Please use this identifier to cite or link to this item:
Title: A novel filter feature selection method using rough set for short text data
Authors: Çekik, Rasim
Uysal, Alper Kürşat
Keywords: Short text classification
Rough set
Feature selection
Issue Date: 2020
Publisher: Pergamon-Elsevier Science Ltd
Abstract: High dimensionality problem is an important concern for short text classification due to its effect on computational cost and accuracy of classifiers. Also, short text data, besides being high dimensional, has an incomplete, inconsistent and sparse structure. Selection of important features that provide a better representation is a solution for high dimensionality problem. In this study, we developed a novel filter feature selection method, Proportional Rough Feature Selector (PRFS), which uses the rough set for a regional distinction according to the value set of term to identify documents that exactly belong to a class or that is possibly belong to a class. Documents possible to belong to a class are penalized by multiplying with a coefficient named a. Additionally, the effect of sparsity in the term vector space is calculated with the help of rough set. The PRFS is compared with state-of-the-art filter feature selection methods such as Gini index, information gain, distinguishing feature selector, recently proposed max-min ratio, and normalized difference measure methods. The comparison is carried out using various feature sizes on four different short text datasets with a Macro-F1 success measure. Experimental results demonstrated that the PRFS offers either better or competitive performance with respect to other feature selection methods in terms of Macro-F1. This study may be a pioneering study in this research field as it proposes a novel feature selection method for short text classification using a rough set theory. (c) 2020 Elsevier Ltd. All rights reserved.
ISSN: 0957-4174
Appears in Collections:Matematik Bölümü Koleksiyonu
Scopus İndeksli Yayınlar Koleksiyonu
WoS İndeksli Yayınlar Koleksiyonu

Show full item record

CORE Recommender


checked on Dec 28, 2022


checked on Jul 14, 2022

Page view(s)

checked on Oct 3, 2022

Google ScholarTM



Items in GCRIS Repository are protected by copyright, with all rights reserved, unless otherwise indicated.