Improved ICHI square feature selection method for Arabic classifiers

Hadeel N. Alshaer, Mohammed A. Otair, Laith Abualigah

Abstract


Feature selection problem is one of the main important problems in the text and data mining domain. This paper presents a comparative study of feature selection methods for Arabic text classification. Five of the feature selection methods were selected: ICHI square, CHI square, Information Gain, Mutual Information and Wrapper. It was tested with five classification algorithms: Bayes Net, Naive Bayes, Random Forest, Decision Tree and Artificial Neural Networks. In addition, Data Collection was used in Arabic consisting of 9055 documents, which were compared by four criteria: Precision, Recall, F-measure and Time to build model. The results showed that the improved ICHI feature selection got almost all the best results in comparison with other methods.

Full Text:

PDF


DOI: http://doi.org/10.11591/ijict.v9i3.pp157-170

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

The International Journal of Informatics and Communication Technology (IJ-ICT)
p-ISSN 2252-8776, e-ISSNĀ 2722-2616
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

Web Analytics View IJICT Stats