• E-ISSN:

    2454-9584

    P-ISSN

    2454-8111

    Impact Factor 2020

    5.051

    Impact Factor 2021

    5.610

  • E-ISSN:

    2454-9584

    P-ISSN

    2454-8111

    Impact Factor 2020

    5.051

    Impact Factor 2021

    5.610

  • E-ISSN:

    2454-9584

    P-ISSN

    2454-8111

    Impact Factor 2020

    5.051

    Impact Factor 2021

    5.610

INTERNATIONAL JOURNAL OF INVENTIONS IN ENGINEERING & SCIENCE TECHNOLOGY

International Peer Reviewed (Refereed), Open Access Research Journal
(By Aryavart International University, India)

Paper Details

Evaluation of Text Classifier Based on Different Stemming Algorithms

Ebtehal Talib Kudair

Ministry of Higher Education and Scientific Research, Iraq

38 - 43 Vol. 6, Jan-Dec, 2020
Receiving Date: 2020-09-01;    Acceptance Date: 2020-09-29;    Publication Date: 2020-10-15
Download PDF

Abstract

Text classification is an important field of machine learning, is a supervised learning method and it depends on dividing texts into groups according to the predefined categories. In general, the text carries a lot of information but in an unstructured form, and this unstructured datamust be converted into structured data.In this paper, texts will be classified using the traditional k-Nearest Neighbor algorithm (KNN), and the performance of the KNN classification algorithm will be compared through text preprocessing with the use of different stemming algorithms such as (Porter Stemmer, Snowball Stemmer). The snowball stemmer reduced the number of featuresin comparisonwith porterstemmer, thusthe results proved that the classifier are more accurate when using snowball stemmer.

Keywords: Text Classification; k-Nearest Neighbor algorithm; Stemming

    References

  1. Bijalwan, Vishwanath & Kumar, Vinay & Kumari, Pinki and Pascual, Jordan (2014) “ KNN based Machine Learning Approach for Text and Document Mining” International Journal of Database Theory and Application Vol.7, No.1.
  2. Jivani, Anjali Ganesh (2016) “A Comparative Study of Stemming Algorithms” Int. J. Comp. Tech. Appl., Vol 2 (6), 1930-1938.
  3. Kannan, S. and Gurusamy, Vairaprakash (2015) “PreprocessingTechniques for Text Mining” Conference Paper.
  4. Korde, Vandana and Mahender, C. Namrata (2012) “Text Classification and Classifiers: A Survey” International Journal of Artificial Intelligence & Applications (IJAIA), Vol.3, No.2, March.
  5. Miah, Muhammed “Improved k-NN Algorithm for Text Classification”, Department of Computer Science and Engineering University of Texas at Arlington, TX, USA.
  6. Qaiser, Shahzad and Ali, Ramsha (2018) “Text Mining: Use of TF-IDF to Examine the Relevance of Words to Documents” International Journal of Computer Applications (0975 – 8887) Volume 181 – No.1, July.
  7. Sulaiman, M.N and Hossin, M. (2015) “A Review on Evaluation Metrics for Data Classification Evaluations” International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.5, No.2, March
  8. Venkatesh, B. and Anuradha, J. (2019) “A Review of Feature Selection and Its Methods” Cybernetics and Information Technologies, Volume 19, No 1.
Back