doi: 10.4304/jsw.5.10.1137-1143
The Chinese Text Categorization System with Category Priorities
Abstract—The process of text categorization involves some understanding of the content of the documents and/or some previous knowledge of the categories. For the content of the documents, we use a filtering measure for feature selection in our Chinese text categorization system. We modify the formula of Term Frequency-Inverse Document Frequency (TF-IDF) to strengthen important keywords’ weights and weaken unimportant keywords’ weights. For the knowledge of the categories, we use category priority to represent the relationship between two different categories. Consequently, the experimental results show that our method can effectively not only decrease noise text but also increase the accuracy rate and recall rate of text categorization.
Index Terms—text categorization, feature selection, filtering measure, text mining.
Cite: Huan-Chao Keh, Ding-An Chiang, Chih-Cheng Hsu, Hui-Hua Huang, "The Chinese Text Categorization System with Category Priorities," Journal of Software vol. 5, no. 10, pp. 1137-1143, 2010.
General Information
ISSN: 1796-217X (Online)
Abbreviated Title: J. Softw.
Frequency: Quarterly
APC: 500USD
DOI: 10.17706/JSW
Editor-in-Chief: Prof. Antanas Verikas
Executive Editor: Ms. Yoyo Y. Zhou
Abstracting/ Indexing: DBLP, EBSCO,
CNKI, Google Scholar, ProQuest,
INSPEC(IET), ULRICH's Periodicals
Directory, WorldCat, etcE-mail: jsweditorialoffice@gmail.com
-
Jun 12, 2024 News!
Vol 19, No 2 has been published with online version [Click]
-
Jan 04, 2024 News!
JSW will adopt Article-by-Article Work Flow
-
Apr 01, 2024 News!
Vol 14, No 4- Vol 14, No 12 has been indexed by IET-(Inspec) [Click]
-
Apr 01, 2024 News!
Papers published in JSW Vol 18, No 1- Vol 18, No 6 have been indexed by DBLP [Click]
-
Mar 01, 2024 News!
Vol 19, No 1 has been published with online version [Click]