Volume 6 Number 9 (Sep. 2011)
Home > Archive > 2011 > Volume 6 Number 9 (Sep. 2011) >
JSW 2011 Vol.6(9): 1837-1843 ISSN: 1796-217X
doi: 10.4304/jsw.6.9.1837-1843

An Improved Algorithm of Bayesian Text Categorization

Tao Dong1, Wenqian Shang1, Haibin Zhu2

1School of Computer, Communication University of China, Beijing, China
2Dept. of Computer Science, Nipissing University, North Bay, Canada


Abstract—Text categorization is a fundamental methodology of text mining and a hot topic of the research of data mining and web mining in recent years. It plays an important role in building traditional information retrieval, web indexing architecture, Web information retrieval, and so on. This paper presents an improved algorithm of text categorization that combines the feature weighting technique with Naïve Bayesian classifier. Experimental results show that using the improved Gini index algorithm to feature weight can improve the performance of Naïve Bayesian classifier effectively. This algorithm obtains good application in the sensitive information recognition system.

Index Terms—text categorization, Gini index, feature weighting, Naïve Bayes

[PDF]

Cite: Tao Dong, Wenqian Shang, Haibin Zhu, "An Improved Algorithm of Bayesian Text Categorization," Journal of Software vol. 6, no. 9, pp. 1837-1843, 2011.

General Information

ISSN: 1796-217X (Online)
Frequency:  Quarterly
Editor-in-Chief: Prof. Antanas Verikas
Executive Editor: Ms. Yoyo Y. Zhou
Abstracting/ Indexing: DBLP, EBSCO, CNKIGoogle Scholar, ProQuest, INSPEC(IET), ULRICH's Periodicals Directory, WorldCat, etc
E-mail: jsweditorialoffice@gmail.com
  • Mar 01, 2024 News!

    Vol 19, No 1 has been published with online version    [Click]

  • Jan 04, 2024 News!

    JSW will adopt Article-by-Article Work Flow

  • Apr 01, 2024 News!

    Vol 14, No 4- Vol 14, No 12 has been indexed by IET-(Inspec)     [Click]

  • Apr 01, 2024 News!

    Papers published in JSW Vol 18, No 1- Vol 18, No 6 have been indexed by DBLP   [Click]

  • Nov 02, 2023 News!

    Vol 18, No 4 has been published with online version   [Click]