Volume 5 Number 2 (Feb. 2010)
Home > Archive > 2010 > Volume 5 Number 2 (Feb. 2010) >
JSW 2010 Vol.5(2): 179-186 ISSN: 1796-217X
doi: 10.4304/jsw.5.2.179-186

PicAChoo: A Text Analysis Tool for Customizable Feature Selection with Dynamic Composition of Primitive Methods

Jaeseok Myung, Jung-Yeon Yang, Sang-goo Lee
School of Computer Science and Engineering, Seoul National University

Abstract—Although documents have hundreds of thousands of unique words, only a small number of words are significantly useful for text analysis. Thus, feature selection has become an important issue to be addressed in various text analysis studies. A number of techniques and algorithms for feature selection are available, but unfortunately, it is hard to say that a certain algorithm overcomes the others, because feature selection results mostly depend on the source documents. We should pick and choose the appropriate algorithm and the best subset of feature words whenever we need to analyze source documents. In this paper, we present a framework named ‘PicAChoo’, which stands for ‘Pick And Choose’ that enables customizable feature selection environments by composing several primitive feature selection methods without hard-coding. As indicated in the name, this framework provides many strategies for extracting appropriate features and allows dynamic compositions among several feature selection methods. In addition, it tries to give users an environment that utilizes linguistic characteristics of textual data, namely part-of-speech, sentence structures, and so on. Finally, we illustrate that selected feature words can be used for various intelligent services.

Index Terms—text analysis, feature selection, dynamic composition, feature storing model, complex feature.

[PDF]

Cite: Jaeseok Myung, Jung-Yeon Yang, Sang-goo Lee, "PicAChoo: A Text Analysis Tool for Customizable Feature Selection with Dynamic Composition of Primitive Methods," Journal of Software vol. 5, no. 2, pp. 179-186, 2010.

General Information

ISSN: 1796-217X (Online)
Frequency: Monthly (2006-2019); Bimonthly (Since 2020)
Editor-in-Chief: Prof. Antanas Verikas
Executive Editor: Ms. Yoyo Y. Zhou
Abstracting/ Indexing: DBLP, EBSCO, Google Scholar, ProQuest, INSPEC, ULRICH's Periodicals Directory, WorldCat, etc
E-mail: jsw@iap.org
  • Dec 06, 2019 News!

    Vol 14, No 1- Vol 14, No 4 has been indexed by EI (Inspec)   [Click]

  • Jun 22, 2020 News!

    Papers published in JSW Vol 14, No 1- Vol 15 No 4 have been indexed by DBLP     [Click]

  • Sep 30, 2020 News!

    The papers published in Vol 15, No 6 have all received dois from Crossref   [Click]

  • Aug 01, 2018 News!

    [CFP] 2020 the annual meeting of JSW Editorial Board, ICCSM 2020, will be held in Rome, Italy, July 17-19, 2020   [Click]

  • Sep 30, 2020 News!

    Vol 15, No 6 has been published with online version     [Click]