Volume 9 Number 12 (Dec. 2014)
Home > Archive > 2014 > Volume 9 Number 12 (Dec. 2014) >
JSW 2014 Vol.9(12): 3028-3034 ISSN: 1796-217X
doi: 10.4304/jsw.9.12.3028-3034

Lexical-semantic SLVM for XML Document Classification

Jun Long, Luda Wang, Zude Li, Zuping Zhang, Huiling Li, and Guihu Zhao

School of Information Science and Engineering, Central South University, Changsha 410075, China

Abstract—Structured link vector model (SLVM) and its improved version depend on statistical term measures to implement XML document representation. As a result, they ignore the lexical semantics of terms and its mutual information, leading to text classification errors. This paper proposed a XML document representation method, WordNet-based lexical-semantic SLVM, to solve the problem. Using WordNet, this method constructed a data structure for characterizing lexical semantic contents of XML document, and adjusted EM modeling to disambiguate word stems. Then, synset matrix of lexical semantic contents was built in the lexical-semantic feature space for XML document representation, and lexical semantic relations were marked on it to construct the feature matrix in lexical-semantic SLVM. On categorized dataset of Wikipedia XML, using NWKNN classification algorithm, the experimental results show that the feature matrix of our method performs F1 measure better than original SLVM and frequent sub-tree SLVM based on TFIDF.

Index Terms—Semi-structured document, SLVM, Lexical semantics, Classification, Feature matrix

[PDF]

Cite: Jun Long, Luda Wang, Zude Li, Zuping Zhang, Huiling Li, and Guihu Zhao, "Lexical-semantic SLVM for XML Document Classification," Journal of Software vol. 9, no. 12, pp. 3028-3034, 2014.

General Information

  • ISSN: 1796-217X (Online)

  • Abbreviated Title: J. Softw.

  • Frequency:  Biannually

  • APC: 500USD

  • DOI: 10.17706/JSW

  • Editor-in-Chief: Prof. Antanas Verikas

  • Executive Editor: Ms. Cecilia Xie

  • Abstracting/ Indexing: DBLPCNKI

  • Google Scholar, ProQuest,
           INSPEC(IET), ULRICH's Periodicals
           Directory, WorldCat, etc

  • E-mail: jsweditorialoffice@gmail.com

  • Mar 07, 2025 News!

    Vol 19, No 4 has been published with online version   [Click]

  • Mar 07, 2025 News!

    JSW had implemented online submission system   [Click]

  • Apr 01, 2024 News!

    Vol 14, No 4- Vol 14, No 12 has been indexed by IET-(Inspec)     [Click]

  • Apr 01, 2024 News!

    Papers published in JSW Vol 18, No 1- Vol 18, No 6 have been indexed by DBLP   [Click]

  • Oct 22, 2024 News!

    Vol 19, No 3 has been published with online version   [Click]