Volume 8 Number 1 (Jan. 2013)
Home > Archive > 2013 > Volume 8 Number 1 (Jan. 2013) >
JSW 2013 Vol.8(1): 55-62 ISSN: 1796-217X
doi: 10.4304/jsw.8.1.55-62

A Structured Information Extraction Algorithm for Scientific Papers based on Feature Rules Learning

Jianguo Chen1, 2, Hao Chen2

1Fujian University of Technology /Fujian, Fuzhou, China
2Software School, Hunan University /Hunan, Changsha, China


Abstract—Traditional scientific papers are unstructured documents, which are difficult to meet the requirement of structured retrieval, statistical classification and association analysis and other high-level application. Hence, how to extract and analyze the structured information of the papers becomes a challenging problem. A structured information extraction algorithm is proposed for unstructured and/or semi-structured machine-readable documents. With extracted rules after feature learning on the basis of analyzing the basic structure and format features of traditional scientific papers, the proposed scheme extracts the title, author, abstract, keywords, text and other elements of paper from the unstructured documents. Then the proposed scheme exports the structured text from the traditional scientific papers with the format required by multi-dimensional scientific papers, which can meet the requirements of structured retrieval, statistical classification and other high-level applications of scientific papers.

Index Terms—Information Extraction, Feature Rules, Multi-dimensional scientific Papers.

[PDF]

Cite: Jianguo Chen, Hao Chen, "A Structured Information Extraction Algorithm for Scientific Papers based on Feature Rules Learning," Journal of Software vol. 8, no. 1, pp. 55-62, 2013.

General Information

ISSN: 1796-217X (Online)
Frequency:  Quarterly
Editor-in-Chief: Prof. Antanas Verikas
Executive Editor: Ms. Yoyo Y. Zhou
Abstracting/ Indexing: DBLP, EBSCO, CNKIGoogle Scholar, ProQuest, INSPEC(IET), ULRICH's Periodicals Directory, WorldCat, etc
E-mail: jsw@iap.org
  • Apr 26, 2021 News!

    Vol 14, No 4- Vol 14, No 12 has been indexed by IET-(Inspec)     [Click]

  • Nov 18, 2021 News!

    Papers published in JSW Vol 16, No 1- Vol 16, No 6 have been indexed by DBLP   [Click]

  • Dec 24, 2021 News!

     Vol 15, No 1- Vol 15, No 6 has been indexed by IET-(Inspec)   [Click]

  • Jan 04, 2024 News!

    JSW will adopt Article-by-Article Work Flow

  • Dec 06, 2019 News!

    Vol 14, No 1- Vol 14, No 4 has been indexed by EI (Inspec)   [Click]