Volume 8 Number 1 (Jan. 2013)
Home > Archive > 2013 > Volume 8 Number 1 (Jan. 2013) >
JSW 2013 Vol.8(1): 55-62 ISSN: 1796-217X
doi: 10.4304/jsw.8.1.55-62

A Structured Information Extraction Algorithm for Scientific Papers based on Feature Rules Learning

Jianguo Chen1, 2, Hao Chen2

1Fujian University of Technology /Fujian, Fuzhou, China
2Software School, Hunan University /Hunan, Changsha, China


Abstract—Traditional scientific papers are unstructured documents, which are difficult to meet the requirement of structured retrieval, statistical classification and association analysis and other high-level application. Hence, how to extract and analyze the structured information of the papers becomes a challenging problem. A structured information extraction algorithm is proposed for unstructured and/or semi-structured machine-readable documents. With extracted rules after feature learning on the basis of analyzing the basic structure and format features of traditional scientific papers, the proposed scheme extracts the title, author, abstract, keywords, text and other elements of paper from the unstructured documents. Then the proposed scheme exports the structured text from the traditional scientific papers with the format required by multi-dimensional scientific papers, which can meet the requirements of structured retrieval, statistical classification and other high-level applications of scientific papers.

Index Terms—Information Extraction, Feature Rules, Multi-dimensional scientific Papers.

[PDF]

Cite: Jianguo Chen, Hao Chen, "A Structured Information Extraction Algorithm for Scientific Papers based on Feature Rules Learning," Journal of Software vol. 8, no. 1, pp. 55-62, 2013.

General Information

  • ISSN: 1796-217X (Online)

  • Abbreviated Title: J. Softw.

  • Frequency:  Quarterly

  • APC: 500USD

  • DOI: 10.17706/JSW

  • Editor-in-Chief: Prof. Antanas Verikas

  • Executive Editor: Ms. Yoyo Y. Zhou

  • Abstracting/ Indexing: DBLP, EBSCO,
           CNKIGoogle Scholar, ProQuest,
           INSPEC(IET), ULRICH's Periodicals
           Directory, WorldCat, etc

  • E-mail: jsweditorialoffice@gmail.com

  • Jun 12, 2024 News!

    Vol 19, No 2 has been published with online version   [Click]

  • Jan 04, 2024 News!

    JSW will adopt Article-by-Article Work Flow

  • Apr 01, 2024 News!

    Vol 14, No 4- Vol 14, No 12 has been indexed by IET-(Inspec)     [Click]

  • Apr 01, 2024 News!

    Papers published in JSW Vol 18, No 1- Vol 18, No 6 have been indexed by DBLP   [Click]

  • Mar 01, 2024 News!

    Vol 19, No 1 has been published with online version    [Click]