IGSNRR OpenIR
Automatic extraction and structuration of soil-environment relationship information from soil survey reports
Wang De-sheng1,2,3; Liu Jun-zhi1,2,3; Zhu A-xing1,2,3,4,5; Wang Shu1,2,3; Zeng Can-ying1,2,3; Ma Tian-wu1,2,3
2019-02-01
Source PublicationJOURNAL OF INTEGRATIVE AGRICULTURE
ISSN2095-3119
Volume18Issue:2Pages:328-339
Corresponding AuthorLiu Jun-zhi()
AbstractIn addition to soil samples, conventional soil maps, and experienced soil surveyors, text about soils (e.g., soil survey reports) is an important potential data source for extracting soil-environment relationships. Considering that the words describing soil-environment relationships are often mixed with unrelated words, the first step is to extract the needed words and organize them in a structured way. This paper applies natural language processing (NLP) techniques to automatically extract and structure information from soil survey reports regarding soil-environment relationships. The method includes two steps: (1) construction of a knowledge frame and (2) information extraction using either a rule-based method or a statistic-based method for different types of information. For uniformly written text information, the rule-based approach was used to extract information. These types of variables include slope, elevation, accumulated temperature, annual mean temperature, annual precipitation, and frost-free period. For information contained in text written in diverse styles, the statistic-based method was adopted. These types of variables include landform and parent material. The soil species of China soil survey reports were selected as the experimental dataset. Precision (P), recall (R), and F-1-measure (F1) were used to evaluate the performances of the method. For the rule-based method, the P values were 1, the R values were above 92%, and the F1 values were above 96% for all the involved variables. For the method based on the conditional random fields (CRFs), the P, R and F1 values for the parent material were, respectively, 84.15, 83.13, and 83.64%; the values for landform were 88.33, 76.81, and 82.17%, respectively. To explore the impact of text types on the performance of the CRFs-based method, CRFs models were trained and validated separately by the descriptive texts of soil types and typical profiles. For parent material, the maximum F1 value for the descriptive text of soil types was 90.7%, while the maximum F1 value for the descriptive text of soil profiles was only 75%. For landform, the maximum F1 value for the descriptive text of soil types was 85.33%, which was similar to that of the descriptive text of soil profiles (i.e., 85.71%). These results suggest that NLP techniques are effective for the extraction and structuration of soil-environment relationship information from a text data source.
Keywordsoil-environment relationship text natural language processing extraction structuration
DOI10.1016/S2095-3119(18)62071-4
WOS KeywordMAPS
Indexed BySCI
Language英语
Funding ProjectNational Natural Science Foundation of China[41431177] ; National Natural Science Foundation of China[41601413] ; National Basic Research Program of China[2015CB954102] ; Natural Science Research Program of Jiangsu Province, China[BK20150975] ; Natural Science Research Program of Jiangsu Province, China[14KJA170001] ; Outstanding Innovation Team in Colleges and Universities in Jiangsu Province, China ; Vilas Associate Award ; Hammel Faculty Fellow Award ; Manasse Chair Professorship from the University of Wisconsin-Madison
Funding OrganizationNational Natural Science Foundation of China ; National Basic Research Program of China ; Natural Science Research Program of Jiangsu Province, China ; Outstanding Innovation Team in Colleges and Universities in Jiangsu Province, China ; Vilas Associate Award ; Hammel Faculty Fellow Award ; Manasse Chair Professorship from the University of Wisconsin-Madison
WOS Research AreaAgriculture
WOS SubjectAgriculture, Multidisciplinary
WOS IDWOS:000459947300008
PublisherELSEVIER SCI LTD
Citation statistics
Document Type期刊论文
Identifierhttp://ir.igsnrr.ac.cn/handle/311030/49262
Collection中国科学院地理科学与资源研究所
Corresponding AuthorLiu Jun-zhi
Affiliation1.Nanjing Normal Univ, Key Lab Virtual Geog Environm, Nanjing 210023, Jiangsu, Peoples R China
2.State Key Lab Cultivat Base Geog Environm Evolut, Nanjing 210023, Jiangsu, Peoples R China
3.Jiangsu Ctr Collaborat Innovat Geog Informat Reso, Nanjing 210023, Jiangsu, Peoples R China
4.Chinese Acad Sci, Inst Geog Sci & Nat Resources Res, State Key Lab Resources & Environm Informat Syst, Beijing 100101, Peoples R China
5.Univ Wisconsin, Dept Geog, Madison, WI 53706 USA
Recommended Citation
GB/T 7714
Wang De-sheng,Liu Jun-zhi,Zhu A-xing,et al. Automatic extraction and structuration of soil-environment relationship information from soil survey reports[J]. JOURNAL OF INTEGRATIVE AGRICULTURE,2019,18(2):328-339.
APA Wang De-sheng,Liu Jun-zhi,Zhu A-xing,Wang Shu,Zeng Can-ying,&Ma Tian-wu.(2019).Automatic extraction and structuration of soil-environment relationship information from soil survey reports.JOURNAL OF INTEGRATIVE AGRICULTURE,18(2),328-339.
MLA Wang De-sheng,et al."Automatic extraction and structuration of soil-environment relationship information from soil survey reports".JOURNAL OF INTEGRATIVE AGRICULTURE 18.2(2019):328-339.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Wang De-sheng]'s Articles
[Liu Jun-zhi]'s Articles
[Zhu A-xing]'s Articles
Baidu academic
Similar articles in Baidu academic
[Wang De-sheng]'s Articles
[Liu Jun-zhi]'s Articles
[Zhu A-xing]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Wang De-sheng]'s Articles
[Liu Jun-zhi]'s Articles
[Zhu A-xing]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.