+ All Categories
Home > Education > An intelligent retrieval system for Chinese agricultural scientific literature

An intelligent retrieval system for Chinese agricultural scientific literature

Date post: 14-Apr-2017
Category:
Upload: aims-agricultural-information-management-standards
View: 621 times
Download: 10 times
Share this document with a friend
37
An Intelligent Retrieval An Intelligent Retrieval System for Chinese System for Chinese Agricultural Scientific Agricultural Scientific Literature Literature Ping Qian, Xiaolu Su Ping Qian, Xiaolu Su Scientech Documentation and Information Center Scientech Documentation and Information Center Chinese Academy of Agricultural Sciences, China. Chinese Academy of Agricultural Sciences, China. {pingq, suxiaolu}@mail.caas.net.cn {pingq, suxiaolu}@mail.caas.net.cn
Transcript
Page 1: An intelligent retrieval system for Chinese agricultural scientific literature

An Intelligent Retrieval System for An Intelligent Retrieval System for Chinese Agricultural Scientific Chinese Agricultural Scientific

LiteratureLiterature

Ping Qian, Xiaolu SuPing Qian, Xiaolu SuScientech Documentation and Information CenterScientech Documentation and Information Center ,,

Chinese Academy of Agricultural Sciences, China.Chinese Academy of Agricultural Sciences, China.

{pingq, suxiaolu}@mail.caas.net.cn{pingq, suxiaolu}@mail.caas.net.cn

Page 2: An intelligent retrieval system for Chinese agricultural scientific literature

IntroductionIntroduction• How to find out desired information from huge

information resources faster and accurately, has become the serious harassment for people to develop and utilize the network information resources.

• This project attends to use new theory and technology to explore a solution to above problem.

• Currently, knowledge engineering concerning ontology under research is an important theoretical foundation and applied technology to solve knowledge discovery and acquisition.

Page 3: An intelligent retrieval system for Chinese agricultural scientific literature

Information Retrieval Information Retrieval Based on OntologyBased on Ontology

• Build up the domain ontology

• Create the database, referring to the ontology

• Conduct the retrieval with the help of ontology

• Process the results, then display the results

• Import the classification method based on ontology theory

• Create agricultural navigation information database

• Create index database ( Agricultural Scientific literature database )

• Create Web information retrieval system

• Display the results

Establish Process of Establish Process of the Systemthe System

Page 4: An intelligent retrieval system for Chinese agricultural scientific literature

Foundation of Building Agricultural SciFoundation of Building Agricultural Scientech Navigation Information Databaseentech Navigation Information Database

• Theory: Ontology• Data Source: Agricultural Scientech Literature

Database (more than 560,000 records)• Tool: Statistical Analysis• Standard: Chinese Library Classification Me

thod

Page 5: An intelligent retrieval system for Chinese agricultural scientific literature

Stages of Building Agricultural Stages of Building Agricultural Navigation Information DatabaseNavigation Information Database

1.1. Agricultural Agricultural TheoreticalTheoretical Classification Tree Classification Tree 2.2. Agricultural Agricultural ActualActual Classification Tree Classification Tree3.3. ClassClass -- Keyword Cross Table Keyword Cross Table 4.4. KeywordKeyword -- Class Cross Table Class Cross Table 5.5. Agricultural Navigation Information DatabasAgricultural Navigation Information Databas

ee

Page 6: An intelligent retrieval system for Chinese agricultural scientific literature

Agricultural Agricultural Theoretical Theoretical Classification TreeClassification Tree

– Component• All of the Classes relevant to Chinese Library

Classification Method– Purpose

• Solve the problems in creating actual classification tree:– The relation between class number and its name– The gradation relation of some class numbers

– Data Amount• Class and subclass: 42,948• First Layer Class:17

Page 7: An intelligent retrieval system for Chinese agricultural scientific literature

序号 类号 类名 记录数 1 S 农业、农业科学 470,213

2 F 经济 47,503

3 T 工业技术 23,555

4 Q 生物科学 10,440

5 X 环境科学、劳动保护科学(安全科学) 6,252

6 P 天文学、地球科学 1,109

7 G 文化、科学、教育、体育 1,106

8 O 数理科学和化学 433

9 U 交通运输 398

10 R 医药、卫生 391

11 C 社会科学总论 209

12 D 政治、法律 102

13 Z 综合性图书 22

14 N 自然科学总论 21

15 K 历史、地理 19

16 H 语言、文字 5

17 V 航空、航天 2

First-Order Class Name in the First-Order Class Name in the TheoreticalTheoretical Tree Tree

Page 8: An intelligent retrieval system for Chinese agricultural scientific literature

Agricultural ActualActual Classification Tree

– Component :• All of the classes indexed actually

– Purpose :• Founding the navigation information database• Knowing the actual distribution of agricultural informatio

n to find new growing points of the development of agricultural sciences

– Data amount:• Classes: 21,391 , Among them.

• Coordinated classes: 10,748• Non-Coordinated classes: 10,643

Page 9: An intelligent retrieval system for Chinese agricultural scientific literature

Agricultural ActualActual Classification Tree Key PointKey Point ::

More than 100,000 class number and its corresponMore than 100,000 class number and its corresponding class nameding class name

Solution:Solution:Create Professional modeled class tables Create Professional modeled class tables (( 99 ))Create modeled class tables (6), among them:Create modeled class tables (6), among them:

General modeled class tables General modeled class tables (( 22 ))Professional modeled class tables Professional modeled class tables (( 44 ))

Page 10: An intelligent retrieval system for Chinese agricultural scientific literature

Modeled Class Table

表名 仿分范围 仿分范围名称 仿分类号 f401_406 F407.1/.9 各工业部门经济 F401/406 s220 S221/229 各种农机具 S220 s50 S51/59 各种农作物 S50 s60 S63/68 各种园艺 S60 S763_30 S763.31/.49 各种虫害及其防治 S763.30 s821 S822/829.9 各种家畜 S821 s831 S823/839 各种家禽 S831 s881_884_9

S885.1/.9 其他各种蚕类 S881/884. 9

s965 S943 各种鱼类的病害、敌害及其防治 S965

Page 11: An intelligent retrieval system for Chinese agricultural scientific literature

General Compound Class Table表名 仿分范围名称 记录数 字段数

fb2 世界地区复分表 F401/406 5 fb3 中国地区复分表 S220 4

Professional Compound Class Table

表名 复分范围 复分范围类名 记录数 F33_37 F33/37 各国农业经济 21 F43_47 F43/47 各国工业经济 19 S727_728 S727/728 各林种、各类特殊地区的造林 5 S79 S791/796 各种森林树种 8

Page 12: An intelligent retrieval system for Chinese agricultural scientific literature

Examples of Modeled Class Table

Page 13: An intelligent retrieval system for Chinese agricultural scientific literature

Examples of General Modeled Class Table

Page 14: An intelligent retrieval system for Chinese agricultural scientific literature

Examples of Professional Modeled Class Table

Page 15: An intelligent retrieval system for Chinese agricultural scientific literature

Class - Keyword Cross Table (17,582)

Page 16: An intelligent retrieval system for Chinese agricultural scientific literature

Keyword - Class Cross TableBeforeBefore delete replication delete replication about 1,210,000 wordsabout 1,210,000 words

After delete replication After delete replication About 320,000 wordsAbout 320,000 words

Page 17: An intelligent retrieval system for Chinese agricultural scientific literature

Agricultural Navigation Information Database

• Determine the regulations for organizing the information

• Make XML files for navigation information• Choose the database management system• Define database structure

Page 18: An intelligent retrieval system for Chinese agricultural scientific literature

The Regulations for Organizing the Information

• Never lose any class or sub-class having record • Display order: Class having more records listed first,

then listed from higher class layer to lower• If one node does not have record as well as one sub-

node only, this node is deleted and move its sub-node to upper layer

• Sub-class below the third layer class merge up to the third class

• Less than 30 records in the subclass are ignored temporarily

Page 19: An intelligent retrieval system for Chinese agricultural scientific literature

XML files for Navigation Information(33MB)

Page 20: An intelligent retrieval system for Chinese agricultural scientific literature

Data Check and Display Menu

Page 21: An intelligent retrieval system for Chinese agricultural scientific literature

Database Management System

• Relational Database– XML - Enabled Database

• Need transfer, low efficiency

• Native XML Database – Software AG Tamino

• Read XML data directly• Save data in XML format

Page 22: An intelligent retrieval system for Chinese agricultural scientific literature

Define Database Structure

Page 23: An intelligent retrieval system for Chinese agricultural scientific literature

System FrameworkXMLDBMS/RDBMS+XML+JAVA/JSP Browser / Server 3 Layer system structure

Environment for running JSP and XML

Java SDK 1.3.1 Xalan2.2.0

Tomcat3.2

Page 24: An intelligent retrieval system for Chinese agricultural scientific literature

Demo of The Retrieval System

Page 25: An intelligent retrieval system for Chinese agricultural scientific literature

Registration

Page 26: An intelligent retrieval system for Chinese agricultural scientific literature

Login

Page 27: An intelligent retrieval system for Chinese agricultural scientific literature

Browse Retrieval

Page 28: An intelligent retrieval system for Chinese agricultural scientific literature

Enter Keyword

Page 29: An intelligent retrieval system for Chinese agricultural scientific literature

Display the Results

Page 30: An intelligent retrieval system for Chinese agricultural scientific literature

Second-Order Retrieval

Page 31: An intelligent retrieval system for Chinese agricultural scientific literature

Retrieval from the Tree Directly

Page 32: An intelligent retrieval system for Chinese agricultural scientific literature

Retrieval from the Tree Directly

Page 33: An intelligent retrieval system for Chinese agricultural scientific literature

Intelligent Retrieval

Page 34: An intelligent retrieval system for Chinese agricultural scientific literature

Fined Retrieval

Page 35: An intelligent retrieval system for Chinese agricultural scientific literature

Fined Retrieval

Page 36: An intelligent retrieval system for Chinese agricultural scientific literature

Conclusion• The establish of the agricultural scientific navigation informati

on database and the development of its web search system change the traditional retrieval method from based on keyword to based on knowledge organization structure.

• It is also a foundation work. The actual classification table and the cross tables between class and keyword established in the project are valuable Chinese agricultural semantic resources.

• It is useful for the further studies on the automatic distinguish and classification of agricultural information as well as constructing strict agriculture domain ontology.

• The work is just the beginning of the study on ontology and its application in agriculture.

Page 37: An intelligent retrieval system for Chinese agricultural scientific literature

The EndThe End

Thanks for All


Recommended