+ All Categories
Home > Software > ReDefine - Research articles summarization

ReDefine - Research articles summarization

Date post: 23-Jan-2017
Category:
Upload: dig-vijay-kumar-yarlagadda
View: 30 times
Download: 1 times
Share this document with a friend
10
ReDefine – Research Articles ReDefine – Research Articles Summarization Summarization Presentation by Presentation by Dig Vijay Kumar Yarlagadda Dig Vijay Kumar Yarlagadda [email protected] [email protected]
Transcript
Page 1: ReDefine - Research articles summarization

ReDefine – Research Articles SummarizationReDefine – Research Articles Summarization

Presentation by Presentation by Dig Vijay Kumar YarlagaddaDig Vijay Kumar Yarlagadda

[email protected]@mail.umkc.edu

Page 2: ReDefine - Research articles summarization

MotivationMotivation

• Publications / research articles are often complex and difficult to understand for a variety of reasons

including:

• New terms are coined in many papers, the definition of those terms is often buried deep in the

content of the paper.

• Papers are written to be concise due to length restrictions

• Obscure abbreviations

• Journal articles are written in much accessible language but are still incomprehensible for general public.

“When everyday life is understood in terms of spatialization, temporalization and embodiment, ubiquitous computing offers a unique opportunity to evaluate the ‘relational’ as flows, intensities and transductions that mobilize sociotechnical assemblages.” - Galloway, A. (2004). Imitations of Everyday Life. Cultural Studies 18(2/3), 384 – 408.  

Page 3: ReDefine - Research articles summarization

ObjectivesObjectives

● A graphical representation of the contents of the research paper including the important key terms and the relations between them. This would help understand the contents of the paper and the topic of discussion.

 

● Categorize a research article/publication into one of subfields (of Computer Science).

● Generate a text summary.

Page 4: ReDefine - Research articles summarization

ApproachApproach

• Preparing dataset

• Convert publications in PDF format to text format using IBM Watson Document

conversion service

• Extract meta-data of PDF files using Apache PDFBox

• Extracted N-ary relations in text using Allen AI Open IE 4.1

• Sentence: • The U.S. president Barack Obama gave his speech on Tuesday to thousands of people.

• Extracted Relations: • (Barack Obama, is the president of, the U.S.)

• (Barack Obama, gave, [his speech, on Tuesday, to thousands of people])

• Allen AI Open IE 4.1 is much better than other Open IE versions including TextRunner,

Reverb and Ollie

Page 5: ReDefine - Research articles summarization

Approach (Cont.)Approach (Cont.)

• Perform NLP

• Lemmatization

• Stopword removal (Update stopword list)

• TFIDF

• Train Naïve-Bayes Model on 10 categories:

• Extract topics using LDA

Page 6: ReDefine - Research articles summarization

WorkflowWorkflow

Page 7: ReDefine - Research articles summarization

ResultsResults

• Open IE 4.1 Relation Extraction:

• N-ary relations are represented in JSON format

Page 8: ReDefine - Research articles summarization

ResultsResults

• Key terms and relations expressed in a graph

Page 9: ReDefine - Research articles summarization

ResultsResults

• Ontology

Page 10: ReDefine - Research articles summarization

ResultsResults

• Classification of terms into sub-fields of Computer Science


Recommended