+ All Categories
Home > Documents > Authors: Prabha Yadav, Hoa T Dang, Anita de Waard, Lucy Vanderwende, Kevin B. Cohen...

Authors: Prabha Yadav, Hoa T Dang, Anita de Waard, Lucy Vanderwende, Kevin B. Cohen...

Date post: 14-Jan-2016
Category:
Upload: brett-wade
View: 215 times
Download: 0 times
Share this document with a friend
Popular Tags:
6
Authors: Prabha Yadav, Hoa T Dang, Anita de Waard, Lucy Vanderwende, Kevin B. Cohen [email protected] Biomed Summarization With Citation Sentences
Transcript

Authors: Prabha Yadav, Hoa T Dang, Anita de Waard, Lucy Vanderwende, Kevin B. Cohen

[email protected]

Biomed Summarization With Citation Sentences

Task Definition

• Given:– A “reference” paper– 10 “citing” papers that cite the reference paper– Citations in the citing paper

• Return:– Task 1A: Substrings of the reference paper that are

the source of specific citations in the citing papers– Task 1B: Identify the facet of the reference span– Task 2: Write a 250-word summary of the

reference paper that takes into account the citations

Example of Citation Mapping

…suggesting that miR-21 overexpression may contribute to the malignant phenotype by suppressing critical apoptosis-related genes (115). Voorhoeve et al. (116) employed a novel strategy by combining an miRNA vector library and corresponding bar code array…

Osada and Takehashi (2007), MicroRNAs in biological processes and carcinogenesis

Citing Paper

Voorhoeve et al. (2006), A Genetic Screen Implicates miRNA-372 and miRNA-373 As Oncogenes in Testicular Germ Cell Tumors

We subsequently created a human miRNA expression library (miR-Lib) by cloning almost all annotated human miRNAs into our vector (Rfam release 6) (Figure S3). Additionally, we made a corresponding microarray (miR-Array) containing all miR-Lib inserts, which allow the detectionof miRNA effects on proliferation.

Reference Paper

Given a citance (or citing clause):

Task 1: Find the most pertinent sentence(s) in the reference paper to the citation text

Task 2: Identify the facet (from a given set of facets) of the reference span(s)

MethodMethod

Voorhoeve et al. (2006), A Genetic Screen…...

In mammals, a near-perfect complementarity between miRNAs and protein coding genes almost never exists, making it difficult to directly pinpoint relevant downstream targets of a miRNA. Several algorithms were developed that predict miRNA targets, most notably TargetScanS, PicTar, and miRanda (John et al., 2004, Lewis et al., 2005 and Robins et al., 2005).

These programs predict dozens to hundreds of target genes per miRNA, making it difficult to directly infer the cellular pathways affected by a given miRNA. Furthermore, the biological effect of the downregulation depends greatly on the cellular context, which exemplifies the need to deduce miRNA functions by in vivo genetic screens in well-defined model systems.

The cancerous process can be modeled by in vitro neoplastic transformation assays in primary human cells (Hahn et al., 1999). Using this system, sets of genetic elements required for transformation were identified. For example, the joint expression of the telomerase reverse transcriptase subunit (hTERT),

oncogenic H-RASV12, and SV40-small t antigen combined with the suppression of p53 and p16INK4A were sufficient to render primary human fibroblasts tumorigenic (Voorhoeve and Agami, 2003).

Application: Summarize the reference paper from ordered faceted citances

Voorhoeve et al. (116) employed a novel strategy by combining an miRNA vector library and corresponding bar code array…

miR-372 and miR-373 were consequently found to permit proliferation and tumorigenesis of these primary cells carrying both oncogenic RAS and wild-type p53,

probably through direct inhibition of the expression of the tumor-suppressor LATS2 and subsequent neutralization of the p53 pathway.

to identify miRNAs that when overexpressed could substitute for p53 loss and allow continued proliferation in the context of Ras activation

GoalGoal

MethodMethod

ResultResult

Con-clusion

Con-clusion

Citing Papers

Reference Paper

Topics Citations

Reference Spans

Training 20 313 2567

Test 30 492 4000

Total 50 805 6,567

Counts of annotations.Words Minimu

m wordsMaximum Words

Articles

Training

1,791,740

1526 23,113 220

Test 2,635,267

1526 23,005 330

Total 4,427,007

- - 550

Size of Corpus.

No of reference papers

Conclusions

• 15 teams participated in the shared task—significant impact for the first year

• Many valuable lessons learnt about automation of quality control

• Significant code based developed for public release

• Data available free of charge:

Thank You

http://www.nist.gov/tac/2014/BiomedSumm/data.html


Recommended