+ All Categories
Home > Documents > Mining text and data on chemicals Lars Juhl Jensen.

Mining text and data on chemicals Lars Juhl Jensen.

Date post: 21-Dec-2015
Category:
View: 219 times
Download: 1 times
Share this document with a friend
Popular Tags:
115
Mining text and data on chemicals Lars Juhl Jensen
Transcript
Page 1: Mining text and data on chemicals Lars Juhl Jensen.

Mining text and data on chemicals

Lars Juhl Jensen

Page 2: Mining text and data on chemicals Lars Juhl Jensen.

three parts

Page 3: Mining text and data on chemicals Lars Juhl Jensen.

text mining

Page 4: Mining text and data on chemicals Lars Juhl Jensen.

data integration

Page 5: Mining text and data on chemicals Lars Juhl Jensen.

medical records

Page 6: Mining text and data on chemicals Lars Juhl Jensen.

Part 1text mining

Page 7: Mining text and data on chemicals Lars Juhl Jensen.

exponential growth

Page 8: Mining text and data on chemicals Lars Juhl Jensen.
Page 9: Mining text and data on chemicals Lars Juhl Jensen.
Page 10: Mining text and data on chemicals Lars Juhl Jensen.

some things are constant

Page 11: Mining text and data on chemicals Lars Juhl Jensen.
Page 12: Mining text and data on chemicals Lars Juhl Jensen.

~45 seconds per paper

Page 13: Mining text and data on chemicals Lars Juhl Jensen.

information retrieval

Page 14: Mining text and data on chemicals Lars Juhl Jensen.

find the relevant papers

Page 15: Mining text and data on chemicals Lars Juhl Jensen.

still too much to read

Page 16: Mining text and data on chemicals Lars Juhl Jensen.

computer

Page 17: Mining text and data on chemicals Lars Juhl Jensen.

as smart as a dog

Page 18: Mining text and data on chemicals Lars Juhl Jensen.

teach it specific tricks

Page 19: Mining text and data on chemicals Lars Juhl Jensen.
Page 20: Mining text and data on chemicals Lars Juhl Jensen.
Page 21: Mining text and data on chemicals Lars Juhl Jensen.

named entity recognition

Page 22: Mining text and data on chemicals Lars Juhl Jensen.

identify the concepts

Page 23: Mining text and data on chemicals Lars Juhl Jensen.

small molecules

Page 24: Mining text and data on chemicals Lars Juhl Jensen.

proteins

Page 25: Mining text and data on chemicals Lars Juhl Jensen.

diseases

Page 26: Mining text and data on chemicals Lars Juhl Jensen.

comprehensive lexicon

Page 27: Mining text and data on chemicals Lars Juhl Jensen.

synonyms

Page 28: Mining text and data on chemicals Lars Juhl Jensen.

orthographic variation

Page 29: Mining text and data on chemicals Lars Juhl Jensen.

“black list”

Page 30: Mining text and data on chemicals Lars Juhl Jensen.

unfortunate names

Page 31: Mining text and data on chemicals Lars Juhl Jensen.

Reflect

Page 32: Mining text and data on chemicals Lars Juhl Jensen.

augmented browsing

Page 33: Mining text and data on chemicals Lars Juhl Jensen.

browser add-on

Page 34: Mining text and data on chemicals Lars Juhl Jensen.

Pafilis, O’Donoghue, Jensen et al., Nature Biotechnology, 2009O’Donoghue et al., Journal of Web Semantics, 2010

Page 35: Mining text and data on chemicals Lars Juhl Jensen.

Firefox

Page 36: Mining text and data on chemicals Lars Juhl Jensen.

Internet Explorer

Page 37: Mining text and data on chemicals Lars Juhl Jensen.

Google Chrome

Page 38: Mining text and data on chemicals Lars Juhl Jensen.

Safari

Page 39: Mining text and data on chemicals Lars Juhl Jensen.

Utopia Documents

Page 40: Mining text and data on chemicals Lars Juhl Jensen.

web services

Page 41: Mining text and data on chemicals Lars Juhl Jensen.

collaboration

Page 42: Mining text and data on chemicals Lars Juhl Jensen.
Page 43: Mining text and data on chemicals Lars Juhl Jensen.
Page 44: Mining text and data on chemicals Lars Juhl Jensen.
Page 45: Mining text and data on chemicals Lars Juhl Jensen.

SciVerse

Page 46: Mining text and data on chemicals Lars Juhl Jensen.
Page 47: Mining text and data on chemicals Lars Juhl Jensen.
Page 48: Mining text and data on chemicals Lars Juhl Jensen.
Page 49: Mining text and data on chemicals Lars Juhl Jensen.
Page 50: Mining text and data on chemicals Lars Juhl Jensen.
Page 51: Mining text and data on chemicals Lars Juhl Jensen.

information extraction

Page 52: Mining text and data on chemicals Lars Juhl Jensen.

formalize the facts

Page 53: Mining text and data on chemicals Lars Juhl Jensen.

co-mentioning

Page 54: Mining text and data on chemicals Lars Juhl Jensen.

NLPNatural Language Processing

Page 55: Mining text and data on chemicals Lars Juhl Jensen.

Gene and protein names

Cue words for entity recognition

Verbs for relation extraction

[nxexpr The expression of [nxgene the cytochrome genes [nxpg CYC1 and CYC7]]]is controlled by[nxpg HAP1]

Page 56: Mining text and data on chemicals Lars Juhl Jensen.

Part 2data integration

Page 57: Mining text and data on chemicals Lars Juhl Jensen.

STITCH

Page 58: Mining text and data on chemicals Lars Juhl Jensen.

Kuhn et al., Nucleic Acids Research, 2012

Page 59: Mining text and data on chemicals Lars Juhl Jensen.

~300,000 small molecules

Page 60: Mining text and data on chemicals Lars Juhl Jensen.

~2.6 million proteins

Page 61: Mining text and data on chemicals Lars Juhl Jensen.

1100+ genomes

Page 62: Mining text and data on chemicals Lars Juhl Jensen.

experimental data

Page 63: Mining text and data on chemicals Lars Juhl Jensen.

physical binding

Page 64: Mining text and data on chemicals Lars Juhl Jensen.

chemical–protein

Page 65: Mining text and data on chemicals Lars Juhl Jensen.

protein–protein

Page 66: Mining text and data on chemicals Lars Juhl Jensen.
Page 67: Mining text and data on chemicals Lars Juhl Jensen.

curated knowledge

Page 68: Mining text and data on chemicals Lars Juhl Jensen.

drug targets

Page 69: Mining text and data on chemicals Lars Juhl Jensen.

complexes

Page 70: Mining text and data on chemicals Lars Juhl Jensen.

pathways

Page 71: Mining text and data on chemicals Lars Juhl Jensen.

Letunic & Bork, Trends in Biochemical Sciences, 2008

Page 72: Mining text and data on chemicals Lars Juhl Jensen.

text mining

Page 73: Mining text and data on chemicals Lars Juhl Jensen.

co-mentioning

Page 74: Mining text and data on chemicals Lars Juhl Jensen.
Page 75: Mining text and data on chemicals Lars Juhl Jensen.

NLPNatural Language Processing

Page 76: Mining text and data on chemicals Lars Juhl Jensen.
Page 77: Mining text and data on chemicals Lars Juhl Jensen.

many data types

Page 78: Mining text and data on chemicals Lars Juhl Jensen.

many databases

Page 79: Mining text and data on chemicals Lars Juhl Jensen.

different formats

Page 80: Mining text and data on chemicals Lars Juhl Jensen.

different identifiers

Page 81: Mining text and data on chemicals Lars Juhl Jensen.

variable quality

Page 82: Mining text and data on chemicals Lars Juhl Jensen.

not comparable

Page 83: Mining text and data on chemicals Lars Juhl Jensen.

spread over many genomes

Page 84: Mining text and data on chemicals Lars Juhl Jensen.

quality scores

Page 85: Mining text and data on chemicals Lars Juhl Jensen.

von Mering et al., Nucleic Acids Research, 2005

Page 86: Mining text and data on chemicals Lars Juhl Jensen.

calibrate vs. gold standard

Page 87: Mining text and data on chemicals Lars Juhl Jensen.

von Mering et al., Nucleic Acids Research, 2005

Page 88: Mining text and data on chemicals Lars Juhl Jensen.

probabilistic scores

Page 89: Mining text and data on chemicals Lars Juhl Jensen.

orthology transfer

Page 90: Mining text and data on chemicals Lars Juhl Jensen.

combine the evidence

Page 91: Mining text and data on chemicals Lars Juhl Jensen.

Part 3patient records

Page 92: Mining text and data on chemicals Lars Juhl Jensen.

a hard problem

Page 93: Mining text and data on chemicals Lars Juhl Jensen.

in Danish

Page 94: Mining text and data on chemicals Lars Juhl Jensen.

by busy doctors

Page 95: Mining text and data on chemicals Lars Juhl Jensen.

about psychiatric patients

Page 96: Mining text and data on chemicals Lars Juhl Jensen.

no lexicon

Page 97: Mining text and data on chemicals Lars Juhl Jensen.

acronyms

Page 98: Mining text and data on chemicals Lars Juhl Jensen.

typos

Page 99: Mining text and data on chemicals Lars Juhl Jensen.

delusions

Page 100: Mining text and data on chemicals Lars Juhl Jensen.

domain specific system

Page 101: Mining text and data on chemicals Lars Juhl Jensen.

patient record excerpt

Page 102: Mining text and data on chemicals Lars Juhl Jensen.

F20

F200

Negation

Family

Page 103: Mining text and data on chemicals Lars Juhl Jensen.

medication

Page 104: Mining text and data on chemicals Lars Juhl Jensen.

adverse drug events

Page 105: Mining text and data on chemicals Lars Juhl Jensen.

diagnoses

Page 106: Mining text and data on chemicals Lars Juhl Jensen.

pharmacovigilance

Page 107: Mining text and data on chemicals Lars Juhl Jensen.

patient stratification

Page 108: Mining text and data on chemicals Lars Juhl Jensen.

Roque et al., PLoS Computational Biology, 2011

Page 109: Mining text and data on chemicals Lars Juhl Jensen.

disease comorbidity

Page 110: Mining text and data on chemicals Lars Juhl Jensen.

Roque et al., PLoS Computational Biology, 2011

Page 111: Mining text and data on chemicals Lars Juhl Jensen.

DNA sequencing

Page 112: Mining text and data on chemicals Lars Juhl Jensen.

genotype

Page 113: Mining text and data on chemicals Lars Juhl Jensen.

phenotype

Page 114: Mining text and data on chemicals Lars Juhl Jensen.

Acknowledgments

ReflectSune FrankildHeiko HornEvangelos PafilisJuan-Carlos Silla-CastroMichael KuhnReinhardt SchneiderSean O’Donoghue

STITCHMichael KuhnDamian SzklarczykAndrea FranceschiniMilan SimonovicAlexander RothPablo MinguezTobias Doerks

Manuel StarkChristian von MeringPeer Bork

EPJ-miningFrancisco S RoquePeter B JensenRobert ErikssonHenriette SchmockMarlene DalgaardMassimo AndreattaThomas HansenKaren SøebySøren BredkjærAnders JuulThomas WergeSøren Brunak

Page 115: Mining text and data on chemicals Lars Juhl Jensen.

larsjuhljensen


Recommended