Date post: | 17-Jul-2015 |
Category: |
Science |
Upload: | jeremy-yang |
View: | 96 times |
Download: | 2 times |
The Language Diversity of Computing
Or, how to talk with a computer.
Jeremy Yang(Mgr., Systems & Programming)
Translational Informatics Div.Dept. of Internal MedicineUniversity of New Mexico
UNM D2K Day -- Feb. 20, 2015 1
Language Diversity Examples
Python Perl NaturalLanguage Icons Menus
C++ Java APIs SQL Sparql
XML XSD XPath URLs bash
HTML HTTP ASCII UTF-8 regex
TCP/IP ControlledVocabularies REST OWL RDF 2
Languages: Some major advances
● FORTRAN (1953)
● COBOL (1960)
● C (1969)● SQL (1979)
● WWW (~1990)
● Semantic Web (~2000)○ RDF○ Sparql○ OWL
● C++ (1979)● Perl (1987)● Python (1989)● Java (1995)
4
Why do we care about languages?
● Compatibility● Efficiency● Usability
● Knowledge representation
● Intelligence● Evolution
Of course we care. Duh!6
Q: So what is the problem?A: Standards (so many!)
“Why can’t my iPhone talk to my ...”
● TV● Audio system● Car● Medical records
8
Illuminating the Druggable GenomeKnowledge Management Center (IDG-KMC)
Translational Informatics DivisionChief: Tudor Oprea, MD, PhD
IDG-KMC Workflow
11
IDG-KMC Language Challenge:Case #2:Disease Ontology
17k codes in ICD-9.155k codes in ICD-10.Medical “coding” a huge $ item.
15
TCRD - Target Central Research Db+------------+------------+--------+------+--------------------------------------------------------------------+--------+--------+| doid | Disease | zscore | conf | Protein | idgfam | tdl |+------------+------------+--------+------+--------------------------------------------------------------------+--------+--------+| DOID:13189 | Gout | 3.512 | 1.8 | alpha-kinase 1 | Kinase | Tmacro || DOID:13189 | Gout | 3.214 | 1.6 | salt-inducible kinase 1 | Kinase | Tclin || DOID:13189 | Gout | 2.922 | 1.5 | melanocortin 3 receptor | GPCR | Tclin || DOID:13189 | Gout | 3.036 | 1.5 | olfactory receptor, family 2, subfamily M, member 3 | GPCR | Tdark || DOID:13189 | Gout | 2.797 | 1.4 | taste receptor, type 2, member 30 | GPCR | Tdark || DOID:13189 | Gout | 2.576 | 1.3 | taste receptor, type 2, member 16 | GPCR | Tmacro || DOID:13189 | Gout | 2.379 | 1.2 | hepatocyte nuclear factor 4, gamma | NR | Tgray || DOID:13189 | Gout | 2.441 | 1.2 | spleen tyrosine kinase | Kinase | Tclin || DOID:13189 | Gout | 1.948 | 1.0 | protein kinase, cGMP-dependent, type II | Kinase | Tclin || DOID:13189 | Gout | 1.798 | 0.9 | pannexin 1 | IC | Tmacro || DOID:13189 | Gout | 1.531 | 0.8 | transient receptor potential cation channel, subfamily V, member 1 | IC | Tclin+ || DOID:13189 | Gout | 1.517 | 0.8 | taste receptor, type 2, member 38 | GPCR | Tmacro || DOID:13189 | Gout | 1.565 | 0.8 | transient receptor potential cation channel, subfamily A, member 1 | IC | Tclin+ || DOID:13189 | Gout | 1.375 | 0.7 | transient receptor potential cation channel, subfamily M, member 3 | IC | Tmacro || DOID:13189 | Gout | 1.427 | 0.7 | interleukin-1 receptor-associated kinase 1 | Kinase | Tclin || DOID:13189 | Gout | 1.388 | 0.7 | adenosine kinase | Kinase | Tclin || DOID:1156 | Pseudogout | 2.410 | 1.2 | chloride channel, voltage-sensitive Kb | IC | Tmacro || DOID:1156 | Pseudogout | 2.423 | 1.2 | purinergic receptor P2Y, G-protein coupled, 11 | GPCR | Tchem || DOID:1156 | Pseudogout | 1.742 | 0.9 | calcium-sensing receptor | GPCR | Tclin+ || DOID:1156 | Pseudogout | 1.767 | 0.9 | activin A receptor, type I | Kinase | Tclin || DOID:1156 | Pseudogout | 1.454 | 0.7 | purinergic receptor P2Y, G-protein coupled, 2 | GPCR | Tclin+ |+------------+------------+--------+------+--------------------------------------------------------------------+--------+--------+
Sample query for disease substring “gout”:
16