Date post: | 01-Jul-2015 |
Category: |
Technology |
Upload: | alasdair-gray |
View: | 2,426 times |
Download: | 4 times |
Scientific lenses to support multiple views over linked Chemistry data
Alasdair J G [email protected]@gray_alasdair
Open PHACTS
[email protected]@open_phacts
Multiple Identities
P12047X31045
P12047
GB:29384RS_2353
21 October 2014 Scientific Lenses – A. J. G. Gray 2
Gleevec®: Imatinib Mesylate
21 October 2014 Scientific Lenses – A. J. G. Gray 3
DrugbankChemSpider PubChem
Imatinib
MesylateImatinib MesylateYLMAHDNUQAMNNX-UHFFFAOYSA-N
Gleevec®: Imatinib Mesylate
21 October 2014 Scientific Lenses – A. J. G. Gray 4
DrugbankChemSpider PubChem
Imatinib
MesylateImatinib MesylateYLMAHDNUQAMNNX-UHFFFAOYSA-N
Are these records the same?It depends upon your task!
Example Use Cases
21 October 2014 Scientific Lenses – A. J. G. Gray 5
I need to perform an analysis, give me
details of the active compound in Gleevec.
Which targets are known to interact
with Gleevec?
Scientific Lenses – A. J. G. Gray 6
skos:exactMatch(InChI)
Strict Relaxed
Analysing Browsing
Structure Lens
21 October 2014
I need to perform an analysis, give me details of the active compound in
Gleevec.
Scientific Lenses – A. J. G. Gray 7
skos:closeMatch(Drug Name)
skos:closeMatch(Drug Name)
skos:exactMatch(InChI)
Strict Relaxed
Analysing Browsing
Name Lens
21 October 2014
Which targets are known to interact with Gleevec?
8
What is a Scientific Lens?
A lens defines a conceptual view over the data Specifies operational equivalence conditions
Consists of: Identifier (URI) Title
(dct:title) Description
(dct:description) Documentation link
(dcat:landingPage) Creator
(pav:createdBy) Timestamp
(pav:createdOn) Equivalence rules
(bdb:linksetJustification)16 October 2014 Scientific Lenses – A. J. G. Gray
9
Ibuprofen consists of two equally active stereoisomers.• Stereoisomers not always represented in dataUsers wish to retrieve information for any stereoisomer.
CHEMBL427526
CHEMBL521CHEMBL175
Lens Effects: Ibuprofen
21 October 2014 Scientific Lenses – A. J. G. Gray
10
Default Lens
21 October 2014 Scientific Lenses – A. J. G. Gray
Ibuprofen consists of two equally active stereoisomers.• Stereoisomers not always represented in dataUsers wish to retrieve information for any stereoisomer.
11
Stereoisomer Lens
21 October 2014 Scientific Lenses – A. J. G. Gray
Ibuprofen consists of two equally active stereoisomers.• Stereoisomers not always represented in dataUsers wish to retrieve information for any stereoisomer.
12
Mapping Generation
21 October 2014 Scientific Lenses – A. J. G. Gray
ops:OPS437281
✔
ops:OPS380297
has_stereoundefined_parent [ci:CHEMINF_000456]
ops:OPS380292
is_stereoisomer_of[ci:CHEMINF_000461] Other relationships
• has part• is tautomer of• uncharged counterpart• isotope…
13
Explorer Screenshot
21 October 2014 Scientific Lenses – A. J. G. Gray
14
Explorer Screenshot
21 October 2014 Scientific Lenses – A. J. G. Gray
15
OPS Discovery Platform
RDFNanopub
Db
VoID
Data Cache (Virtuoso Triple Store)
Semantic Workflow Engine
Linked Data API (RDF/XML, TTL, JSON)DomainSpecificServices
Identity Resolution
Service
Chemistry RegistrationNormalisation & Q/C
IdentifierManagement
Service
Indexing
Co
re P
latf
orm
P12374EC2.43.4
CS4532
“Adenosine receptor 2a”
RDF
VoID
Db
RDFNanopub
Db
VoID
RDF
Db
VoID
RDFNanopub
VoID
Public Content Commercial
Public Ontologies
User Annotations
Apps
21 October 2014 Scientific Lenses – A. J. G. Gray
?iri cheminf:logd ?logd .FILTER (?iri = cw:979b545d-f9a9 || ?iri = cs:2157 || ?iri = chembl:1280 || ?iri = db:db00945 )
GRAPH <http://rdf.chemspider.com> {
}GRAPH <http://…
cw:979b545d-f9a9 cheminf:logd ?logd .
Identity Mapping Service
(BridgeDB)
Query Expander Service
Profiles
Mappings
Q, L1 Q’
[cw:979b545d-f9a9,cs:2157, chembl:1280,db:db00945]
cw:979b545d-f9a9, L1
cw:979b545d-f9a9 cheminf:logd ?logd .
Lenses: Under the hood
• IMS call adds overhead• Call time below human perception [1]• Can also be achieved through UNION[1] C. Y. A. Brenninkmeijer, C. Goble, A. J. G. Gray, P. Groth, A. Loizou, and S. Pettifer, “Including Co-referent URIs in a SPARQL Query,” COLD2013, http://ceur-ws.org/Vol-1034/
21 October 2014 Scientific Lenses – A. J. G. Gray 16
17
API Hits
21 October 2014 Scientific Lenses – A. J. G. Gray
April 2013 – March 2014: 15.8mApril 2014 – Sept 2014: 14mTotal: 29.8 million
Conclusions
Scientific data is complex and messy
Requires flexibility in linking
Equivalence depends upon context
Lenses provide support for operational
equivalence
Chemical structures support automatic
computing of links with justification
21 October 2014 Scientific Lenses – A. J. G. Gray 18
Co-authorsRoyal Society of Chemistry Colin Batchelor Karen Karapetyan Jon Steele Valery Tkachenko Antony Williams
University of Manchester Christian Brenninkmeijer Ian Dunlop Carole Goble Steve Pettifer Robert Stevens
Swiss Institute for Bioinformatics Christine Chichester
European Bioinformatics
Institute Mark Davies Anna Gaulton John Overington
University of Vienna Daniela Digles
Maastricht University Chris Evelo Andra Waagmeester Egon Willighagen
VU University of Amsterdam Paul Groth Antonis Loizou
Connected Discovery Lee Harland
21 October 2014 Scientific Lenses – A. J. G. Gray 19
Questions
Alasdair J G [email protected]@gray_alasdair
Open [email protected]@open_phacts
21 October 2014 Scientific Lenses – A. J. G. Gray 20
Demo at stall 33 this evening!
Scientific Lenses – A. J. G. Gray 21
Source Initial Records Triples Properties
ChEMBL 1,481,473 304,360,749 77
DrugBank 19,628 517,584 74
UniProt 564,246 405,473,138 82
ENZYME 6,187 73,838 2
ChEBI 40,575 1,673,863 2
GeneOntology 38,137 2,447,682 26
GOA 661,232 1,765,622,393 15
ChemSpider 1,361,568 215,193,441 23
ConceptWiki 2,828,966 4,291,131 1
WikiPathways 946 1,949,074 34
Over 2.7 billion
triples
Open PHACTS Data
21 October 2014
22
App EcosystemAn “App Store”?
http://www.openphactsfoundation.org/apps.html
Explorer Explorer2 ChemBioNavigator Target Dossier Pharmatrek Helium
MOE Collector Cytophacts Utopia Garfield SciBite
KNIME Mol. Data Sheets PipelinePilot scinav.it Taverna
21 October 2014 Scientific Lenses – A. J. G. Gray
Discovery Platform
21 October 2014 Scientific Lenses – A. J. G. Gray 23
Drug Discovery Platform
Apps
Domain API
Interactive responses
Production qualityintegration platform
MethodCalls
Linked Data API
21 October 2014 Scientific Lenses – A. J. G. Gray 24
Drug
Disease (1.4)
PathwayTarget
https://dev.openphacts.org/