Date post: | 11-Jan-2016 |
Category: |
Documents |
Upload: | elvin-boone |
View: | 213 times |
Download: | 0 times |
1
Department of Biomedical Informatics
Nanoinformatics: Advancing in silico Cancer Research
David E. JonesJohn D. Morgan Award
Research partially supported by NLM Training Grant #T15LM007124
Department of Biomedical Informatics
2
What is Nanotechnology?• The study of controlling and manipulating matter
at the atomic or molecular level• Focuses on the development of materials,
devices, and other structures at the nanoscale• Very diverse field that bridges multiple sciences
– Molecular Biology– Organic Chemistry– Molecular Physics– Material Science
http://www.nanoinstitute.utah.edu/
Department of Biomedical Informatics
Nanomedicine Defined• The medical application of nanotechnology used
in the diagnosis, treatment, and prevention of diseases in the clinical setting
Department of Biomedical Informatics
Science-to-Informatics
Clinical Informatics
Bioinformatics
?
Department of Biomedical Informatics
5
Nanoinformatics• Defined in 2007 by the United States
National Science Foundation– Improve research in the field of nanotechnology by
using informatics techniques and tools on nanoparticle data and information
http://www.nsf.gov/
Department of Biomedical Informatics
Background: Nanoinformatics• National nanotechnology initiative
– Enhance quality and availability of data• Data acquisition, analysis, and sharing
– Expand theory, modeling, and simulation• Structural and predictive models
– Informatics infrastructure• Semantic search and sharing of data/models• Web-enabled tools for collaboration
http://www.nano.gov/node/681
Department of Biomedical Informatics
7
Nanomedicine Areas of Focus
http://www.wikipedia.org/
http://www.nanotech-now.com
http://www.universityofcalifornia.edu
Theranostics
In VitroDetection
Nanocarriers
Department of Biomedical Informatics
8
Why are Nanocarriers so Important?• Nanomedicine delivery devices are important to
the future of cancer treatment– Promising due to their properties
• Suitable size, high solubility, and ability to change design
Tanner P, et. al. Polymeric Vesicles: From Drug Carriers to Nanoreactors and Artificial Organelles. 2011.
Department of Biomedical Informatics
9
Why are Nanocarriers so Important?• Enhanced permeability and retention (EPR)
effect
http://krauthammerlab.med.yale.edu/imagefinder/Figure.external?sp=443431&state:Home=BrO0ABXcTAAAAAQAADHNlYXJjaFN0cmluZ3QAEG1pUiogYnJhaW4gaGVhc
nQ%3D
Park K. Polysaccharide-based near-infrared fluorescence nanoprobes for cancer diagnosis. 2012.
Department of Biomedical Informatics
Types of Nanocarriers
Cho K, et. al. Therapeutic Nanoparticles for Drug Delivery in Cancer. 2008.
Department of Biomedical Informatics
Poly(amido amine) Dendrimers• PAMAM dendrimers are particularly promising
– Have potential for oral delivery– Cancer drugs can bind to the surface and interior of
the molecule– Molecules surface can easily be modified
http://www.dendritech.com
Department of Biomedical Informatics
12
Design Challenges for Nanocarriers
http://bioserv.rpbs.univ-paris-diderot.fr/services/FAF-Drugs/admetox.html
Department of Biomedical Informatics
13
For Small Molecule Pharmaceutics• Well known in silico approaches exist• Quantitative Structure Activity Relationships
(QSAR)– Analyze the structures and functions of
pharmaceutical and chemical compounds• Used for many different bioactive molecules in the fields of
medicinal chemistry and cheminformatics• This method has seen limited application in the ability to
empirically calculate biochemical properties of nanoparticles
Department of Biomedical Informatics
14
Nanoinformatics Challenges• These approaches have not been used in
nanocarriers for many reasons– Availability of nanoparticle data– Actual atomic size of the nanoparticle structures– Computational capability and algorithms
http://www.nanoinstitute.utah.edu/
Department of Biomedical Informatics
15
Ultimate Goal of this Research• Demonstrate that in silico aided design of
nanocarriers is possible by developing and adapting advanced informatics techniques
• Utilize state of the art data mining and machine learning techniques to develop a model linking PAMAM dendrimer cytotoxicity to molecular descriptors and structure of the nanoparticle
Department of Biomedical Informatics
Where Do We Start?• Availability of Nanoparticle Data
– Databases containing information relevant to biomedical nanoparticles are critical for secondary uses such as data mining and predictive modeling
Department of Biomedical Informatics
caNanoLab• Database containing information relevant to
nanomedicine on nanoparticles and their properties
• Developed by the National Cancer Institute for sharing nanoparticle information
https://cananolab.nci.nih.gov/caNanoLab/
Department of Biomedical Informatics
caNanoLab• Issues
– Limited number of nanoparticles (not all inclusive or current)
– Incomplete information regarding the chemical and physical properties of nanoparticles
– No simple way to download the data to apply machine learning or statistical analyses
– There is no ability to query this system and no data model exists to compare the properties of the molecule to its biochemical activity
Department of Biomedical Informatics
Data Not Easily Accessible• Availability of nanoparticle data
– To our knowledge, there is no authoritative, up-to-date database
– Manual extraction is not feasible
Department of Biomedical Informatics
Natural Language Processing (NLP)• Information extraction method
– Used to automatically extract information from an unstructured (free-text) document
– Shown to be successful in extracting information from related biomedical fields
http://www.conversational-technologies.com/nldemos/nlDemos.html
Department of Biomedical Informatics
Nano-NLP• Garcia-Remesal, Maojo, and colleagues
– Text classification method– Identified:
• Nanoparticle names• Routes of exposure• Toxic effects• Particle targets
– Successful, but qualitative not quantitative
Department of Biomedical Informatics
Our Approach• Two-Step process
TextClassification
TextExtraction
Department of Biomedical Informatics
23
Text Extraction Purpose• Extract numeric values associated with PAMAM
dendrimer properties from the cancer nanomedicine literature– NanoSifter
• 10 properties taken from the NanoParticle Ontology (NPO)• Hydrodynamic diameter, particle diameter, molecular weight,
zeta potential, cytotoxicity, IC50, cell viability, encapsulation efficiency, loading efficiency, and transfection efficiency
Jones DE, Igo S, Hurdle J, Facelli JC. Automatic Extraction of Nanoparticle Properties Using Natural Language Processing: NanoSifter an Application to Acquire PAMAM Dendrimer Properties. PloS one. 2014;9(1):e83932. Epub 2014/01/07.
Department of Biomedical Informatics
24
Properties to be ExtractedVARIABLE DEFINITION
Hydrodynamic Diameter
The hydrodynamic size which is the diameter of a particle or molecule (approximated as a sphere) in an aqueous solution.
Particle Diameter Diameter which inheres in a particle.
Molecular Weight The sum of the relative atomic masses of the constituent atoms of a molecule.
Zeta Potential The potential difference between the bulk dispersion medium (liquid) and the stationary layer of liquid near the surface of the dispersed particulate.
Cytotoxicity Toxicity that impairs or damages cells, and it is a desired property for the killing of growing tumor cells.
IC50 A measure of toxicity which is the concentration of a drug or inhibitor that is required to inhibit a biological process or a participant's activity in that process by half.
Cell Viability Viability of a cell to proliferate, grow, divide, or repair damaged cell components.
Encapsulation Efficiency
The efficiency inhering in a nanomaterial or supramolecular structure by virtue of its capacity to encapsulate an amount of molecular entity, isotope or nanomaterial.
Loading Efficiency A quality inhering in a material entity by virtue of it having the capacity to carry an amount of another material entity.
Transfection Efficiency
The efficiency inhering in a bearer's ability to facilitate transfection.
Department of Biomedical Informatics
25
NanoSifter Extraction Pipeline
Department of Biomedical Informatics
26
NanoSifter PerformanceNanoparticle Property
TermTP FP FN Recall Precision F-measure
Encapsulation Efficiency 1 0 0 1.00 1.00 1.00
Hydrodynamic Diameter 8 0 0 1.00 1.00 1.00
Loading Efficiency 5 0 0 1.00 1.00 1.00
Zeta Potential 41 0 1 0.98 1.00 0.99
Cytotoxicity 124 18 1 0.99 0.87 0.93
Molecular Weight 143 23 2 0.99 0.86 0.92
Particle Diameter 211 39 1 1.00 0.84 0.91
IC50 47 8 1 0.98 0.85 0.91
Cell Viability 78 31 0 1.00 0.72 0.83
Transfection Efficiency 19 13 1 0.95 0.59 0.73
Department of Biomedical Informatics
27
NanoSifter Performance
Type of Average
Recall Precision F-measure
Macro 0.99 0.87 0.92
Micro 0.99 0.84 0.91
Department of Biomedical Informatics
28
NanoSifter Observations• Recall vs. precision
– Desire a higher recall because this means that we are capturing most instances (i.e. missing very few in the literature)
– Tradeoff is that the number of false positives increases which in turn reduces the precision
Department of Biomedical Informatics
29
NanoSifter Limitations• Data extracted by our method is not always
directly associated with a dendrimer nanoparticle
• Only pair a nanoparticle property term with a single numeric value annotation before and after itself (co-reference resolution)
• Cannot extract data from tables and figures
Department of Biomedical Informatics
30
NanoSifter Discussion• Next steps
– Continue work on text classification methods to improve the precision of the system
– Expand the property terms and numeric values that the system targets
– Annotate and extract information from other subclasses of nanoparticles
– Implement some sort of negation analysis tool into our system
Department of Biomedical Informatics
Text Classification Purpose• Identify and annotate entities in the unstructured
nanomedicine literature– Augment the text extraction method– Improve the precision of extracted property data
Department of Biomedical Informatics
Text Classification Pipeline
Department of Biomedical Informatics
33
Now Have the Necessary Data…• Data mining and predictive modeling
– Previous studies• Liu et al. analyzed a number of attributes of a variety of
nanoparticles in order to predict post-fertilization mortality in zebrafish
• Horev-Azaria and colleagues used predictive modeling to explore the effect of cobalt-ferrite nanoparticles on the viability of seven different cell lines
– This method has not been applied to empirically calculate a prediction of the cytotoxicity of PAMAM dendrimers
Department of Biomedical Informatics
34
In Silico Platform
Jones DE, Hamidreza Ghandehari, Facelli JC. Data Mining in Nanomedicine: Predicting Toxicity of PAMAM Dendrimers by Molecular Descriptors and Structure. Submitted 2014.
Department of Biomedical Informatics
35
PAMAM Dendrimers
G3
G4
Department of Biomedical Informatics
36
PAMAM Dendrimers
G5
Department of Biomedical Informatics
37
Molecular Descriptors
Sample Name
Molecular Weight (g/mol)
Aliphatic Atom Count Refractivity
G3 PAMAM 6908.8403 484 1847.28
G4 PAMAM 14214.1651 996 3798.47
G5 PAMAM 28824.8147 2020 7700.85
Department of Biomedical Informatics
38
Classification Analysis• Initial analysis
Classifier Precision Recall F-Measure AccuracyJ48 0.838 0.835 0.836 83.5%Bagging 0.836 0.835 0.835 83.5%Filtered Classifier
0.789 0.748 0.750 74.8%
LWL 0.775 0.738 0.741 73.8%SMO 0.738 0.738 0.725 73.8%Classification via Regression
0.724 0.728 0.723 72.8%
DTNB 0.691 0.670 0.674 67.0%NBTree 0.681 0.670 0.673 67.0%Decision Table 0.678 0.660 0.664 66.0%Naïve Bayes 0.621 0.602 0.607 60.2%
Department of Biomedical Informatics
39
Classification Analysis• Feature selection analysis
Classifier Precision Recall F-Measure ROC Area Accuracy
J48 0.888 0.883 0.884 0.844 88.3%
Filtered
Classifier
0.736 0.718 0.722 0.800 71.8%
LWL 0.819 0.767 0.769 0.834 76.7%
Department of Biomedical Informatics
40
J48 Decision Tree
Department of Biomedical Informatics
41
Regression Analysis
20 30 40 50 60 70 80 90 100 110 12050
60
70
80
90
100
110
f(x) = 0.417360044201466 x + 55.0859041898336R² = 0.493152153805777
Prediction of Cell Viability
Actual
Pre
dic
ted
Department of Biomedical Informatics
42
Discussion• Greatest prediction accuracies were achieved
after supplementing the expert selected features with experimental conditions
• The properties presented in the decision tree diagram represent the more general properties of charge, size, and concentration
• Experimentally, these properties have been hypothesized to be primary causes of cytotoxicity
Department of Biomedical Informatics
43
Conclusion• The results indicate that data mining and
machine learning can be used to predict cytotoxicity and cell viability of PAMAM dendrimers on Caco-2 cells with good accuracy
• Nanoinformatics methods could be implemented to significantly reduce the search space necessary to create suitable PAMAM dendrimers which exhibit less cytotoxicity
Department of Biomedical Informatics
44
References1. Jain K. The Handbook of Nanomedicine. 1st ed. Totowa, New Jersey: Humana; 2008.
2. Staggers N, McCasky T, Brazelton N, Kennedy R. Nanotechnology: the coming revolution and its implications for consumers, clinicians, and informatics. Nursing outlook. 2008;56(5):268-74. Epub 2008/10/17.
3. de la Iglesia D, Maojo V, Chiesa S, Martin-Sanchez F, Kern J, Potamias G, et al. International efforts in nanoinformatics research applied to nanomedicine. Methods of information in medicine. 2011;50(1):84-95. Epub 2010/11/19.
4. Thomas DG, Pappu RV, Baker NA. NanoParticle Ontology for cancer nanotechnology research. J Biomed Inform. 2011;44(1):59-74. Epub 2010/03/10.
5. National Cancer Institute. caNanoLab. 2011 [cited 2011]; Welcome to the cancer Nanotechnology Laboratory (caNanoLab) portal. caNanoLab is a data sharing portal designed to facilitate information sharing in the biomedical nanotechnology research community to expedite and validate the use of nanotechnology in biomedicine. caNanoLab provides support for the annotation of nanomaterials with characterizations resulting from physico-chemical and in vitro assays and the sharing of these characterizations and associated nanotechnology protocols in a secure fashion.]. Available from: https://cananolab.nci.nih.gov/caNanoLab/.
6. Hunter L, Lu Z, Firby J, Baumgartner WA, Jr., Johnson HL, Ogren PV, et al. OpenDMAP: an open source, ontology-driven concept analysis engine, with applications to capturing knowledge regarding protein transport, protein interactions and cell-type-specific gene expression. BMC bioinformatics. 2008;9:78. Epub 2008/02/02.
7. Garcia-Remesal M, Garcia-Ruiz A, Perez-Rey D, de la Iglesia D, Maojo V. Using nanoinformatics methods for automatically identifying relevant nanotoxicology entities from the literature. BioMed research international. 2013;2013:410294. Epub 2013/03/20.
8. Cunningham H, al. e. Text Processing with GATE: University of Sheffield Department of Computer Science; 2011.
9. Yang Y. An Evaluation of Statistical Approaches to Text Categorization. Information Retrieval. 1999;1(1-2):69-90.
10. Tropsha A, Golbraikh A. Predictive QSAR modeling workflow, model applicability domains, and virtual screening. Current pharmaceutical design. 2007;13(34):3494-504. Epub 2008/01/29.
11. Liu X, Tang K, Harper S, Harper B, Steevens JA, Xu R. Predictive modeling of nanomaterial exposure effects in biological systems. International journal of nanomedicine. 2013;8 Suppl 1:31-43. Epub 2013/10/08.
12. Horev-Azaria L, Baldi G, Beno D, Bonacchi D, Golla-Schindler U, Kirkpatrick JC, et al. Predictive toxicology of cobalt ferrite nanoparticles: comparative in-vitro study of different cellular models using methods of knowledge discovery from data. Particle and fibre toxicology. 2013;10:32. Epub 2013/07/31.
13. ChemAxon, Berry I, Ruyts B. Future-proofing Cheminformatics Platforms2012 10/31/2013:[1-16 pp.]. Available from: http://www.chemaxon.com/wp-content/uploads/2012/04/Future_proofing_cheminformatics_platforms.pdf.
14. Ltd. C. Marvin. 2013.
15. Witten I, Frank E, Hall M. Data Mining: Practical Machine Learning Tools and Techniques. 3 ed: Morgan Kaufmann Publishers; 2011. 629 p.
16. Vasumathi V, Maiti PK. Complexation of siRNA with Dendrimer: A Molecular Modeling Approach. Macromolecules. 2010;43:8264-74.
17. Karatasos K, Posocco P, Laurini E, Pricl S. Poly(amidoamine)-based dendrimer/siRNA complexation studied by computer simulations: effects of pH and generation on dendrimer structure and siRNA binding. Macromolecular bioscience. 2012;12(2):225-40. Epub 2011/12/08.
Department of Biomedical Informatics
45
Acknowledgements• Morgan Family• National Library of Medicine Training Grant• Department of Biomedical Informatics at the University
of Utah• Ph.D. Committee
– Julio C. Facelli, Ph.D.– Hamidreza S. Ghandehari, Ph.D.– John F. Hurdle, M.D., Ph.D.– Karen Eilbeck, Ph.D.– Bruce E. Bray, M.D.
Department of Biomedical Informatics
46
Questions