Date post: | 03-Jan-2016 |
Category: |
Documents |
Upload: | zenia-morse |
View: | 25 times |
Download: | 0 times |
Using Relational Structure for Learning and Modeling in Biomedical and Social
DomainsMark Goadrich
Computer Science and Mathematics
Centenary College of Louisiana
Natural Science ColloquiumNovember 6th, 2007
Overview• First-Order Logic and Machine Learning
– The world is full of Objects– Model these Objects to understand the
world
• Inductive Logic Programming– Objects and Relations/Properties
• Agent-Based Modeling– Objects and Interactions/Behaviors
Bongard Problems
• 6 positive examples of a concept on left• 6 negative examples on right• How to learn this concept using a computer?
First-Order Logic using PROLOG
• Objects– e3, t1, t2, c1
• Types– example(e3)– triangle(t1)– triangle(t2)– circle(c1)Positive Example 3
• Relations– has_shape(e3, t1)– has_shape(e3, t2)– has_shape(e3, c1)– inside(t2, c1)– left(t2, t1)– size(c1, 2.5)– above(t2, t1) …
Repeat this process for each example in dataset
Inductive Logic Programming (ILP)
• Search the space of possible rules “positive(E) :- …”
• Judge rule quality by positive - negative coverage positive(E) positive(E):- has_shape(E, A)
positive(E):- has_shape(E, A), triangle(A)
positive(E) :- has_shape(E, A), has_shape(E, B), triangle(A), circle(B), inside(A, B).
Research Issues in ILP
• Enormous space to search for rules• Enormous number of examples• Incorporation of continuous features• Learning of probabilistic rules• Evaluation of rule quality
• Survey of ILP domains and future interests
Mutagenesis
• Designing effective and selective drugs
• Represent chemicals as atoms and bonds between them
atm(127, 127_1, c, 22, 0.191 )bond(127, 127_1, 127_6, 7 )
• Learned mutagenic rule:
mutagenic(A) :- atm(A, B, c, 27, C), bond(A, D, E, 1), bond(A, B, E, 7).
Breast Cancer
Detection• Large dataset of abnormalities
found in mammograms
• Not enough radiologists
• Relational features– More than one abnormality
per mammogram– More than one mammogram
per person over time
malignant(A) :- not asymmetric(A), in_same_mammorgram(A, A2), spiculated_margin(A2), not distorted(A2)
Robot Scientist• Represent Metabolic
Pathways as a Regulatory Network Graph
• Knock out genes, and then systematically deduce the unknown function
• Try to learn the network from time-series microarray data
Social Networks
• People are connected by friendships into networks
• Each person has likes/dislikes, possibly influenced by their network
• Can we learn your interests based on who you know and what they like? Targeted advertisements?
Netflix Prize• What movies should Netflix
recommend you watch next?
• Large relational dataset– Movies– Titles– Ratings– Friends– Friend’s ratings– Genre
• $1 million if you achieve 10% improvement over their algorithm Cinematch
Zendo• Board game about inductive
logic
• Master creates a rule which some 3-D pyramid structures fit and others do not
• Players build structures and try to guess the Master rule
• Easier to design computer Master to decide if a structure fits the rule
• Harder to design computer Player that must efficiently guess the rule
Crab Claws
• What physical characteristics distinguish between two species?
• Within the same species, what changes due to growth, diet and their relation to predation?
• Find the “shock graph” of each image
• Use ILP to learn differences based on these graphs
Agent-Based Modeling• Objects have interactions with each other
– Flocks of Birds, Schools of Fish• Separation• Alignment• Cohesion
• Objects interact with their environment– Ant Foraging, Pheromones, Traffic Laws
• Agent-Based Modeling (ABM)– Create discrete-time computational simulation– Align models with known behavior– Vary parameters to test new hypotheses
Cellular Process
Social Science
Conclusions• First-Order Logic combines with ILP and
ABM to create a powerful representation of the world
• Research Opportunities– Social Networks– Zendo Player– Claws and Shock Graphs– Cellular Simulation– Social Simulation – [Insert your favorite dataset here]