Date post: | 29-Mar-2015 |
Category: |
Documents |
Upload: | angie-meaker |
View: | 213 times |
Download: | 0 times |
SETN 2002, April 11 2002, Thessaloniki, Greece -- George Potamias, ICS/FORTHCombined Distance & Feature-Based Clustering of Time-Series: An Application on Neurophysiology 1/00
Combined Distance and Feature-Combined Distance and Feature-Based Based
Clustering of Time-Series:Clustering of Time-Series:An Application on An Application on NeurophysiolohyNeurophysiolohyGeorge Potamias
Institute of Computer ScienceFORTH
Heraklion, Crete
SETN 2002SETN 2002April 10-12 2002
Thessaloniki, Greece
SETN 2002, April 11 2002, Thessaloniki, Greece -- George Potamias, ICS/FORTHCombined Distance & Feature-Based Clustering of Time-Series: An Application on Neurophysiology 2/00
Brain development: Series of events cell proliferation and migration, growth of axons and dendrites, formation of functional connections and synapses, cell death, myelination of axons and refinement of neuronal specificity
Adult brain: Complex network of fibers Brain nuclei functional structures
Knowledge of the underlying mechanisms that govern these complex processes, and the study of histogenesis and neural plasticity during brain development
are critical for the understanding of the function of normal or injured brain.
The Application Domain
SETN 2002, April 11 2002, Thessaloniki, Greece -- George Potamias, ICS/FORTHCombined Distance & Feature-Based Clustering of Time-Series: An Application on Neurophysiology 3/00
The late embryonic development of avian brain was selected for this study;
Biosynthetic activity, such as protein synthesis, underlies brain-development events. The history of in vivo protein synthesis activity of specific brain areas could: yield insight on their pattern of maturation reveal relationships between distantly located structures suggest different roles of the topographically organized brain structures in the maturation processes
AvianBrain
Study: The time course of protein-synthesis activity of individual brain areas as a model to correlate critical periods during development
Goal: Extract critical-relationships that govern the normal ontogenic processes
??
Study & Goal
SETN 2002, April 11 2002, Thessaloniki, Greece -- George Potamias, ICS/FORTHCombined Distance & Feature-Based Clustering of Time-Series: An Application on Neurophysiology 4/00
The late embryonic development between day 11 (E11) and day 19 (E19) as well as the post-hatching day 1 (P1) was studied
During that time proliferation of neurons has ceased and cell growth, differentiation, migration and death, axon elongation, refinement of connections, and establishment of functional neuronal networks occurs
Biomedical Background
For the determination of biosynthetic activity the in vivo auto-radiographic method of carboxyl labeled L-Leucine was used (an essential amino acid present in most
proteins)
The experimental data concern 30 chick embryos
SETN 2002, April 11 2002, Thessaloniki, Greece -- George Potamias, ICS/FORTHCombined Distance & Feature-Based Clustering of Time-Series: An Application on Neurophysiology 5/00
Time-Series Representation
49 brain-areas (nuclei) were identified. Autoradiographic film Image Analysis
Intensities
For each area, the means over all chicks were
recorded
90
110
130
150
170
190
E11 E13 E15 E17 E19 P1
Inte
nsi
t ies
Protein Synthe
sisPattern
s
Days
The final outcome is a set of 49 time-series
in a time-span of 6 time-points(five embryonic days and one post-hatching day)
SETN 2002, April 11 2002, Thessaloniki, Greece -- George Potamias, ICS/FORTHCombined Distance & Feature-Based Clustering of Time-Series: An Application on Neurophysiology 6/00
40
90
140
190
240
290
E11 E13 E15 E17 E19 P1
AMAcAdBasCACDLCPCPiDLEFPLaFPLpGCtHVHipIOImcLCLLiMMMldOvPAPLPOMPPPTRPORtSCASLSMSPSPISluSpMTOvTPcTnnBORnDBC
How to get meaning from the mesh ?
How to get indicative developmental patterns ?
The Problem
SETN 2002, April 11 2002, Thessaloniki, Greece -- George Potamias, ICS/FORTHCombined Distance & Feature-Based Clustering of Time-Series: An Application on Neurophysiology 7/00
Time-Series discretization
Time-Series discretization
Compute distances (similarities)
Compute distances (similarities)
Method:Discovery of Coherences between Time
Series
Induce underlying/hidden modelsmodels
||||Brain Development Brain Development
HierarchyHierarchy
Induce underlying/hidden modelsmodels
||||Brain Development Brain Development
HierarchyHierarchy
Distance & feature-based Hierarchical Clustering
Distance & feature-based Hierarchical Clustering
40
90
140
190
240
290
E11 E13 E15 E17 E19 P1
AMAcAdBasCACDLCPCPiDLEFPLaFPLpGCtHVHipIOImcLCLLiMMMldOvPAPLPOMPPPTRPORtSCASLSMSPSPISluSpMTOvTPcTnnBORnDBC
Time Seriescollection
… need for hierarchical
hierarchical modeling
Visualize – Interpretclustering result(s)
Visualize – Interpretclustering result(s)
SETN 2002, April 11 2002, Thessaloniki, Greece -- George Potamias, ICS/FORTHCombined Distance & Feature-Based Clustering of Time-Series: An Application on Neurophysiology 8/00
Need for an adjustable and adaptive time-series matching operation Ignore small or not-significant partsnot-significant parts
Translate the offset align vertically Amplitude scaling fixed width
Time-Series Matching:
Problems & Tasks
… apply matching metric
Use of a normal distance metric … outliers; different scaling factors and baselines ?
SETN 2002, April 11 2002, Thessaloniki, Greece -- George Potamias, ICS/FORTHCombined Distance & Feature-Based Clustering of Time-Series: An Application on Neurophysiology 9/00
Achieves- in a convenient way, amplitude scaling, vertical-alignment and identification of (non) significant parts.
Time-Series Discretization
v2 v2 v2 v2 v4 v1 v3
……
……
v1: drastic-increase v2: increase v3: decrease v4: drastic-decrease
44 intervals =
44 nominal valuesQDTQDT: : Qualitative Discrete TransformationA new continuous value will be assigned to the same discrete valuediscrete value as its preceding values if the continuous value belongs to the same population (based on statistical-significance testing).
… the number of discrete-intervals to be specified by the user
Lopez et.al., 2000
SETN 2002, April 11 2002, Thessaloniki, Greece -- George Potamias, ICS/FORTHCombined Distance & Feature-Based Clustering of Time-Series: An Application on Neurophysiology 10/00
Discretization specifics
For a time-series T: {X1, X2, …, Xn}
s: number of discrete values
width = s
XminXmax tt }{}{
otherwise
}Xtmax{Xi if
1]})/min{X[X
s
xti wvi = discr(Xi) =
Discrete Transform of T T’: {v1, v2, …, vm}
SETN 2002, April 11 2002, Thessaloniki, Greece -- George Potamias, ICS/FORTHCombined Distance & Feature-Based Clustering of Time-Series: An Application on Neurophysiology 11/00
Distance Metric
n
),v(vn
1i
b,ia,i
distance dist(Ta,Tb) = dist(T’a,T’b) =
otherwise0
v v if 1 ba
distance(va;i , vb;i) =DTWSegmentation
…
SETN 2002, April 11 2002, Thessaloniki, Greece -- George Potamias, ICS/FORTHCombined Distance & Feature-Based Clustering of Time-Series: An Application on Neurophysiology 12/00
Graph Theoretic Hierarchical Clustering:
The Basics
Iterative Iterative PartitioningPartitioning… which sub-group to formform?… when to stopstop?
Time-Series NodesNodes
TS distance weighted EdgeEdge
dist(Ta,Tb)
Fully connected weighted weighted GraphGraph
Minimum Spanning Minimum Spanning TreeTree preserves the minimum distance between time-series offers the ability to ‘isolate’ and group nodes
STOP
STOP
STOPHierachical ClusteringHierachical Clustering
Category UtilityCategory Utility: A probabilistic metric
SETN 2002, April 11 2002, Thessaloniki, Greece -- George Potamias, ICS/FORTHCombined Distance & Feature-Based Clustering of Time-Series: An Application on Neurophysiology 13/00
Category
Utility
g
i j Vij)p(Aii j Vij/Gk)p(Aip(Gk)Gg)G2,...,CU(G1,
g1k
22
Distribution of Feature-Values … if CLUSTEREDCLUSTERED
Distribution of Feature-Values… if NOT-clusteredNOT-clustered
Over ALL formed clusters
# formed clusters
SETN 2002, April 11 2002, Thessaloniki, Greece -- George Potamias, ICS/FORTHCombined Distance & Feature-Based Clustering of Time-Series: An Application on Neurophysiology 14/00
Stopping
Criterion
G11G11G12G12 G21G21 G22G22
CU(G11,G12) CU(G11,G12) >> CU(G21,G22) CU(G21,G22)
G111G111 G112G112
Current BestCurrent Best CU(G111,G112,G12)
<< Previous Best CU(G11,G12)
STOP
Current BestCurrent Best CU(G121,G122,G11)
>>Previous BestPrevious Best CU(G11,G12)
continue
G11G11G12G12
BestBest Partitioning
G122G122
G121G121
SETN 2002, April 11 2002, Thessaloniki, Greece -- George Potamias, ICS/FORTHCombined Distance & Feature-Based Clustering of Time-Series: An Application on Neurophysiology 15/00
GTC - Graph Theoretic Clustering:
The Procedure
~O(n2 F V)(preliminary)
……
……
STOP
STOP
HierarchicalHierarchicalClustering-TreeClustering-Tree
SETN 2002, April 11 2002, Thessaloniki, Greece -- George Potamias, ICS/FORTHCombined Distance & Feature-Based Clustering of Time-Series: An Application on Neurophysiology 16/00
AM, Ad, Bas, Cpi, DM, GCt, HV, Hip, Co, POM, SL, Tn, Lli, PP, Imc, SCA
16 c3
Ac, CDL, DL, FPLp, GLv, IO, MM, N, NI, OcM, Ov, Rt, SM, Slu, Tov, nBOR, Loc, PA, PM, RPO
20 c2
CA, CP, E, FPLa, LC, LPO, Mld ,PL, PT, SP, Spi, TPc ,VeM13 c1
Brain Nuclei (areas)# ObjectsCluster
The biosynthetic activities of each cluster’s brain-areas- over the stamped developmental ages, exhibit nono statistical-significant deviation from the respective meanmean of the cluster
Patterning Brain Developmental
Events:The Clusters
So, the meanmean of each cluster offers an indicative and representative model for the brain-developmental events … induction of critical relationshipsinduction of critical relationships between the brain areas
SETN 2002, April 11 2002, Thessaloniki, Greece -- George Potamias, ICS/FORTHCombined Distance & Feature-Based Clustering of Time-Series: An Application on Neurophysiology 17/00
E11 E13 E15 E17 E19 P1
c1c3c2
C1: DecreaseDecrease – – IncreaseIncrease
C2: DecreaseDecrease C3: IncreaseIncrease
Patterning Brain Developmental
Events:The Patterns
SETN 2002, April 11 2002, Thessaloniki, Greece -- George Potamias, ICS/FORTHCombined Distance & Feature-Based Clustering of Time-Series: An Application on Neurophysiology 18/00
E11 E13 E15 E17 E19 P1
c1c3c2
Patterning Brain Developmental Events:Hierarchical-Tree Critical
Relationships
c2
c3
c1
c1 c2
c3late late
maturation
earlyearly maturatio
n
earlyearly maturationor, controlcontrol
SETN 2002, April 11 2002, Thessaloniki, Greece -- George Potamias, ICS/FORTHCombined Distance & Feature-Based Clustering of Time-Series: An Application on Neurophysiology 19/00
E11 E13 E15 E17 E19 P1
c1c3c2
Patterning Brain Developmental Events:
Biomedical Interpretation
Clusters {c1c1} {c2c2} Second order sensorysensory and limbic limbic areas Decline in protein-synthesis cell death or cell
displacement due to migrationmigration represent a common phenomenon in many brain regions under development
Differ significantly at post-hatching day {c1c1}: receive sensory-input increase {c2c2}: leucine-incorporation is decreased
Cluster {c3c3} SomatosensorySomatosensory, motormotor, and white-matterwhite-matter areas Increase in protein-synthesis myelination myelination and motor-activitymotor-activity
SETN 2002, April 11 2002, Thessaloniki, Greece -- George Potamias, ICS/FORTHCombined Distance & Feature-Based Clustering of Time-Series: An Application on Neurophysiology 20/00
Conclusion & Future work
The introduced time-series mining methodology (QDTQDT/GTCGTC), and the respective analysis on the history of in vivo protein synthesis activity of specific brain areas, yields insight on their maturation patterns and reveal relationships between distantly located structures
The presented study contribute to the identification of common origin of brain structures and provide possible homologies in the mammalian brain
Inclusion of additional formulas and procedures for computing the distance between time-series
Experimentation on other application domains in order to validate the approach and examine its scalability to huge collections of time-series (initial experiments on economic time-series are already in progress with encouraging preliminary results)
SETN 2002, April 11 2002, Thessaloniki, Greece -- George Potamias, ICS/FORTHCombined Distance & Feature-Based Clustering of Time-Series: An Application on Neurophysiology 21/00
GTC on ASLAustralian Sign Language
dataset
A subset of the dataset for words:“spend”, “lose”, “forget”, “innocent”, “norway”, “happy”, “later”, “eat”, “cold”, “crazy”
Keogh and Pazzani, 19993rd Conf. on Principles & Practice of Knowledge Discovery in Databases
“one vs. another”
.
.
....
word-1 word-2
2Euclidean
22DTW
21SDTW
2525QDT/GTC
out of 45