Object Orie’d Data Analysis, Last Time

Post on 18-Jan-2016

37 views 1 download

description

Object Orie’d Data Analysis, Last Time. Si Z er Analysis Zooming version, -- Dependent version Mass flux data, -- Cell cycle data Image Analysis 1 st Generation -- 2 nd Generation Object Representation Landmarks Boundaries Medial. OODA in Image Analysis. - PowerPoint PPT Presentation

transcript

Object Orie’d Data Analysis, Last Time

• SiZer Analysis– Zooming version, -- Dependent

version

– Mass flux data, -- Cell cycle data

• Image Analysis– 1st Generation -- 2nd Generation

• Object Representation– Landmarks

– Boundaries

– Medial

OODA in Image Analysis

First Generation Problems:

• Denoising

• Segmentation (find object

boundaries)

• Registration (align objects)

(all about single images)

OODA in Image Analysis

Second Generation Problems:

• Populations of Images

– Understanding Population Variation

– Discrimination (a.k.a.

Classification)

• Complex Data Structures (& Spaces)

• HDLSS Statistics

Image Object Representation

Major Approaches for Images:

• Landmark Representations

• Boundary Representations

• Medial Representations

Landmark RepresentationsLandmarks for fly wing data:

Landmark Representations

Major Drawback of Landmarks:

• Need to always find each landmark

• Need same relationship

• I.e. Landmarks need to correspond

• Often fails for medical images

• E.g. How many corresponding landmarks on a set of kidneys, livers or brains???

Boundary Representations

Major sets of ideas:

• Triangular Meshes– Survey: Owen (1998)

• Active Shape Models– Cootes, et al (1993)

• Fourier Boundary Representations– Keleman, et al (1997 & 1999)

Boundary Representations

Example of triangular mesh rep’n:

From:www.geometry.caltech.edu/pubs.html

Boundary RepresentationsMain Drawback:

Correspondence

• For OODA (on vectors of parameters):

Need to “match up points”

• Easy to find triangular mesh– Lots of research on this driven by gamers

• Challenge to match mesh across objects– There are some interesting ideas…

Medial RepresentationsMain Idea: Represent Objects as:• Discretized skeletons (medial atoms)• Plus spokes from center to edge• Which imply a boundary

Very accessible early reference:• Yushkevich, et al (2001)

Medial Representations2-d M-Rep Example: Corpus Callosum(Yushkevich)

Medial Representations2-d M-Rep Example: Corpus Callosum(Yushkevich)

AtomsSpokesImpliedBoundary

Medial Representations3-d M-Rep Example: From Ja-Yeon Jeong

Bladder – Prostate - Rectum

Atoms - Spokes - Implied Boundary

Medial Representations3-d M-reps: there are several variations

Two choices:From Fletcher(2004)

Medial RepresentationsStatistical Challenge

• M-rep parameters are:– Locations– Radii– Angles (not comparable)

• Stuffed into a long vector• I.e. many direct products of

these

32 , 0

Medial RepresentationsStatistical Challenge:• How to analyze angles as data?• E.g. what is the average of:

– ??? (average of the numbers)– (of course!)

• Correct View of angular data:Consider as points on the unit circle

1811

359,358,4,3

Medial RepresentationsWhat is the average (181o?) or (1o?) of:

359

,358

,4

,3

Medial RepresentationsStatistical Challenge• Many direct products of:

– Locations– Radii– Angles (not comparable)

• Appropriate View:Data Lie on Curved Manifold

Embedded in higher dim’al Eucl’n Space

32 , 0

Medial RepresentationsData on Curved Manifold Toy Example:

Medial RepresentationsData on Curved Manifold Viewpoint:• Very Simple Toy Example (last movie)• Data on a Cylinder = • Notes:

– Simplest non-Euclidean Example– 2-d data, embedded on manifold in – Can flatten the cylinder, to a plane– Have periodic representation– Movie by: Suman Sen

• Same idea for more complex direct prod’s

11 S

3R

A Challenging Example• Male Pelvis

– Bladder – Prostate – Rectum– How do they move over time (days)?– Critical to Radiation Treatment (cancer)

• Work with 3-d CT– Very Challenging to Segment– Find boundary of each object?– Represent each Object?

Male Pelvis – Raw Data

One CT Slice

(in 3d image)

Tail Bone

Rectum

Prostate

Male Pelvis – Raw Data

Prostate:

manual segmentation

Slice by slice

Reassembled

Male Pelvis – Raw Data

Prostate:

Slices:

Reassembled in 3d

How to represent?

Thanks: Ja-Yeon Jeong

Object Representation

• Landmarks (hard to find)

• Boundary Rep’ns (no correspondence)

• Medial representations

– Find “skeleton”

– Discretize as “atoms” called M-reps

3-d m-reps

Bladder – Prostate – Rectum (multiple objects, J. Y. Jeong)

• Medial Atoms provide “skeleton”

• Implied Boundary from “spokes” “surface”

3-d m-repsM-rep model fitting

• Easy, when starting from binary (blue)

• But very expensive (30 – 40 minutes technician’s time)

• Want automatic approach

• Challenging, because of poor contrast, noise, …

• Need to borrow information across training sample

• Use Bayes approach: prior & likelihood posterior

• ~Conjugate Gaussians, but there are issues:

• Major HLDSS challenges

• Manifold aspect of data

Mildly Non-Euclidean Spaces

Statistical Analysis of M-rep DataRecall: Many direct products of:• Locations• Radii• Angles I.e. points on smooth manifold

Data in non-Euclidean SpaceBut only mildly non-Euclidean

Mildly Non-Euclidean Spaces

Good source for statistical analysis ofMildly non-Euclidean Data

Fletcher (2004), Fletcher, et al (2004)Main ideas:• Work with geodesic distances• I.e. distances along surface of

manifold

Mildly Non-Euclidean Spaces

What is the mean of data on a manifold?• Bad choice:

– Mean in embedded space– Since will probably leave manifold– Think about unit circle

• How to improve?• Approach study characterizations of

mean– There are many– Most fruitful: Frechét mean

Mildly Non-Euclidean Spaces

Fréchet mean of numbers:

Fréchet mean in Euclidean Space:

Fréchet mean on a manifold:Replace Euclidean by Geodesic

n

ii

xxXX

1

2minarg

d

n

ii

x

n

ii

xxXdxXX

1

2

1

2,minargminarg

d

Mildly Non-Euclidean Spaces

Fréchet Mean:• Only requires a metric (distance) space• Geodesic distance gives geodesic

mean

Well known in robust statistics:• Replace Euclidean distance• With Robust distance, e.g. with• Reduces influence of outliers• Gives another notion of robust median

2L 1L

Mildly Non-Euclidean Spaces

E.g. Fréchet Mean for data on a circle

Mildly Non-Euclidean Spaces

E.g. Fréchet Mean for data on a circle:• Not always easily interpretable

– Think about “distances along arc”– Not about “points in ”– Sum of squared distances “strongly feels the

largest”

• Not always unique– But unique “with probability one” – Non-unique requires strong symmetry– But possible to have many means

2

Mildly Non-Euclidean Spaces

E.g. Fréchet Mean for data on a circle:• Not always sensible notion of center

– Sometimes prefer “top & bottom”?– At end: farthest points from data

• Not continuous Function of Data– Jump from 1 – 2– Jump from 2 – 8

• All false for Euclidean Mean• But all happen generally for manifold data

Mildly Non-Euclidean Spaces

E.g. Fréchet Mean for data on a circle:• Also of interest is Fréchet Variance:

• Works like sample variance• Note values in movie, reflecting spread in

data• Note theoretical version:

• Useful for Laws of Large Numbers, etc.

n

iixxXd

n 1

22 ,1

min̂

22 ,min xXdEXx

Mildly Non-Euclidean Spaces

Useful Viewpoint for data on manifolds:• Tangent Space• Plane touching at one point• At which point?

Geodesic (Fréchet) Mean

Hence terminology “mildly non-Euclidean”

(pic next page)

Mildly Non-Euclidean Spaces

Pics from: Fletcher (2004)

Mildly Non-Euclidean Spaces

“Exponential Map” Terminology:From Complex Exponential Function

Exponential Map:

In Tangent Space On

Manifold

ie sincos i

Mildly Non-Euclidean Spaces

Exponential Map TerminologyMemory Trick:• Exponential Map

Tangent Plane Curved Manifold

• Log Map (Inverse)Curved Manifold Tangent Plane

Mildly Non-Euclidean Spaces

Analog of PCA?Principal geodesics (PGA):• Replace line that best fits data• By geodesic that best fits the data

(geodesic through Fréchet mean)• Implemented as PCA in tangent

space• But mapped back to surface• Fletcher (2004)

PGA for m-reps, Bladder-Prostate-Rectum

Bladder – Prostate – Rectum, 1 person, 17 days

PG 1 PG 2 PG 3

(analysis by Ja Yeon Jeong)

PGA for m-reps, Bladder-Prostate-Rectum

Bladder – Prostate – Rectum, 1 person, 17 days

PG 1 PG 2 PG 3

(analysis by Ja Yeon Jeong)

PGA for m-reps, Bladder-Prostate-Rectum

Bladder – Prostate – Rectum, 1 person, 17 days

PG 1 PG 2 PG 3

(analysis by Ja Yeon Jeong)

Mildly Non-Euclidean Spaces

Other Analogs of PCA???• Why pass through geodesic mean?• Sensible for Euclidean space• But obvious for non-Euclidean?Perhaps “geodesic that explains data as

well as possible” (no mean constraint)?

• Does this add anything?• All same for Euclidean case

(since least squares fit contains mean)

Mildly Non-Euclidean Spaces

E.g. PGA on the unit sphere:

Unit Sphere

Data

Mildly Non-Euclidean Spaces

E.g. PGA on the unit sphere:

Unit Sphere

Data

Geodesic Mean

Mildly Non-Euclidean Spaces

E.g. PGA on the unit sphere:

Unit Sphere

Data

Geodesic Mean

PG 1

Mildly Non-Euclidean Spaces

E.g. PGA on the unit sphere:

Unit Sphere

Data

Geodesic Mean

PG 1

Best Fit Geodesic

Mildly Non-Euclidean Spaces

E.g. PGA on the unit sphere:

Which is “best”?• Perhaps best fit?• What about PG2?

– Should go through geo mean?

• What about PG3?– Should cross PG1 & PG2 at same point?– Need constrained optimization

• Gaussian Distribution on Manifold???