EE 492 ENGINEERING PROJECTEE 492 ENGINEERING PROJECT
LIP TRACKINGLIP TRACKING
Yusuf Ziya Işık & Ashat Turlibayev Yusuf Ziya Işık & Ashat Turlibayev
Advisor: Prof. Dr. Bülent SankurAdvisor: Prof. Dr. Bülent Sankur
OutlineOutline
IDENTIFICATION OF THE PROBLEMIDENTIFICATION OF THE PROBLEM
LIP CONTOUR EXTRACTIONLIP CONTOUR EXTRACTION
LIP TRACKINGLIP TRACKING
RESULTS AND CONCLUSIONRESULTS AND CONCLUSION
FUTURE WORKFUTURE WORK
IDENTIFICATION OF THE PROBLEMIDENTIFICATION OF THE PROBLEM
Automatic Speech Recognition (ASR) systemsAutomatic Speech Recognition (ASR) systems 1.Systems Using Only Acoustic Information1.Systems Using Only Acoustic Information - Poor performance in noisy environments
2.Bimodal Audio-Visual Systems2.Bimodal Audio-Visual Systems - Visual signal often contains information that is complementary to audio information - Visual information is not affected by acoustic
noise - The overall performance of the combined sistem
is better
Recognition ratio of audio, visual and Recognition ratio of audio, visual and audio-visual approachesaudio-visual approaches
LIP READINGLIP READING
Obtaining the visual information is known as lip readinglip reading problem
Lip trackingLip tracking is a crucial step of extracting visual features.
LIP TRACKING LIP TRACKING
Lip tracking problem can be solved in 2 Lip tracking problem can be solved in 2 steps:steps:
– EExtracting lip boundaryxtracting lip boundary in the first frame by the help of the user
– TTrackingracking the obtained contour through the subsequent frames automatically
Lip Contour ExtractionLip Contour Extraction
Fully automatic segmentation is a very difficult task
Semi-automatic methods are unavoidable and wanted
Intelligent ScissorIntelligent Scissorss is a robust, accurate, and interactive semi-automatic boundary extraction tool which requires minimal user input.
Intelligent Scissors IIntelligent Scissors I
Intelligent Scissors tool provides extracting of object’s contour by using several seed seed pointspoints specified interactively by the user.
Intelligent Scissors algorithm converts the object boundary extraction to the problem of optimal path search in a weighted graphweighted graph.
Obtaining Weighted GraphObtaining Weighted Graph
Weighted Graph: The local cost is calculated from every pixel in the image to its neghbouring pixel.
Local Cost Functionals:
-Laplacian zero crossing
-Gradient Magnitude
-Gradient Direction Pixels that exibit strong edge features are made to
have low local costs.
Optimal Path SelectionOptimal Path Selection
User Interaction: Seed points are specified on the image after all local costs are calculated.
Contour = Minimal Cost Path: The optimal path from every pixel in the image to the seed point is determined by using Dijkstra’s algorithmDijkstra’s algorithm.
Live-Wire ToolLive-Wire Tool
Live-Wire Tool: As the user moves the mouse, the optimal path from the free point to the seed point is displayed.
Property of the ‘live-wire’: If the cursor comes in proximity of the edge the ‘live-wire’ snaps to the object boundary.
Extracting the Contour: When the new seed point is specified, the live wire from this point to the previous seed point is taken as a segment of contour.
Extracting of a Lip Contour Extracting of a Lip Contour Using Intelligent ScissorsUsing Intelligent Scissors
At every move of the mouse the previous ‘live-wire’ is deleted and the new one beginning from the current position of the cursor and ending at the seed point is displayed.
LIP TRACKINGLIP TRACKING
Method 1:Method 1:
Non-Rigid Object Tracking AlgorithmNon-Rigid Object Tracking AlgorithmMethod 2:Method 2:
Tracking with “Intelligent Scissors” Tracking with “Intelligent Scissors”
Method 3:Method 3:
Active Shape ModelsActive Shape Models
Results of Non-Rigid Object TrackingResults of Non-Rigid Object Tracking
Esra-8 Video Sequence Aysel-0 Video Sequence Esra-6 Video Sequence
RemarksRemarks
The overall performance of the algorithm is satisfactory.
Advantage: Ability to track the lips through large number of frames.
Drawback: Long computation time of this algorithm in a closed loop mode makes it inappropriate for accurate tracking in real time applications.
Lip Tracking Using Intelligent ScissorsLip Tracking Using Intelligent Scissors
Motivations : A desire to obtain a more accurate and
faster lip tracking tool.
Intelligent Scissors may be extended from lip segmentation to lip tracking easily.
Lip Tracking using Intelligent ScissorsLip Tracking using Intelligent Scissors
Seed points from the first frame are tracked to the following frames and by using Intelligent Scissors the contour of the lip may be extracted automatically.
Suitable seed points are located by using priori information about the lip image.
Used Features: – Gradient Magnitude – Hue Value– Distance between successive seed points
Gradient Magnitude FeatureGradient Magnitude Feature Lip region has larger gradient magnitude than its surrounding region
N points with highest gradient magnitudes (N << M×M, M is the search range) are seed candidates.
Hue ValuesHue ValuesHue value is very useful for separating boundary from inner lip regions.
Selected Seed Point: From N points having largest gradients the one whose
hue tripple is the most similar to the preious seed’s tripple is selected.
Hue tripple: In addition to the seed point that is going to be tracked,
hues of neighbours that are pp pixels up and down of the current point
are calculated.
The Distance Between Seed PointsThe Distance Between Seed Points
In the figure above the search range of seed point s2 in
the following frame is shown.
The relative poistion of seed points is very
important during tracking. The Intelligent Scissor
tool gives wrong results if they get too close or too
far away from each other.
ResultResult•Result of the “Tracking Using Intelligent Scissors” method applied on the 20 frame lip sequence
Active Shape ModelsActive Shape Models
Motivations: Lip trackingLip tracking is a specific case of the general object
tracking problem. Therefore, taking into account the knowledge about the shape of the lip will increse the performance of a tracker.
Active Shape ModelsActive Shape Models may be used for lip tracking on their own as well as for complementing and correcting the errors of a tracker with Intelligent Scissors.
Lip Training SetLip Training Set
The shape of a lip is represented by a set of n 2-D points:
x={x1,x2,x3,...,xn,y1,y2,y3,...,yn}
If there are s training examples in a set corresponding s vectors are constructed and brought to the same coordinate frame.
Active Shape Models IActive Shape Models I
Shape Model: We look for a parametric model x=M(b), where b is vector of model parameters.
Principal Component Analysis: Helps to reduce the dimensionality of the data.
Covariance matrix S of shape vectors:
1
1
1
sT
i m i m
i
S x x x xs
Active Shape Models IIActive Shape Models II
Eigenlips: Eigenvectors of S (φi) are computed and corresponding eigenvalues (λi) are determined .
The matrix Φ is formed which contains t eigenvectors corresponding to t largest eigenvalues. Hence:
New Lip Shapes: By changing components of the vector b in a controlled way we may obtain new plausible lip shapes
mx x b
Applications of Active Shape Applications of Active Shape ModelsModels
1. Determining Visemes of a Language
2. Increasing Robustness of any Tracking Algorithm
3. If the shape model of an object is extracted apriory:
i) To locate the object in the image
ii)To track that object through image sequence
Visemes of a LanguageVisemes of a Language
Determining viseme of each letter: Using Acitive Shape Models the parameter vector b of a lip shape corresponding to a letter of a language is obtained.
Benefits to Speech Recognition: Parameter vectors obtained from an image sequence may be fused with acoustic information, thus increasing the recognition rate.
Contribution of EigenLips to Contribution of EigenLips to Lip Tracking AlgorithmsLip Tracking Algorithms
Lip tracking algorithms may give wrong lip contours for frames far from the first frame.
The shape vector of a wrong lip x’ is projected into the shape space:
Distribution of the parameter vector b: – if p(b’) is larger that a given threshold the contour is
accepted as correct.– if p(b’) is smaller, then the closest b vector is assigned to
to the lip, thus correcting the wrong boundary.
' 'Tmb x x
Conclusion IConclusion I
“Intelligent Scissors” is an interactive semi- automatic image segmentation tool.
May be used for extracting of initial lip boundary as well as for tracking that boundary through image sequence.
Conclusion IIConclusion II
Non-Rigid Object Tracking Algorithm
High time complexity
Tracking through large number of frames Tracking with Intelligent Scissors
More accurate results
Low time complexity
Tracking through small number of frames
Future Works Future Works
Active Shape Models The library of lip shapes was obtained Viseme group for Turkish language Correction of wrong contours Extraction & Tracking of contours
Future Works IIFuture Works II
The method of “Lip Tracking Using Itelligent Scissors” may be made more robust by imposing Shape Constraint factor.
Given an image, the region of the lip may be located by using Shape Models.
A lip tracking system which is fully based on Active Shape Models may be developed.