Post on 19-Jul-2020
transcript
Robust Computer Vision
Homework 5:
Mean Shift for Segmentation and Tracking
Eric Wengrowski
April 5, 2016
1
1 Mean Shift Segmentation
In this section, we compare image segmentation using Mean Shift and Synergis-tic Segmentation, which incorporates both Mean Shift and a Weight Map. The3 Mean Shift parameters are the spatial bandwidth 2hs+1, the color bandwidth2hr, and the minimum region size in pixels. The 3 Weight Map parameters usedin Synergistic Segmentation are the gradient window size 2n + 1, the mixtureparameter, and the threshold for the mixture. The Windows program EDI-SON.EXE is used to perform the segmentation. The code and binaries can befound here: http://coewww.rutgers.edu/riul/research/code/EDISON/.
In each of the following examples, the relevant area of the image was isolatedprior to segmentation.
church.jpg
The task was to segment the area around the bell in church.jpg.
Mean Shift Segmentation
Figure 1: The Mean Shift Segmented church bell. The Spatial bandwidth 2hs +1 = 10, the Color Bandwidth 2hr = 6.5, and the minimum region size = 20pixels. Notice that the underside of the bell is segmented separately, as is theshadow on the upper right side, and the ring underneath near the opening ofthe bell.
2
Synergistic Segmentation
Figure 2: The Synergistic Segmented church bell. The Spatial bandwidth 2hs +1 = 9, the Color Bandwidth 2hr = 8.5, and the minimum region size = 15pixels. The gradient window size 2n + 1 = 7, the mixture parameter = 0.1,and the threshold for the mixture = 0.5. Notice now that the underside of thebell is still segmented separately. But the ring underneath near the opening ofthe bell is not jointly segmented with the rest of the underside. The shadow onthe right face of the bell is still separately segmented. But more of the bright,specular areas on the bell’s front face have been removed from the main bellbody segment.
3
fishcoral.jpg
The task was to segment the a few coral elements from the reef in fishcoral.jpg.
Mean Shift Segmentation
Figure 3: The Mean Shift Segmented coral elements. The Spatial bandwidth2hs + 1 = 7, the Color Bandwidth 2hr = 4, and the minimum region size = 90pixels. Notice that the tips of each coral segment are disjointly segmented fromthe rest of the coral stems. By in large, the coral stems are jointly segmentedwith neighboring stems, especially near the base of each stem. For most coralstems, there are several segments that change with distance to the tip.
4
Synergistic Segmentation
Figure 4: The Synergistic Segmented coral elements. The Spatial bandwidth2hs + 1 = 9, the Color Bandwidth 2hr = 8.5, and the minimum region size= 50 pixels. The gradient window size 2n + 1 = 4, the mixture parameter= 0.7, and the threshold for the mixture = 0.3. Similar to the above Mean Shiftsegmentation, the tips of the coral are often separately segmented from the restof the coral stem. However, two of the coral tips are jointly segmented with thecoral stems in this image, but not the above Mean Shift image. The tips of thecoral are usually contained in only 1 or 2 blue segments. The two close coraltips in the mid left of the above image are better separated than the Mean Shiftimage.
5
halfdress.jpg
The task was to segment the area around the bell in halfdress.jpg.
Mean Shift Segmentation
Figure 5: The Mean Shift Segmented shore shack. The Spatial bandwidth2hs + 1 = 7, the Color Bandwidth 2hr = 4.5, and the minimum region size= 90 pixels. The shore shack is uniformly segmented from the sky and forest.The line segment at the top of the building is well preserved on the right, andsomewhat preserved on the left.
6
Synergistic Segmentation
Figure 6: The Synergistic Segmented shore shack. The Spatial bandwidth 2hs+1 = 15, the Color Bandwidth 2hr = 1.8, and the minimum region size = 90pixels. The gradient window size 2n+ 1 = 4, the mixture parameter = 0.9, andthe threshold for the mixture = 0.9. Notice now that the underside of the bellis still segmented separately. Similar to the above Mean Shift Segmented imageabove, the shore shack is uniformly segmented from the sky and forest. Again,the line segment at the top of the building is well preserved on the right, andsomewhat preserved on the left. However, the differences lie towards the baseof the structure, where the Synergistic Segmentation includes pixels to the left.
7
violinist.jpg
The task was to segment the area around the bell in violinist.jpg.
Mean Shift Segmentation
Figure 7: The Mean Shift Segmented 2 small cars. The Spatial bandwidth2hs +1 = 8, the Color Bandwidth 2hr = 5.5, and the minimum region size = 40pixels. Sections of both cars is jointly segmented. Additionally, each car bodyis broken into several segments. The left car’s windows are broken into severalsegments.
8
Synergistic Segmentation
Figure 8: The Synergistic Segmented 2 small cars. The Spatial bandwidth2hs +1 = 8, the Color Bandwidth 2hr = 3.5, and the minimum region size = 90pixels. The gradient window size 2n + 1 = 5, the mixture parameter = 0.5,and the threshold for the mixture = 0.5. Here, both cars are clearly segmentedseparately. And for the most part, the car bodies are segmented into only 1section. There are separate segmentations of the pixels near and underneaththe driver-size mirrors. The windows are still segmented separately from the restof the car body. The left car’s windows are still broken into several segments.
In general there is not an enormous difference between the ordinary MeanShift Segmentation and the Synergistic Segmentation. The biggest difference iswhen there is an important local edge, such as the separation of the 2 cars inthe last example. Typically, it is difficult to separate these cars with Mean ShiftSegmentation without over segmenting the car bodies.
The mixture parameter and the average threshold are closely related becausethe threshold is just the sum along the edge.
9
2 Mean Shift Tracking
The mean shift tracking implementation in Matlab was Mean-Shift VideoTracking by Sylvain Bernhardto. The code can be found on the Matlab FileExchange here: https://www.mathworks.com/matlabcentral/fileexchange/35520-mean-shift-video-tracking
The video sequence walking44.mpeg was used as input. We were expectedto track 2 people:
1. A man entering the scene from the right about 2.7 seconds into the video,and
2. Another man entering the scene from the left about 6.7 seconds into thevideo.
The video frames not containing either of these two men were removed.
Contents
• Prune Excess Frames• Call Mean-Shift Tracking with Kalman Filtering• Mean-Shift Video Tracking• Description• Import movie• Play the movie• Variables• Target Selection in Reference Frame• Run the Mean-Shift algorithm• Kalman Filtering• Export/Show the processed movie
Prune Excess Frames
v = VideoReader(’walking44.mpeg’);
% 1st Tracking Sequence
v.CurrentTime = 2.7;
k = 1;
clip1_file = VideoWriter(’walking44_1.avi’,’Uncompressed AVI’);
open(clip1_file)
while v.CurrentTime <= 6.0
clip1(k).cdata = readFrame(v);
writeVideo(clip1_file,clip1(k).cdata);
k = k+1;
end
close(clip1_file)
% 2nd Tracking Sequence
v.CurrentTime = 6.7;
k = 1;
clip2_file = VideoWriter(’walking44_2.avi’,’Uncompressed AVI’);
open(clip2_file)
10
while v.CurrentTime <= 14.0
clip2(k).cdata = readFrame(v);
writeVideo(clip2_file,clip2(k).cdata);
k = k+1;
end
close(clip2_file)
Call Mean-Shift Tracking with Kalman Filtering
MS_Tracking(’walking44_1.avi’);
MS_Tracking(’walking44_2.avi’);
Elapsed time is 0.186746 seconds.
Elapsed time is 0.390448 seconds.
Mean-Shift Video Tracking
by Sylvain Bernhardt July 2008 Modified by Eric Wengrowski in April 2016
Description
This is a simple example of how to use the Mean-Shift video tracking algorithmimplemented in ’MeanShift Algorithm.m’. It imports the video ’Ball.avi’ fromthe ’Videos’ folder and tracks a selected feature in it. The resulting videosequence is played after tracking, but is also exported as a AVI file.
Import movie
tic
[Length,height,width,Movie]=Import_mov(fname);
toc
Play the movie
% Put the figure in the center of the screen, % without axes and menu bar. scrsz= get(0,’ScreenSize’); figure(’Position’,[scrsz(3)/2-width/2 scrsz(4)/2-height/2width height],... ’MenuBar’,’none’); axis off % Image position inside the figureset(gca,’Units’,’pixels’,’Position’,[1 1 width height]) % Play the movie movie(Movie);
Variables
index_start = 1;
% Similarity Threshold
f_thresh = 0.16;
% Number max of iterations to converge
max_it = 5;
% Parzen window parameters
11
kernel_type = ’Gaussian’;
radius = 1;
Target Selection in Reference Frame
if(strcmp(fname,’walking44_1.avi’))
if(exist(’clip1patch.mat’,’file’))
load ’clip1patch.mat’;
else
[T,x0,y0,H,W] = Select_patch(Movie(index_start).cdata,0);
save(’clip1patch.mat’,’T’,’x0’,’y0’,’H’,’W’);
end
elseif(strcmp(fname,’walking44_2.avi’))
if(exist(’clip2patch.mat’,’file’))
load ’clip2patch.mat’;
else
[T,x0,y0,H,W] = Select_patch(Movie(index_start).cdata,0);
save(’clip2patch.mat’,’T’,’x0’,’y0’,’H’,’W’);
end
else
[T,x0,y0,H,W] = Select_patch(Movie(index_start).cdata,0);
end
%pause(0.2);
Run the Mean-Shift algorithm
Calculation of the Parzen Kernel window
[k,gx,gy] = Parzen_window(H,W,radius,kernel_type,0);
% Conversion from RGB to Indexed colours
% to compute the colour probability functions (PDFs)
[~,map] = rgb2ind(Movie(index_start).cdata,65536);
Lmap = length(map)+1;
T = rgb2ind(T,map);
% Estimation of the target PDF
q = Density_estim(T,Lmap,k,H,W,0);
% Flag for target loss
loss = 0;
% Similarity evolution along tracking
f = zeros(1,(Length-1)*max_it);
% Sum of iterations along tracking and index of f
f_indx = 1;
% Draw the selected target in the first frame
Movie(index_start).cdata = Draw_target(x0,y0,W,H,...
Movie(index_start).cdata,2);
%%%% TRACKING
WaitBar = waitbar(0,’Tracking in progress, be patient...’);
% From 1st frame to last one
for t=1:Length-1
12
% Next frame
I2 = rgb2ind(Movie(t+1).cdata,map);
% Apply the Mean-Shift algorithm to move (x,y)
% to the target location in the next frame.
[x,y,loss,f,f_indx] = MeanShift_Tracking(q,I2,Lmap,...
height,width,f_thresh,max_it,x0,y0,H,W,k,gx,...
gy,f,f_indx,loss);
Kalman Filtering
My implementation of a Kalman Filter is based on the functions in Matlab’sComputer Vision System Toolbox. See: https://www.mathworks.com/help/
vision/ref/configurekalmanfilter.html and https://www.mathworks.com/
help/control/ug/kalman-filtering.html.
% Kalman Filtering
detectedLocation = [x,y];
if(t==1) %first iteration
motionModel = ’ConstantVelocity’;
initialLocation = detectedLocation;
sigma = 5;
initialEstimateError = sigma.^2*ones(1,2);
%The process noise covariance, Q:
%Q = diag(repmat(MotionNoise, [1, M]))
motionNoise = sigma.^2*ones(1,2);
%The measurement noise covariance, R:
%R = diag(repmat(MeasurementNoise, [1, M])).
measurementNoise = sigma;
kalmanFilter = configureKalmanFilter(motionModel,initialLocation,...
initialEstimateError,motionNoise,measurementNoise);
trackedLocation = correct(kalmanFilter, detectedLocation);
else %not first iteration
predict(kalmanFilter);
trackedLocation = correct(kalmanFilter, detectedLocation);
end
% New Estimates of [x,y] after Kalman Filtering
x = round(trackedLocation(1));
y = round(trackedLocation(2));
% Check for target loss. If true, end the tracking
if loss == 1
break;
else
% Drawing the target location in the next frame
Movie(t+1).cdata = Draw_target(x,y,W,H,Movie(t+1).cdata,2);
% Next frame becomes current frame
y0 = y;
x0 = x;
% Updating the waitbar
13
waitbar(t/(Length-1));
end
end
close(WaitBar);
%%%% End of TRACKING
Export/Show the processed movie
Export the video sequence as an AVI file in the Videos folder
WaitBar = waitbar(0,’Exporting the output AVI file, be patient...’);
tracked_move_name = [fname ’.TrackedOutput.avi’];
trackedv = VideoWriter(tracked_move_name,’Uncompressed AVI’);
open(trackedv)
%movie2avi(Movie,’Videos\Movie_out’,’Quality’,50);
writeVideo(trackedv,Movie);
close(trackedv)
waitbar(1);
close(WaitBar);
% Put a figure in the center of the screen,
% without menu bar and axes.
scrsz = get(0,’ScreenSize’);
figure(1)
set(1,’Name’,’Movie Player’,’Position’,...
[scrsz(3)/2-width/2 scrsz(4)/2-height/2 width height],...
’MenuBar’,’none’);
axis off
% Image position inside the figure
set(gca,’Units’,’pixels’,’Position’,[1 1 width height])
% Play the movie
movie(Movie);
14
Figure 9: Person 1 at the beginning of his tracking sequence.
Figure 10: Person 1 at the end of his tracking sequence.
15
Figure 11: Person 2 at the beginning of his tracking sequence.
Figure 12: Person 2 at the end of his tracking sequence.
16
The first person was difficult to track because his background changed greatlythroughout his trajectory. The background is similar to the foreground whenhe steps in front of the lady in black. Because both people have similar colors,the tracking is most vulnerable to error here. For simple cases, backgroundsubtraction might help, especially when initially selecting a target window, be-cause it is easy to track the background rather than the intended subject. Butbackground subtraction will probably not help much in the case where pathsare crossed with a similarly colored object.
The second person was difficult to track because he moves towards the cam-era. So the amount of pixels he occupies grows larger. In this implementation,the target window does not change with the relative distance of the intendedtarget. The effects of this can be seen towards the end of the tracking sequence,where the subject’s arm and the background are the focus of the target window.
Generally speaking, the first person was more difficult to track for the ma-jority of his path, but the second person was harder to track towards the endof his path.
17