PyData NYC by Akira Shibata

Post on 10-Jul-2015

10,956 views 1 download

Tags:

description

Deck used for my talk during PyDataNYC in which I described how we improved thumbnail cropping in our news app, Kamelio. We used Deep Learning object detection to identify the interesting regions of the image which was subsequently fed into image cropping logic.

transcript

Copyright 2014 Shiroyagi Corporation. All rights reserved.

Akira Shibata, PhD

Putting Together World's Best Data Processing Research with Python

Shiroyagi Corporation

Copyright 2014 Shiroyagi Corporation. All rights reserved. 2

Who am I

Akira Shibata, PhD.

TW: @punkphysicist

CEO, Shiroyagi Corporation (shiroyagi.co.jp)

Kamelio: Personalised News Curation

Kamect: Contents Discovery Platform

2004 - 2010:

Data Scientist @ NYU

Statistical data modelling @ LHC, CERN

2010 - 2013

Boston Consulting Group

Copyright 2014 Shiroyagi Corporation. All rights reserved. 3

Copyright 2014 Shiroyagi Corporation. All rights reserved. 4

Statistical modelling of Physics data

Confirmatory: Highly theory driven model building

Copyright 2014 Shiroyagi Corporation. All rights reserved. 5

Telling discovery from noise

The model tells you the expected uncertainty

Copyright 2014 Shiroyagi Corporation. All rights reserved. 6

Copyright 2014 Shiroyagi Corporation. All rights reserved. 7

Copyright 2014 Shiroyagi Corporation. All rights reserved. 8

Copyright 2014 Shiroyagi Corporation. All rights reserved. 9

Copyright 2014 Shiroyagi Corporation. All rights reserved. 10

Copyright 2014 Shiroyagi Corporation. All rights reserved. 11

Kamelio

“Deep Learning”

“Internet of Things”

“Global Strategy”

“Medical IT”

Collects news through >3M topics to chose from

Copyright 2014 Shiroyagi Corporation. All rights reserved. 12

“Cats”

“Anime”

“Cats reaction to sighting dogs for the first time”

Copyright 2014 Shiroyagi Corporation. All rights reserved. 13

Python puts all our tools together

0 1 2 3 4

Image in Detect regions

Object recog. Scoring Cropping

IPython and Python script

Matlab +Scipy

C++ +Libraries

Numpy PIL

Copyright 2014 Shiroyagi Corporation. All rights reserved. 14

Our approach is heavily influenced by Berkeley Vision and Learning Center

Acknowledgement

Copyright 2014 Shiroyagi Corporation. All rights reserved. 15

0 1 2 3 4

Detect regions

Copyright 2014 Shiroyagi Corporation. All rights reserved. 16

Region detection: Telling where to look at

How do we find regions to feed into object recognition? Default strategy was to look at the center

1

Copyright 2014 Shiroyagi Corporation. All rights reserved. 17

Exhaustive windows -> segmentation

Search over position, scale, aspect ratio

Grouping parts of image at different scales

Exhaustive search far too time inefficient for use with Deep Learning

1

Copyright 2014 Shiroyagi Corporation. All rights reserved. 18

Run matlab as subprocess pid = subprocess.Popen(shlex.split(mc), stdout=open('/dev/null', 'w'), cwd=script_dirname)

matlab -nojvm -r "try; selective_search({‘image_file.jpg’}, ‘output.mat'); catch; exit; end; exit”

1

2

3

Install Malab and Selective Search algorithm from author

Import output using scipy.io all_boxes = list(scipy.io.loadmat(‘output.mat')['all_boxes'][0])subtractor = np.array((1, 1, 0, 0))[np.newaxis, :]all_boxes = [boxes - subtractor for boxes in all_boxes]

1 Region detection: in practice

Copyright 2014 Shiroyagi Corporation. All rights reserved. 19

1 Region detection: proposals generated

~200 proposals generated per image

Copyright 2014 Shiroyagi Corporation. All rights reserved. 20

0 1 2 3 4

Object recog.

Copyright 2014 Shiroyagi Corporation. All rights reserved. 21

Object recognition

Deep blue beat Kasparov at chess in 1997…

2

Copyright 2014 Shiroyagi Corporation. All rights reserved. 22

Deep Learning: Damn good at it2

Copyright 2014 Shiroyagi Corporation. All rights reserved. 23

Convoluted Neural Network2

Copyright 2014 Shiroyagi Corporation. All rights reserved. 24

Caffe: open R-CNN framework under rapid dev.

C++/CUDA with Python wrapper

2

Copyright 2014 Shiroyagi Corporation. All rights reserved. 25

Pre-trained models published

We used 200-category object recog. model developed for 2013 ImageNet Challenge

2

Copyright 2014 Shiroyagi Corporation. All rights reserved. 26

Import wrapper and configure MODEL_FILE=‘models/bvlc_…_ilsvrc13/deploy.prototxt’PRETRAINED_FILE = ‘models/…/bvlc_…_ilsvrc13.caffemodel’MEAN_FILE = 'caffe/imagenet/ilsvrc_2012_mean.npy'detector = caffe.Detector(MODEL_FILE, PRETRAINED_FILE, mean=np.load(MEAN_FILE), raw_scale=255, channel_swap=[2,1,0])

1

2

3

Install a bunch of libraries and Caffe CUDA, Boost, OpenCV, BLAS…

Pass found regions for object detection self.detect_windows(zip(image_fnames, windows_list))

2 Object recognition: in practice

Copyright 2014 Shiroyagi Corporation. All rights reserved. 27

Object recognition: Result2

Takes minutes to detect all windows

0 domestic cat 1.03649377823 1 domestic cat 0.0617411136627 2 domestic cat -0.097744345665 3 domestic cat -0.738470971584 4 chair -0.988844156265 5 skunk -0.999914288521 6 tv or monitor -1.00460898876 7 rubber eraser -1.01068615913 8 chair -1.04896986485 9 rubber eraser -1.09035253525 10 band aid -1.09691572189

Obj Score

Copyright 2014 Shiroyagi Corporation. All rights reserved. 28

0 person 0.126184225082 1 person 0.0311727523804 2 person -0.0777613520622 3 neck brace -0.39757412672 4 person -0.415030777454 5 drum -0.421649754047 6 neck brace -0.481261610985 7 tie -0.649109125137 8 neck brace -0.719438135624 9 face powder -0.789100408554 10 face powder -0.838757038116

Object recognition: Result2

Obj Score

Copyright 2014 Shiroyagi Corporation. All rights reserved. 29

0 1 2 3 4

Scoring

Copyright 2014 Shiroyagi Corporation. All rights reserved. 30

1 For every pixel, sum up score from all detections for  i  in  xrange(len(detec0ons)):          arr[ymin:ymax,  xmin:xmax]  +=  math.exp(score)

Scoring3

Copyright 2014 Shiroyagi Corporation. All rights reserved. 31

Score heatmap

We used 200-cat object recognition model developed for 2013 ImageNet Challenge

3

Copyright 2014 Shiroyagi Corporation. All rights reserved. 32

0 1 2 3 4

Cropping

Copyright 2014 Shiroyagi Corporation. All rights reserved. 33

Cropping4

Find the crop that encloses the highest point of interest in the centre for  i,  window_loc  in  enumerate(window_locs):          x1,  y1,  x2,  y2  =  window_loc          if  max_val  !=  np.max(arr_con[y1:y2,  x1:x2]):                    scores[i]=np.nan          else:                  scores[i]  =  ((x1+x2)/2.-­‐xp)**2+  ((y1+y2)/2.-­‐yp)**2

1

2

3

Generate all possible crop areas while  y+hws  <=  h:          while  x+hws  <=  w:                  window_locs  =  np.vstack((window_locs,  [x,  y,  x+hws,  y+hws]))

Crop and save! img_pil  =  Image.open(fn)  crop_area=map(lambda  x:  int(x),  window_locs[scores.argmax()])  img_crop  =  img_pil.crop(crop_area)

Copyright 2014 Shiroyagi Corporation. All rights reserved. 34

Finally4

Copyright 2014 Shiroyagi Corporation. All rights reserved. 35

Future improvements

Aspect detection:square or rectangle?

Magnification

Fast face/human detection

Unseen object

Object weighting