Artificial Intelligence, Neural Networks & Deep Learning

Post on 15-Apr-2017

563 views 2 download

transcript

Hello World

BrainCreators

BrainCreators builds software solutions

AI technology is our comfort zone

Data is our source of inspiration

1. What is deep learning?OverviewConvolutional Neural Network

2. Case studyAutomated category assignmentDiscovering new categories

Today

Machine Learning

Machine learning > Artificial neural networks

1011 neurons 104 synapses per neuron1016 “operations” per second

Cortex: 2.500 cm 2, 2 mm thick 1.4 kg, 1.7 liters 250 million neurons per mm3 .180,000 km of “wires” 25 Watts

Learning causes synapses to strengthen or weaken, to appear or disappear.

The Human Brain

8x1012 operations/second 500 Watts 5760 (small) cores $3000

Are we only a factor 10,000 away from the power of the human brain?Probably more like 1 million; synapses are complicated A factor of 1 million is 30 years of Moore's Law 2045?

The Machine BrainNVIDIA Titan-Z GPU

Neural Networks

Inspired by the architecture of the brain, researchers wanted to train neural networks for the last 25 years

We have had good algorithms for learning the weights in networks with 1 hidden layer.

But no successful attempts for deep layers were reported before 2006 …

Deep learning

‘Deep Learning’ means using a neural network with several (hidden) layers of nodes between input and output

The series of layers between input & output do feature identification and processing in a series of stages, just as our brains seem to.

what’s new: algorithms for training networks better hardware (faster / more cores)a lot more data (the internet)

Deep learningTrain in steps as autoencoders.

having an input layer, an output layer and one or more hidden layers connecting them –, but with the output layer having the same number of nodes as the input layer, and with the purpose of reconstructing its own inputs

By making this happen with (many) fewer units than the inputs, this forces the ‘hidden layer’ units to become good feature detectors

Deep learning algorithmsConvolutional Neural Network

Deep Belief Network

Restricted Boltzmann Machine

Deep Reinforcement Learning

Deep Q Learning

Hierarchical Temporal Memory

Stacked Denoising Autoencoders

Convolutional Neural Networkfor Image classification

X

O

translationweight

rotation

CNN

CNN

scaling

What computers see

=?

Filtering1 x 1 = 1

Filtering

0.55

(1+1-1+1+1+1-1+1+1) / 9 = 0.55

Convolution: Apply every possible match

Convolution layerOne image becomes a stack of filtered images

Pooling: Reducing size1. Pick a window size (usually 2 or 3).

2. Pick a stride (usually 2).

3. Walk your window across your filtered images.

4. From each window, take the maximum value.

Poolingmaximum

Pooling

max pooling

Rectified Linear Units (ReLUs)Remove negative values

X

Fully connected layer

O.92

.51Weighted

Deep Stacking

Conv

olut

ion

ReLU

Pool

ing

Conv

olut

ion

ReLU

Conv

olut

ion

ReLU

Pool

ing

Fully

conn

ecte

d

Fully

conn

ecte

d XO

.92

.51

Put it all together

Training/Learning

Q: Where do all the magic numbers come from?Features in convolutional layersVoting weights in fully connected layers

A: Backpropagation

Learned

ApplicationsSpeech recognition

Image recognition

Natural language processing

Recommendation systems

Customer relationship management

Etc...

A Case StudyHow to categorise data from 100 scraped webshops?

Object recognitionAutomatically Classify Product Images

Dress Boot

Problem Description

How to create training data

How can we discover “the right” categories

ToolsNvidia Digits / Caffe

Deep learning

ElasticsearchDocument store

Apache SparkDistributed clustering

ScrapyDistributed scraping

Scraped JSON Data"productID": "37801580","price": 12400,"originalPrice": 12400,"name": "Dirk Bikkembergs Polo Shirt","priceCurrency": "USD","url": "http://www.yoox.com/us/37801580KS/item","brand": "DIRK BIKKEMBERGS","description": "Dirk Bikkembergs Men Polo Shirt on YOOX.COM. The best online selection of Polo Shirts Dirk Bikkembergs. YOOX.COM exclusive items of Italian and international designers - Secure payments","seller": "Yoox","variantID": "66","image": [ "http://images.yoox.com/37/37801580ks_12_f.jpg", "http://images.yoox.com/37/37801580ks_12_r.jpg"],"availability": "InStock","raw_tag": [ "Polo shirts", "T-Shirts and Tops", "men", "jersey, solid color, polo collar, long sleeves, logo, no pockets", "46% Cotton, 46% Modal, 8% Spandex", "DIRK BIKKEMBERGS"],

Giant product database5 million products, uncategorised

Basic Steps

Initialization Phase - creating initial training data

Feature Discovery / Refinement Phase

Creating Initial training set

Manually pre defined ontology / categorization

Text search by tag

Use simple image featuresEdge HistogramJCD

Create 50 categories with 500 samples Generate rough initial neural network

Initial Phase: Create labeled training data

Initial Categoriesacc_bagacc_clothing_beltsacc_clothing_bowtieacc_clothing_glovesacc_clothing_tieacc_glasses_glassesacc_hats acc_jewelryclothing_bodywear_bodysuitclothing_bodywearclothing_bottomwear_bikiniclothing_bottomwear_leggingsclothing_bottomwear_pajamaclothing_bottomwear_pantsclothing_bottomwear_pants_jeansclothing_bottomwear_short clothing_bottomwear_skirtclothing_dresswear

clothing_footwearclothing_hosieryclothing_topwear_blazerclothing_topwear_blouseclothing_topwear_coat clothing_topwear_jacketclothing_topwear_poloclothing_topwear_sweaterclothing_topwear_tankclothing_topwear_tshirtother_bedother_electronicsother_homeother_media_bookother_media_moviesother_other_cosmetics

Manual labeling - by tag

search by tag

We would select the top matches and assign the training label

Simple Image features

LIRE: Lucene Image RetrievalLIRE is a Java library that provides a simple way to retrieve images and photos based on color and texture characteristics. LIRE creates a Lucene index of image features for content based image retrieval (CBIR) using local and global state-of-the-art methods. Easy to use methods for searching the index and result browsing are provided. Best of all: it's all open source.

Edge Histogram example

Manual labeling

search by image

We created our own Elasticsearch Plugin to help us visually search and sort by simple image features

by image feature

Training a neural network

Define a network

Define a dataset

Transfer learning Alexnet1.5 million training examples

1000 categories

Define Digits Model

Create Dataset Dataset

Training A Model

Applying modelHigh Confidence Low Confidence

Phase 2Improving categories by clustering features and human selection

Clustering using Kmeans

Applied Feature Discovery ProcessCreate deep learn model

Apply model to data

Automatically create new clusters using kmeans

Human verify useful clusters

Build new training set from top confidence

Create new cluster labels

Clustering (Bracelets)

Clustering (Bracelets)

Clustering (Bracelets)

Clustering (Sneakers)

Clustering (Sneakers)

Clustering (Sneakers)

Final Categoriesacc_bag_backpack, acc_bag_backpack_human, acc_bag_briefcase, acc_bag_bucket, acc_bag_bucket_human, acc_bag_clutch, acc_bag_duffel, acc_bag_hobo, acc_bag_messenger, acc_bag_pouch, acc_bag_satchel, acc_bag_shoulderbag, acc_bag_shoulderbag_human, acc_bag_suitcase, acc_bag_totes, acc_bag_wallet, acc_clothing_belts, acc_clothing_belts_human, acc_clothing_bowtie, acc_clothing_cufflinks, acc_clothing_gloves, acc_clothing_gloves_human, acc_clothing_suspenders, acc_clothing_tie, acc_clothing_umbrella, acc_glasses_glasses, acc_glasses_glasses_human, acc_glasses_sunglasses, acc_glasses_sunglasses_human, acc_hats_beanie, acc_hats_beanie_human, acc_hats_bucket, acc_hats_bucket_doll, acc_hats_bucket_human, acc_hats_cap, acc_hats_cap_human, acc_hats_fedora, acc_hats_fedora_doll, acc_hats_fedora_human, acc_headwear_headband, acc_headwear_headband_human, acc_headwear_scarf, acc_headwear_scarf_human, acc_jewelry_bracelet, acc_jewelry_brooche, acc_jewelry_earrings, acc_jewelry_earrings_human, acc_jewelry_hairpins, acc_jewelry_hairpins_human, acc_jewelry_necklace, acc_jewelry_necklace_human, acc_jewelry_ring, acc_jewelry_watch, acc_jewelry_watch_strap, clothing_bodywear_bodysuit, clothing_bodywear_bodysuit_human, clothing_bodywear_bodysuit_zoom, clothing_bodywear_swimsuit, clothing_bodywear_swimsuit_human, clothing_bottomwear_bikini, clothing_bottomwear_bikini_human, clothing_bottomwear_leggings_capri, clothing_bottomwear_leggings_capri_human, clothing_bottomwear_leggings_default, clothing_bottomwear_leggings_default_human, clothing_bottomwear_leggings_sport, clothing_bottomwear_pajama_female, clothing_bottomwear_pajama_human, clothing_bottomwear_pajama_human_female, clothing_bottomwear_pajama_human_male, clothing_bottomwear_pants, clothing_bottomwear_pants_chino, clothing_bottomwear_pants_chino_human_female, clothing_bottomwear_pants_chino_human_male, clothing_bottomwear_pants_human, clothing_bottomwear_pants_human_female, clothing_bottomwear_pants_jeans, clothing_bottomwear_pants_jeans_boyfriend, clothing_bottomwear_pants_jeans_boyfriend_human, clothing_bottomwear_pants_jeans_human_female, clothing_bottomwear_pants_jeans_human_male, clothing_bottomwear_pants_jeans_skinny, clothing_bottomwear_pants_jeans_skinny_human, clothing_bottomwear_pants_jeans_skinny_zoom, clothing_bottomwear_pants_jeans_straight, clothing_bottomwear_pants_jeans_straight_human_female, clothing_bottomwear_pants_jeans_straight_human_male, clothing_bottomwear_pants_jeans_wideflaredbootcut, clothing_bottomwear_pants_jeans_wideflaredbootcut_human, clothing_bottomwear_pants_joggers_sweatpants,...

Conclusion/ResultsConclusions:From chaotic data we can discover unknown categories and classify it thanks to a loop workflow.

Discover new categories.- How: Feature Analysis

Classify/structure the data.- How: Deep Learning

Repeat

Results:

More than 95% from 5M products classified with confidence > 0.96:

More than 250 new labeled categories:

Thank you! Please stay in touchGerbert Kaandorpgerbert@braincreators.com

Jasper Wognumjasper@braincreators.com

BrainCreators

Prinsengracht 7961017JV Amsterdam

The Netherlands

www.braincreators.com