The Science of Managing Data Scientists

Post on 01-Sep-2014

53,678 views 2 download

Tags:

description

Creating great products powered by big data can be challenging. Data science work is often ambiguous, which can make results unpredictable and scheduling almost impossible. Many of the popular software engineering processes just won’t work for these innovative and ambitious projects. Waterfall falls apart; it just doesn’t make sense to define the product before understanding the limitations of the data and technology. And shoehorning data experiments into tight agile sprints is both difficult and doesn’t necessarily lend itself to discoveries that involve a lot inspiration and perspiration before a light bulb moment. Even with a working process, few teams collaborate truly effectively. Projects that involve machine learning, algorithm development, or other deeply technical endeavors, are filled with advanced math and complicated terminology, which leaves plenty of teams with communication gaps that prevent the synergy realized when working cohesively. Thankfully there are solutions to these problems! Based on personal experience, and interviews with many other leaders spearheading big data initiatives, this session aims to distill these lessons into actionable strategies you can use to improve process and communication for your own team.

transcript

The Science of Managing Data Scientists

Kate Matsudaira

Tuesday, February 26, 13

Tuesday, February 26, 13

What are they doing all day?

Tuesday, February 26, 13

http://data.whicdn.com/images/29273643/funny-science-news-experiments-memes-super-science_large.jpg

Data science is different

Tuesday, February 26, 13

Research doesn’t fit

image src:http://laurajul.dk/wp-content/uploads/2011/10/Screen-shot-2011-10-06-at-00.03.57.png

the traditional SDLC

Tuesday, February 26, 13

image src: legoexpress.tumblr.com

Good help is hard to find

(and keep)

Tuesday, February 26, 13

What are they doing all day?

Tuesday, February 26, 13

Logistics

Trust

Communication

Tuesday, February 26, 13

Tuesday, February 26, 13

Communication

Tuesday, February 26, 13

Do you speak the same language?

image src: abclang.livejournal.com

Tuesday, February 26, 13

Define “quality”

image source: http://www.sodahead.com/fun/

Tuesday, February 26, 13

What do precision and recall mean?

P

RGive them a lesson in semantics

Tuesday, February 26, 13

Before

Precision: 80%Recall 25%

Tuesday, February 26, 13

After

For the top search terms* accessories appear 25% of the time on the first page.

For top search terms the head products are present 90% of the time in the top 3 results, 98% of the time in the top5.

✴ Top search terms are the 1000 most popular queries on our website over the last 30 days.

Tuesday, February 26, 13

Measure with data that matters

image source: http://www.freepatentsonline.com/6971185-0-large.jpg

Tuesday, February 26, 13

This is a really hard problem.

Tuesday, February 26, 13

We know it is hard...

Tuesday, February 26, 13

We know it is hard...but we don’t know

why it is hard.

Tuesday, February 26, 13

Use analogies & examples

Tuesday, February 26, 13

Use analogies & examples

The dandelion swayed in the gentle breeze like an oscillating electric fan set on medium.

Tuesday, February 26, 13

Constructing model lineages for products is really difficult.

Tuesday, February 26, 13

Macbook Air

Macbook

Macbook Pro

?Bravia

EX500 LCD

BraviaEX620 LED

BraviaEX523 LCD

?

Be specificTuesday, February 26, 13

Product Evolution

We are building a

lineage

Take it up a level

Tuesday, February 26, 13

Tuesday, February 26, 13

Logistics

Tuesday, February 26, 13

How do you create a sense of urgency?

image source: http://favim.com/image/503182/

Tuesday, February 26, 13

What are the reasons behind it?

image source: http://dailyfailcenter.com/sites/default/files/fail/explain-this-shit.jpg

Tuesday, February 26, 13

Deadlines in research?

Tuesday, February 26, 13

They don’t call it

research for nothin’

image source: http://sausagetails.com/2012/07/

Tuesday, February 26, 13

You can’t predict the future

image source: maxseesmovies.blogspot.com

Tuesday, February 26, 13

Applying “agile” to R&D

image source:http://mousebreath.com/wp-content/uploads/2011/08/funny-farmers.jpg

Tuesday, February 26, 13

Applying “agile” to R&D

image source:http://mousebreath.com/wp-content/uploads/2011/08/funny-farmers.jpg

Backlog of experimentsTrips to Hawaii

That’s doesn’t sound agile....

Tuesday, February 26, 13

Applying “agile” to R&D

image source:http://mousebreath.com/wp-content/uploads/2011/08/funny-farmers.jpg

Backlog of experiments

Regular Demos

Trips to Hawaii

That’s doesn’t sound agile....

Tuesday, February 26, 13

Applying “agile” to R&D

image source:http://mousebreath.com/wp-content/uploads/2011/08/funny-farmers.jpg

Backlog of experiments

Regular Demos

Trips to Hawaii

That’s doesn’t sound agile....

Defined workflow with iterations

Tuesday, February 26, 13

Collect Data Do we have the

right data?

Do we have the infrastructure to

analyze the data?

Build it...

Run experimentsGet results

Are we finished or do

we need more.....?

Start Here

Yes

No

No

Yes

Research

Data

Tuesday, February 26, 13

Take tools out of the equation

image source: http://www.holesinyoursocks.com/2011/02/14/funny-monday-tool-love-happy-valentines-day/

Tuesday, February 26, 13

Focus on feeds and files

Image source: http://carcat.files.wordpress.com/2009/03/funny-pictures-cat-searches-for-a-file.jpg

Tuesday, February 26, 13

Format & storage standards/prices

/date=2012-07-01

price-obs.2012-07-01.csv.gz

/date=2012-07-02

/full

/date=2012-07-01

2012-07-01T00-10-00.csv.gz

/inc

2012-07-01T00-20-00.csv.gz

Tuesday, February 26, 13

Tuesday, February 26, 13

Trust

Tuesday, February 26, 13

What are they doing all day?

Tuesday, February 26, 13

Building trust

image source: http://writealoud.com/funny-dinosaur-pictures/

Tuesday, February 26, 13

Their motivations

image src: http://www.fredhoogervorst.com/oni.app/local/upload/03897400db.jpg

Hard problems to solve

Recognition for a job well done

My work in the wild serving customers

Being GOLD!

Tuesday, February 26, 13

Their motivations

image src: http://www.fredhoogervorst.com/oni.app/local/upload/03897400db.jpg

Hard problems to solve

Recognition for a job well done

Open the door to higher-ups

My work in the wild serving customers

Being GOLD!

Tuesday, February 26, 13

Minimize time-to-ship

image source: : http://2smallerwheels.blogspot.com/2011/09/speedy-delivery.html

Tuesday, February 26, 13

Let them own the data

image source: http://www.cloudproviderusa.com/weekly-dose-of-humor-mo-data-mo-problems/

Tuesday, February 26, 13

Algorithms can’t solve everything

Tuesday, February 26, 13

Algorithms can’t solve everything

Even though we wish they could!

Tuesday, February 26, 13

Manual data can be awesome

image source: http://www.funnyjunk.com/funny_pictures/4416618/LOOK+AT+DE+PUPPY/

Tuesday, February 26, 13

Can you do the impossible?

image source: anybody-want-a-peanut.blogspot.com

Tuesday, February 26, 13

Data Miners aren’t

magicians

image source: http://1.bp.blogspot.com/_zmT48r8jfbE/S1z4t5L7JxI/AAAAAAAAAJU/1vYZmc5lhBU/s400/steve-brooks-magician.jpg

Tuesday, February 26, 13

You CAN make failure look good

image source: http://fightthebees.com/wp-content/uploads/2011/11/cow-fail.jpg

Tuesday, February 26, 13

Your challenge:

Tuesday, February 26, 13

Your challenge:

Should you choose to accept it....

Tuesday, February 26, 13

Your challenge:Take a difficult problem and

transform it into a feature

Should you choose to accept it....

Tuesday, February 26, 13

1.7GHz dual-core Intel Core i5 $300

256GB flash storage

Intel HD Graphics 4000

Apple premium

$250

8GB SDRAM

$200

$200

$100

$1050Estimated Value:

HARD: Value Estimation

Tuesday, February 26, 13

EASIER: Comparing Products

Tuesday, February 26, 13

Think with your business hat

image source: memegenerator.net

Tuesday, February 26, 13

How do you surface ideas & insights?

image source: http://static.someecards.com/someecards/usercards/1327680406361_4248272.png

Tuesday, February 26, 13

Show ‘em

image source: neslihandurmusoglu.edublogs.org

Tuesday, February 26, 13

And show ‘em

often

image source: http://www.social-science.co.uk/research/

Tuesday, February 26, 13

Negative results happen

image source: http://www.theofrak.com/2012/09/existential-star-wars.html

Tuesday, February 26, 13

The journey is the reward

image source: failblog.org

Tuesday, February 26, 13

Tuesday, February 26, 13

Communication

Tuesday, February 26, 13

Logistics

Communication

Tuesday, February 26, 13

Logistics

Trust

Communication

Tuesday, February 26, 13

Questionshttp://katemats.com

@katemats

Tuesday, February 26, 13

Questionshttp://katemats.com

@katemats

May the forces of evil get lost on the way to

your doorstep.

Tuesday, February 26, 13