+ All Categories
Home > Documents > Stanford GSB - Data Science Lessons from the Field

Stanford GSB - Data Science Lessons from the Field

Date post: 17-Aug-2015
Category:
Upload: gaurav-kataria
View: 129 times
Download: 0 times
Share this document with a friend
20
Data Science Lessons from the Field Gaurav Kataria Head of Product Adoption Google for Work Guest Lecturer at Stanford Business School The views expressed here are my own and do not necessarily represent views of my employer
Transcript

Data ScienceLessons from the Field

Gaurav KatariaHead of Product AdoptionGoogle for Work

Guest Lecturer at Stanford Business School

The views expressed here are my own and do not necessarily represent views of my employer

Lessons from the field

1. Create data-driven culture

2. Invest in the key capabilities

3. Iterate and adapt fast

4. Make the business tradeoffs

5. Don’t forget security and privacy

Data-Driven Decision Making Decision-Driven Data Making(Finding the data to support a

decision that has already been made)

Lesson #1: Create data-driven culture

Lesson #2: Invest in the Capabilities

Describe the Trends

Predict the Future

Change the Future

360o DATA

MACHINE LEARNING

EXPERIMENTATION

Lesson #2: Invest in the Capabilities

Describe the Trends

Predict the Future

Change the Future

360o DATA

MACHINE LEARNING

EXPERIMENTATION

“80% of a Data Scientist’s time is spent on cleaning and organizing the data”

Lesson #2: Invest in the Capabilities

Describe the Trends

Predict the Future

Change the Future

360o DATA

MACHINE LEARNING

EXPERIMENTATION

“Correlation Causation”

Lesson #2: Invest in the Capabilities

Describe the Trends

Predict the Future

Change the Future

360o DATA

MACHINE LEARNING

EXPERIMENTATION

“Let’s do it and then we’ll see what happens” Experiment

Lesson #2: Invest in the Capabilities (examples of common pitfalls)

Claim: Demand is inelastic

Reality: Channel partners did not pass on the discount

Claim: Feature is increasing user engagement

Reality: Users are frustrated because it takes them longer to get the stuff done

Claim: Launch increased sales

Reality: Actually holidays increased sales; the launch actually depressed sales

Price CutSales Feature LaunchTime spent

on site

Product LaunchSales

Sep Oct Nov Dec

Lesson #3: Iterate and adapt fast (Examples)

Movie/Song Recommendations

● Trends change

● User’s preferences change

● Licensing costs change

Lesson #3: Iterate and adapt fast (Examples)

Competition is not sleeping

● Big Data is getting democratized

● Machine learning offered as a service

● People are tuning their algorithms

© Shivon Zilis, Bloomberg Beta

● 100 customers: 90 will renew their subscription and 10 will not

● If we had a simple model that guessed our customers would always renew, it would be

accurate 90% of the time

● However, we’d never be able to identify the 10 customers who won’t renew

● Most business data follows a similar pattern (called class imbalance)

● We need an intelligent model, not just an accurate model

Lesson #4: Make the business tradeoffs (Example)

Lesson #4: Make the business tradeoffs (Example)

Non-renew(10)

Renew(90)

Lesson #4: Make the business tradeoffs (Example)

Prediction

Non-renew(10)

Renew(90)

Lesson #4: Make the business tradeoffs (Example)

Prediction

Non-renew(10)

Renew(90)

True Positive

True Negative

Lesson #4: Make the business tradeoffs (Example)

Prediction

Non-renew(10)

Renew(90)

True Positive

False Positive

True Negative

False Negative

Lesson #4: Make the business tradeoffs (Example)

Prediction

Non-renew(10)

Renew(90)

True Positive (TP)

False Positive (FP)

True Negative (TN)

False Negative (FN)

Precision = 55%TPTP + FP

Recall = 60% TPTP + FN

● Cost plays a big role. For example,

○ Cost of action is $100/customer; Total cost of action for all predicted: $1,100

○ Benefit of action is $150/customer; Total benefit (true positives only): $900

○ Cost > Benefit

● So, what is more important: Precision or Recall?

Lesson #4: Make the business tradeoffs (Example)

Lesson #4: Make the business tradeoffs (Example)

Importance of Precision

Cost of a False Positive

Importance of Recall

Cost of a False Negative

Examples Automated Email

Large Discount

Product Recommendation

Customer Churn

Lesson #5: Don’t forget the data security and user privacy

Thank You!

The views expressed here are my own and do not necessarily represent views of my employer


Recommended