+ All Categories
Home > Documents > The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and...

The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and...

Date post: 14-Aug-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
73
Principal Data Scientist Booz Allen Hamilton http://www.boozallen.com/datascience Kirk Borne @KirkDBorne The Journey to Data Science Maturity: Sailing the Seven C’s Get these slides here: http://www.kirkborne.net/ApexSystems2019/
Transcript
Page 1: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Principal Data Scientist

Booz Allen Hamilton

http://www.boozallen.com/datascience

Kirk Borne@KirkDBorne

The Journey to Data Science Maturity:Sailing the Seven C’s

Get these slides here: http://www.kirkborne.net/ApexSystems2019/

Page 2: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

This talk is *not* about this…

2

The unicorns of the new data world…

https://www.figure-eight.com/figure-eight-2018-data-scientist-report/ htt

p:/

/ww

w.m

arke

tin

gdis

tille

ry.c

om

/20

14

/11

/29

/is-

dat

a-sc

ien

ce-a

-bu

zzw

ord

-mo

der

n-d

ata-

scie

nti

st-d

efin

ed/

Page 3: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

… but this talk is about this…

3

https://twitter.com/dez_blanchfield/status/645139875440668672

Page 4: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

OUTLINE

• Big Data Preliminaries

• Data Literacy & Ethics

• Data Science

• Data Storytelling

• The “7 C’s” (actually 12)

4

Source: https://www.expertsystem.com/government-data-mining/

http://www.boozallen.com/datascience @KirkDBorne

Page 5: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

OUTLINE

• Big Data Preliminaries

• Data Literacy & Ethics

• Data Science

• Data Storytelling

• The “7 C’s” (actually 12)

5

Source: https://www.expertsystem.com/government-data-mining/

http://www.boozallen.com/datascience @KirkDBorne

Page 7: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

We’ve had data and databasesfor years!

So, why all the fuss right now?

• Data Science and Machine Learning are hot right now due to the enormous interest in the new massive data collections for their potential to enable significant new discoveries and to deliver new value (and opportunities) to organizations.

• Data Science is ready for “prime time” because it engages three technologies that are now sufficiently mature:1) Massive data collection

2) More powerful computing (and virtual / distributed computing = Cloud)

3) Sophisticated more powerful ML algorithms.

7

Page 8: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Defining Big Data

• We collect evidence (data) to answer our questions about the world around us … How? Why? What if?

– We are curious creatures… and that is how we end up in a world of BIG DATA!

• Big Data refers to data collections in which “everything is now being quantified and data-fied” (= full-population samples of everything = The End of Demographics!)

– Examples: Social networks (Twitter, YouTube), search & online histories, web logs, financial and e-commerce transactions, environment & health monitors (wearable devices, EHRs), IoT, Astronomy,…

– Huge quantities of data are now being collected and used everywhere.

Source for graphic: http://hinalockim.blogspot.com/2012/08/6th-week-cognitive-learning.html8

Page 9: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

• Statistics = the practice (and science) of collecting and analyzing numerical data.

• Machine Learning (ML) = mathematical algorithms that learn from experience(= pattern recognition from previous evidence).

• Data Mining = application of ML algorithms to data.

• Artificial Intelligence (AI) = application of ML algorithms to robotics and machines = taking actions based on data ( #bots ).

• Data Science = application of scientific method to discovery from data (including statistics, ML, and more: visual analytics, computer vision, computational modeling, semantics, graphs, network analysis, NLU, data indexing schemes [Google!], …).

• Analytics = the products of machine learning & data science. For example: Health Analytics, Marketing Analytics, Behavior Analytics, Predictive Analytics (predictive models).

9

“The 2 most important

things in Data Science are

the Data and the Science!”

Some Quick Definitions :

Page 10: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

An “Easy Button” for Extracting Value from Data through Data Science

• Pattern Discovery (Detection)– D2D: data-to-discovery

• Pattern Recognition– D2D: data-to-decisions

• Pattern Exploration– D2D: data-to-dollars (innovation)

• Pattern Exploitation– D2V: Data-to-Value (action)

– D2A: Data-to-Action (value)

10

Page 11: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Example: Google! (now Alphabet Inc.)• Google’s mission statement (since their beginning):

– “To organize the world's information and make it universally accessible & useful.”

– Ka-Ching!

– This was achieved through… Mathematical Algorithms for search! (PageRank)

• Recent advances in “search” (pattern recognition) algorithms:– Voice-based search and response (Google Assistant, Siri, Cortana, Alexa)

– “Search” for data! (not just keywords and text, but patterns in data)http://diginomica.com/2016/08/16/thoughtspots-search-for-data-analytics-finds-results-fast/

– Deep Learning used to find those patterns (in video, images, audio, networks,…)

– But… – Sometimes the algorithm gets it wrong …

“Good judgment comes from experience, and experience comes from bad judgment.”Actually, you want to get at least two models wrong in gradient descent optimization!

11

Page 12: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

OUTLINE

• Big Data Preliminaries

• Data Literacy & Ethics

• Data Science

• Data Storytelling

• The “7 C’s” (actually 12)

12

Source: https://www.expertsystem.com/government-data-mining/

http://www.boozallen.com/datascience @KirkDBorne

Page 13: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Data Literacy

13

(Jordan Morrow, Qlik)http://www.dataliteracynetwork.org/definitions.html

“Data Literacy includes the ability to read, work with, analyze, and argue with data.”

Source: http://bit.ly/2mEzJsr

Page 14: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Data Literacy in 2 parts: Data Science and Data Ethics

http://www.kirkborne.net/cds151/

14

1) How to use data correctly

2) How to use data ethically

htt

p:/

/dilb

ert.

com

/str

ip/2

00

0-1

1-1

3h

ttp

://d

ilber

t.co

m/s

trip

/20

08

-05

-07

Page 15: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Huge quantities of data are now being used everywhere!•

“With great power comes great responsibility.”

– Spiderman’s uncle (…or… Voltaire)

Source: http://verix.com/6-new-articles-on-harnessing-the-power-of-big-data/

15

Page 16: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Huge quantities of data are now being used everywhere!•

“We were so preoccupied with whether or not we could, we

didn't stop to think if we should.” – Dr. Ian Malcom (Jurassic Park)

16

Data Ethics Example:

• App purchases

Source: https://www.youtube.com/watch?v=9nazm3_OXac

Page 17: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

17

Data Literacy Matters!

17Source: https://lovestats.wordpress.com/dman/

Page 18: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Quote from H.G. Wells (1903; writer) …

“Statistical thinking will one day be as

necessary for efficient citizenship as

the ability to read and write.”

Well, that day is here now!

Statistical & Data Literacy Matter!

1818

Page 19: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Quote from Ronald Coase (economist) …

“If you torture your data long enough,

it will confess to anything.”

1919

Page 20: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Quote from Steven Wright (comedian) …

“42.7% of all statistics are made up

on the spot.”

2020

Page 21: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Quote from somebody (?) …

“It is now beyond any doubt that

cigarettes are the biggest cause of

statistics”

2121

Page 22: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

https://www.geckoboard.com/learn/data-literacy/statistical-fallacies/

In our rush to build and validate our predictive models, we often are too quick to overlook our own cognitive biases and other data fallacies, such as:“Correlation does not imply Causation!”https://bit.ly/2pPnUSu

22

Page 23: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Feature Selection is important in order to disambiguate different classes.

More importantly,Class Discovery depends on choosing the right projection and selecting the right features!

Feature Selection and Projection

23

Source: https://www.quora.com/How-was-classification-as-a-learning-machine-developed

Page 24: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Your chosen data attributes represent a low-dimension projection of the full truth – the feature space (dimensions) in which you explore your data is a form of cognitive bias –… it matters!

Projection Matters

24

Source: http://www.transformativeinsights.co.nz/blog/new-perspective-on-conflict

Page 25: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

The Data Science of Feature-rich Chocolate Brownieshttp://www.datasciencebowl.com/data-science-of-chocolate-brownies/

High-Variety Data enables better (and tastier) analytics models

Variety is the spice of discovery!

25

Page 26: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

OUTLINE

• Big Data Preliminaries

• Data Literacy & Ethics

• Data Science

• Data Storytelling

• The “7 C’s” (actually 12)

26

Source: https://www.expertsystem.com/government-data-mining/

http://www.boozallen.com/datascience @KirkDBorne

Page 27: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Our mission as data scientists:To discover Value in Big Data through the ethical application of Data Science and Machine Learning

27

Page 28: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

28

Recommended Reading:The Data Science Playbook

https://www.boozallen.com/s/insight/publication/data-science-playbook.html

Page 29: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

29Source for graphic: https://data-flair.training/blogs/machine-learning-applications/

Predictive Analytics is currently the most significant application of Machine Learning (*)

(*) The set of mathematical algorithms that learn (patterns) from experience (data)

Page 30: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

30Source for graphic: https://www.altexsoft.com/blog/datascience/machine-learning-strategy-7-steps/

Predictive Analytics is everywhere in Business Data and Machine Learning (AI) Strategy Discussions

Page 31: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Typical Machine Learning Applications

in Our Lives:

PREDICT

OPTIMIZE

DISCOVER

DETECT

Your Purchase Preferences, Recommender Systems,

Credit Scoring, Smart Phone auto-complete …

Your Thermostat, Your Commute Time and Routing,

Personalized Learning …

Your Health Issues (wearables), Your Best Deal

(Bed & Breakfast or Restaurant) …

Your Social Sentiment, Identify Theft,

Credit Card Fraud …

31

Page 32: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Typical Machine Learning Applications

in the Enterprise:

PREDICT

OPTIMIZE

DISCOVER

DETECT

Predict outcomes, events, prices, costs, risks,

product demand …

Optimize processes, products, and people

(delivery of services, supplies, personnel) …

Discover insights in social media, documents,

quarterly business reports, customer call records...

Detect fraud, anomalies in safety events,

behaviors, outbreaks, data usage (GDPR),

systems (cybersecurity breaches) …

32

Page 33: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

4 Types of Insights Discovery from Data:

33

(Graphic by S. G. Djorgovski, Caltech)

1) Class Discovery: Find the categories of objects (population segments), events, and behaviors in your data. + Learn the rules that constrain the class boundaries (that uniquely distinguish them).

2) Correlation (Predictive and Prescriptive Power) Discovery: (insights discovery) – Find trends, patterns, dependencies in data that reveal the governing principles or behavioral patterns (the object’s “DNA”).

3) Outlier / Anomaly / Novelty / Surprise Discovery: Find the new, surprising, unexpected one-in-a-[million / billion / trillion] object, event, or behavior.

4) Association (or Link) Discovery: (Graph and Network Analytics) – Find both the usual and the unusual (interesting) data associations / links / connections across the entities in your domain.

Page 34: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

34

How does a Data Scientist build amodel of a complex dynamic system?

34

Page 35: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

35

We might start by modeling a complex system like this…

35

Page 36: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

36

We can add more features to model the system with higher fidelity …

36

Page 37: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

37

We can add more features to model the system with higher fidelity …

37

Page 38: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Pattern Discovery is easy, but Pattern Exploitation requires more data science…

Source for graphic: http://www.holehouse.org/mlclass/10_Advice_for_applying_machine_learning.html

38

Generalization is key!

(The Goldilocks model)

The most generally useful model captures the fundamental pattern in the data and takes into account the natural variance in the data.

Page 39: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Developing insight and scientific intuitionare connected and essential, and slow...

39

Insight: the capacity to gain an accurate

and deep intuitive understanding of a

person or thing.

Developing Scientific Intuition: "People

are particularly bad at Statistical Thinking,

which requires careful weighing of

evidence by the slower analytic mind."https://www.sigmaxi.org/news/article/2018/12/12/from-the-president-developing-scientific-intuition

https://amzn.to/2RI1mlS

Page 40: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Remember this …

40

Page 42: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

5 Levels of Analytics Maturity

in Data-Driven Applications1) Descriptive Analytics

– Hindsight (What happened?)

2) Diagnostic Analytics

– Oversight (real-time / What is

happening? Why did it happen?)

3) Predictive Analytics

– Foresight (What will happen?)

42

Page 43: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

5 Levels of Analytics Maturity

in Data-Driven Applications1) Descriptive Analytics

– Hindsight (What happened?)

2) Diagnostic Analytics

– Oversight (real-time / What is

happening? Why did it happen?)

3) Predictive Analytics

– Foresight (What will happen?)

4) Prescriptive Analytics

– Insight (How can we optimize what

happens?) (Follow the dots / connections in

the graph!)

5) Cognitive Analytics– Right Sight (the 360 view , what is the right

question to ask for this set of data in this

context = Game of Jeopardy)

– Finds the right insight, the right action, the

right decision,… right now!

– Moves beyond simply providing answers, to

generating new questions and hypotheses.

43

Page 44: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Find a function (i.e., the model) f(d,t)

that predicts the value of some

predictive variable y = f(d,t) at a future

time t, given the set of conditions found

in the training data {d}.

=> Given {d}, find y.

Find the conditions {d’} that will produce a

prescribed (desired, optimum) value y at a

future time t, using the previously learned

conditional dependencies among the

variables in the predictive function f(d,t).

=> Given y, find {d’}.

Predictive vs Prescriptive:What’s the Difference?

44

PREDICTIVE PRESCRIPTIVEAnalyticsAnalytics

Page 45: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Find the conditions {d’} that will produce a

prescribed (desired, optimum) value y at a

future time t, using the previously learned

conditional dependencies among the

variables in the predictive function f(d,t).

=> Given y, find {d’}.

Predictive vs Prescriptive:What’s the Difference?

45

Confucius says…

“Study your past to know

your future”

PREDICTIVE PRESCRIPTIVEAnalyticsAnalytics

Find a function (i.e., the model) f(d,t)

that predicts the value of some

predictive variable y = f(d,t) at a future

time t, given the set of conditions found

in the training data {d}.

=> Given {d}, find y.

Page 46: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Find the conditions {d’} that will produce a

prescribed (desired, optimum) value y at a

future time t, using the previously learned

conditional dependencies among the

variables in the predictive function f(d,t).

=> Given y, find {d’}.

Predictive vs Prescriptive:What’s the Difference?

46

Confucius says…

“Study your past to know

your future”

Baseball philosopher Yogi Berra says…

“The future ain’t what it

used to be.”

PREDICTIVE PRESCRIPTIVEAnalyticsAnalytics

Find a function (i.e., the model) f(d,t)

that predicts the value of some

predictive variable y = f(d,t) at a future

time t, given the set of conditions found

in the training data {d}.

=> Given {d}, find y.

Page 47: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

OUTLINE

• Big Data Preliminaries

• Data Literacy & Ethics

• Data Science

• Data Storytelling

• The “7 C’s” (actually 12)

47

Source: https://www.expertsystem.com/government-data-mining/

http://www.boozallen.com/datascience @KirkDBorne

Page 48: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Zoom deeper into your data for bothPredictive and Prescriptive Power Discovery!

48(from the Booz Allen “Field Guide to Data Science”)

Page 49: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

“What is going on in that neighborhood

on Saturday evenings between 6pm and 8pm?”

49Source for graphic: https://www.boozallen.com/s/insight/publication/field-guide-to-data-science.html

Page 50: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

◼ Classic Textbook Example of Data Mining (Legend?): Data

mining of grocery store logs indicated that men who buy

diapers also tend to buy beer at the same time.

Association Discovery Example #1

50

Page 51: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

◼ Amazon.com mines its customers’ purchase logs to

recommend books to you: “People who bought this book also

bought this other one.”

Association Discovery Example #2

51

Page 52: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

◼ Netflix mines its video rental history database to recommend

rentals to you based upon other customers who rented similar

movies as you.

Association Discovery Example #3

52

Page 53: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

◼ Wal-Mart studied product sales in their Florida stores in 2004

when several hurricanes passed through Florida.

◼ Wal-Mart found that, before the hurricanes arrived, people

purchased 7 times as many of {one particular product}

compared to everything else.

Association Discovery Example #4

53

Page 54: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

◼ Wal-Mart studied product sales in their Florida stores in 2004

when several hurricanes passed through Florida.

◼ Wal-Mart found that, before the hurricanes arrived, people

purchased 7 times as many strawberry pop tarts compared

to everything else.

Association Discovery Example #4

54

Page 55: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Strawberry pop tarts???

http://www.nytimes.com/2004/11/14/business/yourmoney/14wal.htmlhttp://www.hurricaneville.com/pop_tarts.html

http://bit.ly/1gHZddA55

Page 56: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical
Page 57: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

OUTLINE

• Big Data Preliminaries

• Data Literacy & Ethics

• Data Science

• Data Storytelling

• The “7 C’s” (actually 12)

57

Source: https://www.expertsystem.com/government-data-mining/

http://www.boozallen.com/datascience @KirkDBorne

Page 58: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

http

s://w

ww

.pin

tere

st.c

om

/pin

/24

86

83

21

06

47

83

12

64

/

58

Curious

Page 59: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Source: https://infocus.dellemc.com/william_schmarzo/design-thinking-innovation/

59

Creative (design thinking)

https://jaywalker-digital.ch/en/ebook-how-to-apply-design-thinking-to-data-science/

https://blog.westmonroepartners.com/when-design-thinking-meets-data-science/

Page 60: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Computational

60Data Scientists survey results: https://www.kdnuggets.com/2018/05/poll-tools-analytics-data-science-machine-learning-results.html

Page 61: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Collaborative

61

Source: https://www.boozallen.com/s/insight/publication/data-science-playbook.html

Page 62: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Critical Thinker

62Source: https://plus.google.com/collection/UVYWTB

Source: https://www.adam-eason.com/critical-thinking-importance-ways-improve/

Source: https://successatschool.org/advicedetails/964/critical-thinking-skills

Page 63: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Community Focus

63https://datasciencebowl.com/

Page 64: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Courageous Problem-Solver

64

https://middlesexconsulting.com/lesson-from-winston-churchill/

Page 65: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Cool under pressure(tolerance for ambiguity)

65

https://www.slideserve.com/mave/shades-of-gray-ambiguity-tolerance-statistical-thinking

Page 66: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Consultative (customer-focus)

66

Creating value at the “pull” of the customer!

“A pull strategy becomes more important than

push because you want to create enough value

so that the customer comes to you!”https://www.marketing91.com/pull-strategy-in-marketing/

Page 67: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Compassion (empathy)

67

Source: https://www.jitbit.com/news/customer-service-skills/

Page 69: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Continuous Life-long Learner

69

http://bit.ly/2hYkRU4

Page 70: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

Continuous Life-long Learner

70

… or just follow this guy on Twitter …

@KirkDBorne

Page 71: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

71

Booz Allen Hamilton

SAILING THE “7 SEAS” OF DATA SCIENCE:The Individual’s Journey to Data Science Maturity

The “Seven” Seas (C’s):1) Cognitively Curious (ask questions … the right questions!)2) Creative (design thinker)3) Courageous problem-solver (rocks the culture, willingness to fail)4) Cool under pressure (tolerance for ambiguity)5) Continuous life-long learner (hackathons, online classes, …)6) Communicator (data storyteller)7) Collaborative (“data science is a team sport”)

+ 5 more: 8) Critical Thinker 9) Computational 10) Consultative 11) Community-focus12) Compassion (Empathy)

71

Page 72: The Journey to Data Science Maturity: Sailing the Seven ’s · •Statistics = the practice (and science) of collecting and analyzing numerical data. •Machine Learning (ML) = mathematical

72

DATA SCIENTISTS ARE EXPLORERS –– EXPLORING VAST AND ENDLESS SEAS OF DATA!

“If you want to build a ship,

don’t drum up people to

gather wood and don’t

assign them tasks and work,

but rather teach them to

yearn for the vast and

endless sea.”- Antoine de Saint-Exupery h

ttp

s://

ww

w.p

inte

rest

.co

m/p

in/3

77

10

61

68

77

22

98

09

2/

http://www.nytimes.com/2008/04/11/world/europe/11exupery.html


Recommended