Australian CIO Summit 2012: Big Data, New Physics, and Geospatial Super-Food by Tristan Sternson,...

Post on 30-Oct-2014

562 views 1 download

Tags:

description

Australian CIO Summit 2012: Big Data, New Physics, and Geospatial Super-Food by Tristan Sternson, Managing Director, InfoReady

transcript

Big Data, New Physics, and

Geospatial Super-Food

© 2012 Infoready Pty Ltd1

Tristan Sternson, InfoReady Managing Director

My Background – Tristan Sternson

� Past 12 years focussed purely on IM / BI Solutions

� Started InfoReady in 2008

� Prior Roles – Accenture Data Management & Architecture / IM Lead, PWC Consulting / IBM

� Personally designed and deployed and led many large IM and

© 2012 Infoready Pty Ltd2

� Personally designed and deployed and led many large IM and DW application in Australia and UK

� Thought leader in Information Management in Australia and APAC

� Early adopter Data Governance, Big Data, Industry Data Models, Appliance DW solutions

Who is InfoReady?

� Pure-Play Information Management and Business Intelligence Consulting firm

� Team InfoReady career IM and BI Experts

• One of the fastest growing consulting firms in Australia.

© 2012 Infoready Pty Ltd3

Australia. • IM Focused Tier One Consulting capability.

• Focus - people, process and technology

• Assisting companies turn valuable information into actionable intelligence.

• Strategy, Architecture, Solution Design & Delivery

Big Data Definition

Datasets that grow so large that they become difficult to work with, including; capture, storage, search, sharing, analytics, and visualization.

Benefits of working with larger and larger datasets allowing analysts to "spot business trends, prevent diseases, combat crime.”

© 2012 Infoready Pty Ltd4

Benefits of working with larger and larger datasets allowing analysts to "spot business trends, prevent diseases, combat crime.”

We haven’t seen anything yet, as more devices come online, eg;mobile, airborn, logs, cameras, microphones etc…

Wikipedia - 2012

The Big Data Opportunity

V3

© 2012 Infoready Pty Ltd5

Big Data – Why the hype?

� By 2015, nearly 3B people will be online, pushing the data created and shared to nearly 8 zettabytes.

� 30 billion pieces of content were added to Facebook this past month by 600M plus users.

� More than 2B videos were watched on YouTube … yesterday.

© 2012 Infoready Pty Ltd6

� More than 2B videos were watched on YouTube … yesterday.

� In the US mobile phone users between the ages of 18 and 24 send an incredible 110 text messages per day.

� 32B searches were performed last month … on Twitter.

� Worldwide IP traffic will quadruple by 2015.

Business leaders frequently make decisions based on information they don’t trust, or don’t have

1in3

Business leaders say they don’’’’t have access to the information they need to do their jobs

1in2

Business Value

© 2012 Infoready Pty Ltd7 7

83% of CIOs cited “Business intelligence & Analytics” as part of their visionary plans to enhance competitiveness

do their jobs

of CEOs recognise they need to better understand information more rapidly in order to make swift decisions

60%

Big Data Trends

© 2012 Infoready Pty Ltd8

80%20%

What the Industry Analysts say

Gartner predicts Big Data to be one of the top-10 strategic initiatives

for 2012

© 2012 Infoready Pty Ltd9

for 2012

What the Industry Analysts say

Key take-aways from Analyst perspectives Gartner TDWI

Data will grow exponentially� �

Fusion of structured and unstructured data� �

© 2012 Infoready Pty Ltd10

The connection between big data and advanced analytics will get even stronger � �

Future users will not be able to put all useful information into a single data warehouse � �

Enterprise Intelligencevs. Enterprise Amnesia

© 2012 Infoready Pty Ltd11

Com

puting

Pow

er Gro

wth

Available Observation

Space

Context

Trend: Organizations Are Getting Dumber

EnterpriseAmnesia

© 2012 Infoready Pty Ltd12

Time

Com

puting

Pow

er Gro

wth

Sensemaking Algorithms

Available Observation

Space

ContextWHY?

Trend: Organizations Are Getting DumberCom

puting

Pow

er Gro

wth

© 2012 Infoready Pty Ltd13

Time

Sensemaking AlgorithmsC

ompu

ting

Pow

er Gro

wth

Algorithms at Dead End.

You Can’t Squeeze Knowledge

© 2012 Infoready Pty Ltd14

You Can’t Squeeze Knowledge

Out of a Pixel.

scrila34@msn.com

No Context

© 2012 Infoready Pty Ltd15

Context, definition

Better understanding something by taking into

© 2012 Infoready Pty Ltd16

Better understanding something by taking into account the things around it.

Information in Context … and Accumulating

Job Applicant

scrila34@msn.com

© 2012 Infoready Pty Ltd17

Top 200Customer

Job Applicant

IdentityThief

CriminalInvestigation

The Puzzle Metaphor

� Imagine an ever-growing pile of puzzle pieces of varying sizes, shapes and colors

� What it represents is unknown – there is no picture on hand

� Is it one puzzle, 15 puzzles, or 1,500 different puzzles?

© 2012 Infoready Pty Ltd18

� Some pieces are duplicates, missing, incomplete, low quality, or have been misinterpreted

� Some pieces may even be professionally fabricated lies

� Until you take the pieces to the table and attempt assembly, you don’t know what you are dealing with

Puzzling

12 pieces

100%

1000 pieces

100%

12 pieces100%

100 pieces10% (duplicates)

© 2012 Infoready Pty Ltd19

100%

66 pieces

66%

100%(pure noise)

© 2012 Infoready Pty Ltd20

© 2012 Infoready Pty Ltd21

First Discovery – “we found Dora?”

© 2012 Infoready Pty Ltd22

Sorting Algorithm

© 2012 Infoready Pty Ltd23

Another Puzzle …

© 2012 Infoready Pty Ltd24

10 Mins – Completed Dora Puzzels

© 2012 Infoready Pty Ltd25

Data Finds Data

© 2012 Infoready Pty Ltd26

Obvious Duplicates in Front Of Your Eyes

© 2012 Infoready Pty Ltd27

Incremental Context – Incremental Discovery

10:00am START

< 1min “I can see Dora”

1min “How many puzzles are there?”

8min “Are there 1000 pieces and 3 or 4 puzzles?”

© 2012 Infoready Pty Ltd28

10min 2 x Dora puzzles complete

12min “I have blue sky and an animal”

18mins “The other puzzle is more colourful – maybe a red

motorbike”

23min “we’ve found Jenny Sanders – can I search google on my

iPhone for the picture?”

35min “How can we have 2 pieces the same?”

Lots of Sorted Pieces

© 2012 Infoready Pty Ltd29

Pieces in Context

© 2012 Infoready Pty Ltd30

Quickly we find meaning (90mins)

66 piecesof

1190 piecesonly 5.5%

© 2012 Infoready Pty Ltd31

Wow 1%

11 piecesof

1190 piecesonly 1%

© 2012 Infoready Pty Ltd32

Koala, Possum or Monkey?

© 2012 Infoready Pty Ltd33

Foundation

© 2012 Infoready Pty Ltd34

More Data Finds Data

© 2012 Infoready Pty Ltd35

Out of Tablespace…

© 2012 Infoready Pty Ltd36

Incremental Context – Incremental Discovery

55min “Second puzzle is definitely a motorbike – I can see a wheel and seat”

65min Motorcycle coming together very quickly

70min “It’s definitely a koala”

75min “The koala has a baby”

© 2012 Infoready Pty Ltd37

83min “The middle piece of the bike is missing – do I really need it, I know what it is”

88min “These are both Australian puzzles”

114min One of the kids starts isolating pieces that are causing her “noise”

130min 7 chunks emerge from 7 piles of SORTED pieces

165min Pieces beginning to come together quite quickly and picture starts to really emerge

How Context Accumulates

� With each new observation … one of three assertions are made: 1) Un-associated; 2) placed near like neighbors; or 3) connected

� Must favor the false negative

� New observations sometimes reverse earlier assertions

© 2012 Infoready Pty Ltd38

� Some observations produce novel discovery

� As the working space expands, computational effort increases

� Given sufficient observations, there can come a tipping point

� Thereafter, confidence improves while computational effort decreases!

Uniqu

e Ident

ities

Overstated Population

© 2012 Infoready Pty Ltd39

Observations

Uniqu

e Ident

ities

True Population

Counting Is Difficult

Mark Smith6/12/1978

Mark R Smith(614) 13-123-123DL: 00001234

© 2012 Infoready Pty Ltd40

6/12/19780413123123

File 1

File 2

Uniqu

e Ident

ities

The Rise and Fall of a Population

© 2012 Infoready Pty Ltd41

Observations

Uniqu

e Ident

ities

True Population

Data Triangulation

Mark Smith6/12/1978

Mark R Smith(614) 13-123-123DL: 00001234

New Record

© 2012 Infoready Pty Ltd42

6/12/19780413123123

File 1

File 2

Mark Randy Smith0413123123

DL: 00001234

Big Data [in context]. New Physics.

�More data: better the predictions– Lower false positives

– Lower false negatives

© 2012 Infoready Pty Ltd43

�More data: bad data good– Suddenly glad your data is not perfect

�More data: less compute

Big Data

© 2012 Infoready Pty Ltd44

Pile of ____ In Context

One Form of Context: “Expert Counting”

� Is it 5 people each with 1 account … or is it 1 person with 5 accounts?

� Is it 20 cases of H1N1 in 20 cities … or one case reported 20 times?

© 2012 Infoready Pty Ltd45

case reported 20 times?

� If one cannot count … one cannot estimate vector or velocity (direction and speed).

�Without vector and velocity … prediction is nearly impossible.

Expert Counting: Degrees of Difficulty

IncompatibleFeatures

Deceit

Bob Jones123455

Ken Wells550119

© 2012 Infoready Pty Ltd46

Exactly Same

Fuzzy

Bob Jones123455

Bob Jones123455

Bob Jones123455

Robert T Jonnes000123455

Bob Jones123455

bjones@hotmail

Key Features Enable Expert Counting

People Cars Router

Name Make Device IDAddress Model MakeDate of Birth Year ModelPhone License Plate No. Firmware Vers.Passport VIN Asset IDNationality Owner Etc.

© 2012 Infoready Pty Ltd47

Passport VIN Asset IDNationality Owner Etc.Biometric Etc.Etc.

Consider Lying Identical Twins

#123Sue3/3/84UberstanExp 2011

PASSPORT#123Sue3/3/84UberstanExp 2011

PASSPORT

© 2012 Infoready Pty Ltd48

Fingerprint

DNAMost Trusted

Authority

“Same person –

trust me.”

Most TrustedAuthority

�The same thing cannot be in two places … at the same time.

Two different things cannot

© 2012 Infoready Pty Ltd49

�Two different things cannot occupy the same space … at the same time.

Space & Time Enables Absolute Disambiguation

People Cars Router

Name Make Device IDAddress Model MakeDate of Birth Year ModelPhone License Plate No. Firmware Vers.Passport VIN Asset IDNationality Owner Etc.

When When WhenWhere Where Where

© 2012 Infoready Pty Ltd50

Passport VIN Asset IDNationality Owner Etc.Biometric Etc.Etc.

“Life Arcs” Are Also Telling

Bill Smith13/4/67

Melbourne, Victoria

Bill Smith13/4/67

Brisbane, Queensland

Address History Address History

© 2012 Infoready Pty Ltd51

Address History

Melbourne, Vic 2008-2008

St Kilda, Vic 2005-2008

Hampton, Vic 1996-2005

Brighton, Vic 1984-1996

Address History

Carina, QLD 2005-2009

Brisbane, QLD 2005-2005

Bondi, NSW 1990-2005

Carina, QLD 1982-1990

OMG

© 2012 Infoready Pty Ltd52

Space-Time-Travel

� Cell phones are generating a staggering amount of geo-locational data – 600B transactions per day being created in the US alone

� This data is being “de-identified” and shared with third parties – in volume and in real-time

© 2012 Infoready Pty Ltd53

third parties – in volume and in real-time

� Your movement quickly reveals where you spend your time (e.g., evenings vs. working hours)

� Re-identification (figuring out who is who) is somewhat trivial

Powerful Predictions

� Prediction with 87% certainty where you will be next Thursday at 5:35pm

�Names of the top 10 people you co-locate with, not at home and not at work

© 2012 Infoready Pty Ltd54

not at home and not at work

� Intelligence service preempts the next mass protest in real-time

� Robbery of a convenience store is about to happen at 10:42pm

Consequences

�Space-time-travel data is the ultimate biometric

� It will enable enormous opportunity

© 2012 Infoready Pty Ltd55

� It will unravel one’s secrets

� It will challenge existing notions of privacy

�And, it’s here now and more to come

Macro Trends

© 2012 Infoready Pty Ltd56

Value

of Dat

aThe Greater the Context, the Greater the Value

Data in Context

© 2012 Infoready Pty Ltd57

Value

of Dat

a

Pile of Data

Records Managed(Big) (Ludicrous Big)

Willing

ness

to W

ait

The better the predictions … the faster they will be

wanted.

“Why did we have to wait until the

end of the day for the smart answer?”

Time Is Of The Essence

Day

Hour

Batch

© 2012 Infoready Pty Ltd58

Willing

ness

to W

ait

the smart answer?”

Relevance (Iffy) (Totally)

200ms Real-Time

Enterprise IntelligenceOne Plausible Journey

Enterprise IntelligenceOne Plausible Journey

© 2012 Infoready Pty Ltd59

ObservationSpace

Sense and Respond

New

© 2012 Infoready Pty Ltd60

What you know

New Observations

ObservationSpace

Data Finds Data

Sense and Respond

© 2012 Infoready Pty Ltd61

Decide

?Relevance

Finds the Sensor(<200ms)

Data Finds Data

Explore and Reflect

ObservationSpace Deep

Reflection

CuratedData

PatternDiscovery

Data Finds Data

Sense and Respond

© 2012 Infoready Pty Ltd62

Decide

?

DirectedAttention

Relevance Find You

PatternDiscovery

RelevanceFinds the Sensor

(<200ms)

Data Finds Data

ObservationSpace Deep

Reflection

CuratedData

PatternDiscovery

Data Finds Data

Explore and ReflectSense and Respond

© 2012 Infoready Pty Ltd63

Decide

?

DirectedAttention

NEWINTERESTS

PatternDiscovery

RelevanceFinds the Sensor

(<200ms)

Data Finds Data

ObservationSpace Deep

Reflection

CuratedData

PatternDiscovery

Data Finds Data

Explore and ReflectSense and Respond

© 2012 Infoready Pty Ltd64

Decide

?

DirectedAttention

NEWINTERESTS

PatternDiscovery

RelevanceFinds the Sensor

(<200ms)

Data Finds Data

Report and Manage

Closing Thoughts

© 2012 Infoready Pty Ltd65

The most competitive organizations

are going to make sense of what they are observing

fast enough to do something about it

© 2012 Infoready Pty Ltd66

fast enough to do something about it

while they are observing it.

Available Observation

Space

Context

Wish This On The Enemy

EnterpriseAmnesia

Com

puting

Pow

er Gro

wth

© 2012 Infoready Pty Ltd67

Time

Sensemaking AlgorithmsC

ompu

ting

Pow

er Gro

wth

The Way Forward: Enterprise Intelligence

Available Observation

Space

Context

Com

puting

Pow

er Gro

wth

© 2012 Infoready Pty Ltd68

Time

Sensemaking AlgorithmsC

ompu

ting

Pow

er Gro

wth

Questions?

© 2012 Infoready Pty Ltd69

Email: tristan.sternson@infoready.com.au

Twitter: http://www.twitter.com/tsternson

Blog: www.infoready.com.au

LinkedIn: http://www.linkedin.com/in/tristansternson