Date post: | 27-Jan-2015 |
Category: |
Technology |
Upload: | gigaom |
View: | 106 times |
Download: | 2 times |
1
!
Ira A. (Gus) Hunt Chief Technology Officer
Beyond Big Data
Riding the Technology Wave
Our Mission
We are the nation's first line of defense. We accomplish what others cannot accomplish and go where others cannot go. We carry out our mission by:
Collecting information that reveals the plans, intentions and capabilities of our adversaries and provides the basis for decision and action. Producing timely analysis that provides insight, warning and opportunity to the President and decisionmakers charged with protecting and advancing America's interests. Conducting covert action at the direction of the President to preempt threats or achieve US policy objectives.
2
3
4 Big Bets
– Acquire, federate, secure and exploit. Grow the haystack, magnify the needles. Revolutionize Big Data Exploitation
Accelerate Operational Excellence
Serve CIA by supporting the IC
Drive Performance through Talent Management
– Assume a leadership role in IC activities that matter to CIA; Build to share
– Innovate IT operations and run IT like a business.
– Focus on continuous learning and diversity of thought, experience, background
1
4
2 3
6 Key Technology Enablers
– World-class abilities to discover patterns, correlate information, understand plans and intentions, and find and identify operational targets in a sea of data. Big Data analytics as a service
Advanced Mission Analytics—Analytics as a Service
Enterprise Widgets and Services
Security as a Service
Data Harbor—Data as a Service
– One environment, all data, protected and secure.--ubiquitous encryption, enterprise authentication, audit, DRM, secure ID propagation, and Gold Version C&A.
– A customizable, integrated and adaptive webtop that lets analysts, ops officers, and targeters to “have it their way”. Personalization in context.
– An ultra-high performance data environment that enables CIA missions to acquire, federate, and position and securely exploit huge volumes data. Data in context.
1
4 Cloud Computing—Infrastructure as a Service
– Capacity ahead of demand. Large scale, elastic, commodity hosting, storage, and compute 5
– Immediate, secure and appropriate access to people, data and tools from anywhere at anytime Secure Mobility 0
6
It’s a
Big Data
World
Google > 100 PB
> 1T indexed URLs > 3 million servers
> 7.2B page-views/day
7
8
FaceBook > 1 billion users
> 300PB; +> 500TB/day > 35% of world’s photographs
9
YouTube > 1000PB
+>72 hours/minute >37 million hours/year > 4 billion views/day
10
World Population > 7,057,065,162
11
Twitter > 124B tweets/year
> 390M/day ~4500/sec
12
Global Text Messages > 6.1T per year
> 193,000 per second > 876 per person per year
13
US Cell Calls > 2.2 T minutes/year
> 19 minutes / person / day (uncompressed < 1 YouTube/year)
14
3 Driving Forces
Social
15
Mobile
Cloud
16
Big Data
+ + =
17
+ + Increases the velocity of
innovation
18
Accelerates social Change
+ +
19
20
Altered the Flow
of Information
+ +
3 Emerging
Forces
Nano
22
Bio
Sensors
23
Microphone Image 3-axis accelerometer Touch Light Proximity Geolocation
Mobile Sensor Platform
Communicator, Tricorder, Transporter
24
Pacemaker Blood sugar tester Insulin controller Health monitor Exercise coach Remote tune-ups Early warning system
Mobile Health Platform
25
Identity by 3-axis accelerometer Gender (71%) Height--tall or short (80%) Weight--heavy or light (80%) You by your gait (100%)
Mobile Sensor Platform
Actitracker—Android App
26
The inanimate becomes sentient
+ + + +
+ =
27
Smarter Planet
Cars drive themselves
Machines know your needs
+ + + +
+ =
28
Drive radical efficiencies Enhance social engagement Improve information sharing Enables global reach Green (automatic routing) Improve our health Stop/prevent crime …
+ + + +
+ =
2
3
Sensors are Really Big
Sensors are unbounded 1
Sensors are indiscriminate
Sensors are promiscuous
2
3
The Internet of Things is Bigger
Everything is Connected 1
Everything is a Sensor
Everything Communicates
31
That’s the
Really Big Data
Challenge of the future
32
Why We Care
33
Why We Care
34
Why We Care
35
Why We Care
2 3
Impact of Big Data
Know what we know
Discover the gaps in our knowledge
More effective use of expensive or long lead collection assets
1
4 Focus targeting to fill the gaps
Better global coverage to limit surprise 5 Enhance understanding and improve analysis 6
37
Implications
2 3
4 Rules of Big Data
It’s the data…
Power to the people
Context, context, context
1
4 Latency breeds contempt
- Apologies to James Carville
- Apologies to the Black Panthers
- Apologies to Aesop
- Apologies to Lord Harold Samuel
39
It’s the Data…
Data vs Tools—A History Lesson
• Sophisticated tools without the data are useless
• Mediocre tools with the data are frustrating
• Analysts will always opt for frustration over futility, if that is their only option
2
3
Our Job Leverage the Big Data world
Find the Information that Matters
Connect the Dots
Understand the Plans of our Adversaries Safeguard our national security
1
4
The Problem
42
2
3
Our Problem: Which 5K
Don’t know the future value of data
We cannot connect dots we don’t have
Traditional, requirements driven, collection fails in the Big Data world
- Can’t task for data you don’t know you do need - The few cannot know the needs of the many - Global Coverage requires Global Data
1
2 3
Characteristics of Big Data
More is always better
Signal to noise only gets worse
Requirements are usually hindsight
1
4 Enumeration not modeling
45
• Analysts and operators are not data engineers • Need insight and understanding • Ask a question and get a coherent answer • Cannot know what data sets contain
information of value to them • Imbue data services and tools with those
smarts • Smart Data, smart tools, smarter intelligence
Data as a Service
46
Power to the People
47
• Analytics and tools are hard to use • Specialists are required to derive value • Skilled people are in short supply • Algorithms are dense and arcane • Require a lot of hand curation • Built for business not for intelligence
Today
48
New Fields of Expertise
Data Scientist Information Engineer
Data Science
Data science combines elements from many fields:
Math Statistics Data Engineering Pattern Recognition and Learning Advanced Computing Visualization Uncertainty Modeling Data Warehousing High performance computing
* Wikipedia
*
50
The power of big data can only be fully realized
when it is in the hands of the average user
Big Data Democracy Wins
Tomorrow
• Elegant, powerful and easy to use tools and visualizations
• Machines to do more of the heavy lifting
• Intelligent systems that learn from the user
• Correlation not search
• “Curiosity layer”– machines that are curious on your behalf
52
People
Places
Organizations
Time
Events
Concepts
Things
7 Universal Constructs for Analytics
53
User Built Recipes
Keep it Simple
• Data Scientists focus on hard problems
• Build reusable components that anyone can apply—Recipes
• Share them widely—Apps Store/Apps Mall—Recipe Book
• Let users assemble components their way • Experiment and fail quickly to succeed faster
55
Latency Breeds Contempt
Its All About Speed
• Hadoop/Map Reduce—batch • Flexible, powerful, slow
• Equivalent of Real-Time Map/Reduce • Flexible, powerful and fast • Demel, Caffeine, Impala, Apache Drill, Spanner…
• Recursive Streams processing w/
complex analytics
• In-memory—peta-scale RAM architectures • Distributed, in-memory analytics
Tectonic Technology Shifts
Traditional Processing Data on SAN
Move Data to Question Backup
Vertical scaling Capacity after demand
DR Size to peak load
Tape SAN Disk
RAM limited
Mass Analytics/Big Data Data at processor Move Question to Data Replication management Horizontal scaling Capacity ahead of demand COOP Dynamic/elastic provisioning SAN Disk SSD Peta-scale RAM
New Computing Architectures
• Data close to compute • Power at the edge • Optical Computing/Optical Bus • End of the motherboard—shared pools of
everything • Software defined everything—compute,
storage, networking, data center • Network is the bottleneck and constraint
59
Context, Context, Context
Everything in Your Frame of Reference
• Widgets—Webtop in context to business
• Schema on Read—Data in context to your question
• User assembled analytics—answers in context to your questions
• Elastic computing—computing in context to your demand
61
Closing Thoughts
62
High Noon in the
Information Age
63
It is nearly within our grasp to compute on all human
generated information
64
FaceBook > 1 billion users
> 35% of all photographs
65
The inanimate is rapidly becoming sentient
Smarter Planet
Cars drive themselves
Machines know your needs
66
3rd Wave of Computing
Cognitive Machines
Watson
67
Moving faster than government can keep up The legal system is woefully behind What are your rights? Who owns your data? Driving the pace of social change Exponentially increasing cyber threats
+ + + +
+ =
68