Date post: | 14-Jun-2015 |
Category: |
Technology |
Upload: | nasscom |
View: | 564 times |
Download: | 4 times |
© 2013 IBM Corporation
Big Data in the Real World
Chandra S KallurService Area Leader, Business Analytics and Optimization
December 8, 2013
2© 2013 IBM Corporation
Agenda
Big Data – Myths & Truths
The Big Data Strategy
Examples of Big Data Instantiation in real world
Future of Big Data
What can Big Data do for your Organization ?
3© 2013 IBM Corporation
Big Data Myths
Big Data is only about Unstructured informationBig Data is only about Unstructured information
False! Most projects include structured information sources.False! Most projects include structured information sources.
Big Data projects are expensiveBig Data projects are expensive False! You should start small and projects should be ROI positiveFalse! You should start small and projects should be ROI positive
Big Data technologies makes traditional databases and warehouses obsolete
Big Data technologies makes traditional databases and warehouses obsolete
False! Databases and warehouse remain vital part of analytic solutionsFalse! Databases and warehouse remain vital part of analytic solutions
Big Data technologies require BIG datasetsBig Data technologies require BIG datasets
False! Flexibility, not data size, is the most important aspect.False! Flexibility, not data size, is the most important aspect.
4© 2013 IBM Corporation
Big Data: Is It Only For A Few Industries? False?
5© 2013 IBM Corporation
The Big Data Strategy: Move the Analytics Closer to the Data
New analytic applications drive the requirements for a big data platform
• Integrate and manage the full variety, velocity and volume of data
• Apply advanced analytics to information in its native form
• Visualize all available data for ad-hoc analysis
• Development environment for building new analytic applications
• Workload optimization and scheduling
• Security and Governance
6© 2013 IBM Corporation
T-Mobile uses big data to optimize network performance and reduce costsT-Mobile uses big data to optimize network performance and reduce costs
• Needed a solution to store and analyze two
years worth of Call Detail Records (CDRs),
switch, billing and network event data for over
30 million subscribers to identify and address
network bottlenecks• Analyze over 17 billion events per day to
provide over 1,300 users with network Quality
of Experience (QoE) analytics, traffic
engineering, dropped session analytics as
well as voice and data session analytics• Business users can perform ad-hoc network
and traffic analysis to identify performance
issues in seconds and address them faster
6
7© 2013 IBM Corporation
Ufone uses real-time analytics to reduce customer churnUfone uses real-time analytics to reduce customer churn
Need
• Difficulty in managing marketing campaigns
• No direct ability to correlate campaigns with earned business
• Execute a successful marketing campaign base on real time customer insights
Benefits
• Analyzed customer call detail records (CDRs) and created customer profile segmentation .
• Data is streamed and analyzed real-time, offer is given to clients in a timely manner
• Campaign response time improved from 25% to 50%, improving CDR analysis from 1 day to 30 seconds and customer churn reduced by 15 to 20%
7
8© 2013 IBM Corporation
A European utility uses streams and predictive analytics to create accurate estimates of demand to fully capture and optimize the use of distributed generation resources
Need•Real time, scalable and accurate forecasts at a very low level of locality
•Very high number of forecast models automatically updated with limited user interaction
•Incorporate local, diverse information, such as local weather conditions or events
•Simulation for test and what-if analysis on huge amounts of data
Benefits•Accuracy: 20% improvement over industry and academic state of the art. Validated onsite with real consumption data
•Performance: 100’s of thousands of time series processed on an IBM Blade server
•Abrupt changes in demand were resolved with network reconfigurations
8 © 2013 IBM Corporation
9© 2013 IBM Corporation
Optimizing capital investments based on double digit Petabyte analysis
•Model the weather to optimize placement of turbines, maximizing power generation and longevity
• Modeling based on a global 1x1 kilometer grid with hundreds of variables
• Time to analysis curve flatted from 3 weeks to 3 days!
•Build models to cover forecasting and real-time operation of power generation units
• Wind turbine sensor data collection to store and understand PB’s of actual operating results, once the turbine is in production
• Scope includes service intervals, mean time to failure, and optimization of turbine interaction with wind conditions
9
10
© 2013 IBM Corporation
A large U.S. regulated energy provider deploys condition-based maintenance to assess natural gas pipeline risks
NeedCorrelate data from multiple sources into one actionable platform – utilizing information to better plan and deploy inspection, detection, maintenance, repair and replacement resources and personnel.
Benefits•Unified source of truth by integrating data from:
• GIS, EAM, historians
• Corrosion history, drawings, cathodic protection
• External data sources like weather, soil etc
•Analytics-driven condition based assessment
•Estimates of mean residual life, true asset age
•The ability to associate asset condition with failure and mitigation actions probability
•Identify prescriptive options on assets
10 © 2013 IBM Corporation
11
© 2013 IBM Corporation
TerraEchos identifies and classifies potential security threats – miles away
Need
•More secure facilities
•A U.S. high security facility needed a physical intrusion detection system able to detect, classify, locate and track potential threats – above and below ground
Benefits•Because the solution captures and transmits in real-time, security personnel are able to have unprecedented insight into any event – even when the disturbance is miles away – and take appropriate action
TerraEchos identifies and classifies potential security threats – miles away
Need
•More secure facilities
•A U.S. high security facility needed a physical intrusion detection system able to detect, classify, locate and track potential threats – above and below ground
Benefits•Because the solution captures and transmits in real-time, security personnel are able to have unprecedented insight into any event – even when the disturbance is miles away – and take appropriate action
11 © 2013 IBM Corporation
12
© 2013 IBM Corporation
“Helps detect life threatening conditions up to 24 hours sooner”
• Performing real-time analytics using physiological data from neonatal babies
• Continuously correlates data from medical monitors to detect subtle changes and alert hospital staff sooner
• Early warning gives caregivers the ability to proactively deal with complications
Results
• Helps detect life threatening conditions up to 24 hours sooner
• Lower morbidity and improved patient care
University of Ontario Institute of Technology (UOIT) Detects Neonatal Patient Symptoms Sooner
University of Ontario Institute of Technology (UOIT) Detects Neonatal Patient Symptoms Sooner
Capabilities Utilized
Stream Computing
13
© 2013 IBM Corporation
99%60%10%
Future of Big Data – Cognitive computing at play
Understands natural language and human speech
Adapts and Learns from user selections and responses
Generates and evaluates
hypothesis for better outcomes
3
2
1
14
© 2013 IBM Corporation
Big Data and Watson
InfoSphere BigInsights
POS DataPOS Data
CRM DataCRM Data
Social MediaSocial Media
Distilled Insight- Spending habits- Social relationships- Buying trends
Distilled Insight- Spending habits- Social relationships- Buying trends
Advanced search and analysis
Advanced search and analysis
Watson can consume insights fromBig Data for advanced analysis
Watson can consume insights fromBig Data for advanced analysis
Big Data technology is used to build Watson’s knowledge base
Big Data technology is used to build Watson’s knowledge base
Watson uses the Apache Hadoop open framework to distribute the workload for loading information into memory.
Watson uses the Apache Hadoop open framework to distribute the workload for loading information into memory.
Approx. 200M pages of text(To compete on Jeopardy!)
Watson’s Memory
15
© 2013 IBM Corporation
Question100s Possible
Answers
1000’s of Pieces of Evidence
Multiple Interpretations
100,000’s scores from many simultaneous Text Analysis Algorithms100s sources
. . .
HypothesisGeneration
Hypothesis and Evidence Scoring
Final Confidence Merging & Ranking
SynthesisQuestion &
Topic Analysis
QuestionDecomposition
HypothesisGeneration
Hypothesis and Evidence Scoring
Answer & Confidence
Generates and scores many hypotheses using a combination of 1000’s Natural Language Processing, Information Retrieval, Machine Learning and Reasoning Algorithms.
These gather, evaluate, weigh and balance different types of evidence to deliver the answer with the best support it can find
DeepQA: The Technology Behind WatsonMassively Parallel Probabilistic Evidence-Based Architecture
One Jeopardy! question can take 2 hours on a single 2.6Ghz Core: Optimized & Scaled out on 2,880-Core IBM HPC using UIMA-AS, Watson is answering in 2-6 seconds.
16
© 2013 IBM Corporation
IBM Watson Oncology Advisor
IBM Confidential: References to potential future products are subject to the Important Disclaimer provided earlier in the presentation
Oncology Diagnosis and Treatment Demonstration
17
© 2013 IBM Corporation
18
© 2013 IBM Corporation
19
© 2013 IBM Corporation
20
© 2013 IBM Corporation
21
© 2013 IBM Corporation
22
© 2013 IBM Corporation
What can Big Data do for your organization?
Create Innovative New ProductsCreate Innovative New ProductsAct on Deeper Customer InsightAct on Deeper Customer Insight Social Media - Product/brand
Sentiment analysis Brand strategy Market analysis RFID tracking & analysis Transaction analysis to create
insight-based product/service offerings
Social Media - Product/brandSentiment analysis
Brand strategy Market analysis RFID tracking & analysis Transaction analysis to create
insight-based product/service offerings
Social media customer sentiment analysis Promotion optimization Segmentation Customer profitability Click-stream analysis CDR processing Multi-channel interaction analysis Loyalty program analytics Churn prediction
Social media customer sentiment analysis Promotion optimization Segmentation Customer profitability Click-stream analysis CDR processing Multi-channel interaction analysis Loyalty program analytics Churn prediction
Optimize your Operational ProcessesOptimize your Operational Processes
Smart Grid/meter management Supply Chain Optimization Sales reporting Inventory & merchandising optimization Options trading ICU patient monitoring Disease surveillance Transportation network optimization Store performance Environmental analysis Experimental research
Smart Grid/meter management Supply Chain Optimization Sales reporting Inventory & merchandising optimization Options trading ICU patient monitoring Disease surveillance Transportation network optimization Store performance Environmental analysis Experimental research
Prevent Fraud andReduce RiskPrevent Fraud andReduce Risk
Multimodal surveillance Cyber security Fraud modeling & detection Risk modeling & management Regulatory reporting
Multimodal surveillance Cyber security Fraud modeling & detection Risk modeling & management Regulatory reporting
Proactively Maintain your AssetsProactively Maintain your Assets Network analytics Asset management and predictive issue resolution Website analytics IT log analysis
Network analytics Asset management and predictive issue resolution Website analytics IT log analysis
23
© 2013 IBM Corporation
Questions?