Date post: | 22-Nov-2014 |
Category: |
Data & Analytics |
Upload: | prasad-chitta |
View: | 250 times |
Download: | 2 times |
1
August 22, 2014
Big Data & Analytics - IntroductionFaculty Development Program @BIET, Davangere
Prasad Chitta
2#FDPBigDataAnalytics
Discussion Topics
∞• Data & Processing – small and BIG
∞• Big Data, Data Science and Art
∞• Analytics and Optimization
3#FDPBigDataAnalytics
Data – A historic perspective
Systems of Records
Systems of Engagement
Sensor Aggregators
Independent Providers
Data
4#FDPBigDataAnalytics
The data processing lifecycle
Sensing
Acquiring, Validating
Storing Transactional Update
Operational Reporting,
Dashboards
ETL, Warehousing OLAP reporting
Analytics
Archiving & Purging
5#FDPBigDataAnalytics
Aspects of ‘Data’
Data
Meta data
Master Data Reference Data
Integration, Migration
Quality
Visualization
6#FDPBigDataAnalytics
Data Scenarios…
• New product design • Simulation• Knowledge
representation
No Data
• From normalized OLTP systems
• Variables , mostly numbers
Structured Data • Unstructured
• Quickly varying• Mostly alpha-
numeric
BIG data
7#FDPBigDataAnalytics
Processing of data
Serial, bring
data to process, tradition
al
Parallel, take
process to data, modern
8#FDPBigDataAnalytics
The Data Explosion
http://pennystocks.la/internet-in-real-time/
9#FDPBigDataAnalytics
Landing & Staging
Integration Store
Semantic (Logical & Physical)
In-Memory Databases
Visualization Tools &
Framework
System of Records
ETL / ELT
Big Data – Ingestion to insights
10#FDPBigDataAnalytics
The Big Data Landscape
11#FDPBigDataAnalytics
Analytical Processing of Data
Operational Reporting /
MI
OLAP / BI / ETL
Analytics
Content (Unstructured)
Structured
Analytics
Descriptive (Uni or bivariate)
Diagnostic or Inquisitive
Discovery
Predictive
Predictive Statistical Techniques Machine Learning
12#FDPBigDataAnalytics
Analytics Landscape Overview
SQL Analytics Descriptive Analytics Data Mining Predictive Analytics Simulation OptimizationCount Univariate Distribution Association Rules Classification Montecarlo Linear OptimizationMean Central Tendency Clustering Regression Agent based modeling Non-linear OptimizationOLAP Dispersion Feature Extraction Forecasting Discrete Event modeling Spatial Machine Learning Text Analytics
BI Advanced Analytics
13#FDPBigDataAnalytics
Business Value - Analytics Matrix
OLAP ReportingDrill-thru
Drill-Across
Insights/Limited What-ifActionable insights
Descriptive ModelingDescribe historical event
Predictive ModelingBaseline Demand
Impact of Causal Factors
Busi
ness
Val
ue
OptimizationLinear/Non-linear
programming & Simulations
Standard ReportingSales, Inventory, Business
Performance
Data ManagementInternal, Syndicated,
Decision Support Decision Guidance Advanced analytics
Why something happened?
What will happen?
What is the best that can happen?
What happened?
Analytics
RTBI
DSS
DSS – Decision Support Systems, RTBI – Real Time Business Intelligence
Analytics Value Chain
14#FDPBigDataAnalytics
Focus Areas for Insurance Analytics
Focus Areas for Insurance Analytics
Marketing Analysis•Customer Lead Management•Campaign Management
•Channel Profitability Analysis•Social Media Analytics
Customer Management •Customer Segmentation•Customer Churn Analysis •Lifetime Value Analytics•Cross-sell & Up Sell Analytics
Claims Management•Fraud Analytics & Models•Subrogation Models•Claims Analysis
Sample KPI and Business Drivers
• Lead conversion rate• Channel ROI or Effective ness• Market share for each channel• Customer Satisfaction Index
• Profiling of customers • Customer Attrition/Retention Rate• % of Repeat Business from customer• Customer Net worth and Life time value
• Loss due to Fraudulent claims • Loss ratios• Claims Process Cycle ratios• Claims reserves and Provisions
Underwriting / Risk Management• Risk Assessment and Evaluation• Automated Underwritings•Re Insurance Retention Analysis
• Underwriting Margins / Profit Margins• Capacity required for Underwriters• Improve the retentions and profit margins
Insurance Business Analytics for effective decision making by analysing the historic data
15#FDPBigDataAnalytics
Traditional Analytics Process
Extracting and consolidating data from various sources and databases
Generating Random samples to create Development & Validation Samples
Understanding the data & nature of the variables Distribution Relationships Differences
Cleansing & Preparing the data for Modeling: Outlier, Missing
Treatment Variable
Transformation, Derivation
Model Building
DB2DB1
Final
Modeling Universe
Dev70%
Val30%
Data Consolidation SamplingDiagnostics Data Prep Model Building
16#FDPBigDataAnalytics
Data Scientist - Skills needed
Business and Domain knowledge
Planning & Architecting Data Science Solutions
Statistical Modeling
Technology Stack – R, Hadoop
Text Mining, Social Network Analysis and Natural Language Processing
Methods and Algorithms in Machine Learning
Optimization and Decision Analysis
Story telling and Visualization
Privacy, Security and Ethical Concerns
Let the data speak, do not torture data!
17#FDPBigDataAnalytics
Thank You. You can find me on….