Date post: | 09-May-2015 |
Category: |
Technology |
Upload: | aureus-analytics |
View: | 361 times |
Download: | 0 times |
Aureus Claims Solution
Copyright 2013 RESTRICTED CIRCULATION
Footer Option 2Implementation Challenges in Big Data Analytics
• Dr. Nilesh N. Karnik
Copyright 2013 RESTRICTED CIRCULATION 2
The Challenge of BIG Data
ADVANCED Analytics
SOLUTIONS in the Pipeline
What we will discuss
Aureus Claims Solution
Copyright 2013 RESTRICTED CIRCULATION
Footer Option 2
3
Big Data : Distributed Processing
OLD IDEA NEW IDEA
!
Aureus Claims Solution
Copyright 2013 RESTRICTED CIRCULATION
Footer Option 2
5
EXAMPLE 1: Task of storing books on a shelf
Simple, right?
Image source Flickr. Image copyright belongs with original artist.
Aureus Claims Solution
Copyright 2013 RESTRICTED CIRCULATION
Footer Option 2
6
EXAMPLE 1: Task of storing books on a shelf
And now?
Image source Flickr. Image copyright belongs with original artist.
Aureus Claims Solution
Copyright 2013 RESTRICTED CIRCULATION
Footer Option 2
7
Image source Flickr. Image copyright belongs with original artist.
Aureus Claims Solution
Copyright 2013 RESTRICTED CIRCULATION
Footer Option 2
8
EXAMPLE 2 : Summarizing a Report
SUMMER PROJECT REPORT
Simple, right?
Aureus Claims Solution
Copyright 2013 RESTRICTED CIRCULATION
Footer Option 2
9
EXAMPLE 2 : Summarizing a Report
And now?
Aureus Claims Solution
Copyright 2013 RESTRICTED CIRCULATION
Footer Option 2
10
EXAMPLE 3 : Baking a Cake
Simple, right?And now?
Image source PINTEREST. Image copyright belongs with original artist.
Aureus Claims Solution
Copyright 2013 RESTRICTED CIRCULATION
Footer Option 2
11
Advanced Analytics
• Well developed tool set for “small data” environment
• Challenges in Big Data environment
Aureus Claims Solution
Copyright 2013 RESTRICTED CIRCULATION
Footer Option 2
12
Advanced Analytics: MapReduce Difficulties
ITERATIVE
Image source Flickr. Image copyright belongs with original artist.
Aureus Claims Solution
Copyright 2013 RESTRICTED CIRCULATION
Footer Option 2
13
Advanced Analytics: MapReduce Difficulties
INCREMENTAL PROCESSING REQUIRES RESTART
Image source Flickr. Image copyright belongs with original artist.
Aureus Claims Solution
Copyright 2013 RESTRICTED CIRCULATION
Footer Option 2
14
Advanced Analytics: MapReduce Difficulties
BATCH LEARNING SCANS ALL DATA IN ONE GO
Aureus Claims Solution
Copyright 2013 RESTRICTED CIRCULATION
Footer Option 2
15
Some Solutions Data Scientists are working on
New frameworks• E.g., HaLoop*, PrIter# (Extensions of Hadoop)
• Percolator$ (Proprietary Google framework)
* Y. Bu, B. Howe, M. Balazinska, and M. Ernst, “HaLoop: Efficient iterative data processing on large clusters”, VLDB, 2010.# Y. Zhang, Q. Gao, L. Gao and C. Wang, “PrIter: A distributed framework for prioritized iterative computations”, SoCC, 2011. $ D. Peng and F. Dabek, “Large-scale incremental processing using distributed transactions and notifications”, OSDI, 2010
Aureus Claims Solution
Copyright 2013 RESTRICTED CIRCULATION
Footer Option 2
16
Some Solutions Data Scientists are working on
Smarter algorithms / Different implementations
• Random forest
• Parallelized Stochastic Gradient Descent
Aureus Claims Solution
Copyright 2013 RESTRICTED CIRCULATION
Footer Option 2
SINGAPOREAureus Analytics Pte. Ltd.17, Phillip Street,#05-01, Grand BuildingSingapore (048695)
INDIAAureus Analytics Pvt. Ltd.
706, Powai Plaza
Hiranandani Gardens, Powai
Mumbai – [email protected] www.aureusanalytics.com
Thank You!