1© 2015 The MathWorks, Inc.
Data Analytics with MATLAB
Tackling the Challenges of Big Data
Guangyuan Yang
Application Engineer
Applications Engineering Group
MathWorks Benelux
June 11, 2015
2
Value of Big Data & Data Analytics
3MPG Acceleration Displacement Weight Horsepow er
MP
GA
ccele
ratio
nD
ispla
cem
ent
Weig
ht
Hors
epow
er
50 1001502002000 4000200 40010 2020 40
50
100
150
200
2000
4000
200
400
10
20
20
40
How to Gain Value from Your Data?
Observation Organization UnderstandingDecisions &
Design
PhysicalSensors
Data Information Knowledge Action
0 20 40 60 80 100 120 140 160 180 200
0.5
0.6
0.7
0.8
0.9
1
time secs
active p
ow
er
per-
unit
NN
measured
4
Agenda
Data ActionTechniques
Explore
Prototype
Scale
Access Share/Deploy
Advanced
Statistics
Machine
Learning
Prediction
Decision
Making
Volume
Variety
Velocity
5
Agenda
Data ActionTechniques
Explore
Prototype
Scale
Access Share/Deploy
Advanced
Statistics
Machine
Learning
Prediction
Decision
Making
Volume
Variety
Velocity
6
Data Capabilities in MATLAB
Memory and Data Access
64-bit processors
Memory Mapped Variables
Disk Variables
Databases
Datastores
Platforms
Desktop (Multicore, GPU)
Clusters
Cloud Computing (MDCS on EC2)
Hadoop
Programming Constructs
Streaming
Block Processing
Parallel-for loops
GPU Arrays
SPMD and Distributed Arrays
MapReduce
7
Data Capabilities in MATLAB
Memory and Data Access
64-bit processors
Memory Mapped Variables
Disk Variables
Databases
Datastores
Platforms
Desktop (Multicore, GPU)
Clusters
Cloud Computing (MDCS on EC2)
Hadoop
Programming Constructs
Streaming
Block Processing
Parallel-for loops
GPU Arrays
SPMD and Distributed Arrays
MapReduce
10
1503 UA LAX -5 -10 2356
540 PS BUR 13 5 186
1920 DL BOS 10 32 1876
1840 DL SFO 0 13 568
272 US BWI 4 -2 359
784 PS SEA 7 3 176
796 PS LAX -2 2 237
1525 UA SFO 3 -5 1867
632 PS SJC 2 -4 245
1610 UA MIA 60 34 1365
2032 DL EWR 10 16 789
2134 DL DFW -2 6 914
1503 UA LAX -5 -10 2356
540 PS BUR 13 5 186
1920 DL BOS 10 32 1876
1840 DL SFO 0 13 568
272 US BWI 4 -2 359
784 PS SEA 7 3 176
796 PS LAX -2 2 237
1525 UA SFO 3 -5 1867
632 US SJC 2 -4 245
1610 UA MIA 60 34 1365
2032 DL EWR 10 16 789
2134 DL DFW -2 6 914
UA
PS
DL
DL
2356
186
1876
568
US
PS
PS
UA
US
UA
DL
DL
245
1365
789
914
359
176
237
1867
UA 2356
PS 186
PS 237
UA 1867
UA 1365
DL 1876
DL 914
US 359
US 245
Data Store Map Reduce
Example - Access and Organize Big Data
11
Datastore
HDFS
Reduce
Node
Node
Node Data
Data
Data
Map
ReduceMap
ReduceMap
Map Reduce
Map
Map
Reduce
Reduce
Integrate with easily
12
Agenda
Data ActionTechniques
Explore
Prototype
Scale
Access Share/Deploy
Advanced
Statistics
Machine
Learning
Prediction
Decision
Making
Volume
Variety
Velocity
13
Machine Learning techniques
Machine learning uses data and produces a model to perform a task
Model
Task: Human Activity Detection
14
Machine Learning techniques
Machine
Learning
Supervised
Learning
Classification
Regression
Unsupervised
LearningClustering
Group and interpretdata based only
on input data
Develop predictivemodel based on bothinput and output data
Type of Learning Categories of Algorithms
15
Apply Machine Learning techniques easily
Machine
Learning
Data:
3-axial Accelerometer data
3-axial Gyroscope data
16
Parallel Computing Toolbox
Scale up your computation easily
MATLAB
MATLAB Distributed Computing Server (MDCS)
17
Agenda
Data ActionTechniques
Explore
Prototype
Scale
Access Share/Deploy
Advanced
Statistics
Machine
Learning
Prediction
Decision
Making
Volume
Variety
Velocity
18
Integrate with your business easily
Excel®
add-ins
Desktop
MATLABProductionServer(s)
WebServer(s)
Web &
Enterprise
• Royalty-free
• Encryption to protect intellectual property
19
Link to user story
STIWA Increases Total Production Output of Automation Machinery
Challenge
Apply sophisticated mathematical methods to optimize
automation machinery and increase total production output
Solution
Use AMS ZPoint-CI to collect large production data sets in
near real time and use MATLAB to analyze the data and
identify optimal trajectories
Results
Total cycle time reduced by 30%
Large data sets analyzed in seconds
Deployment to multiple machines streamlined
“Our shopfloor management system AMS
ZPoint-CI collects a huge amount of
machine, process, and product data 24 hours
a day. By analyzing this data immediately in
MATLAB and AMS Analysis-CI we have
achieved a tenfold increase in precision, a
30% reduction in total cycle time, and a
significant increase in production output.”
Alexander Meisinger
STIWA
STIWA’s shopfloor management
system, based on MATLAB, AMS
ZPoint-CI, and AMS Analysis-CI.
20
Key takeaways
whatever size and type of data you have,
however complex your models are,
whichever infrastructure you need to deploy to,
MATLAB can help you to gain value easily
21© 2015 The MathWorks, Inc.
Thank you!