SACBD/ECSA - September 9th, 2019
Measuring Performance Quality Scenarios in Big Data
Analytics Applications: A DevOps and Domain-Specific
Model Approach
1
Cristian Camilo Castellanos1
Carlos A. Varela2
Dario Correal1
2Department of Computer Science
Rensselaer Polytechnic Institute
Troy, NY, USA
1Department of Systems Engineering
Universidad de los Andes
Bogotá, Colombia
SACBD/ECSA - September 9th, 2019 2
Deployment
Gap
DevOpsSoftware
Architecture
Model
ACCORDANT
Detect Near Mid Air
Collisions NMAC
Problem
Experimentation
Solution Domain
Proposal
SACBD/ECSA - September 9th, 2019 3
Context Challenge Proposal Experimentation Conclusions Q&A
Deployment Gap Phenomenon
● “Despite the increasing interest in BDA adoption, actual deployments are still scarce” [1]
● “50% of companies do not have a specific data science production procedure.” [2]
● Delayed deployment of ready-to-use models (months: 31%, or years: 30%) [3]
● Incompatibility across multiple tools and communication problems. [4]
● It is not yet clear how to define and monitor different QoS in BDA applications [5]
[1] Chen, Kazman & Matthes (2015). Demystifying Big Data Adoption: Beyond IT Fashion and Relative Advantage
[2] Dataiku. (2017). Building Production-Ready Predictive Analytics.
[3].Rexer, K., Gearan, P., & Allen, H. (2016). 2015 Data Science Survey.
[4] Rexer, K., Gearan, P., & Allen, H. (2016). 2015 Data Science Survey.
[5] Rajiv Ranjan. (2014). Streaming Big Data Processing in Datacenter Clouds.
And, What if I need multiple iterations and configurations???
SACBD/ECSA - September 9th, 2019 4
Context Challenge Proposal Experimentation Conclusions Q&A
Business
● Real-time NMAC (Near
Mid Air Collisions) service
● Response time ≤ 3 s.
● Decision Tree model
● Filtering and cleaning
● Modeling and evaluation
IT Architecture
● Latency < 3s
● Kafka, Python, Spark
● Cloud vs Fog computing
Lambda Architecture
Data Science/Analytics
Monitoring
Big Data Analytics (BDA) development
Data
Model
Deployment Gap“months: 31%, or years: 30%”
Functional
Requirements
Quality
Scenarios (QS)
SACBD/ECSA - September 9th, 2019 6
How to reduce the big data analytics deployment gap by
specifying and measuring quality scenarios and
speeding up their deployment and performance
monitoring?
Context Challenge Proposal Experimentation Conclusions Q&A
SACBD/ECSA - September 9th, 2019 8
Context Proposal Experimentation Conclusions Q&A
ACCORDANT
An exeCutable arChitectural mOdel foR big Data ANalyTics
2. Proposal Process1. Strategy (DSM and DevOps)
Challenge
SACBD/ECSA - September 9th, 2019 9
1- Proposal Strategy
ACCORDANT: A Domain Specific Model and
DevOps Approach
Context Proposal Experimentation Conclusions Q&AChallenge
SACBD/ECSA - September 9th, 2019 10
IT Architecture
● Latency < 3s
● Kafka, Python, Spark
● Cloud vs Fog computing
Lambda Architecture
Functional Viewpoint Model
Deployment Viewpoint Model
A Domain Specific Model
Context Proposal Experimentation Conclusions Q&AChallenge
Automatic Code Generation:● Software Components● Infrastructure as Code
Domain Specific Language (DSL)
SACBD/ECSA - September 9th, 2019
ACCORDANT
Deployment Viewpoint
11
Context Proposal Experimentation Conclusions Q&A
Functional Viewpoint
Challenge
SACBD/ECSA - September 9th, 2019
ACCORDANT
12
Context Proposal
Deployment Viewpoint
Experimentation Conclusions Q&AChallenge
SACBD/ECSA - September 9th, 2019 13
Context Proposal Experimentation Conclusions Q&A
ACCORDANT DSL Example
Challenge
SACBD/ECSA - September 9th, 2019 14
Context Proposal Experimentation Conclusions Q&A
ACCORDANT
An exeCutable arChitectural mOdel foR big Data ANalyTics
2. Proposal Process1. Strategy (DSM and DevOps)
Challenge
SACBD/ECSA - September 9th, 2019 15
2 - Proposal Process
Context Proposal Experimentation Conclusions Q&AChallenge
SACBD/ECSA - September 9th, 2019 16
Context Proposal Experimentation Conclusions Q&A
Requirements Operation
Business
User
2- Models and
Transformations
Data Scientist
designs
BDA Deployment Process
4- Integration
designs
1- Quality
Scenarios
defines
DeploymentDevelopment
3- Software
Architecture
guide
5- Code
Generation
6- Code
Execution
ACCORDANT MM
BDA
Solution
import PMML
monitoringSW Architect
Challenge
SACBD/ECSA - September 9th, 2019 17
Context Proposal Experimentation Conclusions Q&AChallenge
Process Overview
Deployment Gap
● Specify performance QS
integrated with software
architecture.
● Speed up BDA deployment and
monitoring.
SACBD/ECSA - September 9th, 2019 19
Context Challenge Proposal Experimentation Conclusions Q&A
Business (FAA, private pilots)
● Real-time NMAC (Near
Mid Air Collisions) service
● Response time ≤ 3 s.
● Decision Tree model
● Filtering and cleaning
● NMAC detection Model
IT Architecture
● Latency < 3s
● Kafka, Python, Spark
● Cloud vs Fog computing
Lambda Architecture
Avionics Data Scientist
Monitoring
Data
ADS-B
Deployment Gap“months: 31%, or years: 30%”
Functional
Requirements
Quality
Scenarios (QS)
Avionics BDA deployment
Model
SACBD/ECSA - September 9th, 2019 20
Context Proposal Experimentation Conclusions Q&A
Experimentation in Avionics
● Feasibility using Avionics use cases
○ UC1: Near Mid-Air Collision Analysis for
route planning.
○ UC2: Near Mid-Air Collision Detection in
operation.
● Deployment Effort
○ Time
○ Lines of Code (Complexity)
https://wcl.cs.rpi.edu/
Challenge
SACBD/ECSA - September 9th, 2019 21
Context Proposal Experimentation Conclusions Q&A
Business: Data Collection for 2, 20, and 200 nmi around JFK
● 2 nmi: 13.328 compares● 20 nmi: 656.177 compares● 200 nmi: 18,899,217 compares
ADS-B Exchange
Challenge
Automatic dependent surveillance – broadcast
SACBD/ECSA - September 9th, 2019 22
Data Scientist: Build Analytics Model for NMAC Detection
Context Proposal Experimentation Conclusions Q&A
Dtree.pmml
Challenge
ADS-B Exchange
SACBD/ECSA - September 9th, 2019 23
Context Proposal Experimentation Conclusions Q&A
IT Architect: Define Software architecture of two use cases
UC1
UC2
Deadline
< 3600 s
Latency
< 3 s
Challenge
SACBD/ECSA - September 9th, 2019 24
Context Proposal Experimentation Conclusions Q&A
IT Architect: Define Deployment Strategies
Functional View
Deadline
< 3600 s
Challenge
Technology
Assignments
Deployments
SACBD/ECSA - September 9th, 2019 25
Context Proposal Experimentation Conclusions Q&AChallenge
IT Architect: Specify Functional and Deployment Models
SACBD/ECSA - September 9th, 2019 26
ACCORDANT: Automatic Code Generation
Context Proposal Experimentation Conclusions Q&A
Evaluator evaluator = EvaluatorUtil.createEvaluator(DTree.pmml);
TransformerBuilder pmmlTransformerBuilder = new
TransformerBuilder(evaluator).withTargetCols()
.withOutputCols().exploded(false);
List<StructField> fields = new ArrayList<StructField>();
fields.add(DataTypes.createStructField("a",DataTypes.IntegerType, true));
...
fields.add(DataTypes.createStructField("sz_norm",DataTypes.FloatType,
true));
StructType schema = DataTypes.createStructType(fields);
Transformer pmmlTransformer = pmmlTransformerBuilder.build();
Logging.traceMetrics(Logging.DEADLINE, timestamp); //TRACING
apiVersion: apps/v1
kind: Deployment
spec:
replicas: 3
spec:
containers:
- name: spark-worker-ex
image: ramhiser/spark:2.0.1
ports:
- containerPort: 8081
resources:
requests:
cpu: 0.25
ACCORDANT XMI
Challenge
SACBD/ECSA - September 9th, 2019 27
ACCORDANT: Monitoring application operation
Context Proposal Experimentation Conclusions Q&A
ACCORDANT XMI
Challenge
SACBD/ECSA - September 9th, 2019 28
Context Proposal Experimentation Conclusions Q&AChallenge
QS Monitoring of UC1
SACBD/ECSA - September 9th, 2019 29
QS Monitoring of UC2
Context Proposal Experimentation Conclusions Q&A
2 nmi
20 nmi
Challenge
SACBD/ECSA - September 9th, 2019 30
Results
Context Proposal Experimentation Conclusions Q&A
-57.3% -73.47% -32.86
Challenge
Speed Up BDA deployment and monitoring iterations.
Deployment Gap Reduction
SACBD/ECSA - September 9th, 2019 32
Context Proposal Experimentation Conclusions Q&A
● A DSM and DevOps approach to formalize and accelerate BDA
solution development and deployment using FV and DV.
● A performance metrics specification and monitoring.
● An evaluation applied to avionics use cases with different deployment
strategies and quality scenarios.
We believe that this work is a step forward towards deployment gap
reduction!!
Contributions
Challenge
SACBD/ECSA - September 9th, 2019 33
Context Proposal Experimentation Conclusions Q&A
● Train models to predict performance behavior.
● Architectural properties verification.
Future Work
● Design vs development effort.
● Adoption in other industry cases.
● Different deployment paradigms such as serverless or fog computing.
Open Challenges
Challenge
SACBD/ECSA - September 9th, 2019 34
Q & A Session
Thanks!!!
Context Proposal Experimentation Conclusions Q&AChallenge