Date post: | 21-Apr-2017 |
Category: |
Data & Analytics |
Upload: | john-evans |
View: | 87,659 times |
Download: | 7 times |
PNDA
• Volume of network data into terabytes• Siloed data limits ability to perform
correlation and causal analysis• Relational databases limit the ability to • Application of big data analytics to the
network dataset is key to providing both real-time and historical insights
• Data science is driving the bifurcation of the OSS stack
Network data is becoming a big data problem
3-fold increase in total IP Traffic
>60% increase in devices and
connections
Telemetry data streamed in near
real-timeSource: Cisco VNI/GCI Global IP Traffic Forecast
What changes?PMO FMO
Orientation Single domain Cross domain
Realisation Small data, tool driven Big data, data driven
Data collection Polled Streamed
Data aggregation and analysis
Coupled Decoupled
Domain data schema Schema-on-write Schema-on-read
Analysis Prescriptive Prescriptive + Stochastic + ML
Customisation Design time Run time
• Tight coupling of data aggregation/store/analysis
• Multiple analytics pipelines implemented from open source components
• Common design patterns ~75% of effort wasted / duplicated
• Siloes limit the potential of big data analytics and lead to industry divergence
Today’s siloed analytics pipelines
Telemetry
Metrics
Data sources
HDFS
Data store
Spark Streaming
MapR
Data analysis
Hbase
Storm
Kafka
Streamsets
Data aggregation
Kafka
Impala
Query
Outputs
Dashboard & ReportingNiFi
Logs
What is PNDA?PNDA brings together a number of open source technologies to provide a simple, scalable open big data analytics Platform for Network Data Analytics
Linux Foundation Collaborative Project based on the Apache ecosystem
• Simple, scalable open data platform
• Provides a common set of services for developing analytics applications
• Accelerates the process of developing big data analytics applications whilst significantly reducing the TCO
• PNDA provides a platform for convergence of network data analytics
PNDA
PNDAPlugins
ODL
Logstash
OpenBPM
pmacct
XR Telemetry
Real-time
Data D
istribution
FileStore
Platform Services: Installation, Mgmt, Security, Data Privacy
App Packaging and Mgmt
Stream
Batch
Processing
SQL Query
OLAP Cube
Search/Lucene
NoSQL TimeSeries
DataExploration
Metric Visualisation
Event Visualisation PNDA
Mnged App
PNDA Mnged App
UnmngedApp
UnmngedApp
Query Visualisationand Exploration
PNDA Applications
PNDAProducer API
PNDAConsumer API
• Horizontally scalable platform for analytics and data processing applications
• Support for near-real-time stream processing and in-depth batch analysis on massive datasets
• PNDA decouples data aggregation from data analysis
• Consuming applications can be either platform apps developed for PNDA or client apps integrated with PNDA
• Client apps can use one of several structured query interfaces or consume streams directly.
• Leverages best current practise in big data analytics
PNDA
PNDAPlugins
ODL
Logstash
OpenBPM
pmacct
XR Telemetry
Real-time
Data D
istribution
FileStore
Platform Services: Installation, Mgmt, Security, Data Privacy
App Packaging and Mgmt
Stream
Batch
Processing
SQL Query
OLAP Cube
Search/Lucene
NoSQL TimeSeries
DataExploration
Metric Visualisation
Event Visualisation PNDA
Mnged App
PNDA Mnged App
UnmngedApp
UnmngedApp
Query Visualisationand Exploration
PNDA Applications
PNDAProducer API
PNDAConsumer API
Why PNDA?There are a bewildering number of big data technologies out there, so how do you decide what to use?
We've evaluated and chosen the best tools, based on technical capability and community support.
PNDA combines them to streamline the process of developing data processing applications.
PNDA Technologies
Why PNDA?Innovation in the big data space is extremely rapid, but combining multiple technologies into an end-to-end solution can be extremely complex and time-consuming
PNDA removes this complexity and allows you to focus on developing the analytics applications, not on developing the pipeline – significantly reducing the effort required and time-to-insight
PNDA Software Components
• Platform for data aggregation, distribution, processing and storage
• Automated installation, creation, and configuration• Openstack, AWS and baremetal• Typical install ~1hr• Modular install
• Open producer and consumer APIs• Avro platform schema
• Plugins for Logstash, pmacct, OpenBMP, OpenDaylight, Cisco XR-telemetry, bulk ingest …
• Data distribution – Apache Kafka
• Data store:• Automated data partitioning and storage
(HDFS)• OpenTSDB – time series analysis• Hbase - NoSQL
• Support for batch and stream processing:• Apache Spark and Spark Streaming
• Jupyter notebook server for app prototyping and data exploration
• Impala-based SQL query support
• Grafana for time series visualisation
• PNDA application packaging
• PNDA management and dashboard
PNDA 3.4 Capabilities
• The PNDA console provides a dashboard across all components in a cluster
• Inbuilt platform test agents verify the operation of all components
• Active platform testing verifies the end-to-end data pipeline
PNDA Console
• Ingested data should be encapsulated in PNDA Avro schema and published on a pre-defined Kafka topic or set of topics
Publishing Data to PNDA
PNDA PluginsData Type Data Aggregator Data Aggregator Reference PNDA Producer Plugin ReferenceBGP (inc. BGP LS) OpenBMP http://www.openbmp.org/#!index.md#Usi
ng_Kafka_for_Collector_Integrationhttp://pnda.io/pnda-guide/producer/openbmp.html
BGP PMACCT (BGP listener) http://www.pmacct.net/ http://pnda.io/pnda-guide/producer/pmacct.html
Bulk Ingest PNDA Bulk Ingest Tool http://pnda.io/pnda-guide/bulkingest/ISIS PMACCT (ISIS listener) http://www.pmacct.net/ http://pnda.io/pnda-
guide/producer/pmacct.htmlCisco XR streaming telemetry Pipeline https://github.com/cisco/bigmuddy-
network-telemetry-collectorCollectD (CollectD supports multiple plugins as listed here https://collectd.org/wiki/index.php/Table_of_Plugins)
Logstash https://www.elastic.co/guide/en/logstash/current/plugins-codecs-collectd.html
http://pnda.io/pnda-guide/repos/prod-logstash-codec-avro/
IoT sensor via HTTP Node-RED https://nodered.orgLogstash (Logstash supports multiple plugins as listed here https://www.elastic.co/guide/en/logstash/current/input-plugins.html)
Logstash http://pnda.io/pnda-guide/repos/prod-logstash-codec-avro/
NETCONF Notifications ODL http://www.opendaylight.org/ http://pnda.io/pnda-guide/producer/opendl.html
Netflow / IPFIX Logstash https://www.elastic.co/guide/en/logstash/current/plugins-codecs-netflow.html
http://pnda.io/pnda-guide/repos/prod-logstash-codec-avro/
Netflow / IPFIX / sFlow pmacct http://www.pmacct.net/ http://pnda.io/pnda-guide/producer/pmacct.html
Openstack Work in progresssFlow Logstash https://github.com/ashangit/logstash-
codec-sflowhttp://pnda.io/pnda-guide/repos/prod-logstash-codec-avro/
SNMP Metrics and Traps ODL https://wiki.opendaylight.org/view/SNMP_Plugin:Getting_Started
http://pnda.io/pnda-guide/producer/opendl.html
SNMP Traps Logstash https://www.elastic.co/guide/en/logstash/current/plugins-inputs-snmptrap.html
http://pnda.io/pnda-guide/repos/prod-logstash-codec-avro/
Syslog Logstash https://www.elastic.co/guide/en/logstash/current/plugins-inputs-syslog.html
http://pnda.io/pnda-guide/repos/prod-logstash-codec-avro/
Syslog (RFC3164 or RFC5424 - needed for newer IOS/IOS XR/ NX OS etc.)
Logstash https://gist.github.com/donaldh/89b7304981f96497c94fe4d98bb03d71
http://pnda.io/pnda-guide/repos/prod-logstash-codec-avro/
Design time vs. runtime
pico
standard
BGP Analytics Pipeline
Open BPM Collector
BGPBGP
BMP
Logstash
PNDA Cluster
Gobblin
HDFS
Kaf
ka
Spark
OpenTSDB
BGP Data Service
UI
Impala
PNDA Applied to NFV
Infra
struct
ure
Analytics
DataAggregators
OpenDataPlatform(PNDA)
AnalyticsApplications
OpenSource Custom Licensed
Alerts
Metrics
Telemetry
Logs
DataSources
InventoryOrchestration
NFVO
VNFM
VIM
NFVI
VNF
DataCenterCoreUser
State Data
Access Aggregation
Related as loosely coupledsystems
ContextNetworkControl
Convergence of network data analytics
OperationalIntelligence
PlanningIntelligence
SecurityIntelligence
• PNDA 3.5• ElasticSearch integration• CentOS / RHEL• Offline install
• Future• Apache Kylin• Apache Ambari• Containerisation• Deep-learning framework• Red PNDA – the smallest PNDA
yet!
What’s coming?
Come and join us!