+ All Categories
Home > Data & Analytics > Apache Zeppelin Meetup Christian Tzolov 1/21/16

Apache Zeppelin Meetup Christian Tzolov 1/21/16

Date post: 16-Jan-2017
Category:
Upload: pivotalopensourcehub
View: 468 times
Download: 1 times
Share this document with a friend
25
Unified Data Analytics Platform (with Zeppelin, Ambari, Geode, SpringXD and HAWQ) by Christian Tzolov @christzolov
Transcript
Page 1: Apache Zeppelin Meetup Christian Tzolov 1/21/16

UnifiedData Analytics Platform(with Zeppelin, Ambari, Geode, SpringXD and

HAWQ)

by Christian Tzolov@christzolov

Page 3: Apache Zeppelin Meetup Christian Tzolov 1/21/16

Contents• DEMO• Zeppelin Interpreters

• PSQL (to became JDBC in 0.6.x)• Geode• SpringXD

• Apache Ambari • Zeppelin Service • Geode, HAWQ and Spring XD services• Webpage Embedder View

Page 4: Apache Zeppelin Meetup Christian Tzolov 1/21/16

Demo: Twitter Streams with SpringXD, Geode

and HAWQ

Page 5: Apache Zeppelin Meetup Christian Tzolov 1/21/16

Technical Stack

Apache HDFS Data Lake - PHD or HDP HadoopApache HAWQ SQL on Hadoop (OLAP)Apache Geode In-memory data grid (OLTP)Spring XD Integration and Streaming RuntimeApache Ambari Manages All ClustersApache Zeppelin Web UI for interaction with Data Systems

Hadoop/HDFS

Geode HAWQ

SpringXD

Ambari

Zeppelin

Page 6: Apache Zeppelin Meetup Christian Tzolov 1/21/16

Spring XDOrchestrates and automates all steps across multiple data stream pipelines

• HTTP• Tail• File• Mail• Twitter• Gemfire• Syslog• TCP• UDP• JMS• RabbitMQ• MQTT• Kafka• Reactor TCP/UDP

• Filter• Transformer• Object-to-JSON• JSON-to-Tuple• Splitter• Aggregator• HTTP Client• Groovy Scripts• Java Code• JPMML Evaluator• Spark Streaming

• File• HDFS• JDBC• TCP• Log• Mail• RabbitMQ• Gemfire• Splunk• MQTT• Kafka• Dynamic Router• Counters

Page 7: Apache Zeppelin Meetup Christian Tzolov 1/21/16

Apache Geode• Cache - Performance / Consistency /

Resiliency

• Region - Highly available, redundant, distributed Map

China Railway Corporation

5,700 train stations4.5 million tickets per day20 million daily users1.4 billion page views per day40,000 visits per second

Indian Railways7,000 stations72,000 miles of track23 million passengers daily120,000 concurrent users10,000 transactions per minute

Page 8: Apache Zeppelin Meetup Christian Tzolov 1/21/16

Apache HAWQ• Built around a Greenplum MPP DB

• 100% ANSI SQL compliant: SQL-92/99/2003…

• ODBC and JDBC

• Hadoop Native: Parquet, HDFS and YARN

• Extensible - Web Tables, PXF

• TPC-DS outperforms Impala by overall 454%

Page 9: Apache Zeppelin Meetup Christian Tzolov 1/21/16

Demo

tweets = twittersearch --query=<keywork> | hdfs --directory=/user/zeppelin/xd/tweets

geodeTap = tap:stream:tweets > gemfire-json-server --regionName=regionTweet

hawqTap = tap:stream:tweets > transform --script=tweetJsonToTsv.groovy | gpfdist --table=xdsink

tweetsCount = tap:stream:tweets > json-to-tuple | transform --expression='payload.id_str' | counter

Page 10: Apache Zeppelin Meetup Christian Tzolov 1/21/16

SpringXD Interpreter(s)

• %xd.stream and %xd.job

• Multiple streams or jobs in a paragraph.

• Special Deploy/Launch Semantics

• Zeppelin Dynamic Forms (${…})

• Comprihensive Stream and Job DSL auto-completion (Ctrl+.)

Page 11: Apache Zeppelin Meetup Christian Tzolov 1/21/16

SpringXD Conf

Page 12: Apache Zeppelin Meetup Christian Tzolov 1/21/16

PSQL Interpreter• Prefix: %psql.sql

• PostgreSQL, HAWQ/PXF, Greenplum … JDBC

• PSQL command line shell (via %sh)

• Zeppelin Dynamic Forms (${…})

• Comprihensive SQL/JDBC autocompletion (Ctrl+.)

Page 13: Apache Zeppelin Meetup Christian Tzolov 1/21/16

PSQL Configuration

Page 14: Apache Zeppelin Meetup Christian Tzolov 1/21/16

PSQL Doc

https://zeppelin.incubator.apache.org/docs/0.5.5-incubating/interpreter/

postgresql.html

Page 15: Apache Zeppelin Meetup Christian Tzolov 1/21/16

PSQL/HAWQ Demo

• http://10.68.58.121:9995/#/notebook/2B2ZYS18Y

Page 16: Apache Zeppelin Meetup Christian Tzolov 1/21/16

Geode Interpreter• Prefix: %geode.oql

• OQL and PDX nested access (user.name)

• Geode command line shell (via %sh)

• Zeppelin Dynamic Forms (${…})

• Basic OQL auto-completion (Ctrl+.)

Page 17: Apache Zeppelin Meetup Christian Tzolov 1/21/16

Geode Configuration

Page 19: Apache Zeppelin Meetup Christian Tzolov 1/21/16

Geode Tutorial

• http://10.68.58.121:9995/#/notebook/2AW57BUN4

Page 20: Apache Zeppelin Meetup Christian Tzolov 1/21/16

Apache AmbariZeppelin, Geode, HAWQ, SpringXD Services …

Page 21: Apache Zeppelin Meetup Christian Tzolov 1/21/16

Ambari Services

Page 23: Apache Zeppelin Meetup Christian Tzolov 1/21/16

Ambari Blueprint

http://<ambari>:8080/api/v1/clusters/mv10?format=blueprint

Page 24: Apache Zeppelin Meetup Christian Tzolov 1/21/16

Webpage Ebedder

https://github.com/tzolov/ambari-webpage-embedder-view


Recommended