+ All Categories
Home > Technology > [Spark meetup] Spark Streaming Overview

[Spark meetup] Spark Streaming Overview

Date post: 14-Jul-2015
Category:
Upload: stratio
View: 1,062 times
Download: 12 times
Share this document with a friend
Popular Tags:
78
Transcript
Page 1: [Spark meetup] Spark Streaming Overview
Page 2: [Spark meetup] Spark Streaming Overview
Page 3: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 4: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 5: [Spark meetup] Spark Streaming Overview
Page 6: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 7: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 8: [Spark meetup] Spark Streaming Overview

SparkSQL

SparkStreaming

MLlib(machine learning)

GraphX(graph)

SPARK STREAMING OVERVIEW

Page 9: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 10: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 11: [Spark meetup] Spark Streaming Overview
Page 12: [Spark meetup] Spark Streaming Overview
Page 13: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 14: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 15: [Spark meetup] Spark Streaming Overview
Page 17: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 18: [Spark meetup] Spark Streaming Overview
Page 19: [Spark meetup] Spark Streaming Overview

• Kafka provides seamless integration between information of producers and consumers without blocking the producers of the information, and without letting producers know who the final consumers are.

• Each consumer keeps control of its own offset (read)

• On demand topic creation

SPARK STREAMING OVERVIEW

Page 20: [Spark meetup] Spark Streaming Overview

• ETL and ELT, wide catalog of sources and sinks

• Flexible design of topologies and agent deployment strategies.

• Data transformation, thanks to interceptors.

SPARK STREAMING OVERVIEW

Page 21: [Spark meetup] Spark Streaming Overview

readClobreadCSVreadLinereadMultiLinereadAvroreadJson

addCurrentTimeaddLocalHostgeoIPfindReplaceSplit

generateUUIDdecompressIfextractJsonPathsdetectMimeType

xqueryextractURIComponentsxsltGrok (regular expressions)

exec

spooling

logger

SPARK STREAMING OVERVIEW

Page 22: [Spark meetup] Spark Streaming Overview
Page 23: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 24: [Spark meetup] Spark Streaming Overview
Page 25: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 26: [Spark meetup] Spark Streaming Overview

CASSANDRA

Kafka

STRATIO DEEP

STRATIO DEEP

SPARK STREAMING OVERVIEW

Page 27: [Spark meetup] Spark Streaming Overview
Page 28: [Spark meetup] Spark Streaming Overview

Shark(SQL)

SparkStreaming

Mllib(machine learning)

GraphX(graph)

SPARK STREAMING OVERVIEW

Page 29: [Spark meetup] Spark Streaming Overview

RDD, what is that?

SPARK STREAMING OVERVIEW

Page 30: [Spark meetup] Spark Streaming Overview

RDD, what is that?

SPARK STREAMING OVERVIEW

Page 31: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 32: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 33: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 34: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 35: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 36: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 37: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 38: [Spark meetup] Spark Streaming Overview

?SPARK STREAMING OVERVIEW

Page 39: [Spark meetup] Spark Streaming Overview

Spark Streaming: Overall view

SPARK STREAMING OVERVIEW

Page 40: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Spark Streaming: Overall view

Page 41: [Spark meetup] Spark Streaming Overview

Discretized Stream or DStream.

SPARK STREAMING OVERVIEW

Page 42: [Spark meetup] Spark Streaming Overview

Discretized Stream or DStream.

SPARK STREAMING OVERVIEW

Page 43: [Spark meetup] Spark Streaming Overview

Discretized Stream or DStream.

SPARK STREAMING OVERVIEW

Page 44: [Spark meetup] Spark Streaming Overview

Overall view

SPARK STREAMING OVERVIEW

Page 45: [Spark meetup] Spark Streaming Overview

Input DStreams and Receivers.

• Basic (distributed with Spark Streaming).

• Advanced (available as dependency).

SPARK STREAMING OVERVIEW

Page 46: [Spark meetup] Spark Streaming Overview

Basic sources

• File Stream.

• Sockets.

• Actors (Akka).

• Queue RDDs (Testing).

SPARK STREAMING OVERVIEW

Page 47: [Spark meetup] Spark Streaming Overview

Advanced sources

SPARK STREAMING OVERVIEW

Page 48: [Spark meetup] Spark Streaming Overview

Do It Yourself

• Code onStart()

• Code onStop()

• Code receive()

• Custom Receiver ready!

SPARK STREAMING OVERVIEW

Page 49: [Spark meetup] Spark Streaming Overview

• map(func), flatMap(func), filter(func), count()

• repartition(numPartitions)

• union(otherStream)

• reduce(func),countByValue(), reduceByKey(func, [numTasks])

• join(otherStream, [numTasks]), cogroup(otherStream, [numTasks])

• transform(func)

• updateStateByKey(func)

• window(windowLength, slideInterval)

• countByWindow(windowLength, slideInterval)

• reduceByWindow(func, windowLength, slideInterval)

• reduceByKeyAndWindow(func, windowLength, slideInterval, [numTasks])

• countByValueAndWindow(windowLength, slideInterval, [numTasks])

• print()

• foreachRDD(func)

• saveAsObjectFiles(prefix, [suffix])

• saveAsTextFiles(prefix, [suffix])

• saveAsHadoopFiles(prefix, [suffix])

SPARK STREAMING OVERVIEW

Page 50: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 51: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 52: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 53: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 54: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 55: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 56: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 57: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 58: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 59: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 60: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 61: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 62: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 63: [Spark meetup] Spark Streaming Overview

• Stateful transformations (updateStateByKey, reduceByKeyAndWindow).

• As fault-tolerance mechanism, when driver crashes.

HDFS is mandatory if you are going to use operations that requires checkpointing.

SPARK STREAMING OVERVIEW

Page 64: [Spark meetup] Spark Streaming Overview

Configuration parameters

• spark.streaming.receiver.maxRate

• spark.streaming.concurrentJobs

• spark.streaming.receiver.writeAheadLogs.enable

• spark.streaming.unpersist

SPARK STREAMING OVERVIEW

Page 65: [Spark meetup] Spark Streaming Overview
Page 66: [Spark meetup] Spark Streaming Overview

each node has mutable state and for each record they have to update state & send new records

SPARK STREAMING OVERVIEW

Page 67: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 68: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 69: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 70: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 71: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 72: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 73: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 74: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 75: [Spark meetup] Spark Streaming Overview

SPARK STREAMING OVERVIEW

Page 76: [Spark meetup] Spark Streaming Overview
Page 77: [Spark meetup] Spark Streaming Overview
Page 78: [Spark meetup] Spark Streaming Overview

Recommended