Date post: | 21-Jan-2017 |
Category: |
Technology |
Upload: | hkbhadraa |
View: | 220 times |
Download: | 0 times |
Big Data PipelineLambda Architecture - Batch Layer
with AngularJS
Java Restful Web ServicesApache HadoopApache Spark
Apache Cassandraon Amazon Web Services Cloud Platform
INGEST STORE Process Visualize
BIG Data Pipeline
Data Pipeline
AngularJS Web App
RestWeb Services
ApacheWeb Logs
S3
Log/Data File
SparkEngine
SparkSQL
HDFS
ApacheCassandra S3
HDFS
ApacheCassandra
AngularJS Web App
April
May
June
July
0125
00
30
INGEST STORE PROCESSVISUALIZE
STORE
InteractiveQueries
BIG Data Batch Layer Pipeline
Spark Cluster
AngularJS Web App
ClickStreamData
ApacheWeb Logs
Log/Data File
SparkStreaming
SparkSQL
ApacheKafka
S3
HDFS
ApacheCassandra
AngularJS Web App
April
May
June
July
0125
00
30
INGEST STREAM PROCESSVISUALIZE
STORE
InteractiveQueries
Spark Cluster
TCPSockets
BIG Data Real-Time Layer Pipeline
Install Web Server
EC2 instance for Web Server
cat /etc/*-release
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer
java -version
mkdir webserver
cd webserver
wget http://www-eu.apache.org/dist/tomcat/tomcat-8/v8.0.36/bin/apache-tomcat-8.0.36.tar.gz
tar xvzf apache-tomcat-8.0.36.tar.gz
ubuntu@ip-172-31-59-137:~/webserver/apache-tomcat-8.0.36/bin$ ./startup.sh
Commands to setup Apache Tomcat 8.0
Apache Tomcat 8.0 running on EC2 Instance
Install Apache Cassandra - 3 Node Cluster on AWS
3 EC2 instance for Cassandra Cluster
cat /etc/*-release
sudo add-apt-repository ppa:webupd8team/javasudo apt-get updatesudo apt-get install oracle-java8-installer
java -version
mkdir db
cd db
wget http://www-eu.apache.org/dist/cassandra/3.0.7/apache-cassandra-3.0.7-bin.tar.gz
tar xvzf apache-cassandra-3.0.7-bin.tar.gz
cd apache-cassandra-3.0.7/
cd apache-cassandra-3.0.7
bin/cassandra -f
bin/cqlsh
cassandra1 ——-> 52.87.183.121cassandra2 ——-> 52.207.239.229cassandra3 ——-> 54.174.185.29
Commands to setup Apache Cassandra 3.0.7Repeat for all 3 EC2 instances
Change following in conf/cassandra.yaml
cluster_name: 'Test Cluster’
listen_address:
broadcast_address: 54.174.185.29
seeds: “52.87.183.121,52.207.239.229"
rpc_address:
cassandra1 ——-> 52.87.183.121cassandra2 ——-> 52.207.239.229cassandra3 ——-> 54.174.185.29
3 Node Cassandra Server running on AWS EC2 Instances
3 Node Cassandra Server running
CREATE KEYSPACE users;
WITH replication = {'class':'SimpleStrategy', 'replication_factor' : 3};
CREATE TABLE user( id int PRIMARY KEY, name text );
select * from user;
AngularJS - Java Restful WebServices Deployed on AWS Cloud
AngularJS - Java Restful WebServices
AngularJS - Java Restful WebServices
AngularJS - Java Restful WebServices
Tomcat Web Server Web Log we will be processingwith Apache Hadoop/Spark
Web Log and Python Application deployed toAWS Bucket
Spark job executed on AWS EMR - Spark Cluster
Results stored in Cassandra Database
Results stored in AWS S3 Bucket
Python Application BatchLogAnalyzer.py executed on AWS Spark Cluster
Results compared in console and Cassandra Database