Real-‐Time, High Volume Log Processing with Flume & Cassandra
Gemini Mobile Technologies
11.3.6 Gemini Mobile Technologies, Inc. 1
Overview
1. Log CollecAon & Storage in DB • Reliably and efficiently moves logs from mul6ple applica6on nodes using Flume
• Store raw and processed log data in Cassandra DB
2. Real Time and On Demand reports
• Via Web GUI to query against Cassandra. E.g. Transac6ons Per Second (TPS) vs Time, search user’s records.
3. Summary reports by Map-‐Reduce
• E.g. Monthly usage by category (voice, data, mail, etc) for groups of users.
11.3.6 Gemini Mobile Technologies, Inc. All rights reserved. 2
Applica6on Node
Applica6on Node
Applica6on Node
…
Log Aggregator
Log Aggregator
Reports (Web GUI) Cassandra
Cassandra
OA&M
Key Benefits
1. Real Time, Up to Date Business Intelligence
• Dynamic, near-‐real-‐6me reports.
2. Flexible Analysis on Large Historical Data • Instant Query by 6me range, raw log fields, processed log fields (Data is stored in a
Database for fast querying, not flat Log Files)
• Create On Demand Custom Summary Reports by Map-‐Reduce
3. MulAple Data Center Support
• Collect and Store in local Data Center, Query and Analyze across Data Centers
4. Reliable, Easy OperaAon, Maintenance, and Scalability
• No Data Loss if network and PCs fail
• As Data Volume (size of data stored) or Velocity (speed of new data arrival) grow, scale
to 100s of nodes, TBs of data/day by adding PCs horizontally
• Easy to setup, configure, and monitor for a large network
5. Easy CustomizaAon
• Open source, easy to change for custom log format, custom reports, and queries
11.3.6 Gemini Mobile Technologies, Inc. All rights reserved. 3
Log CollecAon: Flume • Open-‐source log collec6on system: h^p://archive.cloudera.com/cdh/3/flume/UserGuide.html
• Flume Agent: Reads logs at configurable interval (e.g., 100ms) and sends to
Collector nodes.
• Flume Collector: Parses logs and inserts to Cassandra.
• Flume Master: Monitors health and processing state of Agents and Collectors.
11.3.6 Gemini Mobile Technologies, Inc. All rights reserved. 4
Flume agent1_src1
Flume agent1_src2
App Node 1
Flume agent2_src1
Flume agent2_src2
App Node 2
Flume collector_src1
Flume collector_src2
Log Aggregator
Cassandra
Cassandra Flume Master
Storage Layer: Cassandra
• Cassandra, an open-‐source Apache project, is the storage layer. It’s a high performance, highly-‐scalable distributed database.
• Top-‐level Apache Project (h^p://cassandra.apache.org/)
• Key Features • Op6mized for Fast Writes of Small Data (<100KB each)
• Peer-‐to-‐peer nodes, easy to add/remove nodes ad-‐hoc
• Scalable for clusters from 2 to 100s of nodes.
• Mul6ple data center replica6on
• Tunable consistency level, per request level
Log CollecAon System Monitoring (Flume Master)
11.3.6 Gemini Mobile Technologies, Inc. All rights reserved. 6
Reports
• Search by a^ribute: • Date Range • Log fields (e.g., userID, Message Type)
• List view (Rows of log data) • Graph view (quan6ty vs. 6me)
• Data downloadable to CSV format.
11.3.6 Gemini Mobile Technologies, Inc. All rights reserved. 7
Reports Example: CDR Search
11.3.6 Gemini Mobile Technologies, Inc. All rights reserved. 8
Reports Example: CDR Search Results
11.3.6 Gemini Mobile Technologies, Inc. All rights reserved. 9
Reports Example: Graphs
11.3.6 Gemini Mobile Technologies, Inc. All rights reserved. 10
Sizing Example Node Hardware: Supermicro (CPU: 2 quad-‐core Intel E5420, 32GB RAM, 16x1TB SATA HD) ~$6,000.
Monitoring Layer:
• Nodes required: 2 (1 Master + 1 Standby for High Availability)
Collector Layer:
• Nodes required = MAX(2, Node Write Throughput (MB/S) / (log bytes per transac6on * transac6ons per second (TPS)))
• Example : 1 MB/sec write throughput per node, 1 KB/Transac6on, 1000 TPS system = 1MB/s writes.
Storage Layer:
• Nodes required = MAX(Replica6on Factor, Data Per Day * # of Days to keep / (Effec6ve Node Storage / Replica6on Factor) )
• Example: Data Per Day = 100 GB, # of Days to Keep = 365, Effec6ve Node Storage = 8 TB, Replica6on Factor = 2; Then Nodes Required =
100 * 365 / (8000 / 2) = 9.125 = 10 nodes
11.3.6 Gemini Mobile Technologies, Inc. All rights reserved. 11
2
3
4
5
2 3 4 5
Collector nodes required
MB/Sec (log bytes/tx * TPS)
EffecAve storage(GB) / node
replicaAon factor
Data (GB) / day
# days of data / node
# of nodes for 365 days
8000 2 10 400 2
8000 3 10 266 3
8000 2 100 40 10
Example
Example
Open Source Components
• Flume and Cassandra are available open-‐source components. We add the following
components:
1. Custom Flume-‐Cassandra Connector: Reads our log format and inserts into Cassandra
2. Cassandra data design including schemas and configura6on
3. Browser UI and Queries to Cassandra
4. Post-‐processor to generate custom log format files
11.3.6 Gemini Mobile Technologies, Inc. All rights reserved. 12
Cassandra Data Model Currently, Flume inserts into 4 tables:
1. Raw Data Table
• Func6on: Store original log data as received. • Row key: YYYYMMDDHH, One for each hour.
• Column: Name: Log entry UUID, Value: Log data.
2. CDR Entry Table
• Func6on: Represent each log field as a column. Useful for querying and indexing.
• Row key: Log entry UUID. • Column: Name: log data field name, Value: log data field value.
11.3.6 Gemini Mobile Technologies, Inc. All rights reserved. 13
AAB32431352 ABC32433781 BCD32433901
2011011107
01S,Market1,12345AA,20110111071200000,10.10.2.9,,10.10.2.10,09012345673,carrier.ne.jp,carrier.ne.jp,,,,,
04RR,Market1,12345ZZ,20110111071200005,10.10.2.9,,10.10.2.10,09023456890,carrier.ne.jp,carrier.ne.jp,,,,,
07S,Market1,12345BB,20110111071200010,10.10.2.9,,10.10.2.10,09012345673,carrier.ne.jp,carrier.ne.jp,,,,,
type
market id Amestamp moipaddress
mApaddress
msisdn senderdomain
recipientdomain
AAB32431352
01S Market1 1235AA 20110111071200000
10.10.2.9 10.10.2.10 09012345673
carrier.ne.jp carrier.ne.jp
ABC32433781
04RR
Market1 1235ZZ 20110111071200005
10.10.2.9 10.10.2.10 09023456890
carrier.ne.jp carrier.ne.jp
BCD32433901
07S Market1 1235BB 20110111071200010
10.10.2.9 10.10.2.10 09012345673
carrier.ne.jp carrier.ne.jp
Column • added for each log entry in that hour • sorted by Unique Log Entry ID (UUID)
Row • added for each log entry
Row • added for each hour
Cassandra Data Model
3. MSISDN Timeline Table
• Func6on: Organize by MSISDN then 6mestamp.
• Row key: MSISDN.
• Column: name: 6mestamp. Value: Log entry UUID to point to CDREntry.
4. HourlyTimeline Table
• Func6on: Organize by 6me (hour) then by 6mestamp.
• Row key: YYYYMMDDHH.
• Column: Name: 6mestamp value. Value: UUID to point to CDREntry.
11.3.6 Gemini Mobile Technologies, Inc. All rights reserved. 14
20110111071200000 20110111071200010
09012345673 AAB32431352 BCD32433901
20110111071200000 20110111071200005 20110111071200010
2011011107 AAB32431352 ABC32433781 BCD32433901 Column • added for each log entry in that hour • sorted by Time stamp
Column • added for each log entry for that MSISDN • sorted by Time stamp
20110111071200005
09023456890 ABC32433781
Row • added for each MSISDN
Row • added for each hour
20110111081200001 20110111081200010
2011011108 BDB32431352 CDC32431352
Next Steps
• Gemini has open sourced the package at
h^ps://github.com/geminitech/logprocessing
• README, sample data, package
• To try: • Download and install Flume, Cassandra, and Gemini’s code
• Try with sample data
• To use for a Produc6on System
• Get sample logs from the actual system, Customize Flume Plug-‐in if needed
• Decide what reports are needed, Customize Cassandra Table format and UI if needed
• Test func6onality and performance with sampe logs
• Deploy: Lab system first, then Produc6on System
11.3.6 Gemini Mobile Technologies, Inc. All rights reserved. 15
Backup
11.3.6 Gemini Mobile Technologies, Inc. All rights reserved. 16
Database Storage AlternaAves
Cassandra is the storage system used.
Comparisons to some alterna6ves:
• SQL. Can't insert so much data at a high rate. Cannot scale horizontally easily.
• Hadoop. Cannot query flexibly and manipulate data since it is not in a database-‐
like system.
• HBase or Hibari. Provides much of same capability as Cassandra. Cassandra was
chosen because
• Mul6ple data center support
• Peer-‐to-‐peer nodes, easy to add/remove nodes ad-‐hoc
• Tunable consistency • Not currently used, but would be useful with mul6ple data centers, or with different classes of
data (e.g. Billing Records vs Sta6s6cs Records)
11.3.6 Gemini Mobile Technologies, Inc. All rights reserved. 17
FAQ (Page 1)
From a view of using stored logs, will you please tell us your know-‐how, e.g.,
Q. What approach of storage (lumped storage of mulAple logs or concentrated
storage of similar data) would be effecAve for analysis or parse?
A. It depends on what analysis/reports we would like to do later. In our solu6on, we have one table which stores all logs, and we have 3 other tables which provide
indexing based on similar data (e.g. MSISDN, Time stamp, Log ID) for fast queries.
The exact table/schema may be customized depending on the actual log and desired reports.
Q. When logs are analyzed / parsed later, would it be beger to use a stored
distributed DB on an as-‐is basis, or is it beger to convert a data structure in a
certain way before returning data into distributed DB and DWH?
A. For real 6me queries and fast report genera6on, it is useful to convert the data into
certain tables. As shown in our example, we store the log both "as-‐is" and in table
format. This allows the most flexible usage.
11.3.6 Gemini Mobile Technologies, Inc. All rights reserved. 18
FAQ (Page 2)
Q. What type of logs (logs, each of which has a short record, e.g., syslog and APlog of
a system; a variety of logs such as Lifelog; or large logs such as mulAmedia data
or web pages) would best fit for aggregaAon?
A. All of these fit well. In our Cassandra based system, we can set expiry 6me for
each log entry. Then we can have some short lived records be automa6cally
deleted azer certain periods, while long live logs can stay for a long 6me. So in our
system, it's possible to have different types of logs in the same database.
Q. Is there any assumpAon or example such as a BI tool for analysis / parse?
A. No. Once the data is in the database, any BI tool can be used for analysis. The BI
tool would need to be integrated to Cassandra. There are a variety of ways to do
this, and amount of customiza6on depends on the BI tool.
11.3.6 Gemini Mobile Technologies, Inc. All rights reserved. 19
FAQ (Page 3)
Q. How is older data deleted? A. Cassandra has a 6me-‐to-‐live (TTL) for each column (in seconds). Azer TTL is
expired, data is automa6cally deleted at compac6on 6me.
Q. How do we detect/process alarms when the data store gets full? How do we
predict when data store is full so we can expand?
B. SNMP (netsnmp) can be used to monitor server disk usage. When it exceeds a
certain threshold, an SNMP trap is generated.
S. How does this compare with a Hadoop-‐based log processing system?
D. By adding a database (i.e., Cassandra), we can query in real 6me, issue complex
queries and do other database-‐type opera6ons.
U. Do we use Map/Reduce?
A. A map-‐reduce script can be used to post-‐process the log data and to generate
other log formats or analysis. We haven’t tested but “should work.”
11.3.6 Gemini Mobile Technologies, Inc. All rights reserved. 20
FAQ (Page 4)
Q. How real Ame is this system (exactly how much delayed it would be under the
best circumstances)? What would it take to make it more real Ame?
A. Total latency is A + B + C where A is configurable delay to read log file, B is 6me to
move data from Agent node to Collector node, C is insert into Cassandra. As an
example scenario A=100ms, B=50ms, C=10ms, total is 160ms.
Q. How many lines of code? In what language?
A. Flume to Cassandra plugin (~40 lines of Java), UI (~2000 lines of Java, JSP), Post-‐process log format (~250 lines of Java).
R. Areas to improve?
C. 1. Generalize UI so it can work with any log format.
2. Extensive load and large system tes6ng.
3. Add Pig scripts to post-‐process log data.
11.3.6 Gemini Mobile Technologies, Inc. All rights reserved. 21
Pig for Cassandra
• Pig (h^p://pig.apache.org/) is a high-‐level, rela6onal language to write queries that are then translated to Map/Reduce jobs.
• The Map/Reduce jobs are supported by Cassandra.
• Example Pig script that finds the top-‐100 MSISDNs that have the highest number of
log records since 2011-‐01-‐01. msisdn = LOAD 'cassandra://CDRLogs/MSISDNTimeline' USING CassandraStorage();
cdrs = FOREACH msisdn GENERATE flatten($1);
cdrtime = FOREACH cdrs GENERATE $0;
givenhourcdr = FILTER cdrtime BY $0 > 20110101000000
msisdnByHour = GROUP givenhourcdr BY $0;
msisdnByHourCount = FOREACH msisdnByHour GENERATE COUNT($1), group;
orderedMsisdn = ORDER msisdnByHourCount BY $0;
topUserAfterNewYear = LIMIT orderedMsisdn 100;
dump topUserAfterNewYear;
11.3.6 Gemini Mobile Technologies, Inc. All rights reserved. 22