Date post: | 15-Aug-2015 |
Category: |
Education |
Upload: | knowbigdata |
View: | 206 times |
Download: | 0 times |
www.KnowBigData.comHadoop
FLUME
A distributed, reliable, and available system for efficiently collecting, aggregating & moving large data from
many different sources to a centralized data store.
www.KnowBigData.comHadoop
Supports a large variety of sources Including:• tail (like unix tail -f), • syslog, • log4j - allowing java applications to write logs to HDFS via flume
Flume nodes can be arranged in arbitrary topologies. Typically there is a node running on each source machineWith tiers of aggregating nodes that the data flows through on its way to HDFS.
Delivery reliability:best-effort delivery - doesn’t tolerate any node failuresend-to-end - which guarantees delivery in node failures
FLUME
www.KnowBigData.comHadoop
Flume Example: Read the data at a port and push it HDFS
Step 0. flume.properties - Download (Also in sgiri/flume)
# Name the components on this agenta1.sources = r1a1.sinks = s1a1.channels = c1
# Describe/configure the sourcea1.sources.r1.type = netcata1.sources.r1.bind = localhosta1.sources.r1.port = 44444
# Describe the sink#a1.sinks.k1.type = loggera1.sinks.s1.type = hdfsa1.sinks.s1.hdfs.path = hdfs://hadoop1.knowbigdata.com/user/student/sgiri/flume/webdata
www.KnowBigData.comHadoop
Flume Example: Read the data at a port and push it HDFS
Step 0. flume.properties - Download (Also in sgiri/flume)
# Use a channel which buffers events in memorya1.channels.c1.type = memorya1.channels.c1.capacity = 1000a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channela1.sources.r1.channels = c1#a1.sinks.k1.channel = c1a1.sinks.s1.channel = c1
www.KnowBigData.comHadoop
Flume Example: Read the data at a port and push it HDFS
1. Start The Agent
flume-ng agent --conf conf --conf-file conf/flume.properties \
--name a1 Dflume.root.logger=INFO,console
2. Generate Some Data
Telnet localhost 44444
3. Check the HDFS
/user/student/sgiri/flume/webdata More Sinks