Date post: | 13-Jul-2015 |
Category: |
Technology |
Upload: | thirupathi-reddy-guduru |
View: | 143 times |
Download: | 0 times |
Real Time Processing With Storm
Mahender Immadi Sr Software Engineer @ Cerner www.linkedin.com/in/mahenderimmadi/
Thirupathi Guduru Sr Software Engineer @ Cerner
www.linkedin.com/in/thirupathireddyguduru/
Batch vs. Real-Time processing
• Batch processing- Gathering of data and processing as a group at one time.- Jobs run to completion- Data might be out of date
• Real-time processing- Processing of data that takes place as the information is being entered.- Run for ever
Real Time Use Cases
• Social Media Feeds• Network Sensors• App/Web Logs• Stock Tick Data• Weather Data• Auctions • Payment Transactions
Storm
Apache Storm is a free and open source distributed realtime computation system.
Storm makes it easy to reliably process unbounded streams of data, doing for realtime processing what Hadoop did for batch processing
Stream Grouping
• Groupings are used to decide to which task in thesubscribing bolt (group) a tuple is sent to.
• Possible Groupings:- Shuffle - Fields- All - Global - None - Direct - Local or Shuffle
References
• https://storm.incubator.apache.org/
• http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.3/bk_user-guide/content/ch_storm-using.html
Books :
• Getting Started with Storm - Jonathan Leibiusky, Gabriel Eisbruch, Dario Simonassi
• Storm Blueprints: Patterns for Distributed Real-time Computation - P. Taylor Goetz, Brian O'Neill