storm - altamira university presentation
DESCRIPTION
TRANSCRIPT
![Page 1: Storm - Altamira University Presentation](https://reader036.vdocument.in/reader036/viewer/2022062511/54c663e64a795937198b4579/html5/thumbnails/1.jpg)
Apache Storm
A distributed, real-time computation system
Some content borrowed from Nathan Marz’ Presentation of a similar name
Ryan Lanman
![Page 2: Storm - Altamira University Presentation](https://reader036.vdocument.in/reader036/viewer/2022062511/54c663e64a795937198b4579/html5/thumbnails/2.jpg)
Objectives1.Their Motivation2.Our Motivation3.Storm Basics4.Demo
![Page 3: Storm - Altamira University Presentation](https://reader036.vdocument.in/reader036/viewer/2022062511/54c663e64a795937198b4579/html5/thumbnails/3.jpg)
Their MotivationHow Storm Came To Be
![Page 4: Storm - Altamira University Presentation](https://reader036.vdocument.in/reader036/viewer/2022062511/54c663e64a795937198b4579/html5/thumbnails/4.jpg)
![Page 5: Storm - Altamira University Presentation](https://reader036.vdocument.in/reader036/viewer/2022062511/54c663e64a795937198b4579/html5/thumbnails/5.jpg)
![Page 6: Storm - Altamira University Presentation](https://reader036.vdocument.in/reader036/viewer/2022062511/54c663e64a795937198b4579/html5/thumbnails/6.jpg)
![Page 7: Storm - Altamira University Presentation](https://reader036.vdocument.in/reader036/viewer/2022062511/54c663e64a795937198b4579/html5/thumbnails/7.jpg)
![Page 8: Storm - Altamira University Presentation](https://reader036.vdocument.in/reader036/viewer/2022062511/54c663e64a795937198b4579/html5/thumbnails/8.jpg)
![Page 9: Storm - Altamira University Presentation](https://reader036.vdocument.in/reader036/viewer/2022062511/54c663e64a795937198b4579/html5/thumbnails/9.jpg)
![Page 10: Storm - Altamira University Presentation](https://reader036.vdocument.in/reader036/viewer/2022062511/54c663e64a795937198b4579/html5/thumbnails/10.jpg)
What They Wanted• Guaranteed data processing• Horizontal scalability• Fault-tolerance• No intermediate message brokers!• Higher level abstraction than message passing• “Just works”
![Page 11: Storm - Altamira University Presentation](https://reader036.vdocument.in/reader036/viewer/2022062511/54c663e64a795937198b4579/html5/thumbnails/11.jpg)
Our MotivationWhy We Chose Storm
eventua
ll
y^
![Page 12: Storm - Altamira University Presentation](https://reader036.vdocument.in/reader036/viewer/2022062511/54c663e64a795937198b4579/html5/thumbnails/12.jpg)
Lumify IngestRaw Data
Text Extraction
Entity Extraction
Text Highlighting
Location Extraction
Full Text Indexing
![Page 13: Storm - Altamira University Presentation](https://reader036.vdocument.in/reader036/viewer/2022062511/54c663e64a795937198b4579/html5/thumbnails/13.jpg)
Issues
• No Reducers• High DB Read/Writes• Batch-style processing• M/R Overhead• Zero Fault Tolerance
![Page 14: Storm - Altamira University Presentation](https://reader036.vdocument.in/reader036/viewer/2022062511/54c663e64a795937198b4579/html5/thumbnails/14.jpg)
What We Really Wanted
• Distributed, Stream-type Processing• Simple Logical DAG• Better Fault Tolerance
![Page 15: Storm - Altamira University Presentation](https://reader036.vdocument.in/reader036/viewer/2022062511/54c663e64a795937198b4579/html5/thumbnails/15.jpg)
Text
Storm Ingest Workflow
Documents
Video
Images
Raw Data Content Sorter
Text Extraction
Video Frame
Splitting
Video Frame Text Extraction
Image Text Extraction
…
![Page 16: Storm - Altamira University Presentation](https://reader036.vdocument.in/reader036/viewer/2022062511/54c663e64a795937198b4579/html5/thumbnails/16.jpg)
Storm BasicsWhat the heck’s a Topology?
![Page 17: Storm - Altamira University Presentation](https://reader036.vdocument.in/reader036/viewer/2022062511/54c663e64a795937198b4579/html5/thumbnails/17.jpg)
Storm Cluster
Nimbus
Zookeeper
Zookeeper
Zookeeper
Supervisor
Supervisor
Supervisor
Supervisor
Supervisor
![Page 18: Storm - Altamira University Presentation](https://reader036.vdocument.in/reader036/viewer/2022062511/54c663e64a795937198b4579/html5/thumbnails/18.jpg)
Storm Cluster
Nimbus
Zookeeper
Zookeeper
Zookeeper
Supervisor
Supervisor
Supervisor
Supervisor
Supervisor
![Page 19: Storm - Altamira University Presentation](https://reader036.vdocument.in/reader036/viewer/2022062511/54c663e64a795937198b4579/html5/thumbnails/19.jpg)
Storm Cluster
Nimbus
Zookeeper
Zookeeper
Zookeeper
Supervisor
Supervisor
Supervisor
Supervisor
Supervisor
![Page 20: Storm - Altamira University Presentation](https://reader036.vdocument.in/reader036/viewer/2022062511/54c663e64a795937198b4579/html5/thumbnails/20.jpg)
Storm Cluster
Nimbus
Zookeeper
Zookeeper
Zookeeper
Supervisor
Supervisor
Supervisor
Supervisor
Supervisor
![Page 21: Storm - Altamira University Presentation](https://reader036.vdocument.in/reader036/viewer/2022062511/54c663e64a795937198b4579/html5/thumbnails/21.jpg)
Storm Data Concepts• Tuples• Streams• Spouts• Bolts• Topologies
![Page 22: Storm - Altamira University Presentation](https://reader036.vdocument.in/reader036/viewer/2022062511/54c663e64a795937198b4579/html5/thumbnails/22.jpg)
Tuples
• Single unit of data in Storm• Examples– Tweet– User Activity Log Entry– File Info
![Page 23: Storm - Altamira University Presentation](https://reader036.vdocument.in/reader036/viewer/2022062511/54c663e64a795937198b4579/html5/thumbnails/23.jpg)
Streams
Tuple Tuple Tuple TupleTupleTuple Tuple
An unbound sequence of Tuples
![Page 24: Storm - Altamira University Presentation](https://reader036.vdocument.in/reader036/viewer/2022062511/54c663e64a795937198b4579/html5/thumbnails/24.jpg)
Spouts
TupleTuple
TupleTupleTuple Tuple
Producers of Streams
Tuple
TupleTuple
Tuple
Tuple Tuple
Spout
![Page 25: Storm - Altamira University Presentation](https://reader036.vdocument.in/reader036/viewer/2022062511/54c663e64a795937198b4579/html5/thumbnails/25.jpg)
Bolts
TupleTuple
Tuple Tuple
Process input streams to create new streams
Tuple
Tuple
Tuple Tuple
Tuple Tuple
![Page 26: Storm - Altamira University Presentation](https://reader036.vdocument.in/reader036/viewer/2022062511/54c663e64a795937198b4579/html5/thumbnails/26.jpg)
Examples
Spout Examples• HDFS Filesystem Spout• Kafka Queue Spout
Bolt Examples• Filtering• Aggregation• DB Operations
![Page 27: Storm - Altamira University Presentation](https://reader036.vdocument.in/reader036/viewer/2022062511/54c663e64a795937198b4579/html5/thumbnails/27.jpg)
Topologies
Spout
Spout
Spout
![Page 28: Storm - Altamira University Presentation](https://reader036.vdocument.in/reader036/viewer/2022062511/54c663e64a795937198b4579/html5/thumbnails/28.jpg)
Demo
![Page 29: Storm - Altamira University Presentation](https://reader036.vdocument.in/reader036/viewer/2022062511/54c663e64a795937198b4579/html5/thumbnails/29.jpg)
Demo Topology
Twitter Hosebird
Spout
SentenceSplitter
Accumulo
WordCount
![Page 30: Storm - Altamira University Presentation](https://reader036.vdocument.in/reader036/viewer/2022062511/54c663e64a795937198b4579/html5/thumbnails/30.jpg)
Demo Topology
Twitter Hosebird
Spout
SentenceSplitter
Accumulo
WordCount
Twitter ShuffleGrouping Field
Grouping