heatmap
DESCRIPTION
A presentation Tikal Fullstack Israel - http://www.tikalk.com/TRANSCRIPT
![Page 1: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/1.jpg)
1
ProcessingProcessing“BIG-DATA”“BIG-DATA”In In Real TimeReal Time
Yanai Franchi , TikalYanai Franchi , Tikal
![Page 2: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/2.jpg)
2
![Page 3: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/3.jpg)
3
Vacation to BarcelonaVacation to Barcelona
![Page 4: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/4.jpg)
4
After a Long Travel DayAfter a Long Travel Day
![Page 5: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/5.jpg)
5
Going to a Salsa Club
![Page 6: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/6.jpg)
6
Best Salsa Club NOW
● Good Music
● Crowded – Now!
![Page 7: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/7.jpg)
7
Same Problem in “gogobot”
![Page 8: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/8.jpg)
8
![Page 9: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/9.jpg)
9
gogobot checkinHeat Map Service
Lets' Develop“Gogobot Checkins Heat-Map”
![Page 10: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/10.jpg)
10
Key Notes● Collector Service - Collects checkins as text addresses
– We need to use GeoLocation ServiceWe need to use GeoLocation Service
● Upon elapsed interval, the last locations list will be displayed as Heat-Map in GUI.
● Web Scale service – 10Ks checkins/seconds all over the world (imaginary, but lets do it for the exercise).
● Accuracy – Sample data, NOT critical data.
– Proportionately representative
– Data volume is large enough to is large enough to compensate for data loss.compensate for data loss.
![Page 11: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/11.jpg)
11
Heat-Map Context
Text-Address
Checkins Heat-MapService
Gogobot System
GogobotMicro Service
GogobotMicro Service
GogobotMicro Service
Geo LocationService
Get-GeoCode(Address)
Heat-Map
Last Interval Locations
![Page 12: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/12.jpg)
12
Database
Persist Checkin Intervals
ProcessingCheckins
ReadText Address
Check-in #1Check-in #2Check-in #3Check-in #4Check-in #5Check-in #6Check-in #7Check-in #8Check-in #9...
Simulate Checkins with a File
Plan A
GET Geo Location
Geo LocationService
![Page 13: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/13.jpg)
13
Tons of Addresses Arriving Every Second
![Page 14: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/14.jpg)
14
Architect - First Reaction...
![Page 15: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/15.jpg)
15
Second Reaction...
![Page 16: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/16.jpg)
16
DeveloperFirst
Reaction
![Page 17: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/17.jpg)
17
SecondReaction
![Page 18: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/18.jpg)
18
Problems ?
● Tedious: Spend time confi guring where to send messages, deploying workers, and deploying intermediate queues.
● Brittle: There's little fault-tolerance.
● Painful to scale: Partition of running worker/s is complicated.
![Page 19: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/19.jpg)
19
What We Want ?● Horizontal scalability● Fault-tolerance● No intermediate message brokers!● Higher level abstraction than message
passing● “Just works”● Guaranteed data processing (not in this
case)
![Page 20: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/20.jpg)
20
Apache Storm
✔Horizontal scalability
✔Fault-tolerance
✔No intermediate message brokers!
✔Higher level abstraction than message passing
✔“Just works”
✔Guaranteed data processing
![Page 21: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/21.jpg)
21
Anatomy of Storm
![Page 22: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/22.jpg)
22
What is Storm ?
● CEP - Open source and distributed realtime computation system. – Makes it easy to Makes it easy to reliably process unboundedreliably process unbounded streams streams ofof
tuplestuples– Doing for realtime processing what Hadoop did for batch Doing for realtime processing what Hadoop did for batch
processing.processing.
● Fast - 1M Tuples/sec per node. – It is scalable,fault-tolerant, guarantees your data will be It is scalable,fault-tolerant, guarantees your data will be
processed, and is easy to set up and operate.processed, and is easy to set up and operate.
![Page 23: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/23.jpg)
23
Streams
Tuple Tuple Tuple Tuple Tuple Tuple
Unbounded sequence of tuples
![Page 24: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/24.jpg)
24
Spouts
TupleTuple
Sources of Streams
Tuple Tuple
![Page 25: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/25.jpg)
25
Bolts
Tuple
TupleTuple
Processes input streams and produces new streams
TupleTupleTupleTuple
Tuple TupleTuple
![Page 26: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/26.jpg)
26
Storm Topology
Network of spouts and bolts
Tuple
TupleTuple
TupleTuple TupleTupleTuple TupleTupleTuple
Tuple
Tuple
TupleTuple TupleTupleTuple
![Page 27: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/27.jpg)
27
Guarantee for Processing
● Storm guarantees the full processing of a tuple by tracking its state
● In case of failure, Storm can re-process it.● Source tuples with full “acked” trees are removed
from the system
![Page 28: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/28.jpg)
28
Tasks (Bolt/Spout Instance)
Spouts and bolts execute asmany tasks across the cluster
![Page 29: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/29.jpg)
29
Stream Grouping
When a tuple is emitted, which task (instance) does it go to?
![Page 30: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/30.jpg)
30
Stream Grouping
● Shuffl e grouping: pick a random task● Fields grouping: consistent hashing on a subset of
tuple fi elds● All grouping: send to all tasks● Global grouping: pick task with lowest id
![Page 31: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/31.jpg)
31
Tasks , Executors , Workers
Task Task Task
Worker Process
Sput /Bolt
Sput /Bolt
Sput /Bolt=
Executor Thread
JVM
Executor Thread
![Page 32: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/32.jpg)
32
Bolt B Bolt B
Worker Process
Executor
Spout A
Executor
Node
SupervisorBolt C Bolt C
Executor
Bolt B Bolt B
Worker Process
Executor
Spout A
Executor
Node
SupervisorBolt C Bolt C
Executor
![Page 33: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/33.jpg)
33
Nimbus
Supervisor Supervisor
Supervisor Supervisor
Supervisor Supervisor
Upload/Rebalance Heat-Map Topology
Zoo Keeper Nodes
Storm Architecture
Master Node (similar to Hadoop JobTracker)
NOT criticalfor running topology
![Page 34: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/34.jpg)
34
Nimbus
Supervisor Supervisor
Supervisor Supervisor
Supervisor Supervisor
Upload/Rebalance Heat-Map Topology
Zoo Keeper
Storm Architecture
Used For Cluster Coordination
A few nodes
![Page 35: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/35.jpg)
35
Nimbus
Supervisor Supervisor
Supervisor Supervisor
Supervisor Supervisor
Upload/Rebalance Heat-Map Topology
Zoo Keeper
Storm Architecture
Run Worker Processes
![Page 36: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/36.jpg)
36
Assembling Heatmap Topology
![Page 37: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/37.jpg)
37
HeatMap Input/Output Tuples
● Input Tuples: Timestamp and Text Address : – (9:00:07 PM , “287 Hudson St New York NY 10013”)(9:00:07 PM , “287 Hudson St New York NY 10013”)
● Output Tuple: Time interval, and a list of points for it:– (9:00:00 PM to 9:00:15 PM, (9:00:00 PM to 9:00:15 PM,
ListList((((40.719,-73.98740.719,-73.987),(40.726,-74.001),(),(40.726,-74.001),(40.719,-73.98740.719,-73.987))))
![Page 38: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/38.jpg)
38
Checkins Spout
GeocodeLookup
Bolt
HeatmapBuilder
Bolt
PersistorBolt
(9:01 PM @ 287 Hudson st)
(9:01 PM , (40.736, -74,354)))
Heat Map Storm
Topology(9:00 PM – 9:15 PM , List((40.73, -74,34),
(51.36, -83,33),(69.73, -34,24))
Upon Elapsed Interval
![Page 39: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/39.jpg)
39
Checkins Spoutpublic class CheckinsSpout extends BaseRichSpout {
private List<String> sampleLocations;private int nextEmitIndex;private SpoutOutputCollector outputCollector;
@Overridepublic void open(Map map, TopologyContext topologyContext,
SpoutOutputCollector spoutOutputCollector) {this.outputCollector = spoutOutputCollector;this.nextEmitIndex = 0;sampleLocations = IOUtils.readLines(
ClassLoader.getSystemResourceAsStream("sanple-locations.txt"));}
@Overridepublic void nextTuple() {
String address = checkins.get(nextEmitIndex);String checkin = new Date().getTime()+"@ADDRESS:"+address;
outputCollector.emit(new Values(checkin));nextEmitIndex = (nextEmitIndex + 1) % sampleLocations.size();
}
@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {declarer.declare(new Fields("str"));
}}
We hold stateNo need for thread safety
Declare output fields
Been called iteratively by Storm
![Page 40: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/40.jpg)
40
Geocode Lookup Bolt
public class GeocodeLookupBolt extends BaseBasicBolt {private LocatorService locatorService;
@Overridepublic void prepare(Map stormConf, TopologyContext context) {
locatorService = new GoogleLocatorService();}
@Overridepublic void execute(Tuple tuple, BasicOutputCollector outputCollector) {
String str = tuple.getStringByField("str");String[] parts = str.split("@");Long time = Long.valueOf(parts[0]);String address = parts[1];
LocationDTO locationDTO = locatorService.getLocation(address);if(checkinDTO!=null)
outputCollector.emit(new Values(time,locationDTO) );}
@Overridepublic void declareOutputFields(OutputFieldsDeclarer fieldsDeclarer) {
fieldsDeclarer.declare(new Fields("time", "location"));}
}
Get Geocode, Create DTO
![Page 41: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/41.jpg)
41
Tick Tuple – Repeating Mantra
![Page 42: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/42.jpg)
42
Two Streams to Heat-Map Builder
On tick tuple, we fl ush our Heat-Map
Checkin 1 Checkin 4 Checkin 5 Checkin 6
HeatMap-Builder Bolt
![Page 43: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/43.jpg)
43
Tick Tuple in Actionpublic class HeatMapBuilderBolt extends BaseBasicBolt {
private Map<String, List<LocationDTO>> heatmaps;
@Overridepublic Map<String, Object> getComponentConfiguration() {
Config conf = new Config();conf.put(Config.TOPOLOGY_TICK_TUPLE_FREQ_SECS, 60 );return conf;
}
@Overridepublic void execute(Tuple tuple, BasicOutputCollector outputCollector) {
if (isTickTuple(tuple)) {// Emit accumulated intervals
} else {// Add check-in info to the current interval in the Map
}}
private boolean isTickTuple(Tuple tuple) { return tuple.getSourceComponent().equals(Constants.SYSTEM_COMPONENT_ID)
&& tuple.getSourceStreamId().equals(Constants.SYSTEM_TICK_STREAM_ID);}
Tick interval
Hold latest intervals
![Page 44: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/44.jpg)
44
Persister Boltpublic class PersistorBolt extends BaseBasicBolt {
private Jedis jedis;
@Overridepublic void execute(Tuple tuple, BasicOutputCollector outputCollector) {
Long timeInterval = tuple.getLongByField("time-interval"); String city = tuple.getStringByField("city"); String locationsList = objectMapper.writeValueAsString
( tuple.getValueByField("locationsList"));
String dbKey = "checkins-" + timeInterval+"@"+city;
jedis.setex(dbKey, 3600*24 ,locationsList);
jedis.publish("location-key", dbKey);
}}
Publish in Redis channel for debugging
Persist in Redisfor 24h
![Page 45: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/45.jpg)
45
Shuffle Grouping
Shuffle Grouping
Check-in #1Check-in #2Check-in #3Check-in #4Check-in #5Check-in #6Check-in #7Check-in #8Check-in #9...
Sample Checkins File
Read Text Addresses
Transforming the TuplesCheckins
Spout
GeocodeLookup
Bolt
HeatmapBuilder
Bolt
DatabasePersistor
Bolt
Get Geo Location
Geo LocationService
Field Grouping(city)
Group by city
![Page 46: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/46.jpg)
46
Heat Map Topologypublic class LocalTopologyRunner { public static void main(String[] args) {
TopologyBuilder builder = buildTopolgy();StormSubmitter.submitTopology(
"local-heatmap", new Config(), builder.createTopology());
}
private static TopologyBuilder buildTopolgy() { topologyBuilder builder = new TopologyBuilder(); builder.setSpout("checkins", new CheckinsSpout());
builder.setBolt("geocode-lookup", new GeocodeLookupBolt() ).shuffleGrouping("checkins");
builder.setBolt("heatmap-builder", new HeatMapBuilderBolt() ).fieldsGrouping("geocode-lookup", new Fields("city"));
builder.setBolt("persistor", new PersistorBolt() ).shuffleGrouping("heatmap-builder");
return builder;
}}
![Page 47: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/47.jpg)
47
Its NOT Scaled
![Page 48: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/48.jpg)
48
![Page 49: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/49.jpg)
49
Scaling the Topologypublic class LocalTopologyRunner {conf.setNumWorkers(20); public static void main(String[] args) {
TopologyBuilder builder = buildTopolgy();Config conf = new Config();conf.setNumWorkers(2);StormSubmitter.submitTopology(
"local-heatmap", conf, builder.createTopology()); }
private static TopologyBuilder buildTopolgy() { topologyBuilder builder = new TopologyBuilder(); builder.setSpout("checkins", new CheckinsSpout(), 4 );
builder.setBolt("geocode-lookup", new GeocodeLookupBolt() , 8 ).shuffleGrouping("checkins").setNumTasks(64);
builder.setBolt("heatmap-builder", new HeatMapBuilderBolt() , 4).fieldsGrouping("geocode-lookup", new Fields("city"));
builder.setBolt("persistor", new PersistorBolt() , 2 ).shuffleGrouping("heatmap-builder").setNumTasks(4);
return builder;
}}
Parallelism hint
Increase TasksFor Future
Set no. of workers
![Page 50: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/50.jpg)
50
Demo
![Page 51: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/51.jpg)
51
Database
Storm Heat-Map Topology
Persist Checkin Intervals
GET Geo Location
Check-in #1Check-in #2Check-in #3Check-in #4Check-in #5Check-in #6Check-in #7Check-in #8Check-in #9...
ReadText Address
Sample Checkins File
Recap – Plan A
Geo LocationService
![Page 52: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/52.jpg)
52
We have something working
![Page 53: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/53.jpg)
53
Add Kafka Messaging
![Page 54: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/54.jpg)
54
Plan B - Kafka Spout&Bolt to HeatMap
GeocodeLookup
Bolt
HeatmapBuilder
Bolt
KafkaCheckins
Spout
Database
PersistorBolt
Geo LocationService
Read Text Addresses
Checkin Kafka Topic
Publish Checkins
Locations Topic
KafkaLocations
Bolt
![Page 55: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/55.jpg)
55
![Page 56: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/56.jpg)
56
They all are GoodBut not for all use-cases
![Page 57: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/57.jpg)
57
KafkaA little introduction
![Page 58: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/58.jpg)
58
![Page 59: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/59.jpg)
59
Pub-Sub Messaging System
![Page 60: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/60.jpg)
60
![Page 61: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/61.jpg)
61
![Page 62: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/62.jpg)
62
![Page 63: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/63.jpg)
63
![Page 64: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/64.jpg)
64
Stateless Broker &Doesn't Fear the File System
![Page 65: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/65.jpg)
65
![Page 66: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/66.jpg)
66
![Page 67: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/67.jpg)
67
![Page 68: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/68.jpg)
68
Topics● Logical collections of partitions (the physical fi les). ● A broker contains some of the partitions for a topic
![Page 69: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/69.jpg)
69
A partition is Consumed byExactly One Group's Consumer
![Page 70: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/70.jpg)
70
Distributed & Fault-Tolerant
![Page 71: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/71.jpg)
71
Broker 1 Broker 3Broker 2
Zoo Keeper
Consumer 1 Consumer 2
Producer 1 Producer 2
![Page 72: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/72.jpg)
72
Broker 1 Broker 4Broker 3Broker 2
Zoo Keeper
Consumer 1 Consumer 2
Producer 1 Producer 2
![Page 73: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/73.jpg)
73
Broker 1 Broker 4Broker 3Broker 2
Zoo Keeper
Consumer 1 Consumer 2
Producer 1 Producer 2
![Page 74: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/74.jpg)
74
Broker 1 Broker 4Broker 3Broker 2
Zoo Keeper
Consumer 1 Consumer 2
Producer 1 Producer 2
![Page 75: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/75.jpg)
75
Broker 1 Broker 4Broker 3Broker 2
Zoo Keeper
Consumer 1 Consumer 2
Producer 1 Producer 2
![Page 76: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/76.jpg)
76
Broker 1 Broker 4Broker 3Broker 2
Zoo Keeper
Consumer 1 Consumer 2
Producer 1 Producer 2
![Page 77: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/77.jpg)
77
Broker 1 Broker 4Broker 3Broker 2
Zoo Keeper
Consumer 1 Consumer 2
Producer 1 Producer 2
![Page 78: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/78.jpg)
78
Broker 1 Broker 3Broker 2
Zoo Keeper
Consumer 1 Consumer 2
Producer 1 Producer 2
![Page 79: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/79.jpg)
79
Broker 1 Broker 3Broker 2
Zoo Keeper
Consumer 1 Consumer 2
Producer 1 Producer 2
![Page 80: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/80.jpg)
80
Broker 1 Broker 3Broker 2
Zoo Keeper
Consumer 1 Consumer 2
Producer 1 Producer 2
![Page 81: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/81.jpg)
81
Broker 1 Broker 3Broker 2
Zoo Keeper
Consumer 1
Producer 1 Producer 2
![Page 82: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/82.jpg)
82
Broker 1 Broker 3Broker 2
Zoo Keeper
Consumer 1
Producer 1 Producer 2
![Page 83: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/83.jpg)
83
Broker 1 Broker 3Broker 2
Zoo Keeper
Consumer 1
Producer 1 Producer 2
![Page 84: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/84.jpg)
84
Performance Benchmark1 Broker
1 Producer1 Consumer
![Page 85: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/85.jpg)
85
![Page 86: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/86.jpg)
86
![Page 87: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/87.jpg)
87
Add Kafka to our Topologypublic class LocalTopologyRunner { ...
private static TopologyBuilder buildTopolgy() {
... builder.setSpout("checkins", new KafkaSpout(kafkaConfig)); ... builder.setBolt("kafkaProducer", new KafkaOutputBolt ( "localhost:9092",
"kafka.serializer.StringEncoder","locations-topic"))
.shuffleGrouping("persistor");
return builder;}
}
Kafka Bolt
Kafka Spout
![Page 88: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/88.jpg)
88
Checkin HTTP Reactor
Publish Checkins
Plan C – Add Reactor
Database
Checkin Kafka Topic
Consume Checkins
Storm Heat-Map Topology
Locations Kafka Topic
Publish Interval Key
Persist Checkin Intervals
Geo LocationServiceGET Geo
Location
Index Interval Locations
Search Server
Index
Text-Address
![Page 89: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/89.jpg)
89
Why Reactor ?
![Page 90: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/90.jpg)
90
C10KProblem
![Page 91: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/91.jpg)
91
2008:Thread Per Request/Response
![Page 92: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/92.jpg)
92
...events trigger handlers
Application registers handlers
Reactor Pattern Paradigm
![Page 93: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/93.jpg)
93
Reactor Pattern – Key Points
● Single thread / single event loop● EVERYTHING runs on it● You MUST NOT block the event loop● Many Implementations (partial list):
– Node.js (JavaScrip), EventMachine (Ruby), Twisted Node.js (JavaScrip), EventMachine (Ruby), Twisted (Python)... and Vert.X(Python)... and Vert.X
![Page 94: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/94.jpg)
94
Reactor Pattern Problems
● Some work is naturally blocking:– Intensive data crunchingIntensive data crunching– 3rd-party blocking API’s (e.g. JDBC)3rd-party blocking API’s (e.g. JDBC)
● Pure reactor (e.g. Node.js) is not a good fi t for this kind of work!
![Page 95: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/95.jpg)
95
![Page 96: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/96.jpg)
96
● Vertciles are Execution unit of Vert.x● Single threaded● Verticles communicate by message passing
Verticles
For Blocking IORun in Thread-PoolRun in
Event Loop
![Page 97: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/97.jpg)
97
Vert.X Architecture
Event Bus
Vert.X Architecture
![Page 98: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/98.jpg)
98
Vert.X Goodies● Growing Module● Repository● web server● Persistors (Mongo,
JDBC, ...)● Work queue● Authentication● Manager● Session manager● Socket.IO
● TCP/SSL servers/clients● HTTP/HTTPS servers/
clients● WebSockets support● SockJS support● Timers● Buffers● Streams and Pumps● Routing● Asynchronous File I/O
![Page 99: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/99.jpg)
99
Node.JS vs Vert.X
![Page 100: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/100.jpg)
100
Node.js vs Vert.X
● Node.js– JavaScript OnlyJavaScript Only
– Inherently Single Inherently Single ThreadedThreaded
– No help much with IPCNo help much with IPC
– All code MUST be in All code MUST be in Event loopEvent loop
● Vert.X– Polyglot (JavaScript, Polyglot (JavaScript,
Java, Ruby, Python...)Java, Ruby, Python...)
– Leverages JVM multi-Leverages JVM multi-threadingthreading
– Nervous Event BusNervous Event Bus
– Blocking work can be Blocking work can be done off the event loopdone off the event loop
![Page 101: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/101.jpg)
101
Node.js vs Vert.X Benchmark
AMD Phenom II X6 (6 core), 8GB RAM, Ubuntu 11.04
http://vertxproject.wordpress.com/2012/05/09/vert-x-vs-node-js-simple-http-benchmarks/
![Page 102: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/102.jpg)
102
Event Bus
HTTPServer Verticle
Kafka module
Kafka Topic
StormTopology
HeatMap Reactor ArchitectureVert.X Instance
Automatically sendsEventBus Msg → KafkaTopic
Vert.X Instance
![Page 103: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/103.jpg)
103
Heat-Map Server – Only 6 LOC !
var vertx = require('vertx');var container = require('vertx/container');var console = require('vertx/console');var config = container.config;
vertx.createHttpServer().requestHandler(function(request) {request.dataHandler(function(buffer) {
vertx.eventBus.send(config.address, {payload:buffer.toString()});});request.response.end();
}).listen(config.httpServerPort, config.httpServerHost);
console.log("HTTP CheckinsReactor started on port "+config.httpServerPort);
Send checkin to Vert.X EventBus
![Page 104: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/104.jpg)
104
Database
Checkin HTTP Reactor Checkin
Kafka Topic
Consume Checkins
Storm Heat-Map Topology
Hotzones Kafka Topic
Publish Interval Key
Persist Checkin Intervals
Web App
Geo LocationServiceGET Geo
Location
Get Interval Locations
Consume Intervals Keys
Push via WebSocket
Publish Checkins
Search Server
Index
Index Interval Locations
Search
Checkin HTTP Firehose
![Page 105: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/105.jpg)
105
Demo
![Page 106: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/106.jpg)
106
Lambda Architecture
![Page 107: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/107.jpg)
107
Until Now...
![Page 108: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/108.jpg)
108
Doesn't Answer Many Answers...
● What are the most popular Salsa club in last month?
● How many unique visitors this year , per Salsa club?
● Show histogram of “bouncing” checkins for the last year?
![Page 109: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/109.jpg)
109
Batch Processing
![Page 110: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/110.jpg)
110
Sing in Concert ?
![Page 111: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/111.jpg)
111
Complementary Views
Batch Views Real TimeViews
Just a few hours of data
Time
Now
![Page 112: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/112.jpg)
112
Lambda Architecture
New Data01101001101...
real-timeview
real-timeview
real-timeview
Speed Layer
master dataset
batchview
batchview
batchview
Serving Layer
Batch Layer
QueryHow Many ?
![Page 113: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/113.jpg)
113
Lambda Advantages
● Recover from Human mistakes● No need for random writes on Batch DB● Processed with high precision, and involve
algorithms without losing short-term information
![Page 114: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/114.jpg)
114
Summary
![Page 115: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/115.jpg)
115
When You go out to Salsa Club
● Good Music
● Crowded
![Page 116: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/116.jpg)
116
More Conclusions..
● Storm – Great for real-time BigData processing. Complementary for Hadoop batch jobs.
● Kafka – Great messaging for logs/events data, been served as a good “source” for Storm spout
● Vert.X – Worth trial and check as an alternative for reactor.
● Lambda Architecture – Bring Real-Time and Batch-Processing concert for Big Data.
![Page 117: Heatmap](https://reader035.vdocument.in/reader035/viewer/2022081800/54c6ddce4a79597d0e8b45e6/html5/thumbnails/117.jpg)
117
Thanks