rackspace: email's solution for indexing 50k documents per second: presented by george bailey...
TRANSCRIPT
Rackspace Email’s solution for indexing 50k documents per second
George Bailey – Software Developer, Rackspace Cameron Baker – Linux Engineer, Rackspace
3
02Who we are…
• “Rackers” dedicated to Fanatical Support! • Based out of San Antonio, TX • Part of Cloud Office • Email infrastructure development and engineering
4
022008: Original Problem
• 1 million mailboxes • Support needs to track message delivery • We need event aggregation + search • Needed to provide Fanatical Support!
http://highscalability.com/how-rackspace-now-uses-mapreduce-and-hadoop-query-terabytes-data
5
02Original System Design
• Scribed: log aggregation, deposit into HDFS • Hadoop 0.20: index via mapreduce • solr 1.4: search • Custom tools: index loader, mapreduce, scheduler
Scribed
7
02Past performance Step Time
Transport < 1 minute
Index Generation(Mapreduce)
10 minutes(cron)
Index Merge10 minutes
(cron)
Searchable Events 20+ minutes
8
027 years later…
• 4+ million mailboxes • Still running solr 1.4, hadoop 0.20, scribed • Scaling, maintenance issues • Grew to 100+ physical servers, 15 VMs • Events need to be used in other contexts • 20+ minute time-to-search no longer acceptable
10
02Goals
• Improve customer experience – Fanatical Support! • Provide search results faster • Reduce technologies • Reduce the amount of custom code • Reduce the number of physical servers
11
02New System - Components
• Apache Flume: aggregation + processing • Solr 1.4 to 4.x/5.x: NRT indexing, distributed search • SolrCloud allowed us to reduce custom code by 75%
14
02Flume: backpressure + hop availability
• Sinks may be unreachable or slow • File Channel = durable buffering • capacity: disk / event size • transactionCapacity: match source / sink • minimumRequiredSpace
15
02Flume: batching and throughput
• Batch size is important • File channels = slow • Memory channels = fast • “Loopback” flows
16
02Flume: controlling the flows
• One event, multiple uses • Channel selectors • Optional channels • Interceptors
agent.sources.avroSource.selector.type = multiplexingagent.sources.avroSource.selector.header = eventTypeagent.sources.avroSource.selector.default = defaultChannelagent.sources.avroSource.selector.authEvent = authEventChannelagent.sources.avroSource.selector.mailEvent = mailEventChannelagent.sources.avroSource.selector.optional.authEvent = optionalChannel
17
02Flume: Morphlines + Solr
• Works with SolrCloud • Many helpful built-in commands • Scripting support for Java • Route to multiple collections • Validate, modify events in-flight
http://kitesdk.org/docs/current/morphlines/morphlines-reference-guide.html
18
02Requirements for Solr
• Near real time indexing of 30,000+ docs per sec • Few queries (< 10,000 per day) • Heavy distributed facet/group/sort queries • Support removing documents older than X days • Minimize JVM GC impact on indexing performance
19
02Basic Solr install
Server A
Solr
Replica
Server B
Solr
Replica
Server C
Solr
Replica
Server D
Solr
Replica
Collection
Shard 1 Shard 2
~2,500 docs per secondGoal 30,000 (30,000/2,500 = 12)
12 * # of Servers = 48 total servers
20
02Consult the experts…
• Days of talking/100’s of emails with Rishi Easwaran • Recommendations from Shalin Mangar • [email protected]
Result: • Fewer physical servers • Faster indexing
21
02Collections – Optimized for additions/deletions
collection-2015-10-11
collection-2015-10-12
collection-2015-10-13
collection-2015-10-14
collection-2015-10-15collection-2015-10-16
• Rolling collections by date • ~1 billion documents removed • Aliases for updates/queries • 25 shards - 2 replicas per shard
22
02JVM – Lean and mean
• 4GB max/min JVM heap size • 5 Solr JVM processes per server • Using Concurrent Mark Sweep GC • GC only on very heavy queries • GC < 10ms; occurs < 10 times a day • No impact on index performance • Reads 28 indexes; writes 2 indexes
Server A
Solr
Server A
Solr
Solr
Solr
Solr
Solr
23
02JVM Monitoring – before it’s too late
• Proactive OOM monitoring • Memory not being released • Trapped in GC • Restart processes • Can impact entire cluster
24
02autoCommit for near real time indexing
Tested autoCommit and autoSoftCommit settings of: • autoCommit 5 seconds to 5 minutes • autoSoftCommit 1 second to 1 minute
Result: • autoSoftCommit of 5 seconds and autoCommit of 1
minute balanced out memory usage and disk IO
25
02DocValues – Reduced OOM Errors
• Struggled with OOME under heavy load • Automated restart for nodes trapped in GC cycle • Distributed facet/group/sort queries
Solution: • docValues=“true” – for facet/group/sort fields
26
02Caching/Cache Warming – Measure and tune
• filterCache/queryResultCache/documentCache/etc. • Very diverse queries (cache hits were too low) • Benefits for our workload did not justify the cost • Willing to accept slower queries
27
02Configs - Keep it simple
• Example configs show off advanced features • If you are not using the feature, turn it off • Start with a trimmed down config • Only add features as needed
29
02Present performance
• Sustained indexing of ~50,000 docs per sec • Each replica indexes ~1,000 docs per sec • New documents are searchable within 5 seconds • 10,000 distributed facet/group/sort queries per day • 1 billion new documents are indexed per day • 13 billion documents are searchable • 7TB of data across all indexes
30
02Performance Comparison
Step Performance (2008) Performance (2015)
Transport <1 minute <1 second (NRT)
Index Generation
10 minutes <5 seconds
Index Merge 10 minutes N/A
Search 20+ minutes <5 seconds
• Faster transport • No more batch processing • No external index generation • NRT indexing with SolrCloud
31
02Environment Comparison Server Type Servers (2008) Servers (2015)
TransportPhysical: 4Virtual: 15
Physical: 4Virtual: 20
Storage / processing
Physical: 100+Virtual: 0
Physical: 0Virtual: 0
SearchPhysical: 12Virtual: 0
Physical: 10Virtual: 5
Total Physical: 100+Virtual: 15
Physical: 14Virtual: 25
• Flume / Solr handle event storage and processing
• No more Hadoop footprint • Over 80% reduction in servers
32
02Future…
• Dedicated Solr nodes with SSDs for indexing • Shard query collections for improved performance • Larger JVM size for query nodes • Multiple Datacenter SolrCloud (replication/mirroring)
Rackspace Email’s solution for indexing 50k documents per second
George Bailey – Software Developer, Rackspace Cameron Baker – Linux Engineer, Rackspace
Thank you