rackspace: email's solution for indexing 50k documents per second: presented by george bailey...

O C T O B E R 1 3 - 1 6 , 2 0 1 6 • A U S T I N , T X

Rackspace Email’s solution for indexing 50k documents per second

George Bailey – Software Developer, Rackspace Cameron Baker – Linux Engineer, Rackspace

[email protected]

[email protected]

3

02Who we are…

•  “Rackers” dedicated to Fanatical Support! •  Based out of San Antonio, TX •  Part of Cloud Office •  Email infrastructure development and engineering

4

022008: Original Problem

•  1 million mailboxes •  Support needs to track message delivery •  We need event aggregation + search •  Needed to provide Fanatical Support!

http://highscalability.com/how-rackspace-now-uses-mapreduce-and-hadoop-query-terabytes-data

5

02Original System Design

•  Scribed: log aggregation, deposit into HDFS •  Hadoop 0.20: index via mapreduce •  solr 1.4: search •  Custom tools: index loader, mapreduce, scheduler

Scribed

6

02Original System Architecture

7

02Past performance Step Time

Transport < 1 minute

Index Generation(Mapreduce)

10 minutes(cron)

Index Merge10 minutes

(cron)

Searchable Events 20+ minutes

8

027 years later…

•  4+ million mailboxes •  Still running solr 1.4, hadoop 0.20, scribed •  Scaling, maintenance issues •  Grew to 100+ physical servers, 15 VMs •  Events need to be used in other contexts •  20+ minute time-to-search no longer acceptable

9

02

Time to modernize!

10

02Goals

•  Improve customer experience – Fanatical Support! •  Provide search results faster •  Reduce technologies •  Reduce the amount of custom code •  Reduce the number of physical servers

11

02New System - Components

•  Apache Flume: aggregation + processing •  Solr 1.4 to 4.x/5.x: NRT indexing, distributed search •  SolrCloud allowed us to reduce custom code by 75%

12

02System architecture

13

02

Performance Tuning

14

02Flume: backpressure + hop availability

•  Sinks may be unreachable or slow •  File Channel = durable buffering •  capacity: disk / event size •  transactionCapacity: match source / sink •  minimumRequiredSpace

15

02Flume: batching and throughput

•  Batch size is important •  File channels = slow •  Memory channels = fast •  “Loopback” flows

16

02Flume: controlling the flows

•  One event, multiple uses •  Channel selectors •  Optional channels •  Interceptors

agent.sources.avroSource.selector.type = multiplexingagent.sources.avroSource.selector.header = eventTypeagent.sources.avroSource.selector.default = defaultChannelagent.sources.avroSource.selector.authEvent = authEventChannelagent.sources.avroSource.selector.mailEvent = mailEventChannelagent.sources.avroSource.selector.optional.authEvent = optionalChannel

17

02Flume: Morphlines + Solr

•  Works with SolrCloud •  Many helpful built-in commands •  Scripting support for Java •  Route to multiple collections •  Validate, modify events in-flight

http://kitesdk.org/docs/current/morphlines/morphlines-reference-guide.html

18

02Requirements for Solr

•  Near real time indexing of 30,000+ docs per sec •  Few queries (< 10,000 per day) •  Heavy distributed facet/group/sort queries •  Support removing documents older than X days •  Minimize JVM GC impact on indexing performance

19

02Basic Solr install

Server A

Solr

Replica

Server B

Solr

Replica

Server C

Solr

Replica

Server D

Solr

Replica

Collection

Shard 1 Shard 2

~2,500 docs per secondGoal 30,000 (30,000/2,500 = 12)

12 * # of Servers = 48 total servers

20

02Consult the experts…

•  Days of talking/100’s of emails with Rishi Easwaran •  Recommendations from Shalin Mangar •  [email protected]

Result: •  Fewer physical servers •  Faster indexing

21

02Collections – Optimized for additions/deletions

collection-2015-10-11




collection-2015-10-15collection-2015-10-16

•  Rolling collections by date •  ~1 billion documents removed •  Aliases for updates/queries •  25 shards - 2 replicas per shard

22

02JVM – Lean and mean

•  4GB max/min JVM heap size •  5 Solr JVM processes per server •  Using Concurrent Mark Sweep GC •  GC only on very heavy queries •  GC < 10ms; occurs < 10 times a day •  No impact on index performance •  Reads 28 indexes; writes 2 indexes

Server A

Solr

Server A

Solr

Solr

Solr

Solr

Solr

23

02JVM Monitoring – before it’s too late

•  Proactive OOM monitoring •  Memory not being released •  Trapped in GC •  Restart processes •  Can impact entire cluster

24

02autoCommit for near real time indexing

Tested autoCommit and autoSoftCommit settings of: •  autoCommit 5 seconds to 5 minutes •  autoSoftCommit 1 second to 1 minute

Result: •  autoSoftCommit of 5 seconds and autoCommit of 1

minute balanced out memory usage and disk IO

25

02DocValues – Reduced OOM Errors

•  Struggled with OOME under heavy load •  Automated restart for nodes trapped in GC cycle •  Distributed facet/group/sort queries

Solution: •  docValues=“true” – for facet/group/sort fields

26

02Caching/Cache Warming – Measure and tune

•  filterCache/queryResultCache/documentCache/etc. •  Very diverse queries (cache hits were too low) •  Benefits for our workload did not justify the cost •  Willing to accept slower queries

27

02Configs - Keep it simple

•  Example configs show off advanced features •  If you are not using the feature, turn it off •  Start with a trimmed down config •  Only add features as needed

28

02

Performance Comparison

29

02Present performance

•  Sustained indexing of ~50,000 docs per sec •  Each replica indexes ~1,000 docs per sec •  New documents are searchable within 5 seconds •  10,000 distributed facet/group/sort queries per day •  1 billion new documents are indexed per day •  13 billion documents are searchable •  7TB of data across all indexes

30

02Performance Comparison

Step Performance (2008) Performance (2015)

Transport <1 minute <1 second (NRT)

Index Generation

10 minutes <5 seconds

Index Merge 10 minutes N/A

Search 20+ minutes <5 seconds

•  Faster transport •  No more batch processing •  No external index generation •  NRT indexing with SolrCloud

31

02Environment Comparison Server Type Servers (2008) Servers (2015)

TransportPhysical: 4Virtual: 15

Physical: 4Virtual: 20

Storage / processing

Physical: 100+Virtual: 0


SearchPhysical: 12Virtual: 0


Total Physical: 100+Virtual: 15


•  Flume / Solr handle event storage and processing

•  No more Hadoop footprint •  Over 80% reduction in servers

32

02Future…

•  Dedicated Solr nodes with SSDs for indexing •  Shard query collections for improved performance •  Larger JVM size for query nodes •  Multiple Datacenter SolrCloud (replication/mirroring)

Rackspace Email’s solution for indexing 50k documents per second

George Bailey – Software Developer, Rackspace Cameron Baker – Linux Engineer, Rackspace

[email protected]

[email protected]

Thank you

rackspace: email's solution for indexing 50k documents per second: presented by george bailey...

Technology