scaling up solr 4.1 to power big search in social media analytics

Scaling Solr 4 to Power Big Search in Social Media

Analytics

Timothy Potter Architect, Big Data Analytics, Dachis Group / Co-author Solr In Action

® 2011 Dachis Group.

dachisgroup.com

• Anyone running SolrCloud in

production today?

• Who is running pre-Solr 4 version in

production?

• Who has fired up Solr 4.x in SolrCloud

mode?

• Personal interest – who was

purchased Solr in Action in MEAP?

Audience poll


dachisgroup.com

• Gain insights into the key design decisions you need

to make when using Solr cloud

Wish I knew back then ...

• Solr 4 feature overview in context

• Zookeeper

• Distributed indexing

• Distributed search

• Real-time GET

• Atomic updates

• A day in the life ...

• Day-to-day operations

• What happens if you lose a node?

Goals of this talk


dachisgroup.com

Our business intelligence platform analyzes relationships, behaviors, and

conversations between 30,000 brands and 100M social accounts every 15 minutes.

About Dachis Group


dachisgroup.com


dachisgroup.com

• In production on 4.2.0

• 18 shards ~ 33M docs / shard, 25GB on disk per shard

• Multiple collections

• ~620 Million docs in main collection (still growing)

• ~100 Million docs in 30-day collection

• Inherent Parent / Child relationships (tweet and re-tweets)

• ~5M atomic updates to existing docs per day

• Batch-oriented updates

• Docs come in bursts from Hadoop; 8,000 docs/sec

• 3-4M new documents per day (deletes too)

• Business Intelligence UI, low(ish) query volume

Solution Highlights


dachisgroup.com

• Scalability

Scale-out: sharding and replication

A little scale-up too: Fast disks (SSD), lots of RAM!

• High-availability

Redundancy: multiple replicas per shard

Automated fail-over: automated leader election

• Consistency

Distributed queries must return consistent results

Accepted writes must be on durable storage

• Simplicity - wip

Self-healing, easy to setup and maintain,

able to troubleshoot

• Elasticity - wip

Add more replicas per shard at any time

Split large shards into two smaller ones

Pillars of my ideal search solution


dachisgroup.com

Nuts and Bolts

Nice tag cloud wordle.net!


dachisgroup.com

1. Zookeeper needs at least 3 nodes to establish quorum with fault

tolerance. Embedded is only for evaluation purposes, you need to

deploy a stand-alone ensemble for production

2. Every Solr core creates ephemeral “znodes” in Zookeeper which

automatically disappear if the Solr process crashes

3. Zookeeper pushes notifications to all registered “watchers” when a

znode changes; Solr caches cluster state

1. Zookeeper provides “recipes” for solving common problems faced

when building distributed systems, e.g. leader election

2. Zookeeper provides centralized configuration distribution, leader

election, and cluster state notifications

Zookeeper in a nutshell


dachisgroup.com

• Number and size of indexed fields

• Number of documents

• Update frequency

• Query complexity

• Expected growth

• Budget

Number of shards?

Yay for shard splitting in 4.3 (SOLR-3755)!


dachisgroup.com

We use Uwe Schindler’s advice on 64-bit Linux:

<directoryFactory name="DirectoryFactory"

class="${solr.directoryFactory:solr.MMapDirectoryFactory}"/>

See: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

java -Xmx4g ...

(hint: rest of our RAM goes to the OS to load index in memory mapped I/O)

Small cache sizes with aggressive eviction – spread GC penalty out over time vs. all at once every time

you open a new searcher

<filterCache class="solr.LFUCache" size="50"

initialSize="50" autowarmCount="25"/>

Index Memory Management


dachisgroup.com

• Not a master

• Leader is a replica (handles queries)

• Accepts update requests for the shard

• Increments the _version_ on the new or

updated doc

• Sends updates (in parallel) to all

replicas

Leader = Replica + Addl’ Work


dachisgroup.com

Don’t let your tlog’s get too big – use “hard” commits with openSearcher=“false”

Distributed Indexing

View of cluster state from Zk

Shard 1 Leader

Node 1 Node 2

Shard 2 Leader

Shard 2 Replica

Shard 1 Replica

Zookeeper

CloudSolrServer “smart client”

Hash on docID

1

2

3 Set the _version_

tlog tlog

Get URLs of current leaders?

4

5

2 shards with 1 replica each

<autoCommit>

<maxDocs>10000</maxDocs>

<maxTime>60000</maxTime>

<openSearcher>false</openSearcher>

</autoCommit>

8,000 docs / sec to 18 shards


dachisgroup.com

Send query request to any node

Two-stage process

1. Query controller sends query to all

shards and merges results

One host per shard must be online

or queries fail

2. Query controller sends 2nd query to

all shards with documents in the

merged result set to get requested

fields

Solr client applications built for 3.x do

not need to change (our query code still

uses SolrJ 3.6)

Limitations

JOINs / Grouping need custom hashing

Distributed search

View of cluster state from Zk

Shard 1 Leader

Node 1 Node 2

Shard 2 Leader

Shard 2 Replica

Shard 1 Replica

Zookeeper

CloudSolrServer

1

3

q=*:*

Get URLs of all live nodes

4

2

Query controller

Or just a load balancer works too

get fields


dachisgroup.com

Search by daily activity volume

Drive analysis that measures the impact of a social message over time ... Company posts a tweet on Monday, how much activity around that message on Thursday?


dachisgroup.com

Problem: Find all documents that had activity on a specific day

• tweets that had retweets or YouTube videos that had comments

• Use Solr join support to find parent documents by matching on child criteria

fq=_val_:"{!join from=echo_grouping_id_s to=id}day_tdt:[2013-05-01T00:00:00Z

TO 2013-05-02T00:00:00Z}" ...

... But, joins don’t work in distributed queries and is probably too slow anyway

Solution: Index daily activity into multi-valued fields. Use real-time GET to lookup

document by ID to get the current daily volume fields

fq:daily_volume_tdtm('2013-05-02’)

sort=daily_vol(daily_volume_s,'2013-04-01','2013-05-01')+desc

daily_volume_tdtm: [2013-05-01, 2013-05-02] <= doc has child signals on May 1 and 2

daily_volume_ssm: 2013-05-01|99, 2013-05-02|88 <= stored only field, doc had 99 child signals on May 1, 88 on May 2

daily_volume_s: 13050288|13050199 <= flattened multi-valued field for sorting using a custom ValueSource

Atomic updates and real-time get


dachisgroup.com

Will it work? Definitely!

Search can be addicting to your organization, queries we

tested for 6 months ago vs. what we have today are vastly

different

Buy RAM – OOMs and aggressive garbage collection

cause many issues

Give RAM from ^ to the OS – MMapDirectory

Need a disaster recovery process in addition to Solr cloud

replication; helps with migrating to new hardware too

Use Jetty ;-)

Store all fields! Atomic updates are a life saver

Lessons learned


dachisgroup.com

Schema will evolve – we thought we understood our data model but have since

added at least 10 new fields and deprecated some too

Partition if you can! e.g. 30-day collection

We don't optimize – segment merging works great

Size your staging environment so that shards have about as many docs and same

resources as prod. I have many more nodes in prod but my staging servers have

roughly the same number of docs per shard, just fewer shards.

Don’t be afraid to customize Solr! It’s designed to be customized with plug-ins

• ValueSource is very powerful

• Check out PostFilters:

{!frange l=1 u=1 cost=200 cache=false}imca(53313,employee)

Lessons learned cont.


dachisgroup.com

• Backups

.../replication?command=backup&location=/mnt/backups

• Monitoring

Replicas serving queries?

All replicas report same number of docs?

Zookeeper health

New search warm-up time • Configuration update process

Our solrconfig.xml changes frequently – see Solr’s zkCli.sh • Upgrade Solr process (it’s moving fast right now)

• Recover failed replica process

• Add new replica

• Kill the JVM on OOM (from Mark Miller)

-XX:OnOutOfMemoryError=/home/solr/on_oom.sh

-XX:+HeapDumpOnOutOfMemoryError

Minimum DevOps Reqts


dachisgroup.com

Nodes will crash! (ephemeral znodes)

Or, sometimes you just need to restart a

JVM (rolling restarts to upgrade)

Peer sync via update log (tlog)

100 updates else ...

Good ol’ Solr replication from leader to

replica

Node recovery


dachisgroup.com

• Moving to a near real-time streaming model using Storm

• Buying more RAM per node

• Looking forward to shard splitting as it has

become difficult to re-index 600M docs

• Re-building the index with DocValues

• We've had shards get out of sync after major failure –

resolved it by going back to raw data and doing a key by key

comparison of what we expected to be in the index and re-indexing

any missing docs.

• Custom hashing to put all docs for a specific brand in the same

shard

Roadmap / Futures


dachisgroup.com

If you find yourself in this

situation, buy more RAM!

Obligatory lolcats slide

CONTACT

Timothy Potter

[email protected]

twitter: @thelabdude

scaling up solr 4.1 to power big search in social media analytics

Education