refactoring a solr based api application

37
Architectural lessons learned from refactoring a Solr based API application. Torsten Bøgh Köster (Shopping24) Apache Lucene Eurocon, 19.10.2011

Upload: torsten-koester

Post on 16-Dec-2014

538 views

Category:

Technology


0 download

DESCRIPTION

Held at Apache Lucene Eurocon in Barcelona in October 2011

TRANSCRIPT

Page 1: Refactoring a Solr based API application

Architectural lessons learned from refactoring a Solr based API application.

Torsten Bøgh Köster (Shopping24) Apache Lucene Eurocon, 19.10.2011

Page 2: Refactoring a Solr based API application

Contents

Shopping24 and it‘s API

Technical scaling solutions

ShardingCachingSolr Cores„Elastic“ infrastructure

business requirements as key factor

Page 3: Refactoring a Solr based API application

@tboeghk

Software- and systems- architect2 years experience with Solr3 years experience with Lucene

Team of 7 Java developers currently at Shopping24

Page 4: Refactoring a Solr based API application

shopping24 internet group

Page 5: Refactoring a Solr based API application

1 portal became n portals

Page 6: Refactoring a Solr based API application

30 partner shops became 700

Page 7: Refactoring a Solr based API application

500k to 7m documents

Page 8: Refactoring a Solr based API application

index fact time

•16 Gig Data•Single-Core-Layout•Up to 17s response time•Machine size limited•Stalled at solr version 1.4•API designed for small tools

Page 9: Refactoring a Solr based API application

scaling goal:15-50m documents

Page 10: Refactoring a Solr based API application

ask the nerds

„Shard!“ That‘ll be fun!

„Use spare compute cores at Amazon?“

breathe load into the cloud

„Reduce that index size“

„Get rid of those long running queries!“

Page 11: Refactoring a Solr based API application

data sharding ...

Page 12: Refactoring a Solr based API application

... is highly effective.

125ms

250ms

375ms

500ms

1 4 8 12 16 20

1shard 2shard 3shard4shard 6shard 8shard

concurrent requests

Page 13: Refactoring a Solr based API application

Sharding: size matters

the bigger your index gets, the more complex your

queries are, the more concurrent requests,

the more sharding you need

Page 14: Refactoring a Solr based API application

but wait ...

Page 15: Refactoring a Solr based API application

Why do we have such a big index?

Page 16: Refactoring a Solr based API application

7m documents vs. 2m active poducts

Page 17: Refactoring a Solr based API application

fashionproduct

lifecyclemeets SEO

Bastografie / photocase.com

Page 18: Refactoring a Solr based API application

Separation of duties! Remove unsearchable data from your index.

Page 19: Refactoring a Solr based API application

Why do we have complex queries?

Page 20: Refactoring a Solr based API application

A Solr index designed for 1 portal

Page 21: Refactoring a Solr based API application

Grown into a multi-portal index

Page 22: Refactoring a Solr based API application

Let “sharding“ follow your data ...

Page 23: Refactoring a Solr based API application

... and build separate cores for every client.

Page 24: Refactoring a Solr based API application

Duplicate data as long as access is fast.

andybahn / photocase.com

Page 25: Refactoring a Solr based API application

Streamline your index provisioning

process.

Page 26: Refactoring a Solr based API application

A thousand splendid cores at your fingertips.

Page 27: Refactoring a Solr based API application

Throwing hardware at problems. Automated.

Page 28: Refactoring a Solr based API application

evil traps: latency, $$

Page 29: Refactoring a Solr based API application

mirror your complete system – solve load balancer problems

froodmat / photocase.com

Page 30: Refactoring a Solr based API application

I said faster!

Page 31: Refactoring a Solr based API application

use a cache layerlike Varnish.

Page 32: Refactoring a Solr based API application

What about those complex queries? Why do we have them? And how do we get

rid of them?

Page 33: Refactoring a Solr based API application

Lost in encapsulation: Solr API exposed to world.

Page 34: Refactoring a Solr based API application

What‘s the key factor?

Page 35: Refactoring a Solr based API application

look at your business requirements

Page 36: Refactoring a Solr based API application

decrease complexity

Page 37: Refactoring a Solr based API application

Questions? Comments? Ideas?

Twitter: @tboeghkGithub: @tboeghkEmail: [email protected]

Web: http://www.s24.com

Images: sxc.hu (unless noted otherwise)