apache solr - search for everyone!

73
Apache Solr Apache Solr - search for everyon ://www.flickr.com/photos/malikdhadha/

Upload: jaran-nilsen

Post on 12-Jan-2015

1.650 views

Category:

Technology


6 download

DESCRIPTION

Talk presented at Baksia meet up i Oslo on November 23rd 2011.

TRANSCRIPT

Page 1: Apache Solr - search for everyone!

Apache Solr

Apache Solr- search for everyone!

http://www.flickr.com/photos/malikdhadha/

Page 2: Apache Solr - search for everyone!

Apache Solr

Jaran Nilsentwitter.com/jarannilsen

• Co-founder and R&D Director at Integrasco

• Founder and developer of Notpod

• Leader of javaBin Sørlandet

• Programmer and Open Source enthusiast

Page 3: Apache Solr - search for everyone!

A global leader in social intelligence

Page 4: Apache Solr - search for everyone!

Apache Solr

What is search?

http://www.flickr.com/photos/denverjeffrey/5133538450/

Page 5: Apache Solr - search for everyone!
Page 6: Apache Solr - search for everyone!

Apache Solr

Page 7: Apache Solr - search for everyone!

http://www.flickr.com/photos/somegeekintn/3709203268/

Page 8: Apache Solr - search for everyone!

This is Apache Solr

• Open Source enterprise search server from Apache

• Built on Apache Lucene

• Offers additional features to those of Lucene

Page 9: Apache Solr - search for everyone!

First, a little history...

Page 10: Apache Solr - search for everyone!

• Started out as a in-house CNET project for adding search functionality to the CNET website in 2004

Page 11: Apache Solr - search for everyone!

• Started out as a in-house CNET project for adding search functionality to the CNET website.

• Donated to Apache Software Foundation in 2006

Page 12: Apache Solr - search for everyone!

• Started out as a in-house CNET project for adding search functionality to the CNET website.

• Donated to Apache Software Foundation in 2006

• Graduated from incubation status in 2007

Page 13: Apache Solr - search for everyone!

• Since version 3.1 (March 2011), Solr and Lucene are now sharing the same codebase.

+

Page 14: Apache Solr - search for everyone!

• Since version 3.1 (March 2011), Solr and Lucene are now sharing the same codebase.

• Meaning sharing of features and fixes between the projects at a much higher rate

+

Page 15: Apache Solr - search for everyone!
Page 16: Apache Solr - search for everyone!

wget http://apache.uib.no/lucene/solr/3.6.1/apache-solr-3.6.1.tgz

tar xvf apache-solr-3.6.1.tgz

cd apache-solr-3.6.1/example/

java -jar start.jar

4 small steps...

Page 17: Apache Solr - search for everyone!

...and we’re up!

Page 18: Apache Solr - search for everyone!
Page 19: Apache Solr - search for everyone!

cd exampledocs/

./post.sh ipod_other.xml

Page 20: Apache Solr - search for everyone!
Page 21: Apache Solr - search for everyone!

The obvious part – full text searching

http://www.flickr.com/photos/49889874@N05/6877840735/

Page 22: Apache Solr - search for everyone!

• q=yourquery

• Example:q=android AND ios&rows=100

Page 23: Apache Solr - search for everyone!
Page 24: Apache Solr - search for everyone!
Page 25: Apache Solr - search for everyone!
Page 26: Apache Solr - search for everyone!
Page 27: Apache Solr - search for everyone!

Don’t worry - it’s not just XML!

Page 28: Apache Solr - search for everyone!

The Schema

http://www.flickr.com/photos/14804582@N08/2111269218/

Page 29: Apache Solr - search for everyone!

Key elements of schema.xml

• Unique identifer

• Default search field

• Types

• Fields and dynamic fields

• Copy fields

Page 30: Apache Solr - search for everyone!
Page 31: Apache Solr - search for everyone!
Page 32: Apache Solr - search for everyone!
Page 33: Apache Solr - search for everyone!

Solr configuration

http://www.flickr.com/photos/esetianto/4099842490/

Page 34: Apache Solr - search for everyone!

Key elements of solrconfig.xml

• Settings for your search index

• Warm-up routines

• Cache settings

• Replication

• Update chain

Page 35: Apache Solr - search for everyone!

Features

http://xkcd.com/619/

Page 36: Apache Solr - search for everyone!

Facets

Page 37: Apache Solr - search for everyone!

Facets

Page 38: Apache Solr - search for everyone!

Facets

Page 39: Apache Solr - search for everyone!

Just add this to your URL:

• facet=true&facet.field=field

• Example:facet=true&facet.field=language

Page 40: Apache Solr - search for everyone!
Page 41: Apache Solr - search for everyone!

Facet queries

Page 42: Apache Solr - search for everyone!

Facet queries

&facet=true&facet.query=price:[* TO 100] &facet.query=price:[100 TO 200]&facet.query=price:[200 TO 300] &facet.query=price:[300 TO 400]&facet.query=price:[400 TO 500] &facet.query=price:[500 TO *]

Page 43: Apache Solr - search for everyone!

Now you want to drill down!

http://www.flickr.com/photos/kk/4712925031/

Page 44: Apache Solr - search for everyone!

Filter queries

Page 45: Apache Solr - search for everyone!

Filter queries

Page 46: Apache Solr - search for everyone!

Filter queries

Page 47: Apache Solr - search for everyone!

Just add this to your URL:

• fq=field:value

• Example:fq=source:facebook.com

Page 48: Apache Solr - search for everyone!

Produce «word clouds»

Page 49: Apache Solr - search for everyone!

•TermsComponent

•TermVectorComponent

Page 50: Apache Solr - search for everyone!

TermVectorComponent

Term vector information aggregator

Page 51: Apache Solr - search for everyone!
Page 52: Apache Solr - search for everyone!

Scalability

http://www.flickr.com/photos/dickyfeng/3249837481/

Page 53: Apache Solr - search for everyone!

•Sharding

•Replication

Page 54: Apache Solr - search for everyone!

Solr instance 1

Solr instance 2

Solr instance 3

Solr instance 4

Solr instance N

Index sharding strategy

Page 55: Apache Solr - search for everyone!

Solr instance 1

Solr instance 2

Solr instance 3

Solr instance 4

Solr instance N

Index sharding strategy

ipod OR iphone Search

Page 56: Apache Solr - search for everyone!

Solr instance 1

Solr instance 2

Solr instance 3

Solr instance 4

Solr instance N

Index sharding strategy

ipod OR iphone Search

Page 57: Apache Solr - search for everyone!

Just add this to your URL:

• shards=shard1,shard2

• Example:q=android&shards=solr1.node.com/solr,solr2.node.com/solr,solr3.node.com/solr

Page 58: Apache Solr - search for everyone!

Replication

Master

Slave

Indexer

android Search

Page 59: Apache Solr - search for everyone!

Replication configuration

Page 60: Apache Solr - search for everyone!

Integration of Solr

http://www.flickr.com/photos/certified_su/229016531/

Page 61: Apache Solr - search for everyone!

Solr has support for many different languages

• Ruby• PHP • Java• Scala• Python• .NET• Perl• JavaScript

Page 62: Apache Solr - search for everyone!
Page 63: Apache Solr - search for everyone!

Tips & Gotcha’sOr; how to avoid the sinkholes!

http://www.flickr.com/photos/67165210@N00/4661419386/

Page 64: Apache Solr - search for everyone!

«What data do your clients need?»

Page 65: Apache Solr - search for everyone!

«Figure out what kind of searches you will be

doing»

Page 66: Apache Solr - search for everyone!

«Spend a siginficant amount of time

designing schema.xml»

Page 67: Apache Solr - search for everyone!

«Add dynamic fields for ALL your field types»

Page 68: Apache Solr - search for everyone!

«Do not use Solr as your primary data store!»

Page 69: Apache Solr - search for everyone!

«The 20 million mark»

Page 70: Apache Solr - search for everyone!

But most importantly...

Don’t panic!

Page 71: Apache Solr - search for everyone!

http://www.flickr.com/photos/11304375@N07/2046228644

Page 72: Apache Solr - search for everyone!

http://www.flickr.com/photos/davidw/2201099990/

Page 73: Apache Solr - search for everyone!

Thank you!

http://www.jeremiahblatz.com/personal/pics/Australia_Travel_Pictures_2009/day12/164_Sunrise_Great_Barrier_Reef.html