apache solr - search for everyone!

Post on 12-Jan-2015

1.650 Views

Category:

Technology

6 Downloads

Preview:

Click to see full reader

DESCRIPTION

Talk presented at Baksia meet up i Oslo on November 23rd 2011.

TRANSCRIPT

Apache Solr

Apache Solr- search for everyone!

http://www.flickr.com/photos/malikdhadha/

Apache Solr

Jaran Nilsentwitter.com/jarannilsen

• Co-founder and R&D Director at Integrasco

• Founder and developer of Notpod

• Leader of javaBin Sørlandet

• Programmer and Open Source enthusiast

A global leader in social intelligence

Apache Solr

What is search?

http://www.flickr.com/photos/denverjeffrey/5133538450/

Apache Solr

http://www.flickr.com/photos/somegeekintn/3709203268/

This is Apache Solr

• Open Source enterprise search server from Apache

• Built on Apache Lucene

• Offers additional features to those of Lucene

First, a little history...

• Started out as a in-house CNET project for adding search functionality to the CNET website in 2004

• Started out as a in-house CNET project for adding search functionality to the CNET website.

• Donated to Apache Software Foundation in 2006

• Started out as a in-house CNET project for adding search functionality to the CNET website.

• Donated to Apache Software Foundation in 2006

• Graduated from incubation status in 2007

• Since version 3.1 (March 2011), Solr and Lucene are now sharing the same codebase.

+

• Since version 3.1 (March 2011), Solr and Lucene are now sharing the same codebase.

• Meaning sharing of features and fixes between the projects at a much higher rate

+

wget http://apache.uib.no/lucene/solr/3.6.1/apache-solr-3.6.1.tgz

tar xvf apache-solr-3.6.1.tgz

cd apache-solr-3.6.1/example/

java -jar start.jar

4 small steps...

...and we’re up!

cd exampledocs/

./post.sh ipod_other.xml

The obvious part – full text searching

http://www.flickr.com/photos/49889874@N05/6877840735/

• q=yourquery

• Example:q=android AND ios&rows=100

Don’t worry - it’s not just XML!

The Schema

http://www.flickr.com/photos/14804582@N08/2111269218/

Key elements of schema.xml

• Unique identifer

• Default search field

• Types

• Fields and dynamic fields

• Copy fields

Solr configuration

http://www.flickr.com/photos/esetianto/4099842490/

Key elements of solrconfig.xml

• Settings for your search index

• Warm-up routines

• Cache settings

• Replication

• Update chain

Features

http://xkcd.com/619/

Facets

Facets

Facets

Just add this to your URL:

• facet=true&facet.field=field

• Example:facet=true&facet.field=language

Facet queries

Facet queries

&facet=true&facet.query=price:[* TO 100] &facet.query=price:[100 TO 200]&facet.query=price:[200 TO 300] &facet.query=price:[300 TO 400]&facet.query=price:[400 TO 500] &facet.query=price:[500 TO *]

Now you want to drill down!

http://www.flickr.com/photos/kk/4712925031/

Filter queries

Filter queries

Filter queries

Just add this to your URL:

• fq=field:value

• Example:fq=source:facebook.com

Produce «word clouds»

•TermsComponent

•TermVectorComponent

TermVectorComponent

Term vector information aggregator

Scalability

http://www.flickr.com/photos/dickyfeng/3249837481/

•Sharding

•Replication

Solr instance 1

Solr instance 2

Solr instance 3

Solr instance 4

Solr instance N

Index sharding strategy

Solr instance 1

Solr instance 2

Solr instance 3

Solr instance 4

Solr instance N

Index sharding strategy

ipod OR iphone Search

Solr instance 1

Solr instance 2

Solr instance 3

Solr instance 4

Solr instance N

Index sharding strategy

ipod OR iphone Search

Just add this to your URL:

• shards=shard1,shard2

• Example:q=android&shards=solr1.node.com/solr,solr2.node.com/solr,solr3.node.com/solr

Replication

Master

Slave

Indexer

android Search

Replication configuration

Integration of Solr

http://www.flickr.com/photos/certified_su/229016531/

Solr has support for many different languages

• Ruby• PHP • Java• Scala• Python• .NET• Perl• JavaScript

Tips & Gotcha’sOr; how to avoid the sinkholes!

http://www.flickr.com/photos/67165210@N00/4661419386/

«What data do your clients need?»

«Figure out what kind of searches you will be

doing»

«Spend a siginficant amount of time

designing schema.xml»

«Add dynamic fields for ALL your field types»

«Do not use Solr as your primary data store!»

«The 20 million mark»

But most importantly...

Don’t panic!

http://www.flickr.com/photos/11304375@N07/2046228644

http://www.flickr.com/photos/davidw/2201099990/

Thank you!

http://www.jeremiahblatz.com/personal/pics/Australia_Travel_Pictures_2009/day12/164_Sunrise_Great_Barrier_Reef.html

top related