meet solr for the tirst again

19
Meet Solr for the first time again Varun Thacker

Upload: varun-thacker

Post on 05-Jul-2015

198 views

Category:

Engineering


1 download

DESCRIPTION

Anyone who has tried integrating search in their application knows how good and powerful Solr is but always wished it was simpler to get started and simpler to take it to production. I will talk about the recent features added to Solr making it easier for users and some of the changes we plan on adding soon to make the experience even better.

TRANSCRIPT

Page 1: Meet Solr For The Tirst Again

Meet Solr for the first time again Varun Thacker

Page 2: Meet Solr For The Tirst Again

Apache Solr has a huge install base and tremendous momentum

most widely used search solution on the planet. 8M+

total downloads

Solr is both established & growing

250,000+monthly downloads

Solr has tens of thousands of applications in production.

You use Solr everyday.

2500+open Solr jobs.

Activity Summary30 Day summary

Aug 18 - Sep 17 2014

• 128 Commits • 18 Contributors

via https://www.openhub.net/p/solr

12 Month Summary Sep 17, 2013 - Sep 17, 2014

• 1351 Commits • 29 Contributors

Page 3: Meet Solr For The Tirst Again

Search - Until recently

• Large organizations (Enterprise)

• Expensive

• Complex

• $$$$$

Page 4: Meet Solr For The Tirst Again

New Age Search• Everyone… startups, websites

• Special use cases

• E-commerce

• Mails and personal data

• Personal data - Across devices

• Social and Local!

• Analytics

Page 5: Meet Solr For The Tirst Again

Decision making!

• Short time frame

• Confidence measure:

• Getting started quick

• Configure and see the tip of the iceberg

• Issues only uncover later in the story

Page 6: Meet Solr For The Tirst Again

Until recently…• Getting started:

• Download

• java -jar start.jar

• SolrCloud, getting started….

• Download

• Copy example directory ‘x’ times over.

• java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -DnumShards=2 -jar start.jar

• java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar

• It runs!

Page 7: Meet Solr For The Tirst Again

Times… they are a changin…

• Download

• cd solr

• Standalone: bin/solr start

• SolrCloud, example, interactive:

• bin/solr start -e cloud (< 2 minutes!)

Page 8: Meet Solr For The Tirst Again

Let’s index some data

• Flexible JSON Indexing - Solr supports any JSON document and the document can be indexed in the required format in Solr

• More reading: https://lucidworks.com/blog/indexing-custom-json-data/

Page 9: Meet Solr For The Tirst Again

Managed Schema

• Solr is the schema owner

• REST APIs - Hide the implementation details

• Schema-less mode

• Update and Addition of Fields and FieldTypes

• More reading: https://lucidworks.com/blog/schemaless-solr-part-1/

Page 10: Meet Solr For The Tirst Again

Configuration APIs

• Configure Solr using APIs

• solrconfig.xml… What did you say?

Page 11: Meet Solr For The Tirst Again

Solr Scale Toolkit

• Easily deploy SolrCloud clusters

• Live patching and rolling restarts

• Dependency on AWS soon to go away

• Chef or Puppet still are valid approaches

• More reading: http://lucidworks.com/blog/introducing-the-solr-scale-toolkit/

Page 12: Meet Solr For The Tirst Again

Talking about the Admin UI…

• Already improved from 3.x

• Uploading documents

• Collections API is coming soon

Collection Actions

Page 13: Meet Solr For The Tirst Again

Recently Added Features• Document expiration and Time To Live (TTL)

• Cursors: Efficient Deep Paging

• Export Sorted Result Sets

• SSL support in SolrCloud

• Distributed Pivot Faceting

• Suggester v2

• CollapsingQParserPlugin

• ReRankingQParserPlugin

• Collections API improvements

Page 14: Meet Solr For The Tirst Again

There’s so much more coming up…

• Schema Bulk API

• Distributed IDF

• Query DSL

• Cross Data-center replication

• Cluster Backup and Restore

• SOLR - Make an application, not ‘war’.

Page 15: Meet Solr For The Tirst Again

It’s easy.. and stable!

• Benchmarking

• Tons of users testing it

• Evolving test framework

Page 16: Meet Solr For The Tirst Again

Solr scalability is unmatched.

• 10TB+ Index Size • 10 Billion+ Documents • 100 Million+ Daily Requests

Page 17: Meet Solr For The Tirst Again

Where is it headed?• Download

• See that server directory?

• Use start scripts

• Send a document, or a few…

• Things don’t really look the way they should?

• Use the schema APIs

• Add fields… not enough?

• Add field types and then add fields

• Configure Solr using REST APIs

For Production:

• Use Solr Scale Toolkit to deploy, patch and manage!

• Configure Solr using REST APIs

Page 18: Meet Solr For The Tirst Again

Lucidworks Fusion

Intelligent Search Services/API

Recommendation Module Signal Processing Analytics Service

Discovery Engine

Analytics StoreEnrichment Services⚒

Analyst Workbench

eCommerce Solution

Admin/ Management

SiLK Log Analysis

Search/ Discovery

Partner Solutions

Connector Framework

Page 19: Meet Solr For The Tirst Again

Connect @

https://twitter.com/varunthacker

http://in.linkedin.com/in/varunthacker

[email protected]