apache solr/lucene: looking ahead
DESCRIPTION
Apache Solr/Lucene: Looking Ahead. Topics. Me. You? Quick Overview of Lucen e and Solr Solr demo Where are we now? What’s in a version number? Looking Ahead Apache Lucene 3.1 and beyond Apache Solr 3.1 and beyond. Me. You? Lucene? Solr? New to Search? Other Search Engines? - PowerPoint PPT PresentationTRANSCRIPT
Apache Solr/Lucene: Looking Ahead
Lucid Imagination, Inc.
Topics
Me. You?
Quick Overview of Lucene and SolrSolr demo
Where are we now?What’s in a version number?
Looking AheadApache Lucene 3.1 and beyond
Apache Solr 3.1 and beyond
Me You?Lucene?
Solr?
New to Search?
Other Search Engines?
Crawling?
Database?
Scale?
Lucid Imagination, Inc.
Lucene is a mature, high performance Java API to provide search capabilities to applications
Supports indexing, searching and a number of other commonly used search features (highlighting, spell checking, etc.)
Not a crawler and doesn’t know anything about Adobe PDF, MS Word, etc.
Created in 1997 and now part of the Apache Software Foundation
Important to note that Lucene does not have distributed index (shard) support
Lucid Imagination, Inc.
Solr
Solr is the Lucene based search server providing the infrastructure required for most users to work with Lucene
Without knowing Java!
Also provides:Easy setup and configuration
Faceting
Highlighting
Replication/Sharding
Lucene Best Practices
http://search.lucidimagination.com
Lucid Imagination, Inc.
Quick Solr DemoPre-reqs:
Apache Ant 1.7.x
SVN
svn co https://svn.apache.org/repos/asf/lucene/dev/trunk solr-trunk
cd solr-trunk/solr/
ant example
cd example
java –jar start.jar
cd exampledocs; java –jar post.jar *.xml
http://localhost:8983/solr/browse
Lucid Imagination, Inc.
Where are we now?
Current releasesApache Lucene 3.0.2 and 2.9.2
Apache Solr 1.4.1
Last March, the Lucene and Solr development communities merged to reduce duplication, ease development, etc.
Mail: [email protected]
User communities are still [email protected], [email protected]
Lucid Imagination, Inc.
Where are we now?
Is the next release Solr 1.5 or 3.1?Solr 3.1 (99% certain!)
Two main branches of development for both Lucene and SolrTrunk (i.e 4.0)• https://svn.apache.org/repos/asf/lucene/dev/trunk/
• No guarantee of back compatibility (but best efforts are made)
3.x Branch• https://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x/
• Try to be backwards compatible to 1.4.X release
Most things are applied to both branches, but not all
Lucid Imagination, Inc.
Words to the Wise
“Some or all of the following statements may contain projections or other forward-looking statements regarding
future events or implementations in Lucene/Solr”
“The statements are not meant to be inclusive of all changes”
Lucid Imagination, Inc.
Apache Lucene 3.1 and Beyond
https://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/
CHANGES.txt
Performance Improvements in many areasBytes instead of Strings – Better Memory savings
Phrase scoring
Packed Ints
Analysis ContributionsMany new languages/dialects supported: Hindi, Indic, Arabic, Armenian, Persian, Indonesian, etc. on top of support for English, most European languages, Chinese, Japanese, Korean
Lucid Imagination, Inc.
Lucene 3.1 and Beyond
Expert LevelFlex APIs• Different codecs for the index• Total control over what is in the index• Pluggable scoring models
(Near) Real Time SearchMake newly indexed documents instantly available for search
See https://svn.apache.org/repos/asf/lucene/dev/branches/realtime_search/
Much, much more
Lucid Imagination, Inc.
Apache Solr 3.1 and Beyond
http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/CHANGES.txt
Solr CloudMake it easy to deploy and manage truly large scale search applications• 10B+ (100B? 1T?) docs with subsecond search/faceting
See http://wiki.apache.org/solr/SolrCloud
(Near) Real Time Search
Apache Solr 3.1 and Beyond
Spatial Search“Find me all the Lulu authors that live within 50 miles of HQ”
Boost, sort, filter documents by distance and other spatial information
http://wiki.apache.org/solr/SpatialSearch
http://www.openstreetmap.org/?lat=44.9744&lon=-93.2484&zoom=14&layers=B000FTFT
Lucid Imagination, Inc.
Solr 3.1 and Beyond
Group By/Field Collapsinghttp://wiki.apache.org/solr/FieldCollapsing
Roll up results that have a common “token”
Examples:• All documents from the same URL
• All documents by the same author that match
• All documents in the same price range
Auto-suggest
Pivoted Faceting
Lucid Imagination, Inc.
Resources
http://lucene.apache.org/solr
/java
http://www.lucidimagination.com