enterprise search in plone using solr

27
Enterprise Search in Plone using Solr Calvin Hendryx-Parker Plone Conference 2010 Wednesday, October 27, 2010

Upload: calvin-hendryx-parker

Post on 19-May-2015

1.192 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Enterprise search in plone using solr

Enterprise Search in Plone using Solr

Calvin Hendryx-ParkerPlone Conference 2010

Wednesday, October 27, 2010

Page 2: Enterprise search in plone using solr

PLONE CONFERENCE 2010

• Java Based

• Full-Text Search

• Web Services API

• Standards Based Interfaces

• Scalable

• XML Configuration

• Extensible

What is Solr?

Wednesday, October 27, 2010

Page 3: Enterprise search in plone using solr

PLONE CONFERENCE 2010

• Indexing

• Query

Playing with Solr

Wednesday, October 27, 2010

Page 4: Enterprise search in plone using solr

PLONE CONFERENCE 2010

Wednesday, October 27, 2010

Page 5: Enterprise search in plone using solr

PLONE CONFERENCE 2010

Wednesday, October 27, 2010

Page 6: Enterprise search in plone using solr

PLONE CONFERENCE 2010

• Data Schema

• Faceted Search

• Administrative Interface

• Incremental Updates

• Supports Sharding

• Index Databases, Local Files and Web Pages

• Supports Multiple Indexes

Solr Features

Wednesday, October 27, 2010

Page 7: Enterprise search in plone using solr

PLONE CONFERENCE 2010

• Stopwords

• Synonyms

• Highlighted Context Snippets

• Spelling Suggestions

• More Like This Suggestions

• Supports Rich Documents

Solr Features

Wednesday, October 27, 2010

Page 8: Enterprise search in plone using solr

PLONE CONFERENCE 2010

Wednesday, October 27, 2010

Page 9: Enterprise search in plone using solr

PLONE CONFERENCE 2010

Wednesday, October 27, 2010

Page 10: Enterprise search in plone using solr

PLONE CONFERENCE 2010

Wednesday, October 27, 2010

Page 11: Enterprise search in plone using solr

PLONE CONFERENCE 2010Solr Performance

• Wiktionary Dataset

• 49.5 Millions lines of XML

• 1.3 GB of data

• 1.7 Million Pages Indexed in 5.5 hours

• ZODB Size after import 1.1GB

Wednesday, October 27, 2010

Page 12: Enterprise search in plone using solr

PLONE CONFERENCE 2010

collective.solr

Integration Options with Plone

Wednesday, October 27, 2010

Page 13: Enterprise search in plone using solr

PLONE CONFERENCE 2010

• Monkey Patching

• Relies on collective.indexing

• Duplicates all indexes

• Sub-Optimal Integration with Zope Transactions

• Relies on Thread Locals

collective.solr Issues

Wednesday, October 27, 2010

Page 14: Enterprise search in plone using solr

PLONE CONFERENCE 2010

What to do?

Wednesday, October 27, 2010

Page 15: Enterprise search in plone using solr

PLONE CONFERENCE 2010

Reevaluate

Wednesday, October 27, 2010

Page 16: Enterprise search in plone using solr

PLONE CONFERENCE 2010

• No Monkey Patching

• Simpler Code

Solr Integration as a Catalog Index

Wednesday, October 27, 2010

Page 17: Enterprise search in plone using solr

PLONE CONFERENCE 2010

• ZCatalog Index

• Doesn't depend on Plone

• Utilizes new foreign_connections Connection Method

• Pass through Solr Queries

• Direct access to the Solr Response

Enter alm.solrindex

Wednesday, October 27, 2010

Page 18: Enterprise search in plone using solr

PLONE CONFERENCE 2010

Wednesday, October 27, 2010

Page 19: Enterprise search in plone using solr

PLONE CONFERENCE 2010

Wednesday, October 27, 2010

Page 20: Enterprise search in plone using solr

PLONE CONFERENCE 2010

• Still handled by the ZCatalog

• Could change in the future

Sorting

Wednesday, October 27, 2010

Page 21: Enterprise search in plone using solr

PLONE CONFERENCE 2010

• Handle Parsing Attributes for Indexing

• Translate field-specific queries to Solr

• Registered as Zope Utilities

alm.solrindex Field Handlers

Wednesday, October 27, 2010

Page 22: Enterprise search in plone using solr

PLONE CONFERENCE 2010

<html><body><h3>Code Sample</h3><p>Replace this text!</p></body></html>

Example Handlerclass TextFieldHandler(DefaultFieldHandler):

def parse_query(self, field, field_query): name = field.name request = {name: field_query} record = parseIndexRequest(request, name, ('query',)) if not record.keys: return None

query_str = ' '.join(record.keys) if not query_str: return None

return {'q': u'+%s:%s' % (name, quote_query(query_str))}

Wednesday, October 27, 2010

Page 23: Enterprise search in plone using solr

PLONE CONFERENCE 2010

• GenericSetup Profile

• Tests

• Uses solrpy instead of the unsupported solr.py

Other alm.solrindex Features

Wednesday, October 27, 2010

Page 24: Enterprise search in plone using solr

PLONE CONFERENCE 2010

• Can replace several ZCatalog indexes

• Remove any indexes you have replaced

• Use it for all Text Indexes

• Still Utilize the ZCatalog Indexes for Everything Else

Tips

Wednesday, October 27, 2010

Page 25: Enterprise search in plone using solr

PLONE CONFERENCE 2010

DemoProject Gutenburg Data

Wednesday, October 27, 2010

Page 26: Enterprise search in plone using solr

PLONE CONFERENCE 2010

Questions?

Wednesday, October 27, 2010

Page 27: Enterprise search in plone using solr

Check out

sixfeetup.com/demos

Wednesday, October 27, 2010