plat-4 understanding the solr integration
DESCRIPTION
Video that accompanies this presentation at: http://www.youtube.com/watch?v=1t3Z2pJyulA Join us for a guided tour of the Alfresco SOLR integration and new search sub-systems. We’ll discuss how it works, the limitations of eventual consistency, guidance for configuration and set-up. We’ll also cover the steps required to migrate, improved PATH performance, in-query ACL evaluation, cross-language support and monitoring as well as performance.TRANSCRIPT
Understanding The SOLR Integration Andy Hind • Senior Developer • twitter @andy_hind
Agenda
• Why SOLR? • What is supported? • Eventual consistency • Configuration and setup • How to migrate • Status/reporting • Improvements
Why SOLR?
• Issues… o Cluster – index per node o Performance
• Permission evaluation • Structural queries • In-transaction indexing
o Scale query independently o Cross-locale support o Sub-system and dynamic configuration
What is supported?
• Spaces store • Archive • Query languages • NOT
o WCM based on AVM o Records Management o All stores o Multi-tenant o In transaction (eventually consistent)
Eventual consistency
• SOLR is tracking Alfresco o Following transactions – a bit like clustering o Eventual consistency o Transactions that may take some time to commit o Two cores
• SpacesStore • ArchiveStore
Eventual consistency
• Models • ACLs • Metadata • Content • Ownership • Structure - PATH
High Level Architecture
Repository
Solr
Search Requests
Async: Index Polling
Solr Cores: - Workspace - Archive
Search Results
Content Store(s)
Database Storage
Solr Cores
Models ACLS Properties & Content
Updates
Setup
• SOLR is a web app o zip
• Communicates over SSL o Generate and configure your certificates …
• Per core configuration in SOLR o Data location
• Installer default
Configuration
• Search sub-systems o solr, lucene o Change configuration without restarting Alfresco
• JMX/Share admin • Lucene
o Lots – sub-set in share
• SOLR o Host/port/SSL
• Properties
How to migrate
• Carry on using lucene • Configure SOLR • Configure Alfresco
o Support SOLR tracking
• Monitor SOLR tracking • Switch sub-systems when ready • You can switch back to lucene
o It will check its state as it does now at start up
Stats and reporting
• JMX/Share o Later ….
• Direct to SOLR o https://localhost:8443/solr/admin/cores?action=SUMMARY o https://localhost:8443/solr/admin/cores?action=REPORT
• Fix o JMX o https://localhost:8443/solr/admin/cores?action=FIX
Improvements
• PATH • Access evaluation
o Query time
• Cross-language/locale support o Query/Tokenisation o Sorting
• SOLR o Query caching o Facets
Improvements …
• Cross-language o Standard tokenisation o Configurable o Default – SOLR WordDelimiterFilterFactory
• BigWoof-123-A47.txt • .txt, Big, 123A, 123a47txt, 47, A47,
BigWoof123A47txt
Improvements …
• Cross-language o Sort
• d:text – en: peach péché pêche sin – fr: peach pêche péché sin
• d:mltext – Nearest match
Improvements …
• Indexing Control o cm:indexControl o cm:isIndexed (Boolean)
• Enable/disable All indexing (properties & content)
o cm:isContentIndexed (Boolean) • Enable/disable Content Indexing
Improvements …
• Canned Queries o How is share affected by eventual consistency? o DB o Not lucene/SOLR
Where is SOLR/Lucene used?
• Advanced Search • Filters • Tags (not the roll up) • Categories (facets) • Dashlets
o E.g. Recently Modified
• People, Groups, Sites will use DB query unless o Start with *xyz o Other wildcards
SOLR futures
• SOLR cloud • SOLR/Lucene improvements
o Performance o Future 3.4, 4.0, ...
• Geo
Demos ….
Questions?