plat-4 understanding the solr integration

20
Understanding The SOLR Integration Andy Hind • Senior Developer • twitter @andy_hind

Upload: alfresco-software

Post on 26-Jun-2015

5.514 views

Category:

Technology


2 download

DESCRIPTION

Video that accompanies this presentation at: http://www.youtube.com/watch?v=1t3Z2pJyulA Join us for a guided tour of the Alfresco SOLR integration and new search sub-systems. We’ll discuss how it works, the limitations of eventual consistency, guidance for configuration and set-up. We’ll also cover the steps required to migrate, improved PATH performance, in-query ACL evaluation, cross-language support and monitoring as well as performance.

TRANSCRIPT

Page 1: PLAT-4 Understanding the SOLR Integration

Understanding The SOLR Integration Andy Hind • Senior Developer • twitter @andy_hind

Page 2: PLAT-4 Understanding the SOLR Integration

Agenda

• Why SOLR? • What is supported? •  Eventual consistency • Configuration and setup •  How to migrate •  Status/reporting •  Improvements

Page 3: PLAT-4 Understanding the SOLR Integration

Why SOLR?

•  Issues… o  Cluster – index per node o  Performance

•  Permission evaluation •  Structural queries •  In-transaction indexing

o  Scale query independently o  Cross-locale support o  Sub-system and dynamic configuration

Page 4: PLAT-4 Understanding the SOLR Integration

What is supported?

•  Spaces store •  Archive • Query languages •  NOT

o  WCM based on AVM o  Records Management o  All stores o  Multi-tenant o  In transaction (eventually consistent)

Page 5: PLAT-4 Understanding the SOLR Integration

Eventual consistency

•  SOLR is tracking Alfresco o  Following transactions – a bit like clustering o  Eventual consistency o  Transactions that may take some time to commit o  Two cores

•  SpacesStore •  ArchiveStore

Page 6: PLAT-4 Understanding the SOLR Integration

Eventual consistency

• Models •  ACLs • Metadata • Content • Ownership •  Structure - PATH

Page 7: PLAT-4 Understanding the SOLR Integration

High Level Architecture

Repository

Solr

Search Requests

Async: Index Polling

Solr Cores: - Workspace - Archive

Search Results

Content Store(s)

Database Storage

Solr Cores

Models ACLS Properties & Content

Updates

Page 8: PLAT-4 Understanding the SOLR Integration

Setup

•  SOLR is a web app o  zip

• Communicates over SSL o  Generate and configure your certificates …

•  Per core configuration in SOLR o  Data location

•  Installer default

Page 9: PLAT-4 Understanding the SOLR Integration

Configuration

•  Search sub-systems o  solr, lucene o  Change configuration without restarting Alfresco

•  JMX/Share admin •  Lucene

o  Lots – sub-set in share

•  SOLR o  Host/port/SSL

•  Properties

Page 10: PLAT-4 Understanding the SOLR Integration

How to migrate

• Carry on using lucene • Configure SOLR • Configure Alfresco

o  Support SOLR tracking

• Monitor SOLR tracking •  Switch sub-systems when ready •  You can switch back to lucene

o  It will check its state as it does now at start up

Page 11: PLAT-4 Understanding the SOLR Integration

Stats and reporting

•  JMX/Share o  Later ….

•  Direct to SOLR o  https://localhost:8443/solr/admin/cores?action=SUMMARY o  https://localhost:8443/solr/admin/cores?action=REPORT

•  Fix o  JMX o  https://localhost:8443/solr/admin/cores?action=FIX

Page 12: PLAT-4 Understanding the SOLR Integration

Improvements

•  PATH •  Access evaluation

o  Query time

• Cross-language/locale support o  Query/Tokenisation o  Sorting

•  SOLR o  Query caching o  Facets

Page 13: PLAT-4 Understanding the SOLR Integration

Improvements …

• Cross-language o  Standard tokenisation o  Configurable o  Default – SOLR WordDelimiterFilterFactory

•  BigWoof-123-A47.txt •  .txt, Big, 123A, 123a47txt, 47, A47,

BigWoof123A47txt

Page 14: PLAT-4 Understanding the SOLR Integration

Improvements …

• Cross-language o  Sort

•  d:text –  en: peach péché pêche sin –  fr: peach pêche péché sin

•  d:mltext – Nearest match

Page 15: PLAT-4 Understanding the SOLR Integration

Improvements …

•  Indexing Control o  cm:indexControl o  cm:isIndexed (Boolean)

•  Enable/disable All indexing (properties & content)

o  cm:isContentIndexed (Boolean) •  Enable/disable Content Indexing

Page 16: PLAT-4 Understanding the SOLR Integration

Improvements …

• Canned Queries o  How is share affected by eventual consistency? o  DB o  Not lucene/SOLR

Page 17: PLAT-4 Understanding the SOLR Integration

Where is SOLR/Lucene used?

•  Advanced Search •  Filters •  Tags (not the roll up) •  Categories (facets) •  Dashlets

o  E.g. Recently Modified

•  People, Groups, Sites will use DB query unless o  Start with *xyz o  Other wildcards

Page 18: PLAT-4 Understanding the SOLR Integration

SOLR futures

•  SOLR cloud •  SOLR/Lucene improvements

o  Performance o  Future 3.4, 4.0, ...

• Geo

Page 19: PLAT-4 Understanding the SOLR Integration

Demos ….

Page 20: PLAT-4 Understanding the SOLR Integration

Questions?