apache solr in aem 6

16
Introduction to Apache SOLR in Adobe AEM 6 Dr. Yash Mody, PhD CTO | Tekno Point Consulting

Upload: yash-mody

Post on 26-Jan-2015

137 views

Category:

Technology


4 download

DESCRIPTION

Introduction to Apache SOLR and configuring Apache SOLR with AEM 6

TRANSCRIPT

Page 1: Apache SOLR in AEM 6

Introduction to Apache SOLR in Adobe AEM 6

Dr. Yash Mody, PhD CTO | Tekno Point Consulting

Page 2: Apache SOLR in AEM 6

About Me

Adobe AEM, Apache Hadoop Instructor & Consultant

Application Architecture and Design Consultant Need I say more?

www.teknopoint.us  

Page 3: Apache SOLR in AEM 6

www.teknopoint.us  

Page 4: Apache SOLR in AEM 6

Information Retrieval Document Term Inverted Index Term Frequency (tf) Skip Pointers Positional Index Collection Frequency Document Frequency (df) Inverse Frequency Idf = Log10(N/df) Term Frequency Inverse Document Frequency

tf-idf = tf * Idf

www.teknopoint.us  

Page 5: Apache SOLR in AEM 6

More???

PHEW! No Way

www.teknopoint.us  

Page 6: Apache SOLR in AEM 6

Apache SOLR

Fire Powered Lucene Distributed Replicated Remote

And just for the record its… SEARCH On LUCENE w/REPLICATION (TBHPHB)

www.teknopoint.us  

Page 7: Apache SOLR in AEM 6

Installation

Unpack SOLR distribution Add solr.war to webapps Add –Dsolr.solr.home = … OR http://bitnami.com/stack/solr

www.teknopoint.us  

Page 8: Apache SOLR in AEM 6

Getting solr ready

Starting SOLR cd /usr/local/Cellar/solr/4.7.2/libexec/example/ - jetty java -jar start.jar http://localhost:8983/solr/#/ Adding content using

www.teknopoint.us  

Page 9: Apache SOLR in AEM 6

Index and search

Indexing Data java -jar post.jar solr.xml

Searching

http://localhost:8983/solr/select?q=solr&wt=json

www.teknopoint.us  

Page 10: Apache SOLR in AEM 6

Configurations

Configurations are done in 2 xml files schema.xml – SOLR index configurations solrconfig.xml – SOLR configurations

www.teknopoint.us  

Page 11: Apache SOLR in AEM 6

Indexing

Indexing is using HTTP POST. So indexed can be posted to SOLR via a web request Data can be pulled using Data Import Handler (uses HTTP GET or DB) SOLR can index binary content (textual + metadata) from docs, video, mp3, images and other binary content

www.teknopoint.us  

Page 12: Apache SOLR in AEM 6

Search

Search features: Paging, Filtering, Sorting, Faceting

Results: xml (Default), json, php, ruby, python etc. Query Parser: used to interpret queries. 2 types of query parsers

Lucene Query Syntax Parser DisMax Parser (Disjunction Max)

www.teknopoint.us  

Page 13: Apache SOLR in AEM 6

Solr integration approaches

Crawl using an external crawler like Nutch or Heritrix CQ servlets to serialize content into a Solr (JSON/XML) JCR Observer for page modifications to trigger indexing to Solr.

www.teknopoint.us  

Page 14: Apache SOLR in AEM 6

AEM 6

2 Types In Built Remote (For distributed) Zookeeper (for setting up a cluster)

Shard – horizontal Partition Replication – no of copies of the index files

www.teknopoint.us  

Page 15: Apache SOLR in AEM 6

SOLR things we didn’t see

https://github.com/evolvingweb/ajax-solr http://wiki.apache.org/solr/SolrQuerySyntax

www.teknopoint.us  

Page 16: Apache SOLR in AEM 6

Thanks

@yash_mody http://www.linkedin.com/in/modyyash