solr has a lot of extensive features solr integration and enhancements todd hatcher

12
Solr has a lot of extensive features Solr Integration and Enhancements Todd Hatcher

Upload: della-jewel-atkins

Post on 18-Dec-2015

221 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Solr has a lot of extensive features Solr Integration and Enhancements Todd Hatcher

Solr has a lot of extensive features

Solr Integration and Enhancements

Todd Hatcher

Page 2: Solr has a lot of extensive features Solr Integration and Enhancements Todd Hatcher

What is Solr?Solr offers advanced, optimized, scalable

searching capabilitiesCommunicate with Solr using XML, JSON

and HTTPIncludes a HTML admin interfaceSolr is built on top of LuceneRich features of Lucene can be leveraged

when using SolrSolr is very configurable

Page 3: Solr has a lot of extensive features Solr Integration and Enhancements Todd Hatcher

Integration with ColdFusionVery little direct integration with ColdFusionColdFusion communicates with Solr using HTTPSolr runs in its own JVM, does not share with

ColdFusionUsing ColdFusion installation, Solr runs in a jetty

servlet container on port 8983 (http://localhost:8983/solr)

Solr is exposed in production by defaultImportant files located C:\ColdFusion9\solr\

multicoreSolr offers a lot more than what is available

using cfindex cfcollection cfsearch

Page 4: Solr has a lot of extensive features Solr Integration and Enhancements Todd Hatcher

SolrWhat is a core? – it’s like a verity collection (a

searchable data group)Single Core (one index) vs Multicore (multiple

isolated configurations/schemas/indexes using same Solr instance)

C:\ColdFusion9\solr\multicore\solr.xml is the central file that points to locations of the Solr cores’ configuration and data (this what CF administrator reads/writes to when creating and using Solr collections)

You can put your Solr cores under you project directory and keep them in source control

Page 5: Solr has a lot of extensive features Solr Integration and Enhancements Todd Hatcher

[core]/conf/solrconfig.xmlMain configuration for solr core<queryResponseWriter name=“json” />

determines the format of the results. ColdFusion uses xslt by default

You can return JSON, XML, python, ruby, phpMultiple query response writers can be

configured, one can be set as default others can be specified by passing parameter wt:[name] (eg. wt:json)

cfsearch type of methods will not work if the response writer is not what ColdFusion is expecting

Page 6: Solr has a lot of extensive features Solr Integration and Enhancements Todd Hatcher

[core]/conf/schema.xmlField Types maps custom types to the solr/lucene

typetype solr.TextField allows for analyzersAnalyzers can be run at index time or query timeThey allow for manipulations of the data (typically

filtering)The order in which filters are declared is the order

processedStopFilterFactory removes common words that do

not help the search resultsWordDelimiterFilterFactory can adds words like

WiFi, Wi, Fi by splitting the original into subwords

Page 7: Solr has a lot of extensive features Solr Integration and Enhancements Todd Hatcher

[core]/conf/schema.xml cont.EnglishPorterFilterFactory determines root word

using word variations like -ing determines root word and adds to index

SynonymFilterFactory treats words as sameDoubleMetaphoneFilterFactory for phonetic logic

(better than Soundex which Verity uses)TextSpell/TextSpellPhrase feedback “did you mean”<copyField source=“fieldName” dest=“d”/> dest

fieldtype can run different analyzers on source field and store result

wiki.apache.org/solr/AnalyzersTokenizersTokenFilters

Adobe adds quite a bit to the file to create fieldtypes to be compatible with what was in verity

Page 8: Solr has a lot of extensive features Solr Integration and Enhancements Todd Hatcher

[core]/conf/schema.xml cont.Similar to creating a database table. Maps

field names to types using <field />Gives you the ability to store additional

dataField can be indexed (searchable)Field can be stored (referenced and

returned with results)Field can be required<uniqueKey>[field name]</uniqueKey><solrQueryParse defaultOperator=“OR” />

Page 9: Solr has a lot of extensive features Solr Integration and Enhancements Todd Hatcher

IndexingData is sent using api - HTTP POST to Solr

as XML/JSON/BinaryCommit is an intensive task. Do bulk adds

first then call commit<cfindex /> calls commit after each index

(confirmed?)Commit after each would noticeably

increase index timeEfficient Process : add data (queue),

commit, optimize

Page 10: Solr has a lot of extensive features Solr Integration and Enhancements Todd Hatcher

Search Syntaxfield:term (*:* returns everything)A score is generated at query time, the value itself

doesn’t have any meaning, the scores are relevant only when relative to each other (a scale)

fq can filter query based on some supplied condition

wt is the return type of the results (xml,json, etc.)qt is the request handler used to process the

request (default is “standard”)fl is the list of fields to return (field must be stored)q is the query stringYou can specify the start value and maxrows

Page 11: Solr has a lot of extensive features Solr Integration and Enhancements Todd Hatcher

DisMaxRequestHandlerDeclared in solrconfig.xmlAllows simplified searching without strict

syntaxCan be configured with default weighted

parameters (which can be overriden)Causes the q parameters to be parsed

differently

Page 12: Solr has a lot of extensive features Solr Integration and Enhancements Todd Hatcher

ResourcesLucene In Actionhttp://wiki.apache.org/solr/http://cfadminsearcher.riaforge.org/http://cfsolrlib.riaforge.org/

CF Solr Lib written by Shannon Hicks – Wrapper for Solr functionality