solr has a lot of extensive features solr integration and enhancements todd hatcher
TRANSCRIPT
Solr has a lot of extensive features
Solr Integration and Enhancements
Todd Hatcher
What is Solr?Solr offers advanced, optimized, scalable
searching capabilitiesCommunicate with Solr using XML, JSON
and HTTPIncludes a HTML admin interfaceSolr is built on top of LuceneRich features of Lucene can be leveraged
when using SolrSolr is very configurable
Integration with ColdFusionVery little direct integration with ColdFusionColdFusion communicates with Solr using HTTPSolr runs in its own JVM, does not share with
ColdFusionUsing ColdFusion installation, Solr runs in a jetty
servlet container on port 8983 (http://localhost:8983/solr)
Solr is exposed in production by defaultImportant files located C:\ColdFusion9\solr\
multicoreSolr offers a lot more than what is available
using cfindex cfcollection cfsearch
SolrWhat is a core? – it’s like a verity collection (a
searchable data group)Single Core (one index) vs Multicore (multiple
isolated configurations/schemas/indexes using same Solr instance)
C:\ColdFusion9\solr\multicore\solr.xml is the central file that points to locations of the Solr cores’ configuration and data (this what CF administrator reads/writes to when creating and using Solr collections)
You can put your Solr cores under you project directory and keep them in source control
[core]/conf/solrconfig.xmlMain configuration for solr core<queryResponseWriter name=“json” />
determines the format of the results. ColdFusion uses xslt by default
You can return JSON, XML, python, ruby, phpMultiple query response writers can be
configured, one can be set as default others can be specified by passing parameter wt:[name] (eg. wt:json)
cfsearch type of methods will not work if the response writer is not what ColdFusion is expecting
[core]/conf/schema.xmlField Types maps custom types to the solr/lucene
typetype solr.TextField allows for analyzersAnalyzers can be run at index time or query timeThey allow for manipulations of the data (typically
filtering)The order in which filters are declared is the order
processedStopFilterFactory removes common words that do
not help the search resultsWordDelimiterFilterFactory can adds words like
WiFi, Wi, Fi by splitting the original into subwords
[core]/conf/schema.xml cont.EnglishPorterFilterFactory determines root word
using word variations like -ing determines root word and adds to index
SynonymFilterFactory treats words as sameDoubleMetaphoneFilterFactory for phonetic logic
(better than Soundex which Verity uses)TextSpell/TextSpellPhrase feedback “did you mean”<copyField source=“fieldName” dest=“d”/> dest
fieldtype can run different analyzers on source field and store result
wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
Adobe adds quite a bit to the file to create fieldtypes to be compatible with what was in verity
[core]/conf/schema.xml cont.Similar to creating a database table. Maps
field names to types using <field />Gives you the ability to store additional
dataField can be indexed (searchable)Field can be stored (referenced and
returned with results)Field can be required<uniqueKey>[field name]</uniqueKey><solrQueryParse defaultOperator=“OR” />
IndexingData is sent using api - HTTP POST to Solr
as XML/JSON/BinaryCommit is an intensive task. Do bulk adds
first then call commit<cfindex /> calls commit after each index
(confirmed?)Commit after each would noticeably
increase index timeEfficient Process : add data (queue),
commit, optimize
Search Syntaxfield:term (*:* returns everything)A score is generated at query time, the value itself
doesn’t have any meaning, the scores are relevant only when relative to each other (a scale)
fq can filter query based on some supplied condition
wt is the return type of the results (xml,json, etc.)qt is the request handler used to process the
request (default is “standard”)fl is the list of fields to return (field must be stored)q is the query stringYou can specify the start value and maxrows
DisMaxRequestHandlerDeclared in solrconfig.xmlAllows simplified searching without strict
syntaxCan be configured with default weighted
parameters (which can be overriden)Causes the q parameters to be parsed
differently
ResourcesLucene In Actionhttp://wiki.apache.org/solr/http://cfadminsearcher.riaforge.org/http://cfsolrlib.riaforge.org/
CF Solr Lib written by Shannon Hicks – Wrapper for Solr functionality