Download - Search is the new UI
Search is the new UI
OpenSource Connections
@DanielBeach
@ DanielBeach OpenSource Connections
‣ Search strategist @ OpenSource Connections
‣ Elasticsearch + AngularJS video course
‣ Spyglass search UI‣ New dad
“Search results”google.com
Powered by Search
‣ Search results‣ Data visualization dashboards ‣ Recommendation engines‣ Anomaly detection
Not your typical search interface.parse.ly
Search Misconceptions
‣ Search only powers “search pages”
‣ Search is best left to Google‣ Search is only useful for
text documents
Movie Datatmdb.com
Typical Search Inputs
‣ Text entered by a user‣ Selected filters
Passive Search Inputs
‣ Current time & events‣ The user’s location‣ How recently was the content
published?‣ The popularity of an item
Time Datagoogle.com
Talk Overview
‣ Principles of good search‣ How search engines work‣ Front-end search patterns
GENERAL SEARCH PRINCIPLES
Search is an opportunity to have a conversation.
‣ Users are telling you what they expect to find on your site
‣ Search is an opportunity to learn about your users / customers
Natural searchessound hound
Principles of Search Design
‣ Values search‣ Listens to your users‣ Shows relevant content‣ Gets out of their way
Search is about Clarity
‣ The relevancy of the results is more important than the design of the results.
This is what search should be.amazon.com
Only Show what is Important
‣ If you have twenty average results, but one result is statistically more important, don’t show the others.
Accommodate Multiple Types of Searching
‣ Informational users are interested in the breadth of your data
‣ Navigational users want to get somewhere fast
Bad Search
‣ Invalid results‣ No results ‣ Confusing result hierarchy‣ Visually messy results‣ No clear input or submit
Good Search Design
‣ Showing relevant data to users
‣ Only shows useful / actionable results
‣ Visually clean results‣ Differentiated results‣ Recognizable input and submit
Seasonal results, or scary-accurate data mining?amazon.com
HOW SEARCH ENGINES WORK
Open Source Search Engines
‣ Elasticsearch‣ Solr
How humans see text
–Arthur Clark
How a search engine sees textThis is called text analysis, and it happens at indexing time.
Indexed terms(simplification)
Query AnalysisA compatible analysis chain must be used for the query string as the indexed content in
order for terms to match.
TERM MEANING
TokenizationSplitting text into indexable pieces, called tokens. A word is often an example of a token.
StemmingCollapsing words to their root (interpretation, interpreting --> interpret)
Inverted index An index of tokens. Maps tokens to document position
Term frequency The number of times a token occurs in a document
Inverse document frequency
Tokens that appear in fewer documents are calculated to be more important (simplified)
Document Scoring
‣ Score based on number of matches in a document, as compared to the popularity of that term across all documents.
Types of Boosts
‣ Field boosts ‣ Text matching (title^5
description)‣ Function scores‣ Boost newer content‣ Multiply by % of popularity‣ …
Minimum Match
‣ How many of the query terms have to match in order for a document to be returned?
‣ Precision vs recall
FRONT-END SEARCH PATTERNS
Search patterns
Search request
{ "query": { "multi_match": { “fields": ["title^5", "description”], “query": “descender”, "minimum_should_match": "2<-1 5<70%" } } }
Response{ "took": 47, "timed_out": false, "hits": { "total": 2, "max_score": 2.17284, "hits": [ { "_index": "catalog", "_type": "comics", "_id": "84", "_score": 2.17284, "_source": { "title": "Descender", "description": "One young robot’s struggle to stay alive in a universe where all androids have been outlawed and bounty hunters lurk on every planet." ...
Aggregations
{ query: query, aggs: { "comic formats": { terms: {field: "format"} } } }
Aggregations Request
Aggregations Response{ "hits":{ "total":2, "hits":[ ... ], "aggregations":{ “comic types":{ "buckets":[{ "key":"Trades", "doc_count":63}, { "key":"Graphic Novels", "doc_count":35 }, { "key":"Compilations", "doc_count":9 }]}}}
Filtering{ filtered: { query: query, filter: { bool: { must: [ {term: {"illustrator": "Dustin Nguyen"}}; ] } }}}
Search categoriesetsy.com
Showing relative document countsassignment.uspto.gov
Autocomplete
Autocomplete Suggestions
‣ Spelling corrections‣ More popular phrases
Autocomplete Request{ "query": { "simple_query_string": { "fields": ['title'], "query": baseTerms + '(' + lastTerm + '|' + lastTerm + '*)', "default_operator": "and" } }, "size": 3, "_source": ["title"] }
Autocomplete – disambiguationgoogle.com
Suggestion Request{ "query": query, "suggest": { "text": searchTerms, "phraseSuggestion": { "phrase": { "field": "title", "direct_generator": [{ "field": "title", "suggest_mode": "popular", "min_word_length": 3 }] }}}}
Highlighting
Highlighting
‣ Highlight search terms‣ Snippets for large blocks of
text
Highlighting Request{ query: query, highlight: { fields: { "title": {number_of_fragments: 0}, "detailed description": {number_of_fragments: 0} } } }
Highlighting Response{ "took":28, "hits":{ "total":2, "max_score":1.44856, "hits":[{ "_index":"catalog", "_type":"comics", "_id":"84", "_score":1.44856, "_source": { "title": "Descender" }, "highlight":{ "title":["<em>Descender</em>"]} } ]}}}
Result highlightinglibrary.oreilly.com
Load more
Loading More Results{ "query": {"match_all": {}}, "from": resultsPage * 10, "sort": "fieldName desc", }
Think outside the search box
‣ Search gives you extreme flexibility to return relevant content quickly, given a wide range of inputs
‣ Retrieval and ranking engine
@DanielBeachOpenSource Connections
Search is the new UI
OpenSource Connections
@DanielBeach