david hawking - the search master's toolbox
DESCRIPTION
David Hawking, Funnelback's Chief Scientist, presented "The Search Master's Toolbox" at Online Information 2010 in London.The talk provided considerations and advice for website and marketing managers to apply to search solutions employed in their organisations. It highlighted the reasons why search is so vitally important to the overall success of a website and provided information on the tools required to deliver and optimise an effective search solution.TRANSCRIPT
![Page 1: David Hawking - The Search Master's Toolbox](https://reader035.vdocument.in/reader035/viewer/2022062510/54b4f4064a7959b9428b45b6/html5/thumbnails/1.jpg)
Online InformationLondon 30 Nov 2010
The Search Master's Toolbox
David HawkingFunnelback / Squiz
![Page 2: David Hawking - The Search Master's Toolbox](https://reader035.vdocument.in/reader035/viewer/2022062510/54b4f4064a7959b9428b45b6/html5/thumbnails/2.jpg)
Funnelback’s UK Customers From 2004/5: Staffordshire University,
Scottish Care Commission
From 2009:The Electoral Commission, Digital UK, Hargreaves Lansdown
From 2010: LSE, The Electoral Commission, Incisive Media, British Medical Journal, East Ayrshire Council, Skype international, UCL, ...
![Page 3: David Hawking - The Search Master's Toolbox](https://reader035.vdocument.in/reader035/viewer/2022062510/54b4f4064a7959b9428b45b6/html5/thumbnails/3.jpg)
“Search is life”
![Page 4: David Hawking - The Search Master's Toolbox](https://reader035.vdocument.in/reader035/viewer/2022062510/54b4f4064a7959b9428b45b6/html5/thumbnails/4.jpg)
Costs of poor search Butler Group: Up to 10% of salary costs wasted
through ineffective search IDC: A company with 1000 information workers
wastes more than $5M p.a. due to poor search Accenture: Survey of 1000 middle mgrs show
they spend up to 2 hrs/day searching. Econsultancy: Only 41% of companies satisfied
that their site search is delivering on business objectives.
ABC Shop: 24% increase in online sales after upgrade in search effectiveness
Search is a critical part of the web experience.
![Page 5: David Hawking - The Search Master's Toolbox](https://reader035.vdocument.in/reader035/viewer/2022062510/54b4f4064a7959b9428b45b6/html5/thumbnails/5.jpg)
Who's the SearchMaster in your organisation?
![Page 6: David Hawking - The Search Master's Toolbox](https://reader035.vdocument.in/reader035/viewer/2022062510/54b4f4064a7959b9428b45b6/html5/thumbnails/6.jpg)
Stakeholders expect every SearchMaster to do her duty! To make external website search work
◦Sales conversions◦Information dissemination◦Reduced inquiry handling load
To provide effective search of corporate information◦Happy, productive employees (plus
students and other stakeholders)
![Page 7: David Hawking - The Search Master's Toolbox](https://reader035.vdocument.in/reader035/viewer/2022062510/54b4f4064a7959b9428b45b6/html5/thumbnails/7.jpg)
Give them the tools and they will do the job!
SearchmasterEnd-user
SimplePowerful
![Page 8: David Hawking - The Search Master's Toolbox](https://reader035.vdocument.in/reader035/viewer/2022062510/54b4f4064a7959b9428b45b6/html5/thumbnails/8.jpg)
1. The basic search tool Should:
◦Have good performance out of the box, without weeks of implementation.
◦Be simple to configure◦Avoid features which are too complex to
use or set up.◦Be able to cover your content and scale to
the necessary level
![Page 9: David Hawking - The Search Master's Toolbox](https://reader035.vdocument.in/reader035/viewer/2022062510/54b4f4064a7959b9428b45b6/html5/thumbnails/9.jpg)
2. FineTuner Every search deployment is different
◦Web, database, fileshare, Lotus The weighting of ranking features must
accommodate to the differences Manual tweaking is fraught with danger
◦Fix one query, break a dozen Make a test file and use a tuning tool to
learn feature weightings
![Page 10: David Hawking - The Search Master's Toolbox](https://reader035.vdocument.in/reader035/viewer/2022062510/54b4f4064a7959b9428b45b6/html5/thumbnails/10.jpg)
Testfile Desiderata Representative of real workload
◦Need an unbiased sample Many queries (typically >> 100) Identify the best answer(s) Equivalent answers See es.csiro.au/C-TEST/
![Page 11: David Hawking - The Search Master's Toolbox](https://reader035.vdocument.in/reader035/viewer/2022062510/54b4f4064a7959b9428b45b6/html5/thumbnails/11.jpg)
Spreadsheet testfile
employment health.gov.au/health-career-vacant.htm
jobs health.gov.au/health-career-vacant.htm
vacancies health.gov.au/health-career-vacant.htm
recruitment health.gov.au/health-career-vacant.htm
tenders health.gov.au/list-of-tenders-and-grants.htm
grants health.gov.au/list-of-tenders-and-grants.htm
tenders health.gov.au/list-of-tenders-and-grants.htm
mental health health.gov.au/mental-health-and-wellbeing
mental health strategyhealth.gov.au/mental-strategy
aged care health.gov.au/aged-care.htm
![Page 12: David Hawking - The Search Master's Toolbox](https://reader035.vdocument.in/reader035/viewer/2022062510/54b4f4064a7959b9428b45b6/html5/thumbnails/12.jpg)
Sources of testfiles at LSE A-Z Sitemap (>500 entries)
◦ Biased toward anchortext Keymatches file (>500 entries)
◦ Pessimistic Click data (>250 queries with > t clicks)
◦ Biased toward clicks – 100% success! Pop/crit queries (134 manually judged)
All biased – Use a sampling tool!
![Page 13: David Hawking - The Search Master's Toolbox](https://reader035.vdocument.in/reader035/viewer/2022062510/54b4f4064a7959b9428b45b6/html5/thumbnails/13.jpg)
Dimension-at-a-time tuning
12
3
dim2dim1
dim1
![Page 14: David Hawking - The Search Master's Toolbox](https://reader035.vdocument.in/reader035/viewer/2022062510/54b4f4064a7959b9428b45b6/html5/thumbnails/14.jpg)
Popular/Critical Set
Out of boxAs configured
-daat (tuned)-daat20000 (tuned)
-daat0/TAAT (tuned)
0
5
10
15
20
25
30
![Page 15: David Hawking - The Search Master's Toolbox](https://reader035.vdocument.in/reader035/viewer/2022062510/54b4f4064a7959b9428b45b6/html5/thumbnails/15.jpg)
Fine Tuning Summary Tuning a large number of dimensions
(Funnelback FineTune covers 38) Millions of query executions Achieves substantial gains
![Page 16: David Hawking - The Search Master's Toolbox](https://reader035.vdocument.in/reader035/viewer/2022062510/54b4f4064a7959b9428b45b6/html5/thumbnails/16.jpg)
But why do queries still fail?
Misspelled◦Onlion Imformation
Query words don't match document◦“door” or “MOPEM” v. “manually operated
personnel egress mechanism” There is no answer to that question.
◦Maybe there should be◦Scope issues.
![Page 17: David Hawking - The Search Master's Toolbox](https://reader035.vdocument.in/reader035/viewer/2022062510/54b4f4064a7959b9428b45b6/html5/thumbnails/17.jpg)
Need more tools!
![Page 18: David Hawking - The Search Master's Toolbox](https://reader035.vdocument.in/reader035/viewer/2022062510/54b4f4064a7959b9428b45b6/html5/thumbnails/18.jpg)
3. Spelling suggestion tools
Suggestions may be useful even if words are correctly spelled:◦ Manchester Untied → Chelsea
Suggestions based on whole query, not word-by-word
Don't suggest queries which make no sense in the collection being searched
Autocompletion: Guide users to the best query
Context is king
![Page 19: David Hawking - The Search Master's Toolbox](https://reader035.vdocument.in/reader035/viewer/2022062510/54b4f4064a7959b9428b45b6/html5/thumbnails/19.jpg)
4. Query expansion tools Manual rules:
◦Rego → [registration rego]◦MOPEM →[“manually operated personnel
egress mechanism door”] Related queries (automatic)
◦Based on co-clicking Contextual navigation (on-the-fly)
◦Finding superphrases in a deep result set Faceting (semi-automatic)
![Page 20: David Hawking - The Search Master's Toolbox](https://reader035.vdocument.in/reader035/viewer/2022062510/54b4f4064a7959b9428b45b6/html5/thumbnails/20.jpg)
![Page 21: David Hawking - The Search Master's Toolbox](https://reader035.vdocument.in/reader035/viewer/2022062510/54b4f4064a7959b9428b45b6/html5/thumbnails/21.jpg)
![Page 22: David Hawking - The Search Master's Toolbox](https://reader035.vdocument.in/reader035/viewer/2022062510/54b4f4064a7959b9428b45b6/html5/thumbnails/22.jpg)
5. Reporting and alerting tools
Reporting on Queries which:◦Produced no results◦Logged behaviour suggestive of unfulfilment
Alerting when:◦Submissions of a query (or group of related
queries) sharply increase in frequency For:
◦business intelligence◦Triggering creation or changes to content
![Page 23: David Hawking - The Search Master's Toolbox](https://reader035.vdocument.in/reader035/viewer/2022062510/54b4f4064a7959b9428b45b6/html5/thumbnails/23.jpg)
Query spike alerting
![Page 24: David Hawking - The Search Master's Toolbox](https://reader035.vdocument.in/reader035/viewer/2022062510/54b4f4064a7959b9428b45b6/html5/thumbnails/24.jpg)
Conclusions Search is important Organisations benefit when someone takes
responsibility for effective search – the SearchMaster.
The core search tool must be effective, and able to be adapted to your organisation's publishing and searching characteristics.
Further tools are needed to overcome poor queries and missing content.
Thanks to Mike Swanson of Oxfam Australia for the Ned Kelly line.
![Page 25: David Hawking - The Search Master's Toolbox](https://reader035.vdocument.in/reader035/viewer/2022062510/54b4f4064a7959b9428b45b6/html5/thumbnails/25.jpg)