search analytics for content strategists @csofnyc
DESCRIPTION
Search is a conversation, learn to listen to what you visitors are telling you by understanding their search behavior. In this presentation we'll cover information foraging, search analysis, and how to use them and other techniques to improve your content without having to be a statistician.TRANSCRIPT
Marko Hurst
Search Analytics For the Content Strategist
Using Search Data To Improve Your Content
Content Strategy NYC Sep. 2009
Keynote: Marko Hurst
Marko Hurst
Me
Book: Search Analytics - Conversations With Your visitors
• Anticipated release: December 2009
• Book website: RosenfeldMedia.com/books/SearchAnalytics
• Co-Author: Lou Rosenfeld
Consultant, Author, & Speaker
Enterprise websites & applications
Web & Search Analytics
User Experience
Machine Learning
Principal: MDH Studios
Blog: MarkoHurst.com “Insightful Analytics”
Twitter: MarkoHurst
Contact: [email protected]
Marko Hurst
About the book
Who: UX & Web Analytics (WA) communities
What: We are bringing UX & WA together by using both qualitative & quantitative data in our decision making process we have created the only complete user model
Why: There are better and more efficient ways of doing our work, but tradition and ignorance keep us siloed and working apart. It’s time to change that. It’s time to move the industry forward.
Marko Hurst
Before We Begin
Establishing a baseline
Marko Hurst
Viva La Revolution! Power To the Content!
Content isn’t king, it’s the dictator
• Doesn’t matter what it is, it’s content…
• Article
• FAQ
• Product / service
• File (PDF, PPT, XML)
• Entertainment (game, video, images)
• Form
• Image
• Etc
The only real goal online is to get visitors to the content they need / want
Flickr Photogrpher : miranda_goode
Marko Hurst
Terms & Definitions
Taxonomy
• Strict hierarchy of parent / child relationships
Ontology
• Associated relationship between content
Metadata
• Data that describes data/content, including where to find it
Controlled Vocabulary
• Closed list of words used to describe a certain piece of content
Classification system
• Generic categorizing of objects to show their structured order
Marko Hurst
SSA Benefits & Expectations
SSA* produces actionable insights• Techniques used are about analysis, NOT reporting
• For some, this like reaching Nirvana• For others, this is like opening Pandora’s Box
To achieve maximum benefits of SSA expect to:
Change site design/layout
Change content
Keywords, copy, metadata, labels, etc.
Change information architecture• Navigation, taxonomy, ontology, user flows, etc.
• Add &/or remove pages
• And much more!* SSA = Site Search Analytics
Marko Hurst
Agenda
Information Foraging
Search Analysis
Anatomy of Search
SSA & Content Techniques
Q&A
Marko Hurst
How We Find Information
Information Foraging
Marko Hurst
Information Trail
Humans forge for information similar to how animals forage for food
1. Move outwards in a direction we think (predict) will provide the expected results
2. Continue on a path as long as we ‘smell’ signs that we are still on the correct path (information scent)
3. When we no longer smell those signs we retrace our path or find a new path entirely where the ‘smell’ is stronger, which we remember for next time (recursive learning) to better predict where/where not to go
Flickr Photogrpher : a walk on the wild side
Marko Hurst
• Strong information scents are good at guiding users to the content they want/need
• Weak information scents cause visitors to spend more time evaluating options and increase the chance that they will select the wrong option and be forced to backtrack or leave entirely
Information Scent
Information scent is how people evaluate options they encounter looking for information on a site
Flickr Photogrpher : RaffertyEvans
Marko Hurst
How Humans Find Content Online
Three ways of finding content
1.Browse
2.Ask
3.Search
Marko Hurst
Browse (Navigate)
Marko Hurst
Ask
Marko Hurst
Search
Marko Hurst
Search Analysis
Getting Started
Marko Hurst
Getting Started: Basics - Overview
Business Model
Data
• Log files
• Search Engine / Web Analytics
Analyzing data
• Data analysis tools
• Zipf Distribution (long-tail)
• Excel (spreadsheet) skills• Low / no budget software• No need for code or higher mathematics
NOTE: everything I show you is 100% technology agnostic
Marko Hurst
Business Models
Your content should be inline dictated by the “site” business goals
Four Online Business Models
• eCommerce
• Content
• Advertising
• Subscription
• Lead Generation
• Self Service
• Most sites fall into at least 2 categories• Each model inherently comes with it own set of KPIs
Marko Hurst
Where Data Comes From: Search Logs (Google Search Appliance)
Critical elements in red
IP address, time/date stamp, query, and # of resultsXXX.XXX.XX.130 - - [10/Jul/2006:10:24:38 -0800] "GET /search?access=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL%3Ad1&ud=1&site=AllSites&ie=UTF-8&client=www&oe=UTF-8&proxystylesheet=www&q=regional+transportation+governance+commission&ip=XXX.XXX.X.130 HTTP/1.1" 200 9718 62 0.17
XXX.XXX.X.104 - - [10/Jul/2006:10:25:46 -0800] "GET /search?access=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL%3Ad1&ud=1&site=AllSites&ie=UTF-8&client=www&oe=UTF-8&proxystylesheet=www&q=lincense+plate&ip=XXX.XXX.X.104 HTTP/1.1" 200 971 0 0.02
XXX.XXX.X.104 - - [10/Jul/2006:10:25:48 -0800] "GET /search?access=p&entqr=0&output=xml_no_dtd&sort=date%3AD%3AL%3Ad1&ie=UTF-8&client=www&q=license+plate&ud=1&site=AllSites&spell=1&oe=UTF-8&proxystylesheet=www&ip=XXX.XXX.X.104 HTTP/1.1" 200 8283 146 0.16
Marko Hurst
Where Data Comes From: Search Engine / Web Analytics
Data collection & options vary by vendor
Data collection is typically a separate step if you want to combine it with web analytics
• I.e. Most analytic vendors (page tag model) do NOT have built-in search data capabilities
Marko Hurst
Zipf Distribution: The Long Tail, Power Law, 80/20, etc.
HeadHead
TorsoTorso
TailTail
Flickr Photogrpher : hjallig
3 characteristics
Marko Hurst
Improving Your Content
SSA Techniques & Search Behavior
Marko Hurst
Flickr Image : Peter Morville Based on original image from “In Defense of Search” by Peter Morville
The Anatomy of Search: Search Components
Six components of a single search experience
1 2 3 4 5 6
Marko Hurst
Component 1: Visitor (User)
When / Where Do Visitors Search?
Most often when a visitor becomes frustrated with browsing (i.e. your content: design, architecture, labeling, etc.)
• Caution: Some visitors use search as their first / primary method
• You will need to filter out these types of searchers
• This behavior also occurs when a visitor ‘knows’ what they are looking for
?’s
How could knowing where search was initiated from be useful?
What insights could be derived from this?
What changes might be made?
Marko Hurst
Component 1: Visitor (User) Search Analysis
When /where did your visitors initiate search from?
Marko Hurst
Component 2: Query (Keywords)
When a visitor users search they are speaking to you their Natural Language, i.e. not yours
They are confessing their needs & desires to you hoping you can help them
This is your chance to have “a conversation” don’t waste it!
Conversation = Good. Monolog = BAD!
Are you speaking the same language, or a foreign language?
?’s
How might you apply natural language to your copy, navigation, labels?
…taxonomy, ontology, metadata, controlled vocabulary?
…SEO & SEM?
How could determine your most valuable content?
Marko Hurst
Component 2: Keyword Analysis
What are your visitors looking for?
Marko Hurst
Component 2: Keyword Analysis
Trends
Marko Hurst
Component 3: Search Interface
Minimum: search query box & search button
Sometimes a filter or facets will also be used
Marko Hurst
Component 3: Search Interface Analysis
How many characters should your query box display?
Marko Hurst
Component 4: Search Engine
While the Search Engine is an/the essential component…
• Opening the ‘black box’ is beyond scope of book & this talk
Things to remember… options and details vary by vendor
Common features: reporting, ranking, best bets, did you mean…, stemming, faceting, weighting, most frequent, clustering, etc.
Marko Hurst
Component 4: Search Engine Analysis
The success of a search is the bottom line of search analytics
How to measure that success…• Precision is the % of content retrieved that is relevant to the user’s query.
• Recall is the % of the content that is relevant to the query that are successfully retrieved.
• Fall-out is the % of non-relevant content that is retrieved, out of all non-relevant content available
* Images courtesy of: http://en.wikipedia.org/wiki/Information_retrieval
Marko Hurst
Component 5: Content
Search is about getting visitors to relevant content
A part of your contents success can be determined by how your visitor’s behave and act with your content
?’s
What type of content can you improve via search data?*
Someone try and walk us through how this could be done
* Hint - all of it
Marko Hurst
Component 5: Content
SEO is NOT about being ranked #1 in Google• I.e. it doesn’t matter that you’re ranked in the top 10 in 25
keywords when no one comes to your site using those keywords!• SEO is about getting visitors to relevant content
User-generated SEO• SEO: Your goal is to get relevant content ranked high in the search
engines to achieve business goals• Writers: Your goal is write compelling content that achieves
business goals• Natural language in a your environment, not Google’s the better
place to start• You both have access to it• You both should use it• HINT - talk to each other before, during, after content creation
Search Analytics is GREAT PLACE for UX, Content Strategists, SEO, & Web Analysts to work together NOT against each other
Marko Hurst
Component 5: Content Analysis
Power of CONVERSATION: Are you listening to or ignoring your visitors?
What content / products / services are your visitors are looking for?
• Do you not have it? Or can’t they find it?
• Maybe you should add / remove content / products?
Natural Language
• Your visitors may be speaking a language you don’t understand
• Worse you may be trying to speak to them in a language they don’t understand
Look for patterns / relationships between content
Informs you taxonomy, ontology, metadata, controlled vocab, & your SEO / SEM
Surveys: tie attitudinal & behavioral data together
• What & why analysis
• Complete user model (the only one)
http://4q.iperceptions.com
Marko Hurst
Component 6: Results (SERP)
SERP (Search Engine Result Page)
The (inferred) quality of your results / content can be determined by:
• Refinement
• Null results
• Bounce Rate
• Where did they go?
Marko Hurst
There are lots of great reports out there, here are a few I find critical for successful analysis…
Component 6: Results (SERP) Analysis
Marko Hurst
Single Greatest Piece of Advice I Can Provide…
Reports & data are fantastic and essential for analysis.
But if you REALLY REALLY want to find out how well or poor your search engine & content are working all you have to do is… “walk a mile in your visitor’s shoes”.
MEANING: Take your visitors’ keywords and manually input them YOURSELF and experience what they did
Marko Hurst
Summary
Pay attention! Especially you in the back row
Marko Hurst
Summary
All content can be optimized via data
Improving search improves your… visitor satisfaction, site usability, SEO, SEM, ROI, design, content, overall user experience, and more
Tear down the traditional walls around data & ownership that hold us back
Combine qualitative & quantitative (what & why) data for analysis and decision making• Provides the only complete user model
• We actually might work together as a team
• It’s good for the soul and gives you the warm & fuzzies when your done
6 components to search
Visitor
Keywords
Search interface
Search engine
Content
SERP (results)
My book “Search Analytics” will be out in December’ish
Marko Hurst
Thank You!Book: RosenfeldMedia.com/books/SearchAnalytics
Blog: MarkoHurst.com
Contact: [email protected]
Twitter: MarkoHurst