know your market – know your customer: what web data reveals if you know where and how to look
TRANSCRIPT
What Web data reveals if you know where & how to look
Know Your Market – Know Your Customer:
Presenters: Christian Giaretta, VP of Sales Engineering, Connotate
Dennis Clark, Chief Strategy Officer, Luminoso
Moderator: Gina Cerami, VP of Marketing, Connotate
Date: November 1, 2012
Today’s Discussion• What Web Data Reveals: The Fundamentals
• The business case• Where to start? Best practices and the automation process
• Know Your Market• Use cases: market transparency, digital strategy, PDF extraction
• Differences in data sources• Know Your Customer: Part 1
• Use case: online advertising - aggregating customer response to ads
• Manual versus automated approaches• Know Your Customer: Part 2
• Text analysis – overview of options• Concept-based text analysis• Use case: consumer packaged goods
• Other considerations • Q&A
3
IDC Research – October 2012
• CEOs are looking at Big Data on the Web to understand their markets and customers
• The number of sites with valuable content continues to expand at a tremendous rate
• Factors to consider when collecting Web data• Timeliness
• Legitimacy
• Aggregation
6
Can I Trust Web Data for Market Research???Good question! You may have to…factors to consider:
• It’s harder and harder to get people to answer surveys
• Focus groups take time – which you may not have
• Proprietary data sources may not answer all of your important questions
• Organizations and government agencies are moving more and more data, content and forms onto the Web
7
Can I Trust Web Data for Market Research???
8
Timely?YES!!
Refresh primary research
Expose new trends or questions rapidly
Aggregate? YES!!
Volumes of data reveal insights
The longer you retain it, the more valuable it gets
Legitimate?Uhh…
Be vigilant about spam and bias in Web data
Some sites are better than others
Polling Question: Web Data Collection
Are you currently collecting data from the Web?
Yes – we are doing this using an automated process
Yes – however, we are collecting Web data using a manual process
No – we are not collecting Web data
Where to Start? Follow Proven Best Practices
Work with experts with deep experience evaluating Web sources for data extraction to help you…
• Clarify “What do you really want to do with this data?”
• Decide which sites to target
• Identify how easy or difficult it will be to extract
data from target sites
• Outline the scope of the project
• Estimate long-term maintenance costs (and how to
minimize them)
10
12
An Overview of the Automation Process
Transform Deliver
• Structure
• Classify
• Prep for Analysis
• Reports
• Dashboards
• Workflow
• BI Plug-ins
Collect Data
Internal Sources
• Database
• Market Basket
• Inventory, etc.
External Sources
• Social Media
• Surface Web
• Hidden Web
•Secured Sites
17
Building Permits Reveal Construction Activity
AP_Title Mr &MrsAP_Forename Samuel JohnAP_Surname MacNaughtonAP_CompanyNameAP_Building OranaAP_AddressLine1 Easter KinkellAP_AddressLine2 DingwallAP_Town Ross-ShireAP_Postcode IV7 8HY
Excel
18
Insurance Coverage Predicts Drug Sales
Drug Name Tier
A/b otic 2Abilify 4Accolate 4Accupril 4Accuretic 4Accutane 4Acebutolol HCL 2Aceon 4 (1/2)Acetaminophen w/ codeine 2Acetasol HC 2Acetazolamide 2Aciphex XAclovate ointment 4Acticin 2Activella 4Actonel 4Actoplus met 3Actos 3
PDF Document Excel File
Benefits of Using Automation to Understand Markets and Market-Moving Events
19
• Reduce costs associated with manual processes
• Speed up processes by doing this continually instead of sporadically
• Improve accuracy
• Repurpose data for new uses by converting PDFs and other unstructured data into a Excel,XML or other usable formats
Altitude Digital – Buyer Behavior in Real Time • Push the boundaries of “Big Data” in interactive advertising• Use Connotate to collect real-time Web data
• Increase clients’ ad revenues by 30% - 300%
• Continually display aggregated dynamic ad exchange data • Publishers view real-time, side-by-side comparisons of online ad
traffic
• They can instantaneously optimize ad placement
Many of these sites are password-protected….not a problem!
23
Manual versus Automated Approaches
25
Your Data Needs To Automate or Not?
Complex product-matching tasks ? May want to
consider crowd sourcing
Small amount of data, needed a few times per year
? A manual approach may suffice
Specific external data (under $5K/year)
? Purchase from 3rd party
High volume data monitoring
Automate
Variety of sources Automate
Frequent updates and/or monitoring
Automate
Need for data post-processing Automate
A Closer Look at Different Approaches
26
Approach Considerations
Manual offshore No economies of scale; human error compromises quality.
Crowdsourcing
A viable approach for complex tasks like product matching of apparel for one-shot projects; may be less reliable for ongoing monitoring and long-term projects.
In-house or low-cost Web scrapers
Not resilient; scrapers break when Web page HTML changes, creating a maintenance headache; scrapers may not monitor well or support scheduling.
Robust automation installed on-premise
High degree of control; better resiliency to change but should consider project complexity and future need to add new Web sources on short notice.
Robust solution hosted by vendor
Highest resiliency; no maintenance burden; 24/7 follow-the-sun support; infinitely scalable and no capital expenditures for hardware or IT resources.
Polling Question: Data Analysis
What type of data analysis tools do you use?
Only basic tools – Excel spreadsheets, etc.
Text analysis and basic tools
Applications built in-house and basic tools
None
Text Analysis Options
29
Main ‘Schools’ of Text Analytics
Machine Learners Understanding through Data•Learn meaning through correlations
Ontologists Understanding through Instruction•People tell computers what words mean
Luminoso Approach
Concept-based text analysis •Know the “Common Sense” about the world•Add new connections from datasets
Language is Creative
It was really stuffy.
Smelled really musty.
Reminds me of a dusty closet.
Was like a wet dog.
It was like it had been shut awayfor a long time.
Smells like an old house.
Really stale.
It smelled terrible.
Concept-based analytics has…
• Shown how reaction to product scent changes with price point
• Determined the customer segments for a sports Web site
• Discovered if customers notice unannounced in-store policy changes
• Matched those who should connect at a large enterprise software company’s user conference
Digital Intuition
We boil down the meaning of text into actionable, mathematically justifiable insights.
Case Study: Swiffer SweeperVac
34
Consumer product design example: Swiffer SweeperVac
IdeaUse social data on Twitter to understand customer reactions to product design
Result Failure. Twitter lacks depth.
Better Idea Product Reviews
Feed input into analytical engine to reveal sentiment
Use Connotate to extract comment text
35
Graphical User Interface/Presentation of Insights
Obtaining Customer Sentiment from YouTube
Use the Connotate automation package to follow links
to individual video reviews and more results
Manually search YouTube for <“product name”> <“review”>
39
Connotate Partners
Transform Deliver
• Classify
• Structure
• Prep for Analysis
• Reports
• Dashboards
• Workflow
• BI Plug-ins
Collect Data
Internal Sources
• Database
• Market Basket
• Inventory, etc.
External Sources
• Social Media
• Surface Web
• Hidden Web
•Secured Sites
Another Look at the Automation Process
• Connotate provides precise quality data, structured for delivery to your analysis and presentation tools.
• Connotate maximizes the value of your investment in business intelligence, text analytics and semantic analysis tools. ExcelExcel
Connotate
Connotate
Web Data Can Reveal Insights of Tremendous Value
Valid insights require precise, quality data
Automation is the key to extracting precise, quality data
Automation reduces the
cost of monitoring
Web sites for updates
Automation makes it easier to
collect data for trending
40
Web Data Can Reveal Insights of Tremendous Value
Spot market trends faster
Detect shifts in competitor’s digital strategy
Monitor buyer behavior online and in aggregate
Detect changes to regulatory
sites, download PDFs and extract
data
Obtain new insights into
customer preferences
41
Q & A
Connotate will email a link to this presentation as well as a copy of the slides to you within 2 business days.
If you have an immediate need and would like us to contact you about a forthcoming project, please check the appropriate box in the last polling question or call (+1) 732-296-8844.
For more information, you may also visit www.connotate.com or www.connotate.co.uk.
42