Download - Converseon 2012 CASRO Technology Conference
© 2012 Converseon Inc. Proprietary and Confidential
Distilling Actionable Insights from the Deluge of Social Media Data Jasper Snyder VP, Converseon
© 2012 Converseon Inc. Proprietary and Confidential 2
From Data Deluge to Insights
© 2012 Converseon Inc. Proprietary and Confidential
Social Media Channel Approx. Monthly Volume Furthermore…
Blogs 30 million new posts On-site comments and social cues and sharing
Facebook 1.8 billion status updates Social cues (e.g., “likes”) and comments
Twitter 4 billion tweets Social cues like favoriting and flagging other users
YouTube 400 million social actions 240 years of video content uploaded each month
3
The vast scope of social media data available today requires scalable tech solutions. Human-machine collaboration is the only way to deal with this deluge.
© 2012 Converseon Inc. Proprietary and Confidential 4
Social-media research can support both traditional market research goals and PR use cases.
Communications Functions through Social Media Monitoring
Traditional Market Research through Social Media Listening
• Consumer Segmentation
• Purchase triggers
• Thoughts and opinions about products and brands
• Market awareness of products or brands
• Consumer complaints and product malfunctions
• Adverse reactions for pharmaceutical companies
• Crisis monitoring and response
• Reputation management
© 2012 Converseon Inc. Proprietary and Confidential
Social Media Monitoring
5
These two use cases – market research and communications – closely align with two services.
Social Listening
When what matters most is understanding a consumer segment or market.
Goal is to acquire just enough data to understand a population “out there” in the world. Higher tolerance for missing content. Lower tolerance for irrelevant content.
When what matters most is delivering customer service, navigating a crisis situation or detecting reputation threats. Goal is comprehensive, real time coverage. Higher tolerance for irrelevant content. Lower tolerance for missing content.
© 2012 Converseon Inc. Proprietary and Confidential 6
The Social Media Research Process: From Raw Data to Insights
2. Data Enrichment
3. Analysis & Insight
Generation
1. Data Collection
© 2012 Converseon Inc. Proprietary and Confidential 7
Stage 1: Social Data Collection
2. Data Enrichment
3. Analysis & Insight
Generation
1. Data Collection
Primary Challenges:
1. Pull in relevant data and metadata
2. Coverage of appropriate social media channels
3. Eliminate spam and irrelevant content.
Primary Goal:
Identify and acquire the data that can answer your business questions.
© 2012 Converseon Inc. Proprietary and Confidential 8
Stage 2: Data Enrichment
2. Data Enrichment
3. Analysis & Insight
Generation
1. Data Collection
Primary Challenges:
1. Data normalization
2. Classification
3. Scalability
Primary Goal:
Implement document- and sub-document-level enrichments like topic, consumer segment, emotion and sentiment.
© 2012 Converseon Inc. Proprietary and Confidential 9
Stage 3: Analysis & Insight Generation
2. Data Enrichment
3. Analysis & Insight
Generation
1. Data Collection
Primary Challenges:
1. Reliability
2. Strategic Value
Primary Goal:
Connect the dots between a suite of metrics and data points in order to reach sound strategic conclusions.
© 2012 Converseon Inc. Proprietary and Confidential 10
Social media is a massive compendium of documents…
© 2012 Converseon Inc. Proprietary and Confidential 11
Harvesting Data and Metadata from Social Media Documents: A Tweet Dissected
© 2012 Converseon Inc. Proprietary and Confidential 12
Harvesting Data and Metadata from Social Media Documents: A Tweet Dissected
• Author Name • Text • Publication Date • Some hashtags
Datapoints:
© 2012 Converseon Inc. Proprietary and Confidential 13
Harvesting Data and Metadata from Social Media Documents: A Tweet Dissected
• Person or tweet that a tweet is in reply to
• Follower count of author • Times retweeted • Times favorited • Author description
Metadata:
© 2012 Converseon Inc. Proprietary and Confidential 14
Sorting Social Metadata
Tweets that contain #Ford in the text.
A
B
C
© 2012 Converseon Inc. Proprietary and Confidential 15
Relevancy as a Sorting Task…
Relevant Documents
Irrelevant Documents
All Documents Containing Your Boolean Query
All Social Media Documents • Spam
• Documents not in target language (e.g., not English)
• Contain keyword but not relevant to client question
© 2012 Converseon Inc. Proprietary and Confidential 16
Data Enrichment: What Should We Measure?
Metric Explanation Sentiment Does the author make a negative or positive
point about a product or brand? Topics What topic is the author talking about the
product or brand in relation to? Purchase Stage Has the author of a document already
purchased the product when writing about it online?
Consumer Segmentation What segment is the document’s author from?
Emotions What emotions do authors express toward the target brand or product?
© 2012 Converseon Inc. Proprietary and Confidential 17
Data Enrichment: What Should We Measure?
Metric Sorting Categories Sentiment Positive, negative, neutral Topics Pre-selected topic and unexpected topics Purchase Stage Before making a purchase or after. Consumer Segmentation Young male, middle-aged woman, etc. Emotions Joy, anticipation, surprise, fear, etc.
© 2012 Converseon Inc. Proprietary and Confidential 18
How can we implement the sorting tasks we’ve discussed so far?
Sorting Tasks
Human Sorters Machine Sorters
© 2012 Converseon Inc. Proprietary and Confidential 19
Q: How do you know when a computer is correct?
A: The same way you know that a human is correct:
“I know it when I see it…”
© 2012 Converseon Inc. Proprietary and Confidential 20
Establishing A Basis for How Well Humans Agree With One Another
Tweet Coder 1 Coder 2
I do not like the cats with thumbs “advert”
Disgust Anger
I say that video is real, definitely.
Trust No Emotion Expressed
Item Coder 1 Coder 2
1 Positive Positive
2 Positive Neutral
3 Neutral Neutral
4 Negative Positive
etc. … …
Example 1: Inter-Coder Agreement on Sentiment Example 2: Inter-Coder Agreement on Emotion
© 2012 Converseon Inc. Proprietary and Confidential 21
Using Human Parallel Coding to Establish Gold Standards
Confusion Matrix: Human as Gold Standard
Raw Accuracy: 61.5%
POSITIVE NEGATIVE NEUTRAL TOTAL POSITIVE 365 24 159 548
NEGATIVE 57 81 65 203 NEUTRAL 274 60 415 749
TOTAL 696 165 639 1500
© 2012 Converseon Inc. Proprietary and Confidential 22
Using A Credit Matrix to Create Improved Measurement
POSITIVE NEGATIVE NEUTRAL POSITIVE 100% 0% 50%
NEGATIVE 0% 100% 50% NEUTRAL 50% 50% 100%
Credit Matrix
POSITIVE NEGATIVE NEUTRAL POSITIVE 365 24 159
NEGATIVE 57 81 65 NEUTRAL 274 60 415
Confusion Matrix: Human 1 as Gold Standard
Partial Credit Figure of Merit: 82.3%
© 2012 Converseon Inc. Proprietary and Confidential 23
But how does the machine learn?
1. Collection of Human Annotated Data
2. Machine ingests coded data and finds patterns in each category classification
3. Machine applies model from step two on raw data. Results are compared to human coding of same material.
© 2012 Converseon Inc. Proprietary and Confidential 24
In conclusion….
© 2012 Converseon Inc. Proprietary and Confidential
Converseon Inc. 53 West 36th Street, 8th Floor, New York, NY 10018 t: 212.213.4279 | f: 646.304.2364 www.converseon.com
Thank You! Jasper Snyder, VP, Converseon [email protected]
25