carnegie mellon university influences of a shocking news event on web browsing d anai k outra p aul...
TRANSCRIPT
Carnegie Mellon University
Influences of a Shocking News Event
on Web Browsing DANAI KOUTRA PAUL BENNETT ERIC HORVITZ
TAIA ’14 (SIGIR), Gold Coast, Australia
In a nutshell - What?
Focus on polarizing topics topics for which there exist ‘many’ different / contradictory
opinions
Genetically Modified Food
Gun Control
In a nutshell - What?
Focus on polarizing topics.
- Why? Often the issued queries reveal our opinions,which are in turn reinforced by the search engines.
In a nutshell - What?
Focus on polarizing topics.
- Why? Often the issued queries reveal our opinions,which are in turn reinforced by the search engines.
- Goal: Make all results accessible, and
allow the user to skip them.
Steps along this path
Understand how people browse webpages on a polarizing topic
Steps along this path
Understand how people browse webpages on a polarizing topicTime evolution + effect of events
How does user browsing behavior change after an event?How do external events affect the polarity of a topic over time?
Data and time period to study
Sandy Hook Elementary ShootingDec 14, 2012
Where: Sandy Hook Elementary, Newtown, Connecticut, USA Who: Single perpetrator (20 yo male) Historical significance: Most deadly shooting in US History at high-school or grade-school and second deadliest
mass shooting in US History by a single perpetrator. Casualties: 20 children (mostly 6 or 7 yo’s), 6 staff members, perpetrator’s mother, perpetrator Primary Weapon: Legally acquired (by mother) semi-automatic Bushmaster XM15-E2S rifle with multiple 30-round
magazines At school: 154 shots with rifle, 2 shots with pistol Primary Span: 5 minutes (all shootings except mother) 9:35 a.m. Shooter enters, 9:35:39 a.m. 911, 9:39:00 a.m. Police on scene, 9:40:03 a.m. shooter suicide
Motive: none determined Other factors: multiple psychological conditions
RQ: Does a shocking event of historical significance prompt both sides of a debate to explore the views of the other side?• As evidenced by search and browsing behavior • Over non-news articles (indicates intent to explore topic rather than simply follow news)• On topic (any of guns, gun rights, and gun control)
Data Data source: web browser toolbar logs (search + browsing behavior) Topic: gun control
Event: Sandy Hookshooting
Data Data source: web browser toolbar logs (search + browsing behavior) Topic: gun control
Event: Sandy Hookshooting
Longer time period to compensate for the increased activity after the event
Data extraction, cleaning
Data Extraction: Main Idea
◦ Start from queries that are on-topic+
◦ Expand them to get more relevant queriesand URLs
http://www.banhandgunsnow.org/
http://www.adequacy.org/stories/2001....7.html
Data Extraction: Identification of Relevant Queries
Identify relevant queries and URLs: (a) find seed queries from the whole period
◦ Queries containing one of the phrases: gun control, gun rights
(b) get relevant queries◦ By doing 2-step random walk on the query-click graph
Data Extraction: Identification of Relevant Queries
Data Extraction: Identification of Relevant URLs
Relevant query Relevant URL
irrelevant query
X
Click trail 1
Click trail 2
Data
Event: Sandy Hookshooting
Data Labeling
Labeling the webpages
- Label all non-news on-topic sites visited by at least two users- 2 assessors
- Self-identified moderate gun rights- Self-identified moderate gun control
- 6 labels: extreme gun control, moderate gun control, highly balanced, purely factual, moderate gun rights, extreme gun rights
- 3 labels: gun control, balanced, gun rights …
Labeling the webpages
- 2 assessors- 6 labels: extreme gun control, moderate gun control,
highly balanced, purely factual, moderate gun rights, extreme gun rights
- 3 labels: gun control, balanced, gun rights
…Inter-rater agreement:
Temporal Evolution
Daily visits for all users
The event clearly had considerable influence on
information seeking about gun control related topics.
Normalized daily visits for all users
Gun BanList
Gun controlpetition
Sandy Hook
Bob Costas
Web Transitions
Goals1. Which are the most common transitions?
2. Are there changes in transitions due to the news on the Newtown shootings?
=> obtain a micro-level view of information consumption patterns.
Setup for Analysis1. Model the browsing history as a Markov chain
pij = Prob( Xt+1 = | Xt = )
e.g., pij = Prob( Xt+1 = ‘gun control’ | Xt = ‘gun rights’ )
2. Use mobility indices widely used in economics and sociology to identify trends and changes in the system
Transition matricesfor all users
Immobility ratio IR = % same-state transitions IR = 0.2486
25% transitions are same state
Transition matricesfor all users
Immobility ratio IR = % same-state transitions IR = 0.2486
25% transitions are same state
50% transitions toward extreme gun rights
only 25% transitions toward extreme gun control
Moving up MU = % transitions from control to rights MU = 0.4997
Moving up MD = % transitions from rights to control MD =0.2518
Change in transition matricesfor common users
After the event:Immobility (IR) Control -> Rights (MU) Rights -> Control (MD)
Before S.H.
AfterS.H.
Mobility Indices
Change in transition matricesfor common users
After the event:Immobility (IR) Control -> Rights (MU) Rights -> Control (MD)
Before S.H.
AfterS.H.
Mobility Indices
Overall, the system moves towards extreme stances and mainly
exploration of gun rights.
Conclusions ◦ Topically
◦ People use the web to largely access agreeable information.◦ Half of the transitions are from gun control to gun rights pages.◦ After the shootings, the system moves into extreme stances, and mainly towards content taking
an extreme gun rights stance.
◦ Methodologically◦ Mobility indices/Matrix distances as change measure◦ All users vs users observed both before/after◦ Entropy as a diversity measure
Thank you!
Questions? Suggestions?
Are balanced webpages bridges between opposing views?
Most transitions between the two sides occur directly, without accessing balanced webpages.
?
Future work: Location based analysis
less diverse
morediverse
Change in user diversity after S.H. (per state)