detecting influenza outbreaks by analyzing twitter messages
DESCRIPTION
Detecting Influenza Outbreaks by Analyzing Twitter Messages. By Aron Culotta. Jedsada Chartree 02/28/11. Outline. Introduction Motivations Data Methodology Results Conclusion Reference. Introduction. The growing in monitoring disease outbreaks using the Internet - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Detecting Influenza Outbreaks by Analyzing Twitter Messages](https://reader036.vdocument.in/reader036/viewer/2022062520/568161e5550346895dd206f7/html5/thumbnails/1.jpg)
Detecting Influenza Outbreaks by Analyzing Twitter Messages
By Aron Culotta
Jedsada Chartree 02/28/11
![Page 2: Detecting Influenza Outbreaks by Analyzing Twitter Messages](https://reader036.vdocument.in/reader036/viewer/2022062520/568161e5550346895dd206f7/html5/thumbnails/2.jpg)
Outline
• Introduction• Motivations• Data• Methodology• Results• Conclusion• Reference
![Page 3: Detecting Influenza Outbreaks by Analyzing Twitter Messages](https://reader036.vdocument.in/reader036/viewer/2022062520/568161e5550346895dd206f7/html5/thumbnails/3.jpg)
Introduction• The growing in monitoring disease outbreaks using the
Internet• The growing of Twitter
![Page 4: Detecting Influenza Outbreaks by Analyzing Twitter Messages](https://reader036.vdocument.in/reader036/viewer/2022062520/568161e5550346895dd206f7/html5/thumbnails/4.jpg)
Motivations• Developing methods that can reliably track ILI rates in real-
time.
![Page 5: Detecting Influenza Outbreaks by Analyzing Twitter Messages](https://reader036.vdocument.in/reader036/viewer/2022062520/568161e5550346895dd206f7/html5/thumbnails/5.jpg)
Data• The U.S. Centers for Disease Control and Prevention (CDC)• Twitter data• 36 week period from August 29, 2009 to May 8, 2010.
![Page 6: Detecting Influenza Outbreaks by Analyzing Twitter Messages](https://reader036.vdocument.in/reader036/viewer/2022062520/568161e5550346895dd206f7/html5/thumbnails/6.jpg)
Data
The ILI rates from the CDC’s weekly tracking statistics (09/05/09 to 05/08/10)
The number of Twitter messages collected per week
![Page 7: Detecting Influenza Outbreaks by Analyzing Twitter Messages](https://reader036.vdocument.in/reader036/viewer/2022062520/568161e5550346895dd206f7/html5/thumbnails/7.jpg)
Methodology• Gathering the ILI rates and Twitter messages• Finding the correlation between the ILI rates and Twitter
messages
P = The proportion of the population exhibiting in ILI symptomsW = {w1…wk} = A set of k keywords, D = Document collection = The coefficients = The error termQ(W,D) = The fraction of documents in D the match W (|Dw|/|D|)Logit(P) = ln(P/(1-P))€
β1 ,β 2
€
ε
![Page 8: Detecting Influenza Outbreaks by Analyzing Twitter Messages](https://reader036.vdocument.in/reader036/viewer/2022062520/568161e5550346895dd206f7/html5/thumbnails/8.jpg)
Methodology• Filtering spurious matches (noise)
The number of messages containing the keyword “flu” and a number of keywords that might lead to spurious correlations.
![Page 9: Detecting Influenza Outbreaks by Analyzing Twitter Messages](https://reader036.vdocument.in/reader036/viewer/2022062520/568161e5550346895dd206f7/html5/thumbnails/9.jpg)
Methodology• Filtering spurious matches by supervised learning - Training a document classifier using logistic regression
![Page 10: Detecting Influenza Outbreaks by Analyzing Twitter Messages](https://reader036.vdocument.in/reader036/viewer/2022062520/568161e5550346895dd206f7/html5/thumbnails/10.jpg)
Methodology• Filtering spurious matches by supervised learning - Combining filtering with regression 1. Soft classifier
![Page 11: Detecting Influenza Outbreaks by Analyzing Twitter Messages](https://reader036.vdocument.in/reader036/viewer/2022062520/568161e5550346895dd206f7/html5/thumbnails/11.jpg)
Methodology• Filtering spurious matches by supervised learning - Combining filtering with regression 2. Hard classifier
• Applying both classifier to the simple linear model.
![Page 12: Detecting Influenza Outbreaks by Analyzing Twitter Messages](https://reader036.vdocument.in/reader036/viewer/2022062520/568161e5550346895dd206f7/html5/thumbnails/12.jpg)
Methodology• Evaluating false alarms by simulation - Sample 1,000 messages deemed to be spurious. - Sample with replacement an increasing number of the
spurious messages and add them to the original message set. - Use the same trained regression models.
![Page 13: Detecting Influenza Outbreaks by Analyzing Twitter Messages](https://reader036.vdocument.in/reader036/viewer/2022062520/568161e5550346895dd206f7/html5/thumbnails/13.jpg)
Results
Fitted and predicted ILI rates using regression over query fractions of Twitter messages
![Page 14: Detecting Influenza Outbreaks by Analyzing Twitter Messages](https://reader036.vdocument.in/reader036/viewer/2022062520/568161e5550346895dd206f7/html5/thumbnails/14.jpg)
Results
Fitted and predicted ILI rates using regression over query fractions of Twitter messages
![Page 15: Detecting Influenza Outbreaks by Analyzing Twitter Messages](https://reader036.vdocument.in/reader036/viewer/2022062520/568161e5550346895dd206f7/html5/thumbnails/15.jpg)
Results
Correlation results with refinements of the flu query
![Page 16: Detecting Influenza Outbreaks by Analyzing Twitter Messages](https://reader036.vdocument.in/reader036/viewer/2022062520/568161e5550346895dd206f7/html5/thumbnails/16.jpg)
Results
Correlation results with refinements of the flu query
![Page 17: Detecting Influenza Outbreaks by Analyzing Twitter Messages](https://reader036.vdocument.in/reader036/viewer/2022062520/568161e5550346895dd206f7/html5/thumbnails/17.jpg)
Results
![Page 18: Detecting Influenza Outbreaks by Analyzing Twitter Messages](https://reader036.vdocument.in/reader036/viewer/2022062520/568161e5550346895dd206f7/html5/thumbnails/18.jpg)
Results
Number false messages added
![Page 19: Detecting Influenza Outbreaks by Analyzing Twitter Messages](https://reader036.vdocument.in/reader036/viewer/2022062520/568161e5550346895dd206f7/html5/thumbnails/19.jpg)
Conclusion•The proposed method can be used to track influenza rates from Twitter messages.•The proposed evaluating false alarm can be used satisfying.
![Page 20: Detecting Influenza Outbreaks by Analyzing Twitter Messages](https://reader036.vdocument.in/reader036/viewer/2022062520/568161e5550346895dd206f7/html5/thumbnails/20.jpg)
References• Aron Culotta. 2010. Detecting influenza outbreaks by analyzing Twitter messages.• Jeremy Ginsberg and others. 2009. Detecting influenza epidemics using search
engine query data.