twitinfo: aggregating and visualizing microblogs for event ... · journalist interview...

Post on 25-Mar-2020

0 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

TwitInfo: Aggregating and Visualizing Microblogsfor Event Exploration

Adam Marcus, Michael Bernstein, Osama Badar,David Karger, Sam Madden, Rob Miller

MIT CSAIL

2

Tell stories using Twitter

3

4

5

too much

too little

6

Identify events of interest

7

Identify events of interestLabel events

8

Identify events of interestLabel eventsAdd context: sentiment? location?

9

Identify events of interestLabel eventsAdd context: sentiment? location?Streaming

10

Identify events of interestLabel eventsAdd context: sentiment? location?Streaming

TwitInfo automates the process of telling stories using Twitter

11

12

13

14

TwitInfo

event detectionevent labeling

contextstreaming algorithms

15

TwitInfo

event detectionevent labeling

contextstreaming algorithms

sentiment correction algorithmevaluation: news consumers

evaluation: pulitzer prize journalist

16

[Andre et al. UIST2007] [Diakopoulos et al. VAST 2010][Diakopoulos+Shamma CHI2010] [Dork et al. InfoVis2010][Go et al. 2010] [Leskovec et al. KDD2009][Shamma et al. CSCW2010+2011]

17

[Andre et al. UIST2007] [Diakopoulos et al. VAST 2010][Diakopoulos+Shamma CHI2010] [Dork et al. InfoVis2010][Go et al. 2010] [Leskovec et al. KDD2009][Shamma et al. CSCW2010+2011]

18

[Andre et al. UIST2007] [Diakopoulos et al. VAST 2010][Diakopoulos+Shamma CHI2010] [Dork et al. InfoVis2010][Go et al. 2010] [Leskovec et al. KDD2009][Shamma et al. CSCW2010+2011]

19

[Andre et al. UIST2007] [Diakopoulos et al. VAST 2010][Diakopoulos+Shamma CHI2010] [Dork et al. InfoVis2010][Go et al. 2010] [Leskovec et al. KDD2009][Shamma et al. CSCW2010+2011]

TwitInfo is a streaming storytelling layer that complements these visualizations

20

ui walkthroughalgorithm designstudy/evaluation

discussion

21

22

23

24

25

26

27

ui walkthroughalgorithm designstudy/evaluation

discussion

28

29

Large

30

LargeRelatively

31

LargeRelatively

Streaming

32

LargeRelatively

Streamingmean deviation

33

LargeRelatively

Adaptive Streamingmean deviation

34

LargeRelatively

Adaptive StreamingExponentially weighted moving mean deviation

35

LargeRelatively

Adaptive Streaming

RFC 2988: ComputingTCP’s retransmission timer

Exponentially weighted moving mean deviation

36

LargeRelatively

Adaptive Streaming

RFC 2988: ComputingTCP’s retransmission timer

Exponentially weighted moving mean deviation

TF-IDF

37

LargeRelatively

Adaptive Streaming

RFC 2988: ComputingTCP’s retransmission timer

Exponentially weighted moving mean deviation

TF-IDF

2 datasets: earthquakes, soccer games

Precision: 80-100%Recall: 80-100%

38

39

40

41

42MAYBE CUT!

43

ui walkthroughalgorithm designstudy/evaluation

discussion

44

12 participants: four female/8 malefirst half: directed tasks

second half: 5 min write article (soccer, Obama)

how do they use the interface?what can they learn from TwitInfo?

45

“After the peace talks, Obama traveled to the ASEAN conference and to NATO to work on issues in those parts of the world. He then spent the week dedicated to domestic economic issues. First he proposed a research tax break, then a $50 billion investment in infrastructure, then the issue came up about whether he should keep some tax breaks that Bush had implemented, and he’s asking for some tax breaks from business, and these are generating some controversy because [. . . ]”

Participant 6:

46

47

Everyone

48

EveryoneSplit

49

EveryoneSplit

NewsJunkies

50

earthquakes

51

“my thoughts and prayers go out to everyone in that country. I hope...”

earthquakes

52

Journalist Interview

53

Journalist Interview

backgrounding

54

Journalist Interview

backgrounding eyewitnesses

55

ui walkthroughalgorithm designstudy/evaluation

discussion

56

appropriate algorithms for the medium

57

appropriate algorithms for the medium

other uses for streaming algorithms/signal processing

58

appropriate algorithms for the medium

other uses for streaming algorithms/signal processing

sentiment: caution!

59

appropriate algorithms for the medium

other uses for streaming algorithms/signal processing

sentiment: caution!

mixed-initiative storytelling

60

TwitInfo

http://twitinfo.csail.mit.edu

marcua@csail.mit.edu / @marcua

Adam Marcus, Michael Bernstein, Osama Badar,David Karger, Sam Madden, Rob Miller

61

62

Twitter data for fun and stories

63

Twitter data for fun and stories

How can we tell stories with the tweet stream?

64

Twitter data for fun and stories

How can we tell stories with the tweet stream?

How can we extract data from the tweet stream?

65

Twitter data for fun and stories

How can we extract data from the tweet stream?

TwitInfo

How can we tell stories with the tweet stream?

66

Twitter data for fun and stories

How can we extract data from the tweet stream?

TwitInfo

TweeQL

How can we tell stories with the tweet stream?

67

TwitInfo

How can we tell stories with the tweet stream?

68

TwitInfo

How can we tell stories with the tweet stream?

69

TwitInfo Uses

Automatically identify eventse.g., goals, earthquakes

70

TwitInfo Uses

Automatically identify eventse.g., goals, earthquakes

Backgrounding e.g., Obama's last two weeks

71

TwitInfo Uses

Automatically identify eventse.g., goals, earthquakes

Backgrounding e.g., Obama's last two weeks

Identifying sources on the grounde.g., interviewing earthquake survivors

72

Cautionary Tweet TalesSentiment is tricky e.g., “My warmest prayers go out to the people of Christchurch.”

73

Cautionary Tweet Tales

Location is evasive

Sentiment is tricky e.g., “My warmest prayers go out to the people of Christchurch.”

74

Cautionary Tweet Tales

Location is evasive

Sentiment is tricky e.g., “My warmest prayers go out to the people of Christchurch.”

Spam is everywhere

75

TwitInfo

Automated event detection makes tweet-basedstory-telling possible

76

TwitInfo

Automated event detection makes tweet-basedstory-telling possible

Interfaces tell a story, people add context

77

TwitInfo

Automated event detection makes tweet-basedstory-telling possible

Interfaces tell a story, people add context

78

TweeQL

How can we extract data from the tweet stream?

79

TweeQL

“It's a balmy 89°C in Phoenix”

location=Phoenix, temperatureC=89

How can we extract data from the tweet stream?

80

TweeQL

“It's a balmy 89°C in Phoenix”

location=Phoenix, temperatureC=89

“I'm starting to dig Obamacare!”

topic=Obama, sentiment=positive

How can we extract data from the tweet stream?

81

Twitter data hacking is hard

● Learn the API● Transform data● Stuff it into a database

82

Twitter data hacking is hard

● Learn the API● Transform data● Stuff it into a database

Ad-hoc data processing

83

Twitter data hacking is hard

● Learn the API● Transform data● Stuff it into a database

Ad-hoc data processing

84

TweeQL extracts data from tweets as they pass through the stream

85

TweeQL extracts data from tweets as they pass through the stream

Data extractione.g., location, sentiment, temperature, opencalais

86

TweeQL extracts data from tweets as they pass through the stream

Data extractione.g., location, sentiment, temperature, opencalais

SQL-like queriese.g., SELECT location, text FROM twitter

87

TweeQL extracts data from tweets as they pass through the stream

Data extractione.g., location, sentiment, temperature, opencalais

SQL-like queriese.g., SELECT location, text FROM twitter

stream, not table

88

TweeQL demo*nerd alert*

89

TweeQL's other features

● Aggregation● Outlier detection● Joins/mashups with other sources

90

91

Unstructureddata

92

+

Structuredqueries

Unstructureddata

93

+

Structuredqueries

=> Structureddata

Unstructureddata

94

+

Structuredqueries

=> Structureddata

Meaningfulvisualizations

=>

Unstructureddata

95

Thanks!

● TweeQL: http://github.com/marcua/tweeql● TwitInfo: http://twitinfo.csail.mit.edu/● Ask me about Mechanical Turk + Journalism!

Adam Marcus@marcuamarcua@csail.mit.eduhttp://people.csail.mit.edu/marcua

96

97

98

99

TweeQL Lessons Learned

● Geographic limitations● Requires grungy regular expressions● Data is not necessarily relational

100

101

102

103

104

105

Breadth

Depth

106

107

108

Challenges

109

Challenges

Event annotation

110

Challenges

Event annotation

Streaming

111

Challenges

Event annotation

StreamingSentiment

112

Challenges

Event annotation

StreamingLocationSentiment

top related