visualization taxonomies and techniques text: words, phrases, sentences, … university of texas –...
TRANSCRIPT
![Page 1: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/1.jpg)
Visualization Taxonomies and Techniques
Text: Words, phrases, sentences, …
University of Texas – Pan American
CSCI 6361, Spring 2014
![Page 2: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/2.jpg)
Introduction
• Text is ubiquitous– Documents, and more
generally text, are a primary information source
• (Verbal has its place!)
– Access to documents and text has grown exponentially in recent years due to networking infrastructure
• WWW • Digital libraries • Social media
• Visualization to aid users in understanding and gathering information from text and document collections
![Page 3: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/3.jpg)
Introduction
• Visualization can aid in performing tasks
• For example: – Which documents contain text on topic XYZ? – Which documents are of interest to me? – Are there other documents that are similar to this one (so they are worthwhile)? – How are different words used in a document or a document collection? – What are the main themes and ideas in a document or a collection? – Which documents have an angry tone? – How are certain words or themes distributed through a document? – Identify “hidden” messages or stories in this document collection. – How does one set of documents differ from another set? – Quickly gain an understanding of a document or collection in order to
subsequently do XYZ. – Understand the history of changes in a document. – Find connections between documents.
From Stasko, 2013
![Page 4: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/4.jpg)
IntroductionChallenges of Text Visualization
• Text is unlike other data types seen so far, for example
• Context and Semantics– Context relevant to understanding and meaning– Indeed, natural language understanding a challenge of the nth + 1 century
• Dimensionality– Inherently, “not dimensional”, so must create “visually realizable” visual encoding – Often, first step is n-D, then 2- or 3-D
• Modeling Abstraction– Consider level of “understanding” require for task– Match analysis task with appropriate tools and models
![Page 5: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/5.jpg)
IntroductionRelated topics
• Information Retrieval – Active search process that brings back particular/specific items (will discuss that
some today, but not always focus) – InfoVis and HCI can help some…
• Visualization may be most useful when not sure precisely what you’re looking for when retrieving information
– More of a browsing paradigm than a search one – But, this is part of the information retrieval task
• Define information need, formulate “query”, examine/evaluate results, … repeat
• Sensemaking – Gaining better understanding of facts at hand in order to take some next steps
• A principle focus in visual analytics – Visualization can help make large document collection more understandable more
rapidly • Which is good: “Overview, zoom and filter, details on demand”
![Page 6: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/6.jpg)
Recall, Visualization Pipeline: Visualization Stages
• Data transformations:– Map raw data (idiosynchratic form) into data tables (relational descriptions
including metatags)
• Text is nominal data– A word, or any text unit, does not map easily to any quantitative representation! – The “Raw data --> Data Table” mapping is a principle element of creating any
visual representation• How do you get numbers from words, sentences, …??
– Will see several solutions
RawInformation
VisualFormDataset Views
User - Task
DataTransformations
VisualMappings
ViewTransformations
F F -1
Interaction
VisualPerception
![Page 7: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/7.jpg)
Recall, Visualization Pipeline: Visualization Stages
• Visual Mappings:– Transform data tables into visual structures that combine spatial substrates,
marks, and graphical properties
• And … visual mappings, as well, requires at least “the usual level” of creativity
RawInformation
VisualFormDataset Views
User - Task
DataTransformations
VisualMappings
ViewTransformations
F F -1
Interaction
VisualPerception
![Page 8: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/8.jpg)
Understanding Text Content
![Page 9: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/9.jpg)
Understanding Text Content
• Visual representations of words, phrases, and sentences – Main goal of understanding, versus search
• Visual presentation always part of text presentation – – Standard typography uses layout, font, style, color … – Electronic media, especially – pick a web page– “Single text content”
![Page 10: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/10.jpg)
Single Text ContentWord Counts
• 2012 National Conventions• NY Times: http://www.nytimes.com/interactive/2012/08/28/us/politics/convention-word-counts.html
![Page 11: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/11.jpg)
Tag / Word Clouds
![Page 12: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/12.jpg)
Tag / Word Clouds
• Lots of popular interest – E.g., on web
• Idea is to show word/concept importance through visual means – Tags: User-specified metadata (descriptors) about something – Sometimes generalized to just reflect word frequencies
• Not a new technique– Milgram’s ‘76 experiment to have people label landmarks in Paris – Flanagan’s ‘97 “Search referral Zeitgeist” – Fortune’s ‘01 Money Makes the World Go Round
![Page 13: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/13.jpg)
Tag / Word CloudsExample: US State of the Union Speeches
• Guardian• http://www.guardian.co.
uk/news/datablog/2011/jan/25/state-of-the-union-text-obama#
• http://image.guardian.co.uk/sys-files/Guardian/documents/2011/01/26/State_of_the_union_2011.pdf?guni=Graphic:in body link
![Page 14: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/14.jpg)
Flickr Tag Cloud
![Page 15: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/15.jpg)
delicious Tag Cloud
![Page 16: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/16.jpg)
Alternate Order
![Page 17: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/17.jpg)
Many Eyes Tag Cloud
• Word pairs
![Page 18: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/18.jpg)
Wordle
![Page 19: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/19.jpg)
Wordle“Beautiful Word Clouds”, http://www.wordle.net/
• Tightly packed words– Horizontal, vertical or diagonal
• Size correlated with frequency
• Multiple color palettes
• User gets some control
• Layout Algorithm – Details not published – Sort words by weight, decreasing
order for each word– Init position randomly chosen
according to distribution for target shape
– Update position moves out radially
![Page 20: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/20.jpg)
Wordle“Beautiful Word Clouds”, http://www.wordle.net/
• Course schedule, table of topics, and assignments
![Page 21: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/21.jpg)
Wordle“Beautiful Word Clouds”, http://www.wordle.net/
• Course schedule, table of topics, and assignments
![Page 22: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/22.jpg)
Wordle“Beautiful Word Clouds”, http://www.wordle.net/
• Course schedule, table of topics, and assignments
![Page 23: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/23.jpg)
Can be many variations …
• A bit more order• Order the words more by frequency
![Page 24: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/24.jpg)
Mani-WordleUser control
• Mani-Wordle – Start with nice default algorithm – Give user more control over design
• Alter color (within a palette) • Pin words, redo the rest • Move and rotate words
– http://www.cg.tuwien.ac.at/courses/InfoVis/HallOfFame/2012/Gruppe03/Homepage/index.html
– Koh et al TVCG (InfoVis) ‘10
![Page 25: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/25.jpg)
Tag / Word CloudsConclusions
• Weaknesses– Sub-optimal visual encoding (size vs. position)– Inaccurate size encoding (long words are bigger)– Font sizes are hard to compare – May not facilitate comparison (unstable layout)– Word frequency may not be meaningful
• Most use words vs. stems
– Does not show structure of the text– Studies have even shown they underperform (Gruen et al CHI ’06)
• Why so popular?– OK for “quick look”– Serve as social signifiers that provide a friendly atmosphere that provide a
point of entry into a complex site – Act as individual and group mirrors – Fun, not business-like
![Page 26: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/26.jpg)
BTW - Text Analysis Toolsvoyeur: http://voyeurtools.org/
• Book• + tools for
text analysis and visualization
![Page 28: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/28.jpg)
![Page 29: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/29.jpg)
Visualization and Information Retrieval
• Examples so far have focused on representing a single document– …, or, really, set of words as no consideration of even word order, let alone
sentence structure, etc.
• Principle question is how might visual representations aid text, or document, search
– I.e., how to find the proverbial needle in a haystack, where the haystack is all the documents on the www or a digital library
– Term information retrieval refers to this search and its history antedates computers
• IR entails:– Determine information need– Query formulation– Retrieval – Assessment of results– Reformulation of query or even information need– Repeat (until information need met)
![Page 30: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/30.jpg)
Visualization and Information Retrieval
…• IR entails:
– Determine information need– Query formulation– Retrieval – Assessment of results– Reformulation of query or even information need– Repeat (until information need met)
• Provide visual representations that during this process– Document collection visually, support browsing, …
• Even for determining information need!
– Show query results visually – Show how query terms relate to results – … any aspect
![Page 31: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/31.jpg)
Visualization and Information Retrieval
• Provide visual representations that during this process– Document collection visually, support browsing, …
• Even for determining information need!
– Show query results visually – Show how query terms relate to results – … any aspect
From Stasko, 2013
![Page 32: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/32.jpg)
Evaluating Query ResultsTileBars, Hearst, 1996
• Hearst points out that query responses do not include:
– How strong the match is – How frequent each term is – How each term is distributed
in the document – Overlap between terms – Length of document
• Document ranking is opaque
• Inability to compare between results
• Input limits term relationships
![Page 33: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/33.jpg)
TileBarsOverview
• Goal : Minimize time and effort for deciding which documents to view in detail
• Show the role of the query terms in the retrieved documents, making use of document structure
• Graphical representation of term distribution and overlap
• Simultaneously indicate: – Relative document length – Frequency of term sets in document – Distribution of term sets with respect to the document and each other
From Stasko, 2013
![Page 34: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/34.jpg)
TileBarsScreen
• TileBars screen:
From Stasko, 2013
![Page 35: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/35.jpg)
TileBarsDocument representation
• Visual representation of retrieved documents
• Video: TileBars-80mb-chi96_05_m1.mpeg
From Stasko, 2013
![Page 36: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/36.jpg)
TileBars
•TileBars
•Video
![Page 37: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/37.jpg)
TileBarsConclusions
• Clearly visually provides the information intended about each document
• Ease/effort/time of comparison?– Surely would improve with use
• … ?
![Page 38: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/38.jpg)
Evaluating Query ResultsSparkler
• Abstract result documents more – Havre et al InfoVis ‘01
• Show “distance” from query in order to give user better feel for quality of match(es)
• Also shows documents in responses to multiple queries • Visualizing One Query
– Triangle – query – Square – document
• Distance between query and documents represents their relevance
From Stasko, 2013
![Page 39: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/39.jpg)
Sparkler
• Visualizing Multiple Queries • Six queries here • Bullseye allows viewer to select quality results
From Stasko, 2013
![Page 40: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/40.jpg)
Sparkler
• Test Example • Text Retrieval Conference (TREC-3) test document collection • AP news stories from June 24–30, 1990 • TREC topic: Japan Protectionist Measures • Sparkler found 16 of 17 relevant documents
From Stasko, 2013
![Page 41: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/41.jpg)
Evaluating Query ResultsRankSpiral
• Compare search results from different search engines– Spoerri InfoVis ’04 poster
From Stasko, 2013
![Page 42: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/42.jpg)
RankSpiral
• Color represents different search engines Compare search results from different search engines
From Stasko, 2013
![Page 43: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/43.jpg)
RankSpiral
• Color represents different search engines Compare search results from different search engines
![Page 44: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/44.jpg)
Evaluating Query Results ResultMaps
• Treemap-style vis for showing query results in a digital library– Clarkson, Desai & Foley TVCG (InfoVis) ‘09
From Stasko, 2013
![Page 45: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/45.jpg)
Representing Multiple Documents
![Page 46: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/46.jpg)
Representing Multiple Documents
• Previously, have seen various techniques for comparing multiple documents that are results of query, i.e., a subset of all documents
• Also, may want to just show everything, and then let user do “manual search”, or user-directed search
• Such displays of all documents also support the type of search common in visual analytics
– Query, browse, connect, drill-down
• Will see:– Parallel word clouds– Tree layout of synonyms– …
![Page 47: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/47.jpg)
Multiple DocumentsParallel Tags Clouds
• Tag clouds increase size of word as f(frequency)• Showing multiple documents as tag clouds allows visual inspection
– Automated and user directed, visual analytics
• Parallel Tag Clouds - name says it all– Video - Collins et al VAST ‘09 – different circuit courts– http://www.youtube.com/watch?v=rL3Ga6xBgLw
![Page 48: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/48.jpg)
Multiple DocumentsDo different district courts differ in cases they handle?
• .
![Page 49: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/49.jpg)
Multiple DocumentsDo different district courts differ in cases they handle?
• .
![Page 50: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/50.jpg)
Multiple DocumentsCounting Words: Overview & Timeline
• Ex., across speeches can count words
• State of the Union Addresses
• http://www.nytimes.com/ref/washington/20070123_STATEOFUNION.html?initialWord=iraq
• NY Times demo
![Page 51: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/51.jpg)
Multiple DocumentsCounting Words: Overview & Timeline
• State of the Union Addresses • http://www.nytimes.com/ref/washington/20070123_STATEOFUNION.html?initialWord=iraq
•NY Times demo
![Page 52: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/52.jpg)
Multiple Document Word UseDocuBurst
• Sets of synonyms grouped together
– Uses WordNet – show words from a
document in terms of their hypernym (ISA) links
– Size – # of leaves in subtree – Hue – diff synsets of word– Shade – frequency of use
• Demo, etc. – http://vialab.science.uoit.ca/portfolio/docuburst-
visualizing-document-content-using-language-structure
![Page 53: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/53.jpg)
FeatureLens
• Show patterns of words or n-grams – Don et al. CIKM ‘07
• Video
![Page 54: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/54.jpg)
FeatureLens
• Show patterns of words or n-grams – Don et al. CIKM ‘07
•Check Video
![Page 55: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/55.jpg)
Combinations of words, phrases, and sentences
![Page 56: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/56.jpg)
Multiple SentencesSeeSoft Display
• Originally for software visualization
• One line of text on each horizontal line
• Color highlight for attributes
– E.g., for software, how often modified, days since modification
– E.g., for text where a particular word appears in a sentence,
• Conversations might be revealed
• Detail view in pop up window
![Page 57: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/57.jpg)
Multiple SentencesTextArc - Simple Single Document Visualization
• Visualize an entire book – Word appearances – Sentences – … – http://textarc.org
• Sentences laid out on circumference in order of appearance in spiral
• Frequently occurring words inside spiral
• Selecting word draws line on to sentences with word
– A kind of “visual concordance”
• Significant interaction
![Page 58: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/58.jpg)
TextArc
![Page 59: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/59.jpg)
Concordances and Word Frequencies
• From field of literary analysis
• Concordance– An alphabetical index
of the principal words in a book or the works of an author with their immediate context
• Word of interest in center, with text in which appears to left and right
• As, KWIC– Key word in context
![Page 60: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/60.jpg)
Word Tree
• Shows context of a word or words – Follow word with all the phrases that follow it
• Wattenberg & Viégas TVCG (InfoVis) ‘08
• Font size shows frequency of appearance • Continue branch until hitting unique phrase • Clicking on phrase makes it the focus • Ordered alphabetically, by frequency, or by first appearance
![Page 61: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/61.jpg)
Word TreeInteraction
![Page 62: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/62.jpg)
Word TreeFrom King James Bible
• From King James Bible
![Page 63: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/63.jpg)
WordTreeMany Eyes
![Page 64: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/64.jpg)
Finding Structure: Phrase Nets
![Page 65: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/65.jpg)
Find Structure: Phrase Nets
• Concordances show local, repeated structure of word context• Phrase Nets In Many Eyes, van Ham et al.
• Other types of patterns– Lexical: <A> at <B>, <A> and <B>, <A> at <B>, <A> (is|are|was|were) <B>– Syntactic: <Noun> <Verb> <Object>
• Visualize extracted patterns in a node-link view– Occurrences -> Node size– Pattern position -> Edge direction
![Page 66: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/66.jpg)
Phrase Net(larger next slide)
Portrait of the Artist as a Young Man<A> and <B>
![Page 67: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/67.jpg)
Phrase Net
Portrait of the Artist as a Young Man<A> and <B>
![Page 68: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/68.jpg)
Phrase NetsThe Bible: <A> begat <B>
![Page 69: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/69.jpg)
Phrase NetsOld and New Testaments: <A> of <B>
![Page 70: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/70.jpg)
Phrase Nets(<A> and <B>) and (<A> at <B>)
![Page 71: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/71.jpg)
End
• .
![Page 72: Visualization Taxonomies and Techniques Text: Words, phrases, sentences, … University of Texas – Pan American CSCI 6361, Spring 2014](https://reader036.vdocument.in/reader036/viewer/2022062511/551c4446550346a5458b46ca/html5/thumbnails/72.jpg)
References
• F. Viegas, M. Wattenberg, "Tag Clouds and the Case for Vernacular Visualization", interactions, Vol. 15, No. 4, Jul-Aug 2008, pp. 49-52.