visualizing translation variation : shakespeare's othello
DESCRIPTION
Visualizing Translation Variation : Shakespeare's Othello. Zhao Geng 1 , Robert S. Laramee 1 , Tom Cheesman 2 , Alison Ehrmann 2 , David M. Berry 2 , 1 Visual Computing Group, Computer Science Department, Swansea University, UK, { cszg,r.s.laramee }@ swansea.ac.uk - PowerPoint PPT PresentationTRANSCRIPT
Visualizing Translation Variation : Shakespeare's Othello
Zhao Geng 1, Robert S. Laramee 1, Tom Cheesman 2, Alison Ehrmann 2 , David M. Berry
2,
1Visual Computing Group, Computer Science Department, Swansea University, UK, {cszg,r.s.laramee}@swansea.ac.uk
2College of Arts and Humanities, Swansea University, UK, {T.Cheesman,d.m.berry} @swansea.ac.uk, alison.ehrmann@t-
online.de
2
OverviewIntroduction and MotivationRelated WorkBackground DataText Pre-processingStructure-aware TreemapParallel Coordinates For
Multi-Document ComparisonConclusionAcknowledgement
3
IntroductionShakespeare's plays have been translated into dozens of languages for about
300 years
For every translation:
A different interpretation of the play
Reflect changing culture or express individual thought by the authors
Build a wide connection between different regions and reveal a retrospective
view of their histories
At the moment, researchers from Modern Languages collect a large number
of German translations of Shakespeare's play, Othello
4
Motivation Goals of Visualization
Present different facets of the data Analyze the data in detail Explore the relationships and patterns to make new hypotheses
Complex Multi-Dimensional Data Set (translation, author, place, year, popularity)
Exploratory Specifications
Where, when, into which languages has Othello been translated ?
How have translators influenced one another ?
How do versions vary globally / locally ?
Which translation is more similar to the original play ?
5
Related Work Multiple Document Visualization prototypes:
ThemeRiver [HHWN02] Parallel Tag Clouds [CBW09] DocuBurst [CCP09] SparkClouds [LRKC10]
6
Related Work (Cont.)Ben Fry, “ On the Origin of Species: The Preservation of
Favoured Traces” (2009), http://benfry.com/traces
Lev Manovich, “Cultural Analytics: Visualizing Cultural Patterns in the Era of “More Media” http://manovich.net/articles/
Stephan Thiel, “Understanding Shakespeare: Towards a Visual Form for Dramatic Texts and Language ( 2010), http://www.understanding-shakepseare.com/
7
8
Background Data ( Cont. ) 57 translations of Othello from 7 various countries, ranging from 1766 to 2006
9
Text Pre-Processing Document CollectionDocument Standardization
In ASCII format stored in standard text editorTokenization
Break the stream of characters into words or tokensReduce the common wordsLanguage dependent
LemmatizationConvert to a standard formStemming to a root
ConcordanceTokens + Frequency
Vector Generation (LSI Model)
10
LSI ModelLSI model ( Latent Semantic Indexing) : Tf ( Term Frequency) : the frequency of a term Θ occurs in a document.
Idf ( Inverse Document Frequency ): the inverse of the number of
documents a term Θ occurs in a document corpus.
The weight of a term Θ can be defined as :
W(Θ) = Tf(Θ) × Idf(Θ)
Each document D then becomes a vector :
D = ( W(Θ1), W(Θ2), … W(Θn) )T
Similarity between two documents D1 and D2 is measured by the angle of
two vectors:
cos Sim(D1,D2) = ( D1 D2) / (|D1| × |D2|)
11
Structure-aware TreemapMeta Data Hierarchy
Century -> Decades -> Country -> Author->TitleVisualization
Treemap Aggregation of numerical values Re-ordering of hierarchies
DOI-Tree Structural clarity Initiate a searching task
12
Structure-Aware Treemap
13
Focus+Context Parallel Coordinates
14
PC-Patterns (1)
15
PC-Patterns (2)
16
Video
17
ConclusionInteractive system for exploring variation among different
German translations of Othello
Structure-Aware Treemap is developed for metadata analysis
Focus+Context Parallel Coordinates incorporate an objective similarity measure and allows the multi-document comparison
In the future, we will expand to the analysis of the whole play of Othello
Utilize more methods from computational linguistics to summarize more semantic feature
18
Acknowledgement
Thanks for listening ! Any questions ?