![Page 1: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/1.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Data Visualization: Language Variation Suiteand Interactive Text Mining Suite
Olga Scrivner
Indiana University
LSU, April 2016
1 / 54
![Page 2: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/2.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Data Analysis and Visualization
“As our collective knowledge continues to be digitized andstored (...) it becomes more difficult to find and discover what
we are looking for.” (Blei 2012)
“Mastery of quantitative methods is increasingly becoming avital component of linguistic training” (Johnson, 2008)
2 / 54
![Page 3: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/3.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Data Analysis and Visualization
“As our collective knowledge continues to be digitized andstored (...) it becomes more difficult to find and discover what
we are looking for.” (Blei 2012)
“Mastery of quantitative methods is increasingly becoming avital component of linguistic training” (Johnson, 2008)
2 / 54
![Page 4: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/4.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Data Types
1 Structured Data
2 Unstructured Data
3 / 54
![Page 5: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/5.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Quantitative Analysis for Structured Data
Traditional Tools Linguistic Data
a. Categorical variable
b. Independence ofobservation
c. Normally distributed data
d. Large corpus size
a. Categorical, continuous,multivariate, ordinal
b. Correlated data
c. Unbalanced data
d. Small corpus size
4 / 54
![Page 6: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/6.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Data Visualization
Word Order in Latin (Passarotti et al., 2013)
Visual Analytics - “The science of analytical reasoningfacilitated by visual interactive interfaces” (Thomas et al.,2005)
5 / 54
![Page 7: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/7.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
New Tools of Linguistic Analysis (Baayen 2008,Tagliamonte 2014, Gries 2015)
1 Mixed Model:
A statistical regression model containing fixed effects(independent variables) and random effects (e.g.,individual- or word-specific effects).
Measures variability between subjects and correlation ofobservation within subjects
Can handle unbalanced data
6 / 54
![Page 8: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/8.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
New Tools of Linguistic Analysis (Baayen 2008,Tagliamonte 2014, Gries 2015)
2 Conditional inference trees and Random Forests
Uses predictive modeling
“Proves to be more stable than stepwise variable selectionapproaches available for logistic regression” (Strobl2009:325)
Can handle skewed data that often violate the assumptionsof regression approaches
7 / 54
![Page 9: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/9.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
RStudio and Shiny Application
1 R - a free programming language for statistical computingand graphics
2 RStudio - Integrated Development Environment: a sourcecode editor, an executor and a debugger
3 Shiny App - a web application framework for R
Computational power of R + Web interactivity
8 / 54
![Page 10: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/10.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Language Variation Suite (LVS) - a StatisticalShiny Application
1 From https://languagevariationsuite.wordpress.com/
download Labov’s data New York 1966(LabovData.csv) andCaracas data Bentivoglio & Sedano 1993 (CaracasData.csv)
2 Open LVS applicationhttps://languagevariationsuite.shinyapps.io/Pages
9 / 54
![Page 11: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/11.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Language Variation Suite - Introduction
1 Data in csv format (no spaces in column names)
10 / 54
![Page 12: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/12.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Files
2 Upload your file
11 / 54
![Page 13: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/13.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Descriptive Data Analysis - Table
Table displays your dataset and allows for filtering columns bya search word, or in descending/ascending order.
12 / 54
![Page 14: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/14.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Summary
Summary provides a quantitative summary for each variable,ex. frequency count, mean, median.
13 / 54
![Page 15: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/15.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Data Structure
1 factor - categorical values, ex. m/f (gender), 20-34/65+(age), low/high (economic level)
2 num - numerical values, ex. 0.95, 1.53 int - integer values, ex. 1, 2, 10
14 / 54
![Page 16: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/16.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Data Subset
15 / 54
![Page 17: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/17.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Cross-Tabulation
Cross-tabulation is a useful feature to examine the distributionof your dependent variable.
16 / 54
![Page 18: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/18.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Cross-Tabulation
Saks (upper middle-class store), Macy’s (middle-class store), Klein
(working-class)17 / 54
![Page 19: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/19.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Cluster
Cluster Analysis allows you to classify your data intosub-groups (clusters), which are defined by your data. Items inthe same cluster will be very similar to one another.
Saks (upper middle-class store), Macy’s (middle-class store), Klein
(working-class)18 / 54
![Page 20: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/20.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
LVS - Inferential Analysis
Fixed Regression Model - ignoring individual variations(speakers or words) may lead to Type I Error:“a chance effect is mistaken for a real differencebetween the populations”
Mixed Regression Model - prone to Type II Error:“if speaker variation is at a high level, we cannotdiscern small population effects without a largenumber of speakers” (Johnson 2009, 2015)
19 / 54
![Page 21: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/21.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Regression Model Selection
20 / 54
![Page 22: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/22.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Model Output
21 / 54
![Page 23: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/23.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Interpretation
Dependent Variable: deletion and retention
By default - deletion is a reference value (alphabetically)
Results are interpreted for retention
22 / 54
![Page 24: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/24.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Interpretation
Lexical item Fourth has a negative effect on retention and issignificant
Normal style has a slightly negative effect on retention but itscoefficient is not significant
Macy’s and Saks have a positive and significant effect onretention. Saks (upper middle class store) is more significantthan Macy’s (middle class store)
23 / 54
![Page 25: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/25.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Conditional Tree
24 / 54
![Page 26: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/26.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Conditional Tree
Store is the most significant factor for R-use: Kleins (working class
store) - more R-deletion; Macy’s and Saks have a higher rate of
R-retention, which also depends on the lexical item (Floor shows
more retention than Fourth)25 / 54
![Page 27: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/27.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Random Forest
26 / 54
![Page 28: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/28.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Random Forest
The variable importance score demonstrates that Store is the most
important predictor, followed by Lexical Item. The variable is
irrelevant is its importance is around the zero and the cut-off value
(red dotted line).27 / 54
![Page 29: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/29.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Data with Token Frequency
Upload CaracasData.csv fromhttps://languagevariationsuite.wordpress.com/
28 / 54
![Page 30: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/30.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Tokens
29 / 54
![Page 31: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/31.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Let’s Have a Short Break
30 / 54
![Page 32: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/32.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Visual Analytics for Digital Humanities
The “epic transformation of archives” - shifting from print todigital archival form (Folsom, 2007)
31 / 54
![Page 33: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/33.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Digital Humanity Manifesto 2.0 (2009) and Berry(2011)
1st Wave: “The first wave of digital humanities work wasquantitative, mobilizing the search and retrievalpowers of the database, automating corpuslinguistics, stacking hypercards into criticalarrays”
2nd Wave: “The second wave is qualitative, interpretive”,concentrating on new tools for creating andcurating digital repositories (Berry, 2011)
3rd Wave: Concentration on the computationality, search,retrieval and analysis originated inhumanity-based work
32 / 54
![Page 34: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/34.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
New Ways of Exploring Data Collections
Graphs, maps and trees for literature analysis (Moretti,2005)
33 / 54
![Page 35: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/35.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Visual Analytics
Word clouds to analyze a novel (Vuillemot et al., 2009)
34 / 54
![Page 36: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/36.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Visual Analytics
Social network graphs of characters in Greek tragedies(Rydberg-Cox, 2011)
35 / 54
![Page 37: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/37.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Visual Analytics
Literary fingerprint and summaries (Oelke et al., 2012)
36 / 54
![Page 38: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/38.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Visual Analytics
Tracking emotion and sentiment in fairy tales(Mohammad, 2012)
37 / 54
![Page 39: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/39.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Topic Modeling
Discovering underlying theme of collection from Science magazine1990-2000 (Blei, 2012)
For more information on topic modeling:http://www.matthewjockers.net/2011/09/29/
the-lda-buffet-is-now-open-or-latent-dirichlet-allocation-for-english-majors/38 / 54
![Page 40: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/40.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Interactive Text Mining Suite - Introduction
1 Download 3 text files (dante01.txt, dante02.txt,dante03.txt) fromhttps://languagevariationsuite.wordpress.com/
(workshop)
2 ITMS Application:https://languagevariationsuite.shinyapps.io/
TextMining/
39 / 54
![Page 41: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/41.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Upload Files - txt
40 / 54
![Page 42: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/42.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Explore
41 / 54
![Page 43: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/43.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Metadata
ID Date Title Author Other
Extract from pdf files
Upload from csv file
42 / 54
![Page 44: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/44.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Stopwords
43 / 54
![Page 45: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/45.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Frequency
44 / 54
![Page 46: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/46.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Frequency Visualization
45 / 54
![Page 47: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/47.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
More Stopwords
46 / 54
![Page 48: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/48.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Topic Modeling
Selection of topics (how many different themes)
Selection of words per theme (how many words per topic)
Identification of the best topic number
47 / 54
![Page 49: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/49.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Models
LDA (Latent Dirichlet allocation)
STM (Structural Topic model)
Chronological topic visualization (lda): requires metadata
48 / 54
![Page 50: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/50.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Cluster Analysis
49 / 54
![Page 51: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/51.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Punctuation Analysis
50 / 54
![Page 52: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/52.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Future Directions
1 New LVS features:
(a) Traditional Rrbrul analysis (for comparison)
(b) Variable re-coding and dataset modification
2 New ITMS features:
(a) Network graphs
(b) Dynamic graphs
51 / 54
![Page 53: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/53.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
Acknowledgements
I would like to thank Professor Rafael Orozco, Professor IrinaShport and LSU Linguistics for inviting me and organizing thisworkshop.
52 / 54
![Page 54: Data Visualization: Language Variation Suite and Interactive Text Mining Suite](https://reader030.vdocument.in/reader030/viewer/2022021502/58f050f41a28ab8c5b8b4645/html5/thumbnails/54.jpg)
Introduction
LanguageVariationSuite
VisualAnalytics forDigitalHumanities
InteractiveText MiningSuite
Conclusion
References
References I
[1] Baayen, Harald. 2008. Analyzing linguistic data: A practical introduction to statistics. Cambridge:Cambridge University Press
[2] Bentivoglio, Paola and Mercedes Sedano. 1993. Investigacion sociolinguıstica: sus metodos aplicados auna experiencia venezolana. Boletın de Linguıstica 8. 3-35
[3] Gries, Stefan Th. 2015. Quantitative designs and statistical techniques. In Douglas Biber RandiReppen (eds.), The Cambridge Handbook of English Corpus Linguistics. Cambridge: CambridgeUniversity Press
[4] Jockers, Matthew. 2014. Text Analysis with R for Students of Literature. Quantitative Methods in theHumanities and Social Sciences. Springer International Publishing, Cham
[5] Labov, W. 1966. The Social Stratification of English in New York City. Washington: Center for AppliedLinguistics
[6] Moretti, Franco. 2005. Graphs, Maps, Trees: Abstract Models for a Literary History. Verso
[7] Oelke, Daniella, Dimitrios Kokkinakis, and Mats Malm. 2012. Advanced visual analytics methods forliterature analysis. Proceedings of the 6th EACL Workshop on Language Technology for CulturalHeritage, Social 561Sciences, and Humanities, pages 3544
[8] Passarotti, Marco, Barbara McGillivray, and David Bamman. “A Treebank-based Study on Latin WordOrder.” In proceedings of 16th International Colloquium on Latin Linguistics, At Uppsala, Sweden.2013, 340–352
[9] Schnapp, Jeffrey, and Peter Presner. 2009. Digital Humanities Manifesto 2.0.
[10] http://blog.kandu.com/post/57065268403/book-reading-gif
[11] http://cdn.business2community.com/wp-content/uploads/2014/09/archives01.jpg
53 / 54