ggwordcloud a word cloud geometry for ggplot2wordcloud2 html graphics placement algorithm in js lots...

15
ggwordcloud a word cloud geometry for ggplot2 E. Le Pennec useR! - Toulouse - 11/07/2019

Upload: others

Post on 27-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ggwordcloud a word cloud geometry for ggplot2wordcloud2 html graphics placement algorithm in js lots of bell and whistles (arbitrary rotation, mask...) ggwordcloud A new ggplot2 geometry

ggwordclouda word cloud geometry for ggplot2

E. Le Pennec

useR! - Toulouse - 11/07/2019

Page 2: ggwordcloud a word cloud geometry for ggplot2wordcloud2 html graphics placement algorithm in js lots of bell and whistles (arbitrary rotation, mask...) ggwordcloud A new ggplot2 geometry

Word Clouds?

Text visualization toolWords scaled according to the a quantity (often a count or aproportion)Nice (meaningless) word position/order.

For documents, vocabulary often pruned (stop words/rarelyused words)

Page 3: ggwordcloud a word cloud geometry for ggplot2wordcloud2 html graphics placement algorithm in js lots of bell and whistles (arbitrary rotation, mask...) ggwordcloud A new ggplot2 geometry

Word Clouds?

Text visualization toolWords scaled according to the a quantity (often a count or aproportion)Nice (meaningless) word position/order.

For documents, vocabulary often pruned (stop words/rarelyused words)

Page 4: ggwordcloud a word cloud geometry for ggplot2wordcloud2 html graphics placement algorithm in js lots of bell and whistles (arbitrary rotation, mask...) ggwordcloud A new ggplot2 geometry

Word Clouds in R

wordcloudbase graphicsfast but crude placementalgorithm in R

wordcloud2html graphicsplacement algorithm in js

lots of bell and whistles(arbitrary rotation, mask...)

Page 5: ggwordcloud a word cloud geometry for ggplot2wordcloud2 html graphics placement algorithm in js lots of bell and whistles (arbitrary rotation, mask...) ggwordcloud A new ggplot2 geometry

ggwordcloud

A new ggplot2 geometryggplot2 ecosystema new geometry similar to geom_text andgeom_text_repel

fast placement algorithm in C

more functionalities than wordcloud and wordcloud2...

Page 6: ggwordcloud a word cloud geometry for ggplot2wordcloud2 html graphics placement algorithm in js lots of bell and whistles (arbitrary rotation, mask...) ggwordcloud A new ggplot2 geometry

geom_text_wordcloud

library(ggwordcloud)data("love_words_small")set.seed(42)ggplot(love_words_small,

aes(label = word,size = speakers)) +

geom_text_wordcloud() +scale_size_area(max_size = 24) +theme_minimal()

A dedicated text geometrygeometry with the geom_text syntax:

label for the wordsize for the count

automatic placement without overlapping around a default(0, 0) position

Page 7: ggwordcloud a word cloud geometry for ggplot2wordcloud2 html graphics placement algorithm in js lots of bell and whistles (arbitrary rotation, mask...) ggwordcloud A new ggplot2 geometry

geom_text_wordcloud

library(ggwordcloud)data("love_words_small")set.seed(42)ggplot(love_words_small,

aes(label = word,size = speakers)) +

geom_text_wordcloud() +scale_size_area(max_size = 24) +theme_minimal()

A dedicated text geometrygeometry with the geom_text syntax:

label for the wordsize for the count

automatic placement without overlapping around a default(0, 0) position

Page 8: ggwordcloud a word cloud geometry for ggplot2wordcloud2 html graphics placement algorithm in js lots of bell and whistles (arbitrary rotation, mask...) ggwordcloud A new ggplot2 geometry

Text Scaling

geom_text_wordcloudFont size proportional to(the square root of) thecount/proportion.Ink area depends on thenumber of letters...

geom_text_wordcloud_areaInk area proportional tothe count/proportion.Perception not biased bynumber of letters.

Page 9: ggwordcloud a word cloud geometry for ggplot2wordcloud2 html graphics placement algorithm in js lots of bell and whistles (arbitrary rotation, mask...) ggwordcloud A new ggplot2 geometry

Rotation and Facet

RotationArbitrary rotation with angle aesthetic.

FacetCompatible with ggplot2 faceting system.

Page 10: ggwordcloud a word cloud geometry for ggplot2wordcloud2 html graphics placement algorithm in js lots of bell and whistles (arbitrary rotation, mask...) ggwordcloud A new ggplot2 geometry

Shapes and Mask

ShapesCloud may have different base shapes.

MaskWords stay within a prescribed mask.

Other functionalities: color, position, fonts...

Page 11: ggwordcloud a word cloud geometry for ggplot2wordcloud2 html graphics placement algorithm in js lots of bell and whistles (arbitrary rotation, mask...) ggwordcloud A new ggplot2 geometry

Algorithm

AlgorithmCompute text area using textGrob and deduce font sizeDraw text using again textGrob and compute a mask madeof small bounding boxesUse a fast spiraling placement algorithm (in C++) to place thewords without any boxes overlap

Bottleneck: text size computation!

Page 12: ggwordcloud a word cloud geometry for ggplot2wordcloud2 html graphics placement algorithm in js lots of bell and whistles (arbitrary rotation, mask...) ggwordcloud A new ggplot2 geometry

Algorithm

AlgorithmCompute text area using textGrob and deduce font sizeDraw text using again textGrob and compute a mask madeof small bounding boxesUse a fast spiraling placement algorithm (in C++) to place thewords without any boxes overlap

Bottleneck: text size computation!

Page 13: ggwordcloud a word cloud geometry for ggplot2wordcloud2 html graphics placement algorithm in js lots of bell and whistles (arbitrary rotation, mask...) ggwordcloud A new ggplot2 geometry

Word Zones

Interest of a GrammarCan produce graphs which where not planned!

Page 14: ggwordcloud a word cloud geometry for ggplot2wordcloud2 html graphics placement algorithm in js lots of bell and whistles (arbitrary rotation, mask...) ggwordcloud A new ggplot2 geometry

Thank you to the R ecosystem

Code and algorithmggplot2: environmentggrepel: basis of the codewordcloud/wordcloud2:source of inspiration

Environmentrstudio...usethis: package skeletontestthat: unit testingdevtools/rhub: packagedev/testing before CRAN

pkgdown: documentation

Page 15: ggwordcloud a word cloud geometry for ggplot2wordcloud2 html graphics placement algorithm in js lots of bell and whistles (arbitrary rotation, mask...) ggwordcloud A new ggplot2 geometry

ggwordcloud

A word cloud geometry for ggplot2A lovely, easy to use and full of functionalities package.Available on CRAN.Source and bug tracker athttps://github.com/lepennec/ggwordcloud

Website: https://lepennec.github.io/ggwordcloud/