data visualization tools · - presentation of solid data about immunization risks, impact of...
TRANSCRIPT
Data Visualization Tools
Taxiarchis Botsis, MSc, MPS, PhD
Assistant Professor of Oncology & Health Sciences InformaticsThe Sidney Kimmel Comprehensive Cancer Center
EPC Journal Club - FEbruarY 18, 2020
botsislab.com EPC Journal Club - FEbruarY 18, 2020
Outline
Introduction
Tool Categories
The Right Tool(s)
Examples - Data - Goals - Audience - Design - Implementation
Key Points
botsislab.com EPC Journal Club - FEbruarY 18, 2020
Introduction
Data OR Information Visualizations OR Infographics represent Data/Information in a graphical and compelling form
A visual story combines multiple media (photographs, illustrations, video, animation, 3D models, and other graphics) with text and/or audio to create powerful stories for selected target audiences
They may be found in: - interactive or static e-format - physical forms
botsislab.com EPC Journal Club - FEbruarY 18, 2020
Tool Categories
1. Interactive nonprogrammatic: Tableau, Google Sheets, Infogram, ChartBlocks - small learning curve for beginners
2. Programmatic: R, D3.js, Chart.js, Python - large learning curve for beginners, some may take years to master
3. Image manipulation or illustration: Adobe Photoshop, Illustrator, Inkscape - average learning curve for beginners
?Would you ever consider hardware tools for building visualizations?
botsislab.com EPC Journal Club - FEbruarY 18, 2020
The Right Tool(s)
According to Wilke*:
“the best visualization tool is the one that allows you to make the figures you need”
Tool selection is related to the overall goals and actual tasks:• Reproduce visualizations• Perform data and visualization exploration• Tweak the visual output
*Claus O. Wilke, Fundamentals of Data Visualization, O’Reilly 2019
botsislab.com EPC Journal Club - FEbruarY 18, 2020
The Right Tool(s)
Reproducible visualizations: same data, similar visual output vs. Repeatable visualizations: same data, exact same output
• Reproducibility may be supported by interactive nonprogrammatic tools unless many steps are needed to develop the visualization
• Repeatability requires the use of programmatic tools -- a script will always generate the same output
Reproducibility
botsislab.com EPC Journal Club - FEbruarY 18, 2020
The Right Tool(s)
Exploration: quickly try multiple visualizations for a dataset vs. Production: generate an aesthetically appealing and efficient final visualization
• Exploration can be only supported by interactive nonprogrammatic tools
• Production includes many steps and may rely on: - interactive or programmatic tools OR - image manipulation or illustration tools OR - a combination of all the above, especially if complex information visualizations or visual stories are built
Exploration
botsislab.com EPC Journal Club - FEbruarY 18, 2020
The Right Tool(s)
Tweaking: rework the overall design vs. Prototyping: quickly generate a basic visual with all context
• Tweaking can be supported by any tool
• Prototyping can be also supported by any tool
Tweaking
botsislab.com EPC Journal Club - FEbruarY 18, 2020
Examples
Data: Adverse Event (AE) reports submitted to the Food and Drug Administration (FDA)
Audience: Safety Evaluators at the FDA
Goal: Improve the rigor of pharmacovigilance through efficient visualizations: - Summary presentation of major AEs per drug product - Focus on reports with serious outcomes, such as death and hospitalizations
Design: Interactive visualization with the following functionalities: - Easily select a drug product and view a summary of the AE information - Focus on either an AE or a serious outcome or both
Implementation: D3.js (many iterations with end users, completed in several months)
Bubble Plot
botsislab.com EPC Journal Club - FEbruarY 18, 2020
Examples Bubble Plot
botsislab.com/#/ae-bubbles
Interactive
botsislab.com EPC Journal Club - FEbruarY 18, 2020
Examples
Data: Cervical cancer data per US State and Nationwide over the period 2010-2015 according to the US Cancer Statistics by the Centers for Disease Control and Prevention
Audience: Public
Goal: Inform the public about cervical cancer incidence rates: - Presentation of rates for each race per US State and Nationwide - Summarize the major finding(s) and observations
Design: Static visualization with adequate context, no complexity and familiar elements
Implementation: Tableau, Adobe Illustrator (completed in days)
Tile Grid Map
botsislab.com EPC Journal Club - FEbruarY 18, 2020
Examples Tile Grid Map
Static+Source
+Title
Data Source: US Cancer Statistics (https://www.cdc.gov/cancer/npcr/uscs/download_data.htm)
Washington
Oregon
California
Idaho
Nevada
Utah
Arizona
AlaskaHawaii
Montana
Wyoming
Colorado
New Mexico
North Dakota
South Dakota
Nebraska
Kansas
Oklahoma
Texas
Minnesota
Iowa
Missouri
Arkansas
Louisiana
Wisconsin
Illinois
Indiana
Kentucky
Tennessee
Mississippi
Michigan
Ohio
West Virginia
North Carolina
Alabama Georgia
South Carolina
Virginia
Pennsylvania
New York
New Jersey
D.C.
Florida
Vermont
Massachusetts
Connecticut
Maryland
Maine
Rhode Island
Delaware
New Hampshire
Is it a Privilege to Be Asian? Cervival Cancer Rates Lowest Among All RacesAsians or Pacic Islanders had the lowest cervival cancer incidence crude rates among all races in the period 2010-2015 nationwide . Hispanics appear to suffer the most in many states. No solid data for American Indians or Alaska Natives.
No Data
rida
20102011201220132014
IncidenceCrude Rate
3020
2015
10
American Indian/Alaska Native RaceAsian/Pacic Islander Race
Black RaceHispanic Race
White Race
botsislab.com/#/viz/cervical_cancer_map
botsislab.com EPC Journal Club - FEbruarY 18, 2020
Examples
Data: Mainly -> Post-market vaccine reports, rotavirus mortality rates, vaccine morbidity
Audience: Public
Goal: Inform the public about vaccine safety: - Presentation of solid data about immunization risks, impact of vaccines, etc. - Be objective and build a data-driven visualization
Design: A compound figure with static multiples, no complexity, familiar elements
Implementation: Tableau, Ms Excel, Adobe Stock, Adobe Illustrator, Adobe Photoshop (completed in ~2 weeks)
Visual Story
botsislab.com EPC Journal Club - FEbruarY 18, 2020
Examples Visual Story
Static+Sources
+Title
botsislab.com EPC Journal Club - FEbruarY 18, 2020
Examples
Data: Survival lung cancer data*
Audience: Lung cancer patients (and physicians)
Goal: Inform patients about immunotherapy benefits: - Clear presentation of complex concepts and recent findings - Improve communication between patients and physicians
Design: A 3D construction with an opportunity to physically interact with it
Implementation: Hardware tools and various materials (about 4 months)
*Anagnostou et al., Nature Cancer 2020
3D Survival Box
botsislab.com EPC Journal Club - FEbruarY 18, 2020
Examples 3D Survival Box
botsislab.com EPC Journal Club - FEbruarY 18, 2020
Examples 3D Survival Box
3D Physical
botsislab.com/#/viz/lung_cancer_3d
botsislab.com EPC Journal Club - FEbruarY 18, 2020
Key Points
• Prior to selecting any tool: - Evaluate data, target audience, and overall goals - Design the information visualization
• Select the appropriate tool(s) depending on the type of visualization and final product
• Use the tool(s) that you feel most comfortable with
• Think out of the...box
QUESTIONS?
THANK YOU!