a tedmed data reveal: big and little dr. brand niemann director and senior data scientist semantic...

40
A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community http://semanticommunity.info/ AOL Government Blogger http://gov.aol.com/bloggers/brand-niemann/ April 22, 2013 http://semanticommunity.info/A_TEDMED_Data_Reveal 1

Upload: ronald-merritt

Post on 20-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

1

A TEDMED Data Reveal:Big and Little

Dr. Brand NiemannDirector and Senior Data Scientist

Semantic Communityhttp://semanticommunity.info/

AOL Government Bloggerhttp://gov.aol.com/bloggers/brand-niemann/

April 22, 2013http://semanticommunity.info/A_TEDMED_Data_Reveal

Page 2: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

2

Background• I did a story about TEDMED 2012 for the 2012 Health Datapalooza III and was invited to go to TEDMED 2013 as

a Journalist!• Session 2: “How Can Big Data Become Real Wisdom?” and Session 6: “Going Farther while Staying Closer” were

the most interesting and motivating to me. See next slide.• I heard about Big and Little Data and saw an opportunity to help TEDMED with a taxonomy that is a semantic

Index to a knowledge base for improved search and to help TEDMED with examples of big and little data science.

• And the best data source for my work was Professor Christopher Murray’s (IHME/GBD) presentation and demonstration on “What does a $100 million public health data revolution look like?” funded by the Bill and Melinda Gates Foundation to prioritize global health research and help.

• It made me think of the Monica Rogati’s Tweet @ Strata 2012: More data beats clever algorithms but better data beats more data.

• I Tweeted: @TEDMED @Storify Yes, and working on IHME/GDB (Global Burden of Disease) Visualizations like: http://semanticommunity.info/Census_Data_Visualization

• But I want to volunteer to help TEDMED 2013 and 2014 as a data scientist/data journalist and saw on their Web site: If you are a talented designer and/or illustrator with experience in bringing presentations to life, you could help with our speaker presentation materials.

• I attended the First Great Challenges Day, participated in the Inventing Wellness Programs Breakout Session, and learned the importance of scientists storifying with “and, but, and therefore”.

• Therefore my story is a TEDMED Data Reveal: Big (IHME/GBD) and Little (TEDMED Web Site) with “and, but, and therefore.”

Page 3: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

3

My TEDMED 2013 Highlights

• SESSION 2: How Can Big Data Become Real Wisdom?– Jay Walker: Introduction.

• Need a macro-scope to gather, network, store, and access data and to go from data to wisdom by finding patterns in the data.

– Larry Smarr: Can you coordinate the dance of your body's 100 trillion microorganisms?• How to quantify self movement with medical detail in real time by an

astrophysicist turned computer scientist.

• SESSION 6: Going Farther while Staying Closer– Christopher Murray: What does a $100 million public health data

revolution look like?• Talk and live demo of Global Burden of Disease Treemap, Map, Time

Plot, Age Plot, and Stacked Bar Chart by Age and Sex.See: http://blog.tedmed.com/

Page 4: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

4

TEDMED 2013

http://www.tedmed.com/

My Note: I decided to make this a Searchable Knowledge Base.

Page 5: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

5

TEDMED Knowledge Base

http://semanticommunity.info/A_TEDMED_Data_Reveal

Google Chrome: Find

Page 6: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

6

TEDMED Speakers

http://www.tedmed.com/speakers

My Note: I decided to make this a little data set for faceted search.

Page 7: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

7

TEDMED Speakers Spreadsheet

http://semanticommunity.info/@api/deki/files/23881/TEDMED.xlsx

My Note: The facets are Year, Keywords, and Tags.

Page 8: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

8

Institutions hosting TEDMEDLive 2013

http://www.tedmed.com/event/tedmedlive?ref=participating

My Note: I decided to make this a little data set for mapping, but it was difficult to get the geo-referenced data set.

Page 9: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

9

TEDMEDLive 2013 Institutions Spreadsheet

http://semanticommunity.info/@api/deki/files/23881/TEDMED.xlsx

My Note: Simple Geo-referencing of Institutions.

Page 10: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

10

TEDMED 2013: Spotfire

https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?TEDMED2013-Spotfire

Page 11: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

11

TEDMED 2009-2012: Spotfire

https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?TEDMED2013-Spotfire

Page 12: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

12

Institute for Health Metrics and Evaluation (IHME)

http://www.healthmetricsandevaluation.org/

My Note: I heard this talk and decide to work with this big data.

My Note: There are three Web site

Page 13: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

13

Press Release• The Global Burden of Disease (GBD) is a first-of-its-kind study of health around the

world. • The GBD findings present a new way to look at health, allowing countries to track

progress against diseases ranging from malaria to cancer to diabetes, identify risks including smoking and poor diet, see how people in 187 countries are faring in terms of health and gauge emerging health challenges. The GBD is a collaboration of nearly 500 researchers in 50 countries, and is led by IHME, part of the University of Washington.

• Some of the countries included in the GBD, such as the UK and Indonesia, already have started to produce their own policy recommendations as a result of the study. Australia and China are also planning to produce studies that use GBD to drill down and develop local-level health data.

• IHME is working with three localities in the US to produce GBD-type data at the community level as well.

• Efforts are underway to provide continuous updates to the GBD and expand the range of health issues included in the study.

• The GBD measures health issues around the world through more than 1 billion pieces of data that can also be explored through interactive visualization tools online.

http://semanticommunity.info/@api/deki/files/23885/TEDMED-media-advisory-Chris-Murray.docx

Page 14: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

14

GHDx Catalog of Demographic and Health Data by IHME

http://ghdx.healthmetricsandevaluation.org/

My Note: Download Data

Page 15: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

15

Global Burden of Disease Study 2010 Data Downloads

http://ghdx.healthmetricsandevaluation.org/global-burden-disease-study-2010-gbd-2010-data-downloads

My Note: I downloaded 17 files totaling 1.13 GB.Two Codebook files were damaged and I repaired them.

Page 16: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

16

GBD Compare

http://viz.healthmetricsandevaluation.org/gbd-compare/

My Note: Treemap and Map.

Page 17: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

17

GBD Cause Patterns

http://www.healthmetricsandevaluation.org/gbd/visualizations/gbd-cause-patterns

My Note: Stacked Bar Chart.

Page 18: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

18

GBD Cause Patterns: Reports

http://www.healthmetricsandevaluation.org/gbd/visualizations/gbd-cause-patterns#/publications-presentations/reports

Page 19: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

19

IHME-GBD Causes of Death: Spotfire

https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?TEDMED2013-Spotfire

Page 20: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

20

IHME-GBD Life Expectancy: Spotfire

https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?TEDMED2013-Spotfire

Page 21: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

21

IHME-GBD Mortality: Spotfire

https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?TEDMED2013-Spotfire

Page 22: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

22

IHME-GBD Risk Factors: Spotfire

https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?TEDMED2013-Spotfire

Page 23: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

23

IHME-GBD Breast and Cervical Cancer: Spotfire

https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?TEDMED2013-Spotfire

Navigation and Metadata

Data Set

World Map

Bar Chart

My Note: Data Visualizations are Linked.

Page 24: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

24

Data Ecosystem: Spotfire

https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?TEDMED2013-Spotfire

Page 25: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

25

IHME-GBD Life Expectancy by Country: Spotfire

Navigation and Metadata

Code Book

Filters

Details-on-Demand

My Note: The Visualizations Are Linked to One Another.

Data Set

My Note: 19 files totaling 1.13 GB of data in a Spotfire file of only 0.5 GB!

Life Expectancy by Region

Life Expectancy (LE) Versus HealthAdjusted Life Expectancy (HALE)

https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?IHME-GBD-Spotfirehttps://silverspotfire.tibco.com/us/library#/users/bniemann/Public?IHME-GBD1-Spotfire

Page 26: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

26

Conclusions and Recommendations

• My story is a TEDMED Data Reveal: Big (IHME/GBD) and Little (TEDMED) with “and, but, and therefore.”– I have done as Jay Walker suggested: We need a macro-scope to gather,

network, store, and access data and to go from data to wisdom by finding patterns in the data.• But to do that, TEDMED needs a taxonomy that is a semantic index to a

knowledge base for improved search and help with examples of big and little data science.

– I found the best big data source for my work was Professor Christopher Murray’s IHME/GBD funded by the Bill and Melinda Gates Foundation to prioritize global health research and help.• But I found I could improved the access and simplify the visualizations of the

IHME/GBD data.

– Therefore, I did both of the above and volunteered to help TEDMED 2013 and 2014 as a data scientist/data journalist.

Page 27: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

27

Data Visualizations

http://www.healthmetricsandevaluation.org/tools/data-visualizations?page=3

Page 28: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

28

GBD Data Visualizations Spreadsheet

http://semanticommunity.info/@api/deki/files/23881/TEDMED.xlsx

My Note: See All 13 Tabs.

Page 29: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

29

GBD Data Visualizations Inventory

My Note: Download 36 flies totaling 19 MB and selected a few for visualizations.

Page 30: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

30

Diabetes Prevalence by County (US) Maps

http://www.healthmetricsandevaluation.org/tools/data-visualization/diabetes-prevalence-county-us-maps#/overview/explore

My Note: I used this in my 2013Health Datapalooza IV Submissionand the The Sanofi US 2013 Data Design Diabetes Innovation Challenge – Prove It!

Page 31: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

31

Research Articles

http://www.healthmetricsandevaluation.org/tools/data-visualization/diabetes-prevalence-county-us-maps#/publications-presentations/publications

My Note: Research Article.

Page 32: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

32

Research Articles

http://www.pophealthmetrics.com/content/8/1/26

My Note: Included this in the Knowledge Base.

Page 33: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

33

Datasets

http://www.healthmetricsandevaluation.org/publications/summaries/novel-framework-validating-and-applying-standardized-small-area-measurement-s#/data-methods

My Note: Downloaded this dataset.

Page 34: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

34

Diabetes prevalence rates by age, sex, and county, 2008 (21KB* xls)

http://www.healthmetricsandevaluation.org/sites/default/files/datasets/diabetes_prevalence_by_county_rank_age_and_sex_2008_US_IHME_1010.xls http://ghdx.healthmetricsandevaluation.org/sites/ghdx/files/record-attached-files/IHME_USA_DIABETES_BY_COUNTY_2008.xls

*My Note: Actual size is 556KB.

My Note: Needed to be separatedinto county and state.

Page 35: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

35

Metadata

http://ghdx.healthmetricsandevaluation.org/record/united-states-diabetes-prevalence-county-2008

My Note: Another Excel file name,but same file.

Page 36: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

36

IHME Diabetes County 2009: Spotfire

https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?Diabetes-Spotfire

Navigation and Metadata

Data Set

Map

Top 10 Counties With High Prevalence of Diabetes

Higher Female Than Male Diabetes Prevalence

Page 37: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

37

IHME-GBD Mortality by Country: Spotfire

https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?IHME-GBD1-Spotfire

Page 38: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

38

IHME-GBD Disability Factors by Health State: Spotfire

https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?IHME-GBD1-Spotfire

Page 39: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

39

IHME-GBD Risk Factors by Region: Spotfire

https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?IHME-GBD-Spotfire

Page 40: A TEDMED Data Reveal: Big and Little Dr. Brand Niemann Director and Senior Data Scientist Semantic Community  AOL Government

40

IHME-GBD Cause of Death by Region: Spotfire

https://silverspotfire.tibco.com/us/library#/users/bniemann/Public?IHME-GBD1-Spotfire