dr. brand niemann director and senior data scientist semantic community
Post on 05-Jan-2016
39 Views
Preview:
DESCRIPTION
TRANSCRIPT
1
Challenges and Solutions for Big Data in the Public Sector:
DGI’s 3rd Annual Government Big Data ConferenceOctober 9, Ronald Reagan Building, Washington, DC
Dr. Brand NiemannDirector and Senior Data Scientist
Semantic Communityhttp://semanticommunity.info/
http://www.meetup.com/Federal-Big-Data-Working-Group/http://semanticommunity.info/Data_Science/Federal_Big_Data_Working_Group_Meetup
October 9, 2014
2
Overview• Related Presentations:
– COM.BigData Conference (Keynote and Panel), August 4-6, Washington, DC, and
– IEEE 2014 Big Data Conference (Paper and NIST Big Data Workshop), October 27-30, Washington, DC.
• Moderator:– Dr. Brand Niemann, Director and Senior Data Scientist, Semantic
Community, and Co-organizer, Federal Big Data Working Group Meetup• Panelists:
– Dr. Tom Rindflesch, Information Research Specialist at Cognitive Science Branch, National Institutes for Health (NIH): Semantic Medline (Ontology, Cray Graph Appliance, and Relational Databases)
– Dr. Kirk Borne, Professor of Astrophysics and Computational Science, George Mason University: NSF Big Data Project of the Decade: LSST
http://www.digitalgovernment.com/Events/Conferences/Government-Big-Data-Conference--Expo.shtml
3
Mission Statement• Federal: Supports the Federal Big Data Initiative, but not
endorsed by the Federal Government or its Agencies;• Big Data: Supports the Federal Digital Government
Strategy which is "treating all content as data", so big data = all your content;
• Working Group: Data Science Teams composed of Federal Government and Non-Federal Government experts producing big data products (How was the data collected, Where is it stored, What are the results, and Does the data story persuade?); and
• Meetup: The world's largest network of local groups to revitalize local community and help people around the world self-organize like MOOCs (Massive Open On-line Classes) being considered by the White House to reduce the cost of higher education.Co-organizers: Brand Niemann and Katherine Goodier
4
Decisions = Science + Art:The Challenger Accident
• Richard Feynman's famous conclusion to his report on the shuttle Challenger accident, which arose again in the Columbia accident, is "For a successful technology, reality must take precedence over public relations, for Nature cannot be fooled."
• -- Edward Tufte
The Challenger: An Information Disaster
Note: These charts appear in Edward Tufte’s book, Visual Explanations: Images and Quantities, Evidence and Narrative.
5
Fourth Paradigm and Fourth Question
• The Fourth Paradigm of Science (1):– First Paradigm. Observation, descriptions of natural phenomena, and
experimentation.– Second Paradigm. Theoretical science such as Newton’s laws of motion
and Maxwell’s equations.– Third Paradigm. Simulation and modelling, such as in astronomy.– Fourth Paradigm. Data-intensive science that exploits the large volumes of
data in new ways for scientific exploration, such as the International Virtual Observatory Alliance in astronomy.
• The Fourth Question of Big Data for Science (2):– How was the data collected?– Where is the data stored?– What are the data results?– Does the data story persuade?(1) Bell G, Hey, T., & Szalay, A. (2009) Beyond the data deluge, Science 323, 6 March 2009, pp. 1297-1298.
(2) de Waard, Anita, (2014) About Stories, that Persuade With Data, Federal Big Data Working Group Meetup, 20 May,, 41 slides.
President Obama Discovers Big Data in 2009
6
Symposium on Predictive Analytics For Defense and Government, November 18-19, Washington, DC
Big Data Analytics Data Science
All Content - Structured and Unstructured
Results and Decisions Mining and Discovery
Data EcosystemData Set 1...Data Set N
PerformanceContentNetworkData
DescriptivePrescriptive
Microscope and Telescope: Szalay (JHU)
Data FAIRport: Strawn (NITRD)Data Commons: Bourne (NIH)
Data Publications in a Data Browser: Semantic Community
Data Science Central
7
NIH Data CommonsDr. Phil Bourne (7/30/2014): Rules, Credit/Not Money, & More Offline
http://semanticommunity.info/Data_Science/Data_Science_for_RDA#Slide_50_The_Power_of_the_CommonsMy Note: Registries, Repositories, Clearinghouses, Portals, GitHubs, Data Commons, & Data FAIRports to MindTouch and Spotfire
8
Best Practices for Data: A Biologists View
BestPracticesForData_PhilipBourne.pdf
9
Examples of Data Publications in Data Browsers for Senior Government PeoplePerson Interest Data Publication in
Data BrowserExample
Dr. John Holdren Climate Change Data Publication in Data Browser
Climate Change Assessment
Dr. George Strawn Research Objects as Digital Objects
Data Publication in Data Browser
VIVO
Dr. Farnam Jahanian NSF Big Data Publications Data Publication in Data Browser
NSF Big Data
Dr. Phil Bourne Data Culture at NIH Data Publication in Data Browser
Bourne Research & NIH
Dan Kaufman and Paul Cohen
Big Mechanism for Cancer
Data Publication in Data Browser
DARPA Contract
Bryan Sivak Hack-a-Thon Data Publication in Data Browser
HHS IDEALAB
Todd Park Code-a-Palooza Data Publication in Data Browser
Health Datapalooza V
Brian Lee Health United States 2013
Data Publication in Data Browser
Centers for Disease Control & Prevention Report
The Honorable Kathleen Sebelius
Dynamic Case Management
Data Publication in Data Browser
HealthCare.gov Web Site
Data Source: Semantic Community NSF BIG DATA PROPOSAL
10
Data Science Central: Meteors Descriptive and Predictive
Web Player
Data Science Data Publications in a Data Browser
11
Data Science for JHU/NSF DIBBs Project: Knowledge Bases
Data Science for JHU DIBBs Project SDSS.xlsx
Data Science Data Publication:Table of Contents is An Ontology!
Data Science Publication Index:Index is Linked Open Data!
12
Data Science for JHU/NSF DIBBs Project: Analytics & Visualizations
Spotfire Content, Network, and Data Analytics:Spotfire is a Microscope and a Telescope for 77 TB!
Web Player
Data Science Data Publications in a Data Browser
13
Cover Page-Performance Analytics: FDA TRACK
Web Player
My Note: Most programs do not have a Strategic Plan!
Data Science Data Publications in a Data Browser
14
October 6th Meetup Agenda• 6:30 p.m. Welcome and Introduction – Report on Recent Meeting
with Dr. Taha Kass-Hout, FDA’s First Chief Health Informatics Officer (CHIO) and FDA Data Science Data Publication Tutorial:– Interest in our Meetup on OpenFDA, July 7th – Keynote at AFCEA Bethesda’s Health IT Day, December 2nd
• 7:00 p.m. Brooke Aker, Big Data Lens, Predictive Analytics for OpenFDA and Other Examples
• 7:45 p.m. Brief Member Introductions and Inter-American Development Bank Open Data Portal Examples
• 8:30 p.m. Open Discussion • 8:45 p.m. Networking • 9:00 p.m. Depart
top related