visualize your big data
DESCRIPTION
One of the easiest ways to make sense of data is to visualize it. In addition to complex computational and statistical data manipulations, it is important to have an effective visual exploration tool to help with the extraction of information from large heaps of data. This statement is very much true of genomics, where there is a critical need for intuitive tools that enable researchers to effectively explore through multidimensional data sets. Learn more in this session.TRANSCRIPT
Visualize your Big Data THT10350
Premjith Balakrishnan Sib De Business Analytics Product Group September 30, 2014
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
• Visualizes data in a circular layout
• Ideal for exploring relationships between objects or positions
• Aesthetically appealing
• Very high data-ink ratio
• Ideal for visualizing genomics data
Visualizations - Circos
CIRCOS
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Visualizations – Many more
Genome Browser Volcano Plot Kaplan Meier Survival Curve
RDF graph Alignment Visualization many more …
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
• Isn’t BI visualization? Is it sufficient?
• Isn’t ‘discovery’ visualization? When is it applicable?
• What is “advanced” visualization?
• How to leverage complex visualizations on *all* my data?
• What about the data volume and performance characteristics?
• Deployment options? Enterprise scale? Ease of use? Tool availability?
• Security? Provisioning users to data?
Visualization in the context of Big Data Analytics Many aspects to consider
Answer: IT DEPENDS!
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
What tools you pack depends on what job needs to be done
Different jobs need different tools
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Different jobs, different needs
A workshop needs a toolbox and not just a ‘tool’
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Next Generation Genome Sequencing
• Cost of genome sequencing is rapidly decreasing; throughput is increasing
• Each run generates huge volumes of data
• Computationally intensive processing
• Specialized skills needed to analyze the data
• Strong open source community, fostering rapid innovation
Big Data Analytics Solution Perspective
$1 Billion $1000
1.5 TB per run
10 years Days
Millions of reads
824 packages
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Data reservoir with a flexible data model
Expanding variety and depth of data sets
Prescription Data
FAERS, MAUDE
Argus
Siebel
Electronic Medical Records
Insurance Claims Data
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Discover and predict, fast
Many different users with varying analytic needs
Safe
ty
Man
ufa
ctu
rin
g
Sup
ply
Ch
ain
R&
D
Clin
ical
Discovery / BI insight
Predictive analytics / statistics insight
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Wide variety of insight needs
Expanding universe of users
Flexible Data Model
Clinician
Drug Discovery
Clinical Trials
• Personalized medicine • Drug effectiveness on
specific patient
• Recruiting new patients for study
• Cohort analysis
• Pathways analysis • Effectiveness study
Characteristics of visualization needs
• Patient centric view • Need a scalable BI environment • Hide data access complexity from end user
• Computationally intensive queries • Expose deeper access to data • Ad hoc visualizations, based on investigation needs • Need to blend in latest from open source community • RDF graph storage for knowledge base
• Blend in predictive analytics to help with patient selection
• Scalable BI environment for larger base of users • Statistical insight as well as BI to drive decisions
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Different demands on visualization and data access
Diversity in audience
Flexible Data Model
Clinician
Drug Discovery
Clinical Trials
• Personalized medicine • Drug effectiveness on
specific patient
• Recruiting new patients for study
• Cohort analysis
• Pathways analysis • Effectiveness study
What type of analytics / data access needs exist
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Demo – Patient cohort recruitment for trials
1 2
3 4
1. Review a cohort of patients by their demographics and other attributes
2. A way of identifying specific set of patients of interest for a trial. Most likely done with an automated script
3. Review the survival curves for the profile and analyze how it varies with age
4. A genome browser view to review huge datasets but identify specific areas of interest
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 13
Many facets of Big Data Analytics
Discover and predict, fast
Simplify access to all data
Secure and govern all data
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Industrial-Strength R Across All Data
Oracle Confidential – Restricted 14
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
One Fast, Secure Query Across All Data
Oracle Confidential – Restricted 15
NoSQL
BIG DATA SQL
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Oracle Big Data Management System
SOU
RC
ES
DATA RESERVOIR DATA WAREHOUSE
Oracle Database
Oracle Industry Models
Oracle Advanced
Analytics
Oracle Spatial & Graph
Big Data Appliance
Apache Flume
Oracle GoldenGate
Oracle Event Processing
Cloudera Hadoop
Oracle Big Data SQL
Oracle NoSQL
Oracle R Advanced Analytics for Hadoop
Oracle R Distribution
Oracle Database
In-Memory, Multi-tenant
Oracle Industry Models
Oracle Advanced Analytics
Oracle Spatial & Graph
Exadata
Oracle GoldenGate
Oracle Event Processing
Oracle Data Integrator
Oracle Big Data Connectors
Oracle Data Integrator
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |
Oracle Big Data Platform Big Data Management System
z BY INDUSTRY & LINE OF BUSINESS
BIG
DA
TA
AP
PLI
CA
TIO
NS
DISCOVERY
BU
SIN
ESS
AN
ALY
TIC
S
BUSINESS ANALYTICS
DATA RESERVOIR
BIG
DA
TA
MA
NA
GEM
ENT
DATA WAREHOUSE
SOU
RC
ES
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential | #BeyondBigData 19
Enterprise Big Data Architecture
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
BIG DATA MANAGEMENT
BIG DATA ANALYTICS
BIG DATA APPLICATIONS
BIG DATA INTEGRATION
CREATE VALUE FROM DATA
Streaming + Batch
Data Reservoir + Data Warehouse
Discovery + Business Analytics
Mobile + Web + On-device
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 20
Anything That Speaks SQL Now Talks To Big Data