big data state of the artcdn.bdigital.org/pdf/bigdatacongress2016/22.bsc-cns.pdf · big da ta \big...
TRANSCRIPT
BIG DATA STATE OF THE ART
Sr. Paul Carpenter
Senior Researcher
Barcelona Supercomputing Centerwww.bsc.es | www.rethinkbig-project.eu
@BSC_CNS
Barcelona Supercomputing CenterPaul Carpenter, Gina Alioto
Barcelona Big Data Congress5 Oct 2016
© 2014 Network Computing, JohnKlossner
big data \big dey-tuh\ n, pl big data
“all-encompassing term for any collection of data sets solarge and complex that it becomes difficult to processusing on-hand data management tools or traditionaldata processing applications.”
“Datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze”
“A new attitude by businesses, non-profits, government agencies, and individuals that combining data from multiple sources could lead to better decisions.”
Forbes, 3 Sep 2014, 12 Big Data Definitions – What’s Yours?, Gil Press
big data \big dey-tuh\ n, pl big data
“ …answers to this question will tend to be almost entirely based on opinions." “If you can imagine an entire book that answers your question, you’re asking too much."
The proof is in the pr cessing…
In 2014, the European Commission asked for a roadmap for European leadership in hardware and networking
optimizations for Big Data
Why hardware?
Big data analytics is about monetizing data
Hardware is not an asset but a liability
Hardware doesn’t matter
It matters to Microsoft… and Facebook and Twitter and…
What European companies are saying
Looking for right business model
Not yet interested in hardware
Key Findings(1) How to extract value?
Hyperscalers take risks and define the ecosystem
Europe is highly fragmented and cannot keep up
Key Findings(2) European companies follow hyperscalers
Key Findings(3) Europe waits for commodity
© 1998 Andy Glasbergen
Intel is the gatekeeper for data centre architectures
Key Findings(4) Dominance of non-EU hardware
In 2016, we delivered a roadmap for European leadership in hardware and networking optimizations
for Big Data
Roadmap Recommendations
(1) Reduce risk and cost of using hardware accelerators, such as FPGAs
Roadmap Recommendations
(2) Encourage European hardware and software experts to work together on whole systems tuned for
specific applications (co-design)
Roadmap Recommendations
(3) Prepare for next generation of hardware,and take advantage of the convergence of High-
Performance Computing HPC and Big Data
Life Sciences & Medicine
Earth Sciences
Astro,High Energy
& Plasma Physics
Materials, Chemistry & Nanoscience
Engineering
Barcelona Supercomputing Center (BSC)
70% PRACE 24% RES 6% BSC-CNS
3 PBof disk storage
100.8 TBof main memory
Nearly50,000 cores
MareNostrum 3Over 1015 Floating Point Operations per second
Earth Sciences
CASE
ComputerSciences
LifeSciences
To influence the way machines are built, programmed and used:
programming models, performance tools, Big Data, computer
architecture, energy efficiency
To develop and implement global and regional state-of-the-art
models for short-term air quality forecast and long-term
climate applications
To understand living organisms by means of theoretical and
computational methods (molecular modeling, genomics, proteomics)
To develop scientific and engineering software to efficiently
exploit super-computing capabilities (biomedical, geophysics,
atmospheric, energy, social and economic simulations)
BSC Departments
Big Data @BSCFighting Cancer
• Analysis of 2000 tumors
• 20 different types
• Using to answer questions about the formation of these tumors.
Big Data @BSCSmart Cities
• Sensors around Barcelona gather data on traffic, pollution levels.
• Processed in cabinets throughout the city
• BSC developing a solution that is quick-to-deploy
• Improved quality of life
Download the Roadmap for European Technologies in Hardware and Networking for Big Data at:
www.rethinkbig-project.eu/roadmap