the internet of things, do we need all that data?
TRANSCRIPT
The internet of things Zettabytes of data, but do we really need all that data?Christian Verstraete, Chief Technologist
A little story
2
Data Generation increases at fast pace
3
World Population
7.210 Billion
Active Internet Users
3.010 Billion
Penetration: 42%
Active Social Media
Accounts
2.078 Billion
Penetration: 29%
Unique Mobile Users
3.649 Billion
Penetration: 51%
Active Mobile Social
Accounts
1.685 Billion
Penetration: 23%
Things in IoT
6.400 Billion
Welcome to thenew vocabulary
1030
This will be our digital
universe tomorrow…
Geopbyte* 10
27A 1BB hard drive would
cover the earth 23,000
times
Brontobyte
1024
This is our digital universe today
= 250 trillion of DVDs
Yottabyte 1021
1.3 ZB of network traffic
by 2016
Zettabyte
1018
1 EB of data is created on the internet each day = 250 million DVDs worth of information.
The proposed Square Kilometer Array telescope will generated an EB of data per day
Exabyte
1012
Terabyte500TB of new data per day are ingested in Facebook databases
1015
PetabyteThe CERN Large Hadron Collider
generates 1PB per second
109
Gigabyte10
6
Megabyte
*The terms Gegobyte and Geobyte are also used in the literature.
5
In 2013, 4.4 Zetabyteswere generated, by 2020,
44 Zetabytes will be generated.
By 2020, 37% of the digital universe will contain
information that might be valuable if analyzed
Sources of Data
6
Transaction & Application Data
Internet of Things Social Media Enterprise Content
Structured Data Unstructured Data
10% of Data By 2020, 12% of Data
7
By 2030, to store all the data generated, a datacenter, 6X the size of Greater London would be required. This datacenter would consume 25% of the world energy.
Example, AWS datacenter locations
8
50 to 80.000 servers per DCUp to 30MW per DC
Analyze
Centralized versus Decentralized
9
0.3 0.8 1.2 1.8
4.4
7.9
44
0
5
10
15
20
25
30
35
40
45
50
2006 2008 2010 2012 2014 2016 2018 2020 2022D
ata
(Z
ett
ab
yte
s)
Years
‘09
ZB
ZB
Digital Universe
2013
2020
Compute is not keeping up
Healthcare - Epidemiology Don’t bring data to the algorithm, bring the algorithm to data
11
Confidential Healthcare Data – Geographically Constraint
ExtractAnonymize
ExtractAnonymize
ExtractAnonymize
ExtractAnonymize
ExtractAnonymize
ExtractAnonymize
Join
Analyze
Analyse Analyse Analyse Analyse Analyse
Report
Finalize Analysis
Mesh Networks – Information that is important locally
– Process local information locally, don’t clog the internet.
– Information is only valid for a very short amount of time
– If there is a gap in the network, it means there is no car, so the informationdoes not need to be taken further.
12
Cloud
Distributed and Parallel Processing of Data
13
?MetaData
Where Consolidation & Results
1
23
4
Computer Technology is Changing
14
Processor
CPU registers
Level 1 cache
Level 2 cache
Level 3 cache
Computer
Network
Network drive array*
Network backup
Archive
Main memory
Local disk
Flash accelerator
SSD
* actually an entire computer
system with its own hierarchy
Physical Server
Physical Server
SoC
SoC
Local DRAM
Local DRAM
Ne
two
rk
SoC
SoC
Local DRAM
Local DRAM
Memory Pool
NVM
NVM
NVM
NVM
Translator
Coordinator
Orchestrator
Arbitrator
Aggregator
Replicator
Anonymizer
Border guard
Learning engine
Distributed Mesh Computing
A mesh of connected aircraft …
Use case: the smart cell tower
Conclusion
–We will go from a centralized to a decentralized processing of data and information
– Allows the processing
– Creates opportunities for a whole new market of data providers
– Enables improved analysis and allows to address new problems
– Improves security and reduces data duplication
–Requires a new thinking and a different approach to analytics
–Builds on a new computing technology
–Enables full exploitation of IoT
18
Thank [email protected]: @ChristianveBlog: http://community.hpe.com/t5/Cloud-Source/bg-p/CloudSource
19