thriving and surviving the big data revolution
DESCRIPTION
Presentation on Big Data given at Collaborate 2014 #c14lvTRANSCRIPT
![Page 1: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/1.jpg)
1 Global MarketingConfidential
REMINDER
Check in on the COLLABORATE mobile app
C
14
LV
207Surviving and thriving in the big data revolution
Guy Harrison
Executive Director RampD
Information Management Group
Dell Software
207Surviving and thriving in the big data revolution
Guy Harrison
Executive Director RampDInformation management group
3 Software Group
Introductions
Web guyharrisonnet Email guyharrisonsoftwaredellcom Twitter guyharrisonGoogle Plus httpswwwgooglecom+GuyHarrison1
4 Software Group
5 Software Group
6 Software Group
7 Software Group
8 Software Group
Dell and Quest ndash a brief history
![Page 2: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/2.jpg)
207Surviving and thriving in the big data revolution
Guy Harrison
Executive Director RampDInformation management group
3 Software Group
Introductions
Web guyharrisonnet Email guyharrisonsoftwaredellcom Twitter guyharrisonGoogle Plus httpswwwgooglecom+GuyHarrison1
4 Software Group
5 Software Group
6 Software Group
7 Software Group
8 Software Group
Dell and Quest ndash a brief history
![Page 3: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/3.jpg)
3 Software Group
Introductions
Web guyharrisonnet Email guyharrisonsoftwaredellcom Twitter guyharrisonGoogle Plus httpswwwgooglecom+GuyHarrison1
4 Software Group
5 Software Group
6 Software Group
7 Software Group
8 Software Group
Dell and Quest ndash a brief history
![Page 4: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/4.jpg)
4 Software Group
5 Software Group
6 Software Group
7 Software Group
8 Software Group
Dell and Quest ndash a brief history
![Page 5: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/5.jpg)
5 Software Group
6 Software Group
7 Software Group
8 Software Group
Dell and Quest ndash a brief history
![Page 6: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/6.jpg)
6 Software Group
7 Software Group
8 Software Group
Dell and Quest ndash a brief history
![Page 7: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/7.jpg)
7 Software Group
8 Software Group
Dell and Quest ndash a brief history
![Page 8: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/8.jpg)
8 Software Group
Dell and Quest ndash a brief history
![Page 9: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/9.jpg)
9 Software Group
But Seriously
10 Software Group
What is Big Data
11 Software Group
Three or Four ldquoVrdquos
VolumeTerabytesPetabytesExabytesZetabytes
VarietyStructuredUnstructuredHuman GeneratedMachine Generated
VelocityUser populations xTransaction rates xMachine data
Value Competitive or Collective advantage
12 Software Group
Instead - the industrial Revolution of data
13 Software Group
14 Software Group
15 Software Group
16 Software Group
17 Software Group
18 Software Group
19 Software Group
20 Software Group
21 Software Group
Generated internally
Key to operational efficiency
1993
Generated externally
Key to competitive advantage
Source of product innovation
Changing our lives
2013
Data means more
22 Software Group
Big Data is the culmination of cloud social and mobile
23 Software Group
Not all upside
24 Software Group
Will Big Data kill retail
25 Software Group
Prevalence of Showrooming
Consumer Electronics
Home Improvement
0 10 20 30 40 50 60 70
Pct
Garter Research G00249458Survey Analysis Focus on Customer Basics to Challenge Amazon as Showrooming Is Universal but Not UnbeatablePublished 12 February 2013
26 Software Group
27 Software Group
28 Software Group
29 Software Group
30 Software Group
Some novel defences
31 Software Group
Web analytics for retail
32 Software Group
Connected Store
bull Shelf assortment optimization
bull In store offers
bull Customer entertainment
bull Checkout anywhere
bull Relationship management
bull Customer analytics
33 Software Group
34 Software Group
Why showrooming
Selection
Stock
Faster
Cheaper
Dynamic Pricing
Predictive ordering
Assortment optimization
Predictive recommendations
Personalization
Defences
35 Software Group
Itrsquos not enough to lay out products on tables
bull Online has significant advantages
bull Retailers can only survive by embracing online and emulating online practicesndash Dynamic pricingndash Shelf optimizationndash Personalized service and selection
bull Only big data analytics can provide these advantages
36 Software Group
Therersquos a similar story in every industry
Web
Transport
Power Grid
Dating
Retail
SecurityFinance
Government
Science
Healthcare
Insurance
Telecom
Advertising
37 Software Group
The Revolution is not over yet
38 Software Group
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 10: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/10.jpg)
10 Software Group
What is Big Data
11 Software Group
Three or Four ldquoVrdquos
VolumeTerabytesPetabytesExabytesZetabytes
VarietyStructuredUnstructuredHuman GeneratedMachine Generated
VelocityUser populations xTransaction rates xMachine data
Value Competitive or Collective advantage
12 Software Group
Instead - the industrial Revolution of data
13 Software Group
14 Software Group
15 Software Group
16 Software Group
17 Software Group
18 Software Group
19 Software Group
20 Software Group
21 Software Group
Generated internally
Key to operational efficiency
1993
Generated externally
Key to competitive advantage
Source of product innovation
Changing our lives
2013
Data means more
22 Software Group
Big Data is the culmination of cloud social and mobile
23 Software Group
Not all upside
24 Software Group
Will Big Data kill retail
25 Software Group
Prevalence of Showrooming
Consumer Electronics
Home Improvement
0 10 20 30 40 50 60 70
Pct
Garter Research G00249458Survey Analysis Focus on Customer Basics to Challenge Amazon as Showrooming Is Universal but Not UnbeatablePublished 12 February 2013
26 Software Group
27 Software Group
28 Software Group
29 Software Group
30 Software Group
Some novel defences
31 Software Group
Web analytics for retail
32 Software Group
Connected Store
bull Shelf assortment optimization
bull In store offers
bull Customer entertainment
bull Checkout anywhere
bull Relationship management
bull Customer analytics
33 Software Group
34 Software Group
Why showrooming
Selection
Stock
Faster
Cheaper
Dynamic Pricing
Predictive ordering
Assortment optimization
Predictive recommendations
Personalization
Defences
35 Software Group
Itrsquos not enough to lay out products on tables
bull Online has significant advantages
bull Retailers can only survive by embracing online and emulating online practicesndash Dynamic pricingndash Shelf optimizationndash Personalized service and selection
bull Only big data analytics can provide these advantages
36 Software Group
Therersquos a similar story in every industry
Web
Transport
Power Grid
Dating
Retail
SecurityFinance
Government
Science
Healthcare
Insurance
Telecom
Advertising
37 Software Group
The Revolution is not over yet
38 Software Group
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 11: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/11.jpg)
11 Software Group
Three or Four ldquoVrdquos
VolumeTerabytesPetabytesExabytesZetabytes
VarietyStructuredUnstructuredHuman GeneratedMachine Generated
VelocityUser populations xTransaction rates xMachine data
Value Competitive or Collective advantage
12 Software Group
Instead - the industrial Revolution of data
13 Software Group
14 Software Group
15 Software Group
16 Software Group
17 Software Group
18 Software Group
19 Software Group
20 Software Group
21 Software Group
Generated internally
Key to operational efficiency
1993
Generated externally
Key to competitive advantage
Source of product innovation
Changing our lives
2013
Data means more
22 Software Group
Big Data is the culmination of cloud social and mobile
23 Software Group
Not all upside
24 Software Group
Will Big Data kill retail
25 Software Group
Prevalence of Showrooming
Consumer Electronics
Home Improvement
0 10 20 30 40 50 60 70
Pct
Garter Research G00249458Survey Analysis Focus on Customer Basics to Challenge Amazon as Showrooming Is Universal but Not UnbeatablePublished 12 February 2013
26 Software Group
27 Software Group
28 Software Group
29 Software Group
30 Software Group
Some novel defences
31 Software Group
Web analytics for retail
32 Software Group
Connected Store
bull Shelf assortment optimization
bull In store offers
bull Customer entertainment
bull Checkout anywhere
bull Relationship management
bull Customer analytics
33 Software Group
34 Software Group
Why showrooming
Selection
Stock
Faster
Cheaper
Dynamic Pricing
Predictive ordering
Assortment optimization
Predictive recommendations
Personalization
Defences
35 Software Group
Itrsquos not enough to lay out products on tables
bull Online has significant advantages
bull Retailers can only survive by embracing online and emulating online practicesndash Dynamic pricingndash Shelf optimizationndash Personalized service and selection
bull Only big data analytics can provide these advantages
36 Software Group
Therersquos a similar story in every industry
Web
Transport
Power Grid
Dating
Retail
SecurityFinance
Government
Science
Healthcare
Insurance
Telecom
Advertising
37 Software Group
The Revolution is not over yet
38 Software Group
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 12: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/12.jpg)
12 Software Group
Instead - the industrial Revolution of data
13 Software Group
14 Software Group
15 Software Group
16 Software Group
17 Software Group
18 Software Group
19 Software Group
20 Software Group
21 Software Group
Generated internally
Key to operational efficiency
1993
Generated externally
Key to competitive advantage
Source of product innovation
Changing our lives
2013
Data means more
22 Software Group
Big Data is the culmination of cloud social and mobile
23 Software Group
Not all upside
24 Software Group
Will Big Data kill retail
25 Software Group
Prevalence of Showrooming
Consumer Electronics
Home Improvement
0 10 20 30 40 50 60 70
Pct
Garter Research G00249458Survey Analysis Focus on Customer Basics to Challenge Amazon as Showrooming Is Universal but Not UnbeatablePublished 12 February 2013
26 Software Group
27 Software Group
28 Software Group
29 Software Group
30 Software Group
Some novel defences
31 Software Group
Web analytics for retail
32 Software Group
Connected Store
bull Shelf assortment optimization
bull In store offers
bull Customer entertainment
bull Checkout anywhere
bull Relationship management
bull Customer analytics
33 Software Group
34 Software Group
Why showrooming
Selection
Stock
Faster
Cheaper
Dynamic Pricing
Predictive ordering
Assortment optimization
Predictive recommendations
Personalization
Defences
35 Software Group
Itrsquos not enough to lay out products on tables
bull Online has significant advantages
bull Retailers can only survive by embracing online and emulating online practicesndash Dynamic pricingndash Shelf optimizationndash Personalized service and selection
bull Only big data analytics can provide these advantages
36 Software Group
Therersquos a similar story in every industry
Web
Transport
Power Grid
Dating
Retail
SecurityFinance
Government
Science
Healthcare
Insurance
Telecom
Advertising
37 Software Group
The Revolution is not over yet
38 Software Group
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 13: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/13.jpg)
13 Software Group
14 Software Group
15 Software Group
16 Software Group
17 Software Group
18 Software Group
19 Software Group
20 Software Group
21 Software Group
Generated internally
Key to operational efficiency
1993
Generated externally
Key to competitive advantage
Source of product innovation
Changing our lives
2013
Data means more
22 Software Group
Big Data is the culmination of cloud social and mobile
23 Software Group
Not all upside
24 Software Group
Will Big Data kill retail
25 Software Group
Prevalence of Showrooming
Consumer Electronics
Home Improvement
0 10 20 30 40 50 60 70
Pct
Garter Research G00249458Survey Analysis Focus on Customer Basics to Challenge Amazon as Showrooming Is Universal but Not UnbeatablePublished 12 February 2013
26 Software Group
27 Software Group
28 Software Group
29 Software Group
30 Software Group
Some novel defences
31 Software Group
Web analytics for retail
32 Software Group
Connected Store
bull Shelf assortment optimization
bull In store offers
bull Customer entertainment
bull Checkout anywhere
bull Relationship management
bull Customer analytics
33 Software Group
34 Software Group
Why showrooming
Selection
Stock
Faster
Cheaper
Dynamic Pricing
Predictive ordering
Assortment optimization
Predictive recommendations
Personalization
Defences
35 Software Group
Itrsquos not enough to lay out products on tables
bull Online has significant advantages
bull Retailers can only survive by embracing online and emulating online practicesndash Dynamic pricingndash Shelf optimizationndash Personalized service and selection
bull Only big data analytics can provide these advantages
36 Software Group
Therersquos a similar story in every industry
Web
Transport
Power Grid
Dating
Retail
SecurityFinance
Government
Science
Healthcare
Insurance
Telecom
Advertising
37 Software Group
The Revolution is not over yet
38 Software Group
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 14: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/14.jpg)
14 Software Group
15 Software Group
16 Software Group
17 Software Group
18 Software Group
19 Software Group
20 Software Group
21 Software Group
Generated internally
Key to operational efficiency
1993
Generated externally
Key to competitive advantage
Source of product innovation
Changing our lives
2013
Data means more
22 Software Group
Big Data is the culmination of cloud social and mobile
23 Software Group
Not all upside
24 Software Group
Will Big Data kill retail
25 Software Group
Prevalence of Showrooming
Consumer Electronics
Home Improvement
0 10 20 30 40 50 60 70
Pct
Garter Research G00249458Survey Analysis Focus on Customer Basics to Challenge Amazon as Showrooming Is Universal but Not UnbeatablePublished 12 February 2013
26 Software Group
27 Software Group
28 Software Group
29 Software Group
30 Software Group
Some novel defences
31 Software Group
Web analytics for retail
32 Software Group
Connected Store
bull Shelf assortment optimization
bull In store offers
bull Customer entertainment
bull Checkout anywhere
bull Relationship management
bull Customer analytics
33 Software Group
34 Software Group
Why showrooming
Selection
Stock
Faster
Cheaper
Dynamic Pricing
Predictive ordering
Assortment optimization
Predictive recommendations
Personalization
Defences
35 Software Group
Itrsquos not enough to lay out products on tables
bull Online has significant advantages
bull Retailers can only survive by embracing online and emulating online practicesndash Dynamic pricingndash Shelf optimizationndash Personalized service and selection
bull Only big data analytics can provide these advantages
36 Software Group
Therersquos a similar story in every industry
Web
Transport
Power Grid
Dating
Retail
SecurityFinance
Government
Science
Healthcare
Insurance
Telecom
Advertising
37 Software Group
The Revolution is not over yet
38 Software Group
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 15: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/15.jpg)
15 Software Group
16 Software Group
17 Software Group
18 Software Group
19 Software Group
20 Software Group
21 Software Group
Generated internally
Key to operational efficiency
1993
Generated externally
Key to competitive advantage
Source of product innovation
Changing our lives
2013
Data means more
22 Software Group
Big Data is the culmination of cloud social and mobile
23 Software Group
Not all upside
24 Software Group
Will Big Data kill retail
25 Software Group
Prevalence of Showrooming
Consumer Electronics
Home Improvement
0 10 20 30 40 50 60 70
Pct
Garter Research G00249458Survey Analysis Focus on Customer Basics to Challenge Amazon as Showrooming Is Universal but Not UnbeatablePublished 12 February 2013
26 Software Group
27 Software Group
28 Software Group
29 Software Group
30 Software Group
Some novel defences
31 Software Group
Web analytics for retail
32 Software Group
Connected Store
bull Shelf assortment optimization
bull In store offers
bull Customer entertainment
bull Checkout anywhere
bull Relationship management
bull Customer analytics
33 Software Group
34 Software Group
Why showrooming
Selection
Stock
Faster
Cheaper
Dynamic Pricing
Predictive ordering
Assortment optimization
Predictive recommendations
Personalization
Defences
35 Software Group
Itrsquos not enough to lay out products on tables
bull Online has significant advantages
bull Retailers can only survive by embracing online and emulating online practicesndash Dynamic pricingndash Shelf optimizationndash Personalized service and selection
bull Only big data analytics can provide these advantages
36 Software Group
Therersquos a similar story in every industry
Web
Transport
Power Grid
Dating
Retail
SecurityFinance
Government
Science
Healthcare
Insurance
Telecom
Advertising
37 Software Group
The Revolution is not over yet
38 Software Group
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 16: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/16.jpg)
16 Software Group
17 Software Group
18 Software Group
19 Software Group
20 Software Group
21 Software Group
Generated internally
Key to operational efficiency
1993
Generated externally
Key to competitive advantage
Source of product innovation
Changing our lives
2013
Data means more
22 Software Group
Big Data is the culmination of cloud social and mobile
23 Software Group
Not all upside
24 Software Group
Will Big Data kill retail
25 Software Group
Prevalence of Showrooming
Consumer Electronics
Home Improvement
0 10 20 30 40 50 60 70
Pct
Garter Research G00249458Survey Analysis Focus on Customer Basics to Challenge Amazon as Showrooming Is Universal but Not UnbeatablePublished 12 February 2013
26 Software Group
27 Software Group
28 Software Group
29 Software Group
30 Software Group
Some novel defences
31 Software Group
Web analytics for retail
32 Software Group
Connected Store
bull Shelf assortment optimization
bull In store offers
bull Customer entertainment
bull Checkout anywhere
bull Relationship management
bull Customer analytics
33 Software Group
34 Software Group
Why showrooming
Selection
Stock
Faster
Cheaper
Dynamic Pricing
Predictive ordering
Assortment optimization
Predictive recommendations
Personalization
Defences
35 Software Group
Itrsquos not enough to lay out products on tables
bull Online has significant advantages
bull Retailers can only survive by embracing online and emulating online practicesndash Dynamic pricingndash Shelf optimizationndash Personalized service and selection
bull Only big data analytics can provide these advantages
36 Software Group
Therersquos a similar story in every industry
Web
Transport
Power Grid
Dating
Retail
SecurityFinance
Government
Science
Healthcare
Insurance
Telecom
Advertising
37 Software Group
The Revolution is not over yet
38 Software Group
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 17: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/17.jpg)
17 Software Group
18 Software Group
19 Software Group
20 Software Group
21 Software Group
Generated internally
Key to operational efficiency
1993
Generated externally
Key to competitive advantage
Source of product innovation
Changing our lives
2013
Data means more
22 Software Group
Big Data is the culmination of cloud social and mobile
23 Software Group
Not all upside
24 Software Group
Will Big Data kill retail
25 Software Group
Prevalence of Showrooming
Consumer Electronics
Home Improvement
0 10 20 30 40 50 60 70
Pct
Garter Research G00249458Survey Analysis Focus on Customer Basics to Challenge Amazon as Showrooming Is Universal but Not UnbeatablePublished 12 February 2013
26 Software Group
27 Software Group
28 Software Group
29 Software Group
30 Software Group
Some novel defences
31 Software Group
Web analytics for retail
32 Software Group
Connected Store
bull Shelf assortment optimization
bull In store offers
bull Customer entertainment
bull Checkout anywhere
bull Relationship management
bull Customer analytics
33 Software Group
34 Software Group
Why showrooming
Selection
Stock
Faster
Cheaper
Dynamic Pricing
Predictive ordering
Assortment optimization
Predictive recommendations
Personalization
Defences
35 Software Group
Itrsquos not enough to lay out products on tables
bull Online has significant advantages
bull Retailers can only survive by embracing online and emulating online practicesndash Dynamic pricingndash Shelf optimizationndash Personalized service and selection
bull Only big data analytics can provide these advantages
36 Software Group
Therersquos a similar story in every industry
Web
Transport
Power Grid
Dating
Retail
SecurityFinance
Government
Science
Healthcare
Insurance
Telecom
Advertising
37 Software Group
The Revolution is not over yet
38 Software Group
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 18: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/18.jpg)
18 Software Group
19 Software Group
20 Software Group
21 Software Group
Generated internally
Key to operational efficiency
1993
Generated externally
Key to competitive advantage
Source of product innovation
Changing our lives
2013
Data means more
22 Software Group
Big Data is the culmination of cloud social and mobile
23 Software Group
Not all upside
24 Software Group
Will Big Data kill retail
25 Software Group
Prevalence of Showrooming
Consumer Electronics
Home Improvement
0 10 20 30 40 50 60 70
Pct
Garter Research G00249458Survey Analysis Focus on Customer Basics to Challenge Amazon as Showrooming Is Universal but Not UnbeatablePublished 12 February 2013
26 Software Group
27 Software Group
28 Software Group
29 Software Group
30 Software Group
Some novel defences
31 Software Group
Web analytics for retail
32 Software Group
Connected Store
bull Shelf assortment optimization
bull In store offers
bull Customer entertainment
bull Checkout anywhere
bull Relationship management
bull Customer analytics
33 Software Group
34 Software Group
Why showrooming
Selection
Stock
Faster
Cheaper
Dynamic Pricing
Predictive ordering
Assortment optimization
Predictive recommendations
Personalization
Defences
35 Software Group
Itrsquos not enough to lay out products on tables
bull Online has significant advantages
bull Retailers can only survive by embracing online and emulating online practicesndash Dynamic pricingndash Shelf optimizationndash Personalized service and selection
bull Only big data analytics can provide these advantages
36 Software Group
Therersquos a similar story in every industry
Web
Transport
Power Grid
Dating
Retail
SecurityFinance
Government
Science
Healthcare
Insurance
Telecom
Advertising
37 Software Group
The Revolution is not over yet
38 Software Group
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 19: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/19.jpg)
19 Software Group
20 Software Group
21 Software Group
Generated internally
Key to operational efficiency
1993
Generated externally
Key to competitive advantage
Source of product innovation
Changing our lives
2013
Data means more
22 Software Group
Big Data is the culmination of cloud social and mobile
23 Software Group
Not all upside
24 Software Group
Will Big Data kill retail
25 Software Group
Prevalence of Showrooming
Consumer Electronics
Home Improvement
0 10 20 30 40 50 60 70
Pct
Garter Research G00249458Survey Analysis Focus on Customer Basics to Challenge Amazon as Showrooming Is Universal but Not UnbeatablePublished 12 February 2013
26 Software Group
27 Software Group
28 Software Group
29 Software Group
30 Software Group
Some novel defences
31 Software Group
Web analytics for retail
32 Software Group
Connected Store
bull Shelf assortment optimization
bull In store offers
bull Customer entertainment
bull Checkout anywhere
bull Relationship management
bull Customer analytics
33 Software Group
34 Software Group
Why showrooming
Selection
Stock
Faster
Cheaper
Dynamic Pricing
Predictive ordering
Assortment optimization
Predictive recommendations
Personalization
Defences
35 Software Group
Itrsquos not enough to lay out products on tables
bull Online has significant advantages
bull Retailers can only survive by embracing online and emulating online practicesndash Dynamic pricingndash Shelf optimizationndash Personalized service and selection
bull Only big data analytics can provide these advantages
36 Software Group
Therersquos a similar story in every industry
Web
Transport
Power Grid
Dating
Retail
SecurityFinance
Government
Science
Healthcare
Insurance
Telecom
Advertising
37 Software Group
The Revolution is not over yet
38 Software Group
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 20: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/20.jpg)
20 Software Group
21 Software Group
Generated internally
Key to operational efficiency
1993
Generated externally
Key to competitive advantage
Source of product innovation
Changing our lives
2013
Data means more
22 Software Group
Big Data is the culmination of cloud social and mobile
23 Software Group
Not all upside
24 Software Group
Will Big Data kill retail
25 Software Group
Prevalence of Showrooming
Consumer Electronics
Home Improvement
0 10 20 30 40 50 60 70
Pct
Garter Research G00249458Survey Analysis Focus on Customer Basics to Challenge Amazon as Showrooming Is Universal but Not UnbeatablePublished 12 February 2013
26 Software Group
27 Software Group
28 Software Group
29 Software Group
30 Software Group
Some novel defences
31 Software Group
Web analytics for retail
32 Software Group
Connected Store
bull Shelf assortment optimization
bull In store offers
bull Customer entertainment
bull Checkout anywhere
bull Relationship management
bull Customer analytics
33 Software Group
34 Software Group
Why showrooming
Selection
Stock
Faster
Cheaper
Dynamic Pricing
Predictive ordering
Assortment optimization
Predictive recommendations
Personalization
Defences
35 Software Group
Itrsquos not enough to lay out products on tables
bull Online has significant advantages
bull Retailers can only survive by embracing online and emulating online practicesndash Dynamic pricingndash Shelf optimizationndash Personalized service and selection
bull Only big data analytics can provide these advantages
36 Software Group
Therersquos a similar story in every industry
Web
Transport
Power Grid
Dating
Retail
SecurityFinance
Government
Science
Healthcare
Insurance
Telecom
Advertising
37 Software Group
The Revolution is not over yet
38 Software Group
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 21: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/21.jpg)
21 Software Group
Generated internally
Key to operational efficiency
1993
Generated externally
Key to competitive advantage
Source of product innovation
Changing our lives
2013
Data means more
22 Software Group
Big Data is the culmination of cloud social and mobile
23 Software Group
Not all upside
24 Software Group
Will Big Data kill retail
25 Software Group
Prevalence of Showrooming
Consumer Electronics
Home Improvement
0 10 20 30 40 50 60 70
Pct
Garter Research G00249458Survey Analysis Focus on Customer Basics to Challenge Amazon as Showrooming Is Universal but Not UnbeatablePublished 12 February 2013
26 Software Group
27 Software Group
28 Software Group
29 Software Group
30 Software Group
Some novel defences
31 Software Group
Web analytics for retail
32 Software Group
Connected Store
bull Shelf assortment optimization
bull In store offers
bull Customer entertainment
bull Checkout anywhere
bull Relationship management
bull Customer analytics
33 Software Group
34 Software Group
Why showrooming
Selection
Stock
Faster
Cheaper
Dynamic Pricing
Predictive ordering
Assortment optimization
Predictive recommendations
Personalization
Defences
35 Software Group
Itrsquos not enough to lay out products on tables
bull Online has significant advantages
bull Retailers can only survive by embracing online and emulating online practicesndash Dynamic pricingndash Shelf optimizationndash Personalized service and selection
bull Only big data analytics can provide these advantages
36 Software Group
Therersquos a similar story in every industry
Web
Transport
Power Grid
Dating
Retail
SecurityFinance
Government
Science
Healthcare
Insurance
Telecom
Advertising
37 Software Group
The Revolution is not over yet
38 Software Group
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 22: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/22.jpg)
22 Software Group
Big Data is the culmination of cloud social and mobile
23 Software Group
Not all upside
24 Software Group
Will Big Data kill retail
25 Software Group
Prevalence of Showrooming
Consumer Electronics
Home Improvement
0 10 20 30 40 50 60 70
Pct
Garter Research G00249458Survey Analysis Focus on Customer Basics to Challenge Amazon as Showrooming Is Universal but Not UnbeatablePublished 12 February 2013
26 Software Group
27 Software Group
28 Software Group
29 Software Group
30 Software Group
Some novel defences
31 Software Group
Web analytics for retail
32 Software Group
Connected Store
bull Shelf assortment optimization
bull In store offers
bull Customer entertainment
bull Checkout anywhere
bull Relationship management
bull Customer analytics
33 Software Group
34 Software Group
Why showrooming
Selection
Stock
Faster
Cheaper
Dynamic Pricing
Predictive ordering
Assortment optimization
Predictive recommendations
Personalization
Defences
35 Software Group
Itrsquos not enough to lay out products on tables
bull Online has significant advantages
bull Retailers can only survive by embracing online and emulating online practicesndash Dynamic pricingndash Shelf optimizationndash Personalized service and selection
bull Only big data analytics can provide these advantages
36 Software Group
Therersquos a similar story in every industry
Web
Transport
Power Grid
Dating
Retail
SecurityFinance
Government
Science
Healthcare
Insurance
Telecom
Advertising
37 Software Group
The Revolution is not over yet
38 Software Group
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 23: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/23.jpg)
23 Software Group
Not all upside
24 Software Group
Will Big Data kill retail
25 Software Group
Prevalence of Showrooming
Consumer Electronics
Home Improvement
0 10 20 30 40 50 60 70
Pct
Garter Research G00249458Survey Analysis Focus on Customer Basics to Challenge Amazon as Showrooming Is Universal but Not UnbeatablePublished 12 February 2013
26 Software Group
27 Software Group
28 Software Group
29 Software Group
30 Software Group
Some novel defences
31 Software Group
Web analytics for retail
32 Software Group
Connected Store
bull Shelf assortment optimization
bull In store offers
bull Customer entertainment
bull Checkout anywhere
bull Relationship management
bull Customer analytics
33 Software Group
34 Software Group
Why showrooming
Selection
Stock
Faster
Cheaper
Dynamic Pricing
Predictive ordering
Assortment optimization
Predictive recommendations
Personalization
Defences
35 Software Group
Itrsquos not enough to lay out products on tables
bull Online has significant advantages
bull Retailers can only survive by embracing online and emulating online practicesndash Dynamic pricingndash Shelf optimizationndash Personalized service and selection
bull Only big data analytics can provide these advantages
36 Software Group
Therersquos a similar story in every industry
Web
Transport
Power Grid
Dating
Retail
SecurityFinance
Government
Science
Healthcare
Insurance
Telecom
Advertising
37 Software Group
The Revolution is not over yet
38 Software Group
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 24: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/24.jpg)
24 Software Group
Will Big Data kill retail
25 Software Group
Prevalence of Showrooming
Consumer Electronics
Home Improvement
0 10 20 30 40 50 60 70
Pct
Garter Research G00249458Survey Analysis Focus on Customer Basics to Challenge Amazon as Showrooming Is Universal but Not UnbeatablePublished 12 February 2013
26 Software Group
27 Software Group
28 Software Group
29 Software Group
30 Software Group
Some novel defences
31 Software Group
Web analytics for retail
32 Software Group
Connected Store
bull Shelf assortment optimization
bull In store offers
bull Customer entertainment
bull Checkout anywhere
bull Relationship management
bull Customer analytics
33 Software Group
34 Software Group
Why showrooming
Selection
Stock
Faster
Cheaper
Dynamic Pricing
Predictive ordering
Assortment optimization
Predictive recommendations
Personalization
Defences
35 Software Group
Itrsquos not enough to lay out products on tables
bull Online has significant advantages
bull Retailers can only survive by embracing online and emulating online practicesndash Dynamic pricingndash Shelf optimizationndash Personalized service and selection
bull Only big data analytics can provide these advantages
36 Software Group
Therersquos a similar story in every industry
Web
Transport
Power Grid
Dating
Retail
SecurityFinance
Government
Science
Healthcare
Insurance
Telecom
Advertising
37 Software Group
The Revolution is not over yet
38 Software Group
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 25: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/25.jpg)
25 Software Group
Prevalence of Showrooming
Consumer Electronics
Home Improvement
0 10 20 30 40 50 60 70
Pct
Garter Research G00249458Survey Analysis Focus on Customer Basics to Challenge Amazon as Showrooming Is Universal but Not UnbeatablePublished 12 February 2013
26 Software Group
27 Software Group
28 Software Group
29 Software Group
30 Software Group
Some novel defences
31 Software Group
Web analytics for retail
32 Software Group
Connected Store
bull Shelf assortment optimization
bull In store offers
bull Customer entertainment
bull Checkout anywhere
bull Relationship management
bull Customer analytics
33 Software Group
34 Software Group
Why showrooming
Selection
Stock
Faster
Cheaper
Dynamic Pricing
Predictive ordering
Assortment optimization
Predictive recommendations
Personalization
Defences
35 Software Group
Itrsquos not enough to lay out products on tables
bull Online has significant advantages
bull Retailers can only survive by embracing online and emulating online practicesndash Dynamic pricingndash Shelf optimizationndash Personalized service and selection
bull Only big data analytics can provide these advantages
36 Software Group
Therersquos a similar story in every industry
Web
Transport
Power Grid
Dating
Retail
SecurityFinance
Government
Science
Healthcare
Insurance
Telecom
Advertising
37 Software Group
The Revolution is not over yet
38 Software Group
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 26: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/26.jpg)
26 Software Group
27 Software Group
28 Software Group
29 Software Group
30 Software Group
Some novel defences
31 Software Group
Web analytics for retail
32 Software Group
Connected Store
bull Shelf assortment optimization
bull In store offers
bull Customer entertainment
bull Checkout anywhere
bull Relationship management
bull Customer analytics
33 Software Group
34 Software Group
Why showrooming
Selection
Stock
Faster
Cheaper
Dynamic Pricing
Predictive ordering
Assortment optimization
Predictive recommendations
Personalization
Defences
35 Software Group
Itrsquos not enough to lay out products on tables
bull Online has significant advantages
bull Retailers can only survive by embracing online and emulating online practicesndash Dynamic pricingndash Shelf optimizationndash Personalized service and selection
bull Only big data analytics can provide these advantages
36 Software Group
Therersquos a similar story in every industry
Web
Transport
Power Grid
Dating
Retail
SecurityFinance
Government
Science
Healthcare
Insurance
Telecom
Advertising
37 Software Group
The Revolution is not over yet
38 Software Group
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 27: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/27.jpg)
27 Software Group
28 Software Group
29 Software Group
30 Software Group
Some novel defences
31 Software Group
Web analytics for retail
32 Software Group
Connected Store
bull Shelf assortment optimization
bull In store offers
bull Customer entertainment
bull Checkout anywhere
bull Relationship management
bull Customer analytics
33 Software Group
34 Software Group
Why showrooming
Selection
Stock
Faster
Cheaper
Dynamic Pricing
Predictive ordering
Assortment optimization
Predictive recommendations
Personalization
Defences
35 Software Group
Itrsquos not enough to lay out products on tables
bull Online has significant advantages
bull Retailers can only survive by embracing online and emulating online practicesndash Dynamic pricingndash Shelf optimizationndash Personalized service and selection
bull Only big data analytics can provide these advantages
36 Software Group
Therersquos a similar story in every industry
Web
Transport
Power Grid
Dating
Retail
SecurityFinance
Government
Science
Healthcare
Insurance
Telecom
Advertising
37 Software Group
The Revolution is not over yet
38 Software Group
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 28: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/28.jpg)
28 Software Group
29 Software Group
30 Software Group
Some novel defences
31 Software Group
Web analytics for retail
32 Software Group
Connected Store
bull Shelf assortment optimization
bull In store offers
bull Customer entertainment
bull Checkout anywhere
bull Relationship management
bull Customer analytics
33 Software Group
34 Software Group
Why showrooming
Selection
Stock
Faster
Cheaper
Dynamic Pricing
Predictive ordering
Assortment optimization
Predictive recommendations
Personalization
Defences
35 Software Group
Itrsquos not enough to lay out products on tables
bull Online has significant advantages
bull Retailers can only survive by embracing online and emulating online practicesndash Dynamic pricingndash Shelf optimizationndash Personalized service and selection
bull Only big data analytics can provide these advantages
36 Software Group
Therersquos a similar story in every industry
Web
Transport
Power Grid
Dating
Retail
SecurityFinance
Government
Science
Healthcare
Insurance
Telecom
Advertising
37 Software Group
The Revolution is not over yet
38 Software Group
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 29: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/29.jpg)
29 Software Group
30 Software Group
Some novel defences
31 Software Group
Web analytics for retail
32 Software Group
Connected Store
bull Shelf assortment optimization
bull In store offers
bull Customer entertainment
bull Checkout anywhere
bull Relationship management
bull Customer analytics
33 Software Group
34 Software Group
Why showrooming
Selection
Stock
Faster
Cheaper
Dynamic Pricing
Predictive ordering
Assortment optimization
Predictive recommendations
Personalization
Defences
35 Software Group
Itrsquos not enough to lay out products on tables
bull Online has significant advantages
bull Retailers can only survive by embracing online and emulating online practicesndash Dynamic pricingndash Shelf optimizationndash Personalized service and selection
bull Only big data analytics can provide these advantages
36 Software Group
Therersquos a similar story in every industry
Web
Transport
Power Grid
Dating
Retail
SecurityFinance
Government
Science
Healthcare
Insurance
Telecom
Advertising
37 Software Group
The Revolution is not over yet
38 Software Group
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 30: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/30.jpg)
30 Software Group
Some novel defences
31 Software Group
Web analytics for retail
32 Software Group
Connected Store
bull Shelf assortment optimization
bull In store offers
bull Customer entertainment
bull Checkout anywhere
bull Relationship management
bull Customer analytics
33 Software Group
34 Software Group
Why showrooming
Selection
Stock
Faster
Cheaper
Dynamic Pricing
Predictive ordering
Assortment optimization
Predictive recommendations
Personalization
Defences
35 Software Group
Itrsquos not enough to lay out products on tables
bull Online has significant advantages
bull Retailers can only survive by embracing online and emulating online practicesndash Dynamic pricingndash Shelf optimizationndash Personalized service and selection
bull Only big data analytics can provide these advantages
36 Software Group
Therersquos a similar story in every industry
Web
Transport
Power Grid
Dating
Retail
SecurityFinance
Government
Science
Healthcare
Insurance
Telecom
Advertising
37 Software Group
The Revolution is not over yet
38 Software Group
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 31: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/31.jpg)
31 Software Group
Web analytics for retail
32 Software Group
Connected Store
bull Shelf assortment optimization
bull In store offers
bull Customer entertainment
bull Checkout anywhere
bull Relationship management
bull Customer analytics
33 Software Group
34 Software Group
Why showrooming
Selection
Stock
Faster
Cheaper
Dynamic Pricing
Predictive ordering
Assortment optimization
Predictive recommendations
Personalization
Defences
35 Software Group
Itrsquos not enough to lay out products on tables
bull Online has significant advantages
bull Retailers can only survive by embracing online and emulating online practicesndash Dynamic pricingndash Shelf optimizationndash Personalized service and selection
bull Only big data analytics can provide these advantages
36 Software Group
Therersquos a similar story in every industry
Web
Transport
Power Grid
Dating
Retail
SecurityFinance
Government
Science
Healthcare
Insurance
Telecom
Advertising
37 Software Group
The Revolution is not over yet
38 Software Group
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 32: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/32.jpg)
32 Software Group
Connected Store
bull Shelf assortment optimization
bull In store offers
bull Customer entertainment
bull Checkout anywhere
bull Relationship management
bull Customer analytics
33 Software Group
34 Software Group
Why showrooming
Selection
Stock
Faster
Cheaper
Dynamic Pricing
Predictive ordering
Assortment optimization
Predictive recommendations
Personalization
Defences
35 Software Group
Itrsquos not enough to lay out products on tables
bull Online has significant advantages
bull Retailers can only survive by embracing online and emulating online practicesndash Dynamic pricingndash Shelf optimizationndash Personalized service and selection
bull Only big data analytics can provide these advantages
36 Software Group
Therersquos a similar story in every industry
Web
Transport
Power Grid
Dating
Retail
SecurityFinance
Government
Science
Healthcare
Insurance
Telecom
Advertising
37 Software Group
The Revolution is not over yet
38 Software Group
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 33: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/33.jpg)
33 Software Group
34 Software Group
Why showrooming
Selection
Stock
Faster
Cheaper
Dynamic Pricing
Predictive ordering
Assortment optimization
Predictive recommendations
Personalization
Defences
35 Software Group
Itrsquos not enough to lay out products on tables
bull Online has significant advantages
bull Retailers can only survive by embracing online and emulating online practicesndash Dynamic pricingndash Shelf optimizationndash Personalized service and selection
bull Only big data analytics can provide these advantages
36 Software Group
Therersquos a similar story in every industry
Web
Transport
Power Grid
Dating
Retail
SecurityFinance
Government
Science
Healthcare
Insurance
Telecom
Advertising
37 Software Group
The Revolution is not over yet
38 Software Group
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 34: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/34.jpg)
34 Software Group
Why showrooming
Selection
Stock
Faster
Cheaper
Dynamic Pricing
Predictive ordering
Assortment optimization
Predictive recommendations
Personalization
Defences
35 Software Group
Itrsquos not enough to lay out products on tables
bull Online has significant advantages
bull Retailers can only survive by embracing online and emulating online practicesndash Dynamic pricingndash Shelf optimizationndash Personalized service and selection
bull Only big data analytics can provide these advantages
36 Software Group
Therersquos a similar story in every industry
Web
Transport
Power Grid
Dating
Retail
SecurityFinance
Government
Science
Healthcare
Insurance
Telecom
Advertising
37 Software Group
The Revolution is not over yet
38 Software Group
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 35: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/35.jpg)
35 Software Group
Itrsquos not enough to lay out products on tables
bull Online has significant advantages
bull Retailers can only survive by embracing online and emulating online practicesndash Dynamic pricingndash Shelf optimizationndash Personalized service and selection
bull Only big data analytics can provide these advantages
36 Software Group
Therersquos a similar story in every industry
Web
Transport
Power Grid
Dating
Retail
SecurityFinance
Government
Science
Healthcare
Insurance
Telecom
Advertising
37 Software Group
The Revolution is not over yet
38 Software Group
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 36: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/36.jpg)
36 Software Group
Therersquos a similar story in every industry
Web
Transport
Power Grid
Dating
Retail
SecurityFinance
Government
Science
Healthcare
Insurance
Telecom
Advertising
37 Software Group
The Revolution is not over yet
38 Software Group
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 37: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/37.jpg)
37 Software Group
The Revolution is not over yet
38 Software Group
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 38: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/38.jpg)
38 Software Group
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 39: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/39.jpg)
39 Software Group
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 40: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/40.jpg)
40 Software Group
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 41: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/41.jpg)
41 Software Group
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 42: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/42.jpg)
42 Software Group
Willy Bowman
Nationality German
Donrsquot Mention the WAR
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 43: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/43.jpg)
43 Software Group
Buying choices
Amazon softcover $4599
Oracle Performance Survival Guide
Amazon Kindle $3999
Say ldquoscrew you booksellerrdquo to buy kindle version
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 44: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/44.jpg)
44 Software Group
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 45: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/45.jpg)
45 Software Group
Data Input
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 46: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/46.jpg)
46 Software Group
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 47: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/47.jpg)
Siri
From now on Irsquoll call you lsquoAn Ambulancersquo OK
ldquoSiri call me an ambulancerdquo
I found 14 bridges nearby
ldquoI want to jump off a bridgerdquo
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 48: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/48.jpg)
48 Software Group
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 49: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/49.jpg)
49 Software Group
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 50: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/50.jpg)
50 Software Group
Brain Control
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 51: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/51.jpg)
51 Software Group
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 52: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/52.jpg)
52 Software Group
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 53: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/53.jpg)
53 Software Group
Muze
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 54: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/54.jpg)
54 Software Group
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 55: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/55.jpg)
55 Software Group
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 56: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/56.jpg)
56 Software Group
The instrumented human
bull Bluetooth Personal Area Network
bull 3GWiFi Wide Area Network
bull GPSbull Storage
bull Pulse temp monitor
bull Silent alarmsbull Pedometer sleep
monitoring
bull Compass bull Camerabull Mikeearphonesbull Heads up displaybull EmotionAttention
monitor
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 57: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/57.jpg)
57 Software Group
The instrumented world
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 58: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/58.jpg)
58 Software Group
All of which accelerates what we call Big Data
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 59: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/59.jpg)
59 Software Group
Big Database technologies
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 60: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/60.jpg)
60 Software Group
Pioneers of Big Data
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 61: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/61.jpg)
61 Software Group
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 62: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/62.jpg)
62 Software Group
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 63: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/63.jpg)
63 Software Group
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 64: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/64.jpg)
64 Software Group
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 65: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/65.jpg)
65 Software Group
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 66: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/66.jpg)
66 Software Group
Google File System (GFS)
Map Reduce BigTable
Google Applications
Google Software Architecture
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 67: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/67.jpg)
67 Software Group
Start ReduceMapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
MapMap
Map Reduce
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 68: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/68.jpg)
68 Software Group
HDFS
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
MAPPER
SCANSORT
MAPPER
MAPPER
MAPPER
MAPPER
AGGREGATE
REDUCEClient
Multi-stage Map-Reduce
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 69: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/69.jpg)
69 Software Group
Schema on Read vs Schema on Write
Data
Analyse
Aggregate
Normalize
Cleanse
CodeExtract
Load Transform Data Warehouse
Data LoadHadoop
Analyse
Cleanse
Code
Utilize
Schema on Write
Schema on Read
Utilize
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 70: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/70.jpg)
70 Software Group
Hadoop Open Source Map-Reduce Stack
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 71: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/71.jpg)
71 Software Group
Hadoop at Yahoo
Yahoo Hadoop cluster
bull 4000 nodesbull 16PB diskbull 64 TB of RAMbull 32000 Cores
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 72: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/72.jpg)
72 Software Group
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 73: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/73.jpg)
73 Software Group
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 74: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/74.jpg)
74 Software Group
Hadoop File System (HDFS)
Map Reduce YARNHbase
(Database)ZooKeeper(Locking)
SQOOP(RDBMS loader)
Hive(Query)
Pig(Scripting)
Flume(Log Loader)
Oozie (Workflow manager)
Hadoop ecosystem
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 75: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/75.jpg)
75 Software Group
Hadoop 10 Architecture
MAP REDUCE (DISTRIBUTED PROCESSING)
HADOOP CLIENT (JAVA PIG HIVE)
HDFS (DISTRIBUTED
STORAGE)
JOB TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
SECONDARY NAME NODE
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
DATA NODE TASK TRACKER
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 76: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/76.jpg)
76 Software Group
Hadoop 20 YARN
APPLICATION MASTER
NODE MANAGER
CONTAINER
RESOURCE MANAGER
NODE MANAGER
CONTAINER
NODE MANAGER
CONTAINER
HADOOP CLIENT (JAVA PIG HIVE)
Yet Another Resource Negotiator
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 77: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/77.jpg)
77 Software Group
Tez1
1Hindi for ldquofastrdquo
HDFS
MAP
REDUCE
MAP
MAP
REDUCE
MAP
MAP
REDUCE
MAP
Job 2Job 1
Job 3
HDFS
Job 1
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 78: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/78.jpg)
78 Software Group
HBase
A Real time database built on Hadoop
ASM
Datafiles
Buffer Cache
Table Table
Redo
Disks
LogBuffe
r
HDFS
HFile
MemStore
Table Table
WA Log
Disks
HFile
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 79: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/79.jpg)
79 Software Group
Name Site Counter
Dick Ebay 507018
Dick Google 690414
Jane Google 716426
Dick Facebook 723649
Jane Facebook 643261
Jane ILoveLarrycom 856767
Dick MadBillFanscom 675230
NameId Name
1 Dick
2 Jane
SiteId SiteName
1 Ebay
2 Google
3 Facebook
4 ILoveLarrycom
5 MadBillFanscom
NameId SiteId Counter
1 1 507018
1 3 690414
2 3 716426
1 3 723649
2 3 643261
2 4 856767
1 5 675230
Id Name Ebay Google Facebook (other columns) MadBillFanscom
1 Dick 507018 690414 723649 675230
Id Name Google Facebook (other columns) ILoveLarrycom
2 Jane 716426 643261 856767
Hbase Data Model
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 80: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/80.jpg)
80 Software Group
Hive
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 81: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/81.jpg)
81 Software Group
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 82: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/82.jpg)
82 Software Group
SQL
JAV
A
RES
ULT
S
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 83: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/83.jpg)
83 Software Group
Other SQL-like Hadoop Interfaces
Cloudera Impala
MapR Drill Aster
Greenplumb (Pivotal HD) Paraccel Hadapt
Oracle SQL Connector for
Hadoop (External Table interface to
HDFS)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 84: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/84.jpg)
84 Software Group
Pig
Pig Latin
SQL or Hive QL
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 85: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/85.jpg)
85 Software Group
Flume and SQOOP
CUSTOMERS
WebLogs
PRODUCTS
HDFS
RDBMS
FLUME
SQOOP
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 86: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/86.jpg)
86 Software Group
Berkeley Data Analytic Stack (BDAS)
Yarn Yarn EC2 Yarn
Mesos ndash heterogeneous cluster manager
Tachyon ndash in memory File system
Spark ndash memory optimized distributed execution
Spark Streaming
Mlbase Mlib ndash Machine Learning
Map Reduce
Shark (SQL) Hive (SQL)
BlinkDB
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 87: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/87.jpg)
87 Software Group
Meanwhile back at the Death Star
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 88: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/88.jpg)
88 Software Group
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 89: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/89.jpg)
89 Software Group
Oracle Exadata (X-2)
Database servers
64 cores 576 GB RAM
Storage Servers112 cores 100 TB SAS or336 TB SATA plus5 TB SSD
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 90: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/90.jpg)
90 Software Group
Economies
Exadata
Hadoop
$0 $1000 $2000 $3000 $4000 $5000 $6000
$4911
$750
Exadata vs Hadoop $$TB (Hardware only)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 91: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/91.jpg)
93 Software Group
Oracle Big Data Appliance
bull 18 Sun X4270 M2 serversndash 48GB RAM per node (864GB total)ndash 2x6 Core CPU per node (216 total)ndash 12x2TB HDD per node (216 spindles 864 TB)ndash 40Gbs Infiniband between nodesndash 10Gbs Ethernet to datacentre
bull Competitive Pricingwwworaclecomusbigdataindexhtml
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 92: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/92.jpg)
94 Software Group
Big Data Appliance Software
bull Cloudera Enterprise
bull Oracle Enterprise R
bull Oracle NoSQL
bull Oracle Big Data Connectors
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 93: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/93.jpg)
95 Software Group
Generating competitive advantage through ldquoBig Data analyticsrdquo Machine
LearningPrograms that evolve with ldquoexperiencerdquo
Collective IntelligencePrograms that use inputs from ldquocrowdsrsquo to seem intelligent
Predictive AnalyticsPrograms that extrapolate from existing data into the future
Big Data AnalyticsAKA Data Science
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 94: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/94.jpg)
96 Software Group
Collective Intelligence
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 95: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/95.jpg)
97 Software Group
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 96: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/96.jpg)
98 Software Group
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 97: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/97.jpg)
99 Software Group
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 98: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/98.jpg)
100 Software Group
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 99: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/99.jpg)
101 Software Group
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 100: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/100.jpg)
102 Software Group
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 101: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/101.jpg)
103 Software Group
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 102: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/102.jpg)
104 Software Group
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 103: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/103.jpg)
105 Software Group
Google Flu Trends
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 104: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/104.jpg)
106 Software Group
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 105: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/105.jpg)
107 Software Group
Collective Intelligence outsmarts Artificial Intelligence
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 106: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/106.jpg)
108 Software Group
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 107: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/107.jpg)
109 Software Group
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 108: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/108.jpg)
110 Software Group
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 109: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/109.jpg)
111 Software Group
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 110: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/110.jpg)
112 Software Group
Artificial Intelligence Strikes back
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 111: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/111.jpg)
113 Software Group
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 112: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/112.jpg)
114 Software Group
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 113: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/113.jpg)
115 Software Group
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 114: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/114.jpg)
116 Software Group
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 115: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/115.jpg)
117 Software Group
Watson is big data AI
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 116: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/116.jpg)
118 Software Group
Predictive Analytics
0 20 40 60 80 100 120
-20
0
20
40
60
80
100
120
f(x) = 0971521231456065 x + 071906459527154
bull Linear regressionbull Non-linear (curve fit)bull Multivariatebull Time seriesbull Logistical Regressionbull CART
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 117: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/117.jpg)
119 Software Group
Classificationbull Create a model that
identifiesclassifies new data
bull Spam detection churn risk customer value
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 118: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/118.jpg)
120 Software Group
Clusteringbull Group data without a
pre-existing classification scheme
bull For instance basket analysis
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 119: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/119.jpg)
121 Software Group
SupervisedMachine Learning
Raw Data Clean
Validate
Model
Candidate
ModelTraining Set
Validation Set
Production
ModelNew Data
New Business
Existing Business
Prediction
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 120: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/120.jpg)
122 Software Group
Inmapslinkedincom
Unsupervised learning
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 121: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/121.jpg)
123 Software Group
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 122: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/122.jpg)
124 Software Group
Big Data Analytics
Data Science
Search Optimization
Recommendation Systems
Securitybull Vulnerabili
tybull Penetratio
n Detection
Fraud Detection
CRMbull Churn bull Defaults
Medicalbull Risk
analysisbull Diagnosisbull Prognosis
Game optimization
Advertisingbull Targetingbull Tailoring
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 123: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/123.jpg)
125 Software Group
Data Science is hard
bull Machine learning collective intelligence Hadoop predictive analytics R Weka Mahout are HARD
bull Small-medium businesses need help to compete
bull Data scientists to the rescue
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 124: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/124.jpg)
126 Software Group
Data Scientists to the rescue
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 125: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/125.jpg)
127 Software Group
Kitenga Analytics Suite
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 126: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/126.jpg)
128 Software Group
Toad for Hadoop
httpwwwtoadworldcomproductstoad-for-hadoopdefaultaspx
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 127: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/127.jpg)
129 Software Group
SharePlexreg for Hadoop
Redo-logs
Change Data Capture
JMS Queue Hadoop Poster
BatchedHDFS File Copy Audit Change
Data
HBase RealTime replication
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 128: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/128.jpg)
130 Software Group
Toad BI Suite
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 129: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/129.jpg)
131 Software Group
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 130: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/130.jpg)
132 Software GroupConfidential
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dellrsquos offering was not completehellip
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
In order to address the demands that face mid-market customers Dell must offer end-to-end solutions enabled with advanced analytic capabilities
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 131: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/131.jpg)
133 Software GroupConfidential
Dell acquires Statsoft
Data Integration
Database Management
Advanced Analytics
Business Intelligence
Server and Storage
STATISTICA
Server and Storage
TOAD amp Shareplex
TOAD BI
Boomi
Kitenga
Key co
mponents
to b
uild
end-
to-e
nd B
IA
naly
tics
solu
tions
Dell + StatSoft = completes a strong end-to-end analytics driven information management value proposition
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 132: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/132.jpg)
134 Software GroupConfidentialConfidential13
4
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 133: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/133.jpg)
135 Software GroupConfidentialConfidential
Data Visualization
135
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 134: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/134.jpg)
136 Software GroupConfidentialConfidential
Live scoring ndash integration into operational systems
136
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 135: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/135.jpg)
137 Software GroupConfidentialConfidential
Industry and cross-industry packaged solutions
137
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 136: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/136.jpg)
138 Software Group
For your business
bull How could data and algorithms transform your business
bull What are the technologies that will be most importantndash Mobilityndash Cloudndash Hadoopndash Big Data Analytics
bull Where is the datandash Start collecting now
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 137: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/137.jpg)
139 Software Group
For your career bull Hadoop and NoSQL creates
strong career opportunities for DBAs and developersndash Demand will exceed supply for
the foreseeable future
bull Lotrsquos of opportunities for those with Math amp Statisticsndash Good time to brush off that
statistics textbook and play with R (maybe Oracle Enterprise R)
bull Easy to get started with Hadoopndash SQOOPndash Hive ndash Pig
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-
![Page 138: Thriving and surviving the Big Data revolution](https://reader037.vdocument.in/reader037/viewer/2022110119/555c042cd8b42a56448b5445/html5/thumbnails/138.jpg)
C
14
LV
C1
4LV
Please complete the session evaluation on the mobile appWe appreciate your feedback and insight
This box will have simplified instructions about how to complete the session evaluation online
- 207Surviving and thriving in the big data revolution
- 207Surviving and thriving in the big data revolution (2)
- Introductions
- Slide 4
- Slide 5
- Slide 6
- Slide 7
- Dell and Quest ndash a brief history
- But Seriously
- What is Big Data
- Slide 11
- Instead - the industrial Revolution of data
- Slide 13
- Slide 14
- Slide 15
- Slide 16
- Slide 17
- Slide 18
- Slide 19
- Slide 20
- Data means more
- Big Data is the culmination of cloud social and mobile
- Not all upside
- Will Big Data kill retail
- Prevalence of Showrooming
- Slide 26
- Slide 27
- Slide 28
- Slide 29
- Some novel defences
- Web analytics for retail
- Connected Store
- Slide 33
- Why showrooming
- Itrsquos not enough to lay out products on tables
- Therersquos a similar story in every industry
- The Revolution is not over yet
- Slide 38
- Slide 39
- Slide 40
- Slide 41
- Slide 42
- Slide 43
- Slide 44
- Data Input
- Slide 46
- Siri
- Slide 48
- Slide 49
- Brain Control
- Slide 51
- Slide 52
- Muze
- Slide 54
- Slide 55
- The instrumented human
- The instrumented world
- All of which accelerates what we call Big Data
- Big Database technologies
- Pioneers of Big Data
- Slide 61
- Slide 62
- Slide 63
- Slide 64
- Slide 65
- Google Software Architecture
- Map Reduce
- Multi-stage Map-Reduce
- Schema on Read vs Schema on Write
- Hadoop Open Source Map-Reduce Stack
- Hadoop at Yahoo
- Slide 72
- Slide 73
- Hadoop ecosystem
- Hadoop 10 Architecture
- Hadoop 20 YARN
- Tez1
- HBase
- Hbase Data Model
- Hive
- Slide 81
- Slide 82
- Other SQL-like Hadoop Interfaces
- Pig
- Flume and SQOOP
- Berkeley Data Analytic Stack (BDAS)
- Meanwhile back at the Death Star
- Slide 88
- Oracle Exadata (X-2)
- Economies
- Oracle Big Data Appliance
- Big Data Appliance Software
- Generating competitive advantage through ldquoBig Data analyticsrdquo
- Collective Intelligence
- Slide 97
- Slide 98
- Slide 99
- Slide 100
- Slide 101
- Slide 102
- Slide 103
- Slide 104
- Google Flu Trends
- Slide 106
- Collective Intelligence outsmarts Artificial Intelligence
- Slide 108
- Slide 109
- Slide 110
- Slide 111
- Artificial Intelligence Strikes back
- Slide 113
- Slide 114
- Slide 115
- Slide 116
- Watson is big data AI
- Predictive Analytics
- Classification
- Clustering
- Supervised Machine Learning
- Unsupervised learning
- Slide 123
- Big Data Analytics
- Data Science is hard
- Data Scientists to the rescue
- Kitenga Analytics Suite
- Toad for Hadoop
- SharePlexreg for Hadoop
- Toad BI Suite
- Slide 131
- Dellrsquos offering was not completehellip
- Dell acquires Statsoft
- Slide 134
- Data Visualization
- Live scoring ndash integration into operational systems
- Industry and cross-industry packaged solutions
- For your business
- For your career
- Please complete the session evaluation on the mobile app We app
-