aws summit tel aviv - enterprise track - data warehouse
DESCRIPTION
TRANSCRIPT
![Page 1: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/1.jpg)
AWS Summit 2013 Tel Aviv Oct 16 – Tel Aviv, Israel
Guy Ernest
Solutions Architecture, Amazon Web Services
Data Warehouse on AWS
![Page 2: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/2.jpg)
DATAWAREHOUSE
ERP
ANALYST CRM
DB
![Page 3: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/3.jpg)
DATAWAREHOUSE
ERP
ANALYST CRM
DB
OLTP
OLTP
OLTP
OLAP
![Page 4: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/4.jpg)
Transactional Processing Analytical Processing
Transactional context Global context
Latency Throughput
Indexed access Full table scans
Random IO Sequential IO
Disk seek times Disk transfer rate
![Page 5: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/5.jpg)
OLTP
OLAP
![Page 6: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/6.jpg)
DATAWAREHOUSE ANALYST
BUSINESS INTELLIGENCE REPORTS, DASHBOARD, …
PRODUCTION OFFLOAD DIFFERENT DATA STRUCTURE, USING ETLs, …
![Page 7: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/7.jpg)
![Page 8: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/8.jpg)
![Page 9: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/9.jpg)
BIG ENTREPRISES
VERY EXPENSIVE (ROI)
DIFFICULT TO MAINTAIN
NOT SCALABLE
![Page 10: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/10.jpg)
BIG ENTREPRISES SME
WAY TOO EXPENSIVE !
VERY EXPENSIVE (ROI)
DIFFICULT TO MAINTAIN
NOT SCALABLE
![Page 11: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/11.jpg)
Jeff Bezos
![Page 12: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/12.jpg)
Data Sources
Queries
Value
![Page 13: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/13.jpg)
+ ELASTIC CAPACITY + NO CAPEX + PAY FOR WHAT YOU USE + DISPOSE ON DEMAND
= NO CONTRAINTS
![Page 14: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/14.jpg)
COLLECT STORE ANALYZE SHARE
ACCELERATION
AMAZON REDSHIFT
![Page 15: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/15.jpg)
AMAZON REDSHIFT
![Page 16: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/16.jpg)
DWH that scales to petabyte and…
AMAZON REDSHIFT
… WAY LESS EXPENSIVE
… WAY FASTER
…WAY SIMPLER
![Page 17: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/17.jpg)
AMAZON REDSHIFT RUNNING ON OPTIMIZED HARDWARE
HS1.8XL: 128 GB RAM, 16 Cores, 16 TB Compressed Data, 2 GB/sec Disk Scan
HS1.XL: 16 GB RAM, 2 Cores, 2 TB Compressed Data
![Page 18: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/18.jpg)
Extra Large Node
(HS1.XL)
Single Node (2 TB)
Cluster 2-32 Nodes (4 TB – 64 TB)
Eight Extra Large Node (HS1.8XL) Cluster 2-100 Nodes (32 TB – 1.6 PB)
![Page 19: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/19.jpg)
10 GigE (HPC)
Ingestion Backup
Restoration
JDBC/ODBC
![Page 20: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/20.jpg)
![Page 21: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/21.jpg)
…WAY SIMPLER
![Page 22: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/22.jpg)
LOADING DATA
Parallel Loading Data sorted and distributed automatically Linear Growth
![Page 23: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/23.jpg)
DATA SNAPSHOTS
Automatic and Incremental snapshots in Amazon S3 Configurable Retention Period Manual Snapshots “Streaming” Restore
![Page 24: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/24.jpg)
REPLICATION IN CLUSTER +
AUTOMATIC SNAPSHOT IN AMAZON S3 +
MONITORING OF CLUSTER NODES
![Page 25: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/25.jpg)
AUTOMATIC RESIZING
![Page 26: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/26.jpg)
Read-only mode while resizing
New cluster is created in the
background
Parallel node-to-node data copy
Only charged for a single cluster
![Page 27: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/27.jpg)
Automatic DNS based endpoint cut-over
Deletion of source cluster
![Page 28: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/28.jpg)
![Page 29: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/29.jpg)
CREATE A DATAWAREHOUSE IN MINUTES
![Page 30: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/30.jpg)
![Page 31: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/31.jpg)
![Page 32: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/32.jpg)
![Page 33: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/33.jpg)
![Page 34: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/34.jpg)
![Page 35: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/35.jpg)
![Page 36: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/36.jpg)
![Page 37: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/37.jpg)
…WAY FASTER
![Page 38: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/38.jpg)
MEMORY CAPACITY AND CPU ERFORMANCE DOUBLE EVERY 2 YEARS
DISK PERFORMANCE
DOUBLE EVERY 10 YEARS
![Page 39: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/39.jpg)
Progress is not evenly distributed
1980 Today
14,000,000$/TB 100MB 4MB/s
30$/TB 3TB
200MB/s 30,000 X
50 X
450,000 ÷
![Page 40: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/40.jpg)
I/O IS THE MAIN FACTOR FOR PERFORMANCE
![Page 41: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/41.jpg)
• COLUMNAR STORAGE
• COMPRESSION PER COLUMN
• ZONE MAPS
• HARDWARE OPTIMIZE
• LARGE DATA BLOCK SIZE
Id Age State 123 20 CA 345 25 WA 678 40 FL
![Page 42: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/42.jpg)
![Page 43: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/43.jpg)
![Page 44: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/44.jpg)
![Page 45: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/45.jpg)
![Page 46: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/46.jpg)
TEST:
2 BILLION RECORDS
6 REPRESENTATIVE REQUETS
![Page 47: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/47.jpg)
AMAZON REDSHIFT 2xHS1.8XL
Vs.
32 NODES, 4.2TB RAM, 1.6PB
![Page 48: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/48.jpg)
12x - 150x FASTER
![Page 49: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/49.jpg)
![Page 50: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/50.jpg)
30 MINUTES
12 SECONDES
![Page 51: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/51.jpg)
…WAY LESS EXPENSIVE
![Page 52: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/52.jpg)
2x HS1.8XL 3.65$ / HOUR
32 000$ / YEAR
![Page 53: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/53.jpg)
Instance HS1.XL per hour
Hourly Price per TB Yearly Price per TB
On-Demand 0.850 $ 0.425 $ 3 723 $
1 Year Reservation
0.500 $ 0.250 $ 2 190 $
3 Years Reservation
0.228 $ 0.114 $ 999 $
![Page 54: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/54.jpg)
![Page 55: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/55.jpg)
Intel Confidential
Intel Analytics on AWS
Assaf Araki
October, 2013
![Page 56: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/56.jpg)
Intel Confidential
Agenda
• Advanced Analytics @ Intel
• Enterprise on the Cloud
• Use Case
![Page 57: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/57.jpg)
Intel Confidential
Advanced Analytics
• Vision: Make analytics a competitive advantage for Intel
• Mission:
• Solve strategic high value business line problems
• Leverage analytics to grow Intel revenue
• About the team:
• ~100 employees - corporate ownership of advanced analytics
• Big data and Machine Learning are key focus areas
• Skills: Software Engineering / Decision Science / Business Acumen
• Value driven – ROI>$10M and/or key corporate problem as defined by VPs
• Part of the Israel Academy Computational research center
Intel AA Team
![Page 58: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/58.jpg)
Intel Confidential
Big Data Analytics Platform
• Highly scalable, hybrid platform to support a range of business use cases
MPP High Speed Data Loader
Rich advanced analytics and real-
time, in-database data mining
capabilities
Heterogeneous data, batch oriented
on advanced analytics
Prediction Module
AA Overview
![Page 59: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/59.jpg)
Intel Confidential
Why Cloud ?
• Known reasons
– Reduce cost
– Universal access
– Scale fast
• Additional reasons
– Flexible & Agile platform – no need to certify each tool by
engineering team
– Development accelerator – R&D team can start develop while
engineering teams implement the platform on premise
Enterprise On the Cloud
![Page 60: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/60.jpg)
Intel Confidential
Use Case
• Characteristics:
– CPU behavior data
– Size: 30TB of data per month
– Type: Structured data
– Processing:
• Create aggregation facts and grant ad hoc analysis
• Create ML solutions
• Current Status:
– Data is sampled and processed on SMP RDBMS
– Takes almost 24 hours to process the entire data
• Problem Statement
– Limited ability analyze all data
Use Case
![Page 61: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/61.jpg)
Intel Confidential
Platforms
• On premise
– Hbase – Hadoop platform exists
• No Hbase
– MPP DB – Exists with Machine Learning capabilities
• Lower cost platform evaluate and purchase
• Cloud
– HBase - EMR
– MPP DB - AWS Redshift
Enterprise On the Cloud
Go for POC on the Cloud
![Page 62: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/62.jpg)
Intel Confidential
Evaluation Criteria
• Capabilities
– Create statistics calculations
• Cost of HW per TB
– Replication
– Compression
• Performance
– Load, transformation, querying
• Scalability
• Ability to execute
Enterprise On the Cloud
![Page 63: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/63.jpg)
Intel Confidential
Preliminary Results • Dataset example
– 34GB compressed data divided to files
– ~1,500,000,000 records
– 24B compressed, 240B per record ( ~15 columns )
• Performance & Scalability - 8 x 1XL nodes
– Load time – for 32 files – 2 hours ( 4 files – 5 hours )
– Table size – 202GB (compression rate ~1.5:1)
– SQL aggregation statements
• 38K records – 6 minutes
• 14M records – 7 minutes
• 66M records – 11 minutes ( on 4 x 1XL – 22 minutes )
• 939M records – 34 minutes ( on 4 x 1XL – 77 minutes )
Use Case
![Page 64: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/64.jpg)
Intel Confidential
Capabilities and Cost
• No current ability to write code (Java/C++/Python/R)
– Implement statistics and algorithm in SQL
• Compression is not strait forward
• Cost sensitive for actual compression
– 2.6 : 1 is break even
• 8XL vs. High Storage instance (16 cores 48TB)
• 3 years with 100% utilization
Use Case
![Page 66: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/66.jpg)
Intel Confidential
Thank You!
![Page 67: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/67.jpg)
USE CASE
![Page 68: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/68.jpg)
AMAZON ELASTIC
MAPREDUCE
AMAZON
DYNAMODB
AMAZON EC2
AWS STORAGE GATEWAY
AMAZON S3
DATA CENTER
AMAZON RDS
AMAZON REDSHIFT
![Page 69: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/69.jpg)
UPLOAD TO AMAZON S3
AWS IMPORT/EXPORT
AWS DIRECT CONNECT
DATA
INTEGRATION
INTEGRATION
SYSTEMS
![Page 70: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/70.jpg)
![Page 71: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/71.jpg)
![Page 72: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/72.jpg)
2 million
15 million
MEMBRES REGISTRATION
2011 2012 2013
![Page 73: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/73.jpg)
1,500,000+ NEW MEMBRES EACH MONTH
![Page 74: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/74.jpg)
1,200,000,000+ SOCIAL CONNECTIONS IMPORTED
![Page 75: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/75.jpg)
Data Analyst
Raw Data
Get Data
Join via Facebook
Add a Skill Page
Invite Friends
Web Servers Amazon S3 User Action Trace Events
EMR Hive Scripts Process Content
• Process log files with regular expressions to parse out the info we need.
• Processes cookies into useful searchable data such as Session, UserId, API Security token.
• Filters surplus info like internal varnish logging.
Amazon S3
Aggregated Data
Raw Events
Internal Web
Excel Tableau
Amazon Redshift
![Page 76: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/76.jpg)
ELASTIC DATA WAREHOUSE
![Page 77: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/77.jpg)
![Page 78: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/78.jpg)
![Page 79: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/79.jpg)
![Page 80: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/80.jpg)
![Page 81: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/81.jpg)
![Page 82: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/82.jpg)
![Page 83: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/83.jpg)
Monthly Reports on a new cluster
![Page 84: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/84.jpg)
Redshift Reporting
and BI EMR
S3
![Page 85: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/85.jpg)
DynamoDB Redshift
OLTP Web Apps
Reporting and BI
![Page 86: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/86.jpg)
RDBMS Redshift
OLTP ERP
Reporting & BI
![Page 87: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/87.jpg)
+
RDBMS Redshift
OLTP ERP
Reporting & BI
![Page 88: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/88.jpg)
JDBC/ODBC
Amazon Redshift
![Page 89: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/89.jpg)
![Page 90: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/90.jpg)
![Page 91: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/91.jpg)
![Page 92: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/92.jpg)
![Page 93: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/93.jpg)
![Page 94: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/94.jpg)
![Page 95: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/95.jpg)
DATAWAREHOUSE BY AWS
Pay per use, no CAPEX
Low cost for high performances
Open and integrate with existing BI tools
Simple to use and scalable
![Page 96: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/96.jpg)
Speed and Agility
Frequent Experiments
Low Cost of Failure
More Innovation
Fewer Experiments
High Cost of Failures
Less Innovation
“On Premise”
![Page 97: AWS Summit Tel Aviv - Enterprise Track - Data Warehouse](https://reader031.vdocument.in/reader031/viewer/2022020206/540df08d8d7f728d7e8b4b77/html5/thumbnails/97.jpg)
תודה רבה