oracle and sas: clouds, community, collaboration, and ... · agenda sas big data & oracle paul...
TRANSCRIPT
Oracle and SAS: Clouds, Community, Collaboration, and Computing CreativityGoodbye Cloudy with a Chance of Showers; Hello Cloud-eze and Clear Skies
Maureen Chew, Principal Software Engineer, Oracle CorporationGary Granito, Enterprise Architect, Oracle CorporationThomas Mendicino, Senior Manager, Data Management ,GE CapitalPaul Kent, Vice President, Big Data, SAS
Agenda
SAS Big Data & Oracle Paul Kent, SAS
Leveraging Technology to Improve SAS Analytics Performance – Battling Change is Never Easy
Thomas Mendicino, GE Capital
Oracle and SAS On Collaboration
Cloud Perspective
Towards Cloud “readiness” with your SAS programs…
Questions, Kindle Drawing
SAS Scoring Accelerator for Oracle and Beyond
Paul Kent, Vice President, Big Data, SAS
SAS & Oracle
Oracle is a top SAS Partner
SAS on Solaris+SPARC is a top tier platform for SAS Enterprise Deployments
T4 / SuperCluster, Solaris 11
Oracle Enterprise Linux (OEL) is RH compatible. SAS is supported on OEL
Announcing Scoring Accelerator Availability June 5th (2012)
Recent joint Exadata performance testing results
Scoring Scale Up on Exadata Full, Half and Quarter Rack
Expected Scalability: Full (32), Half(16), Qtr(8)
Performance even at the low end across all configs
1 2 4 8 16 320
50,000
100,000
150,000
200,000
250,000
300,000
11,433 21,53443,478
87,241
173,160
290,909
Scoring Accelerator - Exadata Full, Half, Quarter Rack
Rows/Second- 80M rows, regression - higher is better
FullHalfQuarter
Degree of Parallelism
#Row
s/se
c
Scoring Scale Up on Exadata Full, Half and Quarter Rack
Linear scalability as expected Full:1→32,
Half:1→16,
Qtr:1→8
Full: ~2 hrs → 4.5 min
Full Half Quarter00:00:00
00:28:48
00:57:36
01:26:24
01:55:12
02:24:00
Scoring Accelerator - Exadata (Full, Half, Quarter Rack)
Time - regression, 80M rows - lower is better
32168421
Scoring Time at varying Degrees of Parallelism (DOP)Linear Scalability as DOP increases
Consistent Full/Half/Qtr Times at similar DOP
Tim
e (H
H:M
M:S
S)
Scoring Scale Up on Exadata Extra Large – 250M rows Full rack with
DOP=32 80M rows: 2hrs →
4 min 250M rows: 6 hrs
→ 14 min
Scale up consistency: Time, Rows/Sec,
1 2 4 8 16 320
50,000
100,000
150,000
200,000
250,000
300,000
350,000
11,086 22,40144,068
87,966
170,765
303,030
Scoring - Scale up on Full Rack
Extra Large Table - Rows/Sec - 80M vs 250M -
80M250M
Degree of Parallelism
Ro
ws
/ Sec
Looking Ahead
HPA on Oracle High Priority
Focus
Leveraging Technology to Improve SAS Analytics Performance
Battling Change is Never Easy
Tom MendicinoManager, Information DeliveryGE Capital – Retail Finance
Current Environment
Retail Finance depends on 10+ Oracle 11g database instances on several independent Sun Fire E25K Servers to support 200+TB of operational data incorporating over 300 million accounts.
Serves over 600 users in risk, marketing, operations, finance, decision science, and fraud utilizing a wide range of SAS applications from simple reporting and portfolio analytics to advanced modeling and forecasting.
The SAS environment sits on a separate E25K with 144 cores spread across 5 domains with 60 TB of attached storage which includes /saswork.
It’s stable, but in need of architecture refresh.
Future State - Database
Databases are currently being migrated to Oracle Exadata X-2 Servers (2 Full Racks)
Decision was based on POC performed between Oracle and another vendor.
In excess of 180,000 tests were conducted showing magnitudes of improvement from hours to minutes, minutes to seconds.
Little to no impact on user queries – lift and load.
When pulling large SAS datasets down to storage, there is still an I/O delay but we still see 400% improvements.
Future State - SAS
The 144 core E25K will be replaced with 18 eight core commodity servers in a SAS Grid enabled architecture.
96 GB RAM per node
8 600 GB HDD (2 for OS, 6 for /saswork)
Analytic Workload Management • Ability to define Job Queues (Production, Normal, Business Critical, etc.)
• Jobs will queue based on queue priority, available compute capacity, and designated group consumption
• Jobs scheduled to run based on “fair share” scheduler so that no one group can consume all of the grid
Oracle and SAS: Clouds, Community, Collaboration, and Computing CreativityGoodbye Cloudy with a Chance of Showers; Hello Cloud-eze and Clear Skies
Maureen Chew, Principal Software Engineer, Oracle CorporationGary Granito, Enterprise Architect, Oracle Corporation
On Collaboration
• Exadata, Exalogic, SPARC SuperCluster • Early Access Programs
• Oracle Solaris Studio 12.3 Platinum Beta• Solaris 11 – Early Preview Testing
• http://oracle.com/sas
SAS & Oracle Collaboration
Current Joint POCs (Proof of Concept) Underway
SPARC SuperCluster – SAS Grid, Marketing Automation, Enterprise Miner
SAS Grid with Sun ZFS Storage
SAS Scoring Accelerator on Exadata
SAS Grid on Oracle Exalogic and Sun ZFS Storage 7420 Appliance
½ Rack Oracle ExalogicCompute Appliance
Sun ZFS Storage7420 Appliance
InfinibandNetwork Attach
Exalogic192 Intel cores1.5 TB RAM
ZFS 7420:160 spindles2 TB Read Cache576 GB Write Cache
96 SAS sort jobs, 5.7TB total I/O 3.7GB/sec throughput
Infiniband Network for NFS
Node 1
Node 2
Node 15
Node 16
½ Rack Oracle ExalogicCompute Appliance
16 Compute NodesEach Compute Node:
2 x 6 core 2.93GHz Xeon96GB RAM
Infiniband Network Switch
ZFS Storage Appliance:40TB2 Mirrored Disk Trays
Head Unit
Disk Tray 1
Sun ZFS Storage7420 Appliance
Disk Tray 2
Disk Tray 7
Disk Tray 8
8 Disk TraysEach Disk Tray:
20 x 1 TB SATA2 Disks4 x 18 GB Write Cache
2 x 8 core 2.0GHz Xeon256 GB RAM4 x 512GB Read Cache
4 Infiniband Connections
Exalogic:192 cores1.5 TB RAMInfiniband FabricInternal ZFS ApplianceFor SAS binaries, code
ZFS 7420:160 spindles2 TB Read Cache576 GB Write CacheFor /sasdata, /saswork
• SAS was one of the 1st major ISVs to announce support
• Tune in to the SAS audio interview on the 25APRSolaris Forum
Solaris 11
• SAS 9.3 Foundation : Supported! System Requirements for SAS 9.3 Foundation for Solaris
“SAS is supported on Solaris 10 Update 8 and Solaris 11 and higher”
• SAS 9.3 Enterprise Business Intelligence : Supported! Current Baseline: Oracle WebLogic 10.3.3, JDK 1.6.0_21
Solaris 11 →Weblogic SAS Third-Party Software Requirements – Baseline and Higher Support
WebLogic 10.3.5 with JDK 1.6.0_26 : All supported!
Solaris 11 – SAS 9.3 Support
Cloud Perspectives
On Clouds … we could talk about
Engineered HW Solutions Fully Integrated
Performance focussed
Virtualization
Exadata, Exalogic, SPARC SuperCluster
Enterprise Manager 12c “c” is for Cloud
Management interface for creating and managing private clouds
Public cloud offerings, private cloud infrastructure
Engineered Systems & Appliances
General Purpose
SPARCSuperCluster
Purpose Built
Database Appliance Exalytics
Big DataExalogicExadata
• Cloud Built In
BEST DENSITY ACTIVE-ACTIVE CONTROLLERS
BEST SCALABILITY ACTIVE-ACTIVE CONTROLLERS
BEST VALUE FULL SUITE OF
DATA SERVICES
NEW BENEFITSBest Density and Scale: Industry-leading density, scale up to 1PB for ConsolidationFlash Everywhere and More Of It: Industry-leading flash capacity for Application PerformanceDoubled the Processing Power: Performance to drive enterprise Data Protection
STANDARD FEATURES (ALL MODELS)All Data Protocols: FC, iSCSI, IB, NFS, CIFS, WebDAV, etc.Advanced Data Services: Snap, dedup., compression, replication, etc.
CLIENTS AND APPLICATIONS (ALL MODELS)Oracle Solaris • Oracle LinuxOracle Database, Middleware, and ApplicationsOracle VM • VMware • WindowsMore than 50 business applications supported
ZFS Storage AppliancesSecond Generation Systems
BEST FLEXIBILITY SINGLE OR DUAL
CONTROLLERS
Oracle “Cloud” Strategies
SAS Cloud Directions
SAS Solutions OnDemand(SSO)Overview 10+ year proven track record of providing outsourced
advanced analytical mission critical solutions: One of the fastest growing SAS Business Units 1000+ years of SAS experience on staff
Multiple lines of business representing more than 250 SAS solutions hosted and supported for more than 200+ customer sites with users from over 70 countries:
SSO Enterprise Hosted Solutions involve huge amounts of highly regulated sensitive data from major drug, retail, financial, education, and government organizations.
Over 1PB (1,000 TB) of data under management
Two 10,000-square-foot server farms for our enterprise hosting, capable of providing 99%+ 7x24 Guarantee
28
Key Capabilities/Deliverables
• Decreased Time to Production for new customers - (no need to take Exadata down to expand HW or SW)
• Best of Breed:• Quality and quantity of tools for management,
monitoring, development and diagnosis via Oracle Enterprise Manager
• Resource Management/Optimization through Exadata’s IO Resource Management and Database Resource Manager.
• Best of Breed Security(HW to SW) capabilities through Oracle’s Advanced Security Option
• Business Continuity:• High Availability SLA’s >99%• Superior Backup, restore, and recovery
• Oracle DB License Consolidation - management efficiencies and savings
SAS OnDemand – Optimized Architecture
28
IP over Ethernet
External Company Data
SAS Servers
SAS Users
Exadata Database Machine
Firewall
Towards Cloud Readiness
Small steps TODAY toward Cloud readiness TOMORROW
a SID for the GRID Running on the GRID – you’re off to a good start…
Grid environments allow for runtime “agility” – other people & policies determine where applications run, how provisioned and resourced, etc
Utilize Checkpoint/Restart for long running SAS jobs Available since SAS 9.2
Utilized by SAS Grid, SAS tools
Can transparently add code support without affecting current application dynamics
Can increase application availability immediately
Checkpoint / restart example$ cat checkpoint.sas /* first step */data temp1; x=1; run;
/* second step */proc print data=temp1; run;
/* global macro statement */%let macval=123;
/* macro definition */%macro doit(xyz);data &xyz; &xyz=&macval; run;proc print data=&xyz; run;%mend;
/* macro invocation - creates 2 new steps */%doit(first);
/* 5th step - force abend if first time invocation, resume here if restarting */data _null_; if getoption('stepchkpt')='STEPCHKPT' then abort abend 999; run;
/* 6th step - continue here after restarting, first should already exist */data _null_; set first; put _all_;
/* ensure macro is redefined, 2 steps invoked */%doit(again);
/* final step invoked */data _null_; set temp1; put _all_;
$ cat run.sh#!/bin/shSASDIR=/apps/sas_install/compute/SASFoundation/9.3$SASDIR/sas checkpoint.sas -fullstimer -print ck1.lst -source -stepchkpt -log ck1.log -noworkinit -noworkterm -work WORK
$SASDIR/sas checkpoint.sas -steprestart -fullstimer -print ck2.lst -log ck2.log -noworkinit -noworkterm -work WORK
Checkpoint / restart example – log 1eecsas@oraclet4:~/check$ cat ck1.log………NOTE: Copyright (c) 2002-2010 by SAS Institute Inc., Cary, NC, USA.….NOTE: This session is executing on the SunOS 5.11 (SUN 64) platform.……NOTE: Begin CHECKPOINT execution mode.NOTE: Checkpoint library: /home/eecsas/check/WORK/SAS_work108800001F2F_oraclet4.NOTE: WORK library: /home/eecsas/check/WORK/SAS_work108800001F2F_oraclet4.NOTE: CHECKPOINT 1.……..2 data temp1; x=1; run;NOTE: The data set WORK.TEMP1 has 1 observations and 1 variables.NOTE: DATA statement used (Total process time):……….NOTE: CHECKPOINT 2.5 proc print data=temp1; run;……..NOTE: CHECKPOINT 5.21 data _null_;22 if getoption('stepchkpt')='STEPCHKPT'23 then abort abend 999;24 run;ERROR: Execution terminated by an ABORT statement at line 23 column 14, it specified the ABEND option._ERROR_=1 _N_=1NOTE: The SAS System stopped processing this step because of errors.NOTE: DATA statement used (Total process time):
Checkpoint / restart example – log 2
eecsas@oraclet4:~/check$ cat ck2.log…NOTE: SAS (r) Proprietary Software 9.3 (TS1M1) Licensed to MA5.41_BenchmarkEngagement, Site 70068123.NOTE: This session is executing on the SunOS 5.11 (SUN 64) platform.….NOTE: Begin CHECKPOINT-RESTART(5) execution mode.NOTE: Checkpoint library: /home/eecsas/check/WORK/SAS_workECE200001F2E_oraclet4.…..19 /* 5th step - force abend if first time invocation,20 resume here if restarting */
NOTE: End CHECKPOINT-RESTART(5) draining and resume normal execution.
21 data _null_;22 if getoption('stepchkpt')='STEPCHKPT'23 then abort abend 999;24 run;
Checkpoint / restart example – lst files
eecsas@oraclet4:~/check$ cat ck1.lst The SAS System 03:44 Thursday, April 26, 2012 1
Obs x 1 1
The SAS System 03:44 Thursday, April 26, 2012 2
Obs first 1 123
eecsas@oraclet4:~/check$ cat ck2.lst The SAS System 03:44 Thursday, April 26, 2012 1
Obs again 1 123
http://oracle.com/sas
Thank You