oracle and sas: clouds, community, collaboration, and ... · agenda sas big data & oracle paul...

Post on 13-Jul-2020

10 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Oracle and SAS: Clouds, Community, Collaboration, and Computing CreativityGoodbye Cloudy with a Chance of Showers; Hello Cloud-eze and Clear Skies

Maureen Chew, Principal Software Engineer, Oracle CorporationGary Granito, Enterprise Architect, Oracle CorporationThomas Mendicino, Senior Manager, Data Management ,GE CapitalPaul Kent, Vice President, Big Data, SAS

Agenda

SAS Big Data & Oracle Paul Kent, SAS

Leveraging Technology to Improve SAS Analytics Performance – Battling Change is Never Easy

Thomas Mendicino, GE Capital

Oracle and SAS On Collaboration

Cloud Perspective

Towards Cloud “readiness” with your SAS programs…

Questions, Kindle Drawing

SAS Scoring Accelerator for Oracle and Beyond

Paul Kent, Vice President, Big Data, SAS

SAS & Oracle

Oracle is a top SAS Partner

SAS on Solaris+SPARC is a top tier platform for SAS Enterprise Deployments

T4 / SuperCluster, Solaris 11

Oracle Enterprise Linux (OEL) is RH compatible. SAS is supported on OEL

Announcing Scoring Accelerator Availability June 5th (2012)

Recent joint Exadata performance testing results

Scoring Scale Up on Exadata Full, Half and Quarter Rack

Expected Scalability: Full (32), Half(16), Qtr(8)

Performance even at the low end across all configs

1 2 4 8 16 320

50,000

100,000

150,000

200,000

250,000

300,000

11,433 21,53443,478

87,241

173,160

290,909

Scoring Accelerator - Exadata Full, Half, Quarter Rack

Rows/Second- 80M rows, regression - higher is better

FullHalfQuarter

Degree of Parallelism

#Row

s/se

c

Scoring Scale Up on Exadata Full, Half and Quarter Rack

Linear scalability as expected Full:1→32,

Half:1→16,

Qtr:1→8

Full: ~2 hrs → 4.5 min

Full Half Quarter00:00:00

00:28:48

00:57:36

01:26:24

01:55:12

02:24:00

Scoring Accelerator - Exadata (Full, Half, Quarter Rack)

Time - regression, 80M rows - lower is better

32168421

Scoring Time at varying Degrees of Parallelism (DOP)Linear Scalability as DOP increases

Consistent Full/Half/Qtr Times at similar DOP

Tim

e (H

H:M

M:S

S)

Scoring Scale Up on Exadata Extra Large – 250M rows Full rack with

DOP=32 80M rows: 2hrs →

4 min 250M rows: 6 hrs

→ 14 min

Scale up consistency: Time, Rows/Sec,

1 2 4 8 16 320

50,000

100,000

150,000

200,000

250,000

300,000

350,000

11,086 22,40144,068

87,966

170,765

303,030

Scoring - Scale up on Full Rack

Extra Large Table - Rows/Sec - 80M vs 250M -

80M250M

Degree of Parallelism

Ro

ws

/ Sec

Looking Ahead

HPA on Oracle High Priority

Focus

Leveraging Technology to Improve SAS Analytics Performance

Battling Change is Never Easy

Tom MendicinoManager, Information DeliveryGE Capital – Retail Finance

Current Environment

Retail Finance depends on 10+ Oracle 11g database instances on several independent Sun Fire E25K Servers to support 200+TB of operational data incorporating over 300 million accounts.

Serves over 600 users in risk, marketing, operations, finance, decision science, and fraud utilizing a wide range of SAS applications from simple reporting and portfolio analytics to advanced modeling and forecasting.

The SAS environment sits on a separate E25K with 144 cores spread across 5 domains with 60 TB of attached storage which includes /saswork.

It’s stable, but in need of architecture refresh.

Future State - Database

Databases are currently being migrated to Oracle Exadata X-2 Servers (2 Full Racks)

Decision was based on POC performed between Oracle and another vendor.

In excess of 180,000 tests were conducted showing magnitudes of improvement from hours to minutes, minutes to seconds.

Little to no impact on user queries – lift and load.

When pulling large SAS datasets down to storage, there is still an I/O delay but we still see 400% improvements.

Future State - SAS

The 144 core E25K will be replaced with 18 eight core commodity servers in a SAS Grid enabled architecture.

96 GB RAM per node

8 600 GB HDD (2 for OS, 6 for /saswork)

Analytic Workload Management • Ability to define Job Queues (Production, Normal, Business Critical, etc.)

• Jobs will queue based on queue priority, available compute capacity, and designated group consumption

• Jobs scheduled to run based on “fair share” scheduler so that no one group can consume all of the grid

Oracle and SAS: Clouds, Community, Collaboration, and Computing CreativityGoodbye Cloudy with a Chance of Showers; Hello Cloud-eze and Clear Skies

Maureen Chew, Principal Software Engineer, Oracle CorporationGary Granito, Enterprise Architect, Oracle Corporation

On Collaboration

• Exadata, Exalogic, SPARC SuperCluster • Early Access Programs

• Oracle Solaris Studio 12.3 Platinum Beta• Solaris 11 – Early Preview Testing

• http://oracle.com/sas

SAS & Oracle Collaboration

Current Joint POCs (Proof of Concept) Underway

SPARC SuperCluster – SAS Grid, Marketing Automation, Enterprise Miner

SAS Grid with Sun ZFS Storage

SAS Scoring Accelerator on Exadata

SAS Grid on Oracle Exalogic and Sun ZFS Storage 7420 Appliance

½ Rack Oracle ExalogicCompute Appliance

Sun ZFS Storage7420 Appliance

InfinibandNetwork Attach

Exalogic192 Intel cores1.5 TB RAM

ZFS 7420:160 spindles2 TB Read Cache576 GB Write Cache

96 SAS sort jobs, 5.7TB total I/O 3.7GB/sec throughput

Infiniband Network for NFS

Node 1

Node 2

Node 15

Node 16

½ Rack Oracle ExalogicCompute Appliance

16 Compute NodesEach Compute Node:

2 x 6 core 2.93GHz Xeon96GB RAM

Infiniband Network Switch

ZFS Storage Appliance:40TB2 Mirrored Disk Trays

Head Unit

Disk Tray 1

Sun ZFS Storage7420 Appliance

Disk Tray 2

Disk Tray 7

Disk Tray 8

8 Disk TraysEach Disk Tray:

20 x 1 TB SATA2 Disks4 x 18 GB Write Cache

2 x 8 core 2.0GHz Xeon256 GB RAM4 x 512GB Read Cache

4 Infiniband Connections

Exalogic:192 cores1.5 TB RAMInfiniband FabricInternal ZFS ApplianceFor SAS binaries, code

ZFS 7420:160 spindles2 TB Read Cache576 GB Write CacheFor /sasdata, /saswork

• SAS was one of the 1st major ISVs to announce support

• Tune in to the SAS audio interview on the 25APRSolaris Forum

Solaris 11

• SAS 9.3 Foundation : Supported! System Requirements for SAS 9.3 Foundation for Solaris

“SAS is supported on Solaris 10 Update 8 and Solaris 11 and higher”

• SAS 9.3 Enterprise Business Intelligence : Supported! Current Baseline: Oracle WebLogic 10.3.3, JDK 1.6.0_21

Solaris 11 →Weblogic SAS Third-Party Software Requirements – Baseline and Higher Support

WebLogic 10.3.5 with JDK 1.6.0_26 : All supported!

Solaris 11 – SAS 9.3 Support

Cloud Perspectives

On Clouds … we could talk about

Engineered HW Solutions Fully Integrated

Performance focussed

Virtualization

Exadata, Exalogic, SPARC SuperCluster

Enterprise Manager 12c “c” is for Cloud

Management interface for creating and managing private clouds

Public cloud offerings, private cloud infrastructure

Engineered Systems & Appliances

General Purpose

SPARCSuperCluster

Purpose Built

Database Appliance Exalytics

Big DataExalogicExadata

• Cloud Built In

BEST DENSITY ACTIVE-ACTIVE CONTROLLERS

BEST SCALABILITY ACTIVE-ACTIVE CONTROLLERS

BEST VALUE FULL SUITE OF

DATA SERVICES

NEW BENEFITSBest Density and Scale: Industry-leading density, scale up to 1PB for ConsolidationFlash Everywhere and More Of It: Industry-leading flash capacity for Application PerformanceDoubled the Processing Power: Performance to drive enterprise Data Protection

STANDARD FEATURES (ALL MODELS)All Data Protocols: FC, iSCSI, IB, NFS, CIFS, WebDAV, etc.Advanced Data Services: Snap, dedup., compression, replication, etc.

CLIENTS AND APPLICATIONS (ALL MODELS)Oracle Solaris • Oracle LinuxOracle Database, Middleware, and ApplicationsOracle VM • VMware • WindowsMore than 50 business applications supported

ZFS Storage AppliancesSecond Generation Systems

BEST FLEXIBILITY SINGLE OR DUAL

CONTROLLERS

Oracle “Cloud” Strategies

SAS Cloud Directions

SAS Solutions OnDemand(SSO)Overview 10+ year proven track record of providing outsourced

advanced analytical mission critical solutions: One of the fastest growing SAS Business Units 1000+ years of SAS experience on staff

Multiple lines of business representing more than 250 SAS solutions hosted and supported for more than 200+ customer sites with users from over 70 countries:

SSO Enterprise Hosted Solutions involve huge amounts of highly regulated sensitive data from major drug, retail, financial, education, and government organizations.

Over 1PB (1,000 TB) of data under management

Two 10,000-square-foot server farms for our enterprise hosting, capable of providing 99%+ 7x24 Guarantee

28

Key Capabilities/Deliverables

• Decreased Time to Production for new customers - (no need to take Exadata down to expand HW or SW)

• Best of Breed:• Quality and quantity of tools for management,

monitoring, development and diagnosis via Oracle Enterprise Manager

• Resource Management/Optimization through Exadata’s IO Resource Management and Database Resource Manager.

• Best of Breed Security(HW to SW) capabilities through Oracle’s Advanced Security Option

• Business Continuity:• High Availability SLA’s >99%• Superior Backup, restore, and recovery

• Oracle DB License Consolidation - management efficiencies and savings

SAS OnDemand – Optimized Architecture

28

IP over Ethernet

External Company Data

SAS Servers

SAS Users

Exadata Database Machine

Firewall

Towards Cloud Readiness

Small steps TODAY toward Cloud readiness TOMORROW

a SID for the GRID Running on the GRID – you’re off to a good start…

Grid environments allow for runtime “agility” – other people & policies determine where applications run, how provisioned and resourced, etc

Utilize Checkpoint/Restart for long running SAS jobs Available since SAS 9.2

Utilized by SAS Grid, SAS tools

Can transparently add code support without affecting current application dynamics

Can increase application availability immediately

Checkpoint / restart example$ cat checkpoint.sas /* first step */data temp1; x=1; run;

/* second step */proc print data=temp1; run;

/* global macro statement */%let macval=123;

/* macro definition */%macro doit(xyz);data &xyz; &xyz=&macval; run;proc print data=&xyz; run;%mend;

/* macro invocation - creates 2 new steps */%doit(first);

/* 5th step - force abend if first time invocation, resume here if restarting */data _null_; if getoption('stepchkpt')='STEPCHKPT' then abort abend 999; run;

/* 6th step - continue here after restarting, first should already exist */data _null_; set first; put _all_;

/* ensure macro is redefined, 2 steps invoked */%doit(again);

/* final step invoked */data _null_; set temp1; put _all_;

$ cat run.sh#!/bin/shSASDIR=/apps/sas_install/compute/SASFoundation/9.3$SASDIR/sas checkpoint.sas -fullstimer -print ck1.lst -source -stepchkpt -log ck1.log -noworkinit -noworkterm -work WORK

$SASDIR/sas checkpoint.sas -steprestart -fullstimer -print ck2.lst -log ck2.log -noworkinit -noworkterm -work WORK

Checkpoint / restart example – log 1eecsas@oraclet4:~/check$ cat ck1.log………NOTE: Copyright (c) 2002-2010 by SAS Institute Inc., Cary, NC, USA.….NOTE: This session is executing on the SunOS 5.11 (SUN 64) platform.……NOTE: Begin CHECKPOINT execution mode.NOTE: Checkpoint library: /home/eecsas/check/WORK/SAS_work108800001F2F_oraclet4.NOTE: WORK library: /home/eecsas/check/WORK/SAS_work108800001F2F_oraclet4.NOTE: CHECKPOINT 1.……..2 data temp1; x=1; run;NOTE: The data set WORK.TEMP1 has 1 observations and 1 variables.NOTE: DATA statement used (Total process time):……….NOTE: CHECKPOINT 2.5 proc print data=temp1; run;……..NOTE: CHECKPOINT 5.21 data _null_;22 if getoption('stepchkpt')='STEPCHKPT'23 then abort abend 999;24 run;ERROR: Execution terminated by an ABORT statement at line 23 column 14, it specified the ABEND option._ERROR_=1 _N_=1NOTE: The SAS System stopped processing this step because of errors.NOTE: DATA statement used (Total process time):

Checkpoint / restart example – log 2

eecsas@oraclet4:~/check$ cat ck2.log…NOTE: SAS (r) Proprietary Software 9.3 (TS1M1) Licensed to MA5.41_BenchmarkEngagement, Site 70068123.NOTE: This session is executing on the SunOS 5.11 (SUN 64) platform.….NOTE: Begin CHECKPOINT-RESTART(5) execution mode.NOTE: Checkpoint library: /home/eecsas/check/WORK/SAS_workECE200001F2E_oraclet4.…..19 /* 5th step - force abend if first time invocation,20 resume here if restarting */

NOTE: End CHECKPOINT-RESTART(5) draining and resume normal execution.

21 data _null_;22 if getoption('stepchkpt')='STEPCHKPT'23 then abort abend 999;24 run;

Checkpoint / restart example – lst files

eecsas@oraclet4:~/check$ cat ck1.lst The SAS System 03:44 Thursday, April 26, 2012 1

Obs x 1 1

The SAS System 03:44 Thursday, April 26, 2012 2

Obs first 1 123

eecsas@oraclet4:~/check$ cat ck2.lst The SAS System 03:44 Thursday, April 26, 2012 1

Obs again 1 123

http://oracle.com/sas

Thank You

top related