changing the world we live in healthcare is proactive airlines go social manufacturers mass...

54
park the future. May 4 – 8, 2015 Chicago, IL

Upload: prosper-oliver

Post on 19-Dec-2015

221 views

Category:

Documents


1 download

TRANSCRIPT

Spark the future.

May 4 – 8, 2015Chicago, IL

Transform your mission-critical environment Superdome X + Windows Server/SQL Server

Maurice De VidtsLaurence GrizaudKen Pomaranski

BRK2584

Agenda Introduction HP Integrity Superdome X overview HP Integrity Superdome X architecture Scalability and flexibility with Microsoft

Windows Server 2012 R2 Enterprise Class Server with built-in

reliability Leveraging SQL Server on HP Superdome

X

The most exciting shifts of our time are underwayChanging the world we live

in

Healthcareis proactive

Airlinesgo social

Manufacturersmass customize

Time to revenueis critical

Decisions mustbe rapid

Change isconstant

Business happensanywhere

Cloud2-fold growth

in next 2 years

Security$104B black

market per year

Mobility100B connecteddevices by

2020

Big Data14.6 PB

created per company

Creating new imperatives for IT

Speed innovation

Create flexible

infrastructure

Control energy and space costs

Speed

Effi

cie

ncy

Businessvalue

Reimagine the server.Delivered as general-purpose, dedicated, physical infrastructure

Silo’edTechnology

centric

Manual

Infrastructure is a cost center

Think Compute.Dynamically aligns pools of resourceswith laser precision to business goals

Infrastructure is a service differentiator

Software-defined and cloud-ready

Workload

optimized

Converged

Compute enables efficient and effective IT services

Common customer challengesBusiness Processing and Decision Support workloads

“We’re deploying new applications on Windows and need more reliability than we have today.”

“My scale-out solution has become too complex, and our management and networking costs are escalating.”

“I need more scalability and availability for our existing x86 applications.”

“We need to reduce our operational costs for mission-critical applications.”

“I’m not getting the x86 performance we require for our large database.”

Superdome X + Windows and SQL Server:A winning combination to meet your challenges today

Redefine compute

economics

Boost business performance

Breakthrough x86 scalability and efficiencies

Groundbreaking x86 performance

Increase competitive

differentiation

Superior x86 uptime

The right compute for the right workload at the right economics … every time

x86 performance not there for large database workloads

x86 uptime not able to meet

mission-critical SLAs

4-socket x86 scale-up not sufficient for the new generation of workloadsComplex, high OPEX large scale-out environments

Today’s challenges New with Superdome X

What is Superdome X?HP Integrity Superdome X, for your critical business processing and decision support workloads

Power your most demanding workloads

Support your largest enterprise applications

Maximize the uptime of your critical x86 apps

16 sockets

12TB memory

7.5x performance of current HP 8 Sockets system

20x greater reliability with unique hardware partitions

CertifiedNew

Drive Real-Time Business with Real-Time InsightsSQL Server 2014

9

Over 100x query speed and significant data compression withIn-Memory ColumnStore

Up to 30x faster transaction processing with In-Memory OLTP

Greater performancewith In-Memory Analysis Services

Billions of rows per second with PowerPivot In-Memory for Excel

Faster InsightsIN-MEMORY ANALYTICS

Faster QueriesIN-MEMORY DW

Faster TransactionsIN-MEMORY OLTP

Decision Support

Business Processing

Target workloads

Superdome X + Windows Server/SQL Server

• Enterprise Resource Planning (ERP) • Customer Relationship Management (CRM)• Online transaction processing (OLTP) • Batch

• Data warehousing/data mart (Scale-up)

• Data analysis/data mining

Hig

hest

Mis

sio

n C

riti

cality

Mega instance and in-memory OLTP UNIX migration Large scale workload

consolidation

HP Integrity Superdome X BladeSystem EnclosureIdeal for scale-up and consolidation

HP Superdome X – at a glance

Form Factor

2s BladeFull extended height

Enclosure18U in standard HP 19” rack

standard power, airflow, cooling

Compute Blades & CPUs

• 2 CPU sockets for Xeon E7 v2• Low and High core count

CPU SKUs

• Up to eight 2s cell blades• 2-16 sockets• Xeon CPUs in one or many nPARs

Memory • 48 DIMM slots, 1.5TB per blade w/ 32GB DIMMs

• 12TB memory capacity w/ 32GB DIMMS

I/O • 2 LOM Cards: Fully configurable / customizable 10Gbe Flex NICs• 3 mezzanine slots

• 16 LOM Cards: Configurable 10Gbe Flex NICs • 24 mezzanine slots

RASUM • Mission-Critical RAS• iLO 4 management processor

• Mission-Critical RAS• SD2 based mission-critical OA

Partitioning and Virtualization

• Electrically isolated blades, can be grouped into nPARs through the flexible crossbar fabric

• Electrically isolated nPARs• Industry standard Virtualization (Hyper-V, VMware)

Superdome X architecture

Crossbar for Reliability

Crossbar for Scaling

Compute16 CPUs 240 cores 480 threads

Memory384 DIMMs12 TB capacity

I/O16 FlexLOMs24 Mezz I/O

• Crossbar aggregate bandwidth (BW) > 1.2TB/s

• CPU/MEM aggregate BW > 1TB/s (measured)

• I/O aggregate BW ~ 800GB/s

• End to End retry through multiple paths

• Electrical isolation of hard partitions (HP nPars) for ultimate flexibility and maintainability

Superdome X optimizes cache coherency4-socket and 8-socket “glueless” designs use the E7 v2 “In memory snoop directory” (also known as directory mode) capability Removes snoop/snoop response processing from critical path Reduces snoop traffic that consume QPI interconnect bandwidth on 4S and larger systems

The external node controller (XNC2) chip isolates snoop traffic between blades in the nPartition, so SDX can use the new E7 v2 “Opportunistic Snoop Broadcast” Mode (OSB) to improve performance Improves cache latency in many scenarios Reduced memory lookup translates to reduced memory latency and improved bandwidth usage

The result is an optimized cache coherency solution for Superdome X Intel Opportunistic Snoop Broadcast (OSB) Cache coherency enabled – even for enormous 16S

partitions without “snoop” traffic penalty Improves performance by reducing local memory access latency Increased performance when compared to Directory mode for NUMA optimized workloads Scalable and Fast Directory Cache in XNC2 ASIC for off-blade accesses

Maximize Uptime

SQL Server Always-on

Windows failover cluster

Integrated HA and backup/restore

Encryption Audit

Main-memory optimized Optimized for in-

memory data Indexes (hash and

range) exist only in memory

No buffer pool Stream-based

storage for durability

High concurrency

Multiversion optimistic concurrency control with full ACID support

Core engine uses lock-free algorithms

No lock manager, latches, or spinlocks

T-SQL compiled to machine code T-SQL compiled to

machine code via C code generator and Visual C compiler

Invoking a procedure is just a DLL entry-point

Aggressive optimizations at compile time

Steadily declining memory price, NVRAM

Many-core processors

Stalling CPU clock rate

TCO

Hardware trends Business

Availability / Security

High-performance data operations

Frictionless scale-up

Efficient, business-logic processingB

en

efi

tsS

QL S

erv

er

Tech

P

illa

rsD

rivers

SQL Server 2014 ArchitecturePerfect match for Superdome X hardware technology!

Boost business outcomes with groundbreaking performance & scalability

Unprecedented x86 performanceHP Integrity Superdome X SPECcpu2006 8-socket Superdome X performance exceeds all x86 competition

#1 8S x86 SPECfp_rate_base_2006 / #1 8S x86 SPECfp_rate2006 #1 8S x86 SPECint_rate_base_2006 / #1 (tie) 8S x86 SPECint_rate2006

16-socket Superdome X system performance even beats out “big iron” #1 16S SPECfp_rate_base_2006 / #2 16S SPECfp_rate2006 #1 16S SPECint_rate_base_2006 / #1 16S SPECint_rate2006

HPTC workloads Preliminary Graph 500 results already put 16S Superdome X as the #2 single-node

solution in performance terms, as well as, the #4 most power efficient – and still tuning left to do

Early HPLinpack results at 5 TFlops out of a theoretical peak possible 5.376 TFlops (93% efficiency)

SPEC and the benchmark name SPEC CPU are registered trademarks of the Standard Performance Evaluation Corporation (SPEC); see spec.org/ as of 12/1/2014

Overshadows the best from:• Fujitsu PRIMEQUEST 2800E• IBM System x3950 X6• Hitachi BladeSymphony BS520X

Superdome X 240-core system wins over the best from:• Fujitsu M10-4S (256-cores)• IBM Power 780 (128-cores)

Full-up 16 Sockets –SQL Server 2014 demo

Superdome X 16S performance w/ SQL 2014

7.5x BI Performance improvement over 8S DL980 G7

60 Billion rows processed in 16 seconds. Internal Workload - DSS focused query SDX – 16S/4TB RAM, DL980 8S/4TB RAM 100% CPU bound query

Benchmark work in progress… Stay tuned

Flexibility

The unique value of HP nParsHard partitions add resource and cost efficiencies

Lower your TCOOptimize software costs by using HP nPars

Maximize resource utilizationCreate different development, test, and production environments within a single enclosure

Minimize downtimeTake one partition offline, while the others continue to run undisturbed

20x greater reliability than soft partitions

Protect your dataElectronic isolation provides a high degree of security between partitions

HP BladeSystem Superdome Enclosure

OS

APP

OS

APP

OS

APP

nPar A:Dev

System

nPar B:Test

System

nPar C:Prod

System

Memory

Memory

Processor 1/1/0

Processor 1/1/1

Memory

Memory

Processor 1/3/0

Processor 1/3/1

Memory

Memory

Processor 1/5/0

Processor 1/5/1

Memory

Memory

Processor 1/7/0

Processor 1/7/1

Blade 1/1 Blade 1/3

Memory

Memory

Processor 1/2/0

Processor 1/2/1

Blade 1/2

Memory

Memory

Processor 1/4/0

Processor 1/4/1

Blade 1/4

Blade 1/5 Blade 1/7

Memory

Memory

Processor 1/6/0

Processor 1/6/1

Blade 1/6

Memory

Memory

Processor 1/8/0

Processor 1/8/1

Blade 1/8

Blade I/O Blade I/O Blade I/O Blade I/O

Blade I/O Blade I/O Blade I/O Blade I/O

Crossbar fabric

Enterprise Class Server with built-in reliability

Experience superior x86 availability with Superdome X

60%

Increase availability with end-to-end mission critical design

ZeroPerform maintenance and updates online without application outage

95% Ensure continuity with HP Firmware First

downtime reduction

planned downtime

reduction inmemory outages

20x greater reliability with

HP nPars

Superdome X with Power-on-Once technology

Availability from components to complete solutions

Up to 100% application availability

Error identification, reporting, recovery

Infrastructure reliability

Common components

Hard partitioning (HP nPars)

‘Firmware First’ architecture

Error Analysis Engine

Enhanced Failover Clustering in R2

Insight Remote Support

Proactive services

Windows Server 2012 R2

Online optimization and repair

Fault-tolerant fabric

Integrated with SQL Server AlwaysOn at database level

The ideal foundation for your mission-critical x86 environment

Benefit from proven reliability, availability, and serviceability (RAS)

Superdome X

Extending the proven HP Integrity Superdome 2 mission-critical RAS features

HP fault management Diagnostics Error analysis engine True ‘One Stop’ fault

managementSelf healing Deconfiguration (core,

DIMM and blade) Runtime deactivation

(DIMM, I/O, and fabric)

Memory RAS Proactive memory

scrubbing Enhanced DDDC + 1

Platform RAS Clock redundancy Fault-tolerant cross-bar fabric Partitioning/error isolation Cross bar and hard partitionsServiceability Redundant, hot swap: Power supplies and fans I/O switches HP Onboard Administrator

modules

HP firmware PCIe Live Error Recovery (LER) Advanced error reporting Viral error containmentHP hardware Advanced memory error recovery Corrupt data containment LER containmentOS level RASProcessor RAS Processor interconnect (CRC) Advanced MCA recovery

Superdome X RAS features begin where most commodity x86 servers leave off

End-to-end mission-critical design

Protect your mission–critical data on HP Superdome X Superdome X protects, analyzes the evidence, recovers – results in 60% downtime reduction

1100X100110MCA

Detect!OS crashes, attempts reboot, no critical analysis

1100X100110

Bad data may end up in storage

Weak containmentNo critical analysis

MC x861100X100110

Self-heal

Collect evidence

OS recovery

Bad data contained

immediately

Critical Analysis

Strong containmentDeep analysis

110011001100

Repair

MCADetect!

Generic x86

Data Integrity

Results in fewer memory, IO, and processor based outages HP Advanced Error

Recovery Recovery from uncorrectable

processor, cache and memory errors during execution

HP Memory Quarantine Recovery from uncorrectable

memory errors which may cause a system crash

1Uncorrectable error detected in processor, cache or memory under execution pipeline

Firmware informs OS, hypervisor and end-application

OS, hypervisor and end-application initiates recovery action – thread, process, VM, application kill/re-started

System keeps running and prevents crash

24 ProcessorsL1 Cache

Cor

e

L2 Cache

Unco

re

DRAM

HP Advanced Error Recovery and HP Memory Quarantine require recovery awareness in OS, Hypervisor and End-application

1MCA Recovery detects uncorrectable memory error

HP Memory quarantine tags memory location as bad and sends address to OS/hypervisor

OS/hypervisor decides how to handle recovery

OS/hypervisor blocks use of bad memory location

3

4

3

2

Handling uncorrectable memory errors DEMO

Leveraging SQL Server on HP Superdome X

Windows Server 2012 R2 has a 4TB RAM Limit

Maximum configuration

  16 socket 8 socket 4 socket 2 socket

Processor cores(15 cores per socket)

240 120 60 30

Logical processors(Hyper-Threading on)

480 240 120 60

Memory capacity(Current supported limit)

12TB(4TB)

6TB(4TB)

3TB 1.5 TB

Mezzanine slots(Current supported limit)

24(16)

12(8)

6 (4)

3(2)

LOM 16 8 4 2

Superdome X and Microsoft Windows Server/SQL Server

Deploy your most demanding Business Processing and Decision Support workloadswith confidence

• Large OLTP – In memory DB• Scale up DW – real-time DW• Large SQL for SAP backend

Windows

• SQL server consolidation

Scale-upMulti-workload consolidation

Multi-instance Multi-partitions

Mission Critical

• SQL server Consolidation• OLTP – In-memory DB• BI in a box• SAP in a box

SQL Server/ Windows

SQL Server/ Windows

SQL Server/ Windows

Mixtures of consolidation types provide additional flexibility

Windows SQL Server

instance 1 SQL

Server instance 2

SQL Server

instance “n”

SQL Server

instance 3

SQL Server Mega

Instance

Scale-up Reference Configuration

Scale-up single instance DB OLTP workload Windows Server 2012 R2

and SQL Server 2014 8 Sockets - 4 TB 16 Sockets - 4 TB

Reference Configuration white paper in development

Windows Server

SQL Server

Mega Instance

HP Integrity Superdome X

nPar (Server)16 Sockets/ 4TBHP 3PAR StoreServ

nPar (server)8 Sockets/ 4TBHP 3PAR StoreServ

Windows Server

SQL Server

Mega Instance

HP 3PAR Storage

HP 3PAR Storage

OLTP workload

OLTP workload

Mixed Workload Reference Architecture

Mixed workload (DW/OLTP) in multi-partition mixed workload configuration Performance information from the Reference

Config.

Config: Windows Server 2012 R2 and SQL Server

2014 2x 8S 4TB nPar OLTP instance HA failover In-Memory Database engine Data Warehouse instance

2x nPar (server) - (2x) 8 Sockets/ 4TB

HP 3PAR StoreServ

Windows Server

SQL Server

Mega Instance(Active)

HP 3PAR Storage

OLTP workload

Windows Server

SQL Server

Data WarehouseDW/BI workload

SQL Server

Mega Instance(Passive)

In MemoryOLTP

HP Integrity Superdome X

nPar1

nPar2

Windows

Failo

ver

Cluste

r

SQL Server 2014 - Platform Migration options Preserve legacy

architecture

(where possible) One to one server to nPAR

mapping Preserve topology

Pros Quick Lower risk

Cons Suboptimal use of resources Stranded resources

Windows Server

Windows Server

SQL Server Instance

nPar1

nPar2

SQL Server Instance

Windows Server

SQL Server Instance

nPar3

SQL Server 2014 - Platform Migration (Cont) Refactor multi-server

architecture into fewer larger NPAR’s than servers Many to many server to nPAR

mapping Refactored topology

Pros Better performance Better resource utilization Better ROI

Cons Requires architecture modification Takes longer planning/design time Higher implementation risks

Windows Server

Windows Server

SQL Server Instance

nPar1

nPar2

SQL Server Instance

Windows Server

SQL Server Instance

nPar3

SQL Server 2014 - HA Windows Failover

Clustering SAN storage Traditional SQL Server

Failover Cluster Instance (FCI) support between hardware partitions (nPar).

Windows Failover Cluster between

two hardware partitions (nPar)s

Windows Server

HP 3PAR Storage

Windows Server

SQL Server Instance

(Passive)

nPar1

nPar2

Windows

Failo

ver

Cluste

r

SQL Server Instance

(Active)

SQL Server 2014 - HA Availability groups

Requires duplicate storage on SAN

Useful for quick as-is migrations of disparate report servers using replicated data

Can be quickly setup using 3PAR virtual copy technology

Windows Failover Cluster

Windows Server

HP 3PAR Storage

Windows Server

SQL Server Instance

(Passive)

nPar1

nPar2

Windows

Failo

ver

Cluste

r

SQL Server Instance

(Active)

Windows Server

SQL Server Instance

nPar3

HP 3PAR Storage

SQL Server

Availability

GroupSecondary Read-only

SQL Server 2014 – Resource Management SQL Server

Instance is NUMA aware

By default all processors allocated to an instance, and all memory

NUMA Node or individual Processor affinity

SQL Server 2014 – Resource Management Memory Resource

Management Set static limits to DB

Instance and SSIS engine(s)

Example nPar 8 Sockets – 4 TB OLTP instance SSAS (Tabular) SSAS (Multidimensional)

Windows Server

SQL Server Instance

nPar

OLTP workload

SQL ServerAnalysis ServicesTabular instance

Analytics

SQL ServerAnalysis Services

Multidimensional

Analytics

CPU 0 CPU 1 CPU 2 CPU 3

CPU 4CPU 5 CPU 6 CPU 7

1920 GB

968 GB

968 GB

SQL Server 2014 – Resource Management SQL Server

Database NUMA affinity SQL Server

Configuration Manager

Database/application Soft NUMA by ports

I/O Resource Management SQL 2014 Resource

Governor (IOPS only) 3PAR Priority Optimization

(QoS) Includes latency targets

Automatically throttles down lower priority work loads

SQL Server 2014 Performance

7.5x BI performance increase over DL980G7 (16s to 8s)

2x OLTP increase over DL980G7 (8s to 8s)

Visit Myignite at http://myignite.microsoft.com or download and use the Ignite Mobile App with the QR code above.

Please evaluate this sessionYour feedback is important to us!

© 2015 Microsoft Corporation. All rights reserved.

Backup

Scaling together to drive the most demanding workloadsWindows Server 2012 R2 scalability

Ever evolving hardware support Windows Server 2012 R2 64 Sockets - 4TB RAM

NUMA – processors have their own set of memory Faster access (local vs remote)

Processor groups Scheduling entity

x2APIC - interrupt address scaling above 256 LPs MSI-X distribution across logical processors

Many applications benefit For example SQL Server

HP BL920s Gen8 Windows Drivers Bundle HP drivers and software

Processor Group (k-group)

NUMA Node (Socket)core

Logical processo

r

Logical processo

r

coreLogical

processor

Logical processo

r

coreLogical

processor

Logical processo

r

coreLogical

processor

Logical processo

r

Support for Windows Server 2012 R2

Hardware configuration

Hardware Partition (nPar) Minimum of 1 BL920S Support for 2, 4, 8 blades (2S, 4S, 8S,

16S) per nPar Maximum 4 TB RAM per nPar (Microsoft

Windows Server 2012 R2 limit) At least one NIC per blade (Flex LOM - HP

Ethernet 10Gb 2-port 560FLB Adapter) At least one Fibre Channel adapter per

blade (HP QMH2672 16Gb FC HBA) for maximum flexibility

SAN Boot There is no internal disk

UEFI Support Support of larger drives (GPT)

NUMA – OVERVIEW

BL920s Gen8 #3

NUMA Node 3

NUMA Node 2

BL920s Gen8 #1

NUMA Node 1

CPUMemor

y

CPUMemor

y

Cro

ss bar

NUMA Node 0

CPUMemor

y

CPUMemor

y

NUMA nodes cannot span Processor Groups. All the processors within a NUMA node must be in the same group If a NUMA node has >64 processors it will be split by the OS into

smaller nodes. Legacy Applications that do not understand Processor Groups

only run within a single processor group Full access to all available memory but not all processors

Windows round robins processes start up to the different processor groups.

From the user perspective assignment seems random.

PROCESSOR GROUP RULES

PROCESSOR GROUPS FOR APPLICATIONS To get information regarding processor groups applications must use

the new system API’s only available on OS releases on Windows Server 2008 R2 and later.

Applications will start in one Processor Group the first thread will be running on that Processor Group Any threads it creates will also run in that Processor Group

Any API calls that specifies an affinity in relation to a Processor Group using the new call will allow threads to cross processor group boundaries

Use SetThreadGroupAffinity() for example DLLs mays also have to be modified to support the Processor Group Structures

RUNNING NON PROCESSOR GROUP AWARE APPLICATIONS IN SPECIFIC PROCESSOR GROUPS

CMD window START can override Processor Group assignment by using the /NODE option

Process will be assigned to start in the Processor Group that the NUMA Node is affiliated with. It won’t be restricted to processors on just that NUMA node but the whole Processor Group.

Starts LegacyApp within the Processor Group that contains NUMA node 3

Similarly services can also be set to have a preferred NUMA node with the SC.exe command

sc.exe preferrednode AnyService 3

START /NODE 3 LegacyApp.Exe

PROCESSOR GROUPS /NODE

START /NODE 3 LegacyApp will cause the Application to start in Processor Group 1, and run on both groups of processors of NUMA 2 and NUMA 3

Processor Group spanning NUMA Nodes

CPU 0 CPU 1 CPU 3

CPU 4 CPU 5 CPU ...

CPU 40

CPU 2

Processor Group 0

NUMA NODE 0

LP 0

LP 1

LP 2

LP 3

LP 4

LP 5

LP …

LP 31

NUMA NODE 1

LP 0

LP 1

LP 2

LP 3

LP 4

LP 5

LP …

LP 31

CPU 0 CPU 1 CPU 3

CPU 4 CPU 5 CPU ...

CPU 40

CPU 2

Processor Group 1

NUMA NODE 2

LP 0

LP 1

LP 2

LP 3

LP 4

LP 5

LP …

LP 31

NUMA NODE 3

LP 0

LP 1

LP 2

LP 3

LP 4

LP 5

LP …

LP 31

Starting PG

SQL Server 2014 Performance Superdome vs DL980 8 socket comparison x.y increase using

X cores Y cores disabled on Superdome

SQL Server 2014 - Superdome vs DL980 8 socket comparison x.y increase X cores vs Y cores