virtualizing databases doing it right – the...

136
Virtualizing Databases Doing IT Right – The Sequel VAPP1318 Michael Corey, Ntirety - A Division of Hosting Jeff Szastak, VMware, Inc

Upload: vuongthu

Post on 24-Jun-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

Virtualizing Databases Doing IT Right – The Sequel

VAPP1318

Michael Corey, Ntirety - A Division of HostingJeff Szastak, VMware, Inc

Jeff SzastakJeff SzastakMSIA, CISSP, VCP, MCSE, etc. Manager, Systems EngineeringCTO AmbassadorVMware, Inc.

Microsoft Exchange & SQL virtualization BC/DR SME

@szastak

Blog contributor: blogs.vmware.com/apps www.virtualinsanity.com

Michael J Corey

Books Include:Virtualizing SQL Server with VMware Doing IT RightOracle Database 12c: Install, Configure & Maintain like a ProfessionalOracle 11g A Beginner’s GuideOracle 10g A Beginner’s GuideOracle 9i - A Beginner's GuideSQL Server 7 Data Warehousing Oracle8i - Data WarehousingOracle8i - A Beginner's Guide Oracle8 - Data Warehousing Oracle8 – Tuning Oracle8 - A Beginner's Guide Oracle - Data Warehousing Oracle - A Beginner's GuideTuning Oracle

Key Past/Current Affiliations:Past President of the IOUGFounding Board IOUG Virtualization SIGPast Member IOUG Board of DirectorsPast Director of Education IOUGFounder Professional Association of SQL ServerTalkin’Cloud Top 200 Channel Partner Experts CloudPast Member Microsoft Data Warehouse CouncilPast Member Oracle Educational Advisory CouncilPast Director of Conferences IOUG AliveExecutive Board Massachusetts Robert H. GoddardCouncil on Science, Technology, Engineering & Mathematics

Started Working with Oracle Version 3.0 Beta Tested Oracle 5,6,6.2,7,8.X,9.X.…. Presented on Technology & Business Topics from Brazil to Australia Worked with Oracle on UNIX, Linux, Windows, MVS,VM, VMS,..

Shameless Plug

Doing Something Different• Presentation Covers Both Oracle & Microsoft SQL Server• More & More DBA’s are faced with maintaining both• Many Issues faced are shared 

5

“This is a Database on Virtualized Infrastructure Session, Principals Apply all Databases”

Dial Tone – The New World Order

Why Customers Are VirtualizingDatabases 

(Business Critical Applications)

VMware

Concise Set

VeryEfficient Drivers

Focused Driver Set

WellVetted O/S

Hardware ResourceO/S

Du Jour

Many Drivers

Many Versions

New Driver’s

Can CauseIssues

Why Your Company Cares: Virtualization is Strategic

1:1 relationship between applications and hardware

Relevant cost metric = cost per server

• 8% - 12% Utilization is typical

Many:1 relationship between applications and hardware

Relevant cost metric = cost per application

• 60 - 80% Utilization: is typical• 60% reduction in CapEx• 30% reduction in OpEx• 80% reduction in Energy

Physical World

1 :1

Virtual World

Many :1

The NewNorm

“Can You Say Right-Sizing”

Memory Hot Add / CPU Hot Plug

Reduction in CPU Utilization

Increased processing rate

Adding Memory

Oracle – Hot Plug vCPU

Oracle ‐ Hot Add MemoryOracle database memory parameters are defined at instance startup.

You will have to restart the database to take advantage of added memory.

Unless you have set SGA_MAX_SIZE to Big

Caution Shared Resource Environment !

Typically…SGA_TARGET_SIZE <= SGA_MAX_SIZEor could be wasting memory

http://www.vmware.com/files/pdf/solutions/oracle/Oracle_Databases_VMware_Workload_Characterization_Study.pdf

1St Time Goal of Consistency Standardization Can Be Achieved

“Any Resource, Any Server, At Any Time” in the (Pool)

The 10 Millionth Model T was produced on June 4, 1927

Trend Keeps Growing

Trigger Points When to Virtualize

Architecting for Performance:The Right Hypervisor

Is your database to “Big” to Virtualize?

Very Large ERP System• 75+ application tiers – VMware/RHEL• 8 TB database; 8.8 billion rows of data• 52 million transactions per day• 79K IOPS• 40K blocks per second interconnect traffic• 40,000+ named users• 4,000+ peak concurrent users

Source EMC

“Yes This is Virtualized”

Performance Test Environment (Topology)

20

■ VMware vSphere 5.1, Red Hat Enterprise Linux (RHEL) 6.3

■ Oracle 11gR2 (11.2.0.3) Single Instance and RAC

■ 3PAR StoreServ 10400

■ 192 x 15K RPM Fibre Channel Disks

■ 32 x 150K RPM Solid State Disk (SSD)

■ ProLiant DL580 G7 (client)

■ Intel® Xeon® CPU X7560 @ 2.26 GHz (8 cores)

■ 128GB memory

■ ProLiant BL660c Gen8 - 4 sockets / 24 cores (database server)

■ Intel® Xeon® CPU E5-4610 @ 2.40 GHz (6 cores)

■ 64GB memory

■ HP Virtual Connect FlexFabric 10Gb/24-Port Module

Recent “HP” Performance Study – Choose Your Vendor DU-JOUR

Performance Results• Virtualization has ~5% overhead as

compared to native• The database tps on a virtual machine is 5%

less than that on the physical machine.

• 2P represents 12 cores and 4P represents 24 cores

21

• For 100 users the delta is ~6% and that increases up to ~10% for 1700 users.

• When the system gets busier, native starts to have a slightly larger advantage over virtualization.

Performance Results ‐ Continued• Both virtual and native, by moving  from 2P (12 cores) to 4P (24 cores) 

• The database tps increases by 40% to 50%

• The CPU utilization drops from 80% to 60% 

22

• For RAC , by moving from 2P (12 cores) to 4P (24 cores)

• The database tps increases by 40% to 60%

• The CPU utilization drops from 75% to 60%

“Who Architects a Database With Less than 5% Overhead - One Busy Day Your Done”

Mega vMotion RAC on vSphere Functional Stress Test

VMW, EMC, CiscoExecuted by “Principled Technologies” 2013WWW.principledtechnologies.com/Vmware/vMotion_oracle_rac_1013.pdf3 RAC Node, vMotion on all 3 Nodes Simultaneously – Without any network disruption

24

Service Level Agreement/The DBASituation: Customer Monitors Critical Medical Equipment within a Hospital. A SQL Server Database is at core of system. Having Huge performance problems

“Failure is not an option”.

Solution: Need to take Server Down. Adjust BIOS Setting Causing SQL Server to only have access to 50% of the available CPU.

Customer: Never a time they can take Server down for 5 minutes

Stand Alone Instance – Had it been virtualized DBA would have had options

No Win ‐ SLAYet this situation points to a bigger issue concerning

“Managements” expectations concerning

the availability of the database and the

physical infrastructures ability to support those

expectations.

Have The Conversation• Get the Resources You Need to meet the expectation • OR – Reset Expectations concerning Database Uptime

Avoid Good Intention BIOS SettingCheck Power Management Settings• Default lot of Servers is “Green” Friendly Setting

• Saves Energy, When Server Inactive• Many Times Does Not Ramp UP CPU Quickly and in Some Cases

Completely• Avoid Dozing Setting

• Slows CPU to half its Speed

Proper Setting for server hosting a Database is “High Performance”

BIOS Settings to ConsiderIf Your Processors Support it

• Enable “Turbo Mode”• Enable “Hyper-threading”

Enable all hardware-assisted virtualization features in the BIOS.

Virtualizing Databases: Doing IT Right

Lessons Learned – Tier 1

“What Works in Tier‐2 (non‐production), will not always work with Tier‐1 (production)”

32

Doing It Right 1st Time: Very Conservative

Designed to Insure You Avoid Common Traps & Pitfalls Associated with Production Databasesbeing Virtualized

Starting Out Right

Doing It Right: Read Best Practices GuidesRead The DocumentationFrom All Your Vendors……

VMware, Microsoft, Storage Vendor, Network Vendor….

Appendix of this deck

Professional Association of SQL Server

http://virtualization.sqlpass.org/“Take Advantage of All resources Available to You”

• “Oracle Performance Management with vCenter Operations Manager and Oracle Enterprise Manager Adapter”

• “Virtualizing Oracle 11gR2 RAC on Vmware vSphere: Best Practices”• “Virtualization Bootcamp: Optimizing Oracle Databases on Vmware”

Sign-up for the NEW VMware SIG and gain access to content, webinars and networking opportunities

Blogs: Longwhiteclouds.com

38

http://vsphere-land.com/news/2014-top-vmware-virtualization-blog-voting-results.html?utm_content=bufferc62e1&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer

#13

Most Up To Date Information 

Installation• Plan your SQL Server installation

SLAs, RPOs, RTOsBaseline current workload, at least 1 business cycleBaseline existing (workload) vSphere implementationEstimated growth rates

I/O requirements (I/O per sec, throughput, latency) Storage (Disk type/speed, RAID, flash cache solution, etc) Software versions (vSphere, Windows, SQL)Product KeysLicensing (may determine architecture)Workload type (OLTP, Batch, Warehouse)Accounts needed for installation / service accountsHigh Availability strategyBackup & Recovery strategy

“If you aim at nothing, you will hit it every time” – Zig Ziglar

Planning a High Availability Strategy Requirements 

• Recovery Time Objective (RTO)• What does 99.99% availability really mean?

• Recovery Point Objective (RPO)• Zero data lost?• HA vs. DR requirements

Evaluating a technology• What’s the cost for implementing the technology?• What’s the complexity of implementing, and managing the technology?• What’s the downtime potential?• What’s the data loss exposure?

Availability % Downtime / Year Downtime / Month * Downtime / week

"Two Nines" ‐ 99%    3.65 Days 7.2 Hours 1.69 Hours"Three Nines" ‐ 99.9%  8.76 Hours 43.2 Minutes 10.1 Minutes"Four Nines" ‐ 99.99%  52.56 Minutes 4.32 Minutes 1.01 Minutes"Five Nines" ‐ 99.999%  5.26 Minutes 25.9 Seconds 6.06 Seconds

* Using a 30 day month

Is Being Down 3 Days In A Row Ok?

You Had 99% Availability !

Baseline, Baseline, Baseline………

44

Why will making it Virtual make it perform better?IF so how?

– New Hardware?– Faster CPU?– Faster Drives?

“There are no silver bullets”

“IT” Food Groups: What to Baseline

• Existing Physical Database Infrastructure• Existing/Proposed vSphere Infrastructure

45

When You Base Line a database Make Sure The Sample Interval Is frequent CPU, Memory, Disk (15 Seconds or less) SQL Server TSQL (1 Minute)

“A Lot can happen in a

short amount of time”

“SAME Applies to Oracle ! ! ! - A lot Can Happen

Oracle 12c Cloud Control/DB Express

The Default thresholds for alerting in Cloud Control 12c good starting point

Migrations ‐ The Bigger Picture

Database As A Service – Road MapMultiple Tier Approach• Different levels for different DB placement• Basic and Premium

– Basic = Low utilization, test / dev DBs– Premium = Moderate to High utilization, production, high visibility

• Different underlying hardware• Different SLAs, RTO, RPOs and HA between tiersCenter of Excellence• Assist with migrations, net new DBs and Capacity Management

– Communication, no “throwing it over the wall”

• VMware/SAN/Network/DB teams to discuss DB migrations– Optional Teams: Security, Procurement

49

“Few Dedicated Personnel to each Level of Stack –End Users are taking advantage automation”

Understanding Workload Resource Requirements

Basic performance characteristics(CPU, memory, IO, Network)• Daily average resource usage • Daily peak resource usage• Daily peak hours• Month‐end, quarter‐end, year‐end peaksMonitoring Tools • Windows Perfmon (Example)

– Processor(*)  %Processor Time– Process(sqlservr)  %Processor Time– SQLServer:Memory Manager  Total Server Memory (KB)– PhysicalDisk(*)  Disk Reads/Sec, Disk Writes/Sec– PhysicalDisk(*)  Disk Reads Bytes/Sec, Disk Write Bytes/Sec– Network Interface(*)  Bytes Received/Sec, Bytes Sent/Sec

50

vSphere Environment

SQL Server Baseline – Suggested Values

SQL Server – Perfmon Counters

SQL Profiler Counters

These are suggested values - work with your DBAs to determine their KPIs

Migration – Baseline: Physical (disk) PreLogicalDisk\Avg Disk sec/Read read latency

LogicalDisk\Avg Disk sec/Write write latency

LogicalDisk\Disk Read Bytes /sec Read throughput

LogicalDisk\Disk Write Bytes /sec Write throughput

LogicalDisk\Disk Reads/sec Read IOPS

LogicalDisk\Disk Writes/sec Write IOPS

LogicalDisk\Disk Transfers/sec Combined IOPS

Migration  – Baseline: Virtual (disk) Post

Export output Excel, and graphed using a variety of tools, such as Jonathan Kehayias’ Powershell script.

Compare the results against the required IOPS as measured in the pre-deployment assessment.

Determine IOPS & ThroughputORION (Part of 11.2 now)sudo ‐u root ./orion_linux_x86‐64 ‐run advanced ‐testname traxpoc ‐num_disks 20‐cache_size 8000 ‐duration 240 ‐matrix basicSLOB (Silly Little Oracle Benchmark)Calibrate I/O – Native to Oracle starting in 11.1SQL> declare2    l_latency integer;3    l_iops integer;     4    l_mbps integer; 5  begin    6    dbms_resource_manager.calibrate_io7    (5,10,l_iops,l_mbps,l_latency);8    dbms_output.put_line ('max_iops = '||l_iops);9    dbms_output.put_line (’latency = '||l_latency);10    dbms_output.put_line ('max_mbps = '||l_mbps);11  end;12  /max_iops = 5348latency = 10max_mbps = 641

Other Free Tools:• Swingbench• TPC Benchmark  • Custom scriptsHow do you know for sure?Oracle’s ‐ $$$:Database Replay

Oracle Calibrate I/O Tip

Don’t’ keep it a Secret• DBA’s – tell vSphere, Storage, and Network Admins your needs 

– Storage: (IOPS / throughput) – CPU: (MHz)– Memory: (Total GB)– Network: Bandwidth– Features (i.e.: Windows clustering)– Anticipated Growth Rates– Anticipated Activity– Other

“They Flunked Mind Reading”

Before You Install a Database on New VM• Do basic throughput testing of the IO subsystem prior to 

deploying a Database• Tools you can use

– SQLIO/IOMETER

– Slob…..

61

“Check It Before You Wreck it”-- Jeff Szastak

Should You PV (Via Converter)

Production Environment’s Build “New” From Scratch – GI/GO

SQL Server ‐ Unattended Installation Options

VMware vCAC Command Line• http://msdn.microsoft.com/en‐us/library/ms144259

Configuration File• http://msdn.microsoft.com/en‐us/library/dd239405

Sysprep• http://msdn.microsoft.com/en‐us/library/ee210664

• FYI – Available as of SQL Server 2008 R2

ORACLE‐ Unattended Installation Options

You At the VMworldParty While your Database is Provisioned

VMware vCACDBCA Silent Install

http://docs.oracle.com/cd/E11882_01/install.112/e24321/app_nonint.htm#CIHHFDGGRAC Silent Install

http://docs.oracle.com/cd/E11882_01/install.112/e24660/cripts.htm#RILIN1119

Phone‐A‐Friend

VMware has stated that it will take the ______support call if a customer calls ______ Support and ______ Support is being difficult because the 

customer is running on VMware.

• Hint…….    “TSANET.ORG--- Hardware or Software”

Use SQL Server/Oracle recommended installation guidelines for respective operating 

system – same as physical !

Physical World 1 :1 Virtual World

Many :1

Same As Physical

If your OS and database don’t know they are virtualized do you need to tell them? 

Did You Hear That?

Architecting For Performance: Design

OLTP Large amount of small queries Sustained CPU utilization during working hours Sensitive to peak contentions (slow downs affects SLA)

Generally Write intensive May generate many chatty network round trips Typically runs during off-peak hours, low CPU utilization

during the normal working hours Can withstand peak contention, but sustain activity is key

Batch / ETL

Database Workloads Types

DSS

Small amount of large queries CPU, memory, disk IO intensive Peaks during month end, quarter end, year end Can benefit from inter-query parallelism with large number of

threads

OLTP vs. Batch Workloads What this says:

• Average 15% Utilization• Moderate sustained activity (around 

28% during working hours 8am‐6pm)• Minimum activities during non working 

hours • Peak utilization of 58%

What this says:• Average 15% Utilization• Very quiet during the working day (less 

than 8% utilization)• Heavy activity during 1am‐4am, with avg. 

73%, and peak 95%

Batch Workload (avg. 15%)

OLTP Workload (avg. 15%)

OLTP vs. Batch WorkloadsWhat This Means

• Better Server Utilization• Improved Consolidation Ratios• Less Equipment To Patch, 

Service, Etc• Saves Money/Less Licensing

OLTP/Batch Combined Workload

“Many Tier-2 were built for capacity not performance”

Separate development, test from production environments into different host clusters in the beginning

Where?/What Year Was The First Documented Use Of The Word “Nerd”

?

The Year Was 1950

76

Which occasion do North Americans eat the most food on average?

For those who Guessed

Wrong77

Super Bowl Sunday

According to Wiki.answers.com78

More VMs vs. More DB Instances

More VMs• Better resource isolation• Better security, patch 

management• Better Performance• Less Risk

Fewer VMs (More instances)• Less expensive in some licensing models• No OS isolation (configuration, security, fault)• No resource isolation• Less Segmentation (HIPPA, PCI,…..)

Note: Both Work, Both Valid Strategies

General Rule of Thumbs• Resource utilization is the basics, but not all

• Consider business, security, management, and other requirements

• Consider workload characteristics • OLTP workloads can be stacked up to a sustained utilization level• OLTP workloads that are high usage during day time, and batch workloads that run during 

off‐peak hours mixed well together• Batch/ETL workloads with different peak periods share well together

• Consider operational history, e.g. month end, quarter end• Additional VMs may be added to handle peak period during month end, quarter end, and 

year end if scale out is a possibility

• CPU, memory hot‐add may be used to handle the peak workload• Reduce VM density, or add more hosts to the cluster

Architecting For Performance: Storage

Golden Rules

“Your Database is just an 

extension of your Storage”  

Michael Webster

“Your Storage is Just a Set

of containers for your

database”

Don Sullivan

Storage• The fundamental relationship between consumption and supply has not changed

• Spindle count and RAID configuration still rules

• host demand is an aggregate of VMs

• Factors that affect storage performance • storage protocols• storage configuration• VMFS configuration (Separate LUN’s, All on one 

LUN, Does it even matter?)

VMFS

More I/O In Flight to the Array

Use VMFS vs. RDM• VMFS Advantages

– Negligible performance cost and superior functionality

– Ability to take full advantage of future functionality enhancements (Future Awesomeness)

• Align VMFS on 64K boundaries– Automatic with vCenter– www.vmware.com/pdf/esx3_partition_align.pdf

• With vSphere 4.1– Use VAAI (Storage API)*

• With vSphere 5.x– Use VASA (Storage API)*

010002000300040005000600070008000

4K IO 16K IO 64K IO

VMFS

RDM (virtual)

RDM(physical)

IOPS

VMFS Scalability

* Work With Storage Vendor For Details

Thin Provisioning Perf / Block ZeroingMBs I/O Throughput

USE  use Thick Eager Zerod Disk for best performance

Maximum Performance happens eventually, but 

when using lazy zeroing,  zeroing needs to occur before you can get maximum performance

At minimum  Databases, LOGS, TEMPDB

Check with Storage Vendor to see how they handle 

Thin Provisioning. Your Mileage may vary

VAAI capable array can alter config

http://www.vmware.com/pdf/vsp_4_thinprov_perf.pdf

Database Thick Provision Eager Zeroed Options

InflationStorage vMotion

Windows

vmkfstools- VMware KB 1011170- vmkfstools –D “My VM.vmdk

- Eager or zeroedthick- vmkfstools –k “My VM.vmdk

- converts to eager Zeroed

Optimizations – SQL Server: Disk Disk

• Instant file initialization – add SQL Server service account to PERFORM VOLUME MAINTAINCE TASK under User Rights Assignment in Local Policies of Server’s settings.

• By default, every time the database file needs to grow, OS will zero fill this file & block writes until complete

• Adding requires a restart of the SQL Service, 

• removal requires a reboot

http://msdn.microsoft.com/en‐us/library/ms175935(v=SQL.105).aspx

SQL Server: System DatabasesTempdb

• Depending on workload, consider creating multiple tempdb files (see next slide)• Microsoft recommends 1 datafile per CPU • Isolate tempdb from database and logs, and consider dedicated vSCSI adapter• Verify via testing

http://technet.microsoft.com/library/Cc966534

Oracle - No Datafile to CPU relationship

For those who want to be less conservative (for TempDB ) SQL 2005 50% the number of cores up to 8, 2008+ 25%‐50% ratio of files to cores, usually up to 8.

The number of data files and tempdb files is important enough that Microsoft has two spots in the Top 10 SQL Server Storage best practices highlighting the number of data 

files per CPU 

TEMPDB 1 datafile per CPU(DUAL Core Counts as 2 CPU’s)

(Raid 1+0 – Write Intensive)

Data Files 1 datafile per CPU200GB DB/4 vCPU = 4@50GBMake Equal Size/Grow Equally

http://technet.microsoft.com/en-us/library/cc966534.aspx

Storage Paravirtual SCSI (PVSCSI) adapters

PVSCSI adapters are high‐performance storage adapters that can result in greater throughput and lower CPU utilization. • Up to 30% CPU Savings• Up to 12% I/O Improvement

Paravirtual Adapter Knows Its Virtual

* Very Important to Use Most Current Version

Always Check Storage Vendors Best Practices

“>80% of the issues in a virtualized

Environment have to do with Storagemisconfigurations”

Storage – Putting It All Together• Work with storage engineer, deliver realistic requirements early in the cycle

• Size for performance, not capacity• Large number of small drives, not small number of large drives

• More / faster spindles are better for performance• Understand the I/O requirements of different workloads 

• Transactional data vs. log vs. backup• OLTP vs. DSS 

“Golden Rule: Capacity Versus Performance”

Storage – Putting It All Together•Understand the path to the drives, i.e. throughput, multi‐pathing•Use eagerzeroedthick disk provisioning to avoid lazy zeroing• Place swap file on separate dedicated drive on SAN, mitigate the impact of swapping with EFD (for high performance workload)

• Can potentially slow down vMotions

• Follow SQL Server storage best practiceshttp://technet.microsoft.com/en‐us/library/cc966534.aspxWork with your SAN Vendor as well, they have Best Practices for running these workloads on your array

The Bottom Line

“>80% of performance problems with virtualization occur at the storage layer”Now that you know, don’t let it happen to YOU

Architecting For Performance: Processor

vCPUs – Hyper‐Threading

hyper‐threading processor to appear as two "logical" processors to the host operating system

98

� Still only One Processor

vCPU’s• With Databases Avoid Over Commitment of Processor Resources till have “actionable” performance data you can scale (vCOPs)

• 1‐1 Ratio Physical Cores to vCPU’s• Out of the gate !

Hyper-Threaded CPU != Full vCPU

Within The VMIn a virtual environment each vCPU is a single thread. There is no virtual equivalent of a hyper‐

thread. 

Guest Operating O/S sees the number of allocated vCPU’sNon-Virtualized O/S – Would see the Hyper threads. Oracle: Latches, Parallelism… Based upon visible CPU’s. Be Careful How You Set these things.

Processor – Putting It All Together 

• Leverage hardware‐assisted virtualization (enabled by default)• Consider avg. and peak utilization• Be aware of hyper‐threading, a hyper‐thread does not provide the full power of a physical core 

• Consider future growth of the system, sufficient head room should be reserved• In high performance environment, consider adding additional hosts when avg. host CPU utilization exceeds 65% 

• Consider increasing CPU resource if guest VM CPU utilization is above 65% in average

• Ensure Power Saving Features are “OFF”• Use vCOPs for consumption & capacity 

Architecting For Performance: Memory

Optimizations  SQL Server: MemoryMemory – Max / Min Min is set to 0

• only change when the OS is requesting memory for other apps

Max, is 2 TB by default• Should not equal or exceed total VM 

RAM, may lead to OS starvation• Do not set to 0, may prevent SQL 

from starting• If using “Hot Add” remember to 

modify this setting

SSQL Max Memory = VMMem – ThreadStack – OS Mem – VM Overhead• ThreadStack = NumOfSQLThreads(ThreadStackSize)• ThreadStackSize = 1 MB on x86 | 2 MB on x64 

http://msdn.microsoft.com/en‐us/library/ms178067.aspx

Max SQL Mem ExampleNtirety Rule**

• 2 Gig + Additional 1 Gig per 16 Gig Physical Memory

105 **In the context of the VM size or Physical Machine Size

Running Multiple Instances on Same VMTwo options, and do nothing is not one of themOption 1: Use max server memory

• Create max setting for each instance• Give each instance memory proportional to expected workload / db size• Do not exceed total RAM allocated to VM

Option 2: Use min server memory• Create min settings for each instance• Give each instance memory proportional to expected workload / db size• The sum should be 1‐2 GB less than RAM allocated to VM

Settings can be modified without having to restart the instancesPro Con

Max server memoryWhen a new process or instance starts, memory is available immediately to fulfill the request

If instances are not running, the running instances cannot access the available RAM

Min server memoryRunning instances can leverage memory previously used by instances that are no longer running

When a new process or instance starts, running instances need to release memory

SQL Server: Memory

107

Lock Pages in Memory

■ This keeps SQL more responsive when paging occurs

■ SQL Server Lock Pages in Memory is ON in >= 32/64 bit Standard Edition (2012)

■ Account needs “Locked pages in Memory” rights

▪ Give it the RIGHTS

http://msdn.microsoft.com/en‐us/library/ms178067.aspx

Non‐Uniform Memory Access (NUMA)• NUMA, avoiding the performance hit when several processors attempt to address the 

same memory by providing separate memory for each NUMA Node.• Speeds up Processing• NUMA Nodes Specific to Each Processor Model

108

Non‐Uniform Memory Access (NUMA)“All Processors Can Use All Memory”

• 4 Sockets, 6 cores. • 4 NUMA Nodes• 128 Gig RAM• Each NUMA Node = 32 Gig RAM

109

“In this example Optimal Performance:Each VM < 32GB*”

*CPU Overhead Needs to be accounted for. Minimal

*vNuma – Minimizes Impact when this happens

Home Node ‐ NUMA

The home node for a virtual machine is first selected considering current CPU and memory load across all NUMA nodes.Wide NUMA Allows for the use of Multiple NUMA Nodes Efficiently Hot Add CPU disables vNUMA**** Properly Size Database/Don’t Need Hot Add CPU *****110

Swapping Occurs Two Places1. Guest VM Swapping2. ESXi Host Swapping

113

Swapping can slow down I/O performance of disks for other VM’s

Ballooning, Memory Compression, Swapping Slow You Down

Stating the Obvious

Ballooning• Kicks in – When Physical Host experiencing memory contention

• Balloon Driver Runs on each individual VM• Communicates with guest O/S to determine what is happening with memory

• Works with the server to reclaim pages that are considered least valuable by the guest OS

Exceeding Host Memory can lead to ballooning, Memory Compression or Swapping

Swapping can slow down I/O performance of disks for other VM’s

Don’t Shut Off Memory Ballooning

Ballooning is Your First Line of Defense

How Many VMs can I Put on Host?

As many whose active memory will fit in physical RAM, while leaving some room for memory spikes.

Total Memory DemandActive memory (%ACTV) of VM’s +Memory Overhead – Page sharing of VM’s (DE‐Duping)

DE‐Duping = Transparent Page Sharing

Transparent Page Sharing  more effective The more similar the VM’s are

“Put Like Operating Systems On Same Physical Host”

TPS – When It Kicks In• Before Ballooning• Always Running on preset cycle looking for opportunity to reclaim memory

• Very Low Overhead• Runs At HOST Level

Disable Unnecessary Foreground/Background within Guest O/S 

• Windows Example– Alerter, Automatic Updates, clip book, error reporting– Help & Support, indexing messenger, netmeeting– Remote desktop– Once Established (Clone for reuse by Vmware)

124

Keep VM Footprint as small as Possible: NUMA, Shared Resource Pool

Memory Reservations• VM is only allowed to power on if the 

CPU & memory reservation is available (Strict admission)

• The amount of memory can be guaranteed even under heavy loads. 

• SET CPU/Not Guaranteed

• VMware HA Strict Admission Control – Settings Can Override this behavior

125

Reservations Rock !• Set the appropriate reservations to guarantee physical memory for the VM.

• In many cases, the configured size and reservation size could be the same

Oracle Approximate Memory Architecture

Set the memory reservation to SGA size plus OS.(Reservation & configured memory might be the same.)

Client sessions and context

SGA(DB buffer cache, and others)

Operating System

VM C

onfig

ured

M

emor

y Instance(PMON, SMON, DBWR, LGWR, CKPT, others)

Reservations and vswp

Setting a reservation creates a 0.00 K

Large Pages/Huge Pages  ‐‐ Broken Down at Hypervisor Level. Not Guest O/S

“Large/Huge PAGES Do

Not Normally SWAP”

In the cases where host memory is overcommitted, ESX may have to swap out pages. Since ESX will not swap out large pages, during host swapping, a large page will be broken into small pages. ESX tries to share those small pages using the pre-generated hashes before they are swapped out. The motivation of doing this is that the overhead of breaking a shared page is much smaller than the overhead of swapping in a page if the page is accessed again in the future.

http://kb.vmware.com/kb/1021095

Oracle – Hugepages/etc/security/limits.conf to set soft and hard limits.

oracle soft nofile 131072oracle hard nofile 131072

oracle soft nproc 131072oracle hard nproc 131072oracle soft core unlimitedoracle hard core unlimited

# -- The following entries need to adjusted with HugePages settings# oracle soft memlock 50000000# oracle hard memlock 50000000

“HUGE PAGES Do Not Normally SWAP”

Use large pages in the guest (start SQL Server w/ Trace flag  –T834)SQL Server In‐Guest Memory Best Practices

Memory – Putting It ALL Together• Do not overcommit memory for production, mission critical SQL Server VMs• Set provision memory = reservation = SQL Server max server memory + OS memory + virtualization overhead

• Set provision memory = reservation = Oracle SGA  + OS memory + virtualization overhead

• To avoid swapping, memory limit should never be set below the provisioned size. Setting memory limit is not recommended in general

• To avoid NUMA remote memory access, size VM memory equal to or less than the memory per NUMA node if possible

Architecting For Performance: Network

Jumbo Frames• Jumbo frames are Ethernet Frames Ethernet with more than 1500 bytes of payload. Conventionally, jumbo frames can carry up to 9000 bytes of payload

Data Movers, Pick One

Enable Jumbo FramesCheck to seeWill Suceed

ping ‐M do ‐s 8972 ‐c 2 rac01a‐privping ‐M do ‐s 8972 ‐c 2 rac01b‐privping ‐M do ‐s 8972 ‐c 2 rac02a‐privping ‐M do ‐s 8972 ‐c 2 rac02b‐privPING rac01a (10.17.33.31) 8972(9000) bytes of data.8980 bytes from rac01a‐priv (10.17.33.31): icmp_seq=1 ttl=64 time=0.017 ms8980 bytes from rac01a‐priv (10.17.33.31): icmp_seq=2 ttl=64 time=0.018 ms

Will Failping ‐M do ‐s 8973 ‐c 2 rac01a‐privping ‐M do ‐s 8973 ‐c 2 rac01b‐privping ‐M do ‐s 8973 ‐c 2 rac02a‐privping ‐M do ‐s 8973 ‐c 2 rac02b‐priv

Make sure: switch support is enabled

9000 Bytes- 20 Bytes IP Header- 8 Bytes of ICMP Header

“8192/64 = 128”

SQL Server: NetworkNetwork Default packet size is 4,096

• If jumbo frames are available for the entire stack, set packet size to 8,192

Maximize Data Throughput for Network Applications

• Limit file system cache by OS• NIC > File & Printer Sharing 

Microsoft Networks• Use Minimize Memory or Balance

http://blogs.msdn.com/b/johnhicks/archive/2008/03/03/sql‐server‐checklist.aspx

Jumbo Frames“Cost of Reducing To 1500 Bytes Then Back Again is Very Expensive” 

Splitting Is Bad

Network – Putting All Together

• Separate SQL workloads with chatty network traffic (Microsoft Always On – Are you there) from the one with chunky access into different physical NIC

• With 10Gbe do at VLAN level (4Gig‐E NICs (4Gb total vs 20Bg total) 2 10Gbe Nics)

• Separate traffic for vMotion, service console, and SQL Server at physical NIC level • 10Gbe Sufficient Bandwidth at Host but separate by VLAN

• Have 4 NICs per host to ensure performance and redundancy of network (Virtualized Environment = Network Heavy)

• Using 4 10Gbe NIC’s overkill from redundancy perspective. 2 10 Gbe Nic’s Usually enough

• vSphere 5.0 Introduced ability to use more than 1 NIC for vMotion. (More vMoitions going at one time. Added specifically for memory intensive applications, ie: Databases)

• Use VMXNET3 (VMware driver – reduces physical CPU utilization)

WSFC – Cluster Validation Wizard

143

Use this to validate support for your configuration• Required by Microsoft Support for condition of support for YOUR 

configuration

Run this before installing AAG (AlwayOn Availabilty Group), and every time you make changes

• Save resulting html reports for reference

If running non‐symmetrical storage, possible hotfixes required• http://msdn.microsoft.com/en‐us/library/ff878487(SQL.110).aspx#

SystemReqsForAOAG

http://www.pearsonitcertification.com/store/virtualizing-oracle-databases-on-vsphere-9780133570182http://www.pearsonitcertification.com/store/virtualizing-sql-server-with-vmware-doing-it-right-9780321927750

New RDBMS books from VMware Press

vmwarepress.com

Thank YouMichael [email protected]: http://michaelcorey.ntirety.comhttp://www.dbtablog.com/

@Michael_Corey

Jeff Szastak@Szastak

Fill out a surveyEvery completed survey is

entered into a drawing for a $25 VMware company store

gift certificate

Virtualizing Databases Doing IT Right – The Sequel

VAPP1318

Michael Corey, Ntirety - A Division of HostingJeff Szastak, VMware, Inc