krzysztof pietrzak infrastructure archiect boanerges s.c. exchange server 2010 planning and sizing

66
Krzysztof Pietrzak Infrastructure Archiect Boanerges S.C. Exchange Server 2010 Planning and Sizing

Upload: jennifer-thomas

Post on 27-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Krzysztof PietrzakInfrastructure ArchiectBoanerges S.C.

Exchange Server 2010Planning and Sizing

Hardware• Understand

hardware• Uderstand storage

Exchange• Understand

Exchange sizing

Next week• Get tools• Real sizing

examples• Cerified solution• Case Study

AgendaA quick tour through the sizing process

• Exchange sizing and performance tuning can be complex− Use planning & sizing toolset to

simplify− Take advantage of hardware

advances to get the most out of Exchange

Business Requirements

Business Requirements

• Information retention (size and duration)− Limited by Restore SLA − Regulatory requirements− Growth, mergers etc.

• Backup/restore strategy • Site level disaster recovery

− RPO/RTO

• Administrative model • Consolidation and virtualization• Power consumption

Large Mailboxes• Large Mailbox = 1-10GB+

− “Aggregate Mailbox” = Primary MB + Archive MB (including Dumpsters)

− ~1 year of mail (minimum)

Increase Knowledge Workers Productivity

ALL Mail is accessibleReduced MB managementClient Accessibility (Outlook/OWA/Mobile)

Eliminate/Reduce PST’sEliminate/Reduce 3rd Party Archives

TIME ITEMS MB SIZE (MB)

1 DAY 200 10

1 MONTH 4,000 200

1 YEAR 48,000 2,400

4 YEARS 192,000 9,600

*Very Heavy Profile = 150 Receive + 50 Send /Day, 50KB, no deletions

Large Mailbox Challenges & SolutionsClient Experience

Outlook 2007 Performance (Cached Mode) • Performance Improvements

Office 2007 SP2 (KB953195)− Updated OST sizing guidance

(10GB)

• Utilize Ex2010 Archive Mailbox to reduce cached data

• Ex2010 Store/ESE Improvements

Outlook 2007 (Online) /OWA PerformanceItems/folder limitationsView Creation Performance

Client Search PerformanceEx2010 Search Performance Improvements

Real-time results views2 x increase in indexing performance

Large Mailbox Challenges & SolutionsDeployment/Operations

Fast Recovery Requirements (RTO)

Long Backup Times Backup passive copiesDaily incremental/Weekly full backupsDPM Express Full backupsEx2010 HA + Hold Policy is your backup

Ex2010 HA

High Storage Costs/IOPS/RAID Overhead

Move Mailbox Downtime Ex2010 Online Move Mailbox

Database MaintenanceOnline Maintenance DurationDB Corruption (-1018) pain pointsDB re-seed performance hit on active copy

Ex2010 Store/ESE changes

Ex2010 Store/ESE changes

Understand Exchange

Scale Out vs. Scale Up

• Scale out is a strategic choice made by the product group

• Scale out provides the following at low cost:− Large mailboxes− High availability− Rich feature set

• Scaling up increases risk that an outage or failure affects more users

• Scaling up usually costs more, and can force feature decisions due to hardware choices− Consider all factors in the equation, particularly

storage

Scale Up Options

• Multiple Role Servers (“brick” deployments)− Likely the best option for big hardware (> 2

socket) – best hardware utilization overall− Be aware of recommendations for max processor

& memory

• Virtualization− Evaluate whether potential added complexity

and monitoring challenges make this a win

• Single role− Product not engineered for single role high scale

(> 2 socket)

Extreme caution necessary – validate carefully in a test lab

Supported vs. Recommended• Supported usually means well tested• Support statements define strict

boundaries• Recommendations define the “best

case” or the state that we want our customers to achieve

• Understand risks of going outside of recommendations or support boundaries

Exchange 2010 Enterprise Topology Enterprise Network

ExternalSMTP

servers

MailboxStorage of

mailbox items

Edge TransportRouting &

AV/AS

Unified MessagingVoice mail & voice access

Phone system (PBX or VOIP)

Client AccessClient

connectivityWeb services

Hub TransportRouting & Policy

Web browser

Outlook (remote

user)

Mobile phone

Outlook (local user)

Line of business application

Pri

ma

ry D

ata

cen

ter Se

con

da

ry Da

tace

nte

r

MBX-B

CAS-Pri

MBX-D

CAS-Sec HT2010

MBX-CMBX-A

HT2010

DAG1

Outlook Outlook

DAG1FSW

Active Active

Active/Active User Distribution

Prim

ary

Dat

acen

ter Secondary D

atacenter

MBX-B

CAS-Pri

MBX-D

CAS-Sec HT2010

MBX-CMBX-A

HT2010

DAG1

Outlook Outlook

DAG1FSW

MBX-F MBX-HMBX-GMBX-E

DAG2

DAG2FSW

Active

ActivePassive

Passive

Active/Active User Distribution

2 HA Copies (Total)

3+ HA Copies (Total)

2+ HA Copies / Datacenter

1 Lagged Copy

2+ Lagged Copies / Datacenter

Server in Primary Datacenter

RAID RAID or JBOD RAID or JBOD RAID RAID or JBOD

Servers in Secondary Datacenter

RAID RAID RAID or JBOD RAID RAID or JBOD

Storage

• Host each copy of a database on isolated storage

• Deployment on RAID or JBOD will be based on several factors− Cost− Hardware− Number and type of copies− Datacenter topology

Network

• Complete redundancy is preferred but not required

• Must have < 500 ms round-trip return latency between DAG members

• Replication is always from source to target−If you have multiple passive copies

in a remote datacenter, you will have multiple log streams from the active (one to each passive)

Network

• DAGs include compression for log shipping− Controllable setting for the DAG− Controlled at subnet level (default is inter-

subnet)− MSIT sees 30% compression− Amount will vary for each customer based on

message traffic

• SP1 adds Continuous Replication Block Mode− Reduces the exposure of data loss on failure by

replicating to passive copies all logs writes in parallel to them being locally persisted

− Only active when replication is up-to-date in terms of copying complete logs

Network

• If using iSCSI storage, configure DAG and cluster to ignore iSCSI networks− Set-DatabaseAvailabilityGroupNetwork -

Identity <DAG Network Name> -ReplicationEnabled:$false -IgnoreNetwork:$true

• Block cross-network communication to minimize heartbeat traffic

Blocked

Allowed

Subnet 3

Subnet 4

Subnet 2

Subnet 1

M M M M

R R R R

Namespaces

• Use Split DNS for Exchange hostnames used by clients−Minimizes number of needed

hostnames− mail.contoso.com for Exchange

connectivity on intranet and Internet

− mail.contoso.com has different IP addresses in intranet/Internet DNS

Moscow

CAS HT

MBX

St. Petersburg

HT CAS

ADAD MBX

Internal DNSMail.contoso.comPop.contoso.comImap.contoso.comAutodiscover.contoso.comSmtp.contoso.comOutlook.contoso.com

Internal DNSMail.sp.contoso.comPop.sp.contoso.comImap.sp.contoso.comSmtp.sp.contoso.comOutlook.sp.contoso.com

ExternalURL = mail.sp.contoso.com

CAS Array = outlook.sp.contoso.com

OA endpoint = mail.sp.contoso.com

ExternalURL =mail.contoso.com

CAS Array =outlook.contoso.com

OA endpoint =mail.contoso.com

External DNSMail.sp.contoso.comPop.sp.contoso.comImap.sp.contoso.comSmtp.sp.contoso.com

External DNSMail.contoso.comPop.contoso.comImap.contoso.comAutodiscover.contoso.comSmtp.contoso.com

Namespaces

RPC Client Access Server Array• 1 RPC CAS Array per Active Directory site• RPC CAS Array does not provide any load

balancing: you need a load balancer− FQDN of the RPC CAS Array must resolve internally to a

load-balanced virtual IP address in DNS

• RPCClientAccessServer is a property of Mailbox database− If database was created before array, then it is set to

random CAS FQDN (or local machine if role co-location)− If database is created after array, then it is set to the

array FQDN− Configure pre-existing databases to use RPC CAS Array

− Set-MailboxDatabase -RPCClientAccessServer

Role sizing

Sizing The Mailbox Role• Proper sizing is key across

all resources:

• Storage & memory are most critical – ensure proper sizing for performance, capacity, reliability

• In depth detail available at http://tinyurl.com/262cpg9

Mailbox RoleSizing Rules Of

Thumb

• 2-socket platform best for performance and TCO

• User profile determines resource requirements for IOPS, memory, CPU

• Don’t forget about high availability (Database Availability Groups)!

Check out UNC01-INT, UNC02-HOL, UNC305 for details on DAG design

Resource Key Considerations

Storage I/O and capacity requirements

Memory Database cache requirements (reduces I/O)

CPU Required for RPC operations, content indexing, mailbox assistants, replication operations

Network Log replication and RPC operations consume bandwidth

Mailbox Storage Sizing• Storage must be sized for

− Performance (IOPS)− Capacity (GB)

• Performance sizing based on user profile (message throughput)

• Capacity sized based on user mailbox size− See TechNet for details on

required overhead (whitespace, dumpster, etc.)

− Consider whether “thin provisioning” makes sense

• Design will either be performance- or capacity-bound

• IOPS guidance based on production observations and internal testing

Estimated IOPS Per-MailboxMessage

s Sent+Received

per mailbox per day

(~75KB average message

size)

Database cache

per mailbox

(MB)

Estimated IOPS: Single

database copy

Estimated IOPS: Multiple database copies

50 3 .060 .050

100 6 .120 .100

150 9 .180 .150

200 12 .240 .200

250 15 .300 .250

300 18 .360 .300

350 21 .420 .350

400 24 .480 .400

450 27 .540 .450

500 30 .600 .500

Mailbox Memory Sizing• Services need base

memory for ongoing operations: − Basic overhead for

servicing user requests− Content indexing− Mailbox assistants

• Store process needs per-user memory for database cache, based on user profile− Properly sized database

cache memory required for IOPS reduction

• Deep checkpoint depth + 32KB pages allow E2010 to benefit from larger memory configurations than E2K7

Mailbox Role Cache Memory Sizing

Messages Sent+Receive

d per mailbox

per day (~75KB average message size)

Database cache per

mailbox (MB)

50 3

100 6

150 9

200 12

250 15

300 18

350 21

400 24

450 27

500 30

Mailbox Memory Sizing• Cache size defaults based on

installed RAM− Size per-mailbox memory,

then map to fit in default cache

− Remaining memory reserved for base service requirements

• Nehalem platform has new rules for memory configuration− Haven’t seen a need to

optimize for memory speed, so optimize for memory size

• For example:− 4000 users with the 200

profile (12MB per mailbox): 4000*12MB = 48GB

− 48GB fits in 53.6GB default cache

− Deploy 64GB server

Default Mailbox Database Cache Sizes

Server Installed Physical Memory

Database Cache Size

(Mailbox Role Only)

Database Cache Size

(Multi-role)

2GB 512MBNot

supported

4GB 1GBNot

supported

8GB 3.6GB 2GB

16GB 10.4GB 8GB

24GB 17.6GB 14GB

32GB 24.4GB 20GB

48GB 39.2GB 32GB

64GB 53.6GB 44GB

96GB 82.4GB 68GB

128GB 111.2GB 92GB

Mailbox CPU Sizing• Proper CPU sizing is critical:

sizing of other roles depends on it

• Megacycle values provided are based on a particular reference platform, newer CPUs differ− Megacycle adjustment

based on SPECint may be required

• Sizing process & calculation can get somewhat complex− Use calculator tools to

simplify this process− See TechNet guidance for

details on megacycle adjustments

• Recommend disabling hyperthreading− May cause capacity

planning & monitoring challenges

Estimated Per-Mailbox CPU Consumption

Messages Sent+Rec

eived per

mailbox per day

(~75KB average

message size)

Megacycles for

active or stand-alone

mailbox(increase by 10% for each passive copy)

Megacycles for

passive mailbox

50 1 .15

100 2 .3

150 3 .45

200 4 .6

250 5 .75

300 6 .9

350 7 1.05

400 8 1.2

450 9 1.35

500 10 1.5

Sizing The Client Access Server Role• CPU and memory are key

for CAS:Client Access Server Role

Sizing Rules Of Thumb

• 2-socket platform best for performance and TCO

• CPU is typically the bottleneck, memory sizing is key as well

• 3 CAS CPU cores for every 4 Mailbox CPU cores (servicing active users)

• Load balancing is important for performance and high availability

• 2GB RAM per CPU core is optimal

Resource Key Considerations

CPU Required for handling client workload transactions, content conversion, garbage collection

Memory Memory required for ongoing transaction processing

Network All clients connect to CAS, network bandwidth and latency are important for client experience, load balancing likely required

Storage Utilized for content conversion, logging

Check out UNC05-INT for details on CAS load balancing

Client Access Server Workload Sizing• Workload impact on

CAS server is variable depending on user profiles & mix of workloads

• CPU & memory scale guidance for CAS based on assumptions of a mixed-protocol heavy information worker profile− Consider other

workloads and adjust− Remember all MAPI

traffic now affects CAS

• Use Windows Server 2008 R2 for best CAS scale− Major improvements

in rpcproxy (Outlook Anywhere), potentially scaling to 15k Outlook Anywhere users on 8-core CAS

CAS Workload Relative Cost Comparison

Workload

CPU Cost

(MHz/user)

Network Cost

(Kbytes/sec/user)

Outlook 0.35 0.37

Outlook Anywhere 0.80 0.44

Exchange ActiveSync(delta from

Outlook)

1.60 1.04

Exchange Web Services

(Microsoft Entourage)

0.71 0.54

Outlook Web Access 0.86 0.88

IMAP4* 0.86 0.14

POP3* 0.33 0.79

* IMAP4 & POP3 protocols do not support sending new mail, so the observed costs do not reflect any sent messages within the user profile.

Reference CAS server platform has 2 processor sockets on the motherboard populated with Intel Xeon L5335 4-core processors running at 2.00GHz for a total of 8 physical cores. Hyperthreading is disabled on this platform to allow for more accurate computation of CPU costs. The platform runs with 16GB of RAM. Observed costs will vary depending on user profile and other parameters defining a workload – these values are only meant to represent observed costs for a normalized sample workload.

Sizing The Hub Transport & Edge Role• CPU and memory are key for

transport:

• Size storage capacity for queue requirements

• Use battery-backed write cache disk controller− Disk I/O can be a bottleneck

on an un-tuned Hub− Log I/O becomes virtually

free with a BBWC controller

Transport RolesSizing Rules Of

Thumb

• 2-socket platform best for performance and TCO

• CPU is typically the bottleneck, memory sizing is key as well

• 1 Transport CPU core for every 7 Mailbox CPU cores (no A/V)or1 Transport CPU core for every 5 Mailbox CPU cores (with A/V)

• 1GB RAM per CPU core is optimal

Resource Key Considerations

CPU CPU required for message processing, hygiene activities, custom agents

Memory Database cache requirements (reduces I/O), messages in queue represented in memory for perf

Storage Low to moderate I/O requirement for relay & delivery activity, queuing needs more I/O+capacity

Network Bandwidth utilized to relay/deliver messages – scales with message volume, latency can cause queuing

Sizing The Unified Messaging Role• CPU and network are key

for UM:

• Scale out UM servers based on concurrent call requirements

• Size CPU based on requirements for Voice Mail Preview

Unified Messaging Role

Sizing Rules Of Thumb

• 2-socket platform best for performance and TCO

• 2GB RAM per core is optimal

• CPU is typically the bottleneck, particularly when Voice Mail Preview is being used

• Default 100 concurrent calls per server (inbound or outbound)

• Voice Mail Preview is CPU intensive: ~1 message/min/core

Resource Key Considerations

CPU CPU used for media operations and Voice Mail Preview transcription – UM is typically “CPU heavy”

Network Network bandwidth used for calls as well as communication with Mailbox role. Minimize latency for best user experience

Memory Memory required for ongoing transaction processing

Storage UM doesn’t have significant storage requirements

Sizing Multi-Role Hub/CAS Servers• Potentially optimal hardware

utilization− Server consolidation –

minimize physical servers• Simplified sizing

− Hub and CAS roles are relatively well balanced given resource requirements

− Virtualization – simplify server configuration with Hyper-V 4-core VMs (8-core physical = 1 4-core Mailbox, 1 4-core Hub/CAS)

• Must not load balance Hub-to-Hub traffic

Multi-Role Hub/CAS Servers

Sizing Rules Of Thumb

• 2-socket platform best for performance and TCO

• CPU is typically the bottleneck, memory sizing is key as well

• 1 Hub/CAS CPU core for every 1 Mailbox CPU core

• 2GB RAM per CPU core is optimal

8 core root

CAS/HUB

Mailbox

16 core root

CAS/HUB

Mailbox

CAS/HUB

Mailbox

24 core root

CAS/HUB

Mailbox

CAS/HUB

Mailbox

CAS/HUB

Mailbox

Sizing Multi-Role “Brick” Servers • Mailbox, CAS, and Hub

Transport roles recommended− UM supported, but not

recommended

• Excellent solution for high core configurations

• Half of cores for Mailbox, half for CAS+Hub

• Use 8-24 cores− 8GB RAM plus 3-30MB/mailbox

recommended (follow mailbox database cache sizing guidance)

• Typical deployment scenarios:− Simple unit of scale (brick) model

− Each multi-role server represents a building block

− Servers with on-board SATA storage (10-16 disks) are optimal

− Small organization/branch office – server consolidation− Minimize the number of physical

servers, operating system instances, and Exchange server instances to manage

Multi-Role “Brick” Servers

Sizing Rules Of Thumb

• Recommend maximum 4-socket platform for multi-role deployment

• Use 8GB RAM plus 3-30MB per mailbox (see Mailbox role sizing details)

Sizing Virtualized Server Roles• Exchange isn’t virtualization

“aware” – VM is just a different hardware platform

• TechNet is the best source of support guidance and best practiceshttp://tinyurl.com/26k4g5j http://tinyurl.com/5abmlh

• Server Virtualization Validation Program (SVVP) Support Policy Wizard helps to determine supported configurationshttp://tinyurl.com/lyko6t

• Be aware of major support limitations− Root clustering + DAG− Unified Messaging role− Snapshots & differencing

disks

Virtualized Server Roles

Sizing Rules Of Thumb

• Size for physical resources, add ~12% CPU overhead for hypervisor

• Avoid resource oversubscription

• Don’t co-locate Mailbox databases on a root server

• CAS+Hub combination can make scale calculations easy

© 2008 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.

The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after

the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Understand Hardware

Driving Factors In The Technology Landscape• Major changes in server hardware technology in

the last decade have influenced changes in Exchange architecture

• Processor advances− 64-bit: massive amounts of addressable memory− Core density: cores per processor continues to grow

• Storage advances− SAS/SATA/SSD− Topologies: SCSI, FC, iSCSI, DAS− Growth in drive capacity (and areal density)

• Power consumption & efficiency• Virtualization• Cloud computing

Processor Advances

• Overall available megacycles per processor (socket) increasing rapidly− Per-core megacycles

constant or decreasing to maintain power requirements

• Processor technology improvements make GHz comparison somewhat meaningless

Storage Advances

Hardware• Since 2003, disk

capacity has grown dramatically− 2TB desktop class SATA

(and midline SAS) disks available, larger sizes available shortly

• Sequential throughput increasing linearly based on areal density− 2010 SATA =~ 250MB/sec

• Random I/O performance not expected to improve substantially− 15k RPM is the ceiling

Workload• Mailbox sizes rapidly

increasing (1-10GB desired)

• Knowledge workers (and IT) want everything online and instantly searchable− Reduced mailbox

management effort− Data accessible from

everywhere (incl. mobile client)

− Increased knowledge worker productivity

• Average message size increasing

Storage Terminology

• Three classes of drive types− Enterprise (ENT)

− Dual port SAS interface− 10K, 15K RPM− 146, 300, 400, 450, 600GB + − Large Form Factor (LFF) & Small Form Factor (SFF)

− Midline (MDL)− SAS – dual port− SATA – single port− 7200 RPM + LFF & SFF− 500GB, 750GB, 1TB, 2TB +

− Entry (ETY) – Not suitable for Exchange

Disk Storage Technology 2010+• Disk Capacity trend predicted to continue

− 2TB Desktop class SATA disks available (3-4TB next year)− 1TB Near-line/Mid-line SAS disk available (2TB end of next

year)

• Sequential throughput increasing linearly based on density− SATA = ~250MB/sec

• Random I/O performance not expected to improve substantially− 15K RPM is the ceiling

• Solid State Disks (SSD)/Flash:− High $/GB, low $/IO− Write performance improving− Reliability mostly addressed

Random vs... Sequential Disk IO• Random IO

− Disk head has to move to process subsequent IO

− Head movement = High IO latency− Seek Latency limits IOPS

• Sequential IO− Disk head does not move to process

subsequent IO− Stationary Head = Low IO latency− Disk RPM speed limits IOPS

7.2K SATA Disk (20ms Latency)Random = 50 IOPSSequential = +300 IOPS!

What’s New: Modular Storage SystemsHP StorageWorks MDS600 as an example Front View Rear View

Side View with one 35 drive drawer extended

3.5” SAS or SATA drives

2 Power supplies 4 for redundancy

Drawer 1 – 35 drivesDrawer 2 – 35 drives

4 Fans

Lower cost per drive bay than shelves:

$8500 / 70 = ~$121 / bay vs.$3400 / 12 = ~$283 / bay

(Internet List Price)

2 port I/O modules – up to 4 per MDS600 for

dual-path

Exchange 2010 Storage Design

Factors To Consider

Exchange 2010 HA Storage Design Flexibility

SAN DAS (SAS) JBOD (SATA)• HA = Shared Storage Clustering• +1.0 IOPS/Mailbox• 3.5” 15K 146GB FC Disks• RAID10 for DB & Logs• Dedicated Spindles• Multi-path (HBA’s, FC Switches, SAN array controllers)• Backup = Streaming off active • Fast Recovery = Hardware VSS (Snapshots/Clones)

• HA = CCR• .33 IOPS/Mailbox• 2.5” 146GB 10K SAS Disks• RAID5 for DB• RAID10 for Logs• SAS Array Controller (/w BBU)• Backup = VSS Snapshot• Fast Recovery = CCR

• HA = DAG (2+ DB copies)• .11 IOPS/Mailbox• 3.5” 2TB 7.2K SATA/SAS Disks• RAID10 for DB & Logs• SAS Array Controller (/w BBU)• Backup = VSS Snapshot/Optional• Fast Recovery = Database Failover

DAS (SATA)

• HA = DAG (3+ DB copies)• .11 IOPS/Mailbox• 3.5” 2TB 7.2K SATA/SAS Disks• 1 DB = 1 Disk• SAS Array Controller (/w BBU)• Backup = VSS Snapshot/Optional• Fast Recovery = Database FailoverMore options to reduce storage cost

Exchange 2010 Storage Design Flexibility• Exchange Online Archive provides mailbox storage

flexibility (One Mailbox or two)• Ex2010 optimized for DAS storage, SAN storage is

supported• IOPS reductions/SATA optimization enable lower

performing storage− Ex2010 HA architected for DAS (simpler)

• JBOD (HA) and RAID storage supported• Ex2010 optimized for Tier 2 (SATA) disks

(Enterprise disks supported)• SSD storage supported but not recommended due

the high $/GB• Storage Groups are gone; maximum 100

databases/server• Max recommended DB size = 2TB (2+ Copies HA)• Maximum recommended Folder count = 100K (no

3rd party app)

Ex2010 Storage Requirements

*JBOD = Single database per physical disk

Storage Guidance Stand Alone Ex2010 HA (2 copies)

Ex2010 HA (3+ copies)

Storage Type DAS, SAN (Fibre Channel, iSCSI)

Disk Type SAS, Fibre Channel, SATA , SSD

RAID RAID RecommendedRAID Optional

RAID TypeRAID-1/0, RAID-5, RAID-6 (256KB Stripe

Size)

JBOD*, RAID-1/0, RAID-5, RAID-6

(256KB Stripe Size)

DB/Log Isolation (volume + spindle)

Best Practice Not Required

Windows Disk Type Basic (recommended), Dynamic (supported)

Partition TypeGPT (recommended), MBR (supported)

Partition Alignment Windows 2008 Default (1MB)

File SystemNTFS

NTFS Allocation Unit Size

64KB for both database and log volumes

Encryption Support Outlook Protection Rules, Bitlocker

Ex2010 Storage Design Considerations• Account for Background Database Maintenance

(Checksum)− 5 MB/sec sequential IO per DB (active or passive)− Storage bandwidth?− Set it to run OLM if performance is a concern (may limit

your DB size)

• 5.4/7.2k disks in RAID 5/6 configurations are not recommended

• Online Maintenance Duration design considerations gone

• Design servers with lots of memory− Deep checkpoint depth (HA) + 32KB pages + mixing

active copies with passives− More DB cache = less IOPS/Mailbox

User type (usage profile)

Send/receive per dayDatabase cache per user

Light 5 sent/20 received 2MB

Average 10 sent/40 received 4MB

Heavy 20 sent/80 received 6MB

Very Heavy 30 sent/120 received 8MB

Extra Heavy 40 sent/160 received 10MB

Ex2010 Storage Design Consideration (HA)• Ensure balanced DB copy distribution across

Database Availability Group (DAG)− Design so server failure causes active DB copies to

failover to multiple nodes within the DAG− Design so servers within a DAG run a mix of active and

passive DB copies− Size for single failures in 2-3 node DAG’s. Size for double

failures in 4+ node DAG’s

• Isolate DB’s to individual disks or RAID disk groups− Do not mix active and passive DB’s on the same spindles

(performance)− Do not mix multiple copies from the same DB on the same

storage sub-system (reliability/performance)

• Log Capacity Design− Use 3+ days log generation rate for capacity rule of

thumb, even with circular logging is enables in HA

Ex2010 Storage Design Considerations (HA)• Lagged DB copy Storage Design

− Do not use lagged copies for single copy restore (use HOLD Policy feature)− Recovering from lagged copies is not

straightforward (eseutil.exe)− Log capacity: Account for lagged DB

copies (size for lag period plus 3+ days of logs)− Single Page Restore not supported in lagged

copies− JBOD disk failure = loss of lag capability

Ex2010 Disk OptionsThere are several trade-offs when choosing disk types for Exchange 2010 storage. The correct disk is one that balances performance (both sequential and random) with capacity, reliability, power utilization and capital cost.

Disk Speed (RPM)

Disk Form Factor

Interface/

Transport

Capacity Random IO Performance

Sequential IO Performance

Power Utilisation

5.4K 2.5" SATA Average Poor Poor Excellent3.5" SATA Excellent Poor Poor Above Average

7.2K 2.5" SATA Average Average Average Excellent2.5” SAS Average Average Average Excellent3.5" SATA Excellent Average Above

AverageAbove Average

3.5" SAS Excellent Average Above Average

Above Average

3.5" FC Excellent Average Above Average

Average

10K 2.5" SAS Below Average

Excellent Above Average

Above Average

3.5" SATA Average Average Above Average

Above Average

3.5" SAS Average Above Average

Above Average

Below Average

3.5" FC Average Above Average

Above Average

Below Average

15K 2.5" SAS Poor Excellent Excellent Average3.5" SAS Average Excellent Excellent Below Average3.5" FC Average Excellent Excellent Poor

SSD: Enterprise

N/A SATA/SAS/FC Poor Excellent Excellent Excellent

Exchange 2010 Storage Design

Improvements and Changes

IOPS Reduction: Store Schema Changes• Store Schema = The way the Store organizes data

in the ESE DB• Ex2010: One simple theme

− “Move away from doing many, random, small disk IO’s to doing fewer, sequential, large disk IO’s.”

• Significant Benefits− Fast and efficient− OWA/Outlook Online Modes

− End user viewing “cold” states/first time view creation− Calendar operations− Search performance

− Outlook Cached Mode/Exchange Active Sync− OST sync = sequential IO− EAS sync = sequential IO

− Server Management− Move Mailbox− Content Index Crawl

IOPS Reduction: ESE Changes

• Optimized for new Store Schema− Allocates database space in contiguous manner− Maintains database contiguity over time− Utilizes space efficiently (database compression)

• Increase IO Sizes− Database page increased from 8kb to 32kb− Improved IO coalescing (gap coalescing)− Provides improved async read capability (pre-

reads)

• Increase Cache Effectiveness− DB page increase from 8kb to 32kb− Improved IO coalescing (gap coalescing)− Provides improved async read capability (pre-

reads)− DB Cache Compression (up to 30% more

cache/mailbox server)

IOPS Reduction: Space Management• Allocate space based on CONTIGUITY

− Allocate DB space based on either data compactness OR data contiguity (usage patterns)

Page 1

Used

Page 3

Used

Disk

DB CachePage X

Msg Header

Page Y

Msg Header

Page Z

Event History

Contiguity

Space Contiguity

Space Compactness

Page 4

Msg Header

Page 5

Msg Header

Page 2

Event History

Sequential/BloatRandom/Compact

IOPS Reduction: Maintain ContiguityNew Database Maintenance Architecture

ESE Function Ex2007 SP1 Ex2010

Cleanup (deleted items/mailboxes)

Cleanup performed during Online Defrag (OLD) which occurs during Online Maintenance (OLM) time window

Cleanup performed at run time (when hard delete occurs). Happens during Store dumpster cleanup (OLM).

Space Compaction

Database is compacted and space reclaimed during Online Defrag (OLD)

Database is compacted and space reclaimed at run-time . Auto-throttled

Maintain Contiguity (defragmentation)

N/A: Contiguity is compromised by space compaction

Database is analysed for contiguity and space at run time and is defragmented in the background (B+Tree Defrag/OLD2). Auto-throttled.

Database Checksum

When configured, ½ of OLD maintenance window reserved for sequential scan (Checksum), manual throttle. Active DB copy only.

Two options (both Active and Passive copies):1. Run DB Checksum in the

background 24x7 (default). Sequential IO

2. Run DB Checksum during OLM window. Sequential IO

IOPS Reduction: DB Continuity Results

E2007 Message Folder Table (aka MFT)

E2010 Message Header Table (aka MsgHeader)

Blue = contiguous (good)Red = fragmented (bad)

*Production database analysis

Random Deletes at the tail

FRAGMENTED

CONTIGUOUS

DB Page Numbers

IOPS Reduction: Other changes

• Store Table Architecture changes− Now has a per database, per mailbox and

per view table structure− Reduces IO

• DB Write Smoothing− Manages the bursty-ness of messaging

data by throttling writes

Reduce DB Space Growth: DB Compression• Store Schema changes, Space Hints,

B+Tree Defrag & 32KB page size combine to increase DB file size by 20%

• Growth is 100% mitigated by Database Compression− 7bit/XPRESS Compression for message headers

and text/html bodies (longvalues)

Ex2007/RTF Ex2010/RTF Ex2010/Mix Eex2010/HTML

0.000.200.400.600.801.001.201.40

1.001.20

1.000.88

DB File Size Comparison

1 Database, 750 x 250MB mailboxes,RTF = RTF Compressed, Mix = 77% HTML, 15% RTF, 8% Text, Avg. Message size = ~50KB

Optimize for SATA/Tier 2 DisksDB Write IO Burstiness• Problem: Bursty DB writes negatively affect

DB read and Log write performance (i.e. Client response time)

• Solution: Throttle DB writes based on Checkpoint target (QoS)

2 4 8 16 32 640

20

40

60

80

100

120

IO Latency Based on Max DB Write IO’s (ms)

Maximum DB Write IO's Is-sued

Latency (ms)

DB Read IO

Log Write IO

IOPS Reduction: E2007 vs. E2010 Results

E2007 E20100

50

100

150

200

250

300

350

400

450

500

DB IOPS Comparison

DB Read IO/SecDB Write IO/SecDB IO/Sec

+70% Reduction!

3000 Mailboxes, 3MB DB Cache/user, Loadgen Outlook 2007 Online Very Heavy Profile, 250MB Mailbox Size, E2010 Beta

Exchange 2003

Exchange 2007

Exchange 2010

0

0.2

0.4

0.6

0.8

1

DB IOPS/Mailbox

IOPS/Mailbox

Exchange IOPS Trends

+90% Reduction!

JBOD Storage: Now an option!• JBOD: 1 disk = 1 database/log stream• Requires Ex2010 HA (3+ DB copies)• Annual Disk Failure Rate = 5%

JBOD AdvantagesReducing Storage Costs/Complexity

JBOD ChallengesExchange HA/Storage must replace RAID functionality

Eliminates unnecessary DB copies: Server and Storage redundancy can be symmetrical

Disk Striping performance (e.g. RAID10) cannot be leveraged

Reduces Disk IO: Eliminates RAID write penalty

Disk Failure = Database Failover (~30 second outage)

Enables Simple Storage Design: 1 disk = 1 database

Re-enabling Resiliency = Spare disk assignment/partitioning/format/DB re-seed (scriptable)

Enables Simple Storage Failure Recovery Soft Disk Errors (bad blocks) must be

detected and repaired

JBOD Storage: Single Page Restore (Active)

Mailbox Server Node

1

Mailbox Server Node

2

Database Availability Group (DAG)

Page1

Page2

Page3

Mailbox Server Node

3

1. Page corruption detected on Active Copy (e.g. -1018)

2. Active DB places marker in log stream to notify passive copies to ship up to date page

3. Passive receives log and replays up to marker, retrieves good page, invokes Replay Service callback and ships page4. Active receives good page, writes page to log, DB page is patched

DB1-Active

Database

Log

Page1

Page2

Page3

DB1-CopyA

Database

Log

Page1

Page2

Page3

DB1-CopyB

Database

Log

5. Subsequent page repair from additional copies ignored

Exchange 2010 Storage Guidance

Stand AloneDatabase Availability Group: 2 nodes, 2 Database copies

Database Availability Group: 3+ nodes, 3+ Database copies

Storage TypeDirect Attached Storage (DAS) Supported Supported SupportedStorage Area Network (SAN): iSCSI

Supported. Best Practice = Do not share physical disks backing Exchange data with other applications.

Supported. Best Practice = Do not share physical disks backing Exchange data with other applications.

Supported. Best Practice = Do not share physical disks backing Exchange data with other applications.

Storage Area Network (SAN): Fibre Channel (FC)

Supported. Best Practice = Do not share physical disks backing Exchange data with other applications.

Supported. Best Practice = Do not share physical disks backing Exchange data with other applications. Best Practice = Do not place both database copies on the same physical spindles.

Supported. Best Practice = Do not share physical disks backing Exchange data with other applications. Best Practice = Do not place both database copies on the same physical spindles.

Network Attached Storage (NAS): SMB

Not Supported Not Supported Not Supported

Physical Disk TypeSATA Supported, requires battery backed

caching array controller for data integritySupported, requires battery backed caching array controller for data integrity

Supported, requires battery backed caching array controller for data integrity

SAS Supported Supported SupportedFC/FATA Supported Supported SupportedSSD (Flash Disk) Supported Supported SupportedPhysical Disk Write Caching (enabled)

Not Supported Not Supported Not Supported

Storage RAID RAID recommended RAID recommended RAID optionalEDB Volume RAID5/6, RAID10, RAID1 RAID5/6, RAID10, RAID1 JBOD, RAID5/6, RAID10, RAID1Log Volume RAID1, RAID10 RAID1, RAID10 JBOD, RAID1, RAID10Disk Array RAID Stripe Size (kb) 256KB 256KB 256KBStorage Array Cache Settings 75% Write Cache, 25% Read Cache (with

Battery Backed Cache)75% Write Cache, 25% Read Cache (with Battery Backed Cache)

75% Write Cache, 25% Read Cache (with Battery Backed Cache)

Database/Log file placement      Database/Log Isolation Best Practice (for recoverability) =

separate database file (.edb) and logs from same Database on to different volumes backed by different physical disks

Database file (.edb) and logs from same Database can share same volume and same physical disk.

Database file (.edb) and logs from same Database can share same volume and same physical disk. This is a best practice for JBOD/RAID'less storage scenario where one or more volumes store the edb and log files backed by the same physical disk.

Database Files/Volume Based on backup methodology Based on backup methodology RAID = based on backup methodology, JBOD = one DB file/volume is recommended

Log Streams/Volume Based on backup methodology Based on backup methodology RAID = based on backup methodology, JBOD = one log stream/volume is recommended

Windows Disk Type      Basic Disk Recommended Recommended RecommendedDynamic Disk Supported Supported SupportedPartition Type      GUID Partition Table (GPT) Recommended Recommended RecommendedMaster Boot Record (MBR) Supported Supported SupportedPartition Alignment Windows 2008 Default: 1MB Windows 2008 Default: 1MB Windows 2008 Default: 1MBVolume Path Drive Letter or Mount Point (mount point

host volume must be RAIDed)Drive Letter or Mount Point (mount point host volume must be RAIDed)

Drive Letter or Mount Point (mount point host volume must be RAIDed)

File System NTFS support only NTFS support only NTFS support only

NTFS Defragmentation Not required, not recommended Not required, not recommended Not required, not recommendedNTFS Allocation Unit Size 64KB for both edb and log volumes 64KB for both edb and log volumes 64KB for both edb and log volumes

NTFS CompressionNot Supported for Exchange Database files Not Supported for Exchange Database files Not Supported for Exchange Database files

NTFS Encrypted File System (EFS)

Not Supported for Exchange Database files Not Supported for Exchange Database files Not Supported for Exchange Database files

Windows Bitlocker (volume encryption)

Supported for all Exchange database and log files

Supported for all Exchange database and log files Supported for all Exchange database and log files