dynamic partitioning in windows longhorn santosh jodh software design engineer windows kernel...
TRANSCRIPT
Dynamic Partitioning in Windows Longhorn
Santosh JodhSoftware Design EngineerWindows Kernel Platform Groupsantoshj @ microsoft.comMicrosoft Corporation
Mike TrickerProgram ManagerWindows Kernel Platform Groupmiketri @ microsoft.comMicrosoft Corporation
Session OutlineSession Outline
Introduction to Dynamic Partitioning (DP)
Clarifying the terminologyReliability, Availability & Serviceability (RAS)
Capacity on Demand (CoD)
Resource Management (RM)
Hot Add, Replace & Remove
Goals and non-goals for DP onWindows codenamed “Longhorn”
What we’re expecting others to do to support this
Session GoalsSession Goals
Attendees should leave this session with a good understanding of the following:
What Microsoft means by Dynamic Partitioning
DP-related terminology and acronyms
Microsoft’s goals and non-goals for DP in Windows Longhorn
Knowledge of where to find resources for DP
An Introduction to Dynamic Partitioning An Introduction to Dynamic Partitioning
A hardware partitionable server has the ability to create one or more isolated hardware partitions comprising processors, memory and I/O, each supporting a single Windows instance
A dynamically partitionable server has the ability to add, replace or remove hardware within a partition without needing to reboot the OS instance within the partition
Why is this interesting?Hardware partition support has been available on some large servers for a number of years
Windows is supported on hardware partitionable systems today, but does not support dynamic hardware partitioning
With the projected increase in processor performance Microsoft expects a number of these features to become available on mid-range systems
Microsoft plans to add support for dynamically partitionable hardware in Windows Longhorn
Why You Should Care About DPWhy You Should Care About DP
Microsoft believes that the capabilities that have previously been limited to expensive high end systems are moving into the mainstream
Together with the introduction of multi-core processors this will make relatively small and inexpensive systems as powerful and reliable as today’s high end systems
This will push highly fault-tolerant enterprise-critical applications such as large databases and management information applications onto less expensive platforms
Which means that a range of hardware that has not previously had to consider some of the issues with dynamic hardware will now need to
In the same way that RAID 5 changed the way in which we considered disks in the 1990’s
What Do We Mean By a Partition?What Do We Mean By a Partition?
One scale-up application, e.g., Database
OSOS
Cell 1 Cell 2 Cell 3 Cell 4
OS
Cell 5 Cell 6
SQLResource
Management
SQL Exchange
Virtual Server
VM1 VM2 VM3
Multiple applications running on one OS
Multiple Virtual Machines running on one OS
All running on a single system
Reliability, Availability and ServiceabilityReliability, Availability and Serviceability
Minimizing unplanned downtime due to failing hardware
E.g. if a processor starts to show signs of failing (increasing number of corrected errors or thermal events) swap it with one that’s on standby without needing to reboot the computer (similar to a hot spare disk in RAID 5)
Capacity on DemandCapacity on Demand
The ability to enable processors that are physically present in the computer but not enabled by default
E.g. buy a system with 8 processors, only 4 of which are initially paid for, enabled and used by the OS, and then when the workload grows pay to enable 2 or 4 more
Resource ManagementResource Management
Sharing resources between two or more partitions
E.g. If the load on partition 1 is increasing whilst the workload on partition 2 is decreasing move processors and/or memory from partition 2 to partition 1 to better handle the increasing workload
More TerminologyMore Terminology
SocketA physical socket into which a processor and/or memory may be plugged mechanically
Sockets may also be independently powered
Partition Unit (PU)A collection of system resources that form the smallest building blocks that can be assigned to a partition
E.g. processors, memory and I/O host bridges
More than one PU may be required to boot a partition
Yet More TerminologyYet More Terminology
Hot AddAdding a socket or cell to a running partition
Hot RemoveRemoving a socket or cell from a running partition
Hot ReplaceReplacing a socket or cell in a running partition with one that is already physically present in the system but offline before the operation is started
Note that Hot Replace is NOT the same as Hot Remove followed by Hot Add
And Yet More TerminologyAnd Yet More Terminology
Hot SwapSome vendors support a model that does not require the stand-by hardware to be physically present before the Replace operation is started, and thus IS equivalent to a Hot Remove followed by a Hot Add
Hot PlugA term typically covering Hot Add and Hot Remove
Assumptions We’re Making About Hardware Assumptions We’re Making About Hardware
Future partitionable machines will contain PUs which comprise
Processors and memory together
Processors
Memory
I/O host bridges
The ACPI tables in those systems will be updated to expose specific methods required to support changing the hardware configuration without needing to reboot
The firmware will be able to assist the OS during Hot Add and Hot Replace operations
More Hardware AssumptionsMore Hardware Assumptions
Systems will include a Service Processor (SP) or Baseboard Management Controller (BMC)
PUs can be electrically isolated when not in use
No hardware assigned to a specific PU can be shared with other partitions, ensuring that a single failure cannot affect more than one partition
Dynamic Hardware PartitioningDynamic Hardware Partitioning
Memory
Memory Memory
Memory
IO Bridge
Service Processor
1. Partition Manager provides the UI for partition creation
and management
2. Service Processor controls the inter processor and IO
connections
Partition Manager
3. Platforms partitionable to the socket level. Virtualization used
for sub socket partitioning
4. Support for dynamic partitioning and socket
replacement
PCI Express
Core Core
Cache
… Core Core
Cache
…
Core Core
Cache
…Core Core
Cache
…
. . .
IO Bridge
. . .IO Bridge
. . .
IO Bridge
. . .
Longhorn dynamic hardware
partitioning features are focused on
improving server RAS
Future Hardware Partitionable Server
Goals For Windows LonghornGoals For Windows Longhorn
Support the Hot Add of:Processors
Memory
I/O host bridges
Support the Hot Replace of:Processors
Memory
OS support onlyx64 and Itanium only – no 32-bit support will be provided
Server SKUs only - for SKUs supporting 4 processors or more only
Non-Goals For Windows LonghornNon-Goals For Windows Longhorn
Hot RemoveWindows Longhorn will not support the Hot Remove of processors or memory
However tools will be supplied to allow both device driver and application developers to validate that they behave correctly in the case of a Hot Remove operation for either processors or memory
Partition ManagerToday’s Partition Managers are proprietary to each major OEM’s platform, and Microsoft will not be providing equivalent functionality in Windows Longhorn
Microsoft will work with the system vendors to enable Windows DP support via their partition management tools
SP & BMC “drivers”SPs and BMCs are devices that can be accessed from Windows via a device driver
Windows Longhorn will include an IPMI driver which can communicate with SPs and BMCs via a standard interface, but will not provide specific drivers for any vendor’s SP or BMC
Supporting TechnologiesSupporting Technologies
Windows Hardware Error Architecture (WHEA)Error infrastructure designed to support (amongst other things) DP, especially Hot Replace operations
Making hardware error information more easily available for management applications to analyze and make failure predictions
Extends the Machine Check Architecture available with the Intel Itanium platform
Multi-level rebalanceWindows Longhorn offers more sophisticated and extensive rebalance operations when hardware is added or removed
This is not specific to DP, but will be leveraged by DP to make these operations as efficient as possible
PCI Express and specifically Advanced Error ReportingThe PCI bus is unable to report many errors, and most end up as NMIs
PCI Express introduces AER and supports error correction, which will be exposed by WHEA for error prediction by management applications
Status of the Various ComponentsStatus of the Various Components
Hot Add of memory is already supported by Windows Server 2003
x86 support shipped in Windows Server 2003 RTM
x64 & Itanium support was added in Windows Server 2003 Service Pack 1
Hot Add of I/OVarious device classes supporting Hot Plug are already available
With Windows Longhorn the extended support for PCI Express devices makes this a very compelling feature
Hot Add Processor support is now in testOn x64 and Itanium
Hot Replace for processors and memory is under development
What DP Implies to an Application DeveloperWhat DP Implies to an Application Developer
Add: applications can register for plug & play notifications of new hardware arriving
Application developers with hard dependencies on memory or number of threads should watch for these notifications and update their behavior accordingly
Resource management software, such as Microsoft’s WSRM, can abstract these changes such that the majority of applications do not need to explicitly handle these notifications
Replace: applications will be unaffected and will see no change in the system
Application developers need do nothing
Remove: applications cannot make hard assumptions about memory or thread affinity
Application developers cannot make assumptions about memory being fixed that they may do today, and should not rely upon thread affinity or the size of thread pools being static
What DP Implies to a Driver Developer What DP Implies to a Driver Developer
Add: drivers can register for plug & play notifications of new hardware arriving
Driver developers have fewer memory size limitations than application developers, and pool sizes will not change even if overall memory grows.
The addition of processors and the related interrupt routing changes should also be invisible to drivers
So in the Add case most drivers will not do anything new
Replace: drivers will be unaffected and will see no change in the system
There are implications around device timeouts, as it will be necessary to quiesce the system whilst the replace operation completes
Remove: drivers cannot make hard assumptions about memory or thread affinity
Drivers cannot make any assumptions around thread affinity, or even that the affinity mask will remain contiguous as it is today
Logo Requirements and TestingLogo Requirements and Testing
NOTE: DP is a Server-only feature, so there are no new Client requirements arising from this feature
A number of new requirements are being proposed for the Microsoft logo program for Server to support DP
Most apply to either platform firmware or device driversSpecific ACPI method support
Device drivers must not assume that the processor affinity mask is contiguous
We will also be providing test tools to ensure that you’re ready for Hot Remove support in a subsequent Windows release
These will apply to both applications and device drivers
Other Implications of DPOther Implications of DP
What about NUMA?What happens to the System Resource Affinity Table (SRAT) or System Locality Distance Information Table (SLIT) when new hardware gets added?
Nothing happens to the SRAT as it’s a static table updated (and read by Windows) only at boot time
So it will be updated the first time the system reboots after hardware is added
For Windows Longhorn we’re not making use of the SLIT nor supporting the _SLI method to update locality information dynamically, so again nothing needs to be done here
SummarySummary
Windows Longhorn is planned to contain support for:
Hot Add of processors, memory and I/O host bridges
Hot Replace of processors and memory
Windows Longhorn will not contain support for:Hot Remove of memory and processors
An in-box Partition Manager
There are things you’ll need to do to:Enable DP on your systems
If your application is hardware-aware you may make use of the benefits offered by DP, and to not fail when hardware changes underneath you
Ensure that your device drivers work correctly on DP-capable systems
Call to ActionCall to Action
Application developers can benefit from DP if they make their application DP-aware
Driver developers need to make their drivers DP-aware to work well on DP-capable systems
Any may fail completely if they are badly behaved when hardware changes beneath them
You may already be talking to us if you’re interested in DP
If you’re interested and aren’t yet talking to us then please do!
Community ResourcesCommunity Resources
Windows Hardware & Driver Central (WHDC)www.microsoft.com/whdc/default.mspx
Technical Communitieswww.microsoft.com/communities/products/default.mspx
Non-Microsoft Community Siteswww.microsoft.com/communities/related/default.mspx
Microsoft Public Newsgroupswww.microsoft.com/communities/newsgroups
Technical Chats and Webcastswww.microsoft.com/communities/chats/default.mspx
www.microsoft.com/webcasts
Microsoft Blogswww.microsoft.com/communities/blogs
Additional ResourcesAdditional Resources
Email:dpfb @ microsoft.com
Related SessionsWindows Hardware Error Architecture
Error Management Solutions Synergy with WHEA