td mxc intel tech session stewart
TRANSCRIPT
-
8/14/2019 TD MXC Intel Tech Session Stewart
1/36
OptimizingOpenSolaris* for Xeon
May, 2008
For Sun Tech Days
-
8/14/2019 TD MXC Intel Tech Session Stewart
2/36
Agenda
Intel Server Advances Intel and Sun collaboration Key Development Areas Summary/Call to Action
Intel is a trademark of Intel Corporation in the U.S. and other
countries.
* Other names and brands ma be claimed as the ro ert of others.
-
8/14/2019 TD MXC Intel Tech Session Stewart
3/36
Executive Summary
One year Anniversary of Intel/Sun collaboration agreement
Engineering teams show excellent collaboration Collaboration intensifies in 2008, more projects in flight Both companies are very upbeat about collaboration on SW, meeting
all the goals Deep and long term engineering engagement and relationship
Solaris + Intel Architecture + 1 year = New Opportunities forour developers and customers in 2008
Strong Intel roadmap Best in class mission critical OS positioned to take advantage of new
Intel server technologies Solaris Openness, Indiana IBM, Dell to OEM Solaris Choice of Virtualization environments Expansion of Sun SW portfolio
-
8/14/2019 TD MXC Intel Tech Session Stewart
4/36
Intel Server Advances
-
8/14/2019 TD MXC Intel Tech Session Stewart
5/36
2 Y E A R
S
2 Y E A
R S
45nm
32nm
Shrink/DerivativeWestmere
New MicroarchitectureSandy Bridge
65nm
2 Y E A R S Shrink/DerivativePresler Yonah Dempsey
New MicroarchitectureIntel Core Microarchitecture
Shrink/Derivative
Penryn Family
New MicroarchitectureNehalem
See Intel Architecture and Silicon Cadence. Whitepaper http://download.intel.com/technology/eep/cadence-paper.pdf
Tick Tock (Shrink) (Innovate)
Intels Sustained Architecture LeadershipStable roadmap for continued software innovation
Source: Intel. All future products, computer systems, dates, and figures specified are
preliminary based on current expectations, and are subject to change without notice.
http://download.intel.com/technology/eep/cadence-paper.pdfhttp://download.intel.com/technology/eep/cadence-paper.pdf -
8/14/2019 TD MXC Intel Tech Session Stewart
6/36
Intel Quad-Core - A Superior Design
Core 0
32KBL1 I
Cache
4 MB SharedL2 Cache
Front Side BusInterface
32KBL1 D
Cache
Core 1
32KBL1 I
Cache
32KBL1 D
Cache
Core 2
32KBL1 I
Cache
4 MB SharedL2 Cache
Front Side BusInterface
32KBL1 D
Cache
Core 3
32KBL1 I
Cache
32KBL1 D
Cache
Front-side Bus: up to1333MHzEnables uniform access toshared memory
Intel Core uArch:Leading Perf and Perf/W
64-bitIntel Virtualization Tech.
Large L2 cache:2X competitors sizeLower latency (vs L3)Fewer cache missesMore efficient inclusive designReduces bus traffic
Socketcompatible:From dual-corethrough to 45nmquad-core
Dual-die vs Monolithic:Faster to design:6-9 mos
Lower Cost Smaller die size Better yield (~20%) Lower mfg cost (~12%)
Better supplyExtends to 45nm
Leading performance, low cost and extensible
-
8/14/2019 TD MXC Intel Tech Session Stewart
7/36
Legal Disclaimers
Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximateperformance of Intel products as measured by those tests. Any difference in system hardware or software design or configuratiomay affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or
components they are considering purchasing. For more information on performance tests and on the performance of Intelproducts, visit http://www.intel.com/performance/resources/limits.htm or call (U.S.) 1-800-628-8686 or 1-916-356-3104.
All dates and products specified are for planning purposes only and are subject to change without notice
Relative performance is calculated by assigning a baseline value of 1.0 to one benchmark result, and then dividing the actualbenchmark result for the baseline platform into each of the specific benchmark results of each of the other platforms, andassigning them a relative performance number that correlates with the performance improvements reported.
SPEC, SPECint2000, SPECfp2000, SPECint2006, SPECfp2006, SPECjbb, SPECWeb are trademarks of the Standard PerformanceEvaluation Corporation. See http://www.spec.org for more information.
Intel Virtualization Technology requires a computer system with an enabled Intel processor, BIOS, virtual machine monitor(VMM) and, for some uses, certain platform software enabled for it. Functionality, performance or other benefits will varydepending on hardware and software configurations and may require a BIOS update. Software applications may not becompatible with all operating systems. Please check with your application vendor.
Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processorseries, not across different processor sequences. See http://www.intel.com/products/processor_number for details.
Intel products are not intended for use in medical, life saving, life sustaining, critical control or safety systems, or in nuclear
facility applications. All dates and products specified are for planning purposes only and are subject to change without notice
* Other names and brands may be claimed as the property of others.
Copyright 2007-2007 Intel Corporation. All rights reserved. Intel, the Intel logo, Xeon and Intel Core are trademarks orregistered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
-
8/14/2019 TD MXC Intel Tech Session Stewart
8/368
Quad-Core Intel Xeon Processor 5400 series based platformsPerformance Comparison of 5400 Series versus AMD Opteron*
70 % 78 %
57 % 59 %63 %
69 %
94 %
16 %
29 %
51 %57 %
88 %96 %
107% 112%125% 12 6% 12 6%
138% 141%
165%
216%
42 %
0%
25%
50%
75%
100%
125%
150%
175%
200%
225%
250%
S P E C O M P M
* 2 0 0 1
S P E C f p *_ r a
t e_ b
a s e
2 0 0 6
S P E C f p *_ r a
t e 2 0 0 6
S P E C W e b * 2 0 0 5
T P C - C
*
S P E C i n t *
_ r a
t e_ b
a s e 2
0 0 6
A b a q u s
E x p
l i c i t 6 . 6 - 1
3 d s m a x
*
S P E C i n t *
_ r a t e 2 0 0 6
S A P - S
D *
2 - T i e r
C i n e b e n c h
*
F l u e n
t 6 . 3
( 9 W o r
k l o a
d s
b m k )
S P E C j b b * 2 0 0 5
B l a c k
S c h o l e s *
L i n p a c k
*
Quad-Core Intel Xeon 5400 SeriesQuad-Core AMD Opteron 1.9 GHzQuad-Core AMD Opteron 2.0 GHz
Quad-Core AMD Opteron 2.3 GHzQuad-Core AMD Opteron 2.5 GHz
Relative Performance. Higher isbetter
Best available Dual-Core AMD Opteron* results used as baseline.
Quad-Core Intel Xeons sustained leadership continues
Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the perf ormance of systems or components they are considering purchasing. For more information onperformance tests and on the performance of Intel products, visit http://www.intel.com/performance/resources/limits.htm Copyright 2008, Intel Corporation. * Other names and brands may be claimed as the property of others.
312%
Dual-Core AMD Opteron* Model 2222SE (3.0GHz) Dual-Core AMD Opteron* Model 2220SE (2.80 GHz); Dual-Core AMD Opteron* Model 2218 (2.60 GHz);Data Source:Published, measured, submitted or approved results as of April 7, 2008. See backup for details;
TPC-C
96%Integer
57% (QC)SPECf p Rate
-7% (QC)
Top500Linpack
312%
Java
165%
-
8/14/2019 TD MXC Intel Tech Session Stewart
9/36
-
8/14/2019 TD MXC Intel Tech Session Stewart
10/36
Sun and Intel Collaboration
-
8/14/2019 TD MXC Intel Tech Session Stewart
11/36
Sun-Intel Solaris Collaboration is Significant
Get the best software and hardware for mission-critical applications
Broad multi-year strategic
alliance Sun roadmap commitment
1P, 2P, 4P, >4P Telco, WS, Enterprise
Intel endorsement of Solaris * asa mainstream OS for Intel Xeon processors
Joint investment in engineering,design, and marketing alliance
for Solaris (and Java *)
-
8/14/2019 TD MXC Intel Tech Session Stewart
12/36
Solaris on Xeon Experience
What worked before (Solaris 10):
Over 800 x86 systems supportedin Hardware Compatibility Listhttp://www.sun.com/bigadmin/hcl
146 Intel-based servers supportedby Solaris 10 (vs. 86 AMD servers,61 SPARC servers)
Performance optimizations to winWorld Record SPECint using SunStudio 12
Majority of existing certified x64Solaris applications already run onIntel Arch.
Fully supported by Sun on Intelservers, workstations, mobile, anddesktop from multiple OEMs
What were improving:
Power-on, errata, memory/stringoperation speedups for Penryn,Nehalem
Microcode update for serviceability
Drivers for Intel wireless,graphics, ICH storage,manageability
Xen / VT roadmap, IOAT Power optimizations (Powertop,
improved P-States, C-States,NPTM, etc)
Xeon enhancements for FaultManagement
-
8/14/2019 TD MXC Intel Tech Session Stewart
13/36
Hiddenprojects
Hiddenprojects
Solaris Development Model
Intel-Platformproject
Hiddenprojects
Intel
Community
OpenSolaris Nevada
Sun
Firewall
Solaris10
SelectiveBack ports Intel-based
HW
Updates(6m beat rate)
OpenSolaris(6m beat rate)
Sun
-
8/14/2019 TD MXC Intel Tech Session Stewart
14/36
Key Development Areas
-
8/14/2019 TD MXC Intel Tech Session Stewart
15/36
Areas of Development
Performance Enhancement
CPU Performance tools Compiler Vectorization and Tools Power Management Driver Support
I/O Acceleration Technology Virtualization Technology Predictive Self-Healing
Join us: http://opensolaris.org
Joint work touches most significant areas of OS
-
8/14/2019 TD MXC Intel Tech Session Stewart
16/36
Intel Core Silicon Enabling
microcode update increased serviceability
iommu increased DMA capability and security
extended xAPIC extend processor addressability for interrupt delivery (up to 4G-1 cores)
ICHx LAN, AHCI, managibility (AMT)
CPUID lead to 4.5% SPECint performance on C2D
MONITOR/MWAIT replaced halt leads to 1.2x in certainmicrobenchmarks
Join us at http://opensolaris.org/os/project/intel-platform/
-
8/14/2019 TD MXC Intel Tech Session Stewart
17/36
Performance Enhancements
Goal: Use Intel current and futuretechnologies to improve Solaris performance
Libc optimizations memcpy(), memmove(), and memset()
Optimized to use SSE2 and/or SSSE3instructions
Significant performance improvements as
measured by libMicro Available soon in OpenSolaris
Str(n)cpy(), str(n)cmp() and strlen() Optimized to use SSE2 and/or SSSE3
instructions Significant performance improvements as
measured by libMicro Available soon in OpenSolaris
Kernel optimizations in progress bzero(), bcopy(), kcopy(), etc.
-
8/14/2019 TD MXC Intel Tech Session Stewart
18/36
Power Management
P-states - Active Power Management Performance states. Different P-states are at different frequency and voltage. You
actually save energy.
C-states Idle Power Management C0 - you execute code, no other mode executes code C1 - HALT instruction, no instructions get executed C2 - like C1, no code executed. Clock stopped C3 - FSB shut down. No snooping and caches can be shut off
T-state Emergency brake
Parts of ACPI: Static Tables that the BIOS creates Captures the platform power capabilities (how many P/C-states, power, switch latency,
etc.)
Stay as long as you can in deeper C-states
-
8/14/2019 TD MXC Intel Tech Session Stewart
19/36
Power Management Development Areas
PowerTOP available for Solaris
To show what wakes up your system from saving power model Uses DTrace
Download at http://www.opensolaris.org/os/project/tesla/work/powertop
C-Stateresidency
ACPI info
Top causesfor wakeup
P-Stateresidency
-
8/14/2019 TD MXC Intel Tech Session Stewart
20/36
Power Management Development Areas
Lots of kernel
improvement areas Tickless kernel
Power-friendly scheduling
P-State improvement
C-state support
HPET timer
Interrupt binding
Join us at http://opensolaris.org/os/project/tesla/
-
8/14/2019 TD MXC Intel Tech Session Stewart
21/36
OpenSolaris vs best in class power use
OpenSolarisSNV b87
Best in Class OS
We have more work to do for Solaris to be best-in-class
CPU
GMH
ICH
Memory
PCI x16 slot
LAN
Backlight
PS2
Serial I/O
CLK
SATA
USB
-
8/14/2019 TD MXC Intel Tech Session Stewart
22/36
IO Acceleration Technologies
Low Latency Interrupt
Direct Cache Access
Message Signaled Interrupts
LAN statelessoffloads
Intel QuickData Technology
Supported Features
Header-splitting /replication
TCP segmentation
TX/RX checksumoffload
Receive Side Scaling
Header/data split
Next gen Gigabit Controller, Nextgen 10GbE Controller
Intel 82575 GigabitController, Intel82598 10GbEController
IntelGbE Controller
(Gilgal,Ophir)
MSI-XMSI-XMSI
(Next gen 10GbE Controller)(Intel 82598 10GbEController)
Receive SideCoalescing
Intel I/O Acceleration Technology
IOAT v1 and v2 in progress
-
8/14/2019 TD MXC Intel Tech Session Stewart
23/36
-
8/14/2019 TD MXC Intel Tech Session Stewart
24/36
Without hardware support
What the VMM Does Emulates a complete hardware
environment for every VirtualMachine
Allocates platform resources
Isolates execution in eachvirtual machine
VM 1 VM n
Shared Physical Hardware
Memory
KY/MS
Graphics
StorageNetwork
Processors
Virtual Machine Monitor
OSOS
AppApp
OSOS
AppApp
Virtualization solutions without hardware support work, but Virtualization solutions without hardware support work, but there are limitations and require frequent software interventionthere are limitations and require frequent software intervention
-
8/14/2019 TD MXC Intel Tech Session Stewart
25/36
-
8/14/2019 TD MXC Intel Tech Session Stewart
26/36
Unlocking Virtualization on Xeon
IntelVirtualizationTechnology
Interoperability Performance
optimizations
Intel Virtualization
TechnologyInteroperabilityPerformanceoptimizationsManageability at scale
AvailabilitySecurity andcompliance
Manageability atscale
Availability Security and
compliance
-
8/14/2019 TD MXC Intel Tech Session Stewart
27/36
Intel Virtualization Technology Evolution
Software-only VMMsBinary translationParavirtualization
Device emulations
Simpler and moresecure VMM throughuse of hardware VT
support
Better IO/CPU perf and functionality viahardware-mediated
access to memory
Assists for IO sharing: PCI IOV compliant devs VMDq: Multi-context IO End-point DMA translation
caching IO virtualization assists
Richer IO-devicefunctionality and IOresource sharing
Core support for IOrobustness & performance viaDMAremapping
Richer/faster: IntelVT FlexPriority,FlexMigrationEPT, VPID, ECRR,APIC-V
Close basicprocessor virtualizationholes in Intel 64& Itanium CPUs
Perf improvementsfor interruptintensive env, fasterVM boot
Interrupt filtering & remappingVT-d extensions totrack PCI-SIG IOV
VT-x VT-x2
VT-d
VT-x3
VT-d2
VT-c
VMM software evolution over time, with hardware supportVMM software evolution over time, with hardware support
VMMSoftware
Evolution
Vector 3:
IO Device Focus
Vector 1:Vector 1:Processor Focus
Vector 2:Chipset Focus
Past 2005 2010
We are adding vt-d, vt-d2, vt-x, and vt-x2 into Solaris xVM
-
8/14/2019 TD MXC Intel Tech Session Stewart
28/36
xVM Server Enabling in Solaris
Join us at http://opensolaris.org/os/community/xen/
Device assignmentIntel Architecturesupport
xVM Ops Center
Blazing fast on IntelArchitecture
VT-x, goodperformance
xVM VirtualBox
Future versionsupports VT-d and VT-d2 device assignment
and interruptremapping for higher performance
V1.0 will support VT-x,extended page tables,VTPR, WBINVD for
better performance,reliability
xVM Server
TomorrowToday
-
8/14/2019 TD MXC Intel Tech Session Stewart
29/36
Fault Management Architecture
Fault a defect that may produce errors
The output of the di agnosis of err ors
Something we can associate wi th an imp actand a corrective action
Diagnosis software automates the steps
Error an incor rect signal, datum, result
Observation that is a symptom of a fault
Old systems only know how to r epor terrors
Diagnosis left to humans
-
8/14/2019 TD MXC Intel Tech Session Stewart
30/36
FMA and Intel Xeon processors
Fault Management Architecture
in Solaris saves millions inservice costs
Intel platform support Bensleyand Caneland platforms
Error injection: ensures that FMAcode paths work correctly
Reporting of physical location of failed DIMMs
Future processors new RASfeatures in Nehalem
P C I -X -1 0 0
Z C R
P WR
NorthBridge
ESB2
PXH-Vx8
SCSI
S C S I
LANZoar
x4
SATA x6FLPY
IDE-MIDE-S
CPU2 CPU1
P C I -X -1 3 3
/ 1 0 0
DDR2 FBD16GB
P C I -X -1 3 3
/ 1 0 0
P C I - ex 1 6
P C I - 3 3
P C I - ex 8 i nx 1 6
L P I P MI
P W
RSCSI VRM4+4
Locationof failedDIMMs
Errorinjection
Intelplatform
FMAmodel
RAS support is great for 2 and 4 socket servers
-
8/14/2019 TD MXC Intel Tech Session Stewart
31/36
Developer Tools
Sun Studio 12 Compiler (released June 2007) withXeon-specific optimizations
Sun Studio Performance Analyzer: latest IntelArchitecture performance counters
Threading Building Blocks for Solaris threadingbuildingblocks.org
Transitive QuickTransit - Run Solaris/SPARC binaries on Solaris/Xeon
-
8/14/2019 TD MXC Intel Tech Session Stewart
32/36
Sun Studio Compiler Optimization Flags
Aggressive For large projects -fast -xtarget=woodcrest -m64 -xvector=simd,lib -xipo -xprofile=collect/use -Wu,-
sched_first_pass=1 -xtarget=woodcrest expands to -xarch=ssse3 -xchip=core2 -xcache=32
/64/8:4096/64/16 SSE3 code generation, core2 architecture optimization and cache configuration
selections, 05 level optimization, and inter-procedural optimization Enable instruction scheduler for FP calculation on IA
Profile guided optimization Medium For most applications
-fast -xtarget=woodcrest -m64 -Wu,-sched_first_pass=1 All aggressive optimization but no IPO and profile guidance
Low For extra precise floating point calculations -O -xtarget=woodcrest -m64 O3 (medium) optimization level Quickest compilation
Use Sun Studio to optimize your application on Solaris/IA
-
8/14/2019 TD MXC Intel Tech Session Stewart
33/36
Solaris system-level tuning
Tuning is critical for best performance Solaris is designed for safe handling of heavy, mixed workloads
out-of-the-box; tune for optimal handling of specific workloadcharacteristics
Processor binding/scheduling Monitor application for threads that dominate CPU Tie these to CPUs in dedicated processor set to guarantee
resource without contention Shield application CPUs from interrupts Use Fixed-Priority scheduling class for critical processes
Network stack tuning Update driver: tuning as new NICs appear Solaris buffers size to avoid retransmissions without consuming
too much memory Look at applications that communicate with the app
Analyse with Dtrace Infiniband for lowest-latency interconnection Running on the same box using containers for ultimate low-latency
-
8/14/2019 TD MXC Intel Tech Session Stewart
34/36
Desktop/Mobile Driver Support - Wireless Driver www.opensourcewireless.org
Focus on 4965 and future Wifi planned
Downloadable uCode and dual licensedheader files
Phase 1 completed 802.11 A/B/G Infrastructure mode power/temperature calibration (FCC regulatory) Rx sensitivity calibration WEP
Phase 2 -- expected completion Jun 802.11 A/N WPA
4965 wireless driver is working and improvement on the way
-
8/14/2019 TD MXC Intel Tech Session Stewart
35/36
Device Drivers Support - Others
Graphics All Intel graphic silicon is supported
AMT HECI driver and LMS service are available for AMT3.0
AMT 4/5 are under planning. NICs
Others
Audio codec, USB, etc.
Intel platform laptop/desktop is supported.
-
8/14/2019 TD MXC Intel Tech Session Stewart
36/36
Summary/Call to Action
Intel platform and Solaris bring the best technology to end
user Intel and Sun teams at full strength through the community
Result is significant in various kernel areas Performance, drivers, FMA, virtualization, etc.
Call to action Run OpenSolaris/Solaris on latest Intel server platforms Joint development with us at OpenSolaris projects