td mxc intel tech session stewart

Upload: armandochagoya

Post on 30-May-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    1/36

    OptimizingOpenSolaris* for Xeon

    May, 2008

    For Sun Tech Days

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    2/36

    Agenda

    Intel Server Advances Intel and Sun collaboration Key Development Areas Summary/Call to Action

    Intel is a trademark of Intel Corporation in the U.S. and other

    countries.

    * Other names and brands ma be claimed as the ro ert of others.

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    3/36

    Executive Summary

    One year Anniversary of Intel/Sun collaboration agreement

    Engineering teams show excellent collaboration Collaboration intensifies in 2008, more projects in flight Both companies are very upbeat about collaboration on SW, meeting

    all the goals Deep and long term engineering engagement and relationship

    Solaris + Intel Architecture + 1 year = New Opportunities forour developers and customers in 2008

    Strong Intel roadmap Best in class mission critical OS positioned to take advantage of new

    Intel server technologies Solaris Openness, Indiana IBM, Dell to OEM Solaris Choice of Virtualization environments Expansion of Sun SW portfolio

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    4/36

    Intel Server Advances

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    5/36

    2 Y E A R

    S

    2 Y E A

    R S

    45nm

    32nm

    Shrink/DerivativeWestmere

    New MicroarchitectureSandy Bridge

    65nm

    2 Y E A R S Shrink/DerivativePresler Yonah Dempsey

    New MicroarchitectureIntel Core Microarchitecture

    Shrink/Derivative

    Penryn Family

    New MicroarchitectureNehalem

    See Intel Architecture and Silicon Cadence. Whitepaper http://download.intel.com/technology/eep/cadence-paper.pdf

    Tick Tock (Shrink) (Innovate)

    Intels Sustained Architecture LeadershipStable roadmap for continued software innovation

    Source: Intel. All future products, computer systems, dates, and figures specified are

    preliminary based on current expectations, and are subject to change without notice.

    http://download.intel.com/technology/eep/cadence-paper.pdfhttp://download.intel.com/technology/eep/cadence-paper.pdf
  • 8/14/2019 TD MXC Intel Tech Session Stewart

    6/36

    Intel Quad-Core - A Superior Design

    Core 0

    32KBL1 I

    Cache

    4 MB SharedL2 Cache

    Front Side BusInterface

    32KBL1 D

    Cache

    Core 1

    32KBL1 I

    Cache

    32KBL1 D

    Cache

    Core 2

    32KBL1 I

    Cache

    4 MB SharedL2 Cache

    Front Side BusInterface

    32KBL1 D

    Cache

    Core 3

    32KBL1 I

    Cache

    32KBL1 D

    Cache

    Front-side Bus: up to1333MHzEnables uniform access toshared memory

    Intel Core uArch:Leading Perf and Perf/W

    64-bitIntel Virtualization Tech.

    Large L2 cache:2X competitors sizeLower latency (vs L3)Fewer cache missesMore efficient inclusive designReduces bus traffic

    Socketcompatible:From dual-corethrough to 45nmquad-core

    Dual-die vs Monolithic:Faster to design:6-9 mos

    Lower Cost Smaller die size Better yield (~20%) Lower mfg cost (~12%)

    Better supplyExtends to 45nm

    Leading performance, low cost and extensible

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    7/36

    Legal Disclaimers

    Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximateperformance of Intel products as measured by those tests. Any difference in system hardware or software design or configuratiomay affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or

    components they are considering purchasing. For more information on performance tests and on the performance of Intelproducts, visit http://www.intel.com/performance/resources/limits.htm or call (U.S.) 1-800-628-8686 or 1-916-356-3104.

    All dates and products specified are for planning purposes only and are subject to change without notice

    Relative performance is calculated by assigning a baseline value of 1.0 to one benchmark result, and then dividing the actualbenchmark result for the baseline platform into each of the specific benchmark results of each of the other platforms, andassigning them a relative performance number that correlates with the performance improvements reported.

    SPEC, SPECint2000, SPECfp2000, SPECint2006, SPECfp2006, SPECjbb, SPECWeb are trademarks of the Standard PerformanceEvaluation Corporation. See http://www.spec.org for more information.

    Intel Virtualization Technology requires a computer system with an enabled Intel processor, BIOS, virtual machine monitor(VMM) and, for some uses, certain platform software enabled for it. Functionality, performance or other benefits will varydepending on hardware and software configurations and may require a BIOS update. Software applications may not becompatible with all operating systems. Please check with your application vendor.

    Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processorseries, not across different processor sequences. See http://www.intel.com/products/processor_number for details.

    Intel products are not intended for use in medical, life saving, life sustaining, critical control or safety systems, or in nuclear

    facility applications. All dates and products specified are for planning purposes only and are subject to change without notice

    * Other names and brands may be claimed as the property of others.

    Copyright 2007-2007 Intel Corporation. All rights reserved. Intel, the Intel logo, Xeon and Intel Core are trademarks orregistered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    8/368

    Quad-Core Intel Xeon Processor 5400 series based platformsPerformance Comparison of 5400 Series versus AMD Opteron*

    70 % 78 %

    57 % 59 %63 %

    69 %

    94 %

    16 %

    29 %

    51 %57 %

    88 %96 %

    107% 112%125% 12 6% 12 6%

    138% 141%

    165%

    216%

    42 %

    0%

    25%

    50%

    75%

    100%

    125%

    150%

    175%

    200%

    225%

    250%

    S P E C O M P M

    * 2 0 0 1

    S P E C f p *_ r a

    t e_ b

    a s e

    2 0 0 6

    S P E C f p *_ r a

    t e 2 0 0 6

    S P E C W e b * 2 0 0 5

    T P C - C

    *

    S P E C i n t *

    _ r a

    t e_ b

    a s e 2

    0 0 6

    A b a q u s

    E x p

    l i c i t 6 . 6 - 1

    3 d s m a x

    *

    S P E C i n t *

    _ r a t e 2 0 0 6

    S A P - S

    D *

    2 - T i e r

    C i n e b e n c h

    *

    F l u e n

    t 6 . 3

    ( 9 W o r

    k l o a

    d s

    b m k )

    S P E C j b b * 2 0 0 5

    B l a c k

    S c h o l e s *

    L i n p a c k

    *

    Quad-Core Intel Xeon 5400 SeriesQuad-Core AMD Opteron 1.9 GHzQuad-Core AMD Opteron 2.0 GHz

    Quad-Core AMD Opteron 2.3 GHzQuad-Core AMD Opteron 2.5 GHz

    Relative Performance. Higher isbetter

    Best available Dual-Core AMD Opteron* results used as baseline.

    Quad-Core Intel Xeons sustained leadership continues

    Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the perf ormance of systems or components they are considering purchasing. For more information onperformance tests and on the performance of Intel products, visit http://www.intel.com/performance/resources/limits.htm Copyright 2008, Intel Corporation. * Other names and brands may be claimed as the property of others.

    312%

    Dual-Core AMD Opteron* Model 2222SE (3.0GHz) Dual-Core AMD Opteron* Model 2220SE (2.80 GHz); Dual-Core AMD Opteron* Model 2218 (2.60 GHz);Data Source:Published, measured, submitted or approved results as of April 7, 2008. See backup for details;

    TPC-C

    96%Integer

    57% (QC)SPECf p Rate

    -7% (QC)

    Top500Linpack

    312%

    Java

    165%

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    9/36

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    10/36

    Sun and Intel Collaboration

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    11/36

    Sun-Intel Solaris Collaboration is Significant

    Get the best software and hardware for mission-critical applications

    Broad multi-year strategic

    alliance Sun roadmap commitment

    1P, 2P, 4P, >4P Telco, WS, Enterprise

    Intel endorsement of Solaris * asa mainstream OS for Intel Xeon processors

    Joint investment in engineering,design, and marketing alliance

    for Solaris (and Java *)

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    12/36

    Solaris on Xeon Experience

    What worked before (Solaris 10):

    Over 800 x86 systems supportedin Hardware Compatibility Listhttp://www.sun.com/bigadmin/hcl

    146 Intel-based servers supportedby Solaris 10 (vs. 86 AMD servers,61 SPARC servers)

    Performance optimizations to winWorld Record SPECint using SunStudio 12

    Majority of existing certified x64Solaris applications already run onIntel Arch.

    Fully supported by Sun on Intelservers, workstations, mobile, anddesktop from multiple OEMs

    What were improving:

    Power-on, errata, memory/stringoperation speedups for Penryn,Nehalem

    Microcode update for serviceability

    Drivers for Intel wireless,graphics, ICH storage,manageability

    Xen / VT roadmap, IOAT Power optimizations (Powertop,

    improved P-States, C-States,NPTM, etc)

    Xeon enhancements for FaultManagement

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    13/36

    Hiddenprojects

    Hiddenprojects

    Solaris Development Model

    Intel-Platformproject

    Hiddenprojects

    Intel

    Community

    OpenSolaris Nevada

    Sun

    Firewall

    Solaris10

    SelectiveBack ports Intel-based

    HW

    Updates(6m beat rate)

    OpenSolaris(6m beat rate)

    Sun

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    14/36

    Key Development Areas

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    15/36

    Areas of Development

    Performance Enhancement

    CPU Performance tools Compiler Vectorization and Tools Power Management Driver Support

    I/O Acceleration Technology Virtualization Technology Predictive Self-Healing

    Join us: http://opensolaris.org

    Joint work touches most significant areas of OS

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    16/36

    Intel Core Silicon Enabling

    microcode update increased serviceability

    iommu increased DMA capability and security

    extended xAPIC extend processor addressability for interrupt delivery (up to 4G-1 cores)

    ICHx LAN, AHCI, managibility (AMT)

    CPUID lead to 4.5% SPECint performance on C2D

    MONITOR/MWAIT replaced halt leads to 1.2x in certainmicrobenchmarks

    Join us at http://opensolaris.org/os/project/intel-platform/

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    17/36

    Performance Enhancements

    Goal: Use Intel current and futuretechnologies to improve Solaris performance

    Libc optimizations memcpy(), memmove(), and memset()

    Optimized to use SSE2 and/or SSSE3instructions

    Significant performance improvements as

    measured by libMicro Available soon in OpenSolaris

    Str(n)cpy(), str(n)cmp() and strlen() Optimized to use SSE2 and/or SSSE3

    instructions Significant performance improvements as

    measured by libMicro Available soon in OpenSolaris

    Kernel optimizations in progress bzero(), bcopy(), kcopy(), etc.

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    18/36

    Power Management

    P-states - Active Power Management Performance states. Different P-states are at different frequency and voltage. You

    actually save energy.

    C-states Idle Power Management C0 - you execute code, no other mode executes code C1 - HALT instruction, no instructions get executed C2 - like C1, no code executed. Clock stopped C3 - FSB shut down. No snooping and caches can be shut off

    T-state Emergency brake

    Parts of ACPI: Static Tables that the BIOS creates Captures the platform power capabilities (how many P/C-states, power, switch latency,

    etc.)

    Stay as long as you can in deeper C-states

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    19/36

    Power Management Development Areas

    PowerTOP available for Solaris

    To show what wakes up your system from saving power model Uses DTrace

    Download at http://www.opensolaris.org/os/project/tesla/work/powertop

    C-Stateresidency

    ACPI info

    Top causesfor wakeup

    P-Stateresidency

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    20/36

    Power Management Development Areas

    Lots of kernel

    improvement areas Tickless kernel

    Power-friendly scheduling

    P-State improvement

    C-state support

    HPET timer

    Interrupt binding

    Join us at http://opensolaris.org/os/project/tesla/

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    21/36

    OpenSolaris vs best in class power use

    OpenSolarisSNV b87

    Best in Class OS

    We have more work to do for Solaris to be best-in-class

    CPU

    GMH

    ICH

    Memory

    PCI x16 slot

    LAN

    Backlight

    PS2

    Serial I/O

    CLK

    SATA

    USB

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    22/36

    IO Acceleration Technologies

    Low Latency Interrupt

    Direct Cache Access

    Message Signaled Interrupts

    LAN statelessoffloads

    Intel QuickData Technology

    Supported Features

    Header-splitting /replication

    TCP segmentation

    TX/RX checksumoffload

    Receive Side Scaling

    Header/data split

    Next gen Gigabit Controller, Nextgen 10GbE Controller

    Intel 82575 GigabitController, Intel82598 10GbEController

    IntelGbE Controller

    (Gilgal,Ophir)

    MSI-XMSI-XMSI

    (Next gen 10GbE Controller)(Intel 82598 10GbEController)

    Receive SideCoalescing

    Intel I/O Acceleration Technology

    IOAT v1 and v2 in progress

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    23/36

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    24/36

    Without hardware support

    What the VMM Does Emulates a complete hardware

    environment for every VirtualMachine

    Allocates platform resources

    Isolates execution in eachvirtual machine

    VM 1 VM n

    Shared Physical Hardware

    Memory

    KY/MS

    Graphics

    StorageNetwork

    Processors

    Virtual Machine Monitor

    OSOS

    AppApp

    OSOS

    AppApp

    Virtualization solutions without hardware support work, but Virtualization solutions without hardware support work, but there are limitations and require frequent software interventionthere are limitations and require frequent software intervention

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    25/36

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    26/36

    Unlocking Virtualization on Xeon

    IntelVirtualizationTechnology

    Interoperability Performance

    optimizations

    Intel Virtualization

    TechnologyInteroperabilityPerformanceoptimizationsManageability at scale

    AvailabilitySecurity andcompliance

    Manageability atscale

    Availability Security and

    compliance

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    27/36

    Intel Virtualization Technology Evolution

    Software-only VMMsBinary translationParavirtualization

    Device emulations

    Simpler and moresecure VMM throughuse of hardware VT

    support

    Better IO/CPU perf and functionality viahardware-mediated

    access to memory

    Assists for IO sharing: PCI IOV compliant devs VMDq: Multi-context IO End-point DMA translation

    caching IO virtualization assists

    Richer IO-devicefunctionality and IOresource sharing

    Core support for IOrobustness & performance viaDMAremapping

    Richer/faster: IntelVT FlexPriority,FlexMigrationEPT, VPID, ECRR,APIC-V

    Close basicprocessor virtualizationholes in Intel 64& Itanium CPUs

    Perf improvementsfor interruptintensive env, fasterVM boot

    Interrupt filtering & remappingVT-d extensions totrack PCI-SIG IOV

    VT-x VT-x2

    VT-d

    VT-x3

    VT-d2

    VT-c

    VMM software evolution over time, with hardware supportVMM software evolution over time, with hardware support

    VMMSoftware

    Evolution

    Vector 3:

    IO Device Focus

    Vector 1:Vector 1:Processor Focus

    Vector 2:Chipset Focus

    Past 2005 2010

    We are adding vt-d, vt-d2, vt-x, and vt-x2 into Solaris xVM

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    28/36

    xVM Server Enabling in Solaris

    Join us at http://opensolaris.org/os/community/xen/

    Device assignmentIntel Architecturesupport

    xVM Ops Center

    Blazing fast on IntelArchitecture

    VT-x, goodperformance

    xVM VirtualBox

    Future versionsupports VT-d and VT-d2 device assignment

    and interruptremapping for higher performance

    V1.0 will support VT-x,extended page tables,VTPR, WBINVD for

    better performance,reliability

    xVM Server

    TomorrowToday

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    29/36

    Fault Management Architecture

    Fault a defect that may produce errors

    The output of the di agnosis of err ors

    Something we can associate wi th an imp actand a corrective action

    Diagnosis software automates the steps

    Error an incor rect signal, datum, result

    Observation that is a symptom of a fault

    Old systems only know how to r epor terrors

    Diagnosis left to humans

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    30/36

    FMA and Intel Xeon processors

    Fault Management Architecture

    in Solaris saves millions inservice costs

    Intel platform support Bensleyand Caneland platforms

    Error injection: ensures that FMAcode paths work correctly

    Reporting of physical location of failed DIMMs

    Future processors new RASfeatures in Nehalem

    P C I -X -1 0 0

    Z C R

    P WR

    NorthBridge

    ESB2

    PXH-Vx8

    SCSI

    S C S I

    LANZoar

    x4

    SATA x6FLPY

    IDE-MIDE-S

    CPU2 CPU1

    P C I -X -1 3 3

    / 1 0 0

    DDR2 FBD16GB

    P C I -X -1 3 3

    / 1 0 0

    P C I - ex 1 6

    P C I - 3 3

    P C I - ex 8 i nx 1 6

    L P I P MI

    P W

    RSCSI VRM4+4

    Locationof failedDIMMs

    Errorinjection

    Intelplatform

    FMAmodel

    RAS support is great for 2 and 4 socket servers

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    31/36

    Developer Tools

    Sun Studio 12 Compiler (released June 2007) withXeon-specific optimizations

    Sun Studio Performance Analyzer: latest IntelArchitecture performance counters

    Threading Building Blocks for Solaris threadingbuildingblocks.org

    Transitive QuickTransit - Run Solaris/SPARC binaries on Solaris/Xeon

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    32/36

    Sun Studio Compiler Optimization Flags

    Aggressive For large projects -fast -xtarget=woodcrest -m64 -xvector=simd,lib -xipo -xprofile=collect/use -Wu,-

    sched_first_pass=1 -xtarget=woodcrest expands to -xarch=ssse3 -xchip=core2 -xcache=32

    /64/8:4096/64/16 SSE3 code generation, core2 architecture optimization and cache configuration

    selections, 05 level optimization, and inter-procedural optimization Enable instruction scheduler for FP calculation on IA

    Profile guided optimization Medium For most applications

    -fast -xtarget=woodcrest -m64 -Wu,-sched_first_pass=1 All aggressive optimization but no IPO and profile guidance

    Low For extra precise floating point calculations -O -xtarget=woodcrest -m64 O3 (medium) optimization level Quickest compilation

    Use Sun Studio to optimize your application on Solaris/IA

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    33/36

    Solaris system-level tuning

    Tuning is critical for best performance Solaris is designed for safe handling of heavy, mixed workloads

    out-of-the-box; tune for optimal handling of specific workloadcharacteristics

    Processor binding/scheduling Monitor application for threads that dominate CPU Tie these to CPUs in dedicated processor set to guarantee

    resource without contention Shield application CPUs from interrupts Use Fixed-Priority scheduling class for critical processes

    Network stack tuning Update driver: tuning as new NICs appear Solaris buffers size to avoid retransmissions without consuming

    too much memory Look at applications that communicate with the app

    Analyse with Dtrace Infiniband for lowest-latency interconnection Running on the same box using containers for ultimate low-latency

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    34/36

    Desktop/Mobile Driver Support - Wireless Driver www.opensourcewireless.org

    Focus on 4965 and future Wifi planned

    Downloadable uCode and dual licensedheader files

    Phase 1 completed 802.11 A/B/G Infrastructure mode power/temperature calibration (FCC regulatory) Rx sensitivity calibration WEP

    Phase 2 -- expected completion Jun 802.11 A/N WPA

    4965 wireless driver is working and improvement on the way

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    35/36

    Device Drivers Support - Others

    Graphics All Intel graphic silicon is supported

    AMT HECI driver and LMS service are available for AMT3.0

    AMT 4/5 are under planning. NICs

    Others

    Audio codec, USB, etc.

    Intel platform laptop/desktop is supported.

  • 8/14/2019 TD MXC Intel Tech Session Stewart

    36/36

    Summary/Call to Action

    Intel platform and Solaris bring the best technology to end

    user Intel and Sun teams at full strength through the community

    Result is significant in various kernel areas Performance, drivers, FMA, virtualization, etc.

    Call to action Run OpenSolaris/Solaris on latest Intel server platforms Joint development with us at OpenSolaris projects