system tuning perf final

Upload: irogers

Post on 06-Apr-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 System Tuning Perf Final

    1/84

    Delivering Performance on Sun: SystemTuning

    Technical White Paper

    Greg Schmitz and A llan Esclamado

  • 8/3/2019 System Tuning Perf Final

    2/84

    Please

    Recycle

    1999 Sun Microsystems, Inc.

    901 San Antonio Road, Palo Alto, California 94303 U.S.A

    RESTRICTED RIGHTS LEGEND: Use, duplication, or disclosure by the government is subject to restrictions as set forth insubparagraph (c)(1)(ii) of the Rights in Technical Data and Computer Software clause at DFARS 252.227-7013 and FAR 52.227-

    19.

    The product described in this manual may be protected by one or more U.S. patents, foreign patents, or pending applications.

    TRADEMARKS

    Sun, Sun Microsystems, the Sun logo, Solaris, NFS, Ultra, UltraComputing, Ultra Enterprise, XGL, XIL Creator, Creator3D,

    SunVTS, and OpenWindows are trademarks or registered trademarks of Sun Microsystems, Inc.in the United States and other

    countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International,

    Inc. in the United States and other countries. Products bearing SPARC trademarks are based upon an architecture developed by

    Sun Microsystems, Inc.

    THIS PUBLICATION IS PROVIDED AS IS WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED,

    INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A

    PARTICULAR PURPOSE, OR NON-INFRINGEMENT.

    THIS PUBLICATION COULD INCLUDE TECHNICAL INACCURACIES OR TYPOGRAPHICAL ERRORS. CHANGES

    ARE PERIODICALLY ADDED TO THE INFORMATION HEREIN; THESE CHANGES WILL BE INCORPORATED IN

    NEW EDITIONS OF THE PUBLICATION. SUN MICROSYSTEMS, INC. MAY MAKE IMPROVEMENTS AND/OR

    CHANGES IN THE PRODUCT(S) AND/OR THE PROGRAM(S) DESCRIBED IN THIS PUBLICATION AT ANY TIME.

  • 8/3/2019 System Tuning Perf Final

    3/84

    i

    Contents

    1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

    2. Tuning the Solaris Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    Reconfiguring the Solaris Kernel. . . . . . . . . . . . . . . . . . . . . . . . . 4

    3. Memory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

    Memory Paging and Swapping . . . . . . . . . . . . . . . . . . . . . . . . . . 9

    Processes and Paging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

    Managing Swap SpaceThe swap Command . . . . . . . . . . . . . 13

    Process Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

    Add ress Space Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

    4. Tuning Disk Subsystems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

    Disk Performance Specifications . . . . . . . . . . . . . . . . . . . . . . . . . 17

    Disk Interface Stand ards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

    RAID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

    Disk Activity and Performanceiostat . . . . . . . . . . . . . . . . . . . 32

    Managing Disk Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

  • 8/3/2019 System Tuning Perf Final

    4/84

    ii Delivering Performance on Sun: System Tuning April 1999

    Volume Managemen t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

    5. File System Tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

    Unix File System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

    Tmpfs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

    VERITAS File System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

    Tun ing File Systemstunefs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

    6. Networking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

    Selecting the Right Network Technology . . . . . . . . . . . . . . . . . . 44

    Ensuring Data Integrity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

    Evaluating System Performance . . . . . . . . . . . . . . . . . . . . . . . . . 51

    Redu cing N etwork Congestion . . . . . . . . . . . . . . . . . . . . . . . . . . 51

    7. Summary of Guidelines for System Performance Evaluation 61

    A. Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

    References

    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

  • 8/3/2019 System Tuning Perf Final

    5/84

    1

    Introduction

    1

    This document focuses on techniques for tuning the Sun computing

    environment for maximum performance. It is aimed at system administrators

    and sophisticated end users wh o d o their own system ad ministration. Each

    chapter concentrates on a different subsystem of the computing environment

    and the specific things that can be done to increase performance. The

    subsystem topics are presented in the order recommended for tuning analysisand optimization:

    Chap ter 2 Tun ing the Solaris Kernel identifies the p aram eters that can

    be tuned to ensure the Solaris operating environment operates at peak

    efficiency.

    Chapter 3 Memory discusses the impact of paging and swapping

    schemes, and identifies tools adm inistrators can u se to effectively monitor

    and manage memory and swap space.

    Chap ter 4 Tun ing Disk Subsystems focuses on the key p erformance

    criteria, important disk technologies, and data layout schemes that ensure

    high p erformance I/ O. Sun tools that can aid in th e detection, isolation, and

    correction of p erformance-related issues are also d iscussed .

    Chap ter 5 File System Tuning looks at the high performan ce file

    systems available from Sun, as w ell as the tools and techniques that can be

    employed to ensure that these file systems continue to operate at peak

    efficiency.

  • 8/3/2019 System Tuning Perf Final

    6/84

    2

    Delivering Performance on Sun: System Tuning April 1999

    1

    Chap ter 6 focuses on the netw orking technologies available today, as well as

    the issues surrounding choosing a networking technology, ensuring data

    integrity, evaluating network performance, and reducing network

    congestion. Tools available to aid these efforts on Sun systems are also

    discussed.

    Chap ter 7 highlights the m ajor fun ctions of the Solaris kernel, and discusses

    the key tables and environment variables that need to be tuned in order to

    ensure it operates efficiently.

    Other Performance White Papers and Sun Documentation

    This paper, as well as companion documents, can be found at

    http://www.sun.com

    on th e World Wide Web. These docum ents include:

    Delivering Performance on Sun: Optimizing Applications for Solaris

    helps

    software developers construct high performan ce app lications with the use of

    proper compiler flags. It also discusses the development of high

    performance multithreaded applications.

    Delivering Performance on Sun: Tools for High Performance Distributed

    Computing

    addresses performance issues for distributed computingplatforms and the compute-intensive applications they run.

    In addition, all Sun documentation can be found at http://docs.sun.com

    .

  • 8/3/2019 System Tuning Perf Final

    7/84

    3

    Tuning the Solaris Kernel

    2

    The Solaris operating system, called the kernel

    , is loaded into memory w hen

    the system boots. The kernel m anages all of the ph ysical resources of the

    computing system. Applications interface with the kernel through a series of

    system calls.

    Key kernel functions includ e:

    Implemen ting m any types of file systems (UFS, NFS, tmp fs, etc.) and

    perm itting p rocesses to create, read, w rite, and a ccess these files.

    Managing daemons, system processes which manage system resources.

    Moving large grou ps of information, called pages

    , to and from the disk into

    main m emory as needed . This concept is known as virtual memory

    .

    Keeping track of all active p rocesses and deciding w hich one gets to ru n

    next.

    Managing device drivers

    , the progr ams th at control phy sical devices like d isk

    drive subsystems.

    Managing netw ork services and IPC facilities.

    Providing system management functions, such as booting and error

    handling.

    Providing oth er functions that m ake system resources available to processes.

  • 8/3/2019 System Tuning Perf Final

    8/84

    4 Delivering Performance on Sun: System Tuning April 1999

    2

    Reconfiguring the Solaris Kernel

    The standard Solaris kernel is dynamic, supporting all devices, pseudo-

    devices, and op tions by au tomatically loading mod ules when they are needed.

    Unlike previous versions of SunOS, there is no need to manually configure the

    Solaris kernel.

    The kernel needs to be reconfigured after hardware has been installed, and a

    reconfiguration reboot (boot -r) performed before it can be recognized an d

    used by the system. For some configuration changes it is sufficient to run the

    drvconfig comm and followed by the disks or tapes command to

    recognize devices recently added to the system.

    The /etc/system File

    The / etc/ system file houses a wide variety of system configuration and tuning

    parameters. Administrators can tune many of the parameters contained in

    / etc/ system to increase system performan ce, includ ing buffer cache, file

    system, STREAMS, interprocess comm un ication, memor y ma nagem ent, and

    other p arameters.

    bufhwm, the maximum size for buffer cache memory usage, in KB

    ufs_ninode, the maximum size of the inode table

    ncsize, the size of the d irectory nam e lookup cache

    nstrpush, the maximum number of STREAMS pushes allowed

    strmsgsz, the m aximum size for a STREAMS message a user can create

    strctlsz, the maximum size of the ctl part of a message

    strthresh, the maximum amount of dynamic memory the STREAMS

    subsystem can consume, in bytes

    sadcnt, the number ofsad devices

    msginfo_msgmap, the number of entries in the message map

    msginfo_msgmax , the maximum message size

    msginfo_msgmnb, the maximu m bytes on queue

    msginfo_msgmni, the number of message queue identifiers

    msginfo_msgssz , th segment size of a message (should be a multiple of the

    word size)

    msginfo_msgtql, the number of system message headers

    msginfo_msgseg, the nu mber of m essage segments (mu st be < 32768)

    seminfo_semmap, the number of entries in the semaphore map seminfo_semmni, the number of semaphore identifiers

    seminfo_semmns, the number of semaphores in the system

  • 8/3/2019 System Tuning Perf Final

    9/84

    Tuning the Solaris Kernel 5

    2

    seminfo_semmnu , the number of processes using the undo facility

    seminfo_semmsl, the maximu m n um ber of semaphores, per id

    seminfo_semopm , the maximum num ber of operations, per semaphore call

    seminfo_semume, the maximu m n um ber of undo structures per process

    seminfo_semvmx , the semaph ore maximu m value

    seminfo_semaem, the maximum value for adjustment on exit

    shminfo_shmmax , the maximu m shared memory segment size

    (set shmsys:shminfo_shmmax = 0xffffffff, or

    set shmsys:shminfo_shmmax = 4292967295)

    shminfo_shmmin, the minimum shared mem ory segment size

    shminfo_shmmni, the num ber of shared m emory identifiers

    shminfo_shmseg, the segments, per process lotsfree, if freemem drop s below lotsfree, the system starts to steal pages

    from p rocesses

    tune_t_fsflushr, the rate at which fsflush is run, in seconds

    tune_t_minarmem, the minimum available resident (not swappable) memory

    needed to avoid deadlock, in pages

    tune_t_minasmem, the minimum available swap pable memory needed to

    avoid deadlock, in pages

    tune_t_flckrec, the maximum number of active frlocks

    lwp_default_stksize, the size of the kernel stack for lwp s. This value shou ldnot be ad justed un less there is a kernel overflow. lwp _default_stksize is

    expressed in bytes an d mu st be a m ultiple of PAGESIZE bytes.

    npty, the total nu mber of 4.0 or 4.1 pseu do-ttys configured

    pt_cnt, the total number of 5.7 pseudo-ttys configured

    priority_paging, enabling the sytem to p lace a boun dary around the file

    cache, ensuring that file system I/ O does not cause app lication pag ing

    consistent_coloring, identifies the method used to improve hit ratios for the

    CPU external cache 0 (uses va rious virtu al ad dress bits), 1 (ph ysical

    add ress is set to virtu al ad dress), 2 (bin h opp ing), 6 (SBSD page coloring

    scheme)

    Administrators can modify these kernel parameters while the system is

    runn ing via the adb comm and, or by editing the / etc/ system file and

    rebooting the system.

    More information / etc/ system tun able parameters can be found in the System

    Administration Volume IImanu al located at http://docs.sun.com on th e World

    Wide Web.

  • 8/3/2019 System Tuning Perf Final

    10/84

    6 Delivering Performance on Sun: System Tuning April 1999

    2

    Kernel Tables

    A variety of kernel tables can be tuned by system administrators to enhance

    the p erformance of the Solaris operating en vironmen t. Table 2-1 identifies thesetables and the relevant environment variables they control, as well as

    suggested settings for each.

    Table 2-1 System administrators can tune key Solaris kernel tables

    Many of the traditional Unix kernel limits have been removed over the years.

    Solaris now dy nam ically sets limits that increase on large memory

    configura tions. Trad itional tables that no longer exist or need to be tun ed

    includ e the text table, open fi le table, callout table, and clist table.

    More information on configu ring kernel tables can be found in Sun Performance

    and Tuning, Java and the Internet, Chapter 14: Kernel Algorithms and Tuning.

    maxusers Kernel Parameter

    Over time, the Solaris kernel has been mod ified to enab le it to scale according

    to the h ardw are capabilities of the w orkload it is run ning. By self-configu ring

    and tuning, Solaris reduces adm inistrative burd ens and dynam ically ad apts to

    changing bu siness needs. While earlier versions needed to be h and -tuned , each

    successive version of Solaris 2 has converted han d-ad justed values into

    adaptively changing limits.

    One kernel parameter that remains an important factor in system performance

    is the maxusers environm ent va riable. As eviden ced in Table 2-1, many

    Solaris kernel parameters are defined relative maxusers, rendering it perhaps

    the most critical variable administrators can tune. Indeed, maxusers

    Variable Default Setting

    Process Table max_procs 20 + 16 * maxusers

    Per User Process Limit maxuprc max_nprocs - 5

    Inode Cache ufs_ninode max_nprocs + 16 + maxuers + 64

    Quota Table ndquot (maxusers * NMOUNT) / 4 + max_nprocs

    Directory Name Lookup Cache ncsize max_nprocs + 16 + maxusers + 64

  • 8/3/2019 System Tuning Perf Final

    11/84

    Tuning the Solaris Kernel 7

    2

    determines whether or not the Solaris kernel has sufficient room to

    accomm oda te users comfortably. (It is importa nt to n ote that maxusers is not

    a limit on the nu mber of simu ltaneous u sers supported by the system.)

    Increasing th e maxusers par ameter increases the size of the Solaris kernel,

    and consequently decreases the amount of memory available. Administrators

    shou ld, therefore, follow a few simp le guidelines to ensure the kern el is tun ed

    effectively for users of the system:

    The maxusers variable is autom atically set by Solaris according to the

    amount of physical memory available in the system. The automatic

    minimum value for maxusers is 8, and the au tomatic maximum is 1024.

    The maxusers par ameters is set a little less than th e amou nt of RAM in thesystem, in megabytes.

    For NFS servers w ith small amou nts of mem ory (un der 512 MB), it can be

    beneficial to increase the DN LC and inode cache limits.

    There is no need to tu ne in most cases and only a small imp act on performance

    can be expected from any kernel tuning.

    Additional information on the impact of the maxusers kernel parameter can

    be found in Sun Performance and Tunin g, Java and the In ternet, Chapter 14: Kernel

    Algorithms and Tuning.

  • 8/3/2019 System Tuning Perf Final

    12/84

    8 Delivering Performance on Sun: System Tuning April 1999

    2

  • 8/3/2019 System Tuning Perf Final

    13/84

    9

    Memory 3

    Memory Paging and Swapping

    When memory becomes tight, the kernel begins to employ techniques known

    as paging an d swapping. Paging occurs when the kernel puts part of an

    executing program out to disk, enabling the pages it occupied to be used for

    other pu rposes. When the process needs to access the page of memory tha t was

    moved to disk, apage faultoccurs, forcing th e process to wait u ntil the page is

    copied ba ck into m emory. This scheme fosters the efficient u se of mem ory, as

    only the parts of a program that are in use at the time the program is running

    need to be in memory and available for use.

    Swapp ing occurs w hen th e kernel stores an entire curren tly inactive process to

    disk, enabling the pages it occupied to be used by other executing programs.

    To continue th e process is paged ba ck in grad ually. Swa pp ing is a par t of the

    normal housekeeping performed by the Solaris kernel.

    Paging and Swapping Parameters

    The Solaris kernel utilizes several values stored in the / etc/ system file to

    determine when to start paging and swapp ing, actions that begin and end

    when the memory system crosses a threshold set as part of the kernel paging

    algorithm:

    lotsfree, the amount of memory required by the p age daemon. If memorydrops below this limit, the page daemon starts looking for pages to reuse.

    desfree, the level at which swapping begins.

  • 8/3/2019 System Tuning Perf Final

    14/84

    10 Delivering Performance on Sun: System Tuning April 1999

    3

    minfree, the minimum memory level acceptable to the system.

    slowscan, the num ber of pages scanned p er second wh en lotsfree is reached .

    fastscan, the num ber of pages scanned per second w hen minfree is reached .

    handspreadpages, the distance between the hand that sets a reference flag

    and the hand that clears it. The two hands keep track of which pages of

    mem ory ha ve been accessed recently. If the p ages have n ot been accessed in

    a certain amount of time, they are paged out.

    maxpgio, the maximum n um ber of page I/ O operations per second the

    system will schedule. maxpgio is typically set to 40, assuming there is little

    swap space on mu ltiple devices. If system swap space is on fast and/ or

    multiple disks, maxpgio should be increased to take advantage of the higher

    performance devices.

    More information on paging in the Solaris kernel can be found in Su n

    Performance and Tuning, Java and the Internet, Chapter 13: RAM and Virtual

    Memory. Information on the / etc/ system file can be found in the System

    Administration Guide, Volume IIlocated at http://docs.sun.com on the World Wide

    Web.

    Processes and Paging

    Und erstanding the n um ber of processes running on a system and how they are

    affected by p aging is essential to overall performan ce. System a dm inistrators

    should routinely monitor processes and learn how they are utilizing system

    resources. To aid this effort, Sun has p rovided several tools for ad ministrators

    as part of the Solaris operating env ironment: vmstat, swap , sar, ps and Find

    Process. Each tool can be used in a variety of wa ys, includ ing d etermining

    memory, paging, and sw apping m easurements.

    Memory Measurements

    The first step in evaluating processes is determining the paging and swapping

    activity occurr ing on the system. Both the vmstat an d sar comm ands can be

    used to mon itor paging activity. While sar is better for information logging

    purposes, vmstat tends to provide more detailed information. Correlations

    exist between the ou tpu t of these tools, as discussed below.

  • 8/3/2019 System Tuning Perf Final

    15/84

    Memory 11

    3

    Swap space (vmstat swap, sar -r freeswap, and swap -s)

    Each command describes the amount of available swap space, although in

    different forms. The vmstat swap command identifies the amount ofavailable swap space, in KB wh ile the sar -r freeswap reports in 512

    byte blocks. The swap -s command shows available swap space in addition

    to other metrics. Administrators should note that when available swap

    space is exhausted the system will be unable to use additional memory.

    Free memory (vmstat free an d sar -r freemem)

    Administrators can determine the amount of free memory the pages of

    RAM that are ready imm ediately to be used w hen a p rocess starts execution

    or needs more memory. The vmstat free comm and reports free memory

    in KB, the while sar -r freemem comm and reports it in p ages.

    Reclaims (vmstat re)

    The nu mber of p ages reclaimed from the free memory list are called reclaims.

    Pages that hav e been taken from a process but not yet reused by a nother can

    be reclaimed, thereby avoiding the full page fault that would typically

    result. The vmstat re command can be used to identify reclaimed

    memory.

    Page Fault Measurements

    Identifying when page faults are occurring is an important step in evaluating

    system performance.

    Attaches to existing p ages (vmstat at an d sar -p atch)

    The vmstat at an d sar -p atch comm ands measure the num ber of

    attaches to shared m emory pages.

    Pages paged in (vmstat pi an d sar -p pgin, ppgin)

    The vmstat pi comm and reports the number of KB/ second, w hile the sar

    command reports the number of page faults and the n um ber of pages paged

    in by swap space for file system read operations. Administrators should

    note th at the file system block size is 8 KB. Consequen tly, there m ay be tw o

    pages, or 8 KB, paged in per page fau lt on m achines with 4 KB pag es. Also

    notew orthy is the fact that UltraSPARC machines em ploy 8 KB pages;almost all other CPUs utilize 4 KB pages.

  • 8/3/2019 System Tuning Perf Final

    16/84

    12 Delivering Performance on Sun: System Tuning April 1999

    3

    Minor faults (vmstat mf an d sar -p vflt)

    Minor page faults are caused by add ress space or hardw are add ress

    translation faults that can be resolved without performing a page-inoperation. Such operations a re fast and requ ire little CPU u tlization,

    eliminating the need for a process to have to stop and wait.

    Other fault types (sar -p pflt, slock, vmstat -s copy-on-write,

    zero fill)

    Many oth er types of page fault conditions are possible, and can be identified

    using the above commands. More information on these and other fault

    conditions can be found in the corresponding man pages, as well as Su n

    Performance and Tuning, Java and the Internet, Chapter 13: RAM and Virtual

    Memory.

    Page-out Measurements

    The vmstat an d sar commands p rovide information on the amoun t of data

    paged out to the sw ap space or filesystem:

    Pages paged out (vmstat po an d sar -g pgout, ppgout)

    The vmstat po command reports the num ber of KB/ second p age out,

    while sar reports the number of page-outs and the num ber of pages paged

    out to swap or file system space. Administrators should note that the

    clustering of swap space writes can lead to a large number of pages written

    per p age-out.

    Pages freed (vmstat fr an d sar -g pgfree)

    The Solaris operating environm ent keeps track of the pages freed the rate

    at which memory is returned to the free list. The vmstat fr command

    reports the amount of memory freed in KB freed per second, while the sar

    command reports the number of pages freed p er second.

    Short-term memory deficit (vmstat de)

    Deficit is a paging parameter that provides a measure of hysteresis for the

    page scanner when there is a period of high memory d emand . If the value

    returned by the v mstat de command is non-zero, then mem ory was being

    consumed quickly and extra free memory will be reclaimed in anticipationof its impending use.

  • 8/3/2019 System Tuning Perf Final

    17/84

    Memory 13

    3

    Page daemon scanning rate (vmstat sr an d sar -g pgscan)

    The vmstat sr an d sar -g pgscan commands can be used to

    determine the nu mber of pages scanned by the page d aemon as it looks forpages used infrequently. If the page daemon scanning rate stays above 200

    pages p er second for long period s of time, then a m emory sh ortage is likely.

    Pages freed by the inode cache

    Statistics related to the inod e cache u sed by the Un ix file system (UFS) are

    also indicators of system p erformance. In p articular, ufs_ipf measu res UFS

    inode cache reuse, a feature that can cause pages to be freed when an

    inactive inod e is freed. ufs_ipf is the nu mber of inod es with p ages freed as a

    percentage of the total number of inodes freed.

    Swapping M easurements

    The sar an d vmstat commands also supp ly system swap ping information.

    Pages swap ped in (vmstat -S si an d sar -w swpin,bswin)

    The vmstat -S si comm and reports the number of KB/ second swap ped

    in, while the sar -w swpin command reports the number of swap -inoperations. The sar -w bswin command reports the number of 512 byte

    blocks swap ped in.

    The vmstat -S so comm and reports the number of KB/ second swap ped

    out, while the sar -w swpot command reports the number of 512 byte

    blocks swapp ed out.

    More information on analyzing paging statistics can be found in Su n

    Performance Tuning, Java and the Internet, Chapter 13.

    Managing Swap Space The swapCommand

    The efficient operation of a system relies on the effective utilization of both

    memory an d sw ap space. System ad ministrators must ensure the system h as

    sufficient swap resources so the paging and swapping activities performed by

    the kernel can operate smoothly. Sun provides the swap utility in the Solaris

    operating environment to facilitate the addition, deletion, and monitoring of

    the swap areas used by the m emory manager.

  • 8/3/2019 System Tuning Perf Final

    18/84

    14 Delivering Performance on Sun: System Tuning April 1999

    3

    Adding a swap area

    The -a option of the swap comm and can be used by system ad ministrators

    to add a swap area to the system. A swap file must be specified, such as/ dev/ dsk/ c0t0d0s1, as well as a length for the swap area in 512 byte blocks.

    A swap space mu st be at last 16 blocks in length.

    Deleting a swap area

    The -d option of the swap comm and can be used by system ad ministrators

    to delete a swap area from the system. Once deleted, blocks will no longer

    be allocated from this area. In add ition, all swap blocks previou sly in use in

    this swap area will have been m oved to other swap areas.

    Listing swap area status

    The -l option of the swap comm and can be used by system ad ministrators

    to list the status of all swap areas on th e system. Several pertinent pieces of

    information are reported, including: the path name for the swap area, the

    major/ minor d evice num ber if a block special device is employed , the

    length of the swap area, and the available free space.

    Printing summary information

    The -s option of the swap comm and can be used by system ad ministrators

    to summary information for all swap areas on the system. Total swap space

    usage an d av ailability is provided. System ad ministrators can determ ine the

    total amount of swap space allocated, the amount not currently allocated

    but reserved for future use, the total amount that is either allocated or

    reserved, and the amount available for future reservation and allocation.

    Distributing Swap SpaceTo achieve peak system performance, system administrators must distribute

    swap space so it can be u tilized effectively. Following a few simple gu idelines

    can dramatically improve system performance:

    Place swap partitions or swap files on as many different disks as possible.

    Place swap partitions or swap files on the fastest disks in the system.

    If more th an on e d isk is available per controller, limit the sw ap space to only

    one of those d isks.

    3

  • 8/3/2019 System Tuning Perf Final

    19/84

    Memory 15

    3

    Although swapping to remote (NFS) file systems is permitted, this practice

    should be avoided if at all possible. Swapping to remote file systems not

    only degrades swap performance, it also increases network traffic.

    Avoid p lacing sw ap fi les on file systems th at are susp ected of excess

    fragmentation or are almost full. If possible, swap files should be p laced in a

    dedicated disk partition to eliminate fragmentation concerns.

    More information on managing swap space can be found in the swap(1M)

    man page.

    Process Manager

    The new Process Manag er in Solaris 7 enables adm inistrators to quickly obtain

    process related information in the CDE environm ent (Figure 3-1). The Process

    Manager enables administrators to:

    Show u ser, system or a ll processes

    Sort processes by resource consum ption

    Suspend , resume, or kill processes

    Diagnose sick processes and contact their owning user via Address Manager

    More information on the Process Manager can be found in the Solaris

    operating environment administrative guides located at http://docs.sun.com on

    the World Wide Web.

    Obtaining Process Informationps

    System ad ministrators need to obtain information abou t active processes in the

    system. The ps command aids this effort by reporting the process ID, terminal

    identifier, cumu lative execution time, and th e comman d n ame by d efault. Moreinformation, including processes owned by a specific user, can be obtained by

    utilizing the options specified in the ps(1M) man p age.

    Address Space Map

    The /usr/proc/bin/pmap comm and can be used to print the add ress space

    map for each running process. Furthermore, the new -x option av ailable in the

    Solaris 7 operating environment prints resident/ shared/ private mappingdetails, providing more information than ever before.

    3

  • 8/3/2019 System Tuning Perf Final

    20/84

    16 Delivering Performance on Sun: System Tuning April 1999

    3

    More information on the pm ap command can be found in the proc(1M) man

    page.

  • 8/3/2019 System Tuning Perf Final

    21/84

    17

    Tuning Disk Subsystems 4

    This chapter first describes options for selecting disk subsystems which may

    improv e system p erformance. Following is a d escription of the tools and

    utilities for analyzing the p erformance of these disk subsystem s.

    It is not u nu sual for storage capacity requirements to grow by as mu ch as 100%

    per year. Coincident with this growth is an increasing need to ensure ready

    access to data whenever it is needed. System administrators know that rawcapacity and performance do not stand alone both scalability and high

    availability are importan t pr actical requiremen ts for ensuring econom ical and

    reliable operation and information access. Indeed, disk subsystems must be

    configured and tuned to match the performance, availability, and cost

    requirements of users.

    Focusing on disk sub systems, this chap ter describes key performan ce criteria,

    introdu ces impor tant d isk technologies, and d escribes Sun tools that can aid in

    the d etection, isolation, and correction of p erformance-related issues.References for more detailed tuning information are provided throughout.

    Disk Performance Specifications

    A number of factors are helpful in determining the relative performance of

    disk subsystems to one another:

    Disk capacity, the amount of data th at can be stored on the d isk or in the d isk

    subsystem.

    4

  • 8/3/2019 System Tuning Perf Final

    22/84

    18 Delivering Performance on Sun: System Tuning April 1999

    4

    Seek time, the amount of time it takes to move the disk drives heads from

    one track to another, and is a product of the performance of the actuator.

    Disk drives typically spend ten times more time seeking than transferring

    data, making seek time perhaps the most important indicator of a disksperformance. Some disk vendors quote the time it takes to traverse the

    entire disk; others cite the time it takes to traverse on e-third of the d isk.

    Rotational speed, the speed a t wh ich the disk spins. After the head h as moved

    to the app ropriate track, the disk mu st rotate until the d ata is und erneath

    the head. The disks rotational speed determines its latencythe amount of

    time it takes to set up for a data transfer.

    Data transfer rate, the amoun t of data th at can be transferred from the disk to

    the host per unit time. Both theoretical maximum and typical transfer rates

    are often quoted . Data transfer rate is quoted in Kilobytes/ second,

    Megabytes/ second, or Gigabytes/ second.

    Zone bit recording, a technique which enables data to be recorded at variable

    densities throughout the disk, thereby providing higher capacities. This

    method breaks the surface of the disk into different zones, each with a

    different number of bytes per track, enabling the bit density to remain

    constant across the disks sur face. Drives w ith this feature offer improved

    transfer rate an d capacity.

    Bus bandwidth, the amount of data that can be handled by the system bu s.

    SCSI bus band width can be mad e faster and wider.

    Controller, a device which coordinates read and write requests between the

    host and the d isk drive.

    Buffer size, the amou nt of cache em ployed on th e disk controller. Some

    drives utilize separate read and write buffers; others use a unified cache.

    Mean Tim e Between Failure (M TBF), the average time betw een failures.

    Mean Time to Data Loss (MTDL), the average time a disk subsystem can

    operate before another failure causes da ta to be lost.

    More information on these and related topics can be found in Sun Performance

    and Tuning, Java and the Internet, Chapter 8: Disks, and Configuration and Capacity

    Planning for Solaris Servers, Chapter 7: Storage Subsystem Architecture.

  • 8/3/2019 System Tuning Perf Final

    23/84

    4

  • 8/3/2019 System Tuning Perf Final

    24/84

    20 Delivering Performance on Sun: System Tuning April 1999

    4

    Table 4-1 highlights the key characteristics of the various SCSI standards.

    Table 4-1 The SCSI protocols provide a wide variety of performance characteristics

    More information on SCSI can be found in Configuration and Capacity Planning

    for Solaris Servers, Chapter 7: Storage Subsystem Architecture.

    Fibre Channel

    Fibre Channel is a high performance interconnect standard designed for

    bidirectional, point-to-point communications between high performance

    wor kstations, peripherals, and large host systems. It offers a variety of benefits

    over other link-level protocols:

    Efficiency and high performance

    Scalability

    Operation over long distances

    Easy to cable

    Supports popular high level protocols

    Sup ports switched networks

    Secure from RF snoop ing

    Invulnerab le to RF interference

    SCSI Fast SCSIFast and Wide

    SCSIUltra SCSI

    Asynchronous/Synchronous

    Asynchronous or

    Synchronous

    Single-ended

    Synchronous

    Differential

    Synchronous

    Differential

    Synchronous

    Differential

    Transfer Rate 5 MB/second 10 MB/second 20 MB/second 40 MB/second

    Number of Pins 50 pins 50 pins 68 pins 68 pins

    Width 8 bits 8 bits 16 bits 16 bits

    Maximum Cable Length 6 m 3 m synchronous

    25 m differential

    3 m synchronous

    25 m differential

    3 m synchronous

    12 m differential

    Number of Targets 7 7 15 15

    4

  • 8/3/2019 System Tuning Perf Final

    25/84

    Tuning Disk Su bsystems 21

    4

    Theory of Operation

    Channels are a generic class of communications technology designed to move

    data from one p oint to another with h igh speed an d minimum latency. Forspeed , functions such as error correction and retry-on-busy are typically

    implemented in hardware. Because they do not have sophisticated software-

    based rou ting and switching capabilities, chan nels typically opera te in

    environments w here every device on the channel is defined and known in

    advance.

    At its most basic, Fibre Chann el is just a high sp eed serial connection betw een

    two buffers. Indifferent to the format of the data, Fibre Channel only p rovides

    the error checking and necessary signaling to ensure the reliable movement ofdata between the source and destination buffers.

    Fibre Channel employs a mechanism know n as a fabric to establish conn ections

    between ports. The fabric, like the telephon e netw ork, has all the need ed

    intelligence to make rou ting d ecisions. The only concern of a p ort is to ensure

    its proper conn ection to the fabric. Each fabric has an ad dress space that allows

    up to 16 million ad dresses. A fabric can be simp le, or a mix of switches and

    hubs th at connect a complex and widely dispersed network of ports.

    Reliability

    There are no limits on the size of a transfer between ap plications using Fibre

    Chan nel. Each sequence is broken into frames, the framesize being negotiated

    by th e por ts and the fabric. A 32-bit cyclic redu nd ancy check ensures th at

    transm ission errors can be reliably detected.

    Fibre Channel Arbitrated LoopAn imp ortant enhancement to Fibre Channel has been the d evelopment of

    Fibre Chan nel Arbitrated Loop (FC-AL), developed specifically to meet th e

    need s of storage interconnects. Employing a simple loop topology, FC-AL can

    supp ort both simple configurations and sophisticated arrangements of hubs,

    switches, servers, and storage systems (Figure 4-1). Furth ermore, by using SCSI

    protocols over the m uch faster, more robust Fibre Chan nel link, FC-AL

    provides higher levels of performance without requiring expensive and

    complex changes to existing dev ice d rivers.

    4

  • 8/3/2019 System Tuning Perf Final

    26/84

    22 Delivering Performance on Sun: System Tuning April 1999

    4

    Figure 4-1 FC-ALs loop topology can support both simple and complexconfigurations.

    Key Benefits of FC-AL

    FC-AL offers a host of benefits over other p rotocols, enabling storage systems

    based on it to foster the creation of new app lications that take full advan tage of

    some impressive sp ecifications:

    Gigabit bandwidth

    FC-AL is capable of data tran sfer rates of up to 100 MB/ second (du al

    ported ), with 400 MB/ second envisioned for the futu re far faster than

    SCSI, Serial Storage Architecture (SSA), or P1394 (Firewire).

    Suitability for networks

    In addition to performance, Fibre Channel is ideal for building networks.

    Employing hubs and switches similar to those used in networks, Fibre

    Channel enables complex arrangements of storage and systems to be

    connected together in highly scalable, highly ava ilable netw orks, or fabrics.

    Use of existing SCSI protocols

    FC-AL allows SCSI comm and packets to be sent over a h igh speed ph ysical

    medium , reducing software and firmw are costs and m inimizing impact on

    existing software.

    4

  • 8/3/2019 System Tuning Perf Final

    27/84

    Tuning Disk Su bsystems 23

    Node addressability far better than SCSI

    With th e ability to su pp ort u p to 126 FC-AL devices on a single host

    adaptor, cost and implementation complexity is greatly reduced. Using

    optical fiber media, a single FC-AL loop can support nodes with a

    separation of up to ten kilometers from the host.

    Unique device naming convention

    FC-AL employs un ique, 64-bit w orld w ide nam es for devices. Disk drives,

    host adapters, and interface boards are identified by a world w ide name,

    enabling improved d iagnostics and software sup port.

    Greatly simplified wiring and cabling requirements

    Because Fibre Channel is a largely optical, simple serial protocol, electrical

    interference and expensive cabling are much less of an issue than with the

    complex parallel data paths used by SCSI.

    In add ition, FC-AL imp lementations support redund ant d ata paths, hot

    pluggable components, multiple host connections, and dual ported drives

    features that 15 year-old SCSI technology was never intended to support.

    Configuration Flexibility

    Fibre Channel Arbitrated Loop enables storage systems to implement several

    loop configurations, each designed to address varying levels of performance

    and availability. Utilizing bypass circuits and multiplexors, FC-AL based

    storage d evices like the Sun StorEdge A5000 perm it the constru ction of

    multiple loops within a single enclosure, improving both performance and

    ava ilability (Figu re 4-2).

    More information on Fibre Channel technology can be found in Su n

    Performance and Tuning, Java and the Internet, Chapter 8: Disks, and Configuration

    and Capacity Planning for Solaris Servers, Chapter 7: Storage Subsystem

    Architecture, and in the w hite pap ers listed in th e References section of this

    document.

    4

  • 8/3/2019 System Tuning Perf Final

    28/84

    24 Delivering Performance on Sun: System Tuning April 1999

    Figure 4-2 FC-ALs loop topology supports both single and dual loop configurations.

    RAID

    Redun dan t Arrays of Inexpensive Disks, or RAID systems, were first formally

    defin ed by researchers at Berkeley in 1987. The most imp ortant p art of their

    idea was to combine the power of small, inexpensive drives to achieveperforma nce equal to a single large, expensive d isk. Because RAID requ ired

    mu ltiple spind les, features were also need ed that could p rotect against the data

    loss that could be caused by the failure of any one of the d rives. These original

    safeguards subsequently became an important feature in their own right, with

    man y u sers pu rchasing RAID systems just for th eir availability.

    Toda y, disk arra ys are sold a s an intelligently man aged collection of disk d rives

    organized to op timize performan ce for a particular set of tasks. All RAID

    systems achieve higher performance and / or reliability by replicating orspanning data across multiple disks. Exactly how this is done has profound

    effects on subsystem performance, cost, and reliability.

    Two metrics are used in assessing the performance of disk subsystems:

    Transfer Rate

    Transfer rate is the speed (measured in MB/ second ) with wh ich a subsystem

    can move data through its controller. In RAID systems, read and write

    transfer rate perform ance can vary considerably, and is particularly valuablefor app lications tha t have to move large am oun ts of data qu ickly, like

    document imaging or digital video.

    4

  • 8/3/2019 System Tuning Perf Final

    29/84

    Tuning Disk Su bsystems 25

    I/O Operations per Second (IOPS)

    IOPS are a m easure of the ability of a storage system to h and le multiple,

    indep end ent I/ O requests in a certain period of time. RAID systems with

    high transfer rates do not always have good transaction rate performance.

    Database and transaction processing systems are examples of applications

    that typically require high I/ O rate p erformance.

    Recovery After a Failure

    When a disk in a protected RAID system does fail, the array continues to

    operate in reduced mode. In reduced m ode, array performance may be lowered,

    depend ing on the RAID level being u sed and wh ether reads or w ritespredominate. Upon encountering a failure, the simplest RAID systems inform

    the administrator of the problem, leaving them to organize the replacement of

    failed compon ents and to begin the restoration of norm al operations. The most

    soph isticated systems can au tomatically begin reconstructing data to a reserve

    dr ive called a hot spare. Hot sparing allows the array to automatically return to

    full performance quickly, and p ermits adm inistrators to defer maintenance to a

    more conven ient time. Some systems even allow d efective d rives to be

    replaced without interrupting processing or removing power, but such hot

    swapping is sometimes less useful than a system th at can instantly begin

    recovery using a hot spare.

    Popular RAID Levels

    Can RAID subsystem s actually eliminate the I/ O bottleneck and p rovide very

    high levels of perform ance? In some circumstances, RAID configu rations can

    dramatically improve performance and in others may actually impair it. To

    accommodate the various kinds of demands put on storage systems, different

    levels of RAID hav e been d efined, each w ith their ow n sp ecial characteristics.

    Five RAID levels, numb ered 1 throu gh 5, were d efined by the Berkeley

    researchers. Since that time, a few more have been added some simply

    variations of the original five. The four RAID levels offered in the Sun

    StorEdge array product family, levels 0, 1, 5, and 0+1, are discussed here.

    4

  • 8/3/2019 System Tuning Perf Final

    30/84

    26 Delivering Performance on Sun: System Tuning April 1999

    Level 0 - Striping

    Striping is an indu stry term for breaking a data stream up and placing it across

    mu ltiple d isks in equ al-sized chunks or stripe blocks. Each chu nk is sequ entiallywr itten to successive drives in the ar ray. The set of sequential chun ks that

    begins with the first drive and ends with the last drive forms the stripe

    (Figure 4-3). Array management software running on the host, on a special

    controller inside th e array, or some combination of both , is responsible for

    making the array look like a normal, or virtual, disk to the operating system

    and applications.

    Figure 4-3 RAID 0 breaks data into equal sized chunks of data in sets that form astripe. In this figure, the stripe width is three, with chunks 1, 2, and 3forming a single stripe.

    Each disk in a stripe is generally assumed to be on its own independent datachannel, allowing the transfer rate of a RAID 0 imp lementation to app roach the

    sum of the transfer rates of each of the drives. (Presum ing that th e intervening

    software and hardw are has sufficient throughpu t.)

    RAID 0 can also p rovide excellent IOPS performance, because d ata is spread

    across many independ ent spindles and actuators, helping to balance the I/ O

    load. Op timizing RAID 0 for transfer rate or IOPS performance requires

    administrators to make configuration decisions about the relationship of the

    sizes of the chunks in a stripe and the average size of disk transactions.

    Chunk 1

    Chunk 2

    Chunk 3

    Chunk 4

    Chunk 5

    Chunk 6

    Chunk 1

    Chunk 2

    Chunk 3Array ManagementSoftware

    Virtual Disk

    PhysicalDisk 1

    PhysicalDisk 2

    PhysicalDisk 3

    Chunk 4

    Chunk 5

    Chunk 6

    4

  • 8/3/2019 System Tuning Perf Final

    31/84

    Tuning Disk Su bsystems 27

    Level 1 M irroring

    Used for a long time in fault tolerant ap plications, this type of array is also

    called shadowing or mirroring. Designed to optimize data availability ratherthan speed, RAID 1 duplicates each write transaction to one or more mirror

    d isks (Figure 4-4).

    Mirrored systems can be used to improve the IOPS performance of read

    operations, because the least busy d rive can be selected to service requests.

    Because both drives must be involved in write operations, write performance

    may be slightly worse than an independ ent drive.

    Figure 4-4 RAID 1, or mirroring, provides d ata redu ndan cy by recording datamu ltiple times on independent spindles.

    Level 5 Striping with Interleaved Parity

    RAID 5 exploits the p erformance gains of striping w hile mitigating its

    vu lnerability to d ata loss by ad ding error correction information (called parity)to the data.

    RAID 5 allows independent access to individual drives in a group with striped

    da ta. If the d ata is interleaved across the group of drives in blocks equal in size

    or larger than most I/ O requests, then drives can respond ind ependently to

    requests, and even to m ultiple requests simu ltaneously. This has the effect of

    allowing RAID 5 to have very good random read performance. Conversely, if

    the block sizes in a stripe are smaller than the average size of an I/ O request,

    forcing m ultiple dr ives to wor k together, RAID 5 systems can a lso enjoy goodsequential performance.

    Chunk 1

    Chunk 2

    Chunk 3

    Chunk 4

    Chunk 1

    Chunk 2

    Chunk 3Chunk 4

    Array ManagementSoftware

    Chunk 1

    Chunk 3

    Chunk 4

    Chunk 2

    Virtual Disk

    4

  • 8/3/2019 System Tuning Perf Final

    32/84

    28 Delivering Performance on Sun: System Tuning April 1999

    Figure 4-5 RAID 5 interleaves parity and data across the stripe to improve

    performance.

    With RAID 5, both the parity information and data are interleaved across the

    array in a cyclic pattern (Figure 4-5). If one of the disks fail, a simple algorithm

    allows the missing d ata to be recovered from the remaining disks. RAID 5 is a

    good compromise for those needing good rand om p erformance without

    incurring the expense of mirroring.

    RAID 5s failing comes from its p oor p erformance on small wr ites. When a

    single-block w rite command is issued to a RAID 5 configu ration, the systemmu st first complete four I/ O op erations and two p arity calculations in a read-

    modify-write sequence, degrading ran dom write performance.

    Cached RA ID 5

    Using a n on-volatile cache in th e disk controller, the need to mod ify w rite

    operations can be optimized to use full width stripes, resulting in very good

    wr ite performance. This is often know n as h ardw are RAID 5 or controller-

    based RAID. Sun s StorEd ge A1000, A3000, and A7000 subyst ems p erform high

    speed cached RAID 5 operations for write-intensive workloads.

    Chunk 1

    Chunk 2

    Chunk 3

    Chunk 4

    Chunk 5

    Chunk 1

    Chunk 2

    Chunk 3

    Array ManagementSoftware

    Physical

    Disk 1

    PhysicalDisk 2

    PhysicalDisk 3

    PhysicalDisk 4

    P(1-3)

    Chunk 6

    P(4-6)

    Chunk 7

    Chunk 8

    Chunk 9

    P(7-9)

    Chunk 10

    Chunk 11

    Chunk 12

    P(10-12)

    Virtual Disk

    Chunk 4

    Chunk 5

    Chunk 6

    Chunk 7

    Chunk 8

    Chunk 9Chunk 10

    Chunk 11

    Chunk 12

    4

  • 8/3/2019 System Tuning Perf Final

    33/84

    Tuning Disk Su bsystems 29

    RAID 0+1 Striping plus Mirroring

    Useful configurations have been created by combining two existing RAID

    levels. One of the more popular is known as RAID 0+1, which combines thereliability of mirroring (RAID 1) with the performance of striping (RAID 0)

    (Figure 4-6).

    The reliability of a striped and mirrored system is excellent because of the h igh

    redu nd ancy afforded by m irroring. RAID 0+1 systems can tolerate the failure

    of one or more disks, and continue to deliver data with virtually no

    performa nce degra dation, u nlike RAID 5. RAID 0+1 systems do carry,

    however, the higher cost of mirrored systems, as data requiring protection

    needs twice the disk space of simple independent spindles.

    Some subsystems also support RAID 1+0, in which data is first mirroed and

    then striped. This scheme can withstand more individual disk failures and

    man tain d ata av ailability.

    Figure 4-6 RAID 0+1 combines m irroring an d str iping to achieve excellent ava ilabilityand performance.

    Array ManagementSoftware

    Chunk 2

    Chunk 1

    Chunk 3

    Chunk 4

    Chunk 1

    Chunk 3Chunk 5

    Chunk 7

    Chunk 2

    Chunk 4

    Chunk 6

    Chunk 8

    Array ManagementSoftware

    Array ManagementSoftware

    Striping

    Mirroring

    Virtual DisksPhysical Disks

    Virtual Disk

    Chunk 1

    Chunk 3

    Chunk 5

    Chunk 7

    Chunk 2

    Chunk 4

    Chunk 6

    Chunk 8

    Striping

    Chunk 6

    Chunk 5

    Chunk 7

    Chunk 8Chunk 2

    Chunk 1

    Chunk 3

    Chunk 4

    Chunk 6

    Chunk 5

    Chunk 7

    Chunk 8

    Chunk 2Chunk 1

    Chunk 3

    Chunk 4

    Chunk 6

    Chunk 5

    Chunk 7

    Chunk 8

    4

  • 8/3/2019 System Tuning Perf Final

    34/84

    30 Delivering Performance on Sun: System Tuning April 1999

    Optimizing the RA ID Configuration

    The various RAID levels have d istinctive strength s, and therefore tend to be

    best for certain kinds of applications (Table 4-2). Database management, forexample, requires good rand om p erformance, and indep endent spind les are

    the best solution. If reliability is a concern, however, mirroring (RAID 1) can

    give excellent performance, but is expensive for large data sets. RAID 5 offers

    slightly worse throughput, but with good redundancy and economy. Clearly,

    finding the right solution requires a good und erstanding of the I/ O load.

    Table 4-2 RAID levels provide varying degrees of availability with correspondingtrade-offs in performance and cost

    Interlace Size

    The interlace size specifies the chunk sizethe amount of data written to each

    disk before moving on to th e next componen t in a stripe. Optimally setting th e

    interlace size can h elp ensu re the RAID configuration p erforms w ell for bothread an d w rite operations. RAID 0 and RAID5 configura tions require different

    settings for the interlace size, depen ding on the type of I/ O:

    In striped configurations experiencing a high volume of rand om access I/ O,

    the interlace size shou ld be set large relative to the size of a typical I/ O

    request.

    In striped configu rations subject to mostly sequential access I/ O, the

    interlace size should b e set small relative to the size of the typical I/ O

    request. One typ ical strategy to d etermine the interlace size is to divide th e

    typical I/ O size by the nu mber of disks in the stripe (Figure 4-7).

    Strengths Weaknesses

    RAID 0

    Striping

    Improves I/O performance No redundancy

    A single disk failure causes all data tobe lost

    RAID 1

    Mirroring

    Improves data availability

    May improve read performance

    Expensive

    Potentially decreased write

    performance

    RAID 0+1Striping Plus MIrroring

    Improves data availability without

    sacrificing performance

    Expensive

    RAID 5

    Striping Plus Distributed Parity

    Improves data availability

    Error correction information

    Lower cost than mirroring

    Survives only a single disk failure

    Poor small write performance

    4

  • 8/3/2019 System Tuning Perf Final

    35/84

    Tuning Disk Su bsystems 31

    Figure 4-7 Setting the interlace size in striped configurations fosters higher sequentialperformance

    RAID 5 configurations perform best when write requests align with the

    stripe and the I/ O is a mu ltiple of the interlace size. Ind eed, full stripe

    wr ites can be u tilized if the size of a typical I/ O requ est equals the interlace

    size multiplied by one less than the number of components in the stripe

    (Figure 4-8).

    Figure 4-8 Setting the interlace size in RAID 5 configurations promotes faster writeoperations

    More information on configuring and optimizing RAID systems can be found

    in Sun Performance and Tuning, Java and the Internet, Chapter 8: Disks, and

    Configuration and Capacity Planning for Solaris Servers, Chapter 7: Storage

    Subsystem Architecture.

    Host-Based RA ID versus Controller-Based RA ID

    In recent years, much has been said about the implementation options

    available for RAID subsystems, and contention has ar isen over the p erceived

    benefits and drawbacks to both host-based RAID and controller-based RAID

    devices. While traditional controller-based RAID su bsystems em ploying SCSItechnology were long seen as the answ er to higher performan ce by minimizing

    the amou nt of traffic flowing between the host and peripheral, the adv ent of

    100 MB/ second Fibre Chann el is bridging th e gap. Utilizing a faster transport

    medium like Fibre Channel, the performance of host-based RAID storage

    systems compete w ith, and in man y cases surp ass, their controller-based RAID

    counterp arts. In add ition, the low er cost and h igher d egree of flexibility offered

    with host-based RAID solutions like the Sun StorEdge A5000 ensures that users

    have access to the best price/ performan ce configura tions available.

    I/O_Size___________________________

    Number_of_componentsInterlace_Size =

    I/O_Size = Interlace_Size * (Number_of_components - 1)

    4

  • 8/3/2019 System Tuning Perf Final

    36/84

    32 Delivering Performance on Sun: System Tuning April 1999

    Disk A ctivity and Performance iostat

    Two factors that impact overall system throughput are CPU utilization and

    disk I/ O activity. If there are insufficient CPU resou rces to process data andexecute tasks, d ata w ill not be stream ed to devices and performa nce will suffer.

    In addition, it is possible for disks to become saturated, rendering them

    incapable of hand ling higher loads of traffic.

    The iostat utility bund led in the Solaris operating en vironmen t helps system

    administrators determine disk activity patterns and performance. In order to

    iteratively repor t disk I/ O activity as well as CPU u tilization, the Solaris kernel

    maintains a series of counters:

    The number of reads and writes per disk The number of bytes read and bytes written per d isk

    Time stamps

    The residence time and cumulative residence length product for each queue

    These statistics enable iostat to produce highly accurate measures of

    throughput, utilization, queue lengths, transaction rates, and service times

    (Figure 4-9).

    Figure 4-9 The iostat utility reports I/ O statistics which can be analyzed todetermine CPU an d disk bottlenecks.

    More information on iostat can be found in the iostat(1M) man p age.

    # iostat -xpextended device statistics

    devicedad0dad0,adad0,bdad0,cdad1dad1,adad1,b

    dad1,cfd0sd2nfs1

    r/s0.00.00.00.00.00.00.0

    0.00.00.00.0

    w/s0.00.00.00.00.20.20.0

    0.00.00.00.0

    kr/s0.00.00.00.00.10.10.0

    0.00.00.00.0

    kw/s0.00.00.00.01.41.40.0

    0.00.00.00.0

    wait0.00.00.00.00.00.00.0

    0.00.00.00.0

    actv0.00.00.00.00.00.00.0

    0.00.00.00.0

    svc_t0.00.00.00.0

    61.761.77.9

    0.00.00.00.0

    %w0000000

    0000

    %b0000000

    0000

    4

  • 8/3/2019 System Tuning Perf Final

    37/84

    Tuning Disk Su bsystems 33

    Managing Disk Space

    Data availability depends on disk drives to be able to read and write data at a

    given time. System administrators must ensure that adequate disk and filesystem space exists for users and app lications. Quotas can be u sed to constrain

    users from utilizing too much disk space.

    Determining Free Disk Blocks and Files df

    To determine the number of free disk blocks and files on the disks in the

    system, administrators can employ the df command included in the Solaris

    operating environment. The df comm and displays the amount of disk space

    occupied by currently mounted file systems, the amount of used and available

    space, as well as the amou nt of each file systems capacity that has been u sed

    (Figure 4-10).

    Figure 4-10 The df utility reports free disk blocks and files, enabling administrators toquickly isolate capacity problems that can lead to decreased performanceand availability.

    More information on df can be found in the df(1M) man p age.

    # df -ak

    Filesystem/proc

    /dev/dsk/c0t1d0s0fdswapauto.sharedauto.scdeauto.src-hostsauto.doeauto.wsauto_home

    kbytes0

    387574206378320000000

    used0

    8164090553120000000

    avail0

    30205760

    5825200000000

    capacity0%

    22%0%9%0%0%0%0%0%0%0%

    Mounted on/proc

    //dev/fd/tmp/shared/scde/src/net/doews

    /home-xfnnevermore:vold(pid236)

    00

    00

    00

    0%0%

    /xfn/vol

    4

  • 8/3/2019 System Tuning Perf Final

    38/84

    34 Delivering Performance on Sun: System Tuning April 1999

    Determining Disk Usage du

    The ability to ensure that disks do not completely fill aids both performance

    and availability. System ad ministrators can obtain a q uick gauge of d isk usage

    by executing the du command. The du command recursively describes the

    number of kilobytes contained in all files and directories (Figure 4-11). The size

    of the file space allocated to a d irectory is reported as the su m of the total sp ace

    allocated to the files contained in the d irectory, as w ell as the space allocated to

    the d irectory itself. All files sizes are reported in, and roun ded to, the nearest

    512 byte unit.

    Figure 4-11 The du utility describes the amount of disk space allocated to files anddirectories.

    More information on the du comm and can be found in the du(1M) man page.

    System Activity Reporter sar

    The Solaris operating environm ent includes a system activity reporter (sar)

    that sam ples cumu lative activity coun ters at specified intervals. The sar

    command can also be used to determine storage related information, including:

    Disk activity levels for system disks (-d option)

    Buffer cache activity (-b option)

    File access system rou tine usag e (-a option)

    Process status (-v option)

    Average queue length (-q option)

    More information on the sar comm and can be found in the sar(1M) man

    page.

    # du -as /tmp

    306128 /tmp

    4

  • 8/3/2019 System Tuning Perf Final

    39/84

    Tuning Disk Su bsystems 35

    Volume M anagement

    The combination of disk drives and intelligent software to manage them is

    common ly referred to as a storage management system. With increasing n um bersof disks and a higher d eman d for availability in mission-critical environmen ts,

    storage management systems h ave a nu mber of requirements:

    Reliability

    Organizations need assurance that the d ata they d epend on is accurate.

    Storage management systems must implement the features necessary to

    ensure the accurate storage and retrieval of data.

    Data and system availability

    Many org anizations run 7x24. In these environmen ts, it is imperative for

    data to be available on demand. For many, unavailable data can mean a

    significant loss in income. Wh en a system fails, data mu st be accessible to an

    alternate host. In the event of a d isk failure, some means m ust exist for the

    data stored on that disk to be made available to users.

    Performance

    Storage management systems mu st provide the performance required bytodays d emand ing app lications and environments.

    Management of large amounts of data

    Users are generating more data than ever before. Organizations need a

    method for organizing disks and the d ata on th em such that d ata reliability

    and availability are maintained without sacrificing performance.

    Simplified administration

    Managing a large nu mber of disks can be dau nting and challenging. Storage

    management systems must provide tools to ease this task.

    Application integration

    A storage management system must fit seamlessly into existing customer

    server, database and OLTP environments.

    Sun p rovides tw o tools to aid th ese efforts: the Sun StorEdge Volume Manager,

    and RAID Manager. More information on Sun volume management solutions

    can be found at http://www.sun.com/storage/products.html on the World Wide

    Web.

    4

  • 8/3/2019 System Tuning Perf Final

    40/84

    36 Delivering Performance on Sun: System Tuning April 1999

  • 8/3/2019 System Tuning Perf Final

    41/84

    37

    File System Tuning 5

    The Solaris operating environment supports several file system types, each

    designed to provide high performance to users. These file systems include the

    stand ard Unix File System (UFS), a RAM disk file system (Tmpfs), and th e

    VERITAS file system (VxFS) discussed below. Two additional file systems,

    CacheFS and NFS, are geared toward enhancing the performance of network

    operations and are discussed in chapter 7 of this document.

    Unix File System

    The Unix File System (UFS) is the stand ard file system used to man age d isk

    dr ives in the Solaris operating environm ent. Based on the Berkeley Fast File

    System, UFS is designed for high performan ce. Its tree-like structure, ma naged

    by a structure know n as an inode, enables fast access to directories and files.

    With UFS, all files that are read are stored in RAM. If RAM becomes full, files

    are flushed to disk and retrieved w hen needed . Consequently, the more RAMin the system the better overall file system performance.

    UFS and Transaction Logging

    When u nschedu led system dow ntime occurs, such as a system p anic or power

    failure, partially comp leted up dates to th e file system are possible. If partial

    up da tes exist, the file system can be left in an inconsistent state. To fix these

    inconsistencies, UFS file systems are checked at boot time (with fsck an d

    quotacheck) and repaired, if possible. Because mu ltiple and complex

    5

  • 8/3/2019 System Tuning Perf Final

    42/84

    38 Delivering Performance on Sun: System Tuning April 1999

    directory hierarchies often exist, and because fsck mu st traverse each an d

    every h ierarchy to ensure its integrity an d consistency, the fsck process is

    often time consuming.

    The combination of Solaris 2.4 and the Solstice DiskSuite storage man agementtool offers a new feature called UFS Logging, sometimes referred to as

    journ alling. UFS logging tak es the (logically) synchronou s natu re of up dating a

    file system and makes it asynchronous. Updates to a file system are not made

    directly to the disk or partition containing the file system. Instead, user data is

    written to the device containing the file system, but the file system disk

    structures are not modifiedthey are logged instead.

    Updates are applied to the file system structure at key times to ensure

    consistency. Changes m ad e to the file system by u nfin ished system calls arediscarded , ensuring th at the file system is in a consistent state. This mean s that

    logging file systems do not have to be checked at boot time, speeding the

    reboot p rocess.

    System administrators should consider the following when configuring UFS

    file systems with transaction logging:

    The absolute m inimum size for a transaction log is 1 MB of disk spa ce. A

    recommend ed minimu m is 1 MB of log space for every 1 GB of file systemspace.

    The larger the log, the more simultan eous file system operations p er second .

    As a general ru le, ad ministrators shou ld p lan on 1 MB of log space for every

    100 MB of file system space being logged.

    Small file systems, or p rimarily read-intensive file systems, will not benefit

    from tran saction logging. H ow ever, file systems experiencing a significant

    num ber of up dates should be logged.

    More information about UFS transaction logging can be found in the Solstice

    DiskSuite Users Guide, and Solstice DiskSuite Reference Guide available at

    http://www.docs.sun.com on the World Wide Web.

    Configuring UFS File Systems for M aximum Performance

    The advent of very large disk drives and RAID subsystems is leading to ever

    increasing file system sizes. Users of large file systems n eed to reset a nu mber

    of UFS para meters to ensure efficient disk utilization and I/ O perform ance:

    5

  • 8/3/2019 System Tuning Perf Final

    43/84

    File Sy stem Tuning 39

    inode density

    When U FS wa s introdu ced, most files were very sm all, averaging on ly 1 to 2

    KB in size. Today, files are approaching 64 KB or larger. To ensure the file

    system u tilizes space efficiently, system ad ministrators shou ld set the inodeden sity to one inod e per 8 KB of file space.

    minfree

    Disk have grown tremen dou sly since the minfree=10% rule w as set back in

    1980. At that time, 10 percent of a disk ranged from 2.5 MB to 5 MB. Today,

    10 percent of a typical disk is several hund red m egabytes. For large file

    systems, or those using d isk drives larger than 2 GB, minfree should be set

    to 1 percent to reduce file system overhead and wasted disk space. On largefile systems, minfree is autom atically redu ced (Solaris 2.6 and later).

    maxcontig

    The file system cluster size, or maxcontig, specifies the m aximu m num ber

    of blocks belonging to a single file that can be allocated contiguously before

    inserting a rotational delay of the disk. Performa nce may be imp roved if the

    file system I/ O cluster size is an integral of the stripe w idth of file systems

    on RAID 0 or RAID 5 volum es.

    cylinders per cylinder group

    Volumes an d file systems that u tilize more than 8 GB of disk space may

    need to increase the size of a cylinder group from 16 cylinders p er cylind er

    group to as many as 256 cylinders per cylinder group.

    More information on configuring UFS file systems can be found in...?

    TmpfsThe Solaris operating en vironmen t includ e sup port for Tmpfs, a temp orary file

    system that utilizes memory to cache writes scheduled to go to disk. System

    administrators can improve system performance by making extensive use of

    Tmpfs to store shor t-lived files. This schem e enables temp orary files to be

    written and accessed without the typical performance impact associated with

    writing information to disk.

    5

  • 8/3/2019 System Tuning Perf Final

    44/84

    40 Delivering Performance on Sun: System Tuning April 1999

    A system is a candid ate for Tmpfs usage if:

    If it ru ns a pp lications th at create temp orary, short-lived files. (Systems th at

    run compilers are a good example of this type of environment.)

    It has a lot of memory and swap space.

    It employs slow disks.

    Since many ap plications already emp loy the /tmp directory on Solaris systems

    for temporary files, it is a natural candidate for Tmpfs. System administrators

    should note, however, that any directory can utilize Tmpfs.

    More information on Tmpfs can be found in...?

    VERITAS File System

    The VERITAS File System is a high-performance, fast recovery, enterprise file

    system. Typically emp loyed w ith very large ma ss storage subsystems from

    Sun, the VERITAS File System offers a host of features and benefits to users,

    including:

    Enterprise file system

    The VERITAS File System is d esigned as an enterpr ise file system, provid ing

    superior file system performance in certain applications and increased

    availability over th e stand ard UN IX file system (UFS).

    Availability

    High availability is an essential requirement of enterprises run ning m ission-

    critical applications. The VERITAS File System provides fast recovery after

    system crashes, decreasing system down-time and increasing data and

    app lication availability. Adm inistrative op erations such as file systembackup, resizing, and defragmentation can occur on-line. Administrators no

    longer need to take the file system off-line to perform th ese common

    operations, increasing file system availability and making administration

    more convenient.

    High performance

    Using a more efficient extent-based a llocation schem e, hashed directories, as

    well as layou t and caching mod ifications, the VERITAS File System p rovides

    the high I/ O performance demanded by large enterprise databases and

    other applications.

    5

  • 8/3/2019 System Tuning Perf Final

    45/84

    File Sy stem Tuning 41

    Extent -Based A llocation

    The VERITAS File System allocates space in contiguous segments, or extents,

    instead of sm all fixed-size disk blocks. This can accelerate sequen tial I/ O by

    reducing seek and disk rotation times. Performance is also improved as fewer

    I/ O operations are needed to read and write large amounts of data.

    Extent sizes can be chosen automatically by the VERITAS File System based on

    the I/ O pa ttern of the file, or can be explicitly selected by th e adm inistrator.

    More information on the VERITAS File System can be found at

    http://www.sun.com/storage/software/veritas.html on the World Wide Web.

    Tuning File Systemstunefs

    The layout of a file system can have a d ram atic affect on I/ O perform ance.

    When a file system is over 90 percent full, system ad ministrators shou ld

    optimize it for time, ensuring higher throughput. The tunefs utility is

    designed to change the dynamic parameters of a file system than affect layout

    policies.

    More information on tunefs can be found in the tunefs(1M) man p age.

    5

  • 8/3/2019 System Tuning Perf Final

    46/84

    42 Delivering Performance on Sun: System Tuning April 1999

  • 8/3/2019 System Tuning Perf Final

    47/84

    43

    Networking 6

    The last decade has seen a ten-thousand fold increase in computer

    performa nce. This increase in pow er, along with ad vances in operating systems

    and graphical desktop environments, has spurred the d evelopment of an array

    of new software tools and technologies. These new capabilities in softwa re

    have in turn pushed computing resources to their limit, causing users to

    deman d h igher performance hardware. Networks are developing through a

    similar technology and demand-driven cycle.

    Networks must not only keep pace with technological advancements, they

    must ensure that sufficient resources are available to meet the demands of

    increasingly soph isticated ap plications. To meet th ese needs, netw orks m ust

    perform well and respond to client requests as needs dictate.

    When a netw ork server is un able to respond to a clients request, the client

    typically retransm its the request a n um ber of times. Each retransmission

    induces additional system overhead and generates more network traffic.

    System a dm inistrators can mitigate the extent of excessive retransmissions by

    improv ing three key cond itions: data integrity, system performa nce, and

    network congestion.

    6

    Selecting the Right Network Technology

  • 8/3/2019 System Tuning Perf Final

    48/84

    44 Delivering Performance on Sun: System Tuning April 1999

    Selecting the Right Network Technology

    Local Area Networks

    Local Area Networks, or LANs, provide a means to connect systems within a

    limited distance, facilitating information an d resou rce sharing in a mu ltivend or

    environm ent. While adequ ate for small, centralized en vironmen ts, existing

    LAN techn ologies are qu ickly becoming a bottleneck in larger d istributed

    computing environments.

    In the futu re, with resources as likely to be remote as local, networ ks mu st be

    able to accommodate the rapid movement of large amounts of data. With

    escalating needs for collaboration, organizations are demanding advancedfunctionality, such a s video mu lticasting, time-synchronou s ap plications, and

    integration with wide area networks. Enterprises require high speed networks

    (100 Megabits/ second or abov e) but will not accept expen sive chan ges to their

    existing hardware, software and training. To keep up with the expanding

    demands of computing, Sun believes that three technologies have promise:

    Ethernet, FDDI, and ATM.

    Ethernet

    Conventional Ethernet was designed when it was ambitious for an affordable

    departmental system to deliver 1 SPECmark. Provided as a standard feature or

    low-cost option with nearly every w orkstation, 10 Megabit/ second Ethernet

    (Ethernet or Ethernet 10) is a u biquitous, inexpensive alternative w hich is

    simple to d eploy. Its pop ularity has m ade p ossible the economies of scale that

    come from volume manu facturing, which in turn has mad e Ethernet

    inexpensive.

    Althoug h it h as extensive capabilities, such a s ease of connectivity, and a stable

    installed base, Ethernet is inadequate for todays ambitious requirements

    (Figure 7-2). Its inability to scale to more th an a few h un dred users, or easily

    accommodate bandwidth-intensive applications makes it difficult to use in

    many environments.

    Because Ethernet is collision-based, on ly one d evice can tra nsmit a t a given

    time. Should two devices attempt to transmit simultaneously, their data

    packets w ill collide, resulting in lower aggregate th rough pu t. A 10BaseTEthernet network is considered saturated at 35percent utilization.

    6

    Consequently approximately 500 KB of traffic can saturate a 10BaseT LAN

  • 8/3/2019 System Tuning Perf Final

    49/84

    Networking 45

    Consequently, approximately 500 KB of traffic can saturate a 10BaseT LAN.

    The use of switches and routers can minimize collisions and improve LAN

    performance by breaking the network into pieces and isolating traffic to

    smaller segments.

    Figure 7-1 Today, networking technology is characterized by inadequate performance,complex interconnections, and multiple protocols

    Fiber Data Distributed Interface

    Primarily used to satisfy the needs of data communication users, the Fiber

    Data Distributed Interface (FDDI) has become a standard for networks

    requiring 100 Megabit/ second transm ission speed s. Providing m edia access

    control, data encoding and decoding, as well as defining drivers for fiber-optic

    components and a multilayered network management scheme, FDDI often is

    used as the backbone for lower speed networks such as Ethernet 10.

    Although a good investment when speed is required, FDDI has limitations. It

    is more expensive to deploy than Ethernet, it offers limited bandwidth, and is

    not as popular, and therefore is more expensive than other networking

    technologies. Neverth eless, for u sers that requ ire standard ization, a high

    degree of interop erability, flexibility an d robustness, FDDI is one of th e most

    mature high speed network technologies available today.

    CampusB

    ackbone

    10 Mbps

    Bridge/Router

    Server

    10BaseTHub

    10-20 Workstations in a Workgroup

    n x 10 Mbps

    10 Mbps

    6

    Fast Ethernet

  • 8/3/2019 System Tuning Perf Final

    50/84

    46 Delivering Performance on Sun: System Tuning April 1999

    Fast Ethernet

    For users needing a low er-cost alternative to FDDI, Fast Ethernet (100

    Megabit/ second Ethernet) may be the networ king technology of choice. Fast

    Ethernet is an extension to the Ethernet 10 standard, supporting a wide range

    of applications with increased bandwidth. Fast Ethernet maintains

    compatibility with the installed base of wiring currently employed for

    Ethernet. It has the ability to au tomatically sense 10 Megabits/ second or 100

    Megabits/ second op eration and can adjust to either speed. Fast Ethernet

    ensures maximum compatibility with existing environments and applications,

    providing an investment-preserving migration path from conventional

    Ethernet.

    Gigabit Ethernet

    The transition of faster interfaces, such as Fast Ethernet, to a wid espread

    desktop technology had created another p roblem for netw ork ad ministrators.

    As the number of devices entering the network at 100 Mbps grows, the

    demand for higher speed at the backbone and server level also increases.

    Organizations are finding they need to increase the level of network

    performance to support computing power demands.

    Expected to become the solution of choice for netw ork ad ministrators, Gigabit

    Ethernet provides a raw data bandwidth of 1000 Mbps (or 1 Gigabit per

    second) that aims to help the increased network congestion experienced at the

    backbone and server levels. Not just a high performance networking

    technology, Gigabit Ethernet also gives ad ministrators the flexibility th ey need

    new full-duplex operating modes for switch-to-switch and switch-to-server

    connections, and half-duplex modes for shared connections using repeaters

    and the carrier sense multiple/ collision detection (CSMA/ CD) access method .

    More information on Gigabit Ethernet and its application can be found in the

    Sun and Gigabit Ethernetwhite paper located on the World Wide Web at

    http://www.sun.com/products-n-solutions/hw/networking/whitepapers/gigabitwp.html .

    Asyn chronous Transfer Mode

    Perhaps the m ost promising n ew network technology is Asynchronous

    Transfer Mode (ATM). Based on cell switching technology, ATM uses small,

    fixed-sized cells and inexpensive, high-band wid th, low-latency switches. With

    6

    da ta rates from 51 Megabits per second to over 2 Gigabits per second, ATM

  • 8/3/2019 System Tuning Perf Final

    51/84

    Networking 47

    g p g p

    offers the performance needed by the most demanding interactive applications

    (Figure 7-2).

    Figure 7-2 Advanced networking technologies have performance far exceeding that ofEthernet, the most commonly used network today

    ATM is independent of the speed of the line on which it runs, making it

    adaptable to most wiring technologies. ATM can run full duplex, permitting

    the simultaneous transmission and receipt of data without interruption. Its

    flexibility and low latency allows real-time ap plications like video a nd aud io to

    be supp orted along w ith data services.

    A scalable technology offering a deterministic quality of service, ATM allows

    the n etwork to be tailored to th e ap plication. A global, mobile, scalable, and

    app lication tran sparent p rotocol, ATM allow s the m ixture of d ata, voice and

    video. With its increased performance and ability to support advanced

    app lications, ATM holds the p romise of being th e prem ier network solution for

    the rest of the decad e and beyond (Figure 7-4).

    TransferRate(Mbits/sec)

    Ethernet Fast Ethernet FDDI1

    10

    100

    2500

    ATM

    10 Mbps

    100 Mbps 51 Mbps to 2.5 Gbps100 Mbps

    6

  • 8/3/2019 System Tuning Perf Final

    52/84

    48 Delivering Performance on Sun: System Tuning April 1999

    Figure 7-3 ATM has many characteristics that make it the preferred solution for mostfuture networking requirements

    Wide Area Networks

    Wide Area N etworks, or WANs, provide a m eans to connect systems which are

    distributed over a large geograp hic area, facilitating informa tion and resource

    sharing. WAN s mu st be low-cost, high sp eed, and easily integrated w ith LANs.

    Three technologies are emerging as leading WAN contenders: Frame Relay,

    Integrated Services Digital Network (ISDN) and A synchronou s Transfer Mod e

    (ATM).

    Frame Relay

    Frame Relay is emp loyed to efficiently transfer information w ith minim al

    delay. Designed to hand le data commu nication w ith large, bursty d ata packets,

    it uses a transmission channel only when there is data present. This offers

    significant advantages over LAN technologies that tie up lines for the entire

    du ration of a session. Frame Relays ability to efficiently p rovide su ch on-

    demand transmission has garnered considerable attention. However, its

    inability to provid e consistent p erformance, its imp recise specifications for

    Data segmented in fixed 53-byte long cells

    Circuit-oriented for easy routing

    Transparent to data type

    Multimedia Support

    Real-time capable

    Low latency

    Dynamic bandwidth

    Star topology

    Full bandwidth to each user

    Easy network maintenance

    Seamless integration between local

    and wide-area networks

    ATM Scalable Bandwidth Large volume Low cost

    Wide vendor acceptance

    6

    congestion avoidan ce and recovery m echanisms, and its inefficiency in

  • 8/3/2019 System Tuning Perf Final

    53/84

    Networking 49

    transmitting voice and video, preclude Frame Relay from being a multimedia

    WAN solution.

    Integrated Services Digital Network

    Integrated Services Digital Network (ISDN) emerged through the need for a

    single integrated stru cture that efficiently accommod ated a variety of services.

    The most widely used network technology, ISDN has addressed the growing

    need s of digital data comm un ications in a var iety of ind ustries. The ability to

    transport information digitally allows a single line to transmit up to 1.5

    Megabits per second while providing superior quality of service over analog

    tr