1. campus infrastructure model

1. Campus infrastructure model

The simplest Ethernet network infrastructure is composed of a single collision and broadcast domain. This type of network is referred to as a "flat" network because any traffic that is transmitted within this network is seen by all of the interconnected devices even if they are not the intended destination of the transmission. The benefit of this type of network is that it is very simple to install and configure, so it is a good fit for home networking and small offices. The downside of a flat network infrastructure is that it does not scale well as demands on the network increase. Some issues with nonhierarchical networks include:

Traffic collisions on the network increase as devices are added, impeding traffic flow on the network. Broadcast traffic increases as devices are added to the network causing over utilization of network

resources. Problem isolation on a large flat network can be difficult.

The table of Network Devices shows the key network hardware devices in a nonhierarchical network and the function of each.

Layer 2 network issues

Layer 2 switches can significantly improve performance in a CSMA/CD network when used in place of hubs. This is because each switch port represents a single collision domain and the device connected to that port does not have to compete with other devices to access the media. Ideally, every host on a given network segment is connected to its own switch port eliminating all media contention as the switch manages network traffic at Layer 2. An additional benefit of Layer 2 switching is that large broadcast domains can be broken up into smaller segments by assigning switch ports to different VLAN segments.

For all their benefits, some drawbacks still exist in a nonhierarchical-switched network:

If switches are not configured with VLANs, very large broadcast domains may be created. If VLANs are created, traffic cannot move between VLANs using only Layer 2 devices.

As the Layer 2 network grows, the potential for bridge loops increase. Therefore, the need to use a Spanning Tree Protocol becomes imperative.

Multilayer switching

Multilayer switching is hardware-based switching and routing, integrated into a single platform. In some cases, the frame and packet forwarding operation is handled by the same specialized hardware ASIC and other specialized circuitry. A multilayer switch does everything to a frame and packet that a traditional switch or router does, including the following:

Provides multiple simultaneous switching paths Segments broadcast and failure domains Provides destination specific frame forwarding based on Layer 2 information Determines the forwarding path based on Layer 3 information Validates the integrity of the Layer 2 frame and Layer 3 packet via checksums and other methods Verifies packet expiration and updates accordingly Processes and responds to any option information Updates forwarding statistics in the Management Information Base (MIB) Applies security and policy controls, if required Provides optimal path determination The more expensive or sophisticated multilayer switches are modular and support a wide variety of

media types and port densities. Has the ability to support QoS Has the ability to support VoIP and in-line power requirements

Because it is designed to handle high-performance LAN traffic, a multilayer switch can be placed anywhere within the network, cost-effectively replacing the traditional switches and routers. Generally, however, a multilayer switch may be more than is required to provide end systems access to network resources.

Enterprise Composite Network model:

Building Access layer – The Building Access layer is used to grant user access to network devices. In a network campus, the Building Access layer generally incorporates switched LAN devices with ports that provide connectivity to workstations and servers. In the WAN environment, the Building Access layer at remote sites may provide access to the corporate network across WAN technology.

Building Distribution layer – The Building Distribution layer aggregates the wiring closets and uses switches to segment workgroups and isolate network problems. Routing and packet manipulation occur in the Building Distribution layer.

Building Core layer – The Building Core layer is a high-speed backbone and is designed to switch packets as fast as possible. Because the core is critical for connectivity, it must provide a high level of availability and adapt to changes very quickly. Routing and packet manipulation above Layer 2 should be avoided in the Core, if possible.

Management ModuleThe primary goal of the management module is to facilitate the secure management of all devices and hosts within the enterprise architecture.

Core ModuleThe core module in the network architecture is nearly identical to the core module of any other network architecture. It merely routes and switches traffic as fast as possible from one network to another.

Building Distribution ModuleThis module provides distribution layer services to the building switches. These include routing, quality of service (QoS), and access control. Requests for data flow into these switches and onto the core, and responses follow the identical path in reverse.

Building Access ModuleThis module is described as the extensive network portion that contains end-userworkstations, phones, and their associated Layer 2 access points. Its primary goal is toprovide services to end users.

Server ModuleThe server module's primary goal is to provide application services to end users anddevices. Traffic flows on the server module are inspected by on-board intrusion detection within the Layer 3 switches.

Edge Distribution ModuleThis module aggregates the connectivity from the various elements at the edge. Traffic is filtered and routed from the edge modules and routed into the core.

The Enterprise Composite Network model contains these three major functional areas:

Enterprise Campus – The Enterprise Campus functional area contains the modules required to build a hierarchical, highly robust campus network that offers performance, scalability, and availability. This

area contains the network elements required for independent operation within a single campus, such as access from all locations to central servers. The Enterprise Campus functional area does not offer remote connections or Internet access.

Enterprise Edge – The Enterprise Edge aggregates connectivity from the various resources external to the enterprise network. As traffic comes into the campus, this area filters traffic from the external resources and routes it into the Enterprise Campus functional area. It contains all of the network elements for efficient and secure communication between the Enterprise Campus and remote locations, remote users, and the Internet. The Enterprise Edge would replace the "DMZ" area of most networks.

Service Provider Edge – This functional area represents connections to resources external to the campus. This area facilitates communication to WAN and Internet Service Providers’ (ISPs) technologies.

Enterprise Composite Network model benefits:

* It defines a deterministic network with clearly defined boundaries between modules. The model also has clear demarcation points, so that the designer knows exactly where traffic is located.

* It increases network scalability and eases the design task by making each module discrete.

* It provides scalability by allowing enterprises to add modules easily. As network complexity grows, designers can add new functional modules.

* It offers more network integrity in network design, allowing the designer to add services and solutions without changing the underlying network design.

Modules of the enterprise campus:

Campus Infrastructure module – comprised of a Building Access and Distribution submodule. Connects users within the campus to the Server Farm and Edge Distribution modules. The Campus Infrastructure module is composed of one or more floors or buildings connected to the Campus Backbone submodule.

Network Management module – performs system logging and authentication as well as network monitoring and general configuration management functions.

Server Farm module – contains e-mail and corporate servers providing application, file, print, e-mail, and Domain Name System (DNS) services to internal users.

Edge Distribution module – aggregates the connectivity from the various elements at the Enterprise Edge functional area and routes the traffic into the Campus Backbone submodule.

A Campus Infrastructure module includes these submodules:

Building Access submodule (also known as Building Access layer) – Contains end-user workstations, IP Phones, and Layer 2 access switches that connect devices to the Building Distribution submodule. The Building Access submodule performs services such as support for multiple VLANs, Private VLANs and establishment of trunk links to Building Distribution layer and IP Phones. Each Building Access switch has connections to redundant switches in the Building Distribution submodule.

Building Distribution submodule (also known as Building Distribution layer) – Provides aggregation of Building Access devices, often using Layer 3 switching. The Building Distribution submodule performs routing, QoS, and access control. Traffic generally flows through the Building Distribution switches and onto the campus core or backbone. This submodule provides fast failure recovery because each Building Distribution switch maintains two equal-cost paths in the routing table to every Layer 3 network number. Each Building Distribution switch has connections to redundant switches in the core. Note deletion of Switch Block.

Campus Backbone submodule (also known as Building Core layer) – Provides redundant and fast-converging connectivity between buildings, the Server Farm and Edge Distribution modules. The purpose of the Building Core submodule is to switch traffic as fast as possible from one module. Forwarding decisions should be made at ASIC level whenever possible. Routing, ACLs and processor based forwarding decisions should be avoided at the core and implemented at Building Distribution devices whenever possible. High End Layer 2 or Layer 3 switches are used at the Core for high throughput with optimal routing, QoS, and security capabilities available when needed.

Hierarchical Campus Model

Enterprise Composite Model Function Areas

Issues that stem from a poorly designed network:

Failure domains – One of the most important reasons to implement an effective design is to minimize how far reaching a network problem is when it occurs. When Layer 2 and Layer 3 boundaries are not clearly defined, failure in one network area can have a far-reaching effect.

Broadcast domains – Broadcasts exist in every network. Many applications and many network operations require broadcasts to function properly, therefore, it is not possible to completely eliminate broadcasts. Just as with failure domains, in order to minimize the negative impact of broadcasts, broadcast domains should have clear boundaries and include an optimal number of devices.

Large amount of unknown MAC unicast traffic – Catalyst switches limit unicast frame forwarding to ports associated with the specific unicast address. However, frames arriving for a destination MAC address not recorded in the MAC table are flooded out all switch ports and this is known as an "unknown MAC unicast flooding." Because this causes excessive traffic on switch ports, NICs have to attend to a larger number of frames on the wire and security can be compromised as data is being propagated on a wire for which is was not intended.

Multicast traffic on ports where not intended – IP multicast is a technique that allows IP traffic to be propagated from one source to a multicast group identified by a single IP and MAC destination group address pair. Similar to unicast flooding and broadcasting, multicast frames will be flooded out all ports on the same VLAN where they were received. A proper design allows for containment of multicast frames while allowing them to be functional.

Difficulty in management and support – Because a poorly designed network may be disorganized, be poorly documented and be lacking easily identified traffic flows, the support, the maintenance, and the problem resolution become time consuming and arduous tasks.

Possible security vulnerabilities – A poorly designed switched network with little thought to security requirements at the access layer can compromise the integrity of the entire network

Here are some benefits of hierarchical addressing:

Ease of Management and Troubleshooting – Hierarchical addressing groups network addresses contiguously. Network management and troubleshooting is more efficient as a well-known IP addressing scheme will make problem components easier to locate.

Minimize Error – Orderly network address assignment can minimize error and duplicate address assignment.

Reduced number of routing table entries – In a hierarchical addressing plan, routing protocols are able to invoke route summarization which allows a single routing table entry to represent a collection of IP network numbers. Route summarization makes routing table entries manageable and provides the following benefits:

o Reduced number of CPU cycles when recalculating a routing table or sorting through the routing table entries to find a match

o Reduced router memory requirementso Faster convergence after a change in the networko Easier troubleshooting

Guidelines for Applying IP Address Space in the Enterprise Network

The Enterprise Composite Network Model provides a modular framework for designing and deploying networks. It also provides the ideal structure for overlaying a hierarchical IP addressing scheme. Some guidelines to follow are:

Design the IP addressing scheme so that blocks of 4, 8, 16, 32, or 64 contiguous network numbers can be assigned to the subnets in a given Building Distribution and Access switch block.

At the Building Distribution layer, continue to assign network numbers contiguously out toward to the Access Layer devices.

Have a single IP subnet correspond with a single VLAN. Subnet at the same binary value on all network numbers avoiding variable length subnet masks when

possible in order to minimize error and confusion when troubleshooting or configuring new devices and segments.

Interconnection technologies:

Fast Ethernet (100-Mbps Ethernet) – This LAN specification (IEEE 802.3u) operates at 100 Mbps over twisted-pair cable. The Fast Ethernet standard raises the speed of Ethernet from 10 Mbps to 100 Mbps with only minimal changes to the existing cable structure. A switch with port functioning at both 10 and 100 Mbps can move frames between ports without protocol Layer 2 protocol translation.

Gigabit Ethernet – An extension of the IEEE 802.3 Ethernet standard, Gigabit Ethernet increases speed tenfold over Fast Ethernet, to 1000 Mbps, or 1 gigabit per second (Gbps). IEEE 802.3z specifies operations over fiber optics and IEEE 802.3ab specifies operations over twisted-pair cable.

10 Gigabit Ethernet – 10 Gigabit Ethernet was formally ratified as an IEEE 802.3 Ethernet standard in June 2002. This technology is the next step for scaling the performance and functionality of enterprise. With the deployment of Gigabit ethernet becoming more common, 10 Gigabit will become the norm for uplinks.

EtherChannel – This feature provides link aggregation to aggregate bandwidth over Layer 2 links between two switches. EtherChannel bundles individual Ethernet ports into a single logical port or link providing aggregate bandwidth of 1600 Mbps (8-100Mbps links, full duplex) or 16 Gbps (8-Gigabit links, full duplex) between two Catalyst switches. All interfaces in each EtherChannel bundle must be configured with similar speed, duplex and VLAN memberships.

Cisco suggests oversubscription ratios that can be used to plan bandwidth requirements between key devices on a network with average traffic flows.

Access to distribution layer links – The oversubscription ratio can be 20:1. That is, the link can be 1/20 of the total bandwidth available cumulatively to all end devices using that access to distribution layer link.

Distribution to Core links – The ratio should be 4:1. Between Core Devices – There should be little to no oversubscription planning. That is, the links

between core devices should be able to carry traffic at the speed represented by the aggregate number bandwidth of all the Distribution uplinks into the core.

CAUTION:

These ratios are appropriate for estimating average traffic from access layer, end user devices. They are not accurate for planning oversubscription from the Server Farm or Edge distribution modules. They are also not accurate for planning bandwidth needed on access switches hosting atypical user applications with high bandwidth consumption. (e.g. non client -server databases or multimedia flows to unicast addresses. Using QoS end to end prioritizes the traffic which would need to be dropped in the event of congestion.

When mapping VLANs onto the new hierarchical network design, keep these parameters in mind:

1. Examine the subnetting scheme that has been applied to the network and associate a VLAN to each subnet.

2. Configure routing between VLANs at the distribution layer. Routing always occurs at the distribution layer switch.

3. Make end-user VLANs and subnets local to a specific switch block.4. Ideally limit a VLAN to one access switch or switch stack. It however may be necessary to span a

VLAN across multiple access switches within a switch block to support say wireless mobility.

Considering traffic source to destination path

The network design and the types of applications supported will determine where certain traffic sources are located. In the case of multicast and IP telephony applications, they do share some common traffic types. Specifically, if a Cisco CallManager is providing music on hold, it may need to multicast that traffic stream. Similarly, if there is an IPTV broadcast server on the network it will also be sending information via multicast to a specific set of devices. Use these considerations when determining where to place the servers:

1. IP multicast servers may exist within a server farm, or be distributed through the network at appropriately designed locations. Select distribution layer switches to act as rendezvous points and which are central to the location of the largest distribution of receiving nodes.

2. Cisco CallManager servers must be accessible throughout the network at all times. Ensure redundant NICs in the publisher and subscriber servers and redundant connections between those NICs and the upstream switch from the server. It is recommended that voice traffic be configured on its own VLAN.

3. VLAN trunks must be configured appropriately to carry IP telephony traffic throughout the network, or to specific destinations.

Configuration Interface Available on Various Catalyst Platforms

Some widely used Catalyst switch platforms that support the IOS interface are 2950, 3500, 3700, 4500*, 6500*, 8500. (*These platforms have an option to use the IOS or Catalyst software for Layer 2 configuration.)

The Catalyst software interface exists on several modular Catalyst platforms including: 4000, 4500, 5500, 6000 and 6500.

For example on the Catalyst 6500 you have the option of using the Catalyst software, Catalyst software + IOS or IOS software functionality.

End-to-end VLANs:

The term end-to-end VLAN refers to a single VLAN associated with switch ports that are widely dispersed throughout an enterprise network. Traffic for this VLAN is carried throughout the switched

network. If many VLANs in a network are end-to-end, special links are required between switches to carry traffic from multiple VLANs.

2. VLANs

An end-to-end VLAN has these characteristics:

The VLAN is geographically dispersed throughout the network. Users are grouped into the VLAN regardless of physical location. As a user moves throughout a campus, VLAN membership of that user remains the same. Users are typically associated with a given VLAN for network management reasons. All devices on a given VLAN typically have addresses on the same IP subnet.

Because a VLAN represents a Layer 3 segment, end-to-end VLANs allow a single Layer 3 segment to be geographically dispersed throughout the network. Reasons for implementing this design might include:

Grouping Users – Users can be grouped on a common IP segment even though they are geographically dispersed.

Security – A VLAN may contain resources that should not be accessible to all users on the network, or there may be a reason to confine certain traffic to a particular VLAN.

Applying QoS – Traffic from a given VLAN can be given higher or lower access priority to network resources.

Routing Avoidance – If much of the VLAN user traffic is destined for devices on that same VLAN and routing to those devices is not desirable, users can access resources on their VLAN without their traffic being routed off the VLAN even though the traffic may traverse multiple switches.

Special Purpose VLAN – Sometimes a VLAN is provisioned to carry a single type of traffic that must be dispersed throughout the campus (for example, Multicast, Voice or Visitor VLANs).

Poor Design – For no clear purpose, users are placed in VLANs that span the campus or even WAN networks.

There are some items that should be considered when implementing end-to-end VLANS. Switch ports are provisioned for each user and associated with a given VLAN. Because users on an end-to-end VLAN may be anywhere in the network, all switches must be aware of that VLAN. This means that all switches carrying traffic for end-to-end VLANs are required to have identical VLAN databases. Also, flooded traffic for the VLAN is, by default, passed to every switch even if it does not currently have any active ports in the particular end-to-end VLAN. Finally, troubleshooting devices on a campus with end-to-end VLANs can be challenging as the traffic for a single VLAN can traverse multiple switching in a large area of the campus.

Local VLANs

In the past, network designers attempted to implement the 80/20 rule when designing networks. The rule was based on the observation that, in general, 80 percent of the traffic on a network segment went between local devices, and only 20 percent of the traffic was destined for remote network segments. Network designers now consolidate servers in central locations on the network, and provide access to external resources such as the Internet through one or two paths on the network, as the bulk of traffic now traverses a number of network segments. Therefore the paradigm has changed more to a 20/80 where the greater flow of traffic leaves the local segment.

Additionally, the concept of end-to-end VLANs was very attractive when IP address configuration was a manually administered and burdensome process; therefore, anything that reduced this burden as users moved between networks was a good thing. But given the ubiquity of DHCP, the process of configuring IP at each desktop is no longer a significant issue. As a result there are few benefits to extending a VLAN throughout an enterprise. It is often more efficient to group all users on a set of geographically common switches into a single VLAN regardless of the organizational function of those users, especially from a troubleshooting perspective. VLANs that have boundaries based upon campus geography rather than organizational function are called "local VLANs."

Here are some local VLAN characteristics and usage guidelines:

Local VLANs should be created with physical boundaries in mind, rather than job functions of the users on the end devices.

Traffic from a local VLAN is routed to reach destinations on other networks. A single VLAN does not extend beyond the Building Distribution submodule. VLANs on a given access switch should not be advertised to all other switches in the network.

Global configuration mode can be use to configure VLANs in the range 1-1005 and must be used to configure extended-range VLANs (VLAN IDs 1006 to 4094). The VTP configuration revision number is incremented each time a VLAN is created or changed.

Attributes and characteristics of access ports are:

An access port is associated with a single VLAN. The VLAN to which the access port is assigned must exist in the VLAN database of the switch or the

port will be associated with an inactive VLAN that does NOT forward frames. Because an access switch port is part of a VLAN or Layer 2 domain, that port will receive broadcasts,

multicasts, unicast floods, and so forth that are sent to all ports in the VLAN. The end device will typically have an IP address that is common to all other devices on the access

VLAN.

Implementing the Enterprise Composite Model using local VLANs provides the following benefits:

Deterministic traffic flow – The simple layout provides a predictable Layer 2 and 3 traffic path. In the event of a failure, which was not mitigated by the redundancy features, the simplicity of the model facilitates expedient problem isolation and resolution within the switch block.

Finite failure domain – If VLANs are local to a switch block and the number of devices on each VLAN is kept small, failures at Layer 2 are isolated to a small subset of users.

High availability – Redundant paths exist at all infrastructure levels. Local VLAN traffic on Access switches can be passed to the Building Distribution switches across an alternate Layer 2 path in the event of primary path failure. Redundant Layer 3 protocols can provide failover should the default gateway for the access VLAN fail. When both the STP instance and VLAN are contained to a specific Access and Distribution block, then Layer 2 and Layer 3 redundancy measures and protocols can be configured to failover in a coordinated manner.

Ease of Management – Local VLANs, typically confined to the Building Access submodule, are easier to plan and manage than VLANs spanning various switches and network areas. Also, local VLANs, when used in combination with dynamically assigned IP addresses, allow workstations to move from one VLAN to another with limited administrative overhead.

A trunk link may exist between these devices:

Two switches A switch and a router A Switch and a trunk capable NIC in a node such as a server

Default Trunk Configuration

The default trunk configuration on Ethernet ports is as follows:

Feature Default Setting

Trunking state

Disabled

Encapsulation Negotiate (if both ports are set to negotiate mode, the trunk uses ISL encapsulation)

Trunk mode Auto

Allowed VLANs

All VLANs are allowed for trunking

VLAN IDs are only associated with frames traversing a trunk link. When a frame enters or exits the switch on an access link, no VLAN ID is present. The ASIC on the switch port assigns the VLAN ID to a frame as it is placed on a trunk link and also strips off the VLAN ID if the frame exits an access switch port.

Trunk links should be managed so that they carry only traffic for intended VLANs. This practice keeps unwanted VLAN data traffic from traversing links unnecessarily. Trunk links are used between the Access and Distribution layers of the Campus Switch Block.

802.1Q is not proprietary and can be deployed in any Ethernet, standards-based Layer 2 device. It is specific to a single Layer 2 protocol (Ethernet) because it modifies the Layer 2 frame by inserting a tag between two specific fields of the frame and therefore must be aware of the frame header details.

ISL is Layer 2 protocol independent. Because the original Layer 2 frame is fully encapsulated and not altered, ISL can transport data frames from various Layer 2 media types.

The following are some benefits of the ISL protocol:

It supports multiple Layer 2 protocols (Ethernet, Token Ring, FDDI and ATM). It supports Per VLAN Spanning Tree Protocol Encapsulation process leaves original frames unmodified, less prone to error and more secure. It has a large installation base.

The ISL Encapsulation Process When a switch port is configured as an ISL trunk port, the entire original Layer 2 frame, including header and FCS trailer, will be encapsulated before it traverses the trunk link. Encapsulation is the process of placing an additional header in the front and a trailer at the end of the original Layer 2 frame. The ISL header will contain the VLAN ID of the VLAN where the frame originated. At the receiving end, the VLAN ID is read, the header and trailer are removed and the original frame is forwarded as any regular Layer 2 frame on that VLAN.

Only ISL trunk ports can properly receive ISL encapsulated frames. A non-ISL port receiving an ISL frame may consider the frame size to be invalid or not recognize the fields in the header. The frame will likely be dropped and counted as a transmission error when received by a non-ISL port.

ISL Header

The ISL header contains various fields with values that define attributes of the original Layer 2 data within the encapsulated frame. This information is used for forwarding, media identification, and VLAN identification. The population of the fields within the ISL header varies based on the type of VLAN and the media of the link. The ASIC on an Ethernet port encapsulates the frames with a 26-byte ISL header and a 4-byte FCS. This 30-byte ISL encapsulation overhead is consistent among the Layer 2 protocols supported on Catalyst switches, but the overall size of the frame will vary and be limited by the MTU of the original Layer 2 protocol.

The ISL Ethernet frame header contains these information fields:

DA (destination address) – 40-bit destination address. This is a multicast address and is set at 0x01-00-0C-00-00 or 0x03-00-0c-00-00. The first 40 bits of the DA field signal the receiver that the packet is in ISL format.

TYPE – Four-bit descriptor of the encapsulated frame types: Ethernet (0000), Token Ring (0001), FDDI (0010), and ATM (0011).

USER – Four-bit descriptor used as the TYPE field extension or to define Ethernet priorities; it is a binary value from 0, the lowest priority, to 3, the highest priority. The default USER field value is "0000." For Ethernet frames, the USER field bits "0" and "1" indicate the priority of the packet as it passes through the switch.

SA (source address) – 48-bit source MAC address of the transmitting Catalyst switch port. LEN (length) – 16-bit frame-length descriptor minus DA, TYPE, USER, SA, LEN, and CRC. AAAA03 – Standard Subnetwork Access Protocol (SNAP) 802.2 logical link control (LLC) header. HSA (high bits of source address) – First three bytes of the SA (manufacturer or unique organizational

ID). VLAN ID – 15-bit VID. Only the lower 10 bits are used for 1024 VLANs. BPDU (bridge protocol data unit) – One-bit descriptor identifying whether the frame is a spanning tree

BPDU. It also identifies if the encapsulated frame is a CDP or VTP frame and indicates if the frame should be sent to the control plane of the switch.

INDX (index) – Indicates the port index of the source of the packet as it exits the switch. It is used for diagnostic purposes only and may be set to any value by other devices. It is a 16-bit value and is ignored in received packets.

ENCAP FRAME – Encapsulated data packet, including its own CRC value, completely unmodified. The internal frame must have a CRC value that is valid when the ISL encapsulation fields are removed. A receiving switch may strip off the ISL encapsulation fields and use this ENCAP FRAME field as the frame is received (associating the appropriate VLAN and other values with the received frame as indicated for switching purposes).

ISL Trailer

The trailer portion of the ISL encapsulation is an FCS, which carries a CRC value calculated on the original frame plus the ISL header as the ISL frame was placed onto the trunk link. The receiving ISL port recalculates this value. If the CRC values do not match, the frame is discarded. If the values match, the switch discards the FCS as a part of removing the ISL encapsulation so that the original frame can be processed. The ISL Trailer consists of these fields:

FCS (frame check sequence) – Consists of four bytes. This sequence contains a 32-bit CRC value, which is created by the sending MAC and is recalculated by the receiving MAC to check for damaged frames. The FCS is generated over the DA, SA, LEN, TYPE, and Data fields. When an ISL header is attached, a new FCS is calculated for the entire ISL packet and added to the end of the frame.

RES (reserved) – 16-bit reserved field used for additional information, such as the FDDI frame control field.

Some additional benefits of the 802.1Q protocol are:

Support for Ethernet and Token Ring Support for 4096 VLANs Support for Common Spanning Tree (CST), Multiple Spanning Tree (MST) and Rapid Spanning Tree

(RST) Point-to-multipoint topology support

Support for untagged traffic over the trunk link via Native VLAN Extended quality of service (QoS) support Growing standard for IP Telephony links

The 802.1Q Tagging Process

To identify a frame with a given VLAN, the 802.1Q protocol adds a tag, or a field, to the standard Layer 2 Ethernet data frame. Because inserting the tag alters the original frame, the switch must recalculate and alter the CRC value for the original frame before sending it out the 802.1Q trunk port. In comparison, ISL does not modify the original frame at all.

The new 802.1Q Tag field has the following components:

PRI – 3 bits carries priority information for the frame. Token Ring Field – Indicates the canonical interpretation of the frame if it is passed from Ethernet to

Token Ring. VLAN ID – VLAN association of the frame. By default, all normal and extended range VLANs are

supported.

If a non-802.1Q-enabled device or an access port receives an 802.1Q frame, the tag data is ignored, and the packet is switched at Layer 2 as a standard Ethernet frame. This allows for the placement of Layer 2 intermediate devices, such as other switches or bridges, along the 802.1Q trunk path. To process an 802.1Q tagged frame, a device must allow an MTU of 1522 or higher.

NOTE:

An Ethernet frame that has a larger MTU than expected (1518 by default for Ethernet) but no larger than 1600 bytes, will register as a Layer 2 error frame called a "baby giant." For ISL, the original frame plus ISL encapsulation can generate a frame as large as 1548 bytes and 1522 bytes for an 802.1Q tagged frame.

When configuring an 802.1Q trunk, a matching, native VLAN must be defined on each end of the trunk link. A trunk link is inherently associated with tagging each frame with a VLAN ID. The purpose of the native VLAN is to allow frames not tagged with a VLAN ID to traverse the trunk link. An 802.1Q Native VLAN is defined as one of the following:

The VLAN that a port is associated with when not in trunking operational mode The VLAN that is associated with untagged frames that are received on a switch port. The VLAN to which Layer 2 frames will be forwarded if received untagged on an 802.1Q trunk port

Compare this to ISL, where no frame may be transported on the trunk link without encapsulation and any frames received on a trunk port that are un-encapsulated are immediately dropped.

Each physical port has a parameter called a Port VLAN identifier (PVID). Every 802.1Q port is assigned a PVID value equal to the native VLAN ID (VID). When a port receives a tagged frame that is to traverse the trunk link, the tag is respected. For all untagged frames the PVID is considered the tag. This allows the frames to traverse devices that may be unable to read VLAN tag information.

Native VLANs have the following attributes:

A trunk port will support only one native, active VLAN per operational mode. The modes are Access and Trunk.

By default on Catalyst switches, all switch ports and native VLANs for 802.1Q are assigned to VLAN1. The 802.1Q trunk ports connected to each other via physical or logical segments must all have the same

native VLAN configured to operate correctly. If the native VLAN is misconfigured for trunk ports on the same trunk link, Layer 2 loops can occur

due to diverting STP BPDUs from their correct VLAN.

Example: Native VLAN Implementation; Two End Devices on the Same Switch Port

A standard place where the Native VLAN of 802.1Q might be used is when a single switch port supports traffic to an IP Phone that then provides a connection to a PC. The port must be configured as 802.1Q so that the Layer 2 header allows the QoS marking to populate the priority (PRI) bits for the telephony traffic. A standard Ethernet packet provides no field for this marking.

The traffic arriving on the switch port from the IP phone will be tagged with VLAN information. The PC traffic arriving on the same switch port will not be tagged. The VLAN ID for the telephony traffic arriving on the 802.1Q trunk port will be respected. The PC traffic arriving with no tag will traverse the Native VLAN.

About Issues with 802.1Q Native VLANs

The following issues need to be considered when configuring the native VLAN on an 802.1Q trunk link:

The native VLAN interface configurations must match at both ends of the link or the trunk may not form.

By default, the native VLAN will be VLAN1. For the purpose of security, the native VLAN on a trunk should be set to a specific VLAN ID that is not used for normal operations elsewhere on the network.

If there is a native VLAN mismatch on an 802.1Q link, CDP, if used and functioning, will issue a "VLAN mismatch" error.

On select versions of Cisco IOS software, CDP may not be transmitted or will be automatically turned off if VLAN1 is disabled on the trunk.

If there is a native VLAN mismatch on either side of an 802.1Q link, Layer 2 loops may occur. When troubleshooting VLANs, note that a link can have one native VLAN association when in access

mode, and another native VLAN association when in trunk mode.

Each VLAN on the network must have a unique VID. The valid range of user-configurable ISL VLANs is 1 to 1024. The valid range of VLANs specified in the IEEE 802.1Q standard is 0 to 4094. This table describes VLAN ranges and their usage.

In a network environment with non-Cisco devices connected to Cisco switches through 802.1Q trunks, you must map 802.1Q VLAN numbers greater than 1000 to ISL VLAN numbers on the Cisco switches. 802.1Q VLANs in the range 1 to 1001 are automatically mapped by VTP to corresponding ISL VLAN. 802.1Q VLAN numbers greater than 1006 must be mapped to an ISL VLAN to be recognized and forwarded by VTP. Alternatively, configure VTP in transparent mode allow extended system id and manually manage VLANs.

As a best practice, assign extended VLANs beginning with 4094 and work downward as some switches use extended range VLAN ids for internal use starting at the low end of the extended range.

DTP Modes:

It is best practice is to shut down an interface while configuring trunking attributes so that premature autonegotiation cannot occur.

Careful design and consideration should be taken when implementing VLAN trunks because they can add to overall network congestion and can also present security challenges. These are general best practices for trunk implementation in the Campus Infrastructure module:

VLAN1 should be removed from the trunks to ensure that no user data propagates among the switches on VLAN1. While each Catalyst switch requires VLAN1 on the actual switch and it is not possible to remove, it is possible to remove VLAN1 from trunk links.

Limit the trunk link to only the intended VLANs required for Layer 2 access and connectivity. This improves bandwidth utilization by restricting unwanted VLAN traffic from the link. Explicitly permitting or denying VLANs to a specific trunk link creates a simple, deterministic Layer 2 switched domain with fewer variables to complicate troubleshooting. This also facilitates correct operation of VLAN interfaces.

DTP should not be required. Trunk links, encapsulation types, and access ports should be statically configured across specific links according to the network design and requirements.

Cisco is now migrating to use 802.1Q as the recommended trunking protocol because of the interoperability and compatibility between the Layer 2 and Layer 3 prioritization methods. The IEEE 802.1Q/p standard provides architectural advantages over ISL; these include widely excepted QoS classification and marking standards and the ability to carry frames that are not tagged with a VID.

If a problem exists with a trunk link, or if a trunk link cannot be established, check the following:

Verify that the interface mode configured on both ends of the link is identical or valid for negotiated links. The interface mode should be trunk, dynamic, or nonegotiate.

Verify the trunk encapsulation type configured on both ends of the link is valid and compatible. For 802.1Q links, verify that the native VLAN is the same on both ends of the trunk.

NOTE:

If the trunk appears configured correctly on both ends or if the configuration has changed and no trunk forms, shutdown the interfaces on both ends of the trunk, check for a match in the configuration, and then bring the interfaces up to see if the trunk forms correctly.

Only "global" VLAN information regarding VLAN number, name and description is exchanged. Information on how ports are assigned to VLANs on a given switch is kept local to the switch and is not part of a VTP advertisement.

These are the attributes of a VTP Domain:

A switch may be in only one VTP domain. A VTP domain may be as small as only one switch. VTP updates will be exchanged only with other switches in the same domain. The way VLAN information is exchanged between switches in the same domain depends upon the VTP

mode of the switch. By default, a Catalyst switch is in the no-management-domain state until it receives an advertisement

for a domain over a trunk link, or until a management domain is configured

These are the attributes of VTP:

VTP is a Cisco proprietary protocol. VTP will advertise VLANs 1-1005. VTP updates are exchanged only across trunk links. Each switch operates in a given VTP "mode" which determines how VTP updates are sent from and

received by that switch. There are three VTP versions that support different features.

VTP in the Campus Infrastructure Module

There are some benefits to using VTP within the guidelines of the Campus Infrastructure module:

VTP domain is restricted to building switch blocks. VTP keeps VLAN information consistent between Building Distribution and Building Access switches. VLAN configuration errors or failures will be contained to a switch block. However knowledge of all VLANs does not need to exist on all switches within the Campus

Infrastructure module. Usage of VTP is optional and in high availability environments it is best practice to set all switches to ignore VTP updates (by setting them in transparent mode?).

Switches within a VTP management domain synchronize their VLAN databases by sending and receiving VTP advertisements over trunk links. VTP advertisements are flooded throughout a management domain by switches running in specific modes of operation. Advertisements are sent every five minutes or whenever there is a change in VLAN configuration. VTP advertisements are transmitted 1 using a Layer 2 multicast frame. VLAN advertisements are not propagated from a switch until a management domain name is specified or learned.

VTP Advertisement Types

There are three types of VTP advertisements exchanged between switches:

Summary Advertisements – an update sent by VTP servers every 300 seconds or when a VLAN database change occurs. Among other things, this advertisement lists the management domain, VTP version, domain name, configuration revision number, timestamp, and number of subset advertisements. When the advertisement results from a VLAN database change, one or more subset advertisements will follow.

Subset Advertisements – an update that follows a summary advertisement that results from a change in the VLAN database. A subset advertisement cites the specific change that was made to a specific VLAN entry in the VLAN database. One subset advertisement will be sent for each VLAN ID that encountered a change.

Advertisement Requests from Clients – an update sent by a switch requesting information in order to update its VLAN database. If a client hears a VTP summary advertisement with a configuration revision number higher than its own, the switch may send an Advertisement Request. A switch operating in VTP server mode, then responds with summary and subset advertisements.

NOTE:

VTP advertisements are associated with VLAN database information only, not VLAN information configured on specific switch ports. Likewise, on a receiving switch, the receipt of new VLAN information does not change the VLAN associations of trunk or access ports on that switch.

VTP Modes

VTP Versions

Currently Catalyst switches run VTP versions 1, 2, or 3. Version 2 is the most prevalent, and provides these features:

Forwarding of VTP updates from transparent mode switches without checking the version number Consistency checks on new VTP and VLAN configuration parameters Support for Token Ring Switches Propagation of VTP updates that have an unrecognized type, length or value

VTP version 3 is available on some switches now using the Cat OS operating system version.

The process of VLAN synchronization over VTP follows this general order:

Step 1 Configure the VTP domain, VTP mode, and VTP password (optional) on each switch. This proactively determines which switches will send updates.Step 2 Switches running VTP server mode then send VTP updates across trunk links.Step 3 A device that receives a VTP advertisement will check that the VTP management domain name and password in the advertisement match those configured in the local switch.Step 4 If a match is found, a switch further inspects the VTP update to see the configuration revision number.Step 5 If the configuration revision number of the message is greater than the number currently in use and the switch is running in VTP server or client mode, the switch overwrites its current VLAN information with that in the received update.Step 6 The switch may also request more information.

Default VTP configuration values depend on the switch model and the software version. The default values for the Catalyst 2900, 4000 and 6000 series switches are as follows:

VTP domain name: None VTP mode: Server VTP password: None

VTP trap: Disabled (SNMP traps communicating VTP status)

The VTP domain name can be specified or learned from VTP updates seen from other switches. By default, the domain name is not set.

NOTE:

In the following example, VTP version 2 is available (as shown by the "VTP Version" line of the output), but not enabled (as shown by the "VTP V2 Mode" line of the output).

In the example above, "Configuration last modified by 0.0.0.0" specifies the IP address of the switch that updated the VLAN database of this switch.

Below are some general best practices with regard to configuring VTP in the Enterprise Composite Model:

Plan boundaries for the VTP domain; not all switches in the network need information on all VLANs in the network. In the Enterprise Composite model the VTP domain should be isolated to redundant distribution switches and the access switches they serve.

Have only one or two switches specifically configured as VTP servers and the remainder as clients. Manually configure VTP on all switches installed in the network so the mode can be specified and the

default mode of server on all switches can be overwritten. Configure a password so that no switch can join the VTP domain with domain name only (which can be

derived dynamically). When setting up a new domain, configure VTP client switches first so they participate passively then

configure servers to update client devices. In an existing domain, if performing VTP cleanup, configure passwords on servers first. Clients may

need to maintain current VLAN information until server contains a complete VLAN database. Once the VLAN database on the server is verified as complete, then client passwords can be configured to be the same as the servers. Clients will then accept updates from the server.

The configuration revision number is the only criterion used when determining if a switch should keep its existing VLAN database or overwrite it with the VTP update sent by another switch in the same domain with the same password. Therefore, when a switch is added to a network it is important that it does not inject spurious information into the domain.

CAUTION:

If a new switch was at one time attached to another network, it is feasible that it contains a vlan.dat file in Flash and that its configuration revision number is higher than that of other VTP servers in the VTP domain to which it is being added. If no VTP domain was explicitly configured on the switch, when connected to the network, the new switch is able to derive the VTP domain name from any VTP update it sees. If there is no password on the domain and if the new switch is in server mode (default), its VLAN information can overwrite the VLAN database on other switches in the VTP domain.

3. STP Implementation

By definition, a transparent bridge has these characteristics:

It must not modify the frames that are forwarded. It learns addresses by "listening" on a port for the source address of a device. If a source MAC address

is read in frames coming in a specific port, the bridge assumes that frames destined for that MAC address can be sent out of that port. The bridge then builds a table that records what source addresses are seen on what port. A bridge is always listening and learning MAC addresses in this manner.

It must forward all broadcasts out all ports, except for the port that initially received the broadcast. If a destination address is unknown to the bridge, it forwards the frame out all ports except for the port

that initially received the frame. This is known as unicast flooding.

Transparent bridging, by definition, must be transparent to the devices on the network. End stations require no configuration. The existence of the bridging protocol operation is not directly visible to them, hence the term transparent bridging.

BPDU Structure:

STP prevents loops using the following mechanisms: STP is implemented through the exchange of BPDU messages between adjacent switches. A single "root bridge" is elected to serve as the reference point from which a loop free topology is built

for all switches exchanging BPDUs. Each switch determines a "root port" that provides the best path to the root bridge. On a link between two non-root switch ports, a port on one switch will become a designated port and

the port on the other switch will be in a blocking state, not forwarding frames. This effectively breaks any loop. Typically, the designated port will be on the switch with the best path to the root bridge.

Any port state change on any switch is considered a network topology change (for example if a port goes up or down and the STP Algorithm must be run on all switches to adapt to the new topology).

The information provided in a BPDU includes the following:

Root ID – The lowest BID in the topology Cost of Path – Cost of all links from the transmitting switch to the root bridge Bridge ID – (BID) of the transmitting switch Port ID – Transmitting switch port ID STP timer values – Max_Age, Hello Time, Forward Delay

The switch compares the BPDUs received on all ports to its own values to determine what role the receiving switch and its ports will play in the STP topology.

STP uses the concepts of root bridge, root ports, and designated ports to establish a loop-free path through the network. The first step in creating the loop-free spanning tree is to elect a root bridge. The root bridge is the reference point that all switches use to establish forwarding paths that will avoid loops in the Layer 2 network.

When a topology change occurs as a result of switch link state changes, the root will send messages throughout the tree regarding the topology change. This allows the CAM tables to adjust and to provide for a new path that may be used toward end host devices.

Timer information is also sent by the root bridge to non-root bridges, informing them of the intervals to use as the ports transition through the spanning tree port states.

The root bridge maintains the stability of the forwarding paths between all switches for a single STP instance. A spanning tree instance is when all switches exchanging BPDUs and participating in spanning tree negotiation are associated with a single root. If this is done for all VLANs, it is called a Common Spanning Tree instance. There is also a Per VLAN Spanning Tree (PVST) implementation that provides one instance, and therefore one root bridge, for each VLAN.

Per-VLAN Spanning Tree

One of the things that must be considered with VLANs is the function of the Spanning Tree Protocol (STP). STP is designed to prevent loops in a switch/bridged topology to eliminate the endless propagation of broadcast around the loop. With VLANs, there are multiple broadcast domains to be considered. Because each broadcast domain is like a unique bridged internetwork, you must consider how STP will operate.

The 802.1Q standard defines one unique Spanning Tree instance to be used by all VLANs in the network. STP runs on the Native VLAN so that it can communicate with both 802.1Q and non-802.1Q compatible switches. This single instance of STP is often referred to as 802.1Q Mono Spanning Tree or Common Spanning Tree (CST). A single spanning tree lacks flexibility in how the links are used in the network topology. Cisco implements a protocol known as Per-VLAN Spanning Tree Plus (PVST+) that is compatible with 802.1Q CST but allows a separate spanning tree to be constructed for each VLAN. There is only one active path for each spanning tree; however, in a Cisco network, the active path can be different for each VLAN.

If your network currently uses PVST+ and you plan to use MISTP on any switch, you must first enable MISTP-PVST+ on the switch and configure an MISTP instance to avoid causing loops in the network.

PVST+ Mode

PVST+ is the default spanning tree protocol used on all Ethernet, Fast Ethernet, and Gigabit Ethernet port-based VLANs on Catalyst 6000 family switches. PVST+ runs on each VLAN on the switch, ensuring that each VLAN has a loop-free path through the network.

PVST+ provides Layer 2 load balancing for the VLAN on which it runs; you can create different logical topologies using the VLANs on your network to ensure that all of your links will be used but no one link will be oversubscribed.

Each instance of PVST+ on a VLAN has a single root switch. This root switch propagates the spanning tree information associated with that VLAN to all other switches in the network. This process ensures that the network topology is maintained because each switch has the same knowledge about the network.

The IEEE 802.1D standard requires that each switch has an unique bridge identifier (bridge ID), which determines the selection of the root switch. Because each VLAN is considered as a different logical bridge with PVST+, the same switch must have as many different bridge IDs as VLANs configured on it. Each VLAN on the switch has a unique 8-byte bridge ID.

MISTP Mode

MISTP is an optional spanning tree protocol that runs on Catalyst 6000 family switches. MISTP allows you to group multiple VLANs under a single instance of spanning tree (an MISTP instance). MISTP combines the Layer 2 load-balancing benefits of PVST+ with the lower CPU load of IEEE 802.1Q.

An MISTP instance is a virtual logical topology defined by a set of bridge and port parameters; an MISTP instance becomes a real topology when VLANs are mapped to it. Each MISTP instance has its own root switch and a different set of forwarding links, that is, different bridge and port parameters.

Each instance of MISTP has a single root switch. This root switch propagates the information associated with that instance of MISTP to all other switches in the network. This process ensures that the network topology is maintained because each switch has the same knowledge about the network.

MISTP builds MISTP instances by exchanging MISTP BPDUs with peer entities in the network. There is only one BPDU for each MISTP instance, rather than for each VLAN as in PVST+. There are fewer

BPDUs in an MISTP network, therefore, there is less overhead in the network. MISTP discards any PVST+ BPDUs that it sees.

An MISTP instance can have any number of VLANs mapped to it, but a VLAN can only be mapped to a single MISTP instance. You can easily move a VLAN (or VLANs) in an MISTP topology to another MISTP instance if it has converged. (However, if ports are added at the same time the VLAN is moved, convergence time is required.)

MISTP-PVST+ Mode

MISTP-PVST+ is a transition spanning tree mode that allows you to use the MISTP functionality on Catalyst 6000 family switches while continuing to communicate with the older Catalyst 5000 and 6000 switches in your network that use PVST+. A switch using PVST+ mode and a switch using MISTP mode connected together cannot see the BPDUs of the other switch, a condition that can cause loops in the network. MISTP-PVST+ allows interoperability between PVST+ and pure MISTP because it sees the BPDUs of both modes. If you wish to convert your network to MISTP, you can use MISTP-PVST+ to transition the network from PVST+ to MISTP in order to avoid problems.

MISTP-PVST+ conforms to the limits of PVST+; for example, you can only configure the amount of VLAN ports on your MISTP-PVST+ switches that you configure on your PVST+ switches.

NOTE

The term Mono Spanning Tree is typically not used anymore because the IEEE 802.1s standard has now defined a Multiple Spanning Tree (MST) protocol that uses the same acronym.

Because a trunk link carries traffic for more than one broadcast domain and switches are typically connected together via trunk links, it is possible to define multiple Spanning Tree topologies for a given network. With PVST+, a root bridge and STP topology can be defined for each VLAN. This is accomplished by exchanging BPDUs for each VLAN operating on the switches. By configuring a different root or port cost based on VLANs, switches could utilize all the links to pass traffic without creating a bridge loop. Using PVST+, administrators can use ISL or 802.1Q to maintain redundant links and load balance traffic between parallel links using the Spanning Tree Protocol. Figure 3-15 shows an example of load balancing using PVST+.

PVST Load Balancing

Cisco developed PVST+ to allow running several STP instances, even over an 802.1Q network by using a tunneling mechanism. PVST+ utilizes Cisco devices to connect to a Mono Spanning Tree zone, typically another vendor's 802.1Q-based network, to a PVST+ zone, typically a Cisco ISL-based network. No specific configuration is needed to achieve this. PVST+ provides support for 802.1Q trunks and the mapping of multiple spanning trees to the single spanning tree of standard 802.1Q switches running Mono Spanning Tree.

The PVST+ architecture distinguishes three types of regions:

A PVST region (PVST switches using ISL only)

A PVST+ region (PVST+ using ISL and/or 802.1Q between Cisco switches)

A Mono Spanning Tree region (Common or Mono Spanning Tree using 802.1Q and exchanging BPDUs on the Native VLAN only between a Cisco and Non-Cisco switches using 802.1Q)

Each region consists of a homogenous type of switch. You can connect a PVST region to a PVST+ region using ISL ports. You can also connect a PVST+ region to a Mono Spanning Tree region using 802.1Q ports.

At the boundary between a PVST region and a PVST+ region, the mapping of Spanning Tree is one-to-one. At the boundary between a Mono Spanning Tree region and a PVST+ region, the Spanning Tree in the Mono Spanning Tree region maps to one PVST in the PVST+ region. The one it maps to is the CST. The CST is the PVST of the Native VLAN (VLAN 1 by default).

On a 802.1Q trunk, BPDUs can be sent or received only by the Native VLAN. Using PVST+, Cisco can send its PVST BPDUs as tagged frames using a Cisco multicast address as the destination. When a non-Cisco switch receives the multicast, it is flooded (but not interpreted as a BPDU, thus maintaining the integrity of CST). Because it is flooded, it will eventually reach Cisco switches on the other side of the CST domain. This allows the PVST fames to be tunneled through the MST region. Tunneling means that the BPDUs are flooded through the Mono Spanning Tree region along the single spanning tree present in the Mono Spanning Tree region.

PVST+ networks must be in a tree-like structure for proper STP operation.

BPDU Fields Associated with Root Bridge Election

The Bridge ID (BID) and Root ID are each 8-byte fields carried in a BPDU. These values are used to complete the root bridge election process. A switch identifies the root bridge by evaluating Root ID field in the BPDUs it receives. The unique Bridge ID of the root bridge will be carried in the Root ID field of the BPDUs sent by each switch in the tree.

When a switch first boots and begins sending BPDUs, it has no knowledge of a Root ID so it will populate the Root ID field of outbound BPDUs with its own Bridge ID.

The switch with the lowest numerical BID will assume the role of root bridge for that spanning tree instance. Upon receipt of BPDUs with a lower bridge ID than its own, a switch will place the lowest value seen in all BPDUs into the Root ID field information of its outbound BPDUs.

The Bridge ID Field in the BPDU

Spanning tree operation requires that each switch have a unique Bridge ID (BID). In the original 802.1D standard, the bridge ID was composed of the Priority Field and the MAC address of the switch, and all VLANs were represented by a Common Spanning Tree. Because (PVST) requires that a separate instance of spanning tree run for each VLAN, the bridge ID field is required to carry VLAN ID (VID) information. This is accomplished by re-using a portion of the priority field as the Extended System ID to carry a VID.

To accommodate the Extended System ID, the original 802.1D 16-bit Bridge Priority field is split into 2 fields resulting in these components in the Bridge ID:

Bridge Priority – a 4-bit field still used to carry Bridge Priority. Because of the limited bit count, priority is now conveyed in discreet values in increments of 4096 rather than discreet values in increments of 1 as they would be with the full 16-bit field available. The default priority, in accordance with IEEE 802.1D, is 32,768, which is the midrange value.

Extended System ID – a 12-bit field carrying, in this case, the VID for PVST. MAC address – a 6-byte field with the MAC address of a single switch.

By virtue of the MAC address, a bridge ID is always unique. When the Priority and Extended System ID are appended to the switch MAC address, each VLAN on the switch can be represented by a unique bridge identifier or Bridge ID.

If no priority has been configured, every switch will have the same default priority and the election of the root for each VLAN will be based on MAC address. This is a fairly random means of selecting the ideal root bridge and for this reason, it is advisable to assign a lower priority to the switch that should serve as root bridge.

802.1D features:

802.1D port roles:

Root port – This port exists on non-root bridges and is the switch port with the best path to the root bridge. Root ports forward traffic toward the root bridge, and the source MAC address of frames received on the root port are capable of populating the MAC table. Only one root port is allowed per bridge.

Designated port – This port exists on root and non-root bridges. For root bridges, all switch ports are designated ports. For non-root bridges, a designated port is the switch port that will receive and will forward frames toward the root bridge as needed. Only one designated port is allowed per segment. If multiple switches exist on the same segment, an election process determines the designated switch, and the corresponding switch port begins forwarding frames for the segment. Designated ports are capable of populating the MAC table.

Nondesignated port – This is a switch port not forwarding (blocking) data frames and not populating the MAC address table with SA of frames seen on that segment.

Disabled port – This is a switch port that is shut down.

Comparison between STP and RSTP port roles:

Root bridge election process:

Non-root bridges place various ports into their proper roles by listening to BPDUs as they come in on all ports. Receiving BPDUs on multiple ports indicates a redundant path to the root bridge.

The switch looks at these components in the BPDU to determine which switch ports will forward data and which switch ports will block data:

Lowest path cost Lowest Sender BID Lowest Local port ID

The switch looks at the path cost first to determine which port is receiving the lowest cost path. The path is calculated based on the link speed and the number of links the BPDU traversed. If a port has the lowest cost, that port is eligible to be placed in forwarding mode. All other ports that are receiving BPDUs continue in blocking mode.

If the path cost and Sender BID are equal, as with parallel links between two switches, the switch goes to the port ID as a "tiebreaker." The port with the lowest port ID forwards data frames, and all other ports continue to block data frames.

Path Cost

The spanning tree path cost is a value advertised in the BPDU by each bridge. This is a value BPDU. Path cost value is used by the receiving switch to determine the best path to the root bridge. The lowest cost is considered to be the best path.

Post cost values per link are shown above under the Revised IEEE Spec with the lower values being associated with higher bandwidth, and therefore being the more desirable paths. This new specification uses a nonlinear scale with port cost values as shown. In the previous IEEE specification, the cost value was calculated based on Gigabit Ethernet being the maximum Ethernet bandwidth, with an associated value of 1, from which all other values were derived in a linear manner.

Selecting the Root Port

Switch Y receives a BPDU from the root bridge (switch X) on its switch port on the Fast Ethernet segment and another BPDU on its switch port on the Ethernet segment. The root path cost in both cases is zero. The local path cost on the Fast Ethernet switch port is 19, while the local path cost on the Ethernet switch port is 100. As a result, the switch port on the Fast Ethernet segment has the lowest path cost to the root bridge and is elected the root port for switch Y.

Selecting the Designated Port

STP selects one designated port per segment to forward traffic. Other switch ports on the segment become nondesignated ports and continue blocking. The nondesignated ports receive BPDUs but do not forward data traffic to prevent loops. The switch port on the segment with the lowest path cost to the root bridge is elected the designated port. If multiple switch ports on the same segment have the same cost, the switch port with the lowest port ID becomes the designated port.

Because ports on the root bridge all have a root path cost of zero, all ports on the root bridge are designated ports.

Each Layer 2 port on a switch running STP exists in one of these five port states:

Blocking – In this state, the Layer 2 port is a nondesignated port and does not participate in frame forwarding. The port receives BPDUs to determine the location and Root ID of the root switch and what port roles (root, designated, or nondesignated) each switch port should assume in the final active STP topology.

Listening – In this state, spanning tree has determined that the port can participate in frame forwarding according to the BPDUs that the switch has received thus far. At this point, the switch port is not only receiving BPDUs, it is also transmitting its own BPDUs and informing adjacent switches that the switch port is preparing to participate in the active topology.

Learning – In this state, the Layer 2 port prepares to participate in frame forwarding and begins to populate the CAM table.

Forwarding – In this state, the Layer 2 port is considered part of the active topology and forwards frames and also sends and receives BPDUs.

Disabled – In this state, the Layer 2 port does not participate in spanning tree and does not forward frames.

Spanning Tree Timers

The amount of time that a port stays in the various port states is dependent upon the BPDU timers. Only the switch in the role of root bridge may send information through the tree to adjust the timers. The following three timers affect STP performance and state changes:

hello time – The hello time is the time between each BPDU that is sent on a port. This is equal to 2 seconds by default, but can be tuned to be between 1 and 10 seconds.

forward delay – The forward delay is the time spent in the listening and learning state. This is by default equal to 15 seconds for each state, but can be tuned to be between 4 and 30 seconds. The time of the listening + learning stage is equal to twice the forward delay value

max_age – The max age timer controls the maximum length of time a switch port saves configuration BPDU information. This is 20 seconds by default, but can be tuned to be between 6 and 40 seconds.

When STP is enabled, every switch port in the network goes through the blocking state and the transitory states of listening and learning at power up. The ports then stabilize to the forwarding or blocking state. During a topology change, a port temporarily implements the listening and learning states for a specified period called the "forward delay interval."

These values allow adequate time for convergence in a network with switches seven Layer 2 hops from the furthest switch to switch. This is referred to as an STP diameter, and a maximum of seven is permitted. The STP diameter value can be adjusted to a lower value that will automatically adjust the forward delay and max age timers proportionally for the new diameter.

CAUTION:

Best practice suggests not to individually alter the spanning tree timers but to adjust them indirectly by configuring the diameter to reflect the actual network topology.

Each configuration BPDU contains these three parameters. In addition, each BPDU configuration contains another time-related parameter that is known as the message age. The message age is not a fixed value. The message age contains the length of time that has passed since the root bridge initially originated the BPDU. The root bridge sends all its BPDUs with a message age value of 0, and all subsequent switches add 1 to this value. Effectively, this value contains the information on how far you are from the root bridge when you receive a BPDU. This diagram illustrates the concept:

The steps that occur in a topology change are as follows:

Step 1 Switch D notices that a change to a link has occurred.Step 2 Switch D sends a TCN BPDU out the root port destined ultimately for the root bridge. The switch will send out the TCN BPDU until the designated switch responds with a topology change acknowledgement.Step 3 Switch B, the designated switch, sends out a topology change acknowledgement to the originating switch D. The designated switch also sends a TCN BPDU out the root port destined for either the designated switch or the root bridge. (This is a propagation TCN.) Step 4 When the root bridge receives the topology change message, the root bridge changes the Flag portion of outbound BPDUs to indicate that a topology change is occurring. The root bridge sets the topology change in the configuration for a period of time equal to the sum of the forward delay and max_age parameters, which is approximately 50 seconds.Step 5 A switch receiving the topology change configuration message from the root bridge uses the forward delay timer to age out entries in the MAC address table. This time specification allows the switch to age out MAC address, switch port, and VLAN mapping entries faster than the normal five-minute default. The bridge continues this process until it no longer receives topology change configuration messages from the root bridge.Step 6 The backup link, if there is one, is enabled and the address table is repopulated.

A backup (or secondary) root bridge is a switch that is preferentially configured to assume the role of the root bridge in the event that the primary root bridge fails.

If no backup root bridge is configured and the root bridge fails, some other switch will be automatically chosen as the root bridge by the STP. However, it is likely that this automatic choice will not be optimal for network performance and stability.

The backup root bridge is configured to have a priority value set lower than the default, but higher than the primary root bridge. In normal operation, when the primary root bridge is functioning, the backup root bridge behaves like any other non-root bridge. When the primary root bridge fails, the backup root bridge then has the lowest priority in the network and so is selected to be the root bridge. The switch with the lowest MAC address will, in a case of a tie, be elected root bridge.

Configuration of an appropriate backup root bridge assures optimal network forwarding and stability in the event of the primary root bridge failure.

NOTE:

The root bridge for each instance of spanning tree should be a Building Distribution switch. On a network with a collapsed backbone and Building Distribution layer, one of the backbone switches should be the root bridge.

PVST is fully compatible with the 802.1Q trunking protocol and with ISL. PVST runs the same STA that 802.1D, and provides the same functionality, to prevent Layer 2 loops. The difference is that PVST is still a Cisco proprietary protocol and runs a separate instance of the STA for each VLAN. This means that for every VLAN created, a separate root bridge, a separate set of designated switches, and associated port roles and states are calculated.

Spanning Tree PortFast causes an interface configured as a Layer 2 access port to transition from blocking to forwarding state immediately, bypassing the listening and learning states. You can use PortFast on Layer 2 access ports, which are connected to a single workstation or to a server, to allow those devices to connect to the network immediately, rather than waiting for spanning tree to converge. If an interface configured with PortFast receives a BPDU, then spanning tree can put the port into the blocking state utilizing a feature called BPDU guard.

CAUTION:

Because the purpose of PortFast is to minimize the time that access ports must wait for spanning tree to converge, it should be used only on access ports. If you enable PortFast on a port connecting to another switch, you risk creating a spanning tree loop.

Cisco provides two features to protect Spanning Tree from loops being created on ports where PortFast has been enabled. In a proper configuration, PortFast would be enabled only on ports supporting end devices such as servers and workstations. It is anticipated that BPDUs from a switch device should not be received on a PortFast interface. BPDU guard and BPDU filtering provide protection in case BPDUs are received on a PortFast interface. Both BPDU guard and BPDU filtering can be configured globally on all PortFast-configured ports, or also on individual ports.

STP Load Sharing

Load Sharing Using STP Port Priorities

When two ports on the same switch form a loop, the STP port priority setting determines which port is enabled and which port is in a blocking state. You can set the priorities on a parallel trunk port so that the port carries all the traffic for a given VLAN. The trunk port with the higher priority (lower values) for a VLAN is forwarding traffic for that VLAN. The trunk port with the lower priority (higher values) for the same VLAN remains in a blocking state for that VLAN. One trunk port sends or receives all traffic for the VLAN.

The next figure shows two trunks connecting supported switches. In this example, the switches are configured as follows:

VLANs 8 through 10 are assigned a port priority of 10 on Trunk 1.

VLANs 3 through 6 retain the default port priority of 128 on Trunk 1. VLANs 3 through 6 are assigned a port priority of 10 on Trunk 2. VLANs 8 through 10 retain the default port priority of 128 on Trunk 2.

In this way, Trunk 1 carries traffic for VLANs 8 through 10, and Trunk 2 carries traffic for VLANs 3 through 6. If the active trunk fails, the trunk with the lower priority takes over and carries the traffic for all of the VLANs. No duplication of traffic occurs over any trunk port.

Load Sharing Using STP Path Cost

You can configure parallel trunks to share VLAN traffic by setting different path costs on a trunk and associating the path costs with different sets of VLANs. The VLANs keep the traffic separate. Because no loops exist, STP does not disable the ports, and redundancy is maintained in the event of a lost link.

In the next figure, Trunk ports 1 and 2 are 100BASE-T ports. The path costs for the VLANs are assigned as follows:

VLANs 2 through 4 are assigned a path cost of 30 on Trunk port 1.

VLANs 8 through 10 retain the default 100BASE-T path cost on Trunk port 1 of 19. VLANs 8 through 10 are assigned a path cost of 30 on Trunk port 2. VLANs 2 through 4 retain the default 100BASE-T path cost on Trunk port 2 of 19.

BPDU Guard

BPDU Guard is used to protect the switched network from the problems that may be caused by the receipt of BPDUs on ports which have been identified as ports that should not be receiving them. The receipt of unexpected BPDUs may be accidental or may be part of an unauthorized attempt to add a switch to the network.

BPDU Filtering

PortFast BPDU filtering effects how the switch acknowledges BPDUs seen on PortFast-configured ports. Its functionality differs when it is configured globally or on a per-port basis.

BPDU Guard

BPDU Guard protects against a switch outside the designated network attempting to become the root bridge, its access is blocked until the receipt of its BPDUs ceases.

BPDU guard protects the network from loops that might form if BPDUs are received on a PortFast enabled switch port.

NOTE:

When the BPDU guard feature is enabled, spanning tree applies BPDU guard to all PortFast-configured interfaces.

BPDU Guard Applied Globally versus Per-Port

At the global level, you can enable BPDU guard on PortFast-enabled ports by using the spanning-tree portfast bpduguard default global configuration command. In a valid configuration, PortFast-enabled ports do not receive BPDUs. Receiving a BPDU on a PortFast-enabled port signals an invalid configuration, such as the connection of an unauthorized device, and the BPDU guard feature puts the port in the error-disabled state.

At the interface level, you can enable BPDU guard on any port by using the spanning-tree bpduguard enable interface configuration command without also enabling the PortFast feature. When the port receives a BPDU, it is put in the error-disabled state.

BPDU Filtering Applied Globally versus Per-Port BPDU filtering can be configured globally or in individual PortFast enabled ports.

When enabled globally, it has these attributes:

It affects all operational PortFast ports on a switch that do not have BPDU filtering configured on the individual port.

If BPDUs are seen, the port looses its PortFast Status, BPDU filtering is disabled and STP sends and receives BPDUs on the port as any other STP port on the switch.

Upon startup, the port transmits ten BPDUs. If this port receives any BPDUs during that time, PortFast and PortFast BPDU filtering are disabled.

When enabled on an individual port, it has these attributes:

Ignores all BPDUs received Sends no BPDUs

CAUTION:

Explicit configuration of PortFast BPDU Filtering on a port not connected to a host station can result in bridging loops. The port ignores any incoming BPDU and changes to the forwarding state. This does not occur when PortFast BPDU filtering is enabled globally.

BPDU filtering results:

Example output:

2005 May 12 15:13:32 %SPANTREE-2-RX_PORTFAST:Received BPDU on PortFast enable port. Disabling 2/1 2005 May 12 15:13:32 %PAGP-5-PORTFROMSTP:Port 2/1 left bridge port 2/1

Root Guard - Root guard limits the switch ports out of which the root bridge may be negotiated. If a root guard-enabled port receives BPDUs that are superior to those being sent by the current root bridge, then that port will be moved to a root-inconsistent state, which is effectively equal to an STP listening state. No data traffic will be forwarded across this port.

Root guard is configured on a per-port basis. If there is a superior BPDU received on the port, root guard does not take the BPDU into account and so puts the port into root inconsistent state. Once switch D stops sending superior BPDUs, the port will be unblocked again and will transition through STP states as any other port. Recovery requires no intervention. A root guard port is in an STP designated state. Root guard should be enabled on all ports where the root bridge is not anticipated. A root guard-enabled port is in an STP designated port state.

The following console message appears when root guard blocks a port:

%SPANTREE-2-ROOTGUARDBLOCK: Port 1/1 tried to become non-designated in VLAN 77. Moved to root-inconsistent state

In the example, root guard should be enabled as follows (so that a access layer switch (C or D) should not become the root and thus causing all traffic at the distribution layer (A and B) to flow through C (A-B link will become blocked if C or D becomes the root)):

* Switch A – port connecting to switch C

* Switch B – port connecting to switch C

* Switch C – port connecting to switch D

A link fault can occur as a result of the following:

* The media toward the root switch is physically disconnected * The media toward the root fails * The switch port at the other end of the root port fails, is disabled or is shutdown * Any event in software or hardware that causes BPDUs not to arrive from the root switch on the port that a switch considers its root port

UplinkFast - Cisco’s Spanning Tree UplinkFast provides fast convergence after a direct link failure. This immediate convergence is facilitated through the creation of an uplink group; a set of Layer 2 interfaces on a single switch, only one of which is forwarding at any given time. An uplink group consists of the root port (which is forwarding) and a set of blocked ports. The uplink group provides alternate failover paths in event that the root port link fails.

UplinkFast can failover to a backup link very quickly, therefore the MAC address tables of other network switches must in turn be updated quickly to account for data traffic that should now traverse the backup path. To accomplish this, the UplinkFast switch will begin flooding frames with a source MAC address of all the entries in its CAM table to a destination Cisco proprietary multicast MAC Address. These frames will be sent out the backup port. This will in turn populate the CAM table of switches on the backup path with MAC addresses that were previously learned through the failed link.

UplinkFast unblocks a blocked port (connected to the backup root) on a switch who lost its direct link to the root switch and transitions it to the forwarding state without going through the listening and learning states. This switchover occurs within 5 seconds. UplinkFast is implemented on an access switch with at least one forwarding port and one blocked port toward the root.

UplinkFast is enabled on a switch rather than on a port. When enabled, it increases the bridge priority to 49,152 and adds a value of 3000 to the spanning tree port cost of all interfaces on the switch, which makes it unlikely that the switch will become the root switch. If bridge priority or port cost has been manually configured on a switch, UplinkFast will not alter those spanning tree values; it will only alter default values.

Enabling UplinkFast affects all VLANs on the switch. You cannot configure UplinkFast on an individual VLAN. NOTE:

UplinkFast should only be configured on access switches.

Cisco’s Spanning Tree UplinkFast provides fast convergence after a direct link failure. This immediate convergence is facilitated through the creation of an uplink group; a set of Layer 2 interfaces on a single switch, only one of which is forwarding at any given time. An uplink group consists of the root port (which is forwarding) and a set of blocked ports. The uplink group provides alternate failover paths in event that the root port link fails.

UplinkFast can failover to a backup link very quickly, therefore the MAC address tables of other network switches must in turn be updated quickly to account for data traffic that should now traverse the backup path. To accomplish this, the UplinkFast switch will begin flooding frames with a source MAC address of all the entries in its CAM table to a destination Cisco proprietary multicast MAC Address. These frames will be sent out the backup port. This will in turn populate the CAM table of switches on the backup path with MAC addresses that were previously learned through the failed link.

The figure shows an example of a topology in which switch A is deployed in the Building Access submodule with uplink connections to the root switch over link 2 and the backup root switch over link 3. Initially, the port on switch A connected to link 2 is in the forwarding state, and the port connected to link 3 is in the blocking state.

When switch A detects a link failure on the currently active link 2 on the root port (a direct link failure), UplinkFast unblocks the blocked port on switch A and transitions it to the forwarding state without going through the listening and learning states. This switchover occurs within 5 seconds. UplinkFast is implemented on an access switch with at least one forwarding port and one blocked port toward the root.

The UplinkFast feature is based on the definition of an uplink group. On a given switch, the uplink group consists in the root port and all the ports that provide an alternate connection to the root bridge. If the root port is failing (that is, if the primary uplink fails), a port with next lowest cost from the uplink group is selected to immediately replace it.

The following diagram helps to explain what the UplinkFast feature is based on:

In this diagram, root ports are represented with a blue R and designated ports arerepresented with a green d. The green arrows represent the BPDUs generated by the rootbridge and retransmitted by the bridges on their designated ports. Without entering aformal demonstration, we can determine the following about BPDUs and ports in a stablenetwork:1. When a port is receiving a BPDU, it has a path to the root bridge. This is becauseBPDUs are originated from the root bridge. In this diagram, check switch A: three of itsports are receiving BPDUs, and three of its ports lead to the root bridge. The port on Athat is sending BPDU is designated and not leading to the root bridge.2. On any given bridge, all ports receiving BPDUs are blocking, except the root port. Aport receiving a BPDU is leading to the root bridge. If we had a bridge with two portsleading to the root bridge, we would have a bridging loop.3. A self-looped port does not provide an alternate path to the root bridge. See switch B inthe diagram. Switch B's blocked port is self-looped, which means that it cannot receiveits own BPDUs. In this case, the blocked port is not providing an alternate path to theroot.

On a given bridge, the root port and all blocked ports that are not self-looped form theuplink group. The following section describes step-by-step how UplinkFast achieves fastconvergence using an alternate port from this uplink group.

Note:UplinkFast is only working when the switch has blocked ports. The feature istypically designed for an access switch having redundant blocked uplinks. When youenable UplinkFast, it is enabled for the entire switch and cannot be enabled for individualVLANs.

BackboneFast:

BackboneFast addresses the situation where an indirect failure causes a topology change and therefore a switch must find an alternative path through an intermediate switch. BackboneFast is initiated when a root port or blocked port on a switch receives inferior BPDUs from its designated bridge. Under normal

spanning-tree rules, the switch ignores inferior BPDUs for the configured maximum aging time, as specified by the agingtime variable of the set spantree maxage command.

Example: BackboneFast OperationBackboneFast operation is best illustrated by the failure of the link between the root and the backup root bridge. The backup root bridge does not assume that the root bridge is still available. The backup switch will immediately block all previously forwarding ports and then transmit configuration BPDUs claiming root responsibility.

When the access switch receives the BPDU of the backup root bridge, the access switch views the BPDU as inferior because its own root port is still active, and the last indication it has is that the backup root bridge is the designated root bridge. If configured for BackboneFast, the access switch then transmits a special root query message to explicitly determine if the root bridge is still active. Upon receipt of a response to the root query message, the access switch sends a BPDU using its known root bridge parameters to the backup root bridge and cycles the port connected to the backup root bridge through the listening and learning states.

This differs from standard 802.1D spanning tree operation in that normally the blocked port does not process the received BPDUs until the maxage interval has expired. By using the BackboneFast feature, the network recovers from an indirect failure in two times the forward delay time, which is 30 seconds by default, rather than max_age plus two times forward delay time, which is 50 seconds by default.

All links up

Link between root and backup root fails

BackboneFast must be enabled on all switches in the Spanning Tree instance. BackboneFast will interoperate with third-party switches. However it is a Cisco proprietary feature and it is not supported on Token Ring VLANs.

EtherChannel - EtherChannel bundles individual Ethernet links into a single logical link that provides bandwidth up to 1600 Mbps (Fast EtherChannel full duplex) or 16 Gbps (Gigabit EtherChannel) between two Catalyst switches. All interfaces in each EtherChannel must be the same speed and duplex and must all be configured as either Layer 2 or Layer 3 interfaces.

If a link within the EtherChannel bundle fails, traffic previously carried over the failed link will be carried over the remaining links within the EtherChannel

The configuration applied to the individual physical interfaces that are to be aggregated by EtherChannel affect only those interfaces. Each EtherChannel has a logical port channel interface. A configuration applied to the port channel interface affects all physical interfaces assigned to that interface. (Such commands can be STP commands or commands to configure a Layer 2 EtherChannel as a trunk.)

Etherchannel Features and Benefits

* Allows for the creation of a very-high-bandwidth logical link * Load balances amongst the physical links involved * Provides automatic failover * Simplifies subsequent logical configuration (configuration is per logical link instead of per physical link)

The Port Aggregation Protocol (PAgP) aids in the automatic creation of Fast EtherChannel links. PAgP packets are sent between Fast EtherChannel-capable ports in order to negotiate the forming of a channel. When PAgP identifies matched Ethernet links, PAgP groups the links into an EtherChannel. The EtherChannel is then added to the spanning tree as a single bridge port.

Link Aggregation Control Protocol (LACP) is part of an IEEE specification (802.3ad) that allows several physical ports to be bundled together to form a single logical channel. LACP allows a switch to negotiate an automatic bundle by sending LACP packets to the peer. It performs a similar function as Port Aggregation Protocol (PAgP) with Cisco EtherChannel. Because LACP is an IEEE standard, it can be used to facilitate EtherChannels in mixed switch environments.

Interface ModesInterfaces can be set in any of several modes to control EtherChannel formation.

This mode enables/disables Etherchannel:

On: The link aggregation is forced to be formed without any LACP negotiation .In other words, the switch will neither send the LACP packet nor process any incoming LACP packet. This is similar to the on state for PAgP.

Off: The link aggregation will not be formed. We do not send or understand the LACP packet. This is similar to the off state for PAgP.

The next two modes enable PAgP:

* Auto – This PAgP mode places an interface in a passive negotiating state in which the interface responds to the PAgP packets it receives, but it does not initiate PAgP negotiation. * Desirable – This PAgP mode places an interface in an active negotiating state in which the interface initiates negotiations with other interfaces by sending PAgP packets. Interfaces configured in the on mode do not exchange PAgP packets. The default mode for PAgP is auto mode.

The EtherChannel modes that use LACP are as follows:

Passive: The switch does not initiate the channel, but does understand incoming LACP packets. The peer (in active state) initiates negotiation (by sending out an LACP packet) which we receive and reply to, eventually forming the aggregation channel with the peer. This is similar to the auto mode in PAgP.

Active: We are willing to form an aggregate link, and initiate the negotiation. The link aggregate will be formed if the other end is running in LACP active or passive mode. This is similar to the desirable mode of PAgP.

There are only three valid combinations to run the LACP link aggregate, as follows:

Switch Switch Comments

active active Recommended.

active passive Link aggregation occurs if negotiation is successful.

on onLink aggregation occurs without LACP. Although this works, it is not recommended.

Note: By default, when an LACP channel is configured, the LACP channel mode is passive.

LACP Parameters

The following parameters are used in configuring LACP:

* System priority – Each switch running LACP must have a system priority. The system priority can be specified automatically or through the command-line interface (CLI). The switch uses the MAC address and the system priority to form the system ID. * Port priority – Each port in the switch must have a port priority. The port priority can be specified automatically or through the CLI. The port priority and the port number form the port identifier. The switch uses the port priority to decide which ports to put in standby mode when a hardware limitation prevents all compatible ports from aggregating.

* Administrative key – Each port in the switch must have an administrative key value, which can be specified automatically or through the CLI. The administrative key defines the ability of a port to aggregate with other ports, determined by the following: o The port physical characteristics, such as data rate, duplex capability, and point-to-point or shared medium o The configuration constraints that you establish

When enabled, LACP attempts to configure the maximum number of compatible ports in a channel. In some instances LACP is not able to aggregate all the ports that are compatible; for example the remote system might have more restrictive hardware limitations. When this occurs, all the ports that cannot be actively included in the channel are put in hot standby state and used only if one of the channeled ports fails

EtherChannel Layer 2/3 configuration:

http://www.cisco.com/en/US/products/hw/switches/ps5528/products_configuration_guide_chapter09186a00801e85ce.html#1033981

EtherChannel configuration tips:

* EtherChannel Support – All Ethernet interfaces on all modules support EtherChannel (maximum of eight interfaces), with no requirement that interfaces be physically contiguous or on the same module. * Speed and Duplex – Configure all interfaces in an EtherChannel to operate at the same speed and in the same duplex mode. Also, if one interface in the bundle is shut down it is treated as a link failure and traffic will traverse other links in the bundle. * SPAN and Etherchannel – An EtherChannel will not form if one of the interfaces is a Switched Port Analyzer (SPAN) destination port. * For Layer 3 EtherChannels – Assign Layer 3 addresses to the port-channel logical interface, not to the physical interfaces in the channel. * VLAN match – All interfaces in the EtherChannel bundle must be assigned to the same VLAN or be configured as a trunk * Range of VLANs – An EtherChannel supports the same allowed range of VLANs on all the interfaces in a trunking Layer 2 EtherChannel. If the allowed range of VLANs is not the same, the interfaces do not form an EtherChannel, even when set to the auto or desirable mode. For Layer 2 EtherChannels, either assign all interfaces in the EtherChannel to the same VLAN or configure them as trunks. * STP Path Cost – Interfaces with different STP port path costs can form an EtherChannel as long they are otherwise compatibly configured. Setting different STP port path costs does not, by itself, make interfaces incompatible for the formation of an EtherChannel. * Port Channel vs. Interface configuration – After configuring an EtherChannel, any configuration you apply to the port-channel interface affects the EtherChannel. Any configuration you apply to the physical interfaces affects only the specific interface you configured.

4. Spanning-tree enhancements

STP Troubleshooting - STP problems are most often evidenced by the existence of a bridge loop. Troubleshooting STP involves the identification and prevention of such loops.

The primary function of the spanning-tree algorithm (STA) is to remove loops created by redundant links in bridged networks. The STP operates at Layer 2 of the OSI mode, exchanging BPDUs between bridges, and selecting the ports that will eventually forward or block traffic. If BPDUs are not being sent or received over a link between switches, the role of the protocol in preventing loops may fail. Troubleshooting the resulting problems can be difficult in a complex network.



Any condition that prevents BPDUs from being sent or received can result in a bridge loop.

Here is an explanation of how those conditions may occur.

Duplex Mismatch

Duplex mismatch on a point-to-point link is a common configuration error and can have specific implications for STP. The results of the mismatch will vary some by platform.

There are two common mismatch scenarios between switches and their resulting STP problems:

* Switch configured for full duplex connected to a host in autonegotiation mode – The rule of autonegotiation is that upon negotiation failure, a port is required to assume half-duplex operation. This creates a situation where there is either no connectivity, or inconsistent connectivity between the two devices as one side of the connection defaults to half duplex mode and the other side is set to full duplex operation. In many cases this condition will allow traffic to flow at low data rates, but as the traffic level increases on the link, the half duplex side of the link will be overwhelmed causing data and link integrity errors. As the error rate goes up BPDUs may not successfully negotiate the link. * Switch configured for half duplex on a link, the peer switch is configured for full duplex – In the example , the duplex mismatch on the link between bridge A and bridge B can lead to a bridge loop. Because B is configured for full duplex, it does not perform carrier sense when accessing the link. B will then start sending frames even if A is already using the link. This is a problem for A, which detects a collision and runs the backoff algorithm before attempting another transmission of its frame. The result is that frames, including BPDUs, sent by A may be deferred or collide and eventually be dropped. Because it does not receive BPDUs from A, bridge B may loose its connection to the root. This will cause B to unblock its port to bridge C, thereby creating the loop.

Unidirectional Link FailureA unidirectional link is one that stays up while providing only one-way communication. Unidirectional links cause specific STP problems. In the example , the link between bridge A and bridge B is unidirectional and drops traffic from A to B while transmitting traffic from B to A. Suppose the port on bridge B was blocking. A port will block only if it receives BPDUs from a bridge with a higher BID. In this case, all the BPDUs coming from bridge A are lost so bridge B will never see the BPDU with the higher BID. B will unblock the port and eventually forward traffic, potentially creating a loop when other switches are in the scenario. If the unidirectional failure exists at startup, the STP will not converge correctly.

Frame CorruptionFrame corruption can occur from duplex mismatch, bad cables, or incorrect cable length and lead to an STP failure. If a link is receiving a high number of frame errors, BPDUs can be lost. This may cause a port in blocking state to transition to forwarding. In 802.1D, if a blocking port does not see any BPDUs for 50 seconds, it would transition to the forwarding state. If a single BPDU was successfully transmitted it would break the loop. This problem would be most likely if STP timing parameters, such as the max_age value setting, had been adjusted too low.

Resource ErrorsSTP is implemented in software. This means that if the CPU of the bridge is over utilized, the switch may lack the resources to send out or to receive BPDUs in a timely manner. Lack of BPDUs can cause ports to transition from blocking to forwarding when they should not transition. This can result in loops forming in the network.

The STA, however, is not processor-intensive and has priority over other processes. Therefore, a CPU utilization problem is unlikely on current Catalyst switch platforms.

PortFast Configuration ErrorPortFast is a feature that is intended for configuration on a port connected to a single host. When the link comes up on such a port, the first stages of the STA are skipped and the port directly transitions to the forwarding state. If a switch is inadvertently attached to a PortFast port, a loop may occur or this rogue switch may be elected as the STP root bridge. Furthermore, if a hub is attached to a PortFast port with redundant connections to the switch then a loop will be introduced that will go unchecked by STP.

In the example, A is a bridge with port P1 forwarding and port P2 configured for PortFast. B is a hub. As soon as the second cable is plugged into A, port P2 goes to the forwarding state and creates a loop between P1 and P2 given that both ports are in forwarding state. As soon as P1 or P2 receives a BPDU, one of these two ports will transition to a blocking state. The traffic generated by this kind of loop may occur at such a high rate that the bridge may have trouble successfully sending the BPDU to stop the loop. Implementing BPDU guard will prevent this problem.

EtherChannel IssuesThe challenges for EtherChannel can be divided into two main areas: Troubleshooting during the configuration phase, and troubleshooting during the execution phase. Configuration errors usually occur because of mismatched parameters on the ports involved (different speeds, different duplex, different spanning tree port values, mismatched native VLAN settings, etc.). But you can also generate errors during the configuration by setting the channel on one side to on and waiting too long before configuring the channel on the other side. This causes temporary spanning tree loops, which generate an error, and shut down the port.

Depending on the version of operating system and platform being configured for EtherChannel, the ports on one side of the link may remain in disabled state even after the configuration issue has been resolved. Be sure to verify that both sides of the link are operational after changing any port parameters on an EtherChannel link.

Debugging spanning-tree CAUTION:

Keep in mind that enabling debugging, especially to a console or auxiliary port can cause excessive processor utilization on an infrastructure device. If debugging commands must be issued, consider disabling console logging and send the output to a terminal session. Additionally, remember to turn off all debugging when the appropriate data has been collected.

Troubleshooting steps:

Reference a Network DiagramCollect the following network information before troubleshooting a bridging loop. Knowledge of the following items in your environment is critical:

* The physical and logical topology of the bridged network * Where the root bridge is located. (For all VLANs if PVST is in use) * Where the redundant links and blocked ports are to be located

Identify Issues

This knowledge is essential at least for the following two reasons:

* To identify a problem, you need to know how the STP network should be laid out when it is operating correctly. * The STP troubleshooting steps use show commands to display error conditions. Knowledge of the network helps focus your attention on the critical portions of these displays.

Identify a Bridge Loop

The best way to identify a bridge loop is to capture the traffic on a saturated link and check whether identical frames are traversing multiple links. Bridge loops often result in high port utilization due to excessive frames. Check the port utilization on your devices and look for abnormal values.

You can monitor STP operations using debug spanning-tree command. This command is helpful in verifying correct bridging operation as well as identifying loops.

Restore Connectivity vs. Resolve Issues

Bridge loops have severe consequences in a switched network. When one occurs, administrators generally do not have time to identify the reason for the loop during working hours and will often take temporary measures to stabilize the network but never resolve the actual problem that occurred. It is important to recreate and correct the original problem at a planned network down time.

Break the Loop Disabling Ports

A simple troubleshooting approach is to manually disable ports providing Layer 2 redundancy. Begin by disabling ports that should be blocking. Each time you disable a port, check to see if connectivity is restored in the network. If you know which port stopped the loop after being disabled, it is a good indication that the failure was located on a redundant path where this port was located.

Log STP Events on Devices Hosting Blocked Ports

If you cannot identify precisely the source of an STP problem, or if the problem is only transient, enable logging of STP events on the bridges and the switches of the network which are experiencing the failure. At a minimum, enable logging on devices hosting blocked ports, because it is typically the transition of a blocked port to forwarding that creates a loop.

Use the command debug spanning-tree events to enable STP debugging. Use the command logging buffered from global configuration mode to capture this debug information into the buffers of the device.

Check Ports

The ports to be investigated first are the blocking ports. Here is a list of what to check for on the various ports, with a brief description of the commands to enter.

Check That Blocked Ports Receive BPDUs

Check that BPDUs are being received periodically, especially on blocked and root ports.

If you are running Cisco IOS Release 12.0 or later release, the command show spanning-tree <bridge-group #> displays a field named BPDU, which displays the number of BPDUs received on each interface. Issuing the command several times will indicate if the device is receiving BPDUs.

Check for Duplex MismatchTo look for a duplex mismatch, check each side of a point-to-point link. Use the show interface command to check the speed and duplex status of the specified ports.

Check Port Utilization

An overloaded interface can fail to transmit vital BPDUs. An overloaded link is also an indication of a possible bridging loop.

Use the command show interface to determine interface utilization. Check the output for load and packet input and output.

Check Frame CorruptionLook for increases in the input errors field of the show interface command.

Look for Resource ErrorsA high CPU utilization can be dangerous for a system running the STA. Use the show processes cpu command to check whether the CPU utilization is approaching 100 percent.

Disable Unneeded FeaturesDisabling as many features as possible helps simplify the network structure and eases the troubleshooting process. EtherChannel, for example, is an advanced feature that needs STP to logically bundle several different links into a single logical port. It can be helpful to disable this feature during troubleshooting. In general, simplifying the network configuration reduces the troubleshooting effort.

The STP debug CommandThe command debug spanning-tree is very useful for troubleshooting STP issues. It accepts a variety of arguments to limit output to events that are specific to a certain STP feature. This example shows output regarding all events while interface GigabitEthernet 0/1 went down.CAUTION:

As with all debug commands, be very careful with debug spanning-tree. This command is extremely resource-intensive and will interfere with normal network traffic processing.

General RecommendationsIn general, it is difficult to troubleshoot spanning tree problems in a very large, flat, switched network. If the network is being restructured, it is advisable to implement a hierarchical network structure that is designed around the Campus Infrastructure module. This would create manageable failure domains and reduce the overall network complexity.

UDLD

A unidirectional link occurs when traffic is transmitted between neighbors in one direction only. Unidirectional links can cause Spanning Tree topology loops. UDLD allows devices to detect when a unidirectional link exists, and also to shut down the affected interface.

UDLD is a Layer 2 protocol that works with the Layer 1 mechanisms to determine the physical status of a link. If one fiber strand in a pair is disconnected, autonegotiation would not allow the link to become active or stay up. If both fiber strands are operant from a Layer 1 perspective, UDLD determines if traffic is flowing bi-directionally between the correct neighbors.

The switch periodically transmits UDLD packets on an interface with UDLD enabled. If the packets are not echoed back within a specific time frame, the link is flagged as unidirectional and the interface is shut down. Devices on both ends of the link must support UDLD for the protocol to successfully identify and disable unidirectional links.

LoopGuard

Like UDLD, loop guard provides protection for STP when a link is unidirectional and BPDUs are being sent, but not received, on a link that is considered operational. Without loop guard, a blocking port will transition to forwarding if it stops receiving BPDUs. If loop guard is enabled, and the link is not receiving BPDUs, the interface will move into the STP loop-inconsistent blocking state (the loop-inconsistent state is effectively equal to the blocking state). Without loop guard, the STP blocking port which encountered unidirectional link failure will transition to the STP listening state upon max_age timer expiration and then to the forwarding state in two times the forward delay time. A loop will be created. When loop guard blocks a port, this message is generated to the console or log file if allowed:

SPANTREE-2-LOOPGUARDBLOCK: No BPDUs were received on port 3/2 in vlan 3. Moved to loop-inconsistent state.

Once a BPDU is received on a loop guard port that is in a loop-inconsistent state, the port will transition to the appropriate state as determined by the normal functioning of Spanning Tree. The recovery requires no user intervention. After the recovery, this message is logged:

SPANTREE-2-LOOPGUARDUNBLOCK: port 3/2 restored in vlan 3.

Enabling loop guard will disable root guard, if root guard is currently enabled on the ports.

LoopGuard vs. UDLD

The functions of UDLD and loop guard partially overlap in that both protect against STP failures caused by unidirectional links. These two features are different in their approach to the problem and also in how they function. The figure identifies key differences as well as how to implement both features.

Depending on various design considerations, you can choose either UDLD or loop guard. UDLD provides no protection against STP failures that are caused by software that result in the designated switch not sending BPDUs. This type of failure, however, is less common than those caused by hardware failure.

On an EtherChannel bundle, UDLD will disable individual failed links. The channel itself remains functional if there are other links available. Loop guard will put the entire channel in loop-inconsistent state if any physical link in the bundle fails.

Loop guard does not work on shared links or if the link has been unidirectional since its initial setup. Enabling both UDLD and loop guard provides the highest level of protection.

UDLD and LoopGuard configuration

Configuring UDLD

UDLD is used when a link should be shut down because of a hardware failure that is causing unidirectional communication. In an EtherChannel bundle, UDLD will shut down only the physical link that has failed.

UDLD can be enabled globally for all fiber interfaces or on a per interface basis.

Enable UDLD on an Interface To enable UDLD on an interface use the following command:

Switch(config-if)#udld enable

Enable UDLD Globally To enable UDLD globally on all fiber-optic interfaces, use the following command:

Switch(config)#udld enable

Verify and Resetting UDLD

Interfaces will be shut down by UDLD. To reset all interfaces that have been shut down by UDLD, enter this command:

Switch#udld reset

To verify the UDLD configuration for an interface, enter this command:

Switch#show udld interface

Example: Displaying the UDLD State This example shows how to display the UDLD state for a single interface.

Switch#show udld GigabitEthernet2/2

Configuring Loop Guard

Loop guard is enabled on a per-port basis. When loop guard is enabled, it is automatically applied to all of the active VLAN instances to which that port belongs. When you disable loop guard, it is disabled for the specified ports. Disabling loop guard moves all loop-inconsistent ports to the listening state. If loop guard is enabled on an EtherChannel interface, the entire channel will be blocked for a particular VLAN. This is because EtherChannel is regarded as one logical port from an STP point of view.

Loop guard should be enabled on the root port and the alternate ports on access switches.

Enable Loop Guard on an InterfaceTo enable loop guard on a specific interface, issue this command:

Switch(config)#spantree guard loop mod/port

To disable loop guard, issue this command:

Switch(config)#spantree guard none mod/port

Enabling loop guard will disable root guard, if root guard is currently enabled on the ports.

Enable Loop Guard Globally Loop guard can be enabled globally on a switch for all point-to-point links. A full-duplex link is considered to be a point-to-point link. The status of loop guard can be changed on an interface even if the feature has been enabled globally.

To enable loop guard globally, issue this command:

Switch(config)#spantree global-default loopguard enable

To globally disable loop guard, issue this command:

Switch(config)#spantree global-default loopguard disable

Verifying the Loop Guard Status To verify the loop guard status, issue this command:

Switch#show spantree guard mod/port | vlan

For example:

Switch#show spantree guard 3/13 Port VLAN Port-State Guard Type

------- ------- ------------------- ----------------

3/13 2 forwarding loop

IEEE 802.1D Overview

STP is a Layer 2 link management protocol that provides path redundancy while preventing loops in thenetwork. For a Layer 2 Ethernet network to function properly, only one active path can exist betweenany two stations. Multiple active paths among end stations cause loops in the network. If a loop existsin the network, end stations might receive duplicate messages. Switches might also learn end-stationMAC addresses on multiple Layer 2 interfaces. These conditions result in an unstable network.Spanning-tree operation is transparent to end stations, which cannot detect whether they are connectedto a single LAN segment or a switched LAN of multiple segments.

The STP uses a spanning-tree algorithm to select one switch of a redundantly connected network as theroot of the spanning tree. The algorithm calculates the best loop-free path through a switched Layer 2network by assigning a role to each port based on the role of the port in the active topology:

• Root—A forwarding port elected for the spanning-tree topology• Designated—A forwarding port elected for every switched LAN segment• Alternate—A blocked port providing an alternate path to the root bridge in the spanning tree• Backup—A blocked port in a loopback configuration

The switch that has all of its ports as the designated role or as the backup role is the root switch. Theswitch that has at least one of its ports in the designated role is called the designated switch.

Spanning tree forces redundant data paths into a standby (blocked) state. If a network segment in thespanning tree fails and a redundant path exists, the spanning-tree algorithm recalculates thespanning-tree topology and activates the standby path. Switches send and receive spanning-tree frames,called bridge protocol data units (BPDUs), at regular intervals. The switches do not forward these frames but use them to construct a loop-free path. BPDUs contain information about the sending switch and its ports, including switch and MAC addresses, switch priority, port priority, and path cost. Spanning tree uses this information to elect the root switch and root port for the switched network and the root port and designated port for each switched segment.

When two interfaces on a switch are part of a loop, the spanning-tree port priority and path cost settings

determine which interface is put in the forwarding state and which is put in the blocking state. Thespanning-tree port priority value represents the location of an interface in the network topology and howwell it is located to pass traffic. The path cost value represents the media speed.

Spanning-Tree Topology and BPDUs

The stable, active spanning-tree topology of a switched network is determined by these elements:

• The unique bridge ID (switch priority and MAC address) associated with each VLAN on each switch• The spanning-tree path cost to the root switch• The port identifier (port priority and MAC address) associated with each Layer 2 interface

When the switches in a network are powered up, each functions as the root switch. Each switch sends aconfiguration BPDU through all of its ports. The BPDUs communicate and compute the spanning-treetopology. Each configuration BPDU contains this information:

• The unique bridge ID of the switch that the sending switch identifies as the root switch• The spanning-tree path cost to the root• The bridge ID of the sending switch• Message age• The identifier of the sending interface• Values for the hello, forward-delay, and max-age protocol timers

When a switch receives a configuration BPDU that contains superior information (lower bridge ID,lower path cost, and so forth), it stores the information for that port. If this BPDU is received on the rootport of the switch, the switch also forwards it with an updated message to all attached LANs for whichit is the designated switch.

If a switch receives a configuration BPDU that contains inferior information to that currently stored forthat port, it discards the BPDU. If the switch is a designated switch for the LAN from which the inferiorBPDU was received, it sends that LAN a BPDU containing the up-to-date information stored for thatport. In this way, inferior information is discarded, and superior information is propagated on thenetwork.

A BPDU exchange results in these actions:

• One switch in the network is elected as the root switch (the logical center of the spanning-treetopology in a switched network).For each VLAN, the switch with the highest switch priority (the lowest numerical priority value) iselected as the root switch. If all switches are configured with the default priority (32768), the switchwith the lowest MAC address in the VLAN becomes the root switch. The switch priority valueoccupies the most significant bits of the bridge ID.• A root port is selected for each switch (except the root switch). This port provides the best path(lowest cost) when the switch forwards packets to the root switch.• The shortest distance to the root switch is calculated for each switch based on the path cost.• A designated switch for each LAN segment is selected. The designated switch incurs the lowest pathcost when forwarding packets from that LAN to the root switch. The port through which thedesignated switch is attached to the LAN is called the designated port.• Interfaces included in the spanning-tree instance are selected. Root ports and designated ports areput in the forwarding state.• All paths that are not needed to reach the root switch from anywhere in the switched network areplaced in the spanning-tree blocking mode.

Bridge ID, Switch Priority, and Extended System ID

The IEEE 802.1D standard requires that each switch has an unique bridge identifier (bridge ID), whichdetermines the selection of the root switch. Because each VLAN is considered as a different logicalbridge with PVST+ and rapid PVST+, the same switch must have as many different bridge IDs asVLANs configured on it. Each VLAN on the switch has a unique 8-byte bridge ID; the twomost-significant bytes are used for the switch priority, and the remaining six bytes are derived from the

switch MAC address.

Spanning-Tree Interface States

Propagation delays can occur when protocol information passes through a switched LAN. As a result,topology changes can take place at different times and at different places in a switched network. Whenan interface transitions directly from nonparticipation in the spanning-tree topology to the forwardingstate, it can create temporary data loops. Interfaces must wait for new topology information to propagatethrough the switched LAN before starting to forward frames. They must allow the frame lifetime toexpire for forwarded frames that have used the old topology.

Each Layer 2 interface on a switch using spanning tree exists in one of these states:

• Blocking—The interface does not participate in frame forwarding.• Listening—The first transitional state after the blocking state when the spanning tree determinesthat the interface should participate in frame forwarding.• Learning—The interface prepares to participate in frame forwarding.• Forwarding—The interface forwards frames.• Disabled—The interface is not participating in spanning tree because of a shutdown port, no link onthe port, or no spanning-tree instance running on the port.An interface moves through these states:• From initialization to blocking• From blocking to listening or to disabled• From listening to learning or to disabled• From learning to forwarding or to disabled• From forwarding to disabled

When you power up the switch, spanning tree is enabled by default, and every interface in the switch,VLAN, or network goes through the blocking state and the transitory states of listening and learning.Spanning tree stabilizes each interface at the forwarding or blocking state.

When the spanning-tree algorithm places a Layer 2 interface in the forwarding state, this process occurs:

1. The interface is in the listening state while spanning tree waits for protocol information to transitionthe interface to the blocking state.2. While the spanning tree waits for the forward-delay timer to expire, it moves the interface to thelearning state and resets the forward-delay timer.3. In the learning state, the interface continues to block frame forwarding as the switch learnsend-station location information for the forwarding database.4. When the forward-delay timer expires, spanning tree moves the interface to the forwarding state,where both learning and frame forwarding are enabled.

Blocking State

A Layer 2 interface in the blocking state does not participate in frame forwarding. After initialization, aBPDU is sent to each interface in the switch. A switch initially functions as the root until it exchangesBPDUs with other switches. This exchange establishes which switch in the network is the root or rootswitch. If there is only one switch in the network, no exchange occurs, the forward-delay timer expires,and the interfaces move to the listening state. An interface always enters the blocking state after switchinitialization.

An interface in the blocking state performs as follows:

• Discards frames received on the port• Discards frames switched from another interface for forwarding• Does not learn addresses• Receives BPDUs

Listening State

The listening state is the first state a Layer 2 interface enters after the blocking state. The interface entersthis state when the spanning tree determines that the interface should participate in frame forwarding.

An interface in the listening state performs as follows:

• Discards frames received on the port• Discards frames switched from another interface for forwarding• Does not learn addresses• Receives BPDUs

Learning State

A Layer 2 interface in the learning state prepares to participate in frame forwarding. The interface entersthe learning state from the listening state.

An interface in the learning state performs as follows:

• Discards frames received on the port• Discards frames switched from another interface for forwarding• Learns addresses• Receives BPDUs

Forwarding State

A Layer 2 interface in the forwarding state forwards frames. The interface enters the forwarding statefrom the learning state.

An interface in the forwarding state performs as follows:

• Receives and forwards frames received on the port• Forwards frames switched from another port• Learns addresses• Receives BPDUs

Disabled State

A Layer 2 interface in the disabled state does not participate in frame forwarding or in the spanning tree.An interface in the disabled state is nonoperational.

A disabled interface performs as follows:

• Discards frames received on the port

• Discards frames switched from another interface for forwarding• Does not learn addresses• Does not receive BPDUs

Setting the root switch

To configure a switch to become the root for the specified VLAN, use the spanning-tree vlan vlan-idroot global configuration command to modify the switch priority from the default value (32768) to asignificantly lower value. When you enter this command, the switch checks the switch priority of the root switches for each VLAN. Because of the extended system ID support, the switch sets its ownpriority for the specified VLAN to 24576 if this value will cause this switch to become the root for thespecified VLAN.

If any root switch for the specified VLAN has a switch priority lower than 24576, the switch sets its own priority for the specified VLAN to 4096 less than the lowest switch priority. (4096 is the value of the least-significant bit of a 4-bit switch priority value as shown in the table.

IEEE 802.1w RSTP Overview

RSTP significantly reduces the time to reconfigure the active topology of the network when changesoccur to the physical topology or its configuration parameters. RSTP selects one switch as the root of aspanning tree-connected active topology and assigns port roles to individual ports of the switch,depending on whether that port is part of the active topology.

RSTP provides rapid connectivity following the failure of a switch, switch port, or a LAN. A new rootport and the designated port on the other side of the bridge transition to forwarding using an explicithandshake between them. RSTP allows switch port configuration so that the ports can transition toforwarding directly when the switch reinitializes.

RSTP as specified in 802.1w supersedes STP specified in 802.1D, but remains compatible with STP.

RSTP provides backward compatibility with 802.1D bridges as follows:

• RSTP selectively sends 802.1D-configured BPDUs and topology change notification (TCN) BPDUson a per-port basis.• When a port initializes, the migration-delay timer starts and RSTP BPDUs are transmitted. Whilethe migration-delay timer is active, the bridge processes all BPDUs received on that port.• If the bridge receives an 802.1D BPDU after a port’s migration-delay timer expires, the bridgeassumes it is connected to an 802.1D bridge and starts using only 802.1D BPDUs.• When RSTP uses 802.1D BPDUs on a port and receives an RSTP BPDU after the migration-delayexpires, RSTP restarts the migration-delay timer and begins using RSTP BPDUs on that port.

In a switched domain, there can be only one forwarding path toward a single reference point; this is the root bridge. The RSTP spanning-tree algorithm (STA) elects a root bridge in exactly the same way as 802.1D elects a root.

However, there are critical differences that make RSTP the preferred protocol for preventing Layer 2 loops in a switched network environment. Many of the differences stem from the Cisco proprietary enhancements. The Cisco-based RSTP enhancements have these characteristics:

* They are integrated into the protocol at a low level. * They are transparent. * They require no additional configuration. * They generally perform better than the Cisco-proprietary 802.1D enhancements.

* BPDU carries information about port roles and is sent to neighbor switches only.

Because the RSTP and the Cisco-proprietary enhancements are functionally similar, features such as UplinkFast and BackboneFast are not compatible with RSTP.

Note RSTP is available as a standalone protocol in Rapid-Per-VLAN-Spanning Tree (Rapid-PVST) mode. In this mode, the switch runs an RSTP instance on each VLAN, which follows the usual PVST+ approach.

Rapid-PVST

Rapid-PVST uses the existing configuration for PVST+; however, Rapid-PVST uses RSTP to providefaster convergence. Independent VLANs run their own RSTP instance.

Dynamic entries are flushed immediately on a per-port basis upon receiving a topology change.

UplinkFast and BackboneFast configurations are ignored in Rapid-PVST mode; both features areincluded in RSTP.

RSTP Port States:

RSTP provides rapid convergence following the failure or re-establishment of a switch, switch port, or link. An RSTP topology change will cause a transition in the appropriate switch ports to the forwarding state through either explicit handshakes or a proposal and agreement process and synchronization.

With RSTP, the role of a port is separated from the state of a port. For example, a designated port could be in the discarding state temporarily, even though its final state is to be forwarding.

The RTSP port states correspond to the three basic operations of a switch port: discarding, learning, and forwarding.

The port states have these characteristics:

* Discarding – This state is seen in both a stable active topology and during topology synchronization and changes. The discarding state prevents the forwarding of data frames, thus "breaking" the continuity of a Layer 2 loop. * Learning – This state is seen in both a stable active topology and during topology synchronization and changes. The learning state accepts data frames to populate the MAC table in an effort to limit flooding of unknown unicast frames. * Forwarding – This state is seen only in stable active topologies. The forwarding switch ports determine the topology. Following a topology change, or during synchronization, the forwarding of data frames occurs only after a proposal and agreement process.

In all port states, a port will accept and process BPDU frames.

RSTP Port Roles:

The port role defines the ultimate purpose of a switch port and how it handles data frames. Port roles and port states are able to transition independently of each other. RSTP uses these definitions for port roles:

* Root port – This is the switch port on every non-root bridge that is the chosen path to the root bridge. There can only be one root port on every switch. The root port assumes the forwarding state in a stable active topology. * Designated port – Each segment will have at least one switch port that is the designated port for that segment. In a stable, active topology, the switch with the designated port will receive frames on the segment that are destined for the root bridge. There can only be one designated port per segment. The designated port assumes the forwarding state. All switches connected to a given segment listen to all BPDUs and determine the switch that will be the designated switch for a particular segment.

* Alternate port – This is a switch port that offers an alternate path toward the root bridge. The alternate port assumes a discarding state in a stable, active topology. An alternate port will be present on nondesignated switches and will make a transition to a designated port if the current designated path fails. * Backup port – This is an additional switch port on the designated switch with a redundant link to the segment for which the switch is designated. A backup port has a higher port ID than the designated port on the designated switch. The backup port assumes the discarding state in a stable active topology. * Disabled port – This is a port that has no role within the operation of spanning tree.

Establishing the additional port roles allows RSTP to define a standby switch port before a failure or topology change. The alternate port moves to the forwarding state if there is a failure on the designated port for the segment.

Edge port:

An RSTP edge port is a switch port that is never intended to be connected to another switch device. It immediately transitioned to the forwarding state when enabled.

The edge port concept is well known to Cisco spanning tree users as it corresponds to the PortFast feature. All ports directly connected to end stations anticipate no switch device connected to them, immediately transition to the STP forwarding state thereby skipping the time consuming listening and learning stages. Neither edge ports nor PortFast enabled ports generate topology changes when the port transitions to a disabled or enabled status.

Unlike PortFast, an edge port that receives a BPDU immediately looses its edge port status and becomes a normal spanning tree port. A switch with the edge port receiving a BPDU generates a Topology Change Notification (TCN).

Cisco’s RSTP implementation maintains the PortFast keyword for edge port configuration, thus making an overall network transition to RSTP more seamless. Configuring an edge port where the port will be attached to another switch can have negative implications for RSTP when it is in the "sync" state.

RSTP Link types:

Link type provides a categorization for each port participating in RSTP. The link type can predetermine the active role that the port plays as it stands by for immediate transition to a forwarding state, if certain parameters are met. These parameters are different for edge ports and non-edge ports. Non-edge ports are categorized into two link types. Link type is automatically determined but can be overwritten with an explicit port configuration.

Edge ports, the equivalent of PortFast enabled ports, and Point-to-Point Links are candidates for rapid transition to a forwarding state. Before the link type parameter can be considered for the purpose of expedient port transition, RSTP must determine the port role.

Root ports do not use the link type parameter. Root ports are able to make a rapid transition to the forwarding state as soon as the port is in "sync."

Alternate and backup ports do not use the link type parameter in most cases.

Designated ports are those that make the most use of the link type parameter. Rapid transition to the forwarding state for the designated port occurs only if the link type parameter indicates a point-to-point link.

RSTP BPDUs:

RSTP (802.1w) uses type 2, version 2 BPDUs so an RSTP bridge can communicate 802.1D on any shared link or with any switch running 802.1D. RSTP sends BPDUs and populates the flag byte in a slightly different manner than in 802.1D:

* An RSTP bridge sends a BPDU with its current information every hello time period (2 seconds by default), even if it does not receive any BPDUs from the root bridge. * Protocol information can be immediately aged on a port if hellos are not received for three consecutive hello times or if the max_age timer expires (when 3 BPDUs are missed the neighbor is declared down). * Because BPDUs are now used as a "keepalive" mechanism, three consecutively missed BPDUs indicate lost connectivity between a bridge and its neighboring root or designated bridge. This fast aging of the information allows quick failure detection.

RSTP uses the flag byte of version 2 BPDU as shown in the figure:

* Bit 0 and 7 are used for topology change notification and acknowledgement as they are in 802.1D. * Bits 1 and 6 are used for the Proposal Agreement process * Bits 2-5 encode the role and state of the port originating the BPDU

RSTP Proposal and Agreement process:

In 802.1D, when a port has been selected by spanning tree to become a designated port, it must wait twice the Forward Delay before transitioning the port to a forwarding state. RSTP significantly speeds up the recalculation process after a topology change occurs in the network as it converges on a link-by-link basis and does not rely on timers expiring before ports can transition. Rapid transition to forwarding state can only be achieved on edge ports and on point-to-point links. In RSTP, this condition

corresponds to a port with a designated role that is in a blocking state. The figure illustrates how rapid transition is achieved step-by-step.

1. A new link is created between the root and Switch A and both ports are in designated blocking state until they receive a BPDU from their counterpart. When a designated port is in a discarding or learning state (and only in this case), it sets the proposal bit on the BPDUs it sends out. This is what happens for port p0 of the root bridge. 2. Switch A sees the proposal BPDU with a superior Root ID. It blocks all non-edge designated ports other than the one over which the proposal and agreement process are occurring. This operation is called "sync" and prevents switches below A from causing a loop during the proposal agreement process. Edge ports need not be blocked and remain unchanged during sync. 3. Bridge A explicitly sends an Agreement which allows the root bridge to put the Root Port 0 in forwarding state. Port 1 becomes the root port for A.

Downstream RSTP Proposal ProcessOnce Switch A and the root bridge are synchronized, the proposal or agreement process continues on Switch A out of all of its downstream, designated, non-edge ports as shown in Figure .

1. Switch B on P5 will see that Switch A is discarding and will also transition to the designated discarding state. Switch A then sends its proposal BPDU down to B with the Root ID of the root bridge. 2. Switch B sees a proposal with the superior BPDU from A and blocks all non-edge designated ports other than the one over which the proposal and agreement process is occurring. 3. Switch B sends a BPDU with the Agreement bit set and Switch A P3 transitions to forwarding state. The synchronization process continues with switches downstream from B.

TCN process:

In 802.1D, any port state change generates a TCN. When an 802.1D bridge detects a topology change (TC), it sends TCNs toward the root bridge. The root bridge sets the TC flag on the outbound BPDUs that are relayed to switches down from the root. When a bridge receives a BPDU with the TC flag bit set, the bridge reduces its bridge-table aging time to forward delay seconds. This ensures a relatively quick flushing of the MAC address table.

In RSTP, only non-edge ports moving to the forwarding state cause a topology change. Loss of connectivity is not considered to be a topology change, and, under these conditions, a port moving to the blocking state does not generate a TC BDPU.

When an RSTP bridge detects a TC, it performs these actions.

The TCN is flooded across the entire network one switch at a time from the switch that is the source of the change rather than from the root bridge. The topology change propagation is now a one-step process. There is no need for each switch port to wait for the root bridge to be notified and then maintain the TC state for the value of the max_age plus forward delay seconds.

If the port consistently keeps receiving BPDUs that do not correspond to the current operating mode for two periods of hello time, the port switches to the mode indicated by the BPDUs.

MST

The main purpose of MST is to reduce the total number of spanning tree instances to match the physical topology of the network and thus reduce the CPU loading of a switch. The instances of spanning tree are reduced to the number of links (i.e. active paths) that are available. If the example in the diagram were implemented via PVST+, there could potentially be 4094 instances of spanning tree, each with their respective BPDU conversations, root bridge election and path selections.

In this example, the goal is to achieve load distribution with VLANS 1-500 using one path and VLANS 501-1000 using the other path, with only two instances of spanning tree. The two ranges of VLANs are mapped to two MST instances (MSTI) respectively. Rather than maintaining 1000 spanning trees, each switch needs to maintain only two. Implemented in this fashion, MST converges faster than Per VLAN Spanning Tree+ (PVST+) and is backward compatible with 802.1D STP, 802.1w (RSTP), and the Cisco PVST+ architecture. Implementation of MST is not required if the Enterprise Composite Model is being employed, as the number of active VLAN instances, and hence, the STP instances would be small in number and very stable due to the design.

MST allows you to build multiple spanning trees over trunks by grouping and associating VLANs to spanning tree instances. Each instance can have a topology independent of other spanning tree instances. This architecture provides multiple active forwarding paths for data traffic and enables load balancing. Network fault tolerance is improved over CST because a failure in one instance (forwarding path) does not necessarily affect other instances. This VLAN to MST grouping must be consistent across all bridges within an MST region.

In large networks, you can more easily administer the network and use redundant paths by locating different VLAN and spanning tree assignments in different parts of the network. A spanning tree instance can exist only on bridges that have compatible VLAN instance assignments.

You must configure a set of bridges with the same MST configuration information, which allows them to participate in a specific set of spanning tree instances. Interconnected bridges that have the same MST configuration are referred to as an "MST region." Bridges with different MST configurations or legacy bridges running 802.1D are considered separate MST regions. A region can have one or multiple members with the same MST configuration; each member must becapable of processing RSTP bridge protocol data units (BPDUs). There is no limit to the number of MST regions in a network, but each region can support up to 65 spanning-tree instances.

In a Cisco PVST+ environment, the spanning tree parameters are tuned so that half of the VLANs are forwarding on each uplink trunk. This is easily achieved by electing bridge D1 to be the root for VLAN501–1000, and bridge D2 to be the root for VLAN1–500. In this configuration, the following is true:

* Optimum load balancing is achieved. * One spanning tree instance for each VLAN is maintained, which means 1000 instances for only two different logical topologies. This consumes resources for all the switches in the network (in addition to the bandwidth used by each instance sending its own BPDUs).

MST (IEEE 802.1s) combines the best aspects from both PVST+ and 802.1Q. The idea is that several VLANs can be mapped to a reduced number of spanning-tree instances because most networks do not need more than a few logical topologies.

MST differs from the other spanning tree implementation in that it combines some, but not necessarily all, VLANs into logical spanning tree instances. This raises the problem of determining what VLAN is to be associated with what instance. More precisely, tagging BPDUs so that receiving devices can identify the instances and the VLANs to which they apply.

The issue is irrelevant in the case of the 802.1Q standard, where all instances are mapped to a unique and common instance Common Spanning Tree (CST). In the PVST+ implementation, different VLANs carry the BPDUs for their respective instance (one BPDU per VLAN) based on the VLAN tagging information.

To provide this logical assignment of VLANS to spanning trees, each switch running MST in the network has a single MST configuration that consists of three attributes:

* An alphanumeric configuration name (32 bytes) * A configuration revision number (two bytes) * A 4096-element table that associates each of the potential 4096 VLANs supported on the chassis to a given instance

To be part of a common MST region, a group of switches must share the same configuration attributes. It is up to the network administrator to properly propagate the configuration throughout the region.

NOTE:

If two switches differ on one or more configuration attributes, they are part of different regions.

To ensure a consistent VLAN-to-instance mapping, it is necessary for the protocol to be able to exactly identify the boundaries of the regions. For that purpose, the characteristics of the region are included in BPDUs. The exact VLANs-to-instance mapping is not propagated in the BPDU, because the switches only need to know whether they are in the same region as a neighbor.

Therefore, only a digest of the VLANs-to-instance mapping table is sent, along with the revision number and the name. Once a switch receives a BPDU, it extracts the digest (a numerical value derived from the VLAN-to-instance mapping table through a mathematical function) and compares it with its own computed digest. If the digests differ, the mapping must be different, so the port on which the BPDU was received is at the boundary of a region.

In generic terms, a port is at the boundary of a region if the designated bridge on its segment is in a different region or if it receives legacy 802.1D BPDUs.

IST, CIST, and CST

Unlike PVST+ and rapid PVST+ in which all the spanning-tree instances are independent, the MSTPestablishes and maintains two types of spanning trees:

• An internal spanning tree (IST), which is the spanning tree that runs in an MST region.Within each MST region, the MSTP maintains multiple spanning-tree instances. Instance 0 is aspecial instance for a region, known as the internal spanning tree (IST). All other MST instances arenumbered from 1 to 4094.

The IST is the only spanning-tree instance that sends and receives BPDUs. All of the otherspanning-tree instance information is contained in M-records, which are encapsulated within MSTPBPDUs. Because the MSTP BPDU carries information for all instances, the number of BPDUs thatneed to be processed to support multiple spanning-tree instances is significantly reduced.

All MST instances within the same region share the same protocol timers, but each MST instancehas its own topology parameters, such as root switch ID, root path cost, and so forth. By default, allVLANs are assigned to the IST.

An MST instance is local to the region; for example, MST instance 1 in region A is independent ofMST instance 1 in region B, even if regions A and B are interconnected.

• A common and internal spanning tree (CIST), which is a collection of the ISTs in each MST region,and the common spanning tree (CST) that interconnects the MST regions and single spanning trees.The spanning tree computed in a region appears as a subtree in the CST that encompasses the entireswitched domain. The CIST is formed by the spanning-tree algorithm running among switches thatsupport the IEEE 802.1w, IEEE 802.1s, and IEEE 802.1D standards. The CIST inside an MSTregion is the same as the CST outside a region.

Extended System ID

As with PVST, the 12-bit Extended System ID field is used in MST. In MST this field carries the MST instance number.

The MAC address is a single address representing a single switch. When the VLAN ID Extended System ID carrying the VLAN ID is appended to the switch MAC address, each VLAN on the switch is represented by a unique Bridge Identifier

Port Role Naming Change

The boundary role is no longer in the final MST standard, but this boundary concept is maintained inCisco’s implementation. However, an MST instance port at a boundary of the region might not followthe state of the corresponding CIST port. Two cases exist now:

• The boundary port is the root port of the CIST regional root—When the CIST instance port is

proposed and is in sync, it can send back an agreement and move to the forwarding state only afterall the corresponding MSTI ports are in sync (and thus forwarding). The MSTI ports now have aspecial master role.

• The boundary port is not the root port of the CIST regional root—The MSTI ports follow the stateand role of the CIST port. The standard provides less information, and it might be difficult tounderstand why an MSTI port can be alternately blocking when it receives no BPDUs (MRecords).In this case, although the boundary role no longer exists, the show commands identify a port asboundary in the type column of the output.

MST and 802.1Q interaction:

One of the issues that arise from MST design is interoperability with the CST implementation in 802.1Q. According to the IEEE 802.1s specification, an MST switch must be able to handle at least one Internal Spanning Tree (IST).

The IST (instance 0) runs on all bridges within an MST region. An important characteristic of the IST instance is that it provides interaction at the boundary of the MST region with other MST regions and, more importantly, it is responsible for providing compatibility between the MST regions and the spanning tree of 802.1D, 802.1Q (CST) and PVST+ networks connected to the region.

The IST instance receives and sends BPDUs to the CST for compatibility with 802.1Q. The IST is capable of representing the entire MST region as a CST virtual bridge to switched networks outside the MST region.

* MST regions appears as a single virtual bridge to the adjacent CST and MST regions. The MST region uses 802.1w port roles and operation. * MST switches run IST, which augments CST information with internal information about the MST region. * IST connects all the MST switches in the region and any CST switched domain. * MST establishes and maintains additional spanning trees within each MST region. These spanning trees are termed MST instances (MSTIs). The IST is numbered 0, and the MSTIs are numbered 1, 2, 3, and so on, up to 15. Any MSTI is local to the MST region and is independent of MSTIs in another region, even if the MST regions are interconnected. * The M-Record is a sub-field, within the BPDU of MSTIs, that contains enough information (root bridge and sender bridge priority parameters) for the corresponding instance to calculate the final topology. It does not contain any timer-related parameters (such as hello time, forward delay, and max_age) that are typically found in a regular IEEE 802.1D BPDU, as these timers are derived from the IST BPDU timers. It is important to note that within an MST region, all spanning tree instances use the same parameters as the IST. * MST instances combine with the IST at the boundary of MST regions to become the CST as follows: o M-records are always encapsulated within MST BPDUs. The original spanning trees are called "M-trees," which are active only within the MST region. M-trees merge with the IST at the boundary of the MST region and form the CST. * MST supports some of the PVST extensions as follows: o UplinkFast and BackboneFast are not available in MST mode; they are part of RSTP. o PortFast is supported. o BPDU filter and BPDU guard are supported in MST mode. o Loop guard and root guard are supported in MST. o For PVLANs, you must map a secondary VLAN to the same instance as the primary.

Restarting Protocol Migration

A switch running both MSTP and RSTP supports a built-in protocol migration mechanism that enables the switch to interoperate with legacy 802.1D switches. If this switch receives a legacy 802.1D configuration BPDU (a BPDU with the protocol version set to 0), it sends only 802.1D BPDUs on that port. An MSTP switch can also detect that a port is at the boundary of a region when it receives a legacy BPDU, an MST BPDU (version 3) associated with a different region, or an RST BPDU (version 2).

However, the switch does not automatically revert to the MSTP mode if it no longer receives 802.1D BPDUs because it cannot determine whether the legacy switch has been removed from the link unless the legacy switch is the designated switch. A switch also might continue to assign a boundary role to a port when the switch to which it is connected has joined the region.

To restart the protocol migration process (force the renegotiation with neighboring switches) on the entire switch, you can use the clear spanning-tree detected-protocols privileged EXEC command. Use the clear spanning-tree detected-protocols interface interface-id privileged EXEC command to restart the protocol migration process on a specific interface.

This example shows how to restart protocol migration:

Switch# clear spanning-tree detected-protocols interface fastEthernet 4/4

5. Multilayer Switching Implementation

Layer 2 switching

Layer 2 switching forwards frames based on information in the Layer 2 Frame header as shown in the figure. Layer 2 switching occurs in hardware thereby decreasing latency introduced by software switching typically found in original bridge platforms. Switch hardware utilizes specialized chips, called application-specific integrated circuits (ASICs), to handle frame manipulation and forwarding. Because the majority of frame manipulation and forwarding decisions occur in hardware, Layer 2 switching can provide wire-speed performance in ideal circumstances.

A Layer 2 switch builds a forwarding table as it records the source MAC address and the inbound port number of received frames. Because the switch simply moves frames from one port to another, based on the information in the forwarding table, operation is said to be transparent; the sending end station is unaware of the switch path traversed by the frame.

Additionally, the frame can be checked against access control list (ACL) and quality of service (QoS) criteria that originate in Layer 3 software, but are stored in tables in switch hardware, to facilitate wire-speed lookups. This process provides frame forwarding at wire-speed while still qualifying the forwarding based on upper layer criteria.

What are Layer 2 Switching Tables?Routing, switching, ACL and QoS tables are stored in a high-speed table memory so that forwarding decisions and restrictions can be made in high-speed hardware. Cisco Catalysts have two primary table architectures:

* CAM Table – content addressable memory table. This is the primary table used to make Layer 2 forwarding decisions. The table is built by recording the source address and inbound port of all frames. When a frame arrives at the switch with a destination MAC address of an entry in the CAM table, the frame is forwarded out only the port associated with that specific MAC address. * TCAM Table – tertiary CAM table. This table stores ACL, QoS and other information generally associated with upper layer processing.

Table lookups are done with efficient search algorithms. A "key" is created to compare the frame to the table content. For example, the destination MAC address and VLAN ID (VID) of a frame would constitute the key for Layer 2 table lookup. This key is fed into a hashing algorithm, which produces a pointer into the table. The system uses the pointer to access a smaller specific area of the table without requiring searching the entire table.

In a Layer 2 table, all bits of all information are significant for frame forwarding (for example, VLANs, destination MAC addresses, and destination protocol types). However, in more complicated tables associated with upper layer forwarding criteria, some bits of information may be inconsequential to analyze. For example, an ACL may require a match on the first 24 bits of an IP address but the last 8 bits are insignificant information.

Identifying the Layer 2 Switch Forwarding Process

Layer 2 forwarding in hardware is based on the destination MAC address. The Layer 2 switch learns the address based on the source MAC address. The MAC address table lists MAC and VLAN pairs with associated interfaces.

Multilayer switching

Multilayer switching includes the ability to switch data based on information at multiple layers. Multilayer switching also refers to a class of high-performance routers that provide Layer 3 services and simultaneously forward packets at wire-speed through switching hardware. A Layer 3 switch performs packet switching, route processing and intelligent network services.

Layer 3 switch processing forwards packets at wire-speed by using ASIC hardware instead of microprocessor-based engines as might be found on a traditional router. Specific Layer 3 components such as routing tables or ACLs are cached into hardware. The Layer 3 packet headers of data traffic will be analyzed and packets forwarded at line speeds based upon that cached information.

Layer 3 switching can occur at two different locations on the switch:

* Centralized switching – Switching decisions are made on the route processor by a central forwarding table, typically controlled by an ASIC. * Distributed switching – Switching decisions can be made on a port or line card level rather than on a central route processor. Cached tables are distributed and synchronized to various hardware components so processing can be distributed throughout the switch chassis.

Layer 3 switching takes place using one of these methods, which are platform dependent:

* Route caching – Also known as flow-based or demand-based switching, a Layer 3 route cache is built in hardware as the switch sees traffic flows into the switch. * Topology-based switching – Information from the routing table is used to populate the route cache regardless of traffic flow. The populated route cache is called the Forwarding Information Base. Cisco Express Forwarding is the facility that builds the FIB

Traditional routers typically perform two main functions: route processing calculation and packet switching based on a routing table (Media Access Control [MAC] Address rewrite, redo checksum, Time To Live [TTL] decrement, and so forth). The major difference between a router and an L3 switch is that packet switching in a router is done in software by microprocessor-based engines, whereas packet switching in an L3 switch is done in hardware by specific Application Specific Integrated Circuits (ASICs).

MLS requires the following components:

MultiLayer Switching Engine (MLS-SE)—Responsible for packet switching and rewrite functions in custom ASICs, and capable of identifying L3 flows.

MultiLayer Switching Route Processor (MLS-RP)—Informs the MLS-SE of MLS configuration, and runs Routing Protocols (RPs) for route calculation.

MultiLayer Switching Protocol (MLSP)—Multicast Protocol messages sent by the MLS-RP to inform the MLS-SE of the MAC address used by MLS-RP, routing and access list changes, and so forth. The MLS-SE uses that information to program the custom ASICs.

Key MLS features:

Feature Description

Ease of Use Is autoconfigurable and autonomously sets up its Layer 3 flow cache. Its plug-and-play design eliminates the need for you to learn new IP switching technologies.

Transparency Requires no end-system changes and no renumbering of subnets. It works with DHCP and requires no new routing protocols.

Standards Based Uses IETF standard routing protocols such as OSPF and RIP for route determination. You can deploy MLS in a multivendor network.

Investment Protection

Provides a simple feature-card upgrade on the Catalyst 5000 series switches. You can use MLS with your existing chassis and modules. MLS also allows you to use either an integrated RSM or an external router for route processing and Cisco IOS services.

Fast Convergence

Allows you to respond to route failures and routing topology changes by performing hardware-assisted invalidation of flow entries.

Resilience Provides the benefits of HSRP without additional configuration. This feature enables the switches to transparently switch over to the hot standby backup router when the primary router goes offline, eliminating a single point of failure in the network.

Access Lists Allows you to set up access lists to filter, or to prevent traffic between members of different subnets. MLS enforces multiple security levels on every packet of the flow at wire speed. It allows you to configure and enforce access control rules on the RSM. Because MLS parses the packet up to the transport layer, it enables access lists to be validated. By providing multiple security levels, MLS enables you to set up rules and control traffic based on IP addresses as well as transport-layer application port numbers.

Accounting and Traffic Management

Allows you to see data flows as they are switched for troubleshooting, traffic management, and accounting purposes. MLS uses NDE to export the flow statistics. Data collection of flow statistics is maintained in hardware with no impact on switching performance. The records for expired and purged flows are grouped together and exported to applications such as NetSys for network planning, RMON2 traffic management and monitoring, and accounting applications.

Network Design Simplification

Enables you to speed up your network while retaining the existing subnet structure. It makes the number of Layer 3 hops irrelevant in campus design, enabling you to cope with increases in any-to-any traffic.

Media Speed Access to Server Farms

You do not have to centralize servers in multiple VLANs to get direct connections. By providing security on a per-flow basis, you can control access to the servers and filter traffic based on subnet numbers and transport-layer application ports without compromising Layer 3 switching performance.

Faster Interworkgroup Connectivity

Addresses the need for higher-performance interworkgroup connectivity by intranet and multimedia applications. By deploying MLS, you gain the benefits of both switching and routing on the same platform.

CEF

Cisco Layer 3 devices can use a variety of methods to switch packets from one port to another. The most basic method of switching packets between interfaces is called process switching. Process switching moves packets between interfaces, based on information in the routing table and the ARP cache, on a scheduled basis. In other words, as packets arrive, they will be moved into a queue to wait for further processing. When the scheduler runs, the outbound interface will be determined and the packet will be switched. Waiting for the scheduler introduces latency.

To speed the switching process, strategies exist to switch packets on demand as they arrived on an interface, and to cache information necessary to make packet-forwarding decisions.

CEF uses these strategies to expediently switch data packets to their destination. It caches information generated by the Layer 3 Routing Engine. CEF caches routing information in one table (the Forwarding Information Base, or FIB), and caches Layer 2 next-hop addresses for all FIB entries in an Adjacency Table. Because CEF maintains multiple tables for forwarding information, parallel paths can exist and enables CEF to load balance per packet.

CEF operates in one of two modes:

* Central CEF mode – The CEF FIB and adjacency tables reside on the route processor, and the route processor performs the express forwarding. Use this CEF mode when line cards are not available for CEF switching, or when features not compatible with Distributed CEF. * Distributed CEF (dCEF) mode – When dCEF is enabled, line cards maintain identical copies of the FIB and adjacency tables. The line cards can perform the express forwarding by themselves, relieving the main processor of involvement in the switching operation. dCEF uses an Inter-Process Communication (IPC) mechanism to ensure synchronization of FIBs and adjacency tables on the route processor and line cards.

CEF separates the control plane hardware from the data plane hardware and switching. ASICs in switches are used to separate the control plane and data plane, thereby achieving higher data throughput. The control plane is responsible for building the Forwarding Information Base or FIB table and adjacency tables in software. The data plane is responsible for forwarding IP unicast traffic using hardware.

When traffic cannot be processed in hardware, it must receive processing in software by the Layer 3 Engine, thereby not receiving the benefit of expedited hardware-based forwarding. There are a number of different packet types that may force the Layer 3 engine to process them. Some examples of IP exception packets are:

* IP Packets that use IP header options (Packets that use TCP header options are switched in hardware because they do not affect the forwarding decision.) * Packets that have an expiring IP TTL counter * Packets that are forwarded to a tunnel interface * Packets that arrive with nonsupported encapsulation types * Packets that are routed to an interface with nonsupported encapsulation types * Packets that exceed the maximum transmission unit (MTU) of an output interface and must be fragmented

CEF Based Tables

CEF-based tables are initially populated and used as follows: The FIB is derived from the IP routing table and is arranged for maximum lookup throughput. The adjacency table is derived from the Address Resolution Protocol table, and it contains Layer 2

rewrite (MAC) information for the next hop. CEF IP destination prefixes are stored in the TCAM table from the most specific to the least

specific entry. When the CEF TCAM table is full, a wildcard entry redirects to the Layer 3 engine. When the adjacency table is full, a CEF TCAM table entry points to the Layer 3 engine to redirect

the adjacency. The FIB lookup is based on the Layer 3 destination address prefix (longest match).

FIB Table Updates

The FIB table is updated when the following occurs:

An ARP entry for the destination next hop changes, ages out, or is removed. The routing table entry for a prefix changes. The routing table entry for the next hop changes.

These are the basic steps that occur to initially populate the adjacency table:

Step 1 The Layer 3 engine queries the switch for a physical MAC address.Step 2 The switch selects a MAC address from the chassis MAC range and assigns it to the Layer 3 engine. This MAC address is assigned by the Layer 3 engine as a burned-in address for all VLANs and is used by the switch to initiate Layer 3 packet lookups.Step 3 The switch installs wildcard CEF entries, which point to drop adjacencies (for handling CEF table lookup misses).Step 4 The Layer 3 engine informs the switch of its interfaces participating in MLS (MAC address and associated VLAN). The switch creates the (MAC, VLAN) Layer 2 CAM entry for the Layer 3 engine.Step 5 The Layer 3 engine informs the switch about features for interfaces participating in MLS.

Step 6 The Layer 3 engine informs the switch about all CEF entries related to its interfaces and connected networks. The switch populates the CEF entries and points them to Layer 3 engine redirect adjacencies.

Ternary Content Addressable Memory Table - TCAM

The Ternary Content Addressable Memory (TCAM) is a specialized piece of memory designed for rapid, hardware based table lookups of Layer 3 and 4 information. In the TCAM a single lookup provides all Layer 2 and Layer 3 forwarding information for frames including CAM and ACL information. The following platforms use TCAMs for Layer 3 switching:Catalyst 6500, 4500, 4000 and 3550

TCAM matching is based on three values: 0, 1, or x (where x is either number), hence the term ternary. The memory structure is broken into a series of patterns and masks. Masks are shared among a specific number of patterns and are used to wildcard some content fields.

These two access control entries (ACEs) are referenced in the figure as it shows how their values would be stored in the TCAM:

access-list 101 permit ip host 10.1.1.1 any access-list 101 deny ip 10.1.1.0 0.0.0.255 any

The TCAM table entries in the figure consist of types of regions:

Longest-match region – Each longest-match region consists of groups of Layer 3 address entries ("buckets") organized in decreasing order by mask length. All entries within a bucket share the same mask value and key size. The buckets can change their size dynamically by borrowing address entries from neighboring buckets. Although the size of the whole protocol region is fixed, you can reconfigure it. The reconfigured size of the protocol region is effective only after the next system reboot.

First-match region – The first-match region consists of ACL entries. Lookup stops after first match of the entry.

ARP throttling

Only the first few packets for a connected destination reach the Layer 3 engine so that the Layer 3 engine can use Address Resolution Protocol (ARP) to locate the host. Throttling adjacency is installed so that subsequent packets to that host are dropped in hardware until an ARP response is received. The throttling adjacency is removed when an ARP reply is received (and a complete rewrite adjacency is installed for the host). The switch removes throttling adjacency if no ARP reply is seen within 2 seconds to allow more packets through to reinitiate ARP. This relieves the Layer 3 engine from excessive ARP processing or from ARP-based denial of service attacks.

The figure provides an example of ARP throttling, which consists of these steps:

Step 1 Host A sends packet to host B.

Step 2 The switch forwards the packet to the Layer 3 engine based on the "glean" entry in the FIB.

Step 3 The Layer 3 engine sends an ARP request for host B and installs the drop adjacency for host B.

Step 4 Host B responds to the ARP request.

The Layer 3 engine installs adjacency for host B and removes the drop adjacency. The adjacency table is populated as adjacencies are discovered. Each time an adjacency entry is created (such as through the ARP protocol), a link-layer header for that adjacent node is precomputed and stored in the adjacency table. After a route is determined, it points to a next hop and corresponding adjacency entry. The route is subsequently used for encapsulation during CEF switching of packets.

A route might have several paths to a destination prefix, such as when a router is configured for simultaneous load balancing and redundancy. For each resolved path, a pointer is added for the adjacency corresponding to the next-hop interface for that path. This mechanism is used for load balancing across several paths.

In addition to adjacencies associated with next-hop interfaces (host-route adjacencies), other types of adjacencies are used to expedite switching when certain exception conditions exist. When the prefix is defined, prefixes requiring exception processing are cached with one of the following special adjacencies:

* Null adjacency – Packets destined for a "Null0" interface are dropped. This can be used as an effective form of access filtering. – When a router is connected directly to several hosts, the FIB table on the router maintains a prefix for the subnet rather than for the individual host prefixes. The subnet prefix points to a glean adjacency. When packets need to be forwarded to a specific host, the adjacency database is gleaned for the specific prefix. * Punt adjacency – Features that require special handling, or features that are not yet supported in conjunction with CEF switching paths, are forwarded to the next switching layer for handling; for example, the packet may require CPU processing. Features that are not supported are forwarded to the next higher switching level. * Discard adjacency – Packets are discarded. * Drop adjacency – Packets are dropped, but the prefix is checked.

When a link-layer header is appended to packets, FIB requires the appended header to point to an adjacency corresponding to the next hop. If an adjacency was created by FIB and not discovered through a mechanism such as ARP, the Layer 2 addressing information is not known and the adjacency is considered incomplete. After the Layer 2 information is known, the packet is forwarded to the route processor, and the adjacency is determined through ARP.

CEF-Based MLS Operation These are the steps that would occur when using CEF to forward frames between host A and host B on different VLANs:

Step 1 Host A sends a packet to host B. The switch recognizes the frame as a Layer 3 packet because the destination MAC (MAC-M) matches the Layer 3 engine MAC.Step 2 The switch performs a CEF lookup based on the destination IP address (IP-B). The packet hits the CEF entry for the connected (VLAN20) network and is redirected to the Layer 3 engine using a "glean" adjacency.Step 3 The Layer 3 engine installs an ARP throttling adjacency in the switch for the host B IP address.Step 4 The Layer 3 engine sends ARP requests for host B on VLAN20.Step 5 Host B sends an ARP response to the Layer 3 engine.Step 6 The Layer 3 engine installs the resolved adjacency in the switch (removing ARP throttling adjacency).Step 7 The switch forwards the packet to host B.Step 8 The switch receives a subsequent packet for host B (IP-B).Step 9 The switch performs a Layer 3 lookup and finds a CEF entry for host B. The entry points to the adjacency with rewrite information for host B.

The switch rewrites packets per the adjacency information and forwards the packet to host B on VLAN20.

Frame Rewrite Using CEFIP unicast packets are rewritten on the output interface as follows:

* The source MAC address changes from the sender MAC address to the router MAC address. * The destination MAC address changes from the router MAC to the next-hop MAC address. * The Time to Live (TTL) is decremented by one.

The IP header and frame checksums are recalculated.

Benefits

CEF offers the following benefits:

* Improved performance—CEF is less CPU-intensive than fast switching route caching. More CPU processing power can be dedicated to Layer 3 services such as quality of service (QoS) and encryption.

* Scalability—CEF offers full switching capacity at each line card when distributed CEF (dCEF) mode is active.

* Resilience—CEF offers unprecedented level of switching consistency and stability in large dynamic networks. In dynamic networks, fast switching cache entries are frequently invalidated due to routing changes. These changes can cause traffic to be process switched using the routing table, rather than fast switched using the route cache. Because the Forwarding Information Base (FIB) lookup table contains all known routes that exist in the routing table, it eliminates route cache maintenance and the fast switch/process switch forwarding scenario. CEF can switch traffic more efficiently than typical demand caching schemes.

Restrictions

* The Cisco 12000 series Gigabit Switch Routers operate only in distributedCEF mode.

* Distributed CEF switching cannot be configured on the same VIP card as distributed fast switchin.g

* Distributed CEF is not supported on Cisco 7200 series routers.

* If you enable CEF and then create an access list that uses the log keyword, the packets that match the

access list are not CEF switched. They are fast switched. Logging disables CEF.

CEF Components

Information conventionally stored in a route cache is stored in several data structures for CEF switching. The data structures provide optimized lookup for efficient packet forwarding. The two main components of CEF operation are the following:

Forwarding Information Base

Adjacency Tables

Forwarding Information Base

CEF uses a FIB to make IP destination prefix-based switching decisions. The FIB is conceptually similar to a routing table or information base. It maintains a mirror image of the forwarding information contained in the IP routing table. When routing or topology changes occur in the network, the IP routing table is updated, and those changes are reflected in the FIB. The FIB maintains next-hop address information based on the information in the IP routing table.

Because there is a one-to-one correlation between FIB entries and routing table entries, the FIB contains all known routes and eliminates the need for route cache maintenance that is associated with switching paths such as fast switching and optimum switching.

Adjacency Tables

Nodes in the network are said to be adjacent if they can reach each other with a single hop across a link layer. In addition to the FIB, CEF uses adjacency tables to prepend Layer 2 addressing information. The adjacency table maintains Layer 2 next-hop addresses for all FIB entries.

Adjacency Discovery

The adjacency table is populated as adjacencies are discovered. Each time an adjacency entry is created (such as through the ARP protocol), a link-layer header for that adjacent node is precomputed and stored in the adjacency table. Once a route is determined, it points to a next hop and corresponding adjacency entry. It is subsequently used for encapsulation during CEF switching of packets.

Adjacency Resolution

A route might have several paths to a destination prefix, such as when a router is configured for simultaneous load balancing and redundancy. For each resolved path, a pointer is added for the adjacency corresponding to the next-hop interface for that path. This mechanism is used for load balancing across several paths.

Adjacency Types That Require Special Handling

In addition to adjacencies associated with next-hop interfaces (host-route adjacencies), other types of adjacencies are used to expedite switching when certain exception conditions exist. When the prefix is defined, prefixes requiring exception processing are cached with one of the special adjacencies listed in Table 4.

Table 4: Adjacency Types for Exception Processing

This adjacency type... Receives this processing...

Null adjacency

Packets destined for a Null0 interface are dropped. This can be used as an effective form of access filtering.

Glean adjacency

When a router is connected directly to several hosts, the FIB table on the router maintains a prefix for the subnet rather than for the individual host prefixes. The subnet prefix points to a glean adjacency. When packets need to be forwarded to a specific host, the adjacency database is gleaned for the specific prefix.

Punt adjacency

Features that require special handling or features that are not yet supported in conjunction with CEF switching paths are forwarded to the next switching layer for handling. Features that are not supported are forwarded to the next higher switching level.

Discard adjacency

Packets are discarded.

Drop adjacency

Packets are dropped, but the prefix is checked.

Unresolved Adjacency

When a link-layer header is prepended to packets, FIB requires the prepend to point to an adjacency corresponding to the next hop. If an adjacency was created by FIB and not discovered through a mechanism, such as ARP, the Layer 2 addressing information is not known and the adjacency is considered incomplete. Once the Layer 2 information is known, the packet is forwarded to the route processor, and the adjacency is determined through ARP.

Supported Media

CEF currently supports ATM/AAL5snap, ATM/AAL5mux, ATM/AAL5nlpid, Frame Relay, Ethernet, FDDI, PPP, HDLC, and tunnels.

CEF Operation Modes

CEF can be enabled in one of two modes:

Central CEF Mode

Distributed CEF Mode

Central CEF Mode

When CEF mode is enabled, the CEF FIB and adjacency tables reside on the route processor, and the route processor performs the express forwarding. You can use CEF mode when line cards are not available for CEF switching or when you need to use features not compatible with distributed CEF switching.

Figure 8 shows the relationship between the routing table, FIB, and adjacency table during CEF mode. The Cisco Catalyst switches forward traffic from workgroup LANs to a Cisco 7500 series router on the enterprise backbone running CEF. The route processor performs the express forwarding.

Figure 8: CEF Mode

Distributed CEF Mode

When dCEF is enabled, line cards, such as VIP line cards or GSR line cards, maintain an identical copy of the FIB and adjacency tables. The line cards perform the express forwarding between port adapters, relieving the RSP of involvement in the switching operation.

dCEF uses an Inter Process Communication (IPC) mechanism to ensure synchronization of FIBs and adjacency tables on the route processor and line cards.

Figure 9 shows the relationship between the route processor and line cards when dCEF mode is active.

Figure 9: dCEF Mode

In this Cisco 12000 series router, the line cards perform the switching. In other routers where you can mix various types of cards in the same router, it is possible that not all of the cards you are using support CEF. When a line card that does not support CEF receives a packet, the line card forwards the packet to the next higher switching layer (the route processor) or forwards the packet to the next hop for processing. This structure allows legacy interface processors to exist in the router with newer interface processors.

Note The Cisco 12000 series Gigabit Switch Routers operate only dCEF mode; dCEF switching cannot be configured on the same VIP card as distributed fast switching, and dCEF is not supported on Cisco 7200 series routers.

Additional Capabilities

In addition to configuring CEF and dCEF, you can also configure the following features:

Distributed CEF switching using access lists

Distributed CEF switching of Frame Relay packets Distributed CEF switching during packet fragmentation Load balancing on a per destination-source host pair or per packet basis Network accounting to gather byte and packet statistics Distributed CEF switching across IP tunnels

Configuring Distributed Tunnel Switching for CEF

CEF supports distributed tunnel switching, such as GRE tunnels. Distributed tunnel switching is enabled automatically when you enable CEF or dCEF. You do not perform any additional tasks to enable distributed tunnel switching once you enable CEF or dCEF.

Configuration and Troubleshooting

Hardware Layer 3 switching is permanently enabled on Catalyst 6500 series Supervisor Engine 720 with Policy Feature Card 2 (PFC3), Multilayer Switch Feature Card 3 (MSFC3), and Distributed Forwarding Card (DFC). No configuration is required and CEF cannot be disabled.

The no ip cef command can be used to disable CEF on the Catalyst 4000 or the no ip route-cache cef command on a Catalyst 3550 interface.

If CEF is enabled globally, it is automatically enabled on all interfaces as long as IP routing is enabled on the device. It can then be enabled or disabled on an interface basis. Cisco recommends CEF enabled on all Layer 3 interfaces. If CEF is disabled on an interface, you can enable CEF as follows:

* On the Catalyst 3550 switch, use the ip route-cache cef interface configuration command to enable CEF on an interface. * On the Catalyst 4000 switch, use the ip cef interface configuration command to enable CEF on an interface after it has been disabled.

Per-destination load balancing allows the router to use multiple paths to achieve load sharing. Packets for a given source-destination host pair are guaranteed to take the same path, even if multiple paths are available. This ensures packets for a given host pair arrive in order. Per-destination load balancing is enabled by default when you enable CEF, and it is the load- balancing method of choice for most situations.

Because per-destination load balancing depends on the statistical distribution of traffic, load sharing becomes more effective as the number of source-destination pairs increase.

Verifying Layer 3 SwitchingThe show ip cef detail command indicates if CEF is running globally. Specify an interface to verify CEF operation on the interface.

Display CEF StatisticsUse the show interfaces command with the | include switch argument to show switching statistics at each layer for the interface. Verify that L3 packets are being switched. -

Displaying Detailed Adjacency InformationEach time an adjacency entry is created, a Layer 2 data link-layer header for that adjacent node is pre-computed and stored in the adjacency table. This information is subsequently used for encapsulation during CEF switching of packets.

Output from the command show adjacency detail displays the content of the information to be used during this Layer 2 encapsulation. Verify that the header information displays as would be expected during Layer 2 operations, not using pre-computed encapsulation from the adjacency table. Adjacency statistics are updated approximately every 60 seconds.

Also the show cef drops command will display if packets are being dropped due to adjacencies that are either incomplete or non-existent. There are two known reasons for incomplete or non-existent adjacencies:

* The router cannot use ARP successfully for the next-hop interface. * After a clear ip arp or a clear adjacency command, the router marks the adjacency as incomplete, and then it fails to clear the entry.

The symptoms of an incomplete adjacency include random packet drops during a ping test. Use the debug ip cef command to view CEF drops due to an incomplete adjacency.

Debugging CEF OperationsUse the debug ip cef arguments to limit the debug output, thereby reducing the overhead of the debug command and providing focus on a specific CEF operation:

debug ip cef {drops [access-list] | receive [access-list] | events [access-list] | prefix-ipc [access-list] | table [access-list]}

Adding an argument to the debug command limits the debug output as follows:

* drops – Records dropped packets

* access-list (optional) – Controls collection of debugging information from specified lists * receive – Records packets that are not switched using information from the FIB table, but that are received and sent to the next switching layer * events – Records general CEF events * prefix-ipc – Records updates related to IP prefix information, including the following: o Debugging of IP routing updates in a line card o Reloading of a line card with a new table o Adding a route update from the route processor to the line card exceeds the maximum number of routes o Control messages related to FIB table prefixes * table – Produces a table showing events related to the FIB table. Possible types of events include the following: o Routing updates that populate the FIB table o Flushing of the FIB table o Adding or removing of entries to the FIB table o Table reloading process

CEF Troubleshooting steps:

CEF is the fastest means of switching Layer 3 packets in hardware. The CEF tables stored in hardware are populated from information gathered by the route processor. Troubleshooting CEF operations therefore has two primary steps:

* Ensure that the normal Layer 3 operations on the route processor are functioning properly so that the switch tables will be populated with accurate and complete information. * Verify that information from the route processor has properly populated the FIB and adjacency table, and is being used by CEF to switch Layer 3 packets in hardware.

Troubleshooting CEF is, in essence, verifying that packets are indeed receiving the full benefit of CEF switching and not being "punted" to a slower packet switching or processing method. The Cisco term "punt" describes an action of sending a packet "down" to the next fastest switching level. The following list defines the order of preferred Cisco IOS switching methods, from fastest to slowest.

A punt occurs when the preferred switching method did not produce a valid path or in CEF, a valid adjacency. If the CEF lookup process fails to find a valid entry in the FIB, CEF will install a punt adjacency to the less preferred system. CEF will punt all packets with that adjacency to the next best switching mode, in order to forward all the packets by some means even if that means is less efficient.

SVI

A switched virtual interface (SVI) is a virtual Layer 3 interface that can be configured for any VLAN that exists on a Layer 3 switch. It is virtual in that there is no physical interface for the VLAN and yet it can accept configuration parameters applied to any Layer 3 router interface. The SVI for the VLAN provides Layer 3 processing for packets from all switch ports associated with that VLAN. Only one SVI can be associated with a VLAN. You configure an SVI for a VLAN for these reasons:

* To provide a default gateway for a VLAN so traffic can be routed between VLANs * To provide fallback bridging if it is required for nonroutable protocols * To provide Layer 3 IP connectivity to the switch

By default, an SVI is created for the default VLAN (VLAN1) to permit remote switch administration. You must explicitly configure additional SVIs.

SVIs are created the first time interface configuration mode is entered for a particular VLAN SVI interface. The VLAN corresponds to the VLAN tag associated with data frames on an ISL or 802.1Q encapsulated trunk or to the VLAN ID configured for an access port. Configure and assign an IP address to each VLAN SVI that is to route traffic off of and onto the local VLAN.

SVIs support routing protocol and bridging configurations

A routed switch port is a physical switch port on a Multilayer switch that capable of Layer 3 packet processing. A routed port is not associated with a particular VLAN, as is an access port or SVI. A routed port behaves like a regular router interface, except that it does not support VLAN subinterfaces. Routed switch ports can be configured using most commands applied to a physical router interface including the assignment of on IP address and the configuration of Layer 3 routing protocols.

A routed switch port is similar to an SVI in that it is a switch port that provides Layer 3 packet processing. SVIs generally provide Layer 3 services for devices connected to the ports of the switch where the SVI is configured. Routed switch ports can provide a Layer 3 path into the switch for a number of devices on a specific subnet, all of which are located out a single switch port.

The number of routed ports and SVIs that can be configured on a switch is not limited by software. However, the interrelationship between these interfaces other features configured on the switch may overload the CPU due to hardware limitations.

Routed switch ports are typically configured by removing the Layer 2 switchport ability of the switch port. On most switches the ports are Layer 2 ports by default. On some switches, the ports are Layer 3 ports by default. The Layer at which the port functions determines the commands that can be configured on the port.

The ping command will return one of these responses:

* Success rate is 100 percent or ip-address is alive – This response occurs in one to ten milliseconds, depending on network traffic and the number of Internet Control Message Protocol (ICMP) packets sent.

* Destination does not respond – No answer message is returned if the host does not respond.

* Unknown host – This response occurs if the targeted host cannot be resolved.

* Destination unreachable – This response occurs if the default gateway cannot reach the specified network or is being blocked.

* Network or host unreachable – This response occurs if the Time to Live (TTL) times out. The TTL default is 2 seconds.

6. Router redundancy protocols

Routing issues:

Using Default GatewaysWhen configuring a default gateway on most devices, there is no means by which to configure a secondary gateway, even if a second route exists to carry packets off the local segment.

For example , primary and secondary paths between the Building Access submodule and the Building Distribution submodule provide continual access in the event of a link failure at the Building Access layer. Primary and secondary paths between the Building Distribution layer and the Building Core layer provide continual operations should a link fail at the Building Distribution layer.

In this example, router A is responsible for routing packets for subnet A, and router B is responsible for handling packets for subnet B. If router A becomes unavailable, routing protocols can quickly and dynamically converge and determine that router B will now transfer packets that would otherwise have gone through router A. Most workstations, servers, and printers, however, do not receive this dynamic routing information.

End devices are typically configured with a single default gateway IP address that does not change when network topology changes occur. If the router whose IP address is configured as the default gateway fails, the local device will be unable to send packets off the local network segment, effectively disconnecting it from the rest of the network. Even if a redundant router exists that could serve as a default gateway for that segment, there is no dynamic method by which these devices can determine the address of a new default gateway.

Using Proxy ARPCisco IOS software runs proxy ARP to enable hosts that have no knowledge of routing options to obtain a MAC address of a gateway that is able to forward packets off of the local subnet. For example , if the Proxy ARP router receives an ARP request for an IP address that it knows is not on the same interface as the ARP request sender, it will generate an ARP reply packet giving its own local MAC address as the destination MAC address of the IP address being resolved. The host that sent the ARP request then sends all packets destined for the resolved IP address to the MAC address of the router. The router then forwards the packets toward the intended host, perhaps repeating this process along the way. Proxy ARP is enabled by default.

With proxy ARP, the end-user station behaves as if the destination device was connected to its same network segment. If the responsible router fails, the source end station continues to send packets for that IP destination to the MAC address of the failed router and the packets are therefore discarded.

Eventually the Proxy ARP MAC address will age out of the workstations ARP cache. The workstation may eventually acquire the address of another proxy ARP failover router, but the workstation cannot send packets off the local segment during this failover time.

HSRP

The Need for HSRP

Because of the way that IP hosts determine where to forward their packets, there needs to be a way to enable quick and efficient redundancy or fail-over. Without HSRP the failure of a single gateway could isolate hosts. The ways in which IP hosts can determine where to forward packets are listed below:

1. The configuration of a default gateway. This is the simplest and most often used method. Using this method a host is statically configured to know the IP address of its default router. However, if that router should become unavailable, the host will no longer be able to communicate with devices off of the local LAN segment even if there is another router available.

2. The use of Proxy ARP. A host can discover a router by using ARP to find the MAC address of any hosts which are not on its directly connected LAN. A router which is configured to support Proxy ARP will answer ARP requests with its own MAC address when it has a specific route for these addresses. Unfortunately, many hosts have long time out values (or no timeout values at all) on their ARP caches. As a result, if a router should become unavailable, the host will continue to attempt to send traffic for these hosts to the router which originally sent the proxy ARP reply.

3. ICMP Router Discover Protocol (IRDP). IP hosts can use the Router Discovery Protocol to listen to router hellos. This allows a host to quickly adapt to changes in network topology. However, only a very small number of hosts have implementations of IRDP.

4. RIP. Some IP hosts use RIP to discover routers. These hosts will adapt to topology changes as RIP converges. Very few hosts fall into the category of devices which run RIP, and very few network administrators would like to deal with the overhead of another routing protocol in their networks or the

hassle of having all of their hosts relying on a routing protocol for connectivity. Even if they will take this hit, the convergence of RIP can take over a minute.

Since none of these mechanisms are satisfactory in the majority of networking situations, Cisco has developed the Hot Standby Router Protocol to allow hosts to adapt to network topology changes almost immediately without requiring hosts to run any special software. HSRP is used in conjunction with the configuration of a default gateway in the host devices. This makes the protocol easy to use in any networking environment and provides redundancy for the critical first hop, or gateway.

HSRP defines a standby group of routers, with one router as the active router. HSRP provides gateway redundancy by sharing IP and MAC addresses between redundant gateways. The protocol consists of a virtual MAC address and IP address that are shared between two routers that belong to the same HSRP standby group. HSRP can optionally monitor both LAN and serial interfaces via a multicast protocol.

An HSRP standby group comprises these entities:

* One active router * One standby router * One virtual router * Other routers

HSRP Operation

The Virtual HSRP Router The virtual router is simply an IP and MAC address pair to which end devices have configured as their default gateway. The active router will process all packets and frames sent to the virtual router address. The virtual router processes no physical frames.

The Active HSRP Router Within an HSRP standby group, one router is elected to be the active router. The active router physically forwards packets sent to the MAC address of the virtual router.

The active router responds to traffic for the virtual router. If an end station sends a packet to the virtual router MAC address, the active router receives and processes that packet. If an end station sends an ARP request with the virtual router IP address, the active router replies with the virtual router MAC address.

In this example, router A assumes the active router role and forwards all frames addressed to the well-known MAC address of 0000.0c07.acxx, where xx is the HSRP group identifier.

ARP Resolution with HSRP ARP establishes correspondences between network addresses, such as an IP address and a hardware Ethernet address. All devices sending packets over IP maintain a table of resolved addresses, including routers.

The IP address and corresponding MAC address of the virtual router is maintained in the ARP table of each router in an HSRP standby group.

In the example, the output displays an ARP entry for a router that is a member of HSRP standby group 47 in VLAN10. The virtual router for VLAN10 is identified as 172.16.10.110. The well-known MAC address that corresponds to this IP address is 0000.0c07.ac2f, where 2f is the HSRP group identifier for standby group 47. The HSRP group number is the standby group number (47) converted to hexadecimal (2f).

The Standby and other HSRP Routers in the Group The function of the HSRP standby router is to monitor the operational status of the HSRP group and quickly assume packet-forwarding responsibility if the active router becomes inoperable. Both the active and standby router transmits hello messages to inform all other routers in the group of their role and status.

An HSRP standby group may contain other routers that are group members but are not in an active or standby state. These routers monitor the hello messages sent by the active and standby routers to ensure that an active and standby router exist for the HSRP group of which they are a member. These routers do forward any packets addressed to their own specific IP addresses, but they do not forward packets addressed to the virtual router. These routers issue speak messages at every hello interval time.

HSRP Active and Standby Router Interaction When the active router fails, the other HSRP routers stop seeing hello messages from the active router. The standby router will then assume the role of the active router. If there are other routers participating in the group, those routers then contend to be the new standby router.

In the event that both the active and standby routers fail, all routers in the group contend for the active and standby router roles.

Because the new active router assumes both the IP and MAC addresses of the virtual router, the end stations see no disruption in service. The end-user stations continue to send packets to the virtual router MAC address, and the new active router delivers the packets to the destination.

HSRP States

When a router exists in one of these states, the router performs the necessary actions required for that state. Not all HSRP routers will transition through all states. For example, a router that is not the standby or active router will not enter the standby or active states.

HSRP Initial State All routers begin in the initial state.This is the starting state and indicates that HSRP is not running. This state is entered via a configuration change or when an interface is initiated.

HSRP Listen State In the listen state, the router knows the IP address of the virtual router, but is neither the active router nor the standby router. The router listens for hello messages from those routers for a duration called the hold time which can be configured. The purpose of this listening interval is to determine if there are active or standby routers for the group. Then this router will join the HSRP group, based on its configuration.

HSRP Speak State In the speak state, the router sends periodic hello messages and is actively participating in the election of the active router or standby router or both. A router cannot enter the speak state unless the router has the IP address of the virtual router. The router will remain in the speak state unless it becomes an active or standby router.

HSRP Standby State In the standby state, because the router is a candidate to become the next active router, it sends periodic hello messages. It will also listen for hello messages from the active router. There will be only one standby router in the HSRP group.

HSRP Active State In the active state, the router is currently forwarding packets that are sent to the virtual MAC address of the group. It also replies to ARP requests made of the virtual routers IP address. The active router sends periodic hello messages. There must be one active router in each HSRP group.

State Definition

Initial This is the state at the start. This state indicates that HSRP does not run. This state is entered via a configuration change or when an interface first comes up.

Learn The router has not determined the virtual IP address and has not yet seen an authenticated hello message from the active router. In this state, the router still waits to hear from the active router.

Listen The router knows the virtual IP address, but the router is neither the active router nor the standby router. It listens for hello messages from those routers.

Speak The router sends periodic hello messages and actively participates in the election of the active and/or standby router. A router cannot enter speak state unless the router has the virtual IP address.

Standby The router is a candidate to become the next active router and sends

periodic hello messages. With the exclusion of transient conditions, there is, at most, one router in the group in standby state.

Active

The router currently forwards packets that are sent to the group virtual MAC address. The router sends periodic hello messages. With the exclusion of transient conditions, there must be, at most, one router in active state in the group.

HSRP Configuration and Verification

Configure HSRP Group on an Interface While running HSRP, it is important that the end-user stations do not discover the actual MAC addresses of the routers in the standby group. Any protocol that informs a host of the router actual address must be disabled. To ensure that the actual addresses of the participating HSRP routers are not discovered, enabling HSRP on a Cisco router interface automatically disables ICMP redirects on that interface.

After the standby ip command is issued, the interface changes to the appropriate state. When the router successfully executes the command, the router issues an HSRP message.

To remove an interface from an HSRP group, enter the no standbygroup ip command.

Verifying HSRP Configuration The following example states that interface VLAN10 is a member of the HSRP standby group 47, the virtual router IP address for that group is 172.16.10.110, and that ICMP redirects are disabled:

Switch#show running-config

Another means of verifying the HSRP configuration is with the command:

Switch#show standby brief

This command displays abbreviated information about the current state of all HSRP operations on this device.

Establish HSRP Priorities Each standby group has its own active and standby routers. The network administrator can assign a priority value to each router in a standby group, allowing the administrator to control the order in which active routers for that group are selected.

To set the priority value of a router, enter this command in interface configuration mode:

Switch(config-if)#standby group-number priority priority-value

During the election process, the router in an HSRP group with the highest priority becomes the forwarding router. In the case of a tie, the router with the highest configured IP address will become active. The default priority is 100.

To reinstate the default standby priority value, enter the no standby priority command.

To display the status of the HSRP router, enter one of these commands:

Switch#show standby [interface [group]] [active | init | listen | standby][brief] Switch#show standby delay [type-number]

If the optional interface parameters are not indicated, the show standby command displays HSRP information for all interfaces.

Verify all HSRP Operations This example shows the output of the show standby command:

Switch#show standby Vlan10 47

This is an example of the output resulting when you specify the brief parameter:

Switch#show standby brief

Load-sharing

To facilitate load sharing, a single router may be a member of multiple HSRP standby groups on a single segment. Multiple standby groups further enable redundancy and load sharing within networks. While a router is actively forwarding traffic for one HSRP group, the router can be in standby or listen state for another group. Each standby group emulates a single virtual router. There can be up to 255 standby groups on any LAN.

CAUTION:

Increasing the number of groups in which a router participates increases the load on the router. This can have an impact on the performance of the router.

Addressing HSRP Groups Across Trunk LinksRouters can simultaneously provide redundant backup and perform load sharing across different IP subnets.

For each standby group, an IP address and a single well-known MAC address with a unique group identifier is allocated to the group.

The IP address of a group is in the range of addresses belonging to the subnet that is in use on the LAN. However, the IP address of the group must differ from the addresses allocated as interface addresses on all routers and hosts on the LAN, including virtual IP addresses assigned to other HSRP groups.

NOTE:A Route Processor (RP) theoretically can support up to 32,650 subinterfaces; however, the actual number of supported interfaces is limited by the capacity of the RP and the number of VLANs.

Supporting Multiple Subnets with Multiple HSRP Groups Routers can belong to multiple groups within multiple VLANs. As members of multiple hot standby groups, routers can simultaneously provide redundant backup and perform load sharing across different IP subnets.

Although multiple routers can exist in an HSRP group, only the active router forwards the packets sent to the virtual router.

A single group can also be configured to host the virtual router of multiple IP subnets. For example, the default gateway virtual addresses of 172.16.10.1 and 172.16.20.1 can both be associated with a single HSRP group number.

Timers

HSRP Timers can be adjusted to tune the performance of HSRP on the Distribution devices, thereby increasing their resilience and reliability in routing packets off the local VLAN.

Sub-second FailoverThe HSRP hello and holdtime can be set to millisecond values so that HSRP failover occurs in less than 1 second. For example:

Switch(config-if)#standby 1 timers msec 200 msec 750

Preempt Time Aligned with Router Boot TimePreempt is an important feature of HSRP which allows the primary router to re-assume the active role when it comes back online after a failure or maintenance event. Preemption is a desired behavior as it forces a predictable routing path for the VLAN during normal operations and ensures that the Layer 3 forwarding path for a VLAN parallels the Layer 2 STP forwarding path whenever possible.

When a preempting devices is rebooted, HSRP preempt communication should not begin until the distribution switch has established full connectivity to the rest of the network. This allows the routing protocol convergence to occur more quickly once the preferred router in an active state. To accomplish this, measure the system boot time and set the HSRP preempt delay to a value 50 percent greater than the boot time. This ensures that the primary distribution switch establishes full connectivity to the network before HSRP communication occurs.

Timer Description

Active timer

This timer is used to monitor the active router. This timer starts any time an active router receives a hello packet. This timer expires in accordance with the hold time value that is set in the corresponding field of the HSRP hello message.

Standby timer

This timer is used to monitor the standby router. The timer starts any time the standby router receives a hello packet. This timer expires in accordance with the hold time value that is set in the respective hello packet.

Hello timerThis timer is used to clock hello packets. All HSRP routers in any HSRP state generate a hello packet when this hello timer expires.

HSRP Priority and Preemption

Preemption enables the HSRP router with the highest priority to immediately become the active router.Priority is determined first by the configured priority value, and then by the IP address. In case of ties,the primary IP addresses are compared, and the higher IP address has priority. In each case, a higher

value is of greater priority. If you do not use the standby preempt interface configuration command inthe configuration for a router, that router will not become the active router, even if its priority is higherthan all other routers.A standby router with equal priority but a higher IP address will not preempt the active router.When a router first comes up, it does not have a complete routing table. You can set a preemption delaythat allows preemption to be delayed for a configurable time period. This delay period allows the routerto populate its routing table before becoming the active router.

When the standby preempt command is issued, the interface changes to the appropriate state.

Hello Message TimersAn HSRP-enabled router sends hello messages to indicate that the router is running and is capable of becoming either the active or standby router.

The hello message contains the priority of the router and also hellotime and holdtime parameter values. The hellotime parameter value indicates the interval between the hello messages that the router sends. The holdtime parameter value indicates the amount of time that the current hello message is considered valid. The HSRP hellotime timer defaults to 3 and the holdtime timer defaults to 10.

If an active router sends a hello message, receiving routers consider that hello message to be valid for one holdtime. The holdtime value should be at least three times the value of the hellotime. The holdtime value must be greater than the value of the hellotime.

HSRP Interface TrackingIn some situations, the status of an interface directly affects which router needs to become the active router. This is particularly true when each of the routers in an HSRP group has a different path to resources within the campus network.

Interface tracking enables the priority of a standby group router to be automatically adjusted, based on availability of the interfaces of that router. When a tracked interface becomes unavailable, the HSRP priority of the router is decreased. The HSRP tracking feature reduces the likelihood that a router with an unavailable key interface will remain the active router.

VRRP

Like HSRP, Virtual Router Redundancy Protocol (VRRP) allows a group of routers to form a single virtual router. The LAN workstations are then configured with the address of the virtual router as their default gateway. VRRP differs from HSRP in the following ways:

* VRRP is an IEEE standard for router redundancy, HSRP is a Cisco proprietary * The virtual router, representing a group of routers, is known as a VRRP group. * The active router is referred to as the master virtual router. * The master virtual router may have the same IP address of the virtual router group. * Multiple routers can function as backup routers.

In the example, routers A, B, and C are members of a VRRP group. The IP address of the virtual router is the same as that of the LAN interface of router A (10.0.0.1). Router A is responsible for forwarding packets sent to this IP address.

The clients have a gateway address of 10.0.0.1. Routers B and C are backup routers. If the master router fails, the backup router with the highest priority becomes the master router. When router A recovers, it resumes the role of master router. Default priority is 100, highest IP wins on equal. Master routers are configured to advertise timers and non-master – to learn.

VRRP offers these redundancy features:

* VRRP provides redundancy for the real IP address of a router, or for a virtual IP address shared among the VRRP group members. * If a real IP address is used, the owning router becomes the master. If a virtual IP address is used, the master is the router with the highest priority. * A VRRP group has one master router and one or more backup routers. The master router uses VRRP messages to inform group members of the IP addresses of the backup routers.

The master sends the advertisement on multicast 224.0.0.18 on a default interval of 1 second. A VRRP flow message is similar in concept to an HSRP coup message. A master with a priority of zero triggers a transition to a backup router. The result is similar to an HSRP resign message.

The dynamic failover, when the active (master) becomes unavailable, uses two timers within VRRP: the advertisement interval and the master-down interval. The advertisement interval is the time interval between advertisements (seconds). The default interval is 1 second. The master-down interval is the time interval for backup to declare the master down (seconds).

GLBP

While HSRP and VRRP provide gateway resiliency, the standby members of the redundancy group along with their upstream bandwidth is not used while the device is in standby mode. Only the active router for the HSRP and VRRP group forwards traffic for the virtual MAC. Resources associated with the standby router are not fully utilized. Some load balancing can be accomplished with these protocols through the creation of multiple groups and through the assignment of multiple default gateways, but this configuration creates administrative burden.

Cisco designed Gateway Load Balancing Protocol (GLBP) to allow automatic selection and simultaneous use of multiple, available gateways, as well as automatic failover between those gateways. Multiple routers share the load of frames that, from a client perspective, are sent to a single default

gateway address. With GLBP, resources can be fully utilized without the administrative burden of configuring multiple groups and managing multiple default gateway configurations as is required with HSRP and VRRP.

GLBP allows automatic selection and simultaneous use of all available gateways in the group. The members of a GLBP group elect one gateway to be the Active Virtual Gateway (AVG) for that group. Other members of the group provide backup for the AVG should it become unavailable. The AVG assigns a virtual MAC address to each member of the GLBP group. All routers become active virtual forwarder (AVF) for frames address to that virtual MAC address . As clients send ARP requests for the address of the default gateway, the AVF sends these virtual MAC addresses in the ARP replies. A GLBP group can have up to four group members. Highest priority and if equal highest IP selects AVG.

GLBP supports these operational modes for load balancing traffic across multiple default routers servicing the same default gateway IP address: Weighted load-balancing algorithm – The amount of load directed to a router is dependent upon

the weighting value advertised by that router. Host-dependent load-balancing algorithm – A host is guaranteed to use the same virtual MAC

address as long as that virtual MAC address is participating in the GLBP group. Round-robin load-balancing algorithm – As clients send ARP requests to resolve the MAC

address of the default gateway, the reply to each client contains the MAC address of the next possible router in round-robin fashion. Each routers MAC address takes turns being included in address-resolution replies for the default gateway IP address.

Host Dependent is required when an application requires traffic flows to be tracked (for example when using NAT). Round Robin is recommended as the default, it is suitable for all other requirements. Weighted can be used if there are disparities in the capabilities of gateways in the GLBP group. None provides no load balancing.

GLBP automatically manages the virtual MAC address assignment, determines who handles the forwarding, and ensures that each station has a forwarding path in the event of failures to gateways or tracked interfaces. If failures occur, the load-balancing ratio is adjusted among the remaining active virtual forwarders so that resources are used in the most efficient way. Like HSRP, GLBP can be configured to track interfaces. GLBP members communicate between each other through hello messages sent every 3 seconds to the multicast address 224.0.0.102, User Datagram Protocol (UDP) port 3222 (source and destination).

GLBP Gateway Weighting and Tracking

GLBP uses a weighting scheme to determine the forwarding capacity of each router in the GLBP group. The weighting assigned to a router in the GLBP group determines whether it will forward packets and, if so, the proportion of hosts in the LAN for which it will forward packets. Thresholds can be set to disable forwarding when the weighting falls below a certain value, and when it rises above another threshold, forwarding is automatically reenabled.

The GLBP group weighting can be automatically adjusted by tracking the state of an interface within the router. If a tracked interface goes down, the GLBP group weighting is reduced by a specified value. Different interfaces can be tracked to decrement the GLBP weighting by varying amounts.

Load Balancing

HOST DEPENDENT

The MAC address of a host is used to determine which VF MAC address the host is directed towards. This ensures that a host will be guaranteed to use the same virtual MAC address as long as the number of VFs in the GLBP group is constant.

Host dependent load balancing will need to be used when using statefull Network Address Translation (NAT) because it requires each host to be returned the same virtual MAC address each time it sends an ARP request for the virtual IP address.

Host dependent load balancing is not recommended for situations where there are a small number of end hosts, for example less than 20, unless there is also a requirement that individual hosts must always use the same forwarder. The larger the number of host, the less likely it is to have an imbalance in distribution across forwarders.

This method uses an algorithm designed to equally distribute hosts among forwarders, this distribution changes only when the number for forwarders permanently changes.

WEIGHTED

This is the ability GLBP to place a weight on each device when calculating the amount of load sharing that will occur through MAC assignment. Each GLBP router in the group will advertise its weighting and assignment; the AVG will act based on that value.

For example, if there are two routers in a group and router A has double the forwarding capacity of router B, the weighting value of router A should be configured to be double the amount of router B.

ROUND ROBIN

With Round Robin each VF MAC address is used sequentially in ARP replies for the virtual IP address. Round Robin load balancing is suitable for any number of end hosts.

GLBP Benefits

Load Sharing

You can configure GLBP in such a way that traffic from LAN clients can be shared by multiple routers, thereby sharing the traffic load more equitably among available routers.

Multiple Virtual Routers

GLBP supports up to 1024 virtual routers (GLBP groups) on each physical interface of a router, and up to 4 virtual forwarders per group.

Preemption

The redundancy scheme of GLBP enables you to preempt an active virtual gateway with a higher priority backup virtual gateway that has become available. Forwarder preemption works in a similar way, except that forwarder preemption uses weighting instead of priority and is enabled by default.

Authentication

You can use a simple text password authentication scheme between GLBP group members to detect configuration errors. A router within a GLBP group with a different authentication string than other routers will be ignored by other group members.

RPR/RPR+

A Catalyst switch can allow a standby supervisor engine to take over if the primary supervisor engine fails. This allows the switch to resume operation quickly and efficiently in the event of a supervisor engine failure. This capability is called supervisor engine redundancy. In software, this capability is enabled by a feature called Route Processor Redundancy (RPR).

When RPR+ mode is used, the redundant Supervisor Engine is fully initialized and configured, and the MSFC and the PFCs are fully operational. This facilitates a faster failover time if than RPR in which the inactive Supervisor Engine is only partially booted.

The active Supervisor Engine checks the IOS version of the redundant Supervisor Engine when it boots. If the image on the redundant Supervisor Engine does not match the image on the active Supervisor Engine, RPR redundancy mode is used rather that RPR+.

The differences between the two RPR modes are:

* RPR leaves the standby MSFC and PFC non operational until a failover occurs.

* RPR + places the standby MSFC and PFC in an operational mode upon boot, thereby providing faster failover.

* RPR+ maintains synchronization of the running-configuration file between the two Supervisor Engines.

* Both RPR and RPR+ m maintain synchronization of the startup-configuration file between the two Supervisor Engines.

Catalyst 6500

The Catalyst 6500 platform provides Layer 3 functionality through a Multilayer Switch Function card (MSFC) residing on the Supervisor Engine module. As of this writing, the current iteration is the MSFC3 which is an integral part of the Supervisor Engine 720. The MSFC3 adds high performance, multilayer switching, and routing intelligence to the Catalyst. Equipped with a high-performance processor, the MSFC3 runs Layer 2 protocols on one CPU and Layer 3 protocols on the second CPU. These protocols include VLAN Trunking Protocol, routing protocols, multimedia services, security services; nearly any protocol capable of running on the high end Cisco routing platforms.

The MSFC builds the Cisco Express Forwarding Information Base table in software and downloads this table to the hardware or ASIC on the Policy Feature Card (PFC) and any installed Distributed Forwarding Card (DFC).

An MSFC3 with PFC3 on a Supervisor 720 adds Stateless Switchover (SSO) and Non Stop Forwarding (NSF) to the arsenal of Catalyst fault tolerance features

SSO

When a redundant supervisor engine runs in SSO mode, the redundant supervisor engine starts up in a fully-initialized state and synchronizes with the persistent configuration and the running configuration of the active supervisor engine. It subsequently maintains the state of the Layer 2 protocols, and all changes in hardware and software states for features that support stateful switchover are kept in sync. Consequently, it offers zero interruption to Layer 2 sessions in a redundant supervisor engine configuration. SSO is supported in 12.2(20)EWA and later releases.

Because the redundant supervisor engine recognizes the hardware link status of every link, ports that were active before the switchover will remain active, including the uplink ports. However, because uplink ports are physically on the supervisor engine, they will be disconnected ONLY if the supervisor engine is removed.

If the active supervisor engine fails, the redundant supervisor engine becomes active. This newly active supervisor engine uses existing Layer 2 switching information to continue forwarding traffic. Layer 3 forwarding will be delayed until the routing tables have been repopulated in the newly active supervisor engine.

SRM

In SRM redundancy, only the designated router (MSFC) is visible to the network at any given time. Dual Router Mode (DRM) had both MFSCs active and used HSRP to maintain an active and secondary relationship. DRM had the problem of extra complexity and routing protocol peering, which is overcome by using SRM. The non-designated router is booted up completely and participates in configuration synchronization, which is automatically enabled when entering SRM. The configuration of the non-designated router is exactly the same as the designated router, but its interfaces are kept in a "line down" state and are not visible to the network. Processes, such as routing protocols, are created on the non-designated router and the designated router, but all non-designated router interfaces are in a "line down" state; they do not send or receive updates from the network.

When the designated router fails, the non-designated router changes its state to become the designated router and its interface state changes to "link up." It builds its routing table while the existing Supervisor engine switch processor entries are used to forward Layer 3 traffic. After the newly designated router builds its routing table, the entries in the switch processor are updated.

Because only one MSFC is visible to the network at a given time, multiple BGP peering sessions do not have to exist between two MSFCs. In the event of a failure of the designated MSFC, the non-designated MSFC re-establishes BGP peering. Therefore, it always appears as a single BGP peer to the network and simplifies the network design, but it gives the same level of redundancy in case an MSFC has a failure.

Failure occurrence

When the switch is powered on, SRM with SSO runs between the two Supervisor Engines. The Supervisor Engine that boots first becomes the active Supervisor. The Multilayer Switch Feature Card 3 MSFC3 and Policy Feature Card 3 PFC3 become fully operational.

If the active Supervisor Engine 720 or MSFC3 fails, the redundant Supervisor Engine 720 and MSFC3 become active. The newly active Supervisor Engine 720 uses the existing PFC3 Layer 3 switching information to forward traffic while the newly active MSFC3 builds its routing table.

The routing protocols have to establish connectivity with their neighbor or peers and the Routing Information Base is built. During this time packet forwarding cannot take place.

CAUTION:

Before going from Dual Router Mode (DRM) to SRM redundancy, Cisco recommends that you use the copy running-config command on the MSFCs to save the non-SRM configuration to boot flash memory. When going to SRM redundancy, the alternative configuration (the configuration following the alt keyword) is lost. Therefore, before enabling SRM redundancy, save the DRM configuration to boot flash memory by entering the following command on both MSFCs: copy running-config bootflash:nosrm_dual_router_config.

NSF

Cisco NSF always runs with SSO and provides redundancy for Layer 3 traffic. NSF works with SSO to minimize the amount of time that a network is unavailable to its users following a switchover. The main purpose of NSF is to continue forwarding IP packets following a supervisor engine switchover and the subsequent establishment of the routing protocols peering relationships.

Cisco NSF is supported by the BGP, OSPF, IS-IS, LDP, RSVP, AToM and EIGRP protocols for routing and is supported by Cisco Express Forwarding (CEF) for forwarding. The routing protocols have been enhanced with NSF-capability and awareness, which means that remote routers running these protocols and configured for NSF can detect a switchover, recover route information from the peer devices and take the necessary actions to continue forwarding network traffic. The IS-IS protocol can be configured to use state information that has been synchronized between the active and the redundant supervisor engine to recover route information following a switchover rather than using of information received from peer devices.

A networking device is NSF-aware if it is running NSF-compatible software. A device is NSF-capable if it has been configured to support NSF; it will rebuild routing information from NSF-aware or NSF-capable neighbors.

Each protocol depends on CEF to continue forwarding packets during switchover while the routing protocols rebuild the Routing Information Base (RIB) tables. After the routing protocols have converged, CEF updates the FIB table and removes stale route entries. CEF then updates the line cards with the new FIB information.

Cisco NSF provides these benefits:

Improved network availability Network stability may be improved with the reduction in the number of route flaps Because the interfaces remain up throughout a switchover, neighboring routers do not detect a

link flap (the link does not go down and come back up). User sessions established before the switchover are maintained.

The routing protocols run only on the MSFC of the active supervisor engine, and they receive routing updates from their neighbor routers. Routing protocols do not run on the MSFC of the redundant supervisor engine. Following a switchover, the routing protocols request that the NSF-aware neighbor devices send state information to help rebuild the routing tables. Alternately, the IS-IS protocol can be configured to synchronize state information from the active to the redundant supervisor engine to help rebuild the routing table on the NSF-capable device in environments where neighbor devices are not NSF-aware. Cisco NSF supports the BGP, OSPF, IS-IS, and EIGRP protocols.

NOTE:

For NSF operation, the routing protocols depend on CEF to continue forwarding packets while the routing protocols rebuild the routing information.

EIGRP Operation When an EIGRP NSF-capable router initially comes back up from an NSF restart, it has no neighbors and its topology table is empty. The router is notified by the redundant (now active) supervisor engine when it needs to bring up the interfaces, reacquire neighbors, and rebuild the topology and routing tables. The restarting router and its peers must accomplish these tasks without interrupting the data traffic directed toward the restarting router. EIGRP peer routers maintain the routes learned from the restarting router and continue forwarding traffic through the NSF restart process.

BGP Operation When an NSF-capable router begins a BGP session with a BGP peer, it sends an OPEN message to the peer. Included in the message is a statement that the NSF-capable device has "graceful" restart capability. Graceful restart is the mechanism by which BGP routing peers avoid a routing flap following a switchover. If the BGP peer has received this capability, it is aware that the device sending the message is NSF-capable. Both the NSF-capable router and its BGP peers need to exchange the graceful restart capability in their OPEN messages at the time of session establishment. If both the peers do not exchange the graceful restart capability, the session will not be graceful-restart-capable.

OSPF Operation When an OSPF NSF-capable router performs a supervisor engine switchover, it must perform the following tasks in order to resynchronize its link state database with its OSPF neighbors:

Relearn the available OSPF neighbors on the network without causing a reset of the neighbor relationship

Reacquire the contents of the link state database for the network

As quickly as possible after a supervisor engine switchover, the NSF-capable router sends an OSPF NSF signal to neighboring NSF-aware devices. Neighbor networking devices recognize this signal as an indicator that the neighbor relationship with this router should not be reset. As the NSF-capable router receives signals from other routers on the network, it can begin to rebuild its neighbor list.

IS-IS Operation When an IS-IS NSF-capable router performs a supervisor engine switchover, it must perform the following tasks in order to resynchronize its link state database with its IS-IS neighbors:

Relearn the available IS-IS neighbors on the network without causing a reset of the neighbor relationship

Reacquire the contents of the link state database for the network

The IS-IS NSF feature offers two options when you configure NSF:

Internet Engineering Task Force (IETF) IS-IS Cisco IS-IS

The Catalyst 4500 and 6500 switches support fault resistance by allowing a redundant supervisor engine to take over if the primary supervisor engine fails. Cisco NSF works with SSO to minimize the amount

of time the routing protocols require to rebuild their tables following a switchover. Catalyst 4500 and 6500 series switches also support route processor redundancy (RPR), route processor redundancy plus (RPR+), and single router mode with stateful switchover (SRM with SSO) for redundancy.

The following events cause a switchover:

* A hardware failure on the active supervisor engine

* Clock synchronization failure between supervisor engines

* A manual switchover

NSF Configuration

NSF Benefits and Restrictions

Cisco NSF provides these benefits:

•Improved network availability

NSF continues forwarding network traffic and application state information so that user session information is maintained after a switchover.

•Overall network stability

Network stability may be improved with the reduction in the number of route flaps that had been created when routers in the network failed and lost their routing tables.

•Neighboring routers do not detect a link flap

Because the interfaces remain up throughout a switchover, neighboring routers do not detect a link flap (the link does not go down and come back up).

•Prevents routing flaps

Because SSO continues forwarding network traffic in the event of a switchover, routing flaps are avoided.

•No loss of user sessions

User sessions established before the switchover are maintained.

Cisco NSF with SSO has these restrictions:

•For NSF operation, you must have SSO configured on the device.

•NSF with SSO supports IP Version 4 traffic and protocols only.

•The Hot Standby Routing Protocol (HSRP) is not SSO-aware, meaning state information is not maintained between the active and standby supervisor engine during normal operation. HSRP and SSO can coexist but both features work independently. Traffic that relies on HSRP may switch to the HSRP standby in the event of a supervisor switchover.

•The Gateway Load Balancing Protocol (GLBP) is not SSO-aware, meaning state information is not maintained between the active and standby supervisor engine during normal operation. GLBP and SSO can coexist but both features work independently. Traffic that relies on GLBP may switch to the GLBP standby in the event of a Supervisor switchover.

•The Virtual Redundancy Routing Protocols (VRRP) is not SSO-aware, meaning state information is not maintained between the active and standby supervisor engine during normal operation. VRRP and SSO can coexist but both features work independently. Traffic that relies on VRRP may switch to the VRRP standby in the event of a supervisor switchover.

•Multiprotocol Label Switching (MPLS) is not suported with Cisco NSF with SSO; however, MPLS and NSF with SSO can coexist. If NSF with SSO is configured in the same chassis with MPLS, the failover performance of MPLS protocols will be at least equivalent to RPR+ while the supported NSF with SSO protocols still retain the additional benefits of NSF with SSO.

•All neighboring devices participating in BGP NSF must be NSF-capable and configured for BGP graceful restart.

•OSPF NSF for virtual links is not supported.

•All OSPF networking devices on the same network segment must be NSF-aware (running an NSF software image).

•For IETF IS-IS, all neighboring devices must be running an NSF-aware software image.

•Multicast NSF with SSO is supported by the Supervisor Engine 720 only.

•The underlying unicast protocols must be NSF-aware in order to use multicast NSF with SSO.

•Bidirectional forwarding detection (BFD) is not SSO-aware and is not supported by NSF with SSO.

RPS

Redundant Power Supplies can work in 2 modes:

Combined mode disables redundancy. The power available to the system is the combined power capability of both power supplies. The system powers up as many modules as the combined capacity allows. If one supply should fail and there is not enough power for all previously powered-up modules, the system powers down those modules for which there is not enough power.

Redundant mode enables redundancy. In a redundant configuration, the total power drawn is at no time greater than the capability of one supply. If one supply malfunctions, the other supply can take over the entire system load. During normal operation with two power supplies, each provides approximately half of the required power to the system. Load sharing and redundancy are enabled automatically; no software configuration is required.

The power to a module can be cycled (reset). The power cycle will turn the module power off for 5 seconds and then back on. To power cycle a module, issue this command:

Network redundancy

Providing hardware redundancy in a switched network can be accomplished by implementing redundant modules within devices or by deploying redundant devices.

To achieve network availability as close to 100 percent of the time as possible, these network components are required:

* Reliable, fault-tolerant network devices – Hardware and software reliability to automatically identify and overcome failures.

* Device and link redundancy – Entire devices may be redundant or modules within devices can be redundant. Links may also be redundant.

* Resilient network technologies – Intelligence that ensures fast recovery around any device or link failure.

* Optimized network design – Well-defined network topologies and configurations designed to ensure there is no single point of failure.

* Best practices – Documented procedures for deploying and maintaining a robust e-commerce network infrastructure.

Network fault tolerance indicates the ability of a device or network to recover from the failure of a component or device. Achieving high availability relies on eliminating any single point of failure and on distributing intelligence throughout the architecture. You can increase availability by adding redundant components, including redundant network devices and connections to redundant Internet services. With the proper design, no single point of failure will have an impact on the availability of the overall system.

Benefits and drawbacks of device-level fault tolerance

One approach to building highly available networks is to replicate all devices to create a fault-tolerant network. To achieve high end-to-end availability, each key network infrastructure device exists in duplicate. Fault tolerance through device replication offers these benefits:

* Minimizes time periods during which the system is non-responsive to requests (for example, while the system is being reconfigured because of a component failure or recovery)

* Eliminates all single points of failure that would cause the system to stop

* Provides disaster protection by allowing the major system components to be separated geographically

Trying to achieve high network availability solely through device-level fault tolerance has a number of drawbacks.

* Massive redundancy within each device adds significantly to its cost. Massive redundancy also reduces physical capacity of each device by consuming slots that could otherwise house network interfaces or provide useful network services.

* Redundant subsystems within devices are often maintained in a hot-standby mode. In hot standby mode, such redundant subsystems cannot contribute additional performance because they are only fully activated when the primary component fails.

* Focusing on device-level hardware reliability may result in a number of other failure mechanisms being overlooked. Network elements are not standalone devices; they are components of a network system whose internal operations and system-level interactions are governed by software and configuration parameters.

Benefits and Drawbacks of redundant network topology

A complementary way to build highly available networks is to provide redundancy in the links between devices in the network topology. In the campus network design shown in the figure, there is a backup for every link and for every network device in the path between the client and server. Using network links to supplement devices fault tolerance has these advantages:

* The network elements providing redundancy can be geographically disparate. This reduces the probability that problems with the physical environment will interrupt service.

* Software errors and changes can be dealt with separately in the primary and secondary forwarding paths without completely interrupting service.

* Device-level fault tolerance can be concentrated in the Building Core and Building Distribution layers of the network where a hardware failure would affect a larger number of users. By partially relaxing the requirement for device-level fault tolerance, the cost per network device is reduced. To some degree, this offsets the requirement for more devices.

* Redundant links provided for fault tolerance can be used to balance the traffic load between the respective layers of the network topology (that is, the Building Access to the Building Distribution also Distribution to Core) during times of normal operation. Therefore, network-level redundancy can also provide increased aggregate performance and capacity.

* Redundant resources can be configured to fail over from primary to secondary facilities automatically. Failover times can be as low as sub-second, depending on the failure mode.

* Fast EtherChannel and Gigabit EtherChannel provide both fault-tolerance and high-speed links between switches with minimal convergence times in the event of link loss.

Redundancy with stacked switches

Stacking access switches has become commonplace. Behavior of a switch stack in the event of a failure depends on its application. Stacked switches are sometimes used to implement hardware redundancy and high port density at the access layer.

Rather than having redundant uplinks between each access and distribution device, the stack as a whole represents a single logical switch with redundant links between the stack and the distribution layer.

Layer 3 Failure with Stacked Switches

Consider a failure of either a middle switch or cable in the switch stack.

The stack maintains the Layer 2 connectivity between the distribution switches. When a link between switches in the stack fails, HSRP packets are no longer sent between the two distribution switches. This causes the standby HSRP router to transition to active and advertise itself as the default gateway. Traffic from SW1 uses distribution SWA as the active gateway but now traffic from SW2 and SW3 uses distribution SWB as the default gateway.

As the distribution switches announce routes into the core, the VLAN interfaces on both distribution switches will advertise reachability to the IP subnet of the switch stack. These VLAN interfaces will present themselves as equal cost paths to the subnet. When return path, potentially load balanced traffic arrives at each distribution VLAN interface, some percentage will not be able to reach the originating end system because it is on the wrong side of the failure.

Loopback Cable to Maintain Layer 2 Path

By installing a loopback cable between the end switches of the switch stack, the Layer 2 path in the segment has redundancy that can be maintained by STP. HSRP communication can now be maintained between the distribution switches if a Layer 2 link occurs. Ideally the HSRP and STP failover times would be closely associated so that there is little time when connectivity is compromised. Rapid spanning tree should be implemented if at all possible.

NOTE:

Stack redundancy software and hardware solutions such as Stackwise in the 3750 can avoid most of the issues associated with stacked switches with no common backplane.

High availability: access layer best practices

When deploying the Campus Infrastructure module, adopting best practice recommendations at the access layer means providing a highly available and deterministic Layer 2 network. It is generally assumed that High Availability in the Access layer will be accomplished through the implementation of link redundancy between access and distribution layers with the STP protocol managing the use of those links. Redundant links to individual user devices is not typical.

These are best practices to follow when establishing highly available access Layer devices.

* Limit VLANs to a single access switch or switch stack. Spanning VLANs across switches may be necessary in some instances but should be avoided if at all possible.

* Leave the Spanning Tree Protocol active if there are no redundant Layer 2 links in the network. This will guard against the attachment of rogue switches.

* Rapid spanning tree is preferred to keep convergence times to 1-2 seconds.

* Setting trunks permanently on, to avoid auto-negotiation and security issues.

* If two different version of Cisco software exist at either end of a trunk link ensure that trunk parameters and manually set to match on another.

* VTP can be disabled or run in transparent mode only

NOTE:

Consider using multilayer switches and routing at the access layer to avoid the use of spanning tree and minimize convergence time.

High availability: distribution layer best practices

Adopting these best practice recommendations at the distribution layer will support the intent of a providing a highly available and deterministic network.

Connect distribution switches with a Layer 3 EtherChannel link. Use equal-cost redundant connections between the distribution and core for fastest convergence and

to avoid black holes. Summarization is required to facilitate optimum EIGRP or OSPF convergence. If summarization is

implemented at the distribution layer, the distribution nodes must be linked or routing black holes occur.

Utilize GLBP/HSRP millisecond timers. Convergence around a link or node failure in the L2/L3 distribution boundary model depends on default gateway redundancy and failover. Millisecond timers can reliably be implemented to achieve sub-second (800 ms) convergence based on HSRP/GLBP failover.

Tune GLBP/HSRP preempt delay to avoid black holes. HSRP/GLBP standby peer so that traffic is not dropped while connectivity to the core is established. The delay should be adjusted to ensure that the node is ready to forward traffic before it preempts.

The hierarchical campus model implements multiple L3 equal-cost paths and traffic should be load balanced across these paths from the access layer across the distribution and core. The CEF hashing algorithm should be tuned at the core and distribution layers to vary decision input and avoid CEF polarization which can result in under-utilization of redundant paths. Use the default L3 information for the core nodes and use L3 with L4 information for the distribution nodes.

VTP Pruning

VTP pruning enhances network bandwidth use by reducing unnecessary flooded traffic, such as broadcast, multicast, unknown, and flooded unicast packets. VTP pruning increases available bandwidth by restricting flooded traffic to those trunk links that the traffic must use to access the appropriate network devices. By default, VTP pruning is disabled.

Make sure that all devices in the management domain support VTP pruning before enabling it.

Figure 9-1 shows a switched network with VTP pruning disabled. Port 1 on Switch 1 and port 2 on Switch 4 are assigned to the Red VLAN. A broadcast is sent from the host connected to Switch 1. Switch 1 floods the broadcast and every switch in the network receives it, even though Switches 3, 5, and 6 have no ports in the Red VLAN.

Figure 9-1 Flooding Traffic without VTP Pruning

Figure 9-2 shows the same switched network with VTP pruning enabled. The broadcast traffic from Switch 1 is not forwarded to Switches 3, 5, and 6 because traffic for the Red VLAN has been pruned on the links indicated (port 5 on Switch 2 and port 4 on Switch 4).

Figure 9-2 Flooding Traffic with VTP Pruning

Enabling VTP pruning on a VTP server enables pruning for the entire management domain. VTP pruning takes effect several seconds after you enable it. By default, VLANs 2 through 1001 are pruning eligible. VTP pruning does not prune traffic from VLANs that are pruning ineligible. VLAN 1 is always pruning ineligible; traffic from VLAN 1 cannot be pruned.

Layer 2 and 3 redundancy alignment

When implementing strategies for failover at the access and distribution layers it is important that the failover paths and timers are aligned between Layer 2 failover protocol (STP) and Layer 3 failover protocol (HSRP or GLBP). This would be most significant if the link between the distribution switches was a Layer 2 link and therefore hosting a redundant Layer 2 path for the VLANs in the Access layer. Although it is a Layer 3 link, alignment of the protocols is still a best practice in the event that a rogue switch is placed on the network.

In the example the distribution switch is configured as the HSRP Active router for VLANs 12 and 120 is also configured as the STP primary root for the same VLANs. The second distribution switch serves as the HSRP standby and STP secondary root for those VLANs.

Likewise, the other distribution switch is configured as the HSRP Active router for VLANs 11 and 110 and is also configured as the STP primary root for the same VLANs. The second distribution switch serves as the HSRP standby and STP secondary root for VLANs 11 and 110.

It is important that the timers of STP and HSRP agree providing failover at recovery at the nearly the same time. This would require the implementation of RSTP on all access and distribution switches.

Optimizing System Resources for User-Selected Features

By using Switch Database Management (SDM) templates, you can configure memory resources in theswitch to optimize support for specific features, depending on how the switch is used in your network.You can select one of four templates to specify how system resources are allocated. You can thenapproximate the maximum number of unicast MAC addresses, Internet Group Management Protocol(IGMP) groups, quality of service (QoS) access control entries (ACEs), security ACEs, unicast routes,multicast routes, subnet VLANs (routed interfaces), and Layer 2 VLANs that can be configured on theswitch.

The four templates prioritize system memory to optimize support for these types of features:

• QoS and security ACEs—The access template might typically be used in an access switch at thenetwork edge where the route table sizes might not be substantial. Filtering and QoS might be moreimportant because an access switch is the entry to the whole network.

• Routing—The routing template maximizes system resources for unicast routing, typically requiredfor a router or aggregator in the center of a network.

• VLANs—The VLAN template disables routing and supports the maximum number of unicast MACaddresses. It would typically be selected for a switch used as a Layer 2 switch.

• Default—The default template gives balance to all functionalities (QoS, ACLs, unicast routing,multicast routing, VLANs and MAC addresses).

You can also enable the switch to support 144-bit Layer 3 TCAM, allowing extra fields in the storedrouting tables, by reformatting the routing table memory allocation. Using the extended-match keywordwith the default, access, or routing templates reformats the allocated TCAM by reducing the number ofallowed unicast routes, and storing extra routing information in the lower 72 bits of the Layer 3 TCAM.The 144-bit Layer 3 TCAM is required when running the Web Cache Communication Protocol (WCCP)or multiple VPN routing/forwarding (multi-VRF) instances in customer edge (CE) devices (multi-VRFCE) on the switch.

The first six rows in the tables (unicast MAC addresses through multicast routes) represent approximatehardware boundaries set when a template is selected. If a section of a hardware resource is full, allprocessing overflow is sent to the CPU, seriously impacting switch performance.

The last two rows, the total number of routed ports and SVIs and the number of Layer 2 VLANs, areguidelines used to calculate hardware resource consumption related to the other resource parameters.

The number of subnet VLANs (routed ports and SVIs) are not limited by software and can be set to anumber higher than indicated in the tables. If the number of subnet VLANs configured is lower or equalto the number in the tables, the number of entries in each category (unicast addresses, IGMP groups, andso on) for each template will be as shown. As the number of subnet VLANs increases, CPU utilizationtypically increases. If the number of subnet VLANs increases beyond the number shown in the tables,the number of supported entries in each category could decrease depending on features that are enabled.For example, if PIM-DVMRP is enabled with more than 16 subnet VLANs, the number of entries formulticast routes will be in the range of 1K-5K entries for the access template.

Using the Templates

Follow these guidelines when using the SDM templates:

• The maximum number of resources allowed in each template is an approximation and depends uponthe actual number of other features configured. For example, in the default template for theCatalyst 3550-12T, if your switch has more than 16 routed interfaces configured, the number ofmulticast or unicast routes that can be accommodated by hardware might be fewer than shown.

• Using the sdm prefer vlan global configuration command disables routing capability in the switch.Any routing configurations are rejected after the reload, and previously configured routing optionsmight be lost. Use the sdm prefer vlan global configuration command only on switches intendedfor Layer 2 switching with no routing.

• Do not use the routing template if you are not enabling routing on your switch. Entering the sdmprefer routing global configuration command on a switch does not enable routing, but it wouldprevent other features from using the memory allocated to unicast and multicast routing in therouting template, which could be up to 30 K in Gigabit Ethernet switches and 17 K in Fast Ethernetswitches.

• You must use the extended-match keyword to support 144-bit Layer 3 TCAM when WCCP ormulti-VRF CE is enabled on the switch. This keyword is not supported on the VLAN template.

This procedure shows how to change the SDM template from the default. The switch must reload beforethe configuration takes effect. If you use the show sdm prefer privileged EXEC command before theswitch reloads, the previous configuration (in this case, the default) appears.

Autostate Layer 3 Convergence during Layer 2 Failure

The autostate feature notifies a switch or routing module VLAN interface (Layer 3 interface) to transition to up and up status when at least one Layer 2 port becomes active in that VLAN.

Autostate also senses the STP forwarding state of ports associated with VLAN id, this will prevent routing protocols and other features from using the VLAN interface as if it were fully operational.

To operate correctly there should not be any local ports with the VLAN id that are NOT offering a connection directly to the access switch which has that VLAN configured.

* Trunk links which have the VLAN id are assumed to provide a path the and will keep interface up

* Access ports with the VLAN id will also keep VLAN interface up.

An example of a problem would be if a trunk link to an access switch which only had VLAN 12 and 14 associated with it, had its trunk configured to carry all VLANs. This trunk would appear to the autostate process to provide a path every active VLAN and hence local VLAN interfaces for 12 and 14 would never be shutdown because this trunk appears to provide a path.

Autostate modes

Normal Autostate Mode

Autostate shuts down (or brings up) the Layer 3 interfaces/subinterfaces on the MSFC and the Multilayer Switch Module (MSM) when the following port configuration changes occur on the switch:

• When the last port on a VLAN goes down, all the Layer 3 interfaces/subinterfaces on that VLAN shut down (are autostated) unless sc0 is on the VLAN or another router is in the chassis with an interface/subinterface in the VLAN.

• When the first port on the VLAN is brought back up, all the Layer 3 interfaces on that VLAN that were previously shut down are brought up.

The Catalyst 6500 series switch does not have knowledge of, or control over, the MSM or MSFC configuration (just as the switch does not have knowledge of, or control over, external router configurations). Autostate does not work on MSM or MSFC interfaces if the MSM or MSFC is not properly configured. For example, consider this MSM trunk configuration:

interface GigabitEthernet0/0/0.200 encap isl 200

In the example, the GigabitEthernet0/0/0.200 interface is not autostated if any of these configuration errors are made:

• VLAN 200 is not configured on the switch.

• Trunking is not configured on the corresponding Gigabit Ethernet switch port.

• Trunking is configured but VLAN 200 is not an allowed VLAN on that trunk.

Autostate Exclude Mode

Autostate exclude mode allows you to specify the ports to exclude from autostate. In normal autostate mode, the Layer 3 interfaces remain up if at least one port in the VLAN remains up. If you have appliances, such as load balancers or firewall servers that are connected to the ports in the VLAN, you can configure these ports to be excluded from the autostate feature to make sure that the forwarding SVI does not go down if these ports become inactive.

Autostate exclude mode affects all VLANs to which the port belongs and is supported on Ethernet, Fast Ethernet, and Gigabit Ethernet ports only.

Note You cannot configure both autostate exclude mode and autostate track mode on the same port.

Autostate Track Mode

You can use autostate track mode to track key VLAN or port connections to the MSFC. When you configure the autostate track mode, the SVI stays up if any tracked connections remain up in the VLAN. Track mode requires that you define a global tracked VLAN group. The VLANs in this group will be tracked by MSFC autostate whether or not you define a member port to be tracked.

When you configure a VLAN and the ports to be tracked by autostate, the tracked SVIs remain down until at least one tracked Ethernet port in the VLAN moves to the Spanning Tree Protocol (STP) forwarding state. Conversely, tracked SVIs remain up if at least one tracked Ethernet port stays in the STP forwarding state.

Autostate track mode is supported on Ethernet, Fast Ethernet, and Gigabit Ethernet ports only.

Affect of Layer 3 Failure with Autostate

Using the trunk range command will ensure appropriate action of the VLAN interface to a loss of physical connectivity. Having discussed the process of autostate we can now discuss the effects of a failure on IP traffic. For the following discussion we will assume that the distribution nodes are summarizing.

When the Layer 2 trunk between SW A and SW C fails, physical connectivity to VLAN 11 is lost on SW A. This is because the trunks are properly configured so autostate will detect that there is no longer any ports active for VLAN 11 and the VLAN 11 interface will shutdown on SW A and the directly connected route to VLAN 11 will be removed from the routing table.

This has the benefit of

The distribution switch will replace its directly connected route to VLAN 11 with the route to VLAN 11 being advertised by SW B across the Layer 3 link.

When return path traffic arrives on the distribution switch SW A destined for VLAN 11, it will be routed toward the access layer through SW B.

Because summarization is taking place, no external network routing update has been propagated into the core.

If the VLAN interface had not shutdown, then the IP return path traffic would have be lost at SW A. This is sometimes referred to as being ‘black holed’

High availability: core layer best practices

These best practices are recommended for optimum core layer convergence.

* Build triangles, not squares, to take advantage of equal-cost redundant paths for the best deterministic convergence.

* Design the core layer as a high-speed, Layer 3 switching environment utilizing only hardware-accelerated services. Layer 3 core designs are superior to Layer 2 and other alternatives because they provide:

o Faster convergence around a link or node failure.

o Increased scalability because neighbor relationships and meshing are reduced.

o More efficient bandwidth utilization.

* When considering high availability in the core, it is assumed that point-to-point links such direct Ethernet links exist between the core and distribution and between core devices. Link up or down topology changes can be propagated almost immediately. With topologies that rely on indirect notification and timer-based detection such as SONET, convergence is non-deterministic and convergence is measured in seconds.

7. Minimizing service loss and data theft in a switched network

Overview

Much industry attention surrounds security attacks from outside the walls of an organization and at the upper OSI layers. Network security coverage often focuses on edge-routing devices and the filtering of packets based upon Layer 3 and 4 headers, ports, stateful packet inspection, etc. This includes all issues surrounding Layer 3 and above as traffic makes its way into the campus network from the Internet. Campus Access devices and Layer 2 communication are left largely unconsidered in most security discussions.

The default state of networking equipment highlights this focus on external protection and internal open communication. Firewalls, placed at the organizational borders, arrive in a secure operational mode and allow no communication, until configured to do so. Routers and switches placed internal to an organization and designed to accommodate communication, delivering needful campus traffic, have a default operational mode that forwards all traffic unless configured otherwise. Their function as devices to facilitate communication often results in minimal security configuration and renders them as targets for malicious attacks. If an attack is launched at Layer 2 on an internal campus device, the rest of the network can be quickly compromised, often without detection.

Switches and routers have many security features available, but they must be enabled to be effective. As was the case with security having to be tightened on Layer 3 devices within the campus as malicious activity increased that compromised this layer, now security measures must be taken to guard against malicious activity at Layer 2. A new area of security focus centers on attacks launched by maliciously leveraging normal Layer 2 switch operations. Security features exist to protect switches and Layer 2 operations but, as with ACLs for upper layer security, a policy must be established, and appropriate features configured, to protect against the potential of malicious acts while maintaining daily network operations.

Switch attack categories

Layer 2 malicious attacks are typically launched by a device connected to the campus network. This can be a physical rogue device placed on the network for malicious purposes or an external intrusion that takes control of and launches attacks from a trusted device. In either case, the network sees all traffic as originating from a legitimate connected device. Each attack method is associated with a standard measure that should be taken to mitigate the associated known security compromise.

Attacks launched against switches and at Layer 2 can be grouped as follows:

* MAC Layer Attacks

* VLAN Attacks

* Spoof Attacks

* Attacks on Switch Devices

MAC Flooding Attack

A common Layer 2/switch attack as of this writing is MAC Flooding, resulting in CAM table overflow that causes flooding of regular data frames out all switch ports. This can be launched for the malicious purpose of collecting a broad sample of traffic or as a DoS attack.

CAM tables are limited in size and therefore the number of entries they can contain at any one time. A network intruder can maliciously flood a switch with a large number of frames from a range of invalid source MAC addresses. If enough new entries are made before old entries expire, new, valid entries will not be accepted. Then, when traffic arrives at the switch for a legitimate device that is located on one of the switch ports that was not able to create a CAM table entry, the switch must flood frames to that address out all ports. This has two adverse effects:

* The switch traffic forwarding is inefficient and voluminous.

* An intruding device can be connected to any switch port and capture traffic not normally seen on that port.

If the attack is launched prior to the beginning of the day, and the CAM table would be full as the majority of devices are powered on. Then frames from those legitimate devices are unable to create CAM table entries as they power on. If this represents a large number of network devices, the number of MAC addresses for which traffic will be flooded will be high and any switch port will carry flooded frames from a large number of devices.

If the initial flood of invalid CAM table entries is a one-time event, over time the switch will age out older, invalid CAM table entries, allowing new legitimate devices to create an entry. Traffic flooding will eventually cease, and may have never have been detected, as the intruder captured a significant amount of data from the network.

Suggested Mitigation for MAC Flood Attacks

Configure Port Security to define the number of MAC addresses that are allowed on a given port. Port security can also specify what MAC address is allowed on a given port.

Port Security

Port security is a feature supported on Cisco Catalyst switches that restricts a switch port to a specific set and/or number of MAC addresses. Those addresses can be learned dynamically or configured statically. The port will then provide access to frames from only those addresses. If, however, the number of addresses is limited to four, but no specific MAC addresses are configured, then the port will allow any four MAC addresses to be learned dynamically and port access will then be limited to those four dynamically learned addresses.

There is a port security feature called "sticky learning" available on some switch platforms that combines the features of dynamically learned and statically configured addresses. When configure on an interface, the interface converts dynamically learned addresses to "sticky secure" addresses. This

adds them to the running-configuration as if they were configured using the switchport port-security mac-address command.

NOTE:

Port security cannot be applied to trunk ports where addresses might change frequently. Implementations of port security vary by Catalyst platform.

Sticky MAC addresses

Port Security can be used to mitigate spoof attacks by limiting access through each switch port to a single MAC address. This prevents intruders from using multiple MAC addresses over a short period of time but does not limit port access to a specific MAC address. The most restrictive Port Security implementation would then specify the exact MAC address of the single device that is to gain access through each port. Implementing this level of security, however, requires considerable administrative overhead.

Port Security has a feature called "sticky MAC addresses" that can limit switch port access to a single, specific MAC address without the network administrator having gather and manually associate the MAC address of every legitimate device with a particular switch port.

When sticky MAC addresses are used, the switch port will convert dynamically learned MAC addresses to sticky MAC addresses and subsequently add them to the running configuration as if they were static entries for a single MAC address to be allowed by Port Security. Sticky secure MAC addresses will be added to the running configuration but will not become part of the startup configuration file unless the running configuration is copied to the startup configuration after addresses have been learned. If they are saved in the startup configuration, they will not having to be relearned upon switch reboot and this provides a higher level of network security.

Port Security Compatibility

VLAN Hopping

VLAN hopping is a network attack whereby an end system sends packets to, or collects them from, a VLAN that should not be accessible to that end system. This is accomplished by tagging the invasive traffic with a specific VLAN ID or by negotiating a trunk link in order to send or receive traffic on penetrated VLANs. VLAN Hopping can be accomplished by Switch Spoofing or Double Tagging.

Switch Spoofing

In a Switch Spoofing attack, the network attacker configures a system to spoof itself as a switch by emulating ISL or 802.1Q signaling along with Dynamic Trunk Protocol (DTP) signaling in an attempt to establish a trunk connection to the switch. Any switch port configured as DTP auto, upon receipt of a DTP packet generated by the attacking device, may become a trunk port and thereby accept traffic destined for any VLAN supported on that trunk. The malicious device can then send packets to, or collect packets from, any VLAN carried on the negotiated trunk.

Double Tagging

Another method of VLAN Hopping is for any workstation to generate frames with two 802.1Q headers in order to get the switch to forward the frames onto a VLAN that would be inaccessible to the attacker through legitimate means.

The first switch to encounter the double-tagged frame strips the first tag off the frame as it enters the switch because it matches the access ports native VLAN and then forwards the frame. The result is that the frame is forwarded with the inner 802.1Q tag out all the switch ports including trunk ports configured with the native VLAN of the network attacker. The second switch then forwards the packet to the destination based on the VLAN identifier in the second 802.1Q header. Should the trunk not match the native VLAN of the attacker, the frame would be untagged and flooded only to the original VLAN.

VLAN Hopping Mitigation

The measures to defend the network from VLAN Hopping are a series of Best Practices for all switch ports and parameters to follow when establishing a trunk port.

* Configure all unused ports as access ports so that trunking cannot be negotiated across those links

* Place all unused ports in the shutdown state and associate with a VLAN designed only for unused ports, carrying no user data traffic

* When establishing a trunk link, purposefully configure:

o the Native VLAN to be different from any data VLANs

o trunking as on, rather than negotiated

o the specific VLAN range to be carried on the trunk

Private VLANs

Service providers often have devices from multiple clients, as well as their own servers, on a single DMZ segment or VLAN. As security issues proliferate, it becomes needful to provide traffic isolation between devices although they may exist on the same Layer 3 segment and VLAN. Catalyst 6500/4500 switches implement Private VLANs (PVLANs) to keep some switch ports shared and some switch ports isolated, although all ports exist on the same VLAN. The 2950 and 3550 support "protected ports" with is functionality similar to PVLANs on a per switch basis.

Configuring Protected Ports

Some applications require that no traffic be forwarded at Layer 2 between ports on the same switch sothat one neighbor does not see the traffic generated by another neighbor. In such an environment, the

use of protected ports ensures that there is no exchange of unicast, broadcast, or multicast traffic between these ports on the switch.Protected ports have these features:

• A protected port does not forward any traffic (unicast, multicast, or broadcast) to any other port thatis also a protected port. Data traffic cannot be forwarded between protected ports at Layer 2; onlycontrol traffic, such as PIM packets, is forwarded because these packets are processed by the CPUand forwarded in software. All data traffic passing between protected ports must be forwardedthrough a Layer 3 device.

• Forwarding behavior between a protected port and a nonprotected port proceeds as usual.Because a switch stack represents a single logical switch, Layer 2 traffic is not forwarded between anyprotected ports in the switch stack, whether they are on the same or different switches in the stack.

Protected Port Configuration Guidelines

You can configure protected ports on a physical interface (for example, Gigabit Ethernet port 1) or anEtherChannel group (for example, port-channel 5). When you enable protected ports for a port channel,it is enabled for all ports in the port-channel group.Do not configure a private-VLAN port as a protected port. Do not configure a protected port as aprivate-VLAN port. A private-VLAN isolated port does not forward traffic to other isolated ports orcommunity ports.

Port Blocking

By default, the switch floods packets with unknown destination MAC addresses out of all ports. Ifunknown unicast and multicast traffic is forwarded to a protected port, there could be security issues. Toprevent unknown unicast or multicast traffic from being forwarded from one port to another, you canblock a port (protected or nonprotected) from flooding unknown unicast or multicast packets to otherports.

PVLANs

The traditional solution to address these ISP requirements is to provide one VLAN per customer, with each VLAN having its own IP subnet. A Layer 3 device then provides interconnectivity between VLANs and Internet destinations.

Challenges with this traditional solution are:

Supporting a separate VLAN per customer may require a high number of interfaces on service provider network devices.

Spanning tree becomes more complicated with many VLAN iterations. Network address space must be divided into many subnets, which wastes space and increases

management complexity. Multiple ACL applications are required to maintaining security on multiple VLANs resulting in

increased management complexity.

PVLANs provide Layer 2 isolation between ports within the same VLAN. This isolation eliminates the need for a separate VLAN and IP subnet per customer.

A port in a PVLAN can be one of three types:

Isolated – An isolated port has complete Layer 2 separation from other ports within the same PVLAN except for the promiscuous port. PVLANs block all traffic to isolated ports, except the traffic from promiscuous ports. Traffic received from an isolated port is forwarded only to promiscuous ports.

Promiscuous – A promiscuous port can communicate with all ports within the PVLAN, including the community and isolated ports. The default gateway for the segment would likely be hosted on a promiscuous port, given that all devices in the PVLAN will need to communicate with that port.

Community – Community ports communicate among themselves and with their promiscuous ports. These interfaces are isolated at Layer 2 from all other interfaces in other communities, or in isolated ports within their PVLAN.

NOTE:

Because trunks can support the VLANs carrying traffic between isolated, community, and promiscuous ports, isolated and community port traffic might enter or leave the switch through a trunk interface.

Secondary and Primary VLAN Configuration Guidelines:

• Set VTP to transparent mode. After you configure a private VLAN, you should not change the VTPmode to client or server. You must use VLAN configuration (config-vlan) mode to configure private VLANs. You cannot configure private VLANs in VLAN database configuration mode.

• After you have configured private VLANs, use the copy running-config startup config privilegedEXEC command to save the VTP transparent mode configuration and private-VLAN configurationin the switch startup configuration file. Otherwise, if the switch resets, it defaults to VTP servermode, which does not support private VLANs.

• VTP does not propagate private-VLAN configuration. You must configure private VLANs on eachdevice where you want private-VLAN ports.

• You cannot configure VLAN 1 or VLANs 1002 to 1005 as primary or secondary VLANs. ExtendedVLANs (VLAN IDs 1006 to 4094) can belong to private VLANs.

• A primary VLAN can have one isolated VLAN and multiple community VLANs associated with it.An isolated or community VLAN can have only one primary VLAN associated with it.

• Although a private VLAN contains more than one VLAN, only one Spanning Tree Protocol (STP)instance runs for the entire private VLAN. When a secondary VLAN is associated with the primary

VLAN, the STP parameters of the primary VLAN are propagated to the secondary VLAN.

• You can enable DHCP snooping on private VLANs. When you enable DHCP snooping on theprimary VLAN, it is propagated to the secondary VLANs. If you configure DHCP on a secondaryVLAN, the configuration does not take effect if the primary VLAN is already configured.

• When you enable IP source guard on private-VLAN ports, you must enable DHCP snooping on theprimary VLAN.

• We recommend that you prune the private VLANs from the trunks on devices that carry no trafficin the private VLANs.

• You can apply different quality of service (QoS) configurations to primary, isolated, and communityVLANs.

• When you configure private VLANs, sticky Address Resolution Protocol (ARP) is enabled bydefault, and ARP entries learned on Layer 3 private VLAN interfaces are sticky ARP entries. Forsecurity reasons, private VLAN port sticky ARP entries do not age out.

Note We recommend that you display and verify private-VLAN interface ARP entries.

Connecting a device with a different MAC address but with the same IP address generates a messageand the ARP entry is not created. Because the private-VLAN port sticky ARP entries do not age out,you must manually remove private-VLAN port ARP entries if a MAC address changes.

– You can remove a private-VLAN ARP entry by using the no arp ip-address global configurationcommand.

– You can add a private-VLAN ARP entry by using the arp ip-address hardware-address typeglobal configuration command.

• You can configure VLAN maps on primary and secondary VLANs. However, we recommend that you configure the same VLAN maps on private-VLAN primary and secondary VLANs.

• When a frame is Layer-2 forwarded within a private VLAN, the same VLAN map is applied at theingress side and at the egress side. When a frame is routed from inside a private VLAN to an externalport, the private-VLAN map is applied at the ingress side.

– For frames going upstream from a host port to a promiscuous port, the VLAN map configuredon the secondary VLAN is applied.

– For frames going downstream from a promiscuous port to a host port, the VLAN mapconfigured on the primary VLAN is applied.

To filter out specific IP traffic for a private VLAN, you should apply the VLAN map to both theprimary and secondary VLANs.

• You can apply router ACLs only on the primary-VLAN SVIs. The ACL is applied to both primaryand secondary VLAN Layer 3 traffic.

• Although private VLANs provide host isolation at Layer 2, hosts can communicate with each otherat Layer 3.

• Private VLANs support these Switched Port Analyzer (SPAN) features:

– You can configure a private-VLAN port as a SPAN source port.– You can use VLAN-based SPAN (VSPAN) on primary, isolated, and community VLANs or use

SPAN on only one VLAN to separately monitor egress or ingress traffic.

Private-VLAN Port Configuration

Follow these guidelines when configuring private-VLAN ports:

• Use only the private-VLAN configuration commands to assign ports to primary, isolated, orcommunity VLANs. Layer 2 access ports assigned to the VLANs that you configure as primary,isolated, or community VLANs are inactive while the VLAN is part of the private-VLANconfiguration. Layer 2 trunk interfaces remain in the STP forwarding state.

• Do not configure ports that belong to a PAgP or LACP EtherChannel as private-VLAN ports. Whilea port is part of the private-VLAN configuration, any EtherChannel configuration for it is inactive.

• Enable Port Fast and BPDU guard on isolated and community host ports to prevent STP loops dueto misconfigurations and to speed up STP convergence. When enabled, STP applies the BPDU guard feature to all PortFast-configured Layer 2 LAN ports. Do not enable Port Fast and BPDU guard on promiscuous ports.

• If you delete a VLAN used in the private-VLAN configuration, the private-VLAN ports associatedwith the VLAN become inactive.

• Private-VLAN ports can be on different network devices if the devices are trunk-connected and theprimary and secondary VLANs have not been removed from the trunk.

Limitations with Other Features

When configuring private VLANs, remember these limitations with other features:

Note In some cases, the configuration is accepted with no error messages, but the commands have no effect.

• Do not configure fallback bridging on switches with private VLANs.

• When IGMP snooping is enabled on the switch (the default), the switch stack supports no more than20 private-VLAN domains.

• Do not configure a remote SPAN (RSPAN) VLAN as a private-VLAN primary or secondary VLAN.

• Do not configure private-VLAN ports on interfaces configured for these other features:

– dynamic-access port VLAN membership– Dynamic Trunking Protocol (DTP)– Port Aggregation Protocol (PAgP)– Link Aggregation Control Protocol (LACP)– Multicast VLAN Registration (MVR)– voice VLAN

• A private-VLAN port cannot be a secure port and should not be configured as a protected port.

• You can configure IEEE 802.1x port-based authentication on a private-VLAN port, but do notconfigure 802.1x with port security, voice VLAN, or per-user ACL on private-VLAN ports.

• A private-VLAN host or promiscuous port cannot be a SPAN destination port. If you configure aSPAN destination port as a private-VLAN port, the port becomes inactive.

• If you configure a static MAC address on a promiscuous port in the primary VLAN, you must addthe same static address to all associated secondary VLANs. If you configure a static MAC addresson a host port in a secondary VLAN, you must add the same static MAC address to the associatedprimary VLAN. When you delete a static MAC address from a private-VLAN port, you must removeall instances of the configured MAC address from the private VLAN.

Note Dynamic MAC addresses learned in one VLAN of a private VLAN are replicated in theassociated VLANs. For example, a MAC address learned in a secondary VLAN is replicatedin the primary VLAN. When the original dynamic MAC address is deleted or aged out, thereplicated addresses are removed from the MAC address table.

• Configure Layer 3 VLAN interfaces (SVIs) only for primary VLANs.

Configuring Layer-2 interface as a private-VLAN host port:

Step 1 interface interface-id Enter interface configuration mode for the Layer 2 interface to be configured.

Step 2 switchport mode private-vlan host Configure the Layer 2 port as a private-VLAN host port.

Step 3 switchport private-vlan host-association primary_vlan_id secondary_vlan_id Associate the Layer 2 port with a private VLAN

Configuring Layer-2 interface as a private-VLAN promiscuous port:

Step 1 interface interface-id Enter interface configuration mode for the Layer 2 interface to be configured.

Step 2 switchport mode private-vlan promiscuous Configure the Layer 2 port as a private-VLAN promiscuous port.

Step 3 switchport private-vlan mapping primary_vlan_id {add | remove} secondary_vlan_list

Map the private-VLAN promiscuous port to aprimary VLAN and to selected secondary VLANs.

Mapping Secondary VLANs to a Primary VLAN Layer 3 VLAN Interface

If the private VLAN will be used for inter-VLAN routing, you configure an SVI for the primary VLAN and map secondary VLANs to the SVI.

Note Isolated and community VLANs are both secondary VLANs.

Step 1 interface vlan primary_vlan_id VLAN, and configure the VLAN as an SVI. The VLAN ID range is 2 to 1001 and 1006 to 4094.

Step 2 private-vlan mapping [add | remove] secondary_vlan_list

Map the secondary VLANs to the Layer 3 VLAN interface of a primary VLAN to allow Layer 3 switching of private-VLAN ingress traffic.

DHCP Spoofing

One of the ways for an attacker to can gain access to network traffic is to spoof responses that would be sent by a valid DHCP server. The DHCP spoofing device replies to client DHCP requests. The legitimate server may reply as well but if the Spoofing device is on the same segment as the client, its reply to the client may arrive first. The intruder’s DHCP reply offers an IP address and supporting information that designates the intruder as the default gateway or DNS server. In the case of a gateway, the clients will then forward all packets to the attacking device, which will in turn send them to the

desired destination. This is referred to as a "man-in-the-middle" attack and it may go entirely undetected as the intruder intercepts the data flow through the network.

DHCP Snooping

DHCP Snooping is a Catalyst feature that determines which switch ports can respond to DHCP requests. Ports are identified as trusted and untrusted. Trusted ports can source all DHCP messages while untrusted ports can source requests only. Trusted ports host a DHCP server or can be an uplink toward the DHCP server. If a rogue device on an untrusted port attempts to send a DHCP response packet into the network, the port is shut down. This feature can be coupled with DHCP Option 82, where switch information, such as the port ID of the DHCP request, can be inserted into the DHCP request packet.

Untrusted ports are those not explicitly configured as trusted. A DHCP Binding Table is built for untrusted ports. Each entry contains client MAC address, IP address, lease time, binding type, VLAN number and Port ID recorded as clients make DHCP requests. The table is then used to filter subsequent DHCP traffic. From a DHCP Snooping perspective, untrusted access ports should not send any DHCP server responses, such as DHCPOffer, DHCPAck, or DHCPNak.

When a router receives a packet on an untrusted interface and the interface belongs to a VLAN in which

DHCP snooping is enabled, the router compares the source MAC address and the DHCP client hardwareaddress. If addresses match (the default), the router forwards the packet. If the addresses do not match,the router drops the packet.

The router drops DHCP packets when any of these situations occur:

• The router receives a packet from a DHCP server, such as a DHCPOFFER, DHCPACK, DHCPNAK,or DHCPLEASEQUERY packet, from outside the network or firewall.

• The router receives a packet on an untrusted interface, and the source MAC address and the DHCPclient hardware address do not match.

• The router receives a DHCPRELEASE or DHCPDECLINE message that contains a MAC addressin the DHCP snooping binding table, but the interface information in the binding table does notmatch the interface on which the message was received.

• The router receives a DHCP packet that includes a relay agent IP address that is not 0.0.0.0.

With the DHCP option 82 on untrusted port feature enabled, use dynamic ARP inspection on theaggregation router to protect untrusted input interfaces.

DHCP Snooping Option-82 Data Insertion

In residential, metropolitan Ethernet-access environments, DHCP can centrally manage the IP addressassignments for a large number of subscribers. When the DHCP snooping option-82 feature is enabledon the router, a subscriber device is identified by the router port through which it connects to the network(in addition to its MAC address). Multiple hosts on the subscriber LAN can be connected to the sameport on the access router and are uniquely identified.

Figure 38-1 is an example of a metropolitan Ethernet network in which a centralized DHCP server

assigns IP addresses to subscribers connected to the router at the access layer. Because the DHCP clientsand their associated DHCP server do not reside on the same IP network or subnet, a DHCP relay agentis configured with a helper address to enable broadcast forwarding and to transfer DHCP messagesbetween the clients and the server.

When you enable the DHCP snooping information option 82 on the router, this sequence of eventsoccurs:

• The host (DHCP client) generates a DHCP request and broadcasts it on the network.

• When the router receives the DHCP request, it adds the option-82 information in the packet. Theoption-82 information contains the router MAC address (the remote ID suboption) and the portidentifier, vlan-mod-port, from which the packet is received (the circuit ID suboption).

• If the IP address of the relay agent is configured, the router adds the IP address in the DHCP packet.

• The router forwards the DHCP request that includes the option-82 field to the DHCP server.

• The DHCP server receives the packet. If the server is option-82 capable, it can use the remote ID,or the circuit ID, or both to assign IP addresses and implement policies, such as restricting thenumber of IP addresses that can be assigned to a single remote ID or circuit ID. The DHCP serverthen echoes the option-82 field in the DHCP reply.

• The DHCP server unicasts the reply to the router if the request was relayed to the server by therouter. When the client and server are on the same subnet, the server broadcasts the reply. The routerverifies that it originally inserted the option-82 data by inspecting the remote ID and possibly thecircuit ID fields. The router removes the option-82 field and forwards the packet to the router portthat connects to the DHCP client that sent the DHCP request.

When DHCP snooping is enabled, these Cisco IOS DHCP commands are not available on the router:

– ip dhcp relay information check global configuration command– ip dhcp relay information policy global configuration command– ip dhcp relay information trust-all global configuration command– ip dhcp relay information option global configuration command– ip dhcp relay information trusted interface configuration command

If these commands are entered, the router returns an error message, and the configuration is not applied.

• To use any DHCP snooping features, you must globally enable DHCP snooping on the router.

• DHCP snooping is not active until DHCP snooping is enabled on a VLAN.

• If a Layer 2 LAN port is connected to a DHCP server, configure the port as trusted by entering the

ip dhcp snooping trust interface configuration command.

• If a Layer 2 LAN port is connected to a DHCP client, configure the port as untrusted by entering theno ip dhcp snooping trust interface configuration command.

DHCP snooping can be enabled on private VLANs:– If DHCP snooping is enabled, any primary VLAN configuration is propagated to its associatedsecondary VLANs.– If DHCP snooping is configured on the primary VLAN and you configure DHCP snooping withdifferent settings on an associated secondary VLAN, the configuration on the secondary VLANdoes not take effect.– If DHCP snooping is not configured on the primary VLAN and you configure DHCP snoopingon a secondary VLAN, the configuration takes affect only on the secondary VLAN.– When you manually configure DHCP snooping on a secondary VLAN, this message appears:DHCP Snooping configuration may not take effect on secondary vlan XXX– The show ip dhcp snooping command displays all VLANs (both primary and secondary) thathave DHCP snooping enabled.

A temporary file on the TFTP server with the touch command should be created in the TFTP server daemonDirectory when it’s used for database storage. With some UNIX implementations, the file should have full read and write access permissions (777).

MAC Spoofing

MAC Spoofing attacks occur when a device spoofs the MAC address of a valid network device to gain access to frames not normally forwarded out the switch port of the attacker. The attacker generates a single frame with a source MAC address of the valid device. The switch overwrites the valid CAM table entry with an entry for the same MAC address out the port of the attacking device. This causes the switch to forward frames destined for the valid MAC address out the port of the network attacker. Once the valid host sends additional frames, the spoofed CAM table entry is overwritten so forwarding to that MAC address resumes on the legitimate port.

ARP Spoofing

In normal ARP operation, a host sends a broadcast to determine the MAC address of a host with a particular IP address. The device at that IP address replies with its MAC address. The originating host caches the ARP response, using it to populate the destination Layer 2 header of packets sent to that IP address. By spoofing an ARP reply from a legitimate device, an attacking device appears to be the destination host sought by the senders. The ARP reply from the attacker causes the sender to store the attacking system MAC address of the in the ARP cache. All packets destined for those IP address will be forwarded through the attacker system.

Dynamic ARP Inspection

To prevent ARP spoofing or "poisoning", a switch must ensure that only valid ARP requests and responses are relayed. Dynamic ARP Inspection (DAI) prevents these attacks by intercepting and validating all ARP requests and responses. Each intercepted ARP reply is verified for valid MAC address to IP address bindings before it is forwarded to a PC to update the ARP cache. ARP replies coming from invalid devices are dropped.

DAI validates ARP replies coming from statically configured IP addresses or for a set of MAC addresses defined as in a VLAN access control lists. DAI can also determine the validity of an ARP reply based on bindings stored in a DHCP snooping database. To ensure that only valid ARP requests and responses are relayed, DAI takes the following actions:

* Forwards ARP packets received on a trusted interface without any checks * Intercepts all ARP packets on untrusted ports * Verifies that each intercepted packet has a valid IP-to-MAC address binding before forwarding packets that can update the local ARP cache.

* Drops and/or logs ARP packets with invalid IP-to-MAC address bindings.

Configure all Access switch ports as untrusted and all switch ports connected to other switches as trusted. In this case, all ARP packets entering the network would be from an upstream Distribution or Core switch, bypassing the security check and requiring no further validation.

Rate Limiting of ARP Packets

The router performs DAI validation checks, which rate limits incoming ARP packets to prevent adenial-of-service attack. By default, the rate for untrusted interfaces is 15 packets per second (pps).Trusted interfaces are not rate limited. DAI uses the DHCP snooping binding database for the list of valid IP-to-MAC address bindings. ARP ACLs take precedence over entries in the DHCP snooping binding database.

Dynamic ARP Inspection Configuration Guidelines

These are the dynamic ARP inspection configuration guidelines:

• Dynamic ARP inspection is an ingress security feature; it does not perform any egress checking.

• Dynamic ARP inspection is not effective for hosts connected to switches that do not supportdynamic ARP inspection or that do not have this feature enabled. Because man-in-the-middleattacks are limited to a single Layer 2 broadcast domain, separate the domain with dynamic ARPinspection checks from the one with no checking. This action secures the ARP caches of hosts in thedomain enabled for dynamic ARP inspection.

• Dynamic ARP inspection depends on the entries in the DHCP snooping binding database to verifyIP-to-MAC address bindings in incoming ARP requests and ARP responses. Make sure to enableDHCP snooping to permit ARP packets that have dynamically assigned IP addresses. When DHCP snooping is disabled or in non-DHCP environments, use ARP ACLs to permit or to deny packets.

• Dynamic ARP inspection is supported on access ports, trunk ports, EtherChannel ports, and privateVLAN ports.

• A physical port can join an EtherChannel port channel only when the trust state of the physical portand the channel port match. Otherwise, the physical port remains suspended in the port channel. Aport channel inherits its trust state from the first physical port that joins the channel. Consequently,

the trust state of the first physical port need not match the trust state of the channel. Conversely, when you change the trust state on the port channel, the switch configures a new trust state on all the physical ports that comprise the channel.

• The rate limit is calculated separately on each switch in a switch stack. For a cross-stackEtherChannel, this means that the actual rate limit might be higher than the configured value. Forexample, if you set the rate limit to 30 pps on an EtherChannel that has one port on switch 1 and oneport on switch 2, each port can receive packets at 29 pps without causing the EtherChannel tobecome error-disabled.

• The operating rate for the port channel is cumulative across all the physical ports within the channel.For example, if you configure the port channel with an ARP rate-limit of 400 pps, all the interfacescombined on the channel receive an aggregate 400 pps. The rate of incoming ARP packets onEtherChannel ports is equal to the sum of the incoming rate of packets from all the channelmembers. Configure the rate limit for EtherChannel ports only after examining the rate of incomingARP packets on the channel-port members.

The rate of incoming packets on a physical port is checked against the port-channel configurationrather than the physical-ports configuration. The rate-limit configuration on a port channel isindependent of the configuration on its physical ports.If the EtherChannel receives more ARP packets than the configured rate, the channel (including allphysical ports) is placed in the error-disabled state.

• Make sure to limit the rate of ARP packets on incoming trunk ports. Configure trunk ports withhigher rates to reflect their aggregation and to handle packets across multiple dynamic ARPinspection-enabled VLANs. You also can use the ip arp inspection limit none interfaceconfiguration command to make the rate unlimited. A high rate-limit on one VLAN can cause adenial-of-service attack to other VLANs when the software places the port in the error-disabledstate.

AAA

Authentication, authorization, and accounting (AAA) network security services provide the primary framework through which access control is set up on a switch. AAA is an architectural framework for configuring a set of three independent security functions in a consistent manner. AAA provides a modular way of performing these services:

Authentication – Provides the method of identifying users, including login and password dialog, challenge and response, messaging support and, depending on the security protocol, encryption.

Authentication is the way in which a user is identified prior to being allowed access to the network and network services. AAA authentication is configured by defining a named list of authentication methods, and then applying that list to various interfaces. The method list defines the types of authentication to be performed and the sequence in which they will be performed; it must be applied to a specific interface before any of the defined authentication methods will be performed. The only exception is the default method list (which is named "default"). The default method list is automatically applied to all interfaces if no other method list is defined. A defined method list overrides the default method list.

All authentication methods must be defined through AAA, with the exception of local, line password, and enable authentication.

Authorization – Provides the method for remote access control, including one-time authorization, or authorization for each service, per-user account list and profile, user group support, and support of IP, Internetwork Packet Exchange (IPX), AppleTalk Remote Access (ARA), and Telnet.

AAA authorization works by assembling a set of attributes that describe what the user is authorized to perform, such as access to different parts of the network. These attributes are compared to the

information contained in a database for a given user, and the result is returned to AAA to determine the actual capabilities and restrictions of the user. The database can be located locally on the multilayer switch, or it can be hosted remotely on a RADIUS or TACACS+ security server. Remote security servers, such as RADIUS and TACACS+, authorize users for specific rights by associating attribute-value pairs, which associate those rights with the appropriate user. All authorization methods must be defined through AAA.

As with authentication, configure AAA authorization by defining a named list of authorization methods, and then applying that list to various interfaces.

Accounting – Provides a method for collecting and sending security server information used for billing, auditing, and reporting. This is information such as user identities, start and stop times, executed commands (such as PPP), number of packets, and number of bytes. Security experts can use the information gained from accounting to audit and improve security.

In many circumstances, AAA uses protocols such as RADIUS, TACACS+, or 802.1X to administer its security functions. If the switch is acting as a network access server, AAA is the means through which a switch establishes communication between the network access server and the RADIUS, TACACS+, or 802.1X security server.

About Authorization Methods

AAA authorization enables the limitation of the services available to a user. When AAA authorization is enabled, the multilayer switch uses information retrieved from the user profile, which is located either in the local user database on the switch or on the security server, to configure the user session. When this task is done, the user will be granted access to a requested service only if the information in the user profile allows it. Just as with AAA authentication, authorization creates method lists to define the ways that authorization will be performed and the sequence in which these methods will be performed.

Method lists are specific to the authorization type requested:

* Auth-proxy – Applies specific security policies on a per-user basis.

* Commands – Applies to the EXEC mode commands that a user issues. Command authorization attempts authorization for all EXEC mode commands, including global configuration commands, associated with a specific privilege level.

* EXEC – Applies to the attributes associated with a user EXEC terminal session.

* Network – Applies to network connections. These connections can include a PPP, Serial Line Internet Protocol (SLIP), or AppleTalk Remote Access Protocol (ARAP) connection.

* Reverse access – Applies to reverse Telnet sessions.

These are the authorization methods possible:s

AAA Authentication methods

AAA Accounting methods

Accounting is the process of keeping track of the activity of each user who is accessing the network resources; including the amount of time spent in the network, the services accessed while there, and the amount of data transferred during the session. Accounting data is used for trend analysis, capacity planning, billing, auditing and cost allocation.

AAA supports six different accounting types:

Network accounting – Provides information for all PPP, SLIP, or ARAP sessions, including packet and byte counts

Connection accounting – Provides information about all outbound connections made from the network, such as Telnet and remote login (rlogin)

EXEC accounting – Provides information about user EXEC terminal sessions (user shells) on the network access server, including username, date, start and stop times, the access server IP address, and (for dial-in users) the telephone number the call originated from

System accounting – Provides information about all system-level events (for example, when the system reboots or when accounting is turned on or off)

Command accounting – Provides information about the EXEC shell commands for a specified privilege level that are being executed on a network access server

Resource accounting – Provides start and stop record support for calls that have passed user authentication

AAA Process

AAA enables dynamic configuration of the type of authentication and authorization on a per-line (per-user) or per-service (for example, IP, IPX, or VPDN) basis. Define the type of authentication and authorization by creating method lists, and then applying those method lists to specific services or interfaces.

A method list is a sequential list that defines the authentication methods used to authenticate a user. Method lists enable designation of one or more security protocols to be used for authentication, thus ensuring a backup system for authentication in case the initial method fails. Cisco IOS software uses the first method listed to authenticate users; if that method does not respond, Cisco IOS software selects the next authentication method in the method list. This process continues until there is successful communication with a listed authentication method, or until the authentication method list is exhausted, in which case authentication fails.

NOTE:

Cisco IOS software attempts authentication with the next listed authentication method only when there is no response from the previous method. If any device denies authentication, the authentication process stops; no other authentication methods are attempted.

First, decide what kind of security solution should be implemented. Assess the security risks in the particular network and decide on the appropriate means to prevent unauthorized entry and attack.

Comprehensive AAA

This is a configuration of a Cisco access device for AAA services to be provided by the RADIUS server for an access server with dialup links. If the RADIUS server fails to respond, then the local database will be queried for authentication and authorization information, and accounting services will be handled by a TACACS+ server.

TACACS+

TACACS+ is a security application that provides centralized validation of users attempting to gain accessto your switch. TACACS+ services are maintained in a database on a TACACS+ daemon typicallyrunning on a UNIX or Windows NT workstation. You should have access to and should configure aTACACS+ server before the configuring TACACS+ features on your switch.

Note We recommend a redundant connection between a switch stack and the TACACS+ server. This is to help ensure that the TACACS+ server remains accessible in case one of the connected stack members isremoved from the switch stack.

TACACS+ provides for separate and modular authentication, authorization, and accounting facilities.TACACS+ allows for a single access control server (the TACACS+ daemon) to provide eachservice—authentication, authorization, and accounting—independently. Each service can be tied into itsown database to take advantage of other services available on that server or on the network, dependingon the capabilities of the daemon.

TACACS+, administered through the AAA security services, can provide these services:

• Authentication—Provides complete control of authentication through login and password dialog,challenge and response, and messaging support.

The authentication facility can conduct a dialog with the user (for example, after a username andpassword are provided, to challenge a user with several questions, such as home address, mother’smaiden name, service type, and social security number). The TACACS+ authentication service can

also send messages to user screens. For example, a message could notify users that their passwordsmust be changed because of the company’s password aging policy.

• Authorization—Provides fine-grained control over user capabilities for the duration of the user’ssession, including but not limited to setting autocommands, access control, session duration, orprotocol support. You can also enforce restrictions on what commands a user can execute with theTACACS+ authorization feature.

• Accounting—Collects and sends information used for billing, auditing, and reporting to theTACACS+ daemon. Network managers can use the accounting facility to track user activity for asecurity audit or to provide information for user billing. Accounting records include user identities,start and stop times, executed commands (such as PPP), number of packets, and number of bytes.The TACACS+ protocol provides authentication between the switch and the TACACS+ daemon, and itensures confidentiality because all protocol exchanges between the switch and the TACACS+ daemonare encrypted.

TACACS+ Operation

When a user attempts a simple ASCII login by authenticating to a switch using TACACS+, this processoccurs:

1. When the connection is established, the switch contacts the TACACS+ daemon to obtain a usernameprompt to show to the user. The user enters a username, and the switch then contacts the TACACS+daemon to obtain a password prompt. The switch displays the password prompt to the user, the userenters a password, and the password is then sent to the TACACS+ daemon.TACACS+ allows a dialog between the daemon and the user until the daemon receives enoughinformation to authenticate the user. The daemon prompts for a username and passwordcombination, but can include other items, such as the user’s mother’s maiden name.

2. The switch eventually receives one of these responses from the TACACS+ daemon:

• ACCEPT—The user is authenticated and service can begin. If the switch is configured torequire authorization, authorization begins at this time.• REJECT—The user is not authenticated. The user can be denied access or is prompted to retrythe login sequence, depending on the TACACS+ daemon.• ERROR—An error occurred at some time during authentication with the daemon or in thenetwork connection between the daemon and the switch. If an ERROR response is received, theswitch typically tries to use an alternative method for authenticating the user.• CONTINUE—The user is prompted for additional authentication information.

After authentication, the user undergoes an additional authorization phase if authorization has beenenabled on the switch. Users must first successfully complete TACACS+ authentication beforeproceeding to TACACS+ authorization.

3. If TACACS+ authorization is required, the TACACS+ daemon is again contacted, and it returns anACCEPT or REJECT authorization response. If an ACCEPT response is returned, the responsecontains data in the form of attributes that direct the EXEC or NETWORK session for that user andthe services that the user can access:

• Telnet, Secure Shell (SSH), rlogin, or privileged EXEC services• Connection parameters, including the host or client IP address, access list, and user timeouts

Identifying the TACACS+ Server Host and Setting the Authentication Key

Configuring TACACS+ Login Authentication

To configure AAA authentication, you define a named list of authentication methods and then apply that list to various ports. The method list defines the types of authentication to be performed and the sequence in which they are performed; it must be applied to a specific port before any of the defined authentication methods are performed. The only exception is the default method list (which, by coincidence, is named default). The default method list is automatically applied to all ports except those that have a named method list explicitly defined. A defined method list overrides the default method list.

A method list describes the sequence and authentication methods to be queried to authenticate a user.You can designate one or more security protocols to be used for authentication, thus ensuring a backupsystem for authentication in case the initial method fails. The software uses the first method listed toauthenticate users; if that method fails to respond, the software selects the next authentication method inthe method list. This process continues until there is successful communication with a listedauthentication method or until all defined methods are exhausted. If authentication fails at any point inthis cycle—meaning that the security server or local username database responds by denying the useraccess—the authentication process stops, and no other authentication methods are attempted.

Configuring TACACS+ Authorization for Privileged EXEC Access and Network Services

AAA authorization limits the services available to a user. When AAA authorization is enabled, theswitch uses information retrieved from the user’s profile, which is located either in the local userdatabase or on the security server, to configure the user’s session. The user is granted access to arequested service only if the information in the user profile allows it.You can use the aaa authorization global configuration command with the tacacs+ keyword to setparameters that restrict a user’s network access to privileged EXEC mode.The aaa authorization exec tacacs+ local command sets these authorization parameters:

• Use TACACS+ for privileged EXEC access authorization if authentication was performed by usingTACACS+.• Use the local database if authentication was not performed by using TACACS+.

Note Authorization is bypassed for authenticated users who log in through the CLI even if authorization has been configured

aaa authorization network tacacs+Configure the switch for user TACACS+

authorization for allnetwork-related service requests.

aaa authorization exec tacacs+ Configure the switch for user TACACS+ authorization if the user has

privileged EXEC access.The exec keyword might return user profile

information (such as autocommand information).

Starting TACACS+ Accounting

The AAA accounting feature tracks the services that users are accessing and the amount of networkresources that they are consuming. When AAA accounting is enabled, the switch reports user activity tothe TACACS+ security server in the form of accounting records. Each accounting record containsaccounting attribute-value (AV) pairs and is stored on the security server. This data can then be analyzedfor network management, client billing, or auditing.

aaa accounting network start-stoptacacs+

Enable TACACS+ accounting for all network-related service requests.

aaa accounting exec start-stop tacacs+

Enable TACACS+ accounting to send a start-record accounting notice at the beginning of a

privileged EXEC process and a stop-record at the end.

RADIUS

Understanding RADIUSRADIUS is a distributed client/server system that secures networks against unauthorized access.RADIUS clients run on supported Cisco routers and switches. Clients send authentication requests to acentral RADIUS server, which contains all user authentication and network service access information.The RADIUS host is normally a multiuser system running RADIUS server software from Cisco (CiscoSecure Access Control Server version 3.0), Livingston, Merit, Microsoft, or another software provider.

Use RADIUS in these network environments that require access security:

• Networks with multiple-vendor access servers, each supporting RADIUS. For example, accessservers from several vendors use a single RADIUS server-based security database. In an IP-basednetwork with multiple vendors’ access servers, dial-in users are authenticated through a RADIUSserver that has been customized to work with the Kerberos security system.

• Turnkey network security environments in which applications support the RADIUS protocol, suchas in an access environment that uses a smart card access control system. In one case, RADIUS hasbeen used with Enigma’s security cards to validates users and to grant access to network resources.

• Networks already using RADIUS. You can add a Cisco switch containing a RADIUS client to thenetwork. This might be the first step when you make a transition to a TACACS+ server.

• Network in which the user must only access a single service. Using RADIUS, you can control useraccess to a single host, to a single utility such as Telnet, or to the network through a protocol suchas IEEE 802.1X.

• Networks that require resource accounting. You can use RADIUS accounting independently ofRADIUS authentication or authorization. The RADIUS accounting functions allow data to be sentat the start and end of services, showing the amount of resources (such as time, packets, bytes, andso forth) used during the session. An Internet service provider might use a freeware-based versionof RADIUS access control and accounting software to meet special security and billing needs.

RADIUS is not suitable in these network security situations:

• Multiprotocol access environments. RADIUS does not support AppleTalk Remote Access (ARA),NetBIOS Frame Control Protocol (NBFCP), NetWare Asynchronous Services Interface (NASI), orX.25 PAD connections.

• Switch-to-switch or router-to-router situations. RADIUS does not provide two-way authentication.

RADIUS can be used to authenticate from one device to a non-Cisco device if the non-Cisco devicerequires authentication.

• Networks using a variety of services. RADIUS generally binds a user to one service model.

RADIUS Authentication Configuration

Switch-to-RADIUS-server communication involves several components:

• Host name or IP address• Authentication destination port• Accounting destination port• Key string• Timeout period• Retransmission value

You identify RADIUS security servers by their host name or IP address, host name and specific UDPport numbers, or their IP address and specific UDP port numbers. The combination of the IP address and the UDP port number creates a unique identifier, allowing different ports to be individually defined as RADIUS hosts providing a specific AAA service. This unique identifier enables RADIUS requests to be sent to multiple UDP ports on a server at the same IP address.If two different host entries on the same RADIUS server are configured for the same service—forexample, accounting—the second host entry configured acts as a fail-over backup to the first one. Usingthis example, if the first host entry fails to provide accounting services, the switch tries the second hostentry configured on the same device for accounting services.

A RADIUS server and the switch use a shared secret text string to encrypt passwords and exchangeresponses. To configure RADIUS to use the AAA security commands, you must specify the host running the RADIUS server daemon and a secret text (key) string that it shares with the switch.The timeout, retransmission, and encryption key values can be configured globally for all RADIUSservers, on a per-server basis, or in some combination of global and per-server settings. To apply thesesettings globally to all RADIUS servers communicating with the switch, use the three unique globalconfiguration commands: radius-server timeout, radius-server retransmit, and radius-server key. To apply these values on a specific RADIUS server, use the radius-server host global configurationcommand.

radius-server host {hostname |

ip-address} [auth-port port-number] [acct-port port-

number] [timeout seconds] [retransmit retries] [key string]

Specify the IP address or host name of the remote RADIUS server host.• (Optional) For auth-port port-number, specify the UDP destinationport for authentication requests.• (Optional) For acct-port port-number, specify the UDP destinationport for accounting requests.• (Optional) For timeout seconds, specify the time interval that theswitch waits for the RADIUS server to reply before resending. Therange is 1 to 1000. This setting overrides the radius-server timeoutglobal configuration command setting. If no timeout is set with theradius-server host command, the setting of the radius-servertimeout command is used.• (Optional) For retransmit retries, specify the number of times aRADIUS request is resent to a server if that server is not respondingor responding slowly. The range is 1 to 1000. If no retransmit value isset with the radius-server host command, the setting of theradius-server retransmit global configuration command is used.• (Optional) For key string, specify the authentication and encryptionkey used between the switch and the RADIUS daemon running on the

RADIUS server.Note The key is a text string that must match the encryption key usedon the RADIUS server. Always configure the key as the last itemin the radius-server host command. Leading spaces are ignored,but spaces within and at the end of the key are used. If you usespaces in your key, do not enclose the key in quotation marksunless the quotation marks are part of the key.To configure the switch to recognize more than one host entry associatedwith a single IP address, enter this command as many times as necessary,making sure that each UDP port number is different. The switch softwaresearches for hosts in the order in which you specify them. Set the timeout,retransmit, and encryption key values to use with the specific RADIUShost

aaa group server radius group-name

Define the AAA server-group with a group name.

server ip-addressAssociate a particular RADIUS server with the defined server group.Repeat this step for each RADIUS server in the AAA server group.

RADIUS Authorization Configuration

aaa authorization network radiusConfigure the switch for user RADIUS

authorization for all network-related service requests.

aaa authorization exec radius

Configure the switch for user RADIUS authorization if the user has privileged EXEC

access.The exec keyword might return user profile

information (such as autocommand information).

RADIUS Accounting Configuration

aaa accounting network start-stopradius

Enable RADIUS accounting for all network-related service requests.

aaa accounting exec start-stop radiusEnable RADIUS accounting to send a start-record accounting notice at the beginning of a privileged

EXEC process and a stop-record at the end.

Kerberos

Kerberos is a secret-key network authentication protocol, which was developed at the MassachusettsInstitute of Technology (MIT). It uses the Data Encryption Standard (DES) cryptographic algorithm forencryption and authentication and authenticates requests for network resources. Kerberos uses theconcept of a trusted third party to perform secure verification of users and services. This trusted thirdparty is called the key distribution center (KDC). The trusted third party can be a Catalyst 3550 switch that supports Kerberos, that is configured as a network security server, and that can authenticate users by using the Kerberos protocol.

The main purpose of Kerberos is to verify that users are who they claim to be and the network servicesthat they use are what the services claim to be. To do this, a KDC or trusted Kerberos server issues ticketsto users. These tickets, which have a limited lifespan, are stored in user credential caches. The Kerberos

server uses the tickets instead of usernames and passwords to authenticate users and network services. The Kerberos credential scheme uses a process called single logon. This process authenticates a user once and then allows secure authentication (without encrypting another password) wherever that user credential is accepted.

The current software release supports Kerberos 5, which allows organizations that are already using Kerberos 5 to use the same Kerberos authentication database on the KDC that they are already using on their other network hosts (such as UNIX servers and PCs).

In this software release, Kerberos supports these network services:

• Telnet• rlogin• rsh (Remote Shell Protocol)

Kerberos Operation

This section describes how Kerberos operates with a switch that is configured as a network security server. Although you can customize Kerberos in a number of ways, remote users attempting to access network services must pass through three layers of security before they can access network services. To

authenticate to network services by using a switch as a Kerberos server, remote users must follow these steps:1. Authenticating to a Boundary Switch2. Obtaining a TGT from a KDC3. Authenticating to Network Services

Authenticating to a Boundary SwitchThis section describes the first layer of security through which a remote user must pass. The user mustfirst authenticate to the boundary switch. When the remote user authenticates to a boundary switch, thisprocess occurs:1. The user opens an un-Kerberized Telnet connection to the boundary switch.2. The switch prompts the user for a username and password.3. The switch requests a TGT from the KDC for this user.4. The KDC sends an encrypted TGT to the switch that includes the user identity.5. The switch attempts to decrypt the TGT by using the password that the user entered.– If the decryption is successful, the user is authenticated to the switch.– If the decryption is not successful, the user repeats Step 2 by re-entering the username and password (noting if Caps Lock or Num Lock is on or off) or by entering a different username and password.A remote user who initiates a un-Kerberized Telnet session and authenticates to a boundary switch is inside the firewall, but the user must still authenticate directly to the KDC before getting access to the network services. The user must authenticate to the KDC because the TGT that the KDC issues is stored on the switch and cannot be used for additional authentication until the user logs on to the switch.

Obtaining a TGT from a KDCThis section describes the second layer of security through which a remote user must pass. The user mustnow authenticate to a KDC and obtain a TGT from the KDC to access network services.

http://www.cisco.com/univercd/cc/td/doc/product/software/ios122/122cgcr/fsecur_c/fsecsp/scfkerb.htm#1000999

Authenticating to Network ServicesThis section describes the third layer of security through which a remote user must pass. The user with a TGT must now authenticate to the network services in a Kerberos realm.


Configuring KerberosSo that remote users can authenticate to network services, you must configure the hosts and the KDC inthe Kerberos realm to communicate and mutually authenticate users and network services. To do this,you must identify them to each other. You add entries for the hosts to the Kerberos database on the KDCand add KEYTAB files generated by the KDC to all hosts in the Kerberos realm. You also create entriesfor the users in the KDC database.

When you add or create entries for the hosts and users, follow these guidelines:

• The Kerberos principal name must be in all lowercase characters.• The Kerberos instance name must be in all lowercase characters.• The Kerberos realm name must be in all uppercase characters.

To set up a Kerberos-authenticated server-client system, follow these steps:

• Configure the KDC by using Kerberos commands.• Configure the switch to use the Kerberos protocol.








Certificate Authority Trustpoints

Certificate authorities (CAs) manage certificate requests and issue certificates to participating networkdevices. These services provide centralized security key and certificate management for the participatingdevices. Specific CA servers are referred to as trustpoints.When a connection attempt is made, the HTTPS server provides a secure connection by issuing acertified X.509v3 certificate, obtained from a specified CA trustpoint, to the client. The client (usuallya Web browser), in turn, has a public key that allows it to authenticate the certificate.For secure HTTP connections, we highly recommend that you configure a CA trustpoint. If a CAtrustpoint is not configured for the device running the HTTPS server, the server certifies itself andgenerates the needed RSA key pair. Because a self-certified (self-signed) certificate does not provideadequate security, the connecting client generates a notification that the certificate is self-certified, andthe user has the opportunity to accept or reject the connection. This option is useful for internal networktopologies (such as testing).If you do not configure a CA trustpoint, when you enable a secure HTTP connection, either a temporaryor a persistent self-signed certificate for the secure HTTP server (or client) is automatically generated.

• If the device is not configured with a hostname and a domain name, a temporary self-signedcertificate is generated. If the device reboots, any temporary self-signed certificate is lost, and a newtemporary new self-signed certificate is assigned.

• If the device has been configured with a host and domain name, a persistent self-signed certificateis generated. This certificate remains active if you reboot the device or if you disable the secureHTTP server so that it will be there the next time you re-enable a secure HTTP connection.

802.1x Port-Based Authentication

The IEEE 802.1x standard defines a client-server-based access control and authentication protocol that prevents unauthorized clients from connecting to a LAN through publicly accessible ports unless they are properly authenticated. The authentication server authenticates each client connected to a switch port before making available any services offered by the switch or the LAN.

Until the client is authenticated, IEEE 802.1x access control allows only Extensible Authentication Protocol over LAN (EAPOL), Cisco Discovery Protocol (CDP), and Spanning Tree Protocol (STP) traffic through the port to which the client is connected. After authentication is successful, normal traffic can pass through the port.

The IEEE 802.1x standard defines a port-based access control and authentication protocol that restricts unauthorized workstations from connecting to a LAN through publicly accessible ports. The authentication server authenticates each workstation connected to a switch port before making available any services offered by the switch or the LAN.

Until the workstation is authenticated, 802.1x access control allows only Extensible Authentication Protocol over LAN (EAPOL) traffic through the port to which the workstation is connected. After authentication is successful, normal traffic can pass through the port.

With 802.1x port-based authentication, the devices in the network have specific roles as follows:

Client – The device (workstation) that requests access to the LAN and switch services and responds to requests from the switch. The workstation must be running 802.1x-compliant client software such as that offered in the Microsoft Windows XP operating system. (The port that the client is attached to is the supplicant [client] in the IEEE 802.1x specification.)

Authentication server – Performs the actual authentication of the client. The authentication server validates the identity of the client and notifies the switch whether or not the client is authorized to access the LAN and switch services. Because the switch acts as the proxy, the authentication service is transparent to the client. The RADIUS security system with Extensible Authentication Protocol (EAP) extensions is the only supported authentication server.

Switch (also called the authenticator) – Controls the physical access to the network based on the authentication status of the client. The switch acts as an intermediary (proxy) between the client (supplicant) and the authentication server, requesting identity information from the client, verifying that information with the authentication server, and relaying a response to the client. The switch uses a RADIUS software agent, which is responsible for encapsulating and decapsulating the EAP frames and interacting with the authentication server.

When the switch receives EAPOL frames and relays them to the authentication server, the Ethernetheader is stripped and the remaining EAP frame is re-encapsulated in the RADIUS format. The EAPframes are not modified during encapsulation, and the authentication server must support EAPwithin the native frame format. When the switch receives frames from the authentication server, theserver’s frame header is removed, leaving the EAP frame, which is then encapsulated for Ethernetand sent to the client.The devices that can act as intermediaries include the Catalyst 3750, Catalyst 3560, Catalyst 3550,Catalyst 2970, Catalyst 2955, Catalyst 2950, Catalyst 2940 switches, or a wireless access point.These devices must be running software that supports the RADIUS client and IEEE 802.1x.

Authentication Initiation and Message Exchange

The switch port state determines whether or not the client is granted access to the network. The port starts in the unauthorized state. While in this state, the port disallows all ingress and egress traffic except for 802.1x protocol packets. When a client is successfully authenticated, the port transitions to the authorized state, allowing all traffic for the client to flow normally.

If the switch requests the client identity (authenticator initiation) and the client does not support 802.1x, the port remains in the unauthorized state and the client is not granted access to the network.

The switch or the client can initiate authentication. If you enable authentication on a port by using thedot1x port-control auto interface configuration command, the switch initiates authentication when thelink state changes from down to up or periodically as long as the port remains up and unauthenticated.The switch sends an EAP-request/identity frame to the client to request its identity. Upon receipt of theframe, the client responds with an EAP-response/identity frame.However, if during bootup, the client does not receive an EAP-request/identity frame from the switch,the client can initiate authentication by sending an EAPOL-start frame, which prompts the switch torequest the client’s identity.

Note If IEEE 802.1x is not enabled or supported on the network access device, any EAPOL frames from the client are dropped. If the client does not receive an EAP-request/identity frame after three attempts to start authentication, the client sends frames as if the port is in the authorized state. A port in the authorized state effectively means that the client has been successfully authenticated.

You control the port authorization state by using the dot1x port-control interface configuration command and these keywords:

force-authorized – Disables 802.1x port-based authentication and causes the port to transition to the authorized state without any authentication exchange required. The port transmits and receives normal traffic without 802.1x-based authentication of the client. This is the default setting.

force-unauthorized – Causes the port to remain in the unauthorized state, ignoring all attempts by the client to authenticate. The switch cannot provide authentication services to the client through the interface.

auto – Enables 802.1x port-based authentication and causes the port to begin in the unauthorized state, allowing only EAPOL frames to be sent and received through the port. The authentication process begins when the link state of the port transitions from down to up (authenticator initiation) or when an EAPOL-start frame is received (supplicant initiation). The switch requests the identity of the client and begins relaying authentication messages between the client and the authentication server. The switch uniquely identifies each client attempting to access the network by using the client MAC address.

If the client is successfully authenticated (receives an Accept frame from the authentication server), the port state changes to authorized, and all frames from the authenticated client are allowed through the port. If the authentication fails, the port remains in the unauthorized state, but authentication can be retried. If the authentication server cannot be reached, the switch can retransmit the request. If no response is received from the server after the specified number of attempts, authentication fails, and network access is not granted.

When a client logs off, it sends an EAPOL-logoff message, causing the switch port to transition to the unauthorized state.

The specific exchange of EAP frames depends on the authentication method being used. Here is an example message exchange initiated by the client using the One-Time-Password (OTP) authenticationmethod with a RADIUS server.

IEEE 802.1x Accounting

The IEEE 802.1x standard defines how users are authorized and authenticated for network access butdoes not keep track of network usage. IEEE 802.1x accounting is disabled by default. You can enableIEEE 802.1x accounting to monitor this activity on IEEE 802.1x-enabled ports:

• User successfully authenticates.• User logs off.• Link-down occurs.• Re-authentication successfully occurs.• Re-authentication fails.

The switch does not log IEEE 802.1x accounting information. Instead, it sends this information to theRADIUS server, which must be configured to log accounting messages.

IEEE 802.1x Accounting Attribute-Value Pairs

The information sent to the RADIUS server is represented in the form of Attribute-Value (AV) pairs.These AV pairs provide data for different applications. (For example, a billing application might requireinformation that is in the Acct-Input-Octets or the Acct-Output-Octets attributes of a RADIUS packet.)AV pairs are automatically sent by a switch that is configured for IEEE 802.1x accounting. Three typesof RADIUS accounting packets are sent by a switch:

• START–sent when a new user session starts• INTERIM–sent during an existing session for updates• STOP–sent when a session terminates

IEEE 802.1x Host Mode

You can configure an IEEE 802.1x port for single-host or for multiple-hosts mode. In single-host mode only one client can be connected to the IEEE 802.1x-enabled switch port. The switch detects the client by sending an EAPOL frame when the port link state changes to the up state. If a client leaves or is replaced with another client, the switch changes the port link state to down, and the port returns to the unauthorized state.In multiple-hosts mode, you can attach multiple hosts to a single IEEE 802.1x-enabled port. Here is an example that shows IEEE 802.1x port-based authentication in a wireless LAN. In this mode, only oneof the attached clients must be authorized for all clients to be granted network access. If the port becomes unauthorized (re-authentication fails or an EAPOL-logoff message is received), the switch denies network access to all of the attached clients. In this topology, the wireless access point is responsible for authenticating the clients attached to it, and it also acts as a client to the switch.With the multiple-hosts mode enabled, you can use IEEE 802.1x to authenticate the port and portsecurity to manage network access for all MAC addresses, including that of the client.

Using IEEE 802.1x with Port Security

You can configure an IEEE 802.1x port with port security in either single-host or multiple-hosts mode.(You also must configure port security on the port by using the switchport port-security interfaceconfiguration command.) When you enable port security and IEEE 802.1x on a port, IEEE 802.1xauthenticates the port, and port security manages network access for all MAC addresses, including thatof the client. You can then limit the number or group of clients that can access the network through anIEEE 802.1x port.

These are some examples of the interaction between IEEE 802.1x and port security on the switch:

• When a client is authenticated, and the port security table is not full, the client MAC address is addedto the port security list of secure hosts. The port then proceeds to come up normally.When a client is authenticated and manually configured for port security, it is guaranteed an entryin the secure host table (unless port security static aging has been enabled).A security violation occurs if the client is authenticated, but the port security table is full. This canhappen if the maximum number of secure hosts has been statically configured or if the client agesout of the secure host table. If the client address is aged, its place in the secure host table can betaken by another host.

If the security violation is caused by the first authenticated host, the port becomes error-disabled andimmediately shuts down.

The port security violation modes determine the action for security violations.

• When you manually remove an IEEE 802.1x client address from the port security table by using theno switchport port-security mac-address mac-address interface configuration command, youshould re-authenticate the IEEE 802.1x client by using the dot1x re-authenticate interfaceinterface-id privileged EXEC command.

• When an IEEE 802.1x client logs off, the port changes to an unauthenticated state, and all dynamicentries in the secure host table are cleared, including the entry for the client. Normal authenticationthen takes place.

• If the port is administratively shut down, the port becomes unauthenticated, and all dynamic entriesare removed from the secure host table.

• Port security and a voice VLAN can be configured simultaneously on an IEEE 802.1x port that is ineither single-host or multiple-hosts mode. Port security applies to both the voice VLAN identifier(VVID) and the port VLAN identifier (PVID).

Using IEEE 802.1x with Voice VLAN Ports

A voice VLAN port is a special access port associated with two VLAN identifiers:

• VVID to carry voice traffic to and from the IP phone. The VVID is used to configure the IP phoneconnected to the port.

• PVID to carry the data traffic to and from the workstation connected to the switch through the IPphone. The PVID is the native VLAN of the port.

Before Cisco IOS Release 12.1(14)EA1, a switch in single-host mode accepted traffic from a single host, and voice traffic was not allowed. In multiple-hosts mode, the switch did not accept voice traffic until the client was authenticated on the primary VLAN, thus making the IP phone inoperable until the user logged in.

With Cisco IOS Release 12.1(14)EA1 and later, the IP phone uses the VVID for its voice traffic,regardless of the authorization state of the port. This allows the phone to work independently of IEEE802.1x authentication.

In single-host mode, only the IP phone is allowed on the voice VLAN. In multiple-hosts mode,additional clients can send traffic on the voice VLAN after a supplicant is authenticated on the PVID.When multiple-hosts mode is enabled, the supplicant authentication affects both the PVID and theVVID.

A voice VLAN port becomes active when there is a link, and the device MAC address appears after thefirst CDP message from the IP phone. Cisco IP phones do not relay CDP messages from other devices.As a result, if several IP phones are connected in series, the switch recognizes only the one directlyconnected to it. When IEEE 802.1x is enabled on a voice VLAN port, the switch drops packets from

unrecognized IP phones more than one hop away.

When IEEE 802.1x is enabled on a port, you cannot configure a port VLAN that is equal to a voiceVLAN.

Note If you enable IEEE 802.1x on an access port on which a voice VLAN is configured and to which a Cisco IP Phone is connected, the Cisco IP phone loses connectivity to the switch for up to 30 seconds.

Using IEEE 802.1x with VLAN Assignment

Before Cisco IOS Release 12.1(14)EA1, when an IEEE 802.1x port was authenticated, it was authorizedto be in the access VLAN configured on the port even if the RADIUS server returned an authorizedVLAN from its database. Recall that an access VLAN is a VLAN assigned to an access port. All packets sent from or received on this port belong to this VLAN.

However, with Cisco IOS Release 12.1(14)EA1 and later releases, the switch supports IEEE 802.1x with VLAN assignment. After successful IEEE 802.1x authentication of a port, the RADIUS server sends the VLAN assignment to configure the switch port. The RADIUS server database maintains theusername-to-VLAN mappings, assigning the VLAN based on the username of the client connected tothe switch port. You can use this feature to limit network access for certain users.

When configured on the switch and the RADIUS server, IEEE 802.1x with VLAN assignment has thesecharacteristics:

• If no VLAN is supplied by the RADIUS server or if IEEE 802.1x authorization is disabled, the portis configured in its access VLAN after successful authentication.

• If IEEE 802.1x authorization is enabled but the VLAN information from the RADIUS server is notvalid, the port returns to the unauthorized state and remains in the configured access VLAN. Thisprevents ports from appearing unexpectedly in an inappropriate VLAN because of a configurationerror.

Configuration errors could include specifying a VLAN for a routed port, a malformed VLAN ID, anonexistent or internal (routed port) VLAN ID, or an attempted assignment to a voice VLAN ID.

• If IEEE 802.1x authorization is enabled and all information from the RADIUS server is valid, theport is placed in the specified VLAN after authentication.

• If the multiple-hosts mode is enabled on an IEEE 802.1x port, all hosts are placed in the same VLAN(specified by the RADIUS server) as the first authenticated host.

• If IEEE 802.1x and port security are enabled on a port, the port is placed in the RADIUSserver-assigned VLAN.

• If IEEE 802.1x is disabled on the port, it is returned to the configured access VLAN.When the port is in the force authorized, force unauthorized, unauthorized, or shutdown state, it is putinto the configured access VLAN.

If an IEEE 802.1x port is authenticated and put in the RADIUS server-assigned VLAN, any change tothe port access VLAN configuration does not take effect.The IEEE 802.1x with VLAN assignment feature is not supported on trunk ports, dynamic ports, or with dynamic-access port assignment through a VLAN Membership Policy Server (VMPS).

To configure VLAN assignment you need to perform these tasks:

• Enable AAA authorization by using the network keyword to allow interface configuration from theRADIUS server.

• Enable IEEE 802.1x. (The VLAN assignment feature is automatically enabled when you configureIEEE 802.1x on an access port).

• Assign vendor-specific tunnel attributes in the RADIUS server. The RADIUS server must returnthese attributes to the switch:

– [64] Tunnel-Type = VLAN– [65] Tunnel-Medium-Type = 802– [81] Tunnel-Private-Group-ID = VLAN name or VLAN ID

Attribute [64] must contain the value VLAN (type 13). Attribute [65] must contain the value 802(type 6). Attribute [81] specifies the VLAN name or VLAN ID assigned to the IEEE802.1x-authenticated user.

Using IEEE 802.1x with Guest VLAN

You can configure a guest VLAN for each IEEE 802.1x port on the switch to provide limited services toclients, such as downloading the IEEE 802.1x client. These clients might be upgrading their system forIEEE 802.1x authentication, and some hosts, such as Windows 98 systems, might not be IEEE802.1x-capable.

When you enable a guest VLAN on an IEEE 802.1x port, the switch assigns clients to a guest VLANwhen the authentication server does not receive a response to its EAPOL request/identity frame or whenEAPOL packets are not sent by the client.

Before Cisco IOS Release 12.2(25)SE, the switch did not maintain the EAPOL packet history andallowed clients that failed authentication access to the guest VLAN, regardless of whether EAPOLpackets had been detected on the interface. You can enable this optional behavior by using the dot1xguest-vlan supplicant global configuration command.

With Cisco IOS Release 12.2(25)SE and later, the switch maintains the EAPOL packet history. If another EAPOL packet is detected on the interface during the lifetime of the link, network access is denied. The EAPOL history is reset upon loss of the link.

Any number of IEEE 802.1x-incapable clients are allowed access when the switch port is moved to theguest VLAN. If an IEEE 802.1x-capable client joins the same port on which the guest VLAN isconfigured, the port is put into the unauthorized state in the user-configured access VLAN, andauthentication is restarted.

Guest VLANs are supported on IEEE 802.1x ports in single-host or multiple-hosts mode.You can configure any active VLAN except an RSPAN VLAN, private VLAN, or a voice VLAN as anIEEE 802.1x guest VLAN. The guest VLAN feature is not supported on internal VLANs (routed ports)or trunk ports; it is supported only on access ports.

Using IEEE 802.1x with Wake-on-LAN

The IEEE 802.1x wake-on-LAN (WoL) feature allows dormant PCs to be powered when the switchreceives a specific Ethernet frame, known as the magic packet. You can use this feature in environmentswhere administrators need to connect to systems that have been powered down.

When hosts that use WoL are attached through IEEE 802.1x ports and the host powers down, the IEEE802.1x port becomes unauthorized. In this state, the port can only receive and send EAPOL packets, andWoL magic packets cannot reach the host. When the PC is powered down, it is not authenticated, andthe switch port is not opened.

When the switch uses IEEE 802.1x with WoL, the switch sends packets to unauthorized IEEE 802.1xports. This feature is also known as the Unidirectional Controlled Port in the IEEE 802.1x specification.

Note If PortFast is not enabled on the port, the port is forced to the bidirectional state.

Unidirectional State

When you configure a port as unidirectional by using the dot1x control-direction in privileged EXECcommand, the port changes to the spanning-tree forwarding state.

When WoL is enabled, the connected host is in the sleeping mode or power-down state. The host doesnot exchange traffic with other devices in the network. If the host connected to the unidirectional portthat cannot send traffic to the network, the host can only receive traffic from other devices in the network.

If the unidirectional port receives incoming traffic, the port returns to the default bidirectional state, andthe port changes to the spanning-tree blocking state. When the port changes to the initialize state, notraffic other than EAPOL packet is allowed. When the port returns to the bidirectional state, the switchstarts a 5 minute timer. If the port is not authenticated before the timer expires, the port becomes aunidirectional port.

Bidirectional State

When you configure a port as bidirectional by using the dot1x control-direction both privileged EXECcommand, the port is access-controlled in both directions. In this state, the switch port does not receiveor send packets.

Using IEEE 802.1x with Per-User ACLs

You can enable per-user access control lists (ACLs) to provide different levels of network access andservice to an IEEE 802.1x-authenticated user. When the RADIUS server authenticates a user connectedto an IEEE 802.1x port, it retrieves the ACL attributes based on the user identity and sends them to theswitch. The switch applies the attributes to the IEEE 802.1x port for the duration of the user session. The switch removes the per-user ACL configuration when the session is over, if authentication fails, or if a link-down condition occurs. The switch does not save RADIUS-specified ACLs in the runningconfiguration. When the port is unauthorized, the switch removes the ACL from the port.You can configure router ACLs and input port ACLs on the same switch. However, a port ACL takesprecedence over a router ACL. If you apply input port ACL to an interface that belongs to a VLAN, theport ACL takes precedence over an input router ACL applied to the VLAN interface. Incoming packetsreceived on the port to which a port ACL is applied are filtered by the port ACL. Incoming routed packets received on other ports are filtered by the router ACL. Outgoing routed packets are filtered by the router ACL. To avoid configuration conflicts, you should carefully plan the user profiles stored on the RADIUS server.

RADIUS supports per-user attributes, including vendor-specific attributes. These vendor-specificattributes (VSAs) are in octet-string format and are passed to the switch during the authenticationprocess. The VSAs used for per-user ACLs are inacl#<n> for the ingress direction and outacl#<n> forthe egress direction. MAC ACLs are supported only in the ingress direction. The switch supports VSAsonly in the ingress direction. It does not support port ACLs in the egress direction on Layer 2 ports.Use only the extended ACL syntax style to define the per-user configuration stored on the RADIUSserver. When the definitions are passed from the RADIUS server, they are created by using the extended naming convention. However, if you use the Filter-Id attribute, it can point to a standard ACL.You can use the Filter-Id attribute to specify an inbound or outbound ACL that is already configured onthe switch. The attribute contains the ACL number followed by .in for ingress filtering or .out for egressfiltering. If the RADIUS server does not allow the .in or .out syntax, the access list is applied to the outbound ACL by default. Because of limited support of Cisco IOS access lists on the switch, the Filter-Id attribute is supported only for IP ACLs numbered 1 to 199 and 1300 to 2699 (IP standard and IP extended ACLs).

Only one IEEE 802.1x-authenticated user is supported on a port. If the multiple-hosts mode is enabledon the port, the per-user ACL attribute is disabled for the associated port.

The maximum size of the per-user ACL is 4000 ASCII characters.

To configure per-user ACLs, you need to perform these tasks:

• Enable AAA authentication.

• Enable AAA authorization by using the network keyword to allow interface configuration from theRADIUS server.• Enable IEEE 802.1x.• Configure the user profile and VSAs on the RADIUS server.• Configure the IEEE 802.1x port for single-host mode.

IEEE 802.1x and Switch Stacks

If a switch is added to or removed from a switch stack, IEEE 802.1x authentication is not affected aslong as the IP connectivity between the RADIUS server and the stack remains intact. This statement also applies if the stack master is removed from the switch stack. Note that if the stack master fails, a stack member becomes the new stack master by using the election process and the IEEE 802.1x authentication process continues as usual.

If IP connectivity to the RADIUS server is interrupted because the switch that was connected to theserver is removed or fails, these events occur:

• Ports that are already authenticated and that do not have periodic re-authentication enabled remainin the authenticated state. Communication with the RADIUS server is not required.

• Ports that are already authenticated and that have periodic re-authentication enabled (with the dot1xre-authentication global configuration command) fail the authentication process when there-authentication occurs. Ports return to the unauthenticated state during the re-authenticationprocess. Communication with the RADIUS server is required.

For an ongoing authentication, the authentication fails immediately because there is no serverconnectivity.

If the switch that failed comes up and rejoins the switch stack, the authentications might or might notfail depending on the boot-up time and whether the connectivity to the RADIUS server is re-establishedby the time the authentication is attempted.

To avoid loss of connectivity to the RADIUS server, you should ensure that there is a redundantconnection to it. For example, you can have a redundant connection to the stack master and another to astack member, and if the stack master fails, the switch stack still has connectivity to the RADIUS server.

IEEE 802.1x Configuration Guidelines

These are the IEEE 802.1x authentication configuration guidelines:

• When IEEE 802.1x is enabled, ports are authenticated before any other Layer 2 or Layer 3 featuresare enabled.

• The IEEE 802.1x protocol is supported on Layer 2 static-access ports, voice VLAN ports, and Layer3 routed ports, but it is not supported on these port types:

– Trunk port—If you try to enable IEEE 802.1x on a trunk port, an error message appears, andIEEE 802.1x is not enabled. If you try to change the mode of an IEEE 802.1x-enabled port totrunk, an error message appears, and the port mode is not changed.

– Dynamic ports—A port in dynamic mode can negotiate with its neighbor to become a trunkport. If you try to enable IEEE 802.1x on a dynamic port, an error message appears, and IEEE802.1x is not enabled. If you try to change the mode of an IEEE 802.1x-enabled port to dynamic,IEEE 802.an error message appears, and the port mode is not changed.

– Dynamic-access ports—If you try to enable IEEE 802.1x on a dynamic-access (VLAN QueryProtocol [VQP]) port, an error message appears, and IEEE 802.1x is not enabled. If you try tochange an IEEE 802.1x-enabled port to dynamic VLAN assignment, an error message appears,and the VLAN configuration is not changed.

– EtherChannel port—Do not configure a port that is an active or a not-yet-active member of anEtherChannel as an IEEE 802.1x port. If you try to enable IEEE 802.1x on an EtherChannelport, an error message appears, and IEEE 802.1x is not enabled.

Note In software releases earlier than Cisco IOS Release 12.2(18)SE, if IEEE 802.1x isenabled on a not-yet active port of an EtherChannel, the port does not join the EtherChannel.

– Switched Port Analyzer (SPAN) and Remote SPAN (RSPAN) destination ports—You canenable IEEE 802.1x on a port that is a SPAN or RSPAN destination port. However, IEEE 802.1xis disabled until the port is removed as a SPAN or RSPAN destination port. You can enable IEEE802.1x on a SPAN or RSPAN source port.

• You can configure any VLAN except an RSPAN VLAN, private VLAN, or a voice VLAN as anIEEE 802.1x guest VLAN. The guest VLAN feature is not supported on internal VLANs (routedports) or trunk ports; it is supported only on access ports.

• When IEEE 802.1x is enabled on a port, you cannot configure a port VLAN that is equal to a voice VLAN.

• The IEEE 802.1x with VLAN assignment feature is not supported on private-VLAN ports, trunkports, dynamic ports, or with dynamic-access port assignment through a VMPS.

• You can configure IEEE 802.1x on a private-VLAN port, but do not configure IEEE 802.1x withport security, voice VLAN, guest VLAN, or a per-user ACL on private-VLAN ports.

• Before globally enabling IEEE 802.1x on a switch by entering the dot1x system-auth-controlglobal configuration command, remove the EtherChannel configuration from the interfaces onwhich IEEE 802.1x and EtherChannel are configured.

• If you are using a device running the Cisco Access Control Server (ACS) application for IEEE802.1x authentication with EAP-Transparent LAN Services (TLS) and EAP-MD5 and your switchis running Cisco IOS Release 12.1(14)EA1, make sure that the device is running ACS Version 3.2.1or later.

• After you configure a guest VLAN for an IEEE 802.1x port to which a DHCP client is connected,you might need to get a host IP address from a DHCP server. You can also change the settings forrestarting the IEEE 802.1x authentication process on the switch before the DHCP process on theclient times out and tries to get a host IP address from the DHCP server. Decrease the settings forthe IEEE 802.1x authentication process (IEEE 802.1x quiet period and switch-to-client transmissiontime).

802.1X Configuration

Default configuration

Example configuration:

802.1X Guest VLAN Configuration

CDP security issues

Attackers with knowledge of how Cisco Discovery Protocol (CDP) works could find ways to take advantage of the clear text CDP packets to gain knowledge of edge of the network. The CDP runs at Layer 2 and allows Cisco devices to identify themselves to other Cisco devices. However, the information sent through CDP is transmitted in clear text and unauthenticated. Utilizing a packet analyzer, attackers could glean information about the network device from CDP advertisements.

CDP is necessary for management applications and cannot be disabled without impairing some network-management applications. However, CDP can be selectively disabled on interfaces where management is not being performed.

Telnet vulnerabilities

Known Telnet Vulnerabilities:

* All usernames, passwords and data that are sent over the public network in clear text are vulnerable. * A user with an account on the system could gain elevated privileges. * A remote attacker could crash the Telnet service, preventing legitimate use of that service. * A remote attacker could find an enabled guest account that may be present anywhere within the trusted domains of the server.

Switch security considerations and security policies

Network security vulnerabilities include loss of privacy, data theft, impersonation, and loss of integrity. Basic security measures should be taken on every network to mitigate adverse effects of user negligence or acts of malicious intent.

Best Practices following these general steps are required whenever placing new equipment in service.

1. Consider or establish organizational security policies.2. Secure switch devices.

3. Secure switch protocols.4. Mitigate compromises launched through a switch.

Organizational Security Policies It is important to consider the policies of an organization when determining what level of security and what type of security should be implemented. There is a need to balance the goal of reasonable network security against the administrative overhead that is clearly associated with extremely restrictive security measures.

A well-established security policy has these characteristics:

Provides a process for auditing existing network security. Provides a general security framework for implementing network security. Defines behaviors toward electronic data that are disallowed. Determines which tools and procedures are needed for the organization. Communicates consensus among a group of key decision makers and define responsibilities of

users and administrators. Defines a process for handling network security incidents. Enables enterprise-wide, all site security implementation and enforcement plan.

Securing switch access

Follow these Best Practices for secure switch access:

Set system passwords – Use the enable secret command to set the password that grants enabled access to the IOS system. Because the enable secret command simply implements a Message Digest 5 (MD5) hash on the configured password, that password still remains vulnerable to dictionary attacks. Therefore, apply standard practices in selecting a feasible password. Try to pick passwords that contain both letters and numbers as well as special characters, for example, $pecia1$ instead of "specials," where the "s" has been replaced by "$" and the "l" has been replace with "1"(one).

Secure access to the console – Console access requires a minimum level of security both physically and logically. An individual who gains console access to a system will gain the ability to recover or reset the system-enable password, thus giving that person the ability to bypass all other security implemented on that system. Consequently, it is imperative to secure access to the console.

Secure access to VTY Lines – The minimum recommended security for Telnet access is: o Apply the basic ACL for in-band access to all VTY lines.o Configure a line password for all configured VTY lines.o If the installed IOS image permits, use Secure Shell Protocol (SSH) to access the device

remotely, instead of Telnet. Use Secure Shell Protocol (SSH) – The protocol and application provide a secure, remote

connection to a router. Two versions of SSH are available, SSH Version 1 and SSH Version 2. SSH Version 1 is implemented in IOS software. It encrypts all traffic, including passwords, between a remote console and a network router across a Telnet session. Because SSH sends no traffic clear text, network administrators can conduct remote access sessions that casual observers will not be able to view. The SSH server in IOS software will work with publicly and commercially available SSH clients.

Configure system-warning banners – For both legal and administrative purposes, configuring a system-warning banner to display prior to login is a convenient and effective way of reinforcing security and general usage policies. By clearly stating the ownership, usage, access, and protection policies prior to a login, future potential prosecution becomes more solidly backed.

Disable unneeded services – By default, Cisco devices implement multiple TCP and UDP servers to facilitate management and integration into existing environments. For most installations these services are typically not required, and disabling them can greatly reduce overall security exposure. These commands will disable the services not typically used:

o no service tcp-small-servers o no service udp-small-servers o no service finger

o no service config Disable the integrated HTTP daemon if not in use – Although IOS software provides an

integrated HTTP server for management, it is highly recommended that it be disabled to minimize overall exposure. If HTTP access to the switch is absolutely required, use basic ACLs to isolate access from only trusted subnets.

Configure basic logging – To assist and simplify both problem troubleshooting and security investigations, monitor switch subsystem information received from the logging facility. View the output in the on-system logging buffer memory. To render the on-system logging useful, increase the default buffer size.

Secure SNMP – Whenever possible, avoid using SNMP read-write features. SNMP v2c authentication consists of simple text strings communicated between devices in clear, unencrypted text. In most cases, a read-only community string may be configured. In doing so, apply the basic access list to mask to allow SNMP traffic to trusted hosts only

Securing switch protocols

Follow these Best Practices for securing switch protocols:

CDP – CDP does not reveal security-specific information, but it is possible for an attacker to exploit this information in a reconnaissance attack, whereby an attacker gains knowledge of device and IP address information for the purpose of launching other types of attacks. Two practical guidelines should be followed for CDP.

o If CDP is not required, or the device is located in an unsecure environment, disable CDP globally on the device.

o If it is required, disable CDP on a per-interface basis on ports connected to untrusted networks. Because CDP is a link-level protocol, it is not transient across a network (unless a Layer 2 tunneling mechanism is in place). Limit it to run only between trusted devices, disabling it everywhere else. However CDP is required on any access port when attaching a Cisco phone to establish a trust relationship.

Secure the spanning tree topology – It is important to protect the STP process of the switches composing the infrastructure. Inadvertent or malicious introduction of STP BPDUs could potentially overwhelm a device or pose a DoS attack. The first step in stabilizing a spanning tree installation is to positively identify the intended root bridge in the design, and to hard set the STP bridge priority of that bridge to an acceptable root value. Do the same for the designated backup root bridge. These actions will protect against inadvertent shifts in STP due to an uncontrolled introduction of a new switch.

In addition to taking these steps, on some platforms the BPDU guard feature may be available. If this feature is available for the platform, enable it on access ports in conjunction with the PortFast feature to protect the network from unwanted BPDU traffic injection. Upon receipt of a BPDU, the feature will automatically disable the port.

Mitigating compromises launched through a switch

Follow these Best Practices to mitigate compromises through a switch:

Proactively configure unused router and switch ports: o Execute the shut command on all unused ports and interfaceso Place all unused ports into a "parking-lot" VLAN used specifically to group unused ports

until they are proactively placed into serviceo Configure all unused ports as access ports disallowing automatic trunk negotiation

Considerations for trunk links – By default, Catalyst switches running IOS software are configured to automatically negotiate trunking capabilities. This situation poses a serious hazard to the infrastructure. It allows the possibility of an unsecured third party to be introduced into the infrastructure, as part of the infrastructure. Potential attacks include interception of traffic, redirection of traffic, denial of service (DoS), and more. To avoid this risk, disable automatic

negotiation of trunking, and manually enable it on links that will require it. Ensure that trunks use a native VLAN dedicated ONLY to trunk links.

Physical device access – Physical access to the switch should be closely monitored to avoid rogue device placement in wiring closets with direct access to switch ports.

Access port-based security – Specific measures should be taken on every access port of any switch placed into service. A policy should be in place that outlines the configuration of unused switch ports as well as those that are in use.

SPAN/RSPAN/VSPAN/ERSPAN

Cisco switches allow one or more ports to be configured as Switch Port Analyzer (SPAN) ports. SPAN sends a copy of frames generated on one port or an entire VLAN to another switch port hosting a network analyzer. The concept of SPAN is also referred to as Port Mirroring or Port Monitoring.

Commands Used to Facilitate Capturing Network TrafficVarious commands are used across Catalyst platforms to inform the switch which port carries the traffic of interest and to which port the network analyzer will be attached.

Monitoring Performance with RSPAN

Remote SPAN (RSPAN) is a variation of SPAN. Rather than sending traffic directly to the traffic analyzer located on the same switch as the port being monitored, RSPAN sends traffic from a monitored port through an intermediate switch network to a traffic analyzer on another switch. RSPAN supports source ports, source VLANs, and destination ports on different switches. RSPAN provides remote monitoring of ports on multiple switches across the network. The traffic for each RSPAN session is carried over a user-specified RSPAN VLAN that is dedicated for that RSPAN session in all participating switches.

RSPAN consists of an RSPAN source session, an RSPAN VLAN, and an RSPAN destination session. The RSPAN source session must be configured separately from the destination sessions given that the two are on different network devices. To configure an RSPAN source session on one network device, associate a set of source ports and VLANs with an RSPAN VLAN. To configure an RSPAN destination session on another device, you associate the destination port with the RSPAN VLAN. The intermediate switches need only have the RSPAN VLAN carried over source to destination switch links.

RSPAN Configuration Guidelines

In addition to the guidelines and restrictions that apply to SPAN, these guidelines apply to RSPAN:

Networks impose no limit on the number of RSPAN VLANs that the networks carry.Intermediate switches might impose limits on the number of RSPAN VLANs that they can support, based on their capacity. The RSPAN VLANs must be configured in all source, intermediate, and destination network

switches. RSPAN VLANs can be used only for RSPAN traffic. Access ports must not be assigned to RSPAN VLANs. Any ports in an RSPAN VLAN, except those selected to carry RSPAN traffic, should not be

configured. MAC address learning is disabled on the RSPAN VLAN. RSPAN source ports and destination ports must be on different network devices.

RSPAN VLANs cannot be configured as sources in VSPAN sessions. Any VLAN can be configured as an RSPAN VLAN. For RSPAN configuration, you can distribute the source ports and the destination ports across

multiple switches in your network. A port cannot serve as an RSPAN source port or RSPAN destination port while designated as an RSPAN reflector port. When you configure a switch port as a reflector port, it is no longer a normal switch port; only

looped-back traffic passes through the reflector port. RSPAN does not support BPDU packet monitoring or other Layer 2 switch protocols. The RSPAN VLAN is configured only on trunk ports and not on access ports. To avoid unwanted

traffic in RSPAN VLANs, make sure that the VLAN remote-span feature is supported in all the participating switches. Access ports on the RSPAN VLAN are silently disabled.

RSPAN VLANs are included as sources for port-based RSPAN sessions when source trunk ports have active RSPAN VLANs. RSPAN VLANs can also be sources in SPAN sessions.

For RSPAN configuration, you can distribute the source ports and the destination ports across multiple switches in your network. A port cannot serve as an RSPAN source port or RSPAN destination port while designated as an RSPAN reflector port. When you configure a switch port as a reflector port, it is no longer a normal switch port; only

looped-back traffic passes through the reflector port. RSPAN does not support BPDU packet monitoring or other Layer 2 switch protocols. The RSPAN VLAN is configured only on trunk ports and not on access ports. To avoid unwanted

traffic in RSPAN VLANs, make sure that the VLAN remote-span feature is supported in all the participating switches. Access ports on the RSPAN VLAN are silently disabled.

RSPAN VLANs are included as sources for port-based RSPAN sessions when source trunk ports have active RSPAN VLANs. RSPAN VLANs can also be sources in SPAN sessions.

You can configure any VLAN as an RSPAN VLAN as long as these conditions are met: No access port is configured in the RSPAN VLAN. The same RSPAN VLAN is used for an RSPAN session in all the switches. All participating switches support RSPAN. You should create an RSPAN VLAN before configuring an RSPAN source or destination session. If you enable VTP and VTP pruning, RSPAN traffic is pruned in the trunks to prevent the

unwanted flooding of RSPAN traffic across the network for VLAN-IDs that are lower than 1005. Because RSPAN traffic is carried across a network on an RSPAN VLAN, the original VLAN

association of the mirrored packets is lost. Therefore, RSPAN can only support forwarding of traffic from an IDS device onto a single user-specified VLAN.

As RSPAN VLANs have special properties, you should reserve a few VLANs across your network for use as RSPAN VLANs; do not assign access ports to these VLANs.

You can apply an output access control list (ACL) to RSPAN traffic to selectively filter or monitorspecific packets. Specify these ACLs on the RSPAN VLAN in the RSPAN source switches.

You can configure any VLAN as an RSPAN VLAN as long as these conditions are met:– No access port is configured in the RSPAN VLAN.– The same RSPAN VLAN is used for an RSPAN session in all the switches.– All participating switches support RSPAN.

Only traffic that enters or leaves source ports or traffic that enters source VLANs can be monitored by using SPAN; traffic that gets routed to ingress source ports or source VLANs cannot be monitored. For example, if incoming traffic is being monitored, traffic that gets routed from another VLAN to the source VLAN is not monitored; however, traffic that is received on the source VLAN and routed to another VLAN is monitored.RSPAN extends SPAN by enabling remote monitoring of multiple switches across your network. The traffic for each RSPAN session is carried over a user-specified RSPAN VLAN that is dedicated for that RSPAN session in all participating switches. The SPAN traffic from the sources is copied onto the RSPAN VLAN through a reflector port and then forwarded over trunk ports that are carrying the RSPAN VLAN to any RSPAN destination sessions monitoring the RSPAN VLAN

SPAN and RSPAN do not affect the switching of network traffic on source ports or source VLANs; a copy of the packets received or sent by the source interfaces are sent to the destination interface. You can use the SPAN or RSPAN destination port to inject traffic from a network security device. For example, if you connect a Cisco Intrusion Detection System (IDS) Sensor Appliance to a destination port, the IDS device can send TCP Reset packets to close down the TCP session of a suspected attacker.

SPAN Session

A local SPAN session is an association of a destination port with source ports and source VLANs. An RSPAN session is an association of source ports and source VLANs across your network with an RSPAN VLAN. The destination source is the RSPAN VLAN.

You configure SPAN sessions by using parameters that specify the source of network traffic to monitor.

Traffic monitoring in a SPAN session has these restrictions:

• You can monitor incoming traffic on a series or range of ports and VLANs.

• You can monitor outgoing traffic on a single port; you cannot monitor outgoing traffic on multiple ports.

• You cannot monitor outgoing traffic on VLANs.

You can configure two separate SPAN or RSPAN sessions with separate or overlapping sets of SPAN source ports and VLANs. Both switched and routed ports can be configured as SPAN sources and destinations.

SPAN sessions do not interfere with the normal operation of the switch. However, an oversubscribedSPAN destination, for example, a 10-Mbps port monitoring a 100-Mbps port, results in dropped or lost packets.

You can configure SPAN sessions on disabled ports; however, a SPAN session does not become active unless you enable the destination port and at least one source port or VLAN for that session. A SPAN session remains inactive after system power-on until the destination port is operational.

Traffic Types

SPAN sessions include these traffic types:

• Receive (Rx) SPAN—the goal of receive (or ingress) SPAN is to monitor as much as possible all the packets received by the source interface or VLAN before any modification or processing is performed by the switch. A copy of each packet received by the source is sent to the destination port for that SPAN session. You can monitor a series or range of ingress ports or VLANs in a SPAN session.

On tagged packets (Inter-Switch Link [ISL] or IEEE 802.1Q), the tagging is removed at the ingress port. At the destination port, if tagging is enabled, the packets appear with the ISL or IEEE 802.1Q headers. If no tagging is specified, packets appear in the native format.

Packets that are modified because of routing are copied without modification for Rx SPAN; that is, the original packet is copied. Packets that are modified because of quality of service (QoS)—for example, modified Differentiated Services Code Point (DSCP)—are copied with modification for Rx SPAN.

Some features that can cause a packet to be dropped during receive processing have no effect on SPAN; the destination port receives a copy of the packet even if the actual incoming packet is dropped. These features include IP standard and extended input access control lists (ACLs), IP standard and extended output ACLs for unicast and ingress QoS policing.VLAN maps, ingress QoS policing, and policy-based routing. Switch congestion that causes packets to be dropped also has no effect on SPAN.

• Transmit (Tx) SPAN—The goal of transmit (or egress) SPAN is to monitor as much as possible all the packets sent by the source interface after all modification and processing is performed by the switch. A copy of each packet sent by the source is sent to the destination port for that SPAN session. The copy is provided after the packet is modified. Only one egress source port is allowed per SPAN session. VLAN monitoring is not supported in the egress direction.

Packets that are modified because of routing—for example, with a time-to-live (TTL) or MAC-address modification—are duplicated at the destination port. On packets that are modified because of QoS, the modified packet might not have the same DSCP (IP packet) or CoS (non-IP packet) as the SPAN source.

Some features that can cause a packet to be dropped during transmit processing might also affect the duplicated copy for SPAN.These features include VLAN maps, IP standard and extended output ACLs on multicast packets, and egress QoS policing. In the case of output ACLs, if the SPAN source drops the packet, the SPAN destination would also drop the packet. In the case of egress QoS policing, if the SPAN source drops the packet, the SPAN destination might not drop it. If the source port is oversubscribed, the destination ports will have different dropping behavior.

• Both—In a SPAN session, you can monitor a single port for both received and sent packets.

Source Port

A source port (also called a monitored port) is a switched or routed port that you monitor for networktraffic analysis. In a single local SPAN session or RSPAN source session, you can monitor source porttraffic such as received (Rx), transmitted (Tx), or bidirectional (both); however, on a VLAN, you canmonitor only received traffic. The switch supports any number of source ports (up to the maximumnumber of available ports on the switch) and any number of source ingress VLANs (up to the maximumnumber of VLANs supported).

A source port has these characteristics:

• It can be any port type (for example, EtherChannel, Fast Ethernet, Gigabit Ethernet, and so forth).• It can be monitored in multiple SPAN sessions.• It cannot be a destination port.• Each source port can be configured with a direction (ingress, egress, or both) to monitor. ForEtherChannel sources, the monitored direction would apply to all the physical ports in the group.• Source ports can be in the same or different VLANs.• For VLAN SPAN sources, all active ports in the source VLAN are included as source ports.

You can configure a trunk port as a source port. By default, all VLANs active on the trunk are monitored. You can limit SPAN traffic monitoring on trunk source ports to specific VLANs by using

VLAN filtering. Only switched traffic in the selected VLANs is sent to the destination port. This feature affects only traffic forwarded to the destination SPAN port and does not affect the switching of normal traffic. This feature is not allowed in sessions with VLAN sources.

Destination Port

Each local SPAN session or RSPAN destination session must have a destination port (also called amonitoring port) that receives a copy of traffic from the source ports and VLANs.

The destination port has these characteristics:

• It must reside on the same switch as the source port (for a local SPAN session).• It can be any Ethernet physical port.• It can participate in only one SPAN session at a time (a destination port in one SPAN session cannotbe a destination port for a second SPAN session).• It cannot be a source port or a reflector port.• It cannot be an EtherChannel group or a VLAN.• It can be a physical port that is assigned to an EtherChannel group, even if the EtherChannel grouphas been specified as a SPAN source. The port is removed from the group while it is configured asa SPAN destination port.• The port does not transmit any traffic except that required for the SPAN session.• If ingress traffic forwarding is enabled for a network security device, the destination port forwardstraffic at Layer 2.• It does not participate in spanning tree while the SPAN session is active.• When it is a destination port, it does not participate in any of the Layer 2 protocols— CiscoDiscovery Protocol (CDP), VLAN Trunk Protocol (VTP), Dynamic Trunking Protocol (DTP),Spanning Tree Protocol (STP), Port Aggregation Protocol (PagP), and Link Aggregation ControlProtocol (LACP).• A destination port that belongs to a source VLAN of any SPAN session is excluded from the sourcelist and is not monitored.• No address learning occurs on the destination port.

Reflector Port

The reflector port is the mechanism that copies packets onto an RSPAN VLAN. The reflector portforwards only the traffic from the RSPAN source session with which it is affiliated. Any deviceconnected to a port set as a reflector port loses connectivity until the RSPAN source session is disabled.

The reflector port has these characteristics:

• It is a port set to loopback.• It cannot be an EtherChannel group, it does not trunk, and it cannot do protocol filtering.• It can be a physical port that is assigned to an EtherChannel group, even if the EtherChannel groupis specified as a SPAN source. The port is removed from the group while it is configured as areflector port.• A port used as a reflector port cannot be a SPAN source or destination port, nor can a port be areflector port for more than one session at a time.• It is invisible to all VLANs.• The native VLAN for looped-back traffic on a reflector port is the RSPAN VLAN.• The reflector port loops back untagged traffic to the switch. The traffic is then placed on the RSPANVLAN and flooded to any trunk ports that carry the RSPAN VLAN.• Spanning tree is automatically disabled on a reflector port.If the bandwidth of the reflector port is not sufficient for the traffic volume from the correspondingsource ports and VLANs, the excess packets are dropped. A 10/100 port reflects at 100 Mbps. A Gigabitport reflects at 1 Gbps.• The reflector port is not used when (R/V/ER)SPAN is enabled on router routed ports

VSPAN

VLAN-based SPAN (VSPAN) is the monitoring of the network traffic in one or more VLANs. You can

configure VSPAN to monitor only received (Rx) traffic, which applies to all the ports for that VLAN.Use these guidelines for VSPAN sessions:

• Only traffic on the monitored VLAN is sent to the destination port.• If a destination port belongs to a source VLAN, it is excluded from the source list and is notmonitored.• If ports are added to or removed from the source VLANs, the traffic on the source VLAN receivedby those ports is added to or removed from the sources being monitored.• VLAN pruning and the VLAN allowed list have no effect on SPAN monitoring.• VSPAN only monitors traffic that enters the switch, not traffic that is routed between VLANs. Forexample, if a VLAN is being Rx-monitored and the multilayer switch routes traffic from anotherVLAN to the monitored VLAN, that traffic is not monitored and is not received on the SPANdestination port.• You cannot use filter VLANs in the same session with VLAN sources.• You can monitor only Ethernet VLANs.

SPAN Traffic

You can use local SPAN to monitor all network traffic, including multicast and bridge protocol data unit (BPDU) packets, and CDP, VTP, DTP, STP, PagP, and LACP packets. You cannot use RSPAN to monitor Layer 2 protocols.

In some SPAN configurations, multiple copies of the same source packet are sent to the SPANdestination port. For example, a bidirectional (both Rx and Tx) SPAN session is configured for thesources a1 Rx monitor and the a2 Rx and Tx monitor to destination port d1. If a packet enters the switchthrough a1 and is switched to a2, both incoming and outgoing packets are sent to destination port d1.Both packets are the same (unless a Layer 3 rewrite occurs, in which case the packets are differentbecause of the added Layer 3 information).

Monitored Traffic Direction

You can configure local SPAN sessions, RSPAN source sessions, and ERSPAN source sessions tomonitor ingress traffic (called ingress SPAN), or to monitor egress traffic (called egress SPAN), or tomonitor traffic flowing in both directions.

Ingress SPAN copies traffic received by the source ports and VLANs for analysis at the destination port.

Egress SPAN copies traffic transmitted from the source ports and VLANs. When you enter the bothkeyword, SPAN copies the traffic received and transmitted by the source ports and VLANs to thedestination port.

Monitored Traffic

By default, local SPAN and ERSPAN monitor all traffic, including multicast and bridge protocol dataunit (BPDU) frames. RSPAN does not support BPDU monitoring.

Duplicate Traffic

In some configurations, SPAN sends multiple copies of the same source traffic to the destination port.For example, in a configuration with a bidirectional SPAN session (both ingress and egress) for twoSPAN sources, called s1 and s2, to a SPAN destination port, called d1, if a packet enters the routerthrough s1 and is sent for egress from the switch to s2, ingress SPAN at s1 sends a copy of the packet toSPAN destination d1 and egress SPAN at s2 sends a copy of the packet to SPAN destination d1. If thepacket was Layer 2 switched from s1 to s2, both SPAN packets would be the same. If the packet wasLayer 3 switched from s1 to s2, the Layer 3 rewrite would alter the source and destination Layer 2addresses, in which case the SPAN packets would be different.

SPAN and RSPAN Interaction with Other Features

SPAN interacts with these features:

• Routing—Ingress SPAN does not monitor routed traffic. VSPAN only monitors traffic that entersthe switch, not traffic that is routed between VLANs. For example, if a VLAN is beingRx-monitored and the multilayer switch routes traffic from another VLAN to the monitored VLAN,that traffic is not monitored and not received on the SPAN destination port.

• Spanning Tree Protocol (STP)—A destination port or a reflector port does not participate in STPwhile its SPAN or RSPAN session is active. The destination or reflector port can participate in STPafter the SPAN or RSPAN session is disabled. On a source port, SPAN does not affect the STP status.STP can be active on trunk ports carrying an RSPAN VLAN.

• Cisco Discovery Protocol (CDP)—A SPAN destination port does not participate in CDP while theSPAN session is active. After the SPAN session is disabled, the port again participates in CDP.

• VLAN Trunking Protocol (VTP)—You can use VTP to prune an RSPAN VLAN between switches.

• VLAN and trunking—You can modify VLAN membership or trunk settings for source, destination,or reflector ports at any time. However, changes in VLAN membership or trunk settings for adestination or reflector port do not take effect until you disable the SPAN or RSPAN session.Changes in VLAN membership or trunk settings for a source port immediately take effect, and therespective SPAN sessions automatically adjust accordingly.

• EtherChannel—You can configure an EtherChannel group as a source port but not as a SPANdestination port. When a group is configured as a SPAN source, the entire group is monitored.If a port is added to a monitored EtherChannel group, the new port is added to the SPAN source portlist. If a port is removed from a monitored EtherChannel group, it is automatically removed fromthe source port list. If the port is the only port in the EtherChannel group, the EtherChannel groupis removed from SPAN.

If a physical port that belongs to an EtherChannel group is configured as a SPAN source, destination,or reflector port, it is removed from the group. After the port is removed from the SPAN session, itrejoins the EtherChannel group. Ports removed from an EtherChannel group remain members of thegroup, but they are in the down or standalone state.

If a physical port that belongs to an EtherChannel group is a destination or reflector port and theEtherChannel group is a source, the port is removed from the EtherChannel group and from the listof monitored ports.

• QoS—For ingress monitoring, the packets sent to the SPAN destination port might be different fromthe packets actually received at the SPAN source port because the packets are forwarded afteringress QoS classification and policing. The packet DSCP might not be the same as the receivedpacket.

For egress monitoring, the packets sent out the SPAN destination port might not be the same as thepackets sent out of SPAN source ports because the egress QoS policing at the SPAN source portmight change the packet classification. QoS policing is not applied at SPAN destination ports.

• Multicast traffic can be monitored. For egress and ingress port monitoring, only a single uneditedpacket is sent to the SPAN destination port. It does not reflect the number of times the multicastpacket is sent.

• Port security—A secure port cannot be a SPAN destination port.

For SPAN sessions, do not enable port security on ports that are egress monitored when ingressforwarding is enabled on the destination port. For RSPAN source sessions, do not enable portsecurity on any ports that are egress monitored.

• 802.1x—You can enable 802.1x on a port that is a SPAN destination or reflector port; however,802.1x is disabled until the port is removed as a SPAN destination or reflector port. You can enable

802.1x on a SPAN source port.

For SPAN sessions, do not enable 802.1x on ports that are egress monitored when ingressforwarding is enabled on the destination port. For RSPAN source sessions, do not enable 802.1x onany ports that are egress monitored.

ERSPAN Overview

ERSPAN(Encapsulated RSPAN) supports source ports, source VLANs, and destination ports on different routers, which provides remote monitoring of multiple routers across your network.ERSPAN consists of an ERSPAN source session, routable ERSPAN GRE-encapsulated traffic, and anERSPAN destination session. You separately configure ERSPAN source sessions and destinationsessions on different routers. Instead of using Layer-2 trunks on intermediate switches as in RPSPAN GRE encapsulation is used.

To configure an ERSPAN source session on one router, you associate a set of source ports or VLANswith a destination IP address, ERSPAN ID number, and optionally with a VRF name. To configure anERSPAN destination session on another router, you associate the destination ports with the source IPaddress, ERSPAN ID number, and optionally with a VRF name.ERSPAN source sessions do not copy locally sourced RSPAN VLAN traffic from source trunk ports that carry RSPAN VLANs. ERSPAN source sessions do not copy locally sourced ERSPANGRE-encapsulated traffic from source ports.

Each ERSPAN source session can have either ports or VLANs as sources, but not both.The ERSPAN source session copies traffic from the source ports or source VLANs and forwards thetraffic using routable GRE-encapsulated packets to the ERSPAN destination session. The ERSPANdestination session switches the traffic to the destination ports.

Feature Incompatiblities

These feature incompatibilities exist with local SPAN, RSPAN, and ERSPAN:

• With a PFC3, EoMPLS ports cannot be SPAN sources. (CSCed51245)

• A port-channel interface (an EtherChannel) can be a SPAN source, but you cannot configure activemember ports of an EtherChannel as SPAN source ports. Inactive member ports of an EtherChannelcan be configured as SPAN sources but they are put into the suspended state and carry no traffic.• A port-channel interface (an EtherChannel) cannot be a SPAN destination.• You cannot configure active member ports of an EtherChannel as SPAN destination ports. Inactivemember ports of an EtherChannel can be configured as SPAN destination ports but they are put intothe suspended state and carry no traffic.• Because SPAN destination ports drop ingress traffic, these features are incompatible with SPANdestination ports:– Private VLANs– IEEE 802.1X port-based authentication– Port security– Spanning tree protocol (STP) and related features (PortFast, PortFast BPDU Filtering, BPDUGuard, UplinkFast, BackboneFast, EtherChannel Guard, Root Guard, Loop Guard)– VLAN trunk protocol (VTP)– Dynamic trunking protocol (DTP)– IEEE 802.1Q tunneling

Note SPAN destination ports can participate in IEEE 802.3Z Flow Control.

Local SPAN, RSPAN, and ERSPAN Guidelines and Restrictions

These guidelines and restrictions apply to local SPAN, RSPAN, and ERSPAN:

• A SPAN destination port that is copying traffic from a single egress SPAN source port sends onlyegress traffic to the network analyzer. However, in Release 12.2(18)SXE and later releases, if youconfigure more than one egress SPAN source port, the traffic that is sent to the network analyzeralso includes these types of ingress traffic that were received from the egress SPAN source ports:

– Any unicast traffic that is flooded on the VLAN– Broadcast and multicast traffic

This situation occurs because an egress SPAN source port receives these types of traffic from theVLAN but then recognizes itself as the source of the traffic and drops it instead of sending it backto the source from which it was received. Before the traffic is dropped, SPAN copies the traffic andsends it to the SPAN destination port. (CSCds22021)

• Entering additional monitor session commands does not clear previously configured SPANparameters. You must enter the no monitor session command to clear configured SPAN parameters.• Connect a network analyzer to the SPAN destination ports.• All the SPAN destination ports receive all of the traffic from all the SPAN sources.

Note With Release 12.2(18)SXD and later releases, you can configure destination trunk portVLAN filtering using allowed VLAN lists.

With Release 12.2(18)SXE and later releases, for local SPAN and RSPAN, you canconfigure Source VLAN Filtering.

• You can configure both Layer 2 LAN ports (LAN ports configured with the switchport command)and Layer 3 LAN ports (LAN ports not configured with the switchport command) as sources ordestinations.• You cannot mix individual source ports and source VLANs within a single session.• If you specify multiple ingress source ports, the ports can belong to different VLANs.• You cannot mix source VLANs and filter VLANs within a session. You can have source VLANs orfilter VLANs, but not both at the same time.• When enabled, local SPAN, RSPAN, and ERSPAN use any previously entered configuration.• When you specify sources and do not specify a traffic direction (ingress, egress, or both), “both” isused by default.• SPAN copies Layer 2 Ethernet frames, but SPAN does not copy source trunk port ISL or 802.1Qtags. You can configure destination ports as trunks to send locally tagged traffic to the traffic

analyzer.Note A destination port configured as a trunk tags traffic from a Layer 3 LAN source port withthe internal VLAN used by the Layer 3 LAN port.

• Local SPAN sessions, RSPAN source sessions, and ERSPAN source sessions do not copy locallysourced RSPAN VLAN traffic from source trunk ports that carry RSPAN VLANs.• Local SPAN sessions, RSPAN source sessions, and ERSPAN source sessions do not copy locallysourced ERSPAN GRE-encapsulated traffic from source ports.• A port specified as a destination port in one SPAN session cannot be a destination port for anotherSPAN session.• A port configured as a destination port cannot be configured as a source port.• Destination ports never participate in any spanning tree instance. Local SPAN includes BPDUs inthe monitored traffic, so any BPDUs seen on the destination port are from the source port. RSPANdoes not support BPDU monitoring.• All packets sent through the router for transmission from a port configured as an egress source arecopied to the destination port, including packets that do not exit the router through the port becauseSTP has put the port into the blocking state, or on a trunk port because STP has put the VLAN intothe blocking state on the trunk port.

VSPAN Guidelines and Restrictions

Note Local SPAN, RSPAN, and ERSPAN all support VSPAN.These are VSPAN guidelines and restrictions:

• For VSPAN sessions with both ingress and egress configured, two packets are forwarded from thedestination port if the packets get switched on the same VLAN (one as ingress traffic from theingress port and one as egress traffic from the egress port).

• VSPAN only monitors traffic that leaves or enters Layer 2 ports in the VLAN.

– If you configure a VLAN as an ingress source and traffic gets routed into the monitored VLAN,the routed traffic is not monitored because it never appears as ingress traffic entering a Layer 2port in the VLAN.

– If you configure a VLAN as an egress source and traffic gets routed out of the monitored VLAN,the routed traffic is not monitored because it never appears as egress traffic leaving a Layer 2port in the VLAN.

RSPAN Guidelines and Restrictions

These are RSPAN guidelines and restrictions:

• Supervisor Engine 2 does not support RSPAN if you configure an egress SPAN source for a localSPAN session.• Supervisor Engine 2 does not support egress SPAN sources for local SPAN if you configure RSPAN.• All participating routers must be trunk-connected at Layer 2.• Any network device that supports RSPAN VLANs can be an RSPAN intermediate device.• Networks impose no limit on the number of RSPAN VLANs that the networks carry.• Intermediate network devices might impose limits on the number of RSPAN VLANs that they cansupport.• You must configure the RSPAN VLANs in all source, intermediate, and destination network devices.If enabled, the VLAN Trunking Protocol (VTP) can propagate configuration of VLANs numbered1 through 1024 as RSPAN VLANs. You must manually configure VLANs numbered higher than1024 as RSPAN VLANs on all source, intermediate, and destination network devices.• If you enable VTP and VTP pruning, RSPAN traffic is pruned in the trunks to prevent the unwantedflooding of RSPAN traffic across the network.• RSPAN VLANs can be used only for RSPAN traffic.• Do not configure a VLAN used to carry management traffic as an RSPAN VLAN.• Do not assign access ports to RSPAN VLANs. RSPAN puts access ports in an RSPAN VLAN into

the suspended state.• Do not configure any ports in an RSPAN VLAN except trunk ports selected to carry RSPAN traffic.• MAC address learning is disabled in the RSPAN VLAN.• You can use output access control lists (ACLs) on the RSPAN VLAN in the RSPAN source routerto filter the traffic sent to an RSPAN destination.• RSPAN does not support BPDU monitoring.• Do not configure RSPAN VLANs as sources in VSPAN sessions.• You can configure any VLAN as an RSPAN VLAN as long as all participating network devicessupport configuration of RSPAN VLANs and you use the same RSPAN VLAN for each RSPANsession in all participating network devices.

ERSPAN Guidelines and Restrictions

These are ERSPAN guidelines and restrictions:

• ERSPAN is supported only when the router is operating in the compact switching mode: all modulesmust be fabric-enabled.• Release 12.2(18)SXE and later releases support ERSPAN.• All versions of WS-SUP720-3B (Supervisor Engine 720 with PFC3B) and WS-SUP720-3BXL(Supervisor Engine 720 with PFC3BXL) support ERSPAN.• WS-SUP720 (Supervisor Engine 720 with PFC3A), hardware version 3.2 or higher, supportsERSPAN. Enter the show module version | include WS-SUP720-BASE command to display thehardware version. For example:Router# show module version | include WS-SUP720-BASE7 2 WS-SUP720-BASE SAD075301SZ Hw :3.2• Supervisor Engine 2 does not support ERSPAN.• For ERSPAN packets, the “protocol type” field value in the GRE header is 0x88BE.• The payload of a Layer 3 ERSPAN packet is a copied Layer 2 Ethernet frame, excluding any ISL or802.1Q tags.• ERSPAN adds a 50-byte header to each copied Layer 2 Ethernet frame and replaces the 4-byte cyclicredundancy check (CRC) trailer.• ERSPAN supports jumbo frames that contain Layer 3 packets of up to 9,202 bytes. If the length ofthe copied Layer 2 Ethernet frame is greater than 9,170 (9,152-byte Layer 3 packet), ERSPANtruncates the copied Layer 2 Ethernet frame to create a 9,202-byte ERSPAN Layer 3 packet.• Regardless of any configured MTU size, ERSPAN creates Layer 3 packets that can be as long as9,202 bytes. ERSPAN traffic might be dropped by any interface in the network that enforces an MTUsize smaller than 9,202 bytes.• With the default MTU size (1,500 bytes), if the length of the copied Layer 2 Ethernet frame isgreater than 1,468 bytes (1,450-byte Layer 3 packet), the ERSPAN traffic is dropped by anyinterface in the network that enforces the 1,500-byte MTU size.Note The mtu interface command and the system jumbomtu command set the maximum Layer 3 packet size (default is 1,500 bytes, maximum is 9,216 bytes).• All participating routers must be connected at Layer 3 and the network path must support the sizeof the ERSPAN traffic.• ERSPAN does not support packet fragmentation. The “do not fragment” bit is set in the IP headerof ERSPAN packets. ERSPAN destination sessions cannot reassemble fragmented ERSPANpackets.• ERSPAN traffic is subject to the traffic load conditions of the network. You can set the ERSPANpacket IP precedence or DSCP value to prioritize ERSPAN traffic for QoS.• The only supported destination for ERSPAN traffic is an ERSPAN destination session on aSupervisor Engine 720.• All ERSPAN source sessions on a router must use the same origin IP address, configured with theorigin ip address command• All ERSPAN destination sessions on a switch must use the same IP address on the same destinationinterface. You enter the destination interface IP address with the ip address command• The ERSPAN source session’s destination IP address, which must be configured on an interface onthe destination router, is the source of traffic that an ERSPAN destination session sends to thedestination ports. You configure the same address in both the source and destination sessions withthe ip address command.• The ERSPAN ID differentiates the ERSPAN traffic arriving at the same destination IP address from

various different ERSPAN source sessions.

ERSPAN Configuration

ERSPAN is configured in a slightly different way than (R/V)SPAN. Separate source and destination sessions are configured, the first one on destination routers and the second one on source.

Source session:

Destination session:

When configuring monitor sessions, note the following information:• session_description can be up to 240 characters and cannot contain special characters or spaces.

Note You can enter 240 characters after the description command.

• ERSPAN_source_span_session_number can range from 1 to 66.• single_interface is interface type slot/port; type is ethernet, fastethernet, gigabitethernet, ortengigabitethernet.• interface_list is single_interface , single_interface , single_interface ...

Note In lists, you must enter a space before and after the comma. In ranges, you must enter a spacebefore and after the dash.

• interface_range is interface type slot/first_port - last_port.• mixed_interface_list is, in any order, single_interface , interface_range , ...• single_vlan is the ID number of a single VLAN.• vlan_list is single_vlan , single_vlan , single_vlan ...• vlan_range is first_vlan_ID - last_vlan_ID.• mixed_vlan_list is, in any order, single_vlan , vlan_range , ...• ERSPAN_flow_id can range from 1 to 1023.• All ERSPAN source sessions on a switch must use the same source IP address. Enter the origin ipaddress ip_address force command to change the origin IP address configured in all ERSPANsource sessions on the router.• ttl_value can range from 1 to 255.• ipp_value can range from 0 to 7.• dscp_value can range from 0 to 63

8. Configuring switches for voice and video

Voice traffic on a Cisco infrastructure

Cisco’s converged end-to-end network solution offers the strengths of the Cisco data networking components such as routers, switches and firewalls which have infrastructure security and reliability as a foundation. An IP Telephony solution can be then be implemented over that network.

The power of this approach is that each new application such as video, Web, or telephony represents just another media type over the same infrastructure medium rather than creating a different communication medium for each media type. Intelligent devices are automatically given rights and priorities and the applications themselves can intelligently communicate with the infrastructure to meet the constantly changing needs of the system as specified by the organization. As the figure indicates, this unity of infrastructure and applications is what distinguishes Cisco IP Telephony solutions from those of its competitors.

Benefits of IP Telephony on a Cisco Infrastructure

Cisco IP phones are able to use the Ethernet switches in the network as the "voice call switch matrix." Calls are managed differently, and the inherent time slot and bandwidth limitations of traditional TDM architectures are removed. Switching of a call is done only between the devices required to switch the call: the IP phones, voice gateways and Ethernet switches. Calls do not have to be routed back to a traditional TDM switching matrix to complete the call.

Cisco IP phones are also able to receive call-processing capability directly from the Cisco IOS® Software running on the access router for remote or small office locations. The tight integration with the IP network infrastructure provides customers with the flexibility to design their IP networks to meet their individual voice and data needs.

Beyond network efficiency and scalability, the tight integration of IP telephony and Cisco infrastructure also delivers other benefits including:

Speedier, lower-cost moves, adds, and changes Automatically updated E911 System Quicker deployment of quality of service (QoS) settings Security common to all network devices Built-in resiliency Power over Ethernet and intelligent power management to reduce power costs New planning and management tools to ensure voice quality A full range of IP Communications solutions

Revenue-generating and productivity-enhancing Extensible Markup Language (XML) applications

Voice VLAN

Some Cisco Catalyst switches offer a "voice VLAN" feature. The voice VLAN, also known as an auxiliary VLAN, provides automatic VLAN association for IP phones. By associating the phones, and therefore the phone traffic, with a specific VLAN, the phone traffic will be on different IP subnets even though voice and data co-exist on the same physical infrastructure.

When a phone is connected to the switch, the switch sends necessary voice VLAN information to the IP phone, placing it into the voice VLAN without end-user intervention. Placing phone traffic onto a distinct VLAN allows the phone traffic to be segmented from the data traffic; this facilitates better network management and troubleshooting. Additionally, QoS or security policies can be enforced specifically for the traffic traversing the phone VLANs without affecting the data traffic. If the phone is moved, the voice VLAN association occurs again. The voice VLAN information may change if the phone is moved.

In an implementation where a PC, or other IP device, is connected to the switch through the IP phone, and the phone is in an Auxiliary VLAN, Layer 2 frame type incompatibility may keep the phone and device from communicating. The IP phone and device cannot communicate if they are in the same

VLAN and subnet but each is using a different frame type. Because the traffic between the two takes place on the same subnet, it will not be routed and therefore, the Layer 2 headers will not be altered. Also, switch commands cannot be used to configure the frame type being used by a device on the other side of the phone that is not directly attached to the switch.

The voice VLAN feature enables access ports to carry IP voice traffic from an IP phone. When the switch is connected to a Cisco 7960 IP Phone, the IP Phone sends voice traffic with Layer 3 IP precedence and Layer 2 class of service (CoS) values, which are both set to 5 by default. Because the sound quality of an IP phone call can deteriorate if the data is unevenly sent, the switch supports quality of service (QoS) based on IEEE 802.1p CoS. QoS uses classification and scheduling to send network traffic from the switch in a predictable manner.

In order for the device and the phone to communicate, one of the following must be true:

They both use the same Layer 2 frame type. The phone uses 802.1p frames and the device uses untagged frames. The phone uses untagged frames and the device uses 802.1p frames. The phone uses 802.1Q frames, and the voice VLAN equals the native VLAN.

The Cisco 7960 IP Phone is a configurable device, and you can configure it to forward traffic with anIEEE 802.1p priority. You can configure the switch to trust or override the traffic priority assigned byan IP Phone. The Cisco IP Phone contains an integrated three-port 10/100 switch. The ports provide dedicated connections to these devices:

• Port 1 connects to the switch or other voice-over-IP (VoIP) device.• Port 2 is an internal 10/100 interface that carries the IP phone traffic.• Port 3 (access port) connects to a PC or other device.

Cisco IP Phone Voice Traffic

You can configure an access port with an attached Cisco IP Phone to use one VLAN for voice trafficand another VLAN for data traffic from a device attached to the phone. You can configure access portson the switch to send Cisco Discovery Protocol (CDP) packets that instruct an attached Cisco IP Phoneto send voice traffic to the switch in any of these ways:• In the voice VLAN tagged with a Layer 2 CoS priority value• In the access VLAN tagged with a Layer 2 CoS priority value• In the access VLAN, untagged (no Layer 2 CoS priority value)

Note In all configurations, the voice traffic carries a Layer 3 IP precedence value (the default is 5 for voice traffic and 3 for voice control traffic).

Cisco IP Phone Data Traffic

The switch can also process tagged data traffic (traffic in IEEE 802.1Q or IEEE 802.1p frame types)from the device attached to the access port on the Cisco IP Phone. You can configure Layer 2 access ports on the switch to send CDP packets that instruct the attached Cisco IP Phone to configure the IP phone access port in one of these modes:

• In trusted mode, all traffic received through the access port on the Cisco IP Phone passes throughthe IP phone unchanged.

• In untrusted mode, all traffic in IEEE 802.1Q or IEEE 802.1p frames received through the accessport on the IP phone receive a configured Layer 2 CoS value. The default Layer 2 CoS value is 0.Untrusted mode is the default.

Note Untagged traffic from the device attached to the Cisco IP Phone passes through the IP phone unchanged, regardless of the trust state of the access port on the IP phone.

Configuring the Priority of Incoming Data Frames

You can connect a PC or other data device to a Cisco IP Phone port. To process tagged data traffic (inIEEE 802.1Q or IEEE 802.1p frames), you can configure the switch to send CDP packets to instruct theIP phone how to send data packets from the device attached to the access port on the Cisco IP Phone.The PC can generate packets with an assigned CoS value. You can configure the Cisco IP Phone to notchange (trust) or to override (not trust) the priority of frames arriving on the IP phone port fromconnected devices. This is achieved through switchport priority extend command.

Default Voice VLAN Configuration

The voice VLAN feature is disabled by default.When the voice VLAN feature is enabled, all untagged traffic is sent according to the default CoSpriority of the port.The CoS value is not trusted for IEEE 802.1p or IEEE 802.1Q tagged traffic.

CCM Ports

Protocol

Remote

Source Port

CallManager

Destination Port

CallManager Source

Port

Remote Device

Destination Port

Remote Devices

Notes

DTC

TCP 135 CallManager

s in the same cluster

SSH

TCP 22 Secure Shell

Client

Telnet

TCP 23

Telnet Client

DNS

UDP 53

DNS Servers

DHCP UDP 68 UDP 67

DHCP Server

DHCP

UDP 68 UDP 67 DHCP Client

TFTP

UDP 69

Dynamic Ports used after initial connect

HTTP

TCP 80 Administrator

/ User Web browsers

CCMAdmin and CCMUser pages

OSI (DAP, DSP, DISP)

TCP or UDP 120

DCD Directory

NTP

UDP 123

WINS UDP 137-

139

WINS ServerWindows Internet Name Service

SNMP

UDP 161

SNMP Trap

UDP 162

LDAP

TCP 389

TCP 389Directory Services

When integrated with Corporate Directory

HTTPS / SSL

TCP 443

SMB

TCP 445

TCP 445CallManagers in the same cluster

Syslog

TCP 514

UDP 514Syslog service

RMI TCP 1099-

1129 RMI Service.

Attendant Console

MS SQL

TCP 1433


H.323 RAS

UDP 1718Gatekeeper discovery

H.323 RAS

UDP 1719Gatekeeper RAS

CallManager prior to 3.3. Cisco Conference Connection

H.323 RAS UDP 1024-

4999 UDP 1719

Gatekeeper RAS

CallManager 3.3

H.323 H.225

TCP 1720

TCP 1720

H.323 Gateways / Anonymous Device Cisco Conference Connection / Non- Gatekeeper Controlled H.323 Trunk

H.323 H.225/ICT

TCP 1024-4999

CallManager Gatekeeper Controlled H.323 Trunks

CallManager 3.3

H.323 H.245

TCP 1024-4999

TCP 1024-4999

CallManager H.323 Gateways / Anonymous Device / H.323 Trunks

H.323 H.245

TCP 11000-65535

IOS H.323 Gateways. Cisco Conference Connection

SCCP

TCP 2000 Skinny

Clients (IP Phones)

Skinny Gateway (Analogue)

TCP 2001

Analogue Skinny Gateway

Obsolete

Skinny Gateway (Digital)

TCP 2002

Digital Skinny Gateway

Obsolete

MGCP Control

UDP 2427

MGCP Gateway Control

MGCP Backhaul

TCP 2428

MGCP Gateways Backhaul

RTS Serv

2500

Cisco Extended Service

TCP 2551

Active / Backup Determination

Cisco Extended Service

TCP 2552

DB Change Notification

RIS Data Collector

TCP 2555

Inter RIS communication

RIS Data Collector

TCP 2556

Used by clients (IIS) to communicate with RIS

CTI/QBE

TCP 2748 TAPI/JTAPI

Applications

Connects with CTI Manager. Used by IVR, CCC, PA, Cisco Softphone, CRS, ICD, IPCC, IPMA, Attendant Console and any other application that utilizes the TAPI or J/TAPI plugin / TSP.

IPMA Service

TCP 2912

IPMA Assistant Console

Media Streaming Application

UDP 3001

Change Notification

SCCP

TCP 3224 Media

Resources

Conference Bridges / Xcoders

MS Terminal Services

TCP 3389

Windows Terminal Services

Entercept HID Agent

TCP 5000

Host Intrusion Detection Console

CallManager SIP

TCP/UDP 5060

TCP 5060

SIP Trunk Default Port

Can use TCP 1024 - 65535

VNC http helper

TCP 580x

Remote Control

VNC Display

TCP 690x

Virtual Network Computer Display

Remote Control

CallManager Change Notification

TCP 7727

CallManager Change Notification. Cisco database layer monitor, Cisco TFTP, Cisco IP media streaming, Cisco TCD, Cisco MOH

RealTime Change Notification

IPMA Service

TCP 8001

IP Manager Assistant

Change Notification

ICCS

TCP 8002


Intra Cluster Communication

CTIM

TCP 8003

Cisco Tomcat

TCP 8007

Web Requests

Cisco Tomcat

TCP 8009

Web Requests

Cisco Tomcat

TCP 8111

IIS, Web Requests to IPMA worker thread

Cisco Tomcat

TCP 8222

IIS, Web Requests to EM application worker thread

Cisco Tomcat

TCP 8333

IIS, Web Requests to WebDialer application worker thread

DC Directory

TCP 8404

Embedded Directory Services

Used for Directory services. Application Authentication / configuration. SoftPhone Directory. User Directory

Cisco Tomcat

TCP 8444

IIS, Web Requests to EM Service worker thread

Cisco Tomcat

TCP 8555

IIS, Web Requests to Apache SOAP worker thread

Cisco Tomcat

TCP 8998

Web Requests

Cisco Tomcat

TCP 9007

IIS, Web Requests to CAR worker thread

RTPUDP 16384-32767

UDP 16384-32767

Voice Media

IP IVR Media. CCC IVR Media, Cisco SoftPhone, Media Streaming Application

Cisco SNMP Trap Agent

UDP 61441

Cisco Alarm Interface

Receives some SNMP alarm in XML format.

Voice considerations in campus submodules

Building Access Submodule

Within the Building Access submodule, these features support IP telephony:

Voice VLANs 802.1p/Q Hardware support for multiple output queues Hardware support for in-line power to IP phones PortFast Root Guard Unidirectional Link Detection (UDLD) UplinkFast

Building Distribution Submodule

Within the Building Distribution submodule, these features support IP telephony:

Passive interfaces Layer 3 redundancy with Hot Standby Router Protocol (HSRP), HSRP track, and HSRP preempt OSPF or Enhanced Interior Gateway Routing Protocol (EIGRP) routing with adjusted timers,

summary addresses, and path costs

Network design considerations for voice

IP telephony places strict requirements on the network infrastructure. The network must provide sufficient bandwidth and quick convergence after network failures or network changes. Most IP telephony installations are built on an existing network infrastructure, therefore the infrastructure typically requires enhancement with priority given to voice traffic.

General Design Considerations To determine if an infrastructure can support the addition of voice, evaluate these considerations:

Features required for each device in the campus network – IP phones require power and most enterprises put IP telephony applications on a separate VLAN with priority handling.

Physical plant capable of supporting IP telephony – The wiring and cabling plant must be adequate for IP telephony needs. At a minimum, Category 5 cabling is required and consideration should be made for increased wall jacks and switch ports required to support phone and PC connections.

Provision switches with inline power to support IP phones – Within a wiring closet, deploy a Catalyst Inline Power Patch Panel or an in-line power from the switch to provide in-line power to the IP phones. This may increase the power requirements of the switch itself.

Network bandwidth adequate for data, voice and call control traffic – Along with data traffic, consider both voice and call control traffic loads. Bandwidth provisioning requires careful planning of the LAN infrastructure so that the available bandwidth is always considerably higher than the load. There should be no steady-state congestion or latency over the LAN links. This is critical for voice operations over a LAN infrastructure.

Bandwidth Provisioning

Properly provisioning the network bandwidth is a major component of designing a successful IP telephony network. The required bandwidth can be calculated by adding the bandwidth requirements for each major application, including voice, video, and data. This sum then represents the minimum bandwidth requirement for any given link, and it should not exceed approximately 75 percent of the total available bandwidth for the link.

From a traffic standpoint, an IP telephony call consists of two traffic types:

Voice carrier stream – This consists of Real-Time Transport Protocol (RTP) packets that contain the actual voice samples.

Call control signaling – This consists of packets belonging to one of several protocols; those used to set up, to maintain, to tear down, or to redirect a call depending upon call endpoints. Examples are H.323 or Media Gateway Control Protocol (MGCP).

A Voice over IP (VoIP) packet consists of the voice payload, IP header, UDP header, RTP header, and Layer 2 link header. Coder-decoder (codec) type (G.711, G.729, etc.) is configurable by device. However, G.729 does not support fax or modem traffic. The IP header is 20 bytes, the UDP header is 8 bytes, and the RTP header is 12 bytes. The link header varies in size according to the Layer 2 media used; Ethernet requires 14 bytes of header. The voice payload size and the packetization period are device-dependent.

To calculate the bandwidth that voice streams consume, use this formula:

(Packet payload + all headers in bits) * Packet rate per second; for example, 50 packets per second (pps) when using a 20-ms packet period

Power Considerations

Accurate calculations of power requirements are critical for an effective IP telephony solution. Power can be supplied to the IP phones directly from Catalyst switches with inline power capabilities or by inserting a Catalyst Inline Power Patch Panel. In addition to IP phones, failover power and total load must be considered for all devices in the IP telephony availability definition, including Distribution and Backbone submodules, gateways, CallManager and other servers and devices. Power calculations, therefore, must be network rather than device based.

Providing highly available power protection requires an uninterruptible power supply (UPS) with a minimum battery life to support one hour and a four hour response for power system failures, or a generator with an onsite service contract. This solution must include UPS or generator backup for all devices associated with the IP Telephony network. In addition, consider UPS systems that have auto-restart capability and a service contract for four-hour support response.

IP telephony high-availability power and environment include these recommendations:

UPS and generator backup UPS systems with auto-restart capability UPS system monitoring A 4-hour service response contract for UPS system problems Maintain recommended equipment operating temperatures 24/7

Intelligent Network Services

Network management, high availability, security, and quality of service (QoS) intelligent network services must extend to incorporate voice-specific attributes.

Network Management – The merging of network management tasks associated with both voice and data networks is one of the key benefits of using a converged network as opposed to a voice only

network. However, it is still necessary to understand the traditional voice-only management concepts to relate the features available in that technology to the converged network management techniques.

High Availability – As with any network capability, plan redundancy for critical voice network components such as the Cisco CallManager and the associated gateway and infrastructure devices.

Security – The subject of securing voice communications has received more visibility recently as network convergence becomes an accepted design model. With the advent of IP telephony traffic traversing the LAN infrastructure, the potential exists for malicious attacks on call-processing components and telephony applications. As with all network devices, there should be a predefined security policy for all devices, applications, and users associated with the voice network that is appropriate for the level of caution required. Consider security measures for voice call-processing platforms, applications and telephony traffic.

QoS – The goal of QoS is to provide critical applications a higher priority for service so that they are the least likely to be delayed or to be dropped in times of congestion. When a network becomes congested, some traffic will be delayed or lost. Voice traffic has strict requirements concerning delay and delay variation (also known as "jitter") and compared to most data traffic, voice traffic is relatively intolerant of loss. To establish priority processing for voice traffic, a wide range of IP QoS features can be employed, such as classification, queuing, congestion detection, traffic shaping, and compression.

Voice VLAN Configuration Guidelines

These are the voice VLAN configuration guidelines:• You should configure voice VLAN on switch access ports; voice VLAN is not supported ontrunk ports. You can configure a voice VLAN only on Layer 2 ports.

Note Voice VLAN is only supported on access ports and not on trunk ports, even though theconfiguration is allowed.

• The voice VLAN should be present and active on the switch for the IP phone to correctlycommunicate on the voice VLAN. Use the show vlan privileged EXEC command to see if theVLAN is present (listed in the display).

• Do not configure voice VLAN on private VLAN ports.

• The Power over Ethernet (PoE) switches are capable of automatically providing power to Ciscopre-standard and IEEE 802.3af-compliant powered devices if they are not being powered by an ACpower source.

• Before you enable voice VLAN, we recommend that you enable QoS on the switch by entering themls qos global configuration command and configure the port trust state to trust by entering the mlsqos trust cos interface configuration command. If you use the auto-QoS feature, these settings areautomatically configured.

• You must enable CDP on the switch port connected to the Cisco IP Phone to send the configurationto the Cisco IP Phone. (CDP is globally enabled by default on all switch interfaces.)

• The Port Fast feature is automatically enabled when voice VLAN is configured. When you disablevoice VLAN, the Port Fast feature is not automatically disabled.

• If the Cisco IP Phone and a device attached to the Cisco IP Phone are in the same VLAN, they mustbe in the same IP subnet. These conditions indicate that they are in the same VLAN:

– They both use IEEE 802.1p or untagged frames.– The Cisco IP Phone uses IEEE 802.1p frames, and the device uses untagged frames.– The Cisco IP Phone uses untagged frames, and the device uses IEEE 802.1p frames.– The Cisco IP Phone uses IEEE 802.1Q frames, and the voice VLAN is the same as the accessVLAN.

• The Cisco IP Phone and a device attached to the phone cannot communicate if they are in the sameVLAN and subnet but use different frame types because traffic in the same subnet is not routed

(routing would eliminate the frame type difference).

• You cannot configure static secure MAC addresses in the voice VLAN.

• Voice VLAN ports can also be these port types:

– Dynamic access port.– IEEE 802.1x authenticated port

Note If you enable IEEE 802.1x on an access port on which a voice VLAN is configured andto which a Cisco IP Phone is connected, the Cisco IP Phone loses connectivity to theswitch for up to 30 seconds.

– Protected port. – A source or destination port for a SPAN or RSPAN session.- Secure port.

Note When you enable port security on an interface that is also configured with a voiceVLAN, you must set the maximum allowed secure addresses on the port to two plus themaximum number of secure addresses allowed on the access VLAN. When the port isconnected to a Cisco IP phone, the IP Phone requires up to two MAC addresses. The IPphone address is learned on the voice VLAN and might also be learned on the accessVLAN. Connecting a PC to the IP phone requires additional MAC addresses.

VoIP Configuration Guides

Priority of Incoming Data Frames

You can connect a PC or other data device to a Cisco IP Phone port. To process tagged data traffic (inIEEE 802.1Q or IEEE 802.1p frames), you can configure the switch to send CDP packets to instruct theIP phone how to send data packets from the device attached to the access port on the Cisco IP Phone.The PC can generate packets with an assigned CoS value. You can configure the Cisco IP Phone to notchange (trust) or to override (not trust) the priority of frames arriving on the IP phone port fromconnected devices.

QoS Basics

Network managers must be prepared for increasing amounts of traffic, requiring more bandwidth than is currently available. This is especially important when dealing with Voice traffic. Almost any network can take advantage of QoS for optimum efficiency, whether it is a small corporate network, an Internet service provider (ISP), or an enterprise network. QoS is the application of features and functionality required to actively manage and satisfy networking requirements of applications sensitive to loss, delay, and delay variation (jitter). QoS allows preference to be given to critical application flows for the available bandwidth. QoS tools enable manageability and predictable service for a variety of networked applications and traffic types in a complex network.

The Cisco IOS implementation of QoS software provides these benefits:

Priority access to resources – QoS allows administrators to control which traffic is allowed to access specific network resources such as bandwidth, equipment and WAN links. Critical traffic may take possession of a resource by dropping low-priority packets.

Efficient management of network resources – If network management and accounting tools indicate that specific traffic is experiencing latency, jitter, and packet loss, then QoS tools can be used to adjust how that traffic is handled.

Tailored services – The control provided by QoS enables ISPs to offer carefully tailored grades of service differentiation to their customers. For example, a service provider can offer one service level agreements (SLAs) to a customer website that receives 3000 to 4000 hits per day and another to a site that receives only 200 to 300 hits per day.

Coexistence of mission-critical applications – QoS technologies ensure that mission-critical business applications receive priority access to network resources while providing adequate processing for applications that are not delay sensitive. Multimedia and voice applications tolerate little latency and require priority access to resources. Other delay-tolerant traffic traversing the same link, such as SMTP over TCP, can still be adequately serviced.

QoS and voice traffic in the campus module

Regardless of the speed of individual switches or links, speed mismatches, many-to-one switching fabrics and aggregation may cause a device to experience congestion which can results in latency. If congestion occurs and congestion management features are not in place, then some packets will be dropped causing retransmissions that inevitably increase overall network load. QoS can mitigate latency caused by congestion on campus devices.

QoS is implemented by classifying and marking traffic at one device while allowing other devices to prioritize or to queue the traffic according to those marks applied to individual frames or packets.

Network Availability Problem Areas

An enterprise network may experience any of these network availability problems:

Delay – Delay (or latency) is the amount of time that it takes a packet to reach the receiving endpoint after being transmitted from the sending endpoint. This time period is termed the "end-to-end delay," and can be broken into two areas: fixed network delay and variable network delay. Fixed network delay includes encoding and decoding time (for voice and video), as well as the amount of time required for the electrical and optical pulses to traverse the media en route to their destination. Variable network delay generally refers to network conditions, such as congestion, that may affect the overall time required for transit. In data networks, for example, these types of delay occur:

Packetization delay – The amount of time that it takes to segment data (if necessary), sample and encode signals (if necessary), process data, and turn the data into packets

Serialization delay – The amount of time that it takes to place the bits of a packet, encapsulated in a frame, onto the physical media

Propagation delay – The amount of time that it takes to transmit the bits of a frame across the physical wire

Processing delay – The amount of time that it takes for a network device to take the frame from an input interface, place it into a receive queue, and then place it into the output queue of the output interface

Queuing delay – The amount of time that a packet resides in the output queue of an interface

Delay variation – Delay variation (or jitter) is the difference in the end-to-end delay between packets. For example, if one packet requires 100 ms to traverse the network from the source endpoint to the destination endpoint, and the following packet requires 125 ms to make the same trip, then the delay variation is calculated as 25 ms.

Each end station and Cisco network device in a voice or video conversation has a jitter buffer. Jitter buffers are used to smooth out changes in arrival times of data packets containing voice and video. A jitter buffer is dynamic and can adjust for changes in arrival times of packets. If you have instantaneous changes in arrival times of packets that are outside of the capabilities of a jitter buffer to compensate, you will have one of these situations:

A jitter buffer underrun, when arrival times between packets containing voice or video increase to the point where the jitter buffer has been exhausted and contains no packets to process the signal for the next piece of voice or video.

A jitter buffer overrun, when arrival times between packets containing voice or video decrease to the point where the jitter buffer cannot dynamically resize itself quickly enough to accommodate. When an overrun occurs, packets are dropped and voice quality is degraded.

Packet loss – Packet loss is a measurement of packets transmitted and received compared to the total number that were transmitted. Loss is expressed as the percentage of packets that were dropped. Tail drops occur when the output queue is full. These are the most common drops that can occur when a link

is congested. Other types of drops (input, ignore, overrun, no buffer) are not as common but may require a hardware upgrade because they are usually a result of network device congestion.

QoS Trust Boundaries

In a campus QoS implementation, boundaries are defined where the existing QoS values attached to frames and to packets are to be accepted or altered. These "trust boundaries" are established by configuring trust levels on the ports of key peripheral network devices where QoS policies will be enforced as traffic makes its way into the network. At these boundaries, traffic will be allowed to retain its original QoS marking or have new marking ascribed as a result of policies associated with its entry point into the network.

Trust boundaries establish a border for traffic entering the campus network. As traffic traverses the switches of the campus network, it is handled and is prioritized according to the marks received or trusted when the traffic originally entered the network at the trust boundary.

At the trust boundary device, QoS values are trusted if they are considered to accurately represented the type of traffic and precedence processing the traffic should receive as it enters the campus network. If untrusted, the traffic will be marked with a new QoS value appropriate for the policy in place at the point where the traffic entered the campus network. Ideally, the trust boundary exists at the first switch receiving traffic from a device or IP Phone. It is also acceptable to establish the trust boundary as all the traffic from an Access Switch enters a Distribution layer port.

NOTE:

Best Practices suggest classifying and marking traffic as close to the traffic source as possible.

QoS traffic classification and marking

Classification and marking is the process of identifying traffic for proper prioritization as that traffic traverses the campus network. Traffic is classified by examining information at various Layers of the

OSI model. All traffic classified in a certain manner will receive an associate mark or QoS value. IP Traffic can be classified according to any values configurable in an ACL or any of the following criteria:

Layer 2 parameters – MAC address, Multiprotocol Label Switching (MPLS), ATM cell loss priority (CLP) bit, Frame Relay discard eligible (DE) bit, ingress interface

Layer 3 parameters – IP precedence, DSCP, QoS group, IP address, ingress interface Layer 4 parameters – TCP or UDP ports, ingress interface Layer 7 parameters – application signatures, ingress interface

All traffic classified or grouped according to the criteria above, will be marked according to that classification. QoS marks or values establish priority levels or priority classes of service for network traffic as it is processed by each switch. Once traffic is marked with a QoS value, then QoS policies on switches and interfaces will handle traffic according to the QoS values contained in individual frames and packets. As a result of classification and marking, traffic will be prioritized accordingly at each switch to ensure that delay sensitive traffic receives priority processing as the switch manages congestion, delay and bandwidth allocation.

Layer 2 QoS Marking

QoS Layer 2 classification occurs by examining information in the Ethernet or 802.1Q header such as destination MAC address or VLAN ID. QoS Layer 2 marking occurs in the Priority field of 802.1Q header. LAN Layer 2 headers have no means of carrying a QoS value so 802.1Q encapsulation is required if Layer 2 QoS marking is to occur. The Priority field is 3 bits in length and is also known as the 802.1p User Priority or Class of Service (CoS) value.

This 3 bit field hosts CoS values ranging from 1-7; 1 being associated with delay tolerant traffic such as TCP/IP. Voice traffic, which by nature is not delay tolerant, receives higher default CoS values such as 3 for Call Signaling. A CoS value of 5 is given to Voice Bearer traffic which is the phone conversation itself where voice quality is impaired if any packets are dropped or delayed.

As a result of Layer 2 classification and marking, the following QoS operations can occur:

Input queue scheduling – When a frame enters a port, it can be assigned to one of a number of port-based queues prior to being scheduled for switching to an egress port. Typically, multiple queues are used where traffic requires different service levels.

Policing – Policing is the process of inspecting a frame to see if it has exceeded a predefined rate of traffic within a certain time frame that is typically a fixed number internal to the switch. If a frame is determined to be in excess of the predefined rate limit, it can either be dropped or the CoS value can be marked down.

Output queue scheduling – The switch will place the frame into an appropriate outbound (egress) queue for switching. The switch will perform buffer management on this queue by ensuring that the buffer does not overflow.

Layer 3 QoS Marking

QoS Layer 3 classification results from the examination of header values such as Destination IP address or Protocol. QoS Layer 3 marking occurs in the Type of Service (ToS) byte in the IP Header. The first 3 bits of the ToS byte are occupied by IP Precedence, which correlates to the 3 CoS Bits carried in the Layer 2 header.

The ToS Byte can also be used for Differentiated Services Code Point (DSCP) marking. DSCP allows prioritization hop by hop as packets are processed on each switch and interface. The ToS bits are used by DSCP values as shown below. The first 3 DSCP bits, correlating to Precedence and CoS, identify the DSCP Class of Service for the packet.

The next three DS bits establish a drop precedence for the packet. Packets with a high DSCP drop precedence value will be dropped before those with a low value if a device or a queue becomes overloaded and must drop packets. Voice traffic will be marked with a low DSCP drop precedence value to minimize voice packet drop.

Each 6 bit DSCP value is also given a DSCP Codepoint name. DSCP classes 1-4 are Assured Forwarding classes (AF). Therefore, if the DSCP class value was 3 and the Drop Precedence was 1, the DSCP Codepoint would be AF31.

Understanding QoS

Typically, networks operate on a best-effort delivery basis, which means that all traffic has equal priority and an equal chance of being delivered in a timely manner. When congestion occurs, all traffic has an equal chance of being dropped.

When you configure QoS, you can select specific network traffic, prioritize it according to its relativeimportance, and use congestion-management and congestion-avoidance techniques to give preferentialtreatment. Implementing QoS in your network makes network performance more predictable andbandwidth utilization more effective.

The QoS implementation is based on the DiffServ architecture, an emerging standard from the InternetEngineering Task Force (IETF). This architecture specifies that each packet is classified upon entry intothe network. The classification is carried in the IP packet header, using 6 bits from the deprecated IP type of service (ToS) field to carry the classification (class) information. Classification can also be carried in the Layer 2 frame.

These special bits in the Layer 2 frame or in the Layer 3 packet are described hereand shown in the next figure:

• Prioritization bits in Layer 2 frames:

Layer 2 Inter-Switch Link (ISL) frame headers have a 1-byte User field that carries an IEEE 802.1pclass of service (CoS) value in the three least-significant bits. On interfaces configured as Layer 2ISL trunks, all traffic is in ISL frames.

Layer 2 IEEE 802.1Q frame headers have a 2-byte Tag Control Information field that carries the CoSvalue in the three most-significant bits, which are called the User Priority bits. On interfaces configured as Layer 2 IEEE 802.1Q trunks, all traffic is in IEEE 802.1Q frames except for traffic in the native VLAN.

Other frame types cannot carry Layer 2 CoS values.

Layer 2 CoS values range from 0 for low priority to 7 for high priority.

• Prioritization bits in Layer 3 packets:

Layer 3 IP packets can carry either an IP precedence value or a Differentiated Services Code Point(DSCP) value. QoS supports the use of either value because DSCP values are backward-compatiblewith IP precedence values.

IP precedence values range from 0 to 7.

DSCP values range from 0 to 63.

IP Precedence Values

Number Bit Value Name

0 000 Routine

1 001 Priority

2 010 Immediate

3 011 Flash

4 100 Flash Override

5 101 CRITIC/ECP

6 110 Internetwork Control

7 111 Network Control

Note Layer 3 IPv6 packets are treated as non-IP packets and are bridged by the switch.

To give the same forwarding treatment to packets with the same class information and different treatment to packets with different class information, all switches and routers that access the Internet rely on class information. Class information in the packet can be assigned by end hosts or by switches or routers along the way, based on a configured policy, detailed examination of the packet, or both. Detailed examination of the packet is expected to happen closer to the network edge so that core switches and routers are not overloaded.

Switches and routers along the path can use class information to limit the amount of resources allocated

per traffic class. The behavior of an individual device when handling traffic in the DiffServ architectureis called per-hop behavior. If all devices along a path have a consistent per-hop behavior, you canconstruct an end-to-end QoS solution.

Implementing QoS in your network can be a simple or complex task and depends on the QoS featuresoffered by your internetworking devices, the traffic types and patterns in your network, and the granularity of control that you need over incoming and outgoing traffic.

Basic QoS model

Actions at the ingress interface include classifying traffic, policing, and marking:

• Classifying distinguishes one kind of traffic from another. The process generates an internal DSCPfor a packet, which identifies all the future QoS actions to be performed on this packet.

• Policing determines whether a packet is in or out of profile by comparing the internal DSCP to theconfigured policer. The policer limits the bandwidth consumed by a flow of traffic. The result of thisdetermination is passed to the marker.

• Marking evaluates the policer and the configuration information for the action to be taken when apacket is out of profile and decides what to do with the packet (pass through a packet withoutmodification, mark down the DSCP value in the packet, or drop the packet).

Actions at the egress interface include queueing and scheduling:

• Queueing evaluates the internal DSCP and determines which of the four egress queues in which toplace the packet. The DSCP value is mapped to a CoS value, which selects one of the queues.

• Scheduling services the four egress queues based on their configured weighted round robin (WRR)weights and thresholds. One of the queues can be the expedite queue, which is serviced until emptybefore the other queues are serviced. Congestion avoidance techniques include tail drop and Weighted Random Early Detection (WRED) on Gigabit-capable Ethernet ports and tail drop (with only one threshold) on 10/100 Ethernet ports.

Note Policing and marking also can occur on egress interfaces.

Classification

Classification is the process of distinguishing one kind of traffic from another by examining the fieldsin the packet. Classification is enabled only if QoS is globally enabled on the switch. By default, QoS isglobally disabled, so no classification occurs.

Note Classification occurs on a physical interface or on a per-port per-VLAN basis. No support exists for classifying packets at the switch virtual interface level. You specify which fields in the frame or packet that you want to use to classify incoming traffic.

The priorities for these CoS/DSCP values are as follows:

• CoS 5 (voice data)—Highest priority (priority queue if present, otherwise high queue)• CoS 6, 7 (routing protocols)—Second priority (high queue)• CoS 3, 4 (call signal and video stream)—Third priority (high queue)• CoS 1, 2 (streaming and mission critical)—Fourth priority (high queue)• CoS 0— Low priority (low queue)

For non-IP traffic, these are the classification options:

• Use the port default. If the frame does not contain a CoS value, the switch assigns the default portCoS value to the incoming frame. Then, the switch uses the configurable CoS-to-DSCP map togenerate the internal DSCP value. The switch uses the internal DSCP value to generate a CoS valuerepresenting the priority of the traffic.

• Trust the CoS value in the incoming frame (configure the port to trust CoS). Then, the switch usesthe configurable CoS-to-DSCP map to generate the internal DSCP value. Layer 2 ISL frame headerscarry the CoS value in the three least-significant bits of the 1-byte User field. Layer 2 IEEE 802.1Qframe headers carry the CoS value in the three most-significant bits of the Tag Control Informationfield. CoS values range from 0 for low priority to 7 for high priority.

• The trust DSCP and trust IP precedence configurations are meaningless for non-IP traffic. If youconfigure a port with either of these options and non-IP traffic is received, the switch assigns thedefault port CoS value and generates the internal DSCP from the CoS-to-DSCP map.

• Perform the classification based on the configured Layer 2 MAC access control list (ACL), whichcan examine the MAC source address, the MAC destination address, and the Ethertype field. If noACL is configured, the packet is assigned the default DSCP of 0, which means best-effort traffic;otherwise, the policy map specifies the DSCP to assign to the incoming frame.

For IP traffic, these are the classification options:

• Trust the IP DSCP in the incoming packet (configure the port to trust DSCP), and assign the same DSCP to the packet for internal use. The IETF defines the 6 most-significant bits of the 1-byte ToS field as the DSCP. The priority represented by a particular DSCP value is configurable. DSCP values rangefrom 0 to 63.

For ports that are on the boundary between two QoS administrative domains, you can modify the DSCPto another value by using the configurable DSCP-to-DSCP-mutation map.

• Trust the IP precedence in the incoming packet (configure the port to trust IP precedence), andgenerate a DSCP by using the configurable IP-precedence-to-DSCP map. The IP Version 4specification defines the three most-significant bits of the 1-byte ToS field as the IP precedence. IPprecedence values range from 0 for low priority to 7 for high priority.

• Trust the CoS value (if present) in the incoming packet, and generate the DSCP by using theCoS-to-DSCP map.

• Perform the classification based on a configured IP standard or an extended ACL, which examinesvarious fields in the IP header. If no ACL is configured, the packet is assigned the default DSCPof 0, which means best-effort traffic; otherwise, the policy map specifies the DSCP to assign to theincoming frame.

Classification Based on QoS ACLs

You can use IP standard, IP extended, and Layer 2 MAC ACLs to define a group of packets with the same characteristics (class). In the QoS context, the permit and deny actions in the access control entries(ACEs) have different meanings than with security ACLs:

• If a match with a permit action is encountered (first-match principle), the specified QoS-relatedaction is taken.• If a match with a deny action is encountered, the ACL being processed is skipped, and the next ACLis processed.• If no match with a permit action is encountered and all the ACEs have been examined, no QoSprocessing occurs on the packet, and the switch offers best-effort service to the packet.• If multiple ACLs are configured on an interface, the lookup stops after the packet matches the firstACL with a permit action, and QoS processing begins.

After a traffic class has been defined with the ACL, you can attach a policy to it. A policy might containmultiple classes with actions specified for each one of them. A policy might include commands toclassify the class as a particular aggregate (for example, assign a DSCP) or rate-limit the class. Thispolicy is then attached to a particular port on which it becomes effective.

You implement IP ACLs to classify IP traffic by using the access-list global configuration command;you implement Layer 2 MAC ACLs to classify non-IP traffic by using the mac access-list extendedglobal configuration command.

Classification Based on Class Maps and Policy Maps

A class map is a mechanism that you use to name and to isolate a specific traffic flow (or class) from allother traffic. The class map defines the criteria used to match against a specific traffic flow to furtherclassify it; the criteria can include matching the access group defined by the ACL, matching a specificlist of DSCP or IP precedence values, or matching a specific list of VLAN IDs associated with anotherclass map that defines the actual criteria (for example, to match a standard or extended ACL). If you have more than one type of traffic that you want to classify, you can create another class map and use adifferent name. After a packet is matched against the class-map criteria, you further classify it throughthe use of a policy map.

A policy map specifies which traffic class to act on. Actions can include trusting the CoS, DSCP, or IPprecedence values in the traffic class; setting a specific DSCP or IP precedence value in the traffic class;or specifying the traffic bandwidth limitations and the action to take when the traffic is out of profile.Before a policy map can be effective, you must attach it to an interface.

You create a class map by using the class-map global configuration command or the class policy-mapconfiguration command; you should use the class-map command when the map is shared among manyports. When you enter the class-map command, the switch enters the class-map configuration mode. Inthis mode, you define the match criterion for the traffic by using the match class-map configurationcommand.

You create and name a policy map by using the policy-map global configuration command. When youenter this command, the switch enters the policy-map configuration mode. In this mode, you specify theactions to take on a specific traffic class by using the class, trust, or set policy-map configuration andpolicy-map class configuration commands. To make the policy map effective, you attach it to an interface by using the service-policy interface configuration command. The policy map also can contain commands that define the policer, the bandwidth limitations of the traffic, and the action to take if the limits are exceeded.

A policy map has these characteristics:

• A policy map can contain multiple class statements.• A separate policy-map class can exist for each type of traffic received through an interface.• The policy-map trust state and an interface trust state are mutually exclusive, and whichever isconfigured last takes affect.

Policing and Marking

After a packet is classified and has an internal DSCP value assigned to it, the policing and markingprocess can begin.

Policing involves creating a policer that specifies the bandwidth limits for the traffic. Packets that exceed the limits are out of profile or nonconforming. Each policer specifies the action to take for packets that are in or out of profile. These actions, carried out by the marker, include passing through the packet without modification, dropping the packet, or marking down the packet with a new DSCP value that is obtained from the configurable policed-DSCP map.

You can create these types of policers:

• Individual

QoS applies the bandwidth limits specified in the policer separately to each matched traffic class.You configure this type of policer within a policy map by using the police policy-map configurationcommand.

• Aggregate

QoS applies the bandwidth limits specified in an aggregate policer cumulatively to all matchedtraffic flows. You configure this type of policer by specifying the aggregate policer name within apolicy map by using the police aggregate policy-map configuration command. You specify thebandwidth limits of the policer by using the mls qos aggregate-policer global configuration command. In this way, the aggregate policer is shared by multiple classes of traffic within a policy map.

Policing uses a token bucket algorithm. As each frame is received by the switch, a token is added to thebucket. The bucket has a hole in it and leaks at a rate that you specify as the average traffic rate in bitsper second. Each time a token is added to the bucket, the switch performs a check to determine if thereis enough room in the bucket. If there is not enough room, the packet is marked as nonconforming, andthe specified policer action is taken (dropped or marked down).

How quickly the bucket fills is a function of the bucket depth (burst-byte), the rate at which the tokensare removed (rate-bps), and the duration of the burst above the average rate. The size of the bucketimposes an upper limit on the burst length and determines the number of frames that can be sentback-to-back. If the burst is short, the bucket does not overflow, and no action is taken against the traffic flow. However, if a burst is long and at a higher rate, the bucket overflows, and the policing actions are taken against the frames in that burst.

You configure the bucket depth (the maximum burst that is tolerated before the bucket overflows) byusing the burst-byte option of the police policy-map class configuration command or the mls qosaggregate-policer global configuration command. You configure how fast (the average rate) that thetokens are removed from the bucket by using the rate-bps option of the police policy-map classconfiguration command or the mls qos aggregate-policer global configuration command.

When configuring policing and policers, keep these items in mind:

• By default, no policers are configured.• Policers can be configured only on a physical port or on a per-port per-VLAN basis (specifies thebandwidth limits for the traffic on a per-VLAN basis, for a given port). Per-port per-VLAN policingis not supported on routed ports or on virtual (logical) interfaces. It is supported only on an ingressport configured as a trunk or as a static-access port.• Only one policer can be applied to a packet per direction.• Only the average rate and committed burst parameters are configurable.• Policing can occur on ingress and egress interfaces:

Note Per-port per-VLAN policing is supported only on ingress interfaces.

– 128 policers are supported on ingress Gigabit-capable Ethernet ports.– 8 policers are supported on ingress 10/100 Ethernet ports.– 8 policers are supported on all egress ports.– Ingress policers can be individual or aggregate.

• On an interface configured for QoS, all traffic received through the interface is classified, policed,and marked according to the policy map attached to the interface. On a trunk interface configuredfor QoS, traffic in all VLANs received through the interface is classified, policed, and markedaccording to the policy map attached to the interface.

After you configure the policy map and policing actions, attach the policy to an ingress or egressinterface by using the service-policy interface configuration command.

Note The 10-Gigabit Ethernet interfaces do not support policing.

The next figure shows the policing and marking process when these types of policy maps are configured:

• A nonhierarchical policy map on a physical port.• The interface level of a hierarchical policy map attached to an SVI. The physical ports are specifiedin this secondary policy map.

Policing on Physical Ports

In policy maps on physical ports, you can create these types of policers:

• Individual

QoS applies the bandwidth limits specified in the policer separately to each matched traffic class.You configure this type of policer within a policy map by using the police policy-map classconfiguration command.

• Aggregate

QoS applies the bandwidth limits specified in an aggregate policer cumulatively to all matchedtraffic flows. You configure this type of policer by specifying the aggregate policer name within apolicy map by using the police aggregate policy-map class configuration command. You specify the

bandwidth limits of the policer by using the mls qos aggregate-policer global configurationcommand. In this way, the aggregate policer is shared by multiple classes of traffic within a policymap.

Note In Cisco IOS Release 12.2(25)SE or later, you can only configure individual policers on anSVI.

Policing uses a token-bucket algorithm. As each frame is received by the switch, a token is added to thebucket. The bucket has a hole in it and leaks at a rate that you specify as the average traffic rate in bitsper second. Each time a token is added to the bucket, the switch verifies that there is enough room in thebucket. If there is not enough room, the packet is marked as nonconforming, and the specified policeraction is taken (dropped or marked down).

How quickly the bucket fills is a function of the bucket depth (burst-byte), the rate at which the tokensare removed (rate-bps), and the duration of the burst above the average rate. The size of the bucketimposes an upper limit on the burst length and limits the number of frames that can be transmittedback-to-back. If the burst is short, the bucket does not overflow, and no action is taken against the traffic flow. However, if a burst is long and at a higher rate, the bucket overflows, and the policing actions are taken against the frames in that burst.

You configure the bucket depth (the maximum burst that is tolerated before the bucket overflows) byusing the burst-byte option of the police policy-map class configuration command or the mls qosaggregate-policer global configuration command. You configure how fast (the average rate) that thetokens are removed from the bucket by using the rate-bps option of the police policy-map classconfiguration command or the mls qos aggregate-policer global configuration command.

Policing on SVIs

Note Before configuring a hierarchical policy map with individual policers on an SVI, you must enableVLAN-based QoS on the physical ports that belong to the SVI. Though a policy map is attached to theSVI, the individual policers only affect traffic on the physical ports specified in the secondary interfacelevel of the hierarchical policy map.

A hierarchical policy map has two levels. The first level, the VLAN level, specifies the actions to betaken against a traffic flow on an SVI. The second level, the interface level, specifies the actions to betaken against the traffic on the physical ports that belong to the SVI and are specified in theinterface-level policy map.

When configuring policing on an SVI, you can create and configure a hierarchical policy map with these two levels:

• VLAN level—Create this primary level by configuring class maps and classes that specify the porttrust state or set a new DSCP or IP precedence value in the packet. The VLAN-level policy mapapplies only to the VLAN in an SVI and does not support policers.

• Interface level—Create this secondary level by configuring class maps and classes that specify theindividual policers on physical ports the belong to the SVI. The interface-level policy map onlysupports individual policers and does not support aggregate policers.

Mapping Tables on Catalyst 3550

During QoS processing, the switch represents the priority of all traffic (including non-IP traffic) with aninternal DSCP value:

• During classification, QoS uses configurable mapping tables to derive the internal DSCP (a 6-bitvalue) from received CoS or IP precedence (3-bit) values. These maps include the CoS-to-DSCPmap and the IP-precedence-to-DSCP map.

On an ingress interface configured in the DSCP-trusted state, if the DSCP values are differentbetween the QoS domains, you can apply the configurable DSCP-to-DSCP-mutation map to theinterface that is on the boundary between the two QoS domains.

• During policing, QoS can assign another DSCP value to an IP or non-IP packet (if the packet is outof profile and the policer specifies a marked down DSCP value). This configurable map is called thepoliced-DSCP map.

• Before the traffic reaches the scheduling stage, QoS uses the configurable DSCP-to-CoS map toderive a CoS value from the internal DSCP value. Through the CoS-to-egress-queue map, the CoSvalues select one of the four egress queues for output processing.

The CoS-to-DSCP, DSCP-to-CoS, and the IP-precedence-to-DSCP map have default values that mightor might not be appropriate for your network.

The default DSCP-to-DSCP-mutation map and the default policed-DSCP map are null maps; they mapan incoming DSCP value to the same DSCP value. The DSCP-to-DSCP-mutation map is the only mapyou apply to a specific Gigabit-capable Ethernet port or to a group of 10/100 Ethernet ports. All othermaps apply to the entire switch.

Mapping Tables on Catalyst 3750

During QoS processing, the switch represents the priority of all traffic (including non-IP traffic) with anQoS label based on the DSCP or CoS value from the classification stage:

• During classification, QoS uses configurable mapping tables to derive a corresponding DSCP orCoS value from a received CoS, DSCP, or IP precedence value. These maps include the CoS-to-DSCP map and the IP-precedence-to-DSCP map. You configure these maps by using the mls qos map cos-dscp and the mls qos map ip-prec-dscp global configuration commands.

On an ingress port configured in the DSCP-trusted state, if the DSCP values are different betweenthe QoS domains, you can apply the configurable DSCP-to-DSCP-mutation map to the port that ison the boundary between the two QoS domains. You configure this map by using the mls qos mapdscp-mutation global configuration command.

• During policing, QoS can assign another DSCP value to an IP or a non-IP packet (if the packet isout of profile and the policer specifies a marked-down value). This configurable map is called thepoliced-DSCP map. You configure this map by using the mls qos map policed-dscp globalconfiguration command.

• Before the traffic reaches the scheduling stage, QoS stores the packet in an ingress and an egressqueue according to the QoS label. The QoS label is based on the DSCP or the CoS value in the packetand selects the queue through the DSCP input and output queue threshold maps or through the CoSinput and output queue threshold maps. You configure these maps by using the mls qos srr-queue{input | output} dscp-map and the mls qos srr-queue {input | output} cos-map globalconfiguration commands.

The CoS-to-DSCP, DSCP-to-CoS, and the IP-precedence-to-DSCP maps have default values that mightor might not be appropriate for your network.

The default DSCP-to-DSCP-mutation map and the default policed-DSCP map are null maps; they mapan incoming DSCP value to the same DSCP value. The DSCP-to-DSCP-mutation map is the only map

you apply to a specific port. All other maps apply to the entire switch.

Queueing and Scheduling

After a packet is policed and marked, the queueing and scheduling process begins as described in the next sections.

Queueing and Scheduling on Gigabit-Capable Ports on Catalyst 3550

Note If the expedite queue is enabled, WRR services it until it is empty before servicing the other threequeues.

During the queueing and scheduling process, the switch uses egress queues and WRR for congestionmanagement, and tail drop or WRED algorithms for congestion avoidance on Gigabit-capable Ethernetports.

Each Gigabit-capable Ethernet port has four egress queues, one of which can be the egress expeditequeue. You can configure the buffer space allocated to each queue as a ratio of weights by using thewrr-queue queue-limit interface configuration command, where the relative size differences in thenumbers show the relative differences in the queue sizes. To display the absolute value of the queue size, use the show mls qos interface interface-id statistics privileged EXEC command, and examine the FreeQ information.

You assign two drop thresholds to each queue, map DSCPs to the thresholds through theDSCP-to-threshold map, and enable either tail drop or WRED on the interface. The queue size, dropthresholds, tail-drop or WRED algorithm, and the DSCP-to-threshold map work together to determinewhen and which packets are dropped when the thresholds are exceeded. You configure the droppercentage thresholds by using either the wrr-queue threshold interface configuration command for tail drop or the wrr-queue random-detect max-threshold interface configuration command for

WRED; in either case, you map DSCP values to the thresholds (DSCP-to-threshold map) by using the wrr-queue dscp-map interface configuration command.

The available bandwidth of the egress link is divided among the queues. You configure the queues to beserviced according to the ratio of WRR weights by using the wrr-queue bandwidth interfaceconfiguration command. The ratio represents the importance (weight) of a queue relative to the otherqueues. WRR scheduling prevents low-priority queues from being completely neglected during periodsof high-priority traffic by sending some packets from each queue in turn. The number of packets sentcorresponds to the relative importance of the queue. For example, if one queue has a weight of 3 andanother has a weight of 4, three packets are sent from the first queue for every four that are sent from the second queue. By using this scheduling, low-priority queues can send packets even though thehigh-priority queues are not empty. Queues are selected by the CoS value that is mapped to an egressqueue (CoS-to-egress-queue map) through the wrr-queue cos-map interface configuration command.All four queues participate in the WRR unless the expedite queue is enabled, in which case, the fourthbandwidth weight is ignored and not used in the ratio calculation. The expedite queue is a priority queue, and it is serviced until empty before the other queues are serviced. You enable the expedite queue by using the priority-queue out interface configuration command.

You can combine the commands described in this section to prioritize traffic by placing packets withparticular DSCPs into certain queues, allocate a larger queue size or service the particular queue morefrequently, and adjust queue thresholds so that packets with lower priorities are dropped.

Tail Drop

Tail drop is the default congestion-avoidance technique on Gigabit-capable Ethernet ports in Catalyst 3550. With tail drop, packets are queued until the thresholds are exceeded. Specifically, all packets with DSCPs assigned to the first threshold are dropped until the threshold is no longer exceeded. However, packets assigned to the second threshold continue to be queued and sent as long as the second threshold is not exceeded. You can modify the two tail-drop threshold percentages assigned to the four egress queues by using the wrr-queue threshold interface configuration command. Each threshold value is a percentage of the total number of allocated queue descriptors for the queue. The default threshold is 100 percent for thresholds 1 and 2.

You modify the DSCP-to-threshold map to determine which DSCPs are mapped to which threshold IDby using the wrr-queue dscp-map interface configuration command. By default, all DSCPs are mapped to threshold 1, and when this threshold is exceeded, all the packets are dropped.

If you use tail-drop thresholds, you cannot use WRED, and vice versa. If tail drop is disabled, WRED isautomatically enabled with the previous configuration (or the default if it was not previouslyconfigured).

Weighted Tail Drop

Both the ingress and egress queues use an enhanced version of the tail-drop congestion-avoidancemechanism called weighted tail drop (WTD) in Catalyst 3750. WTD is implemented on queues to manage the queue lengths and to provide drop precedences for different traffic classifications.

As a frame is enqueued to a particular queue, WTD uses the frame’s assigned QoS label to subject it todifferent thresholds. If the threshold is exceeded for that QoS label (the space available in the destination queue is less than the size of the frame), the switch drops the frame.

Here is an example that shows an example of WTD operating on a queue whose size is 1000 frames. Three drop percentages are configured: 40 percent (400 frames), 60 percent (600 frames), and 100 percent (1000 frames). These percentages mean that up to 400 frames can be queued at the 40-percent threshold, up to 600 frames at the 60-percent threshold, and up to 1000 frames at the 100-percent threshold.

In this example, CoS values 6 and 7 have a greater importance than the other CoS values, and they areassigned to the 100-percent drop threshold (queue-full state). CoS values 4 and 5 are assigned to the60-percent threshold, and CoS values 0 to 3 are assigned to the 40-percent threshold.

Suppose the queue is already filled with 600 frames, and a new frame arrives. It contains CoS values 4and 5 and is subjected to the 60-percent threshold. If this frame is added to the queue, the threshold willbe exceeded, so the switch drops it.

WRED

Cisco’s implementation of Random Early Detection (RED), called Weighted Random Early Detection(WRED), differs from other congestion-avoidance techniques because it attempts to anticipate and avoid congestion, rather than controlling congestion when it occurs.

WRED takes advantage of the Transmission Control Protocol (TCP) congestion control to try to controlthe average queue size by indicating to end hosts when they should temporarily stop sending packets. By randomly dropping packets before periods of high congestion, it tells the packet source to decrease its transmission rate. Assuming the packet source is using TCP, WRED tells it to decrease its transmission rate until all the packets reach their destination, meaning that the congestion is cleared.

WRED reduces the chances of tail drop by selectively dropping packets when the output interface begins to show signs of congestion. By dropping some packets early rather than waiting until the queue is full, WRED avoids dropping large numbers of packets at once. Thus, WRED allows the transmission line to be fully used at all times. WRED also drops more packets from large users than small. Therefore, sources that generate the most traffic are more likely to be slowed down versus sources that generate little traffic.

You can enable WRED and configure the two threshold percentages assigned to the four egress queueson a Gigabit-capable Ethernet port by using the wrr-queue random-detect max-threshold interfaceconfiguration command. Each threshold percentage represents where WRED starts to randomly droppackets. After a threshold is exceeded, WRED randomly begins to drop packets assigned to thisthreshold. As the queue limit is approached, WRED continues to drop more and more packets. When the queue limit is reached, WRED drops all packets assigned to the threshold. By default, WRED isdisabled.

You modify the DSCP-to-threshold map to determine which DSCPs are mapped to which threshold IDby using the wrr-queue dscp-map interface configuration command. By default, all DSCPs are apped to threshold 1, and when this threshold is exceeded, all the packets are randomly dropped. If you use WRED thresholds, you cannot use tail drop, and vice versa. If WRED is disabled, tail drop is automatically enabled with the previous configuration (or the default if it was not previously configured).

SRR(Shaped Round Robin) Shaping and Sharing

Both the ingress and egress queues in Catalyst 3750 are serviced by SRR, which controls the rate at which packets are sent. On the ingress queues, SRR sends packets to the stack ring. On the egress queues, SRR sends packets to the egress port.

You can configure SRR on egress queues for sharing or for shaping. However, for ingress queues, sharing is the default mode, and it is the only mode supported.

In shaped mode, the egress queues are guaranteed a percentage of the bandwidth, and they arerate-limited to that amount. Shaped traffic does not use more than the allocated bandwidth even if thelink is idle. Shaping provides a more even flow of traffic over time and reduces the peaks and valleys ofbursty traffic. With shaping, the absolute value of each weight is used to compute the bandwidth

available for the queues.

In shared mode, the queues share the bandwidth among them according to the configured weights. Thebandwidth is guaranteed at this level but not limited to it. For example, if a queue is empty and no longer requires a share of the link, the remaining queues can expand into the unused bandwidth and share it among them. With sharing, the ratio of the weights controls the frequency of dequeuing; the absolute values are meaningless.

Queueing and Scheduling on 10/100 Ethernet Ports on Catalyst 3550

Note If the expedite queue is enabled, WRR services it until it is empty before servicing the other threequeues.

During the queueing and scheduling process, the switch uses egress queues (to select the level and buffer size) and WRR for congestion management.Each 10/100 Ethernet port has four egress queues, one of which can be the egress expedite queue. Eachqueue can access one of eight minimum-reserve levels; each level has 100 packets of buffer space bydefault for queueing packets. When the buffer specified for the minimum-reserve level is full, packetsare dropped until space is available.

The next figure is an example of the 10/100 Ethernet port queue assignments, minimum-reserve levels, and buffer sizes. The figure shows four egress queues per port, with each queue assigned to aminimum-reserve level. For example, for Fast Ethernet port 0/1, queue 1 is assigned to minimum-reserve level 1, queue 2 is assigned to minimum-reserve level 3, queue 3 is assigned to minimum-reserve level 5, and queue 4 is assigned to minimum-reserve level 7. You assign the minimum-reserve level to a queue by using the wrr-queue min-reserve interface configuration command.

Each minimum-reserve level is configured with a buffer size. As shown in the figure, queue 4 of FastEthernet port 1 has a buffer size of 70 packets, queue 4 of Fast Ethernet port 2 has a buffer size of 80packets, queue 4 of Fast Ethernet port 3 has a buffer size of 40 packets, and Fast Ethernet port 4 has abuffer size of 80 packets. You configure the buffer size by using the mls qos min-reserve globalconfiguration command.

The available bandwidth of the egress link is divided among the queues. You configure the queues to beserviced according to the ratio of WRR weights by using the wrr-queue bandwidth interfaceconfiguration command. The ratio represents the importance (weight) of a queue relative to the otherqueues. WRR scheduling prevents low-priority queues from being completely neglected during periodsof high-priority traffic by sending some packets from each queue in turn. The number of packets sentcorresponds to the relative importance of the queue. For example, if one queue has a weight of 3 andanother has a weight of 4, three packets are sent from the first queue for every four that are sent from the second queue. By using this scheduling, low-priority queues can send packets even though thehigh-priority queues are not empty. Queues are selected by the CoS value that is mapped to an egressqueue (CoS-to-egress-queue map) through the wrr-queue cos-map interface configuration command.

All four queues participate in the WRR unless the egress expedite queue is enabled, in which case, thefourth bandwidth weight is ignored and not used in the ratio calculation. The expedite queue is a priority queue, and it is serviced until empty before the other queues are serviced. You enable the expedite queue by using the priority-queue out interface configuration command.

You can combine the commands described in this section to prioritize traffic by placing packets withparticular DSCPs into certain queues, allocate a larger minimum-reserve buffer size, and service a particular queue more frequently.

Ingress and egress queue allocation in Catalyst 3750

Because the total ingress bandwidth of all ports can exceed the bandwidth of the stack ring, ingressqueues are located after the packet is classified, policed, and marked and before packets are forwardedinto the switch fabric. Because multiple ingress ports can simultaneously send packets to an egress portand cause congestion, egress queues are located after the stack ring.

Queueing and Scheduling on Ingress Queues in Catalyst 3750

Note SRR services the priority queue for its configured share before servicing the other queue.

You assign each packet that flows through the switch to a queue and to a threshold. Specifically, you map DSCP or CoS values to an ingress queue and map DSCP or CoS values to a threshold ID. You use the mls qos srr-queue input dscp-map queue queue-id {dscp1...dscp8 | threshold threshold-iddscp1...dscp8} or the mls qos srr-queue input cos-map queue queue-id {cos1...cos8 | thresholdthreshold-id cos1...cos8} global configuration command. You can display the DSCP input queuethreshold map and the CoS input queue threshold map by using the show mls qos maps privileged EXEC command.

WTD Thresholds

The queues use WTD to support distinct drop percentages for different traffic classes. Each queue hasthree drop thresholds: two configurable (explicit) WTD thresholds and one nonconfigurable (implicit)threshold preset to the queue-full state. You assign the two explicit WTD threshold percentages for

threshold ID 1 and ID 2 to the ingress queues by using the mls qos srr-queue input threshold queue-idthreshold-percentage1 threshold-percentage2 global configuration command. Each threshold value is apercentage of the total number of allocated buffers for the queue. The drop threshold for threshold ID 3is preset to the queue-full state, and you cannot modify it.

Buffer and Bandwidth Allocation

You define the ratio (allocate the amount of space) with which to divide the ingress buffers between thetwo queues by using the mls qos srr-queue input buffers percentage1 percentage2 global configuration command. The buffer allocation together with the bandwidth allocation control how much data can be buffered and sent before packets are dropped. You allocate bandwidth as a percentage by using the mls qos srr-queue input bandwidth weight1 weight2 global configuration command. The ratio of the weights is the ratio of the frequency in which the SRR scheduler sends packets from each queue.

Priority Queueing

You can configure one ingress queue as the priority queue by using the mls qos srr-queue inputpriority-queue queue-id bandwidth weight global configuration command. The priority queue shouldbe used for traffic (such as voice) that requires guaranteed delivery because this queue is guaranteed part of the bandwidth regardless of the load on the stack ring.

SRR services the priority queue for its configured weight as specified by the bandwidth keyword in themls qos srr-queue input priority-queue queue-id bandwidth weight global configuration command.Then, SRR shares the remaining bandwidth with both ingress queues and services them as specified bythe weights configured with the mls qos srr-queue input bandwidth weight1 weight2 globalconfiguration command.

You can combine the commands described in this section to prioritize traffic by placing packets withparticular DSCPs or CoSs into certain queues, by allocating a large queue size or by servicing the queuemore frequently, and by adjusting queue thresholds so that packets with lower priorities are dropped.

Queueing and Scheduling on Egress Queues in Catalyst 3750

Each port supports four egress queues, one of which (queue 1) can be the egress expedite queue. Thesequeues are assigned to a queue-set. All traffic exiting the switch flows through one of these four queuesand is subjected to a threshold based on the QoS label assigned to the packet.

The next figure shows the egress queue buffer. The buffer space is divided between the common pool and the reserved pool. The switch uses a buffer allocation scheme to reserve a minimum amount of buffers for each egress queue, to prevent any queue or port from consuming all the buffers and depriving other queues, and to control whether to grant buffer space to a requesting queue. The switch detects whether the target queue has not consumed more buffers than its reserved amount (under-limit), whether it has consumed all of its maximum buffers (over limit), and whether the common pool is empty (no free buffers) or not empty (free buffers). If the queue is not over-limit, the switch can allocate buffer space from the reserved pool or from the common pool (if it is not empty). If there are no free buffers in the common pool or if the queue is over-limit, the switch drops the frame.

Buffer and Memory Allocation

You guarantee the availability of buffers, set drop thresholds, and configure the maximum memoryallocation for a queue-set by using the mls qos queue-set output qset-id threshold queue-iddrop-threshold1 drop-threshold2 reserved-threshold maximum-threshold global configuration command. Each threshold value is a percentage of the queue’s allocated memory, which you specify by using themls qos queue-set output qset-id buffers allocation1 ... allocation4 global configuration command. The sum of all the allocated buffers represents the reserved pool, and the remaining buffers are part of the common pool.

Through buffer allocation, you can ensure that high-priority traffic is buffered. For example, if the buffer space is 400, you can allocate 70 percent of it to queue 1 and 10 percent to queues 2 through 4. Queue 1 then has 280 buffers allocated to it, and queues 2 through 4 each have 40 buffers allocated to them.

You can guarantee that the allocated buffers are reserved for a specific queue in a queue-set. Forexample, if there are 100 buffers for a queue, you can reserve 50 percent (50 buffers). The switch returns the remaining 50 buffers to the common pool. You also can enable a queue in the full condition to obtain more buffers than are reserved for it by setting a maximum threshold. The switch can allocate the needed buffers from the common pool if the common pool is not empty.

WTD Thresholds

You can assign each packet that flows through the switch to a queue and to a threshold. Specifically, you map DSCP or CoS values to an egress queue and map DSCP or CoS values to a threshold ID. You use the mls qos srr-queue output dscp-map queue queue-id {dscp1...dscp8 | threshold threshold-iddscp1...dscp8} or the mls qos srr-queue output cos-map queue queue-id {cos1...cos8 | threshold threshold-id cos1...cos8} global configuration command. You can display the DSCP output queuethreshold map and the CoS output queue threshold map by using the show mls qos maps privilegedEXEC command.

The queues use WTD to support distinct drop percentages for different traffic classes. Each queue hasthree drop thresholds: two configurable (explicit) WTD thresholds and one nonconfigurable (implicit)threshold preset to the queue-full state. You assign the two WTD threshold percentages for thresholdID 1 and ID 2. The drop threshold for threshold ID 3 is preset to the queue-full state, and you cannotmodify it.

Shaped or Shared Mode

SRR services each queue-set in shared or shaped mode. You map a port to a queue-set by using thequeue-set qset-id interface configuration command. You assign shared or shaped weights to the port byusing the srr-queue bandwidth share weight1 weight2 weight3 weight4 or the srr-queue bandwidthshape weight1 weight2 weight3 weight4 interface configuration command.

Note You cannot assign shaped weights on 10-Gigabit interfaces.

The buffer allocation together with the SRR weight ratios control how much data can be buffered andsent before packets are dropped. The weight ratio is the ratio of the frequency in which the SRRscheduler sends packets from each queue.

All four queues participate in the SRR unless the expedite queue is enabled, in which case the firstbandwidth weight is ignored and is not used in the ratio calculation. The expedite queue is a priorityqueue, and it is serviced until empty before the other queues are serviced. You enable the expedite queueby using the priority-queue out interface configuration command.

You can combine the commands described in this section to prioritize traffic by placing packets withparticular DSCPs or CoSs into certain queues, by allocating a large queue size or by servicing the queuemore frequently, and by adjusting queue thresholds so that packets with lower priorities are dropped.

Note The egress queue default settings are suitable for most situations. You should change them only whenyou have a thorough understanding of the egress queues and if these settings do not meet your QoSsolution.

Threshold maps for Catalyst 3750

Packet Modification in Catalyst 3550

A packet is classified, policed, and queued for QoS. Packet modifications can occur during this process:

• For IP packets, classification involves assigning a DSCP to the packet. However, the packet is notmodified at this stage; only an indication of the assigned DSCP is carried along. The reason for thisis that QoS classification and ACL lookup occur in parallel, and it is possible that the ACL specifiesthat the packet should be denied and logged. In this situation, the packet is forwarded with itsoriginal DSCP to the CPU, where it is again processed through ACL software. However, routelookup is performed based on classified DSCPs.

• For non-IP packets, classification involves assigning an internal DSCP to the packet, but becausethere is no DSCP in the non-IP packet, no overwrite occurs. Instead, the internal DSCP is translatedto the CoS and is used both for queueing and scheduling decisions and for writing the CoS priorityvalue in the tag if the packet is being sent on either an ISL or IEEE 802.1Q trunk port. Because theCoS priority is written in the tag, Catalyst 3500 series XL switches that use the IEEE 802.1p prioritycan interoperate with the QoS implementation on the Catalyst 3550 switches.

• During policing, IP and non-IP packets can have another DSCP assigned to them (if they are out ofprofile and the policer specifies a markdown DSCP). For IP packets, the packet modification occursat a later stage; for non-IP packets the DSCP is converted to CoS and used for queueing andscheduling decisions.

Packet Modification in Catalyst 3750

A packet is classified, policed, and queued to provide QoS. Packet modifications can occur during thisprocess:

• For IP and non-IP packets, classification involves assigning a QoS label to a packet based on theDSCP or CoS of the received packet. However, the packet is not modified at this stage; only anindication of the assigned DSCP or CoS value is carried along. The reason for this is that QoSclassification and forwarding lookups occur in parallel, and it is possible that the packet is forwardedwith its original DSCP to the CPU where it is again processed through software.

• During policing, IP and non-IP packets can have another DSCP assigned to them (if they are out ofprofile and the policer specifies a markdown DSCP). Once again, the DSCP in the packet is notmodified, but an indication of the marked-down value is carried along. For IP packets, the packetmodification occurs at a later stage; for non-IP packets the DSCP is converted to CoS and used forqueueing and scheduling decisions.

• Depending on the QoS label assigned to a frame and the mutation chosen, the DSCP and CoS valuesof the frame are rewritten. If you do not configure the mutation map and if you configure the port totrust the DSCP of the incoming frame, the DSCP value in the frame is not changed, but the CoS isrewritten according to the DSCP-to-CoS map. If you configure the port to trust the CoS of theincoming frame and it is an IP packet, the CoS value in the frame is not changed, but the DSCP mightbe changed according to the CoS-to-DSCP map.

The input mutation causes the DSCP to be rewritten depending on the new value of DSCP chosen.The set action in a policy map also causes the DSCP to be rewritten.

Configuring a Trusted Boundary to Ensure Port Security

For most Cisco IP Phone configurations, the traffic sent from the telephone to the switch should betrusted to ensure that voice traffic is properly prioritized over other types of traffic in the network. Byusing the mls qos trust cos interface configuration command, you configure the switch port to whichthe telephone is connected to trust the CoS labels of all traffic received on that port. Use the mls qostrust dscp interface configuration command to configure a routed port to which the telephone isconnected to trust the DSCP labels of all traffic received on that port.

With the trusted setting, you also can use the trusted boundary feature to prevent misuse of ahigh-priority queue if a user bypasses the telephone and connects the PC directly to the switch. Withouttrusted boundary, the CoS labels generated by the PC are trusted by the switch (because of the trustedCoS setting). By contrast, trusted boundary uses CDP to detect the presence of a Cisco IP Phone (suchas the Cisco IP Phone 7910, 7935, 7940, and 7960) on a switch port. If the telephone is not detected, thetrusted boundary feature disables the trusted setting on the switch port and prevents misuse of ahigh-priority queue. Note that the trusted boundary feature is not effective if the PC and Cisco IP Phoneare connected to a hub that is connected to the switch.

In some situations, you can prevent a PC connected to the Cisco IP Phone from taking advantage of ahigh-priority data queue. You can use the switchport priority extend cos interface configurationcommand to configure the telephone through the switch CLI to override the priority of the trafficreceived from the PC.

QoS ACL Guidelines

These are the guidelines with for configuring QoS with access control lists (ACLs):

• It is not possible to match IP fragments against configured IP extended ACLs to enforce QoS. IPfragments are sent as best-effort. IP fragments are denoted by fields in the IP header.

• Only one ACL per class map and only one match class-map configuration command per class mapare supported. The ACL can have multiple ACEs, which match fields against the contents of thepacket.

Applying QoS on Interfaces

These are the guidelines with for configuring QoS on physical ports. This section also applies to SVIs(Layer 3 interfaces):

• You can configure QoS on physical ports and SVIs. When configuring QoS on physical ports, youcreate and apply nonhierarchical policy maps. When configuring QoS on SVIs, you can create andapply nonhierarchical and hierarchical policy maps.

• Incoming traffic is classified, policed, and marked down (if configured) regardless of whether the

traffic is bridged, routed, or sent to the CPU. It is possible for bridged frames to be dropped or tohave their DSCP and CoS values modified.

• In Cisco IOS Release 12.2(25)SE or later, follow these guidelines when configuring policy maps onphysical ports or SVIs:

– You cannot apply the same policy map to a physical port and to an SVI.

– If VLAN-based QoS is configured on a physical port, the switch removes all the port-basedpolicy maps on the port. The traffic on this physical port is now affected by the policy mapattached to the SVI to which the physical port belongs.

– In a hierarchical policy map attached to an SVI, you can only configure an individual policer atthe interface level on a physical port to specify the bandwidth limits for the traffic on the port.The ingress port must be configured as a trunk or as a static-access port. You cannot configurepolicers at the VLAN level of the hierarchical policy map.

– The switch does not support aggregate policers in hierarchical policy maps.

– After the hierarchical policy map is attached to an SVI, the interface-level policy map cannotbe modified or removed from the hierarchical policy map. A new interface-level policy map alsocannot be added to the hierarchical policy map. If you want these changes to occur, the hierarchical policy map must first be removed from the SVI. You also cannot add or remove a class map specified in the hierarchical policy map.

Policing Guidelines

These are the policing guidelines:

• The port ASIC device, which controls more than one physical port, supports 256 policers (255policers plus 1 no policer). The maximum number of policers supported per port is 64. For example,you could configure 32 policers on a Gigabit Ethernet port and 8 policers on a Fast Ethernet port, oryou could configure 64 policers on a Gigabit Ethernet port and 5 policers on a Fast Ethernet port.Policers are allocated on demand by the software and are constrained by the hardware and ASICboundaries. You cannot reserve policers per port; there is no guarantee that a port will be assignedto any policer.

• Only one policer is applied to a packet on an ingress port. Only the average rate and committed burstparameters are configurable.

• You cannot configure policing on the 10-Gigabit interfaces.

• You can create an aggregate policer that is shared by multiple traffic classes within the samenonhierarchical policy map. However, you cannot use the aggregate policer across different policymaps.

• On a port configured for QoS, all traffic received through the port is classified, policed, and markedaccording to the policy map attached to the port. On a trunk port configured for QoS, traffic in allVLANs received through the port is classified, policed, and marked according to the policy mapattached to the port.

• If you have EtherChannel ports configured on your switch, you must configure QoS classification,policing, mapping, and queueing on the individual physical ports that comprise the EtherChannel.You must decide whether the QoS configuration should match on all ports in the EtherChannel.

General QoS Guidelines

These are general QoS guidelines:

• Control traffic (such as spanning-tree bridge protocol data units [BPDUs] and routing update

packets) received by the switch are subject to all ingress QoS processing.

• You are likely to lose data when you change queue settings; therefore, try to make changes whentraffic is at a minimum.

Enabling DSCP Transparency Mode

In software releases earlier than Cisco IOS Release 12.2(25)SE, if QoS is disabled, the DSCP value ofthe incoming IP packet is not modified. If QoS is enabled and you configure the interface to trust DSCP, the switch does not modify the DSCP value. If you configure the interface to trust CoS, the switch modifies the DSCP value according to the CoS-to-DSCP map.

In Cisco IOS Release 12.2(25)SE or later, the switch supports the DSCP transparency feature. It affectsonly the DSCP field of a packet at egress. By default, DSCP transparency is disabled. The switchmodifies the DSCP field in an incoming packet, and the DSCP field in the outgoing packet is based onthe quality of service (QoS) configuration, including the port trust setting, policing and marking, and theDSCP-to-DSCP mutation map.

If DSCP transparency is enabled by using the no mls qos rewrite ip dscp command, the switch does not modify the DSCP field in the incoming packet, and the DSCP field in the outgoing packet is the same as that in the incoming packet.

Note Enabling DSCP transparency does not affect the port trust settings on IEEE 802.1Q tunneling ports. Regardless of the DSCP transparency configuration, the switch modifies the internal DSCP value of the packet, which the switch uses to generate a class of service (CoS) value that represents the priority of the traffic. The switch also uses the internal DSCP value to select an egress queue and threshold.

Regardless of the DSCP transparency configuration, the switch modifies the internal DSCP value of thepacket, which the switch uses to generate a class of service (CoS) value that represents the priority ofthe traffic. The switch also uses the internal DSCP value to select an egress queue and threshold.

If you enter the no mls qos rewrite ip dscp global configuration command to enable DSCP transparency and then enter the mls qos trust [cos | dscp] interface configuration command, DSCP transparency is still enabled.

Configuring the DSCP Trust State on a Port Bordering Another QoS Domain

If you are administering two separate QoS domains between which you want to implement QoS featuresfor IP traffic, you can configure the switch ports bordering the domains to a DSCP-trusted state as shown in the next figure. Then the receiving port accepts the DSCP-trusted value and avoids the classification stage of QoS. If the two domains use different DSCP values, you can configure the DSCP-to-DSCP-mutation map to translate a set of DSCP values to match the definition in the other domain.

Classifying, Policing, and Marking Traffic on Physical Ports by Using Policy Maps

You can configure a nonhierarchical policy map on a physical port that specifies which traffic class toact on. Actions can include trusting the CoS, DSCP, or IP precedence values in the traffic class; settinga specific DSCP or IP precedence value in the traffic class; and specifying the traffic bandwidthlimitations for each matched traffic class (policer) and the action to take when the traffic is out of profile(marking).

A policy map also has these characteristics:

• A policy map can contain multiple class statements, each with different match criteria and policers.

• A separate policy-map class can exist for each type of traffic received through a port.

• A policy-map trust state and a port trust state are mutually exclusive, and whichever is configuredlast takes affect.

Follow these guidelines when configuring policy maps on physical ports:

• You can attach only one policy map per ingress port.

• If you configure the IP-precedence-to-DSCP map by using the mls qos map ip-prec-dscpdscp1...dscp8 global configuration command, the settings only affect packets on ingress interfacesthat are configured to trust the IP precedence value. In a policy map, if you set the packet IPprecedence value to a new value by using the set precedence new-precedence policy-map classconfiguration command, the egress DSCP value is not affected by the IP-precedence-to-DSCP map.If you want the egress DSCP value to be different than the ingress value, use the set dscp new-dscppolicy-map class configuration command.

• In Cisco IOS Release 12.2(25)SE or later, if you have used the set ip dscp command, the switchchanges this command to set dscp in the switch configuration. If you enter the set ip dscp command,this setting appears as set dscp in the switch configuration.

• In Cisco IOS Release 12.2(25)SEC or later, you can use the set ip precedence or the set precedencecommand to set the packet IP precedence value to a new value. This setting appears as set ipprecedence in the switch configuration.

Note The 10-Gigabit interfaces do not support policing by using a policy map.

Classifying, Policing, and Marking Traffic on SVIs by Using Hierarchical Policy Maps

In Cisco IOS Release 12.2(25)SE or later, you can configure hierarchical policy maps on SVIs.

Hierarchical policing combines the VLAN- and interface-level policy maps to create a single policy map. On an SVI, the VLAN-level policy map specifies which traffic class to act on. Actions can includetrusting the CoS, DSCP, or IP precedence values or setting a specific DSCP or IP precedence value inthe traffic class. Use the interface-level policy map to specify the physical ports that are affected byindividual policers.

Follow these guidelines when configuring hierarchical policy maps:

• Before configuring a hierarchical policy map, you must enable VLAN-based QoS on the physicalports that are to be specified at the interface level of the policy map.

• You can attach only one policy map per ingress port or SVI.

• A policy map can contain multiple class statements, each with different match criteria and actions.

• A separate policy-map class can exist for each type of traffic received on the SVI.

• A policy-map trust state and a port trust state are mutually exclusive, and whichever is configuredlast takes affect.

• If you configure the IP-precedence-to-DSCP map by using the mls qos map ip-prec-dscpdscp1...dscp8 global configuration command, the settings only affect packets on ingress interfacesthat are configured to trust the IP precedence value. In a policy map, if you set the packet IPprecedence value to a new value by using the set precedence new-precedence policy-map classconfiguration command, the egress DSCP value is not affected by the IP-precedence-to-DSCP map.If you want the egress DSCP value to be different than the ingress value, use the set dscp new-dscppolicy-map class configuration command.

• In Cisco IOS Release 12.2(25)SE or later, if you have used the set ip dscp command, the switchchanges this command to set dscp in the switch configuration. If you enter the set ip dscp command,this setting appears as set dscp in the switch configuration.

• In Cisco IOS Release 12.2(25)SEC or later, you can use the set ip precedence or the set precedencecommand to set the packet IP precedence value to a new value. This setting appears as set ipprecedence in the switch configuration.

• If VLAN-based QoS is enabled, the hierarchical policy map supersedes the previously configuredport-based policy map.

• The hierarchical policy map is attached to the SVI and affects all traffic belonging to the VLAN.The individual policer in the interface-level traffic classification only affects the traffic on thephysical ports specified in that classification. The actions specified in the VLAN-level policy mapaffects the traffic belonging to the SVI.

• When configuring a hierarchical policy map on trunk ports, the VLAN ranges must not overlap. Ifthe ranges overlap, the actions specified in the policy map affect the incoming and outgoing trafficon the overlapped VLANs.

• Aggregate policers are not supported in hierarchical policy maps.

• When VLAN-based QoS is enabled, the switch supports VLAN-based features, such as the VLANmap.

• You can configure a hierarchical policy map only on the primary VLAN of a private VLAN.

• When you enable VLAN-based QoS and configure a hierarchical policy map in a switch stack, theseautomatic actions occur when the stack configuration changes:

– When a new stack master is selected, the stack master re-enables and reconfigures these featureson all applicable interfaces on the stack master.

– When a stack member is added, the stack master re-enables and reconfigures these features onall applicable ports on the stack member.

– When you merge switch stacks, the new stack master re-enables and reconfigures these featureson the switches in the new stack.

– When the switch stack divides into two or more switch stacks, the stack master in each switchstack re-enables and reconfigures these features on all applicable interfaces on the stackmembers, including the stack master.

Configuration example:

Switch(config)# access-list 101 permit ip any anySwitch(config)# class-map match-all cm-1Switch(config-cmap)# match access-group 101Switch(config-cmap)# exitSwitch(config)# exitSwitch(config)# class-map match-all cm-interface-1Switch(config-cmap)# match input-interface gigabitethernet2/0/1 gigabitethernet2/0/2Switch(config-cmap)# exitSwitch(config)# exitSwitch(config)# policy-map port-plcmapSwitch(config-pmap)# class cm-interface-1Switch(config-pmap-c)# police 9000000 9000 exceed-action policed-dscp-transmitSwitch(config-pmap-c)# exitSwitch(config-pmap)# exitSwitch(config)# policy-map vlan-plcmapSwitch(config-pmap)# class cm-1Switch(config-pmap-c)# set dscp 7Switch(config-pmap-c)# service-policy port-plcmapSwitch(config-pmap-c)# exitSwitch(config-pmap)# exitSwitch(config)# interface vlan 10Switch(config-if)# service-policy input vlan-plcmap

Classifying, Policing, and Marking Traffic by Using Aggregate Policers

By using an aggregate policer, you can create a policer that is shared by multiple traffic classes withinthe same policy map. However, you cannot use the aggregate policer across different policy maps or ports.

You can configure aggregate policers only in nonhierarchical policy maps on physical ports.

This example shows how to create an aggregate policer and attach it to multiple classes within a policymap. In the configuration, the IP ACLs permit traffic from network 10.1.0.0 and from host 11.3.1.1. Fortraffic coming from network 10.1.0.0, the DSCP in the incoming packets is trusted. For traffic comingfrom host 11.3.1.1, the DSCP in the packet is changed to 56. The traffic rate from the 10.1.0.0 networkand from host 11.3.1.1 is policed. If the traffic exceeds an average rate of 48000 bps and a normal burstsize of 8000 bytes, its DSCP is marked down (based on the policed-DSCP map) and sent. The policymap is attached to an ingress port.Switch(config)# access-list 1 permit 10.1.0.0 0.0.255.255Switch(config)# access-list 2 permit 11.3.1.1Switch(config)# mls qos aggregate-police transmit1 48000 8000 exceed-actionpoliced-dscp-transmitSwitch(config)# class-map ipclass1Switch(config-cmap)# match access-group 1Switch(config-cmap)# exitSwitch(config)# class-map ipclass2Switch(config-cmap)# match access-group 2Switch(config-cmap)# exitSwitch(config)# policy-map aggflow1Switch(config-pmap)# class ipclass1Switch(config-pmap-c)# trust dscpSwitch(config-pmap-c)# police aggregate transmit1Switch(config-pmap-c)# exitSwitch(config-pmap)# class ipclass2Switch(config-pmap-c)# set dscp 56Switch(config-pmap-c)# police aggregate transmit1Switch(config-pmap-c)# exitSwitch(config-pmap)# exitSwitch(config)# interface gigabitethernet2/0/1Switch(config-if)# service-policy input aggflow1Switch(config-if)# exit

Mapping tables definitions

You use the CoS-to-DSCP map to map CoS values in incoming packets to a DSCP value that QoS usesinternally to represent the priority of the traffic.

You use the IP-precedence-to-DSCP map to map IP precedence values in incoming packets to a DSCPvalue that QoS uses internally to represent the priority of the traffic.

You use the policed-DSCP map to mark down a DSCP value to a new value as the result of a policingand marking action.The default policed-DSCP map is a null map, which maps an incoming DSCP value to the same DSCPvalue.

You use the DSCP-to-CoS map to generate a CoS value, which is used to select one of the four egressqueues.

If two QoS domains have different DSCP definitions, use the DSCP-to-DSCP-mutation map to translateone set of DSCP values to match the definition of another domain. You apply theDSCP-to-DSCP-mutation map to the receiving port (ingress mutation) at the boundary of a QoSadministrative domain.

With ingress mutation, the new DSCP value overwrites the one in the packet, and QoS treats the packetwith this new value. The switch sends the packet out the port with the new DSCP value.

You can configure multiple DSCP-to-DSCP-mutation maps on an ingress port. The defaultDSCP-to-DSCP-mutation map is a null map, which maps an incoming DSCP value to the same DSCPvalue.

Configuring Ingress Queues on Catalyst 3750

Depending on the complexity of your network and your QoS solution, you might need to perform all ofthe tasks in the next sections. You will need to make decisions about these characteristics:• Which packets are assigned (by DSCP or CoS value) to each queue?• What drop percentage thresholds apply to each queue, and which CoS or DSCP values map to eachthreshold?• How much of the available buffer space is allocated between the queues?• How much of the available bandwidth is allocated between the queues?• Is there traffic (such as voice) that should be given high priority?

Mapping DSCP or CoS Values to an Ingress Queue and Setting WTD Thresholds

You can prioritize traffic by placing packets with particular DSCPs or CoSs into certain queues andadjusting the queue thresholds so that packets with lower priorities are dropped.

This example shows how to map DSCP values 0 to 6 to ingress queue 1 and to threshold 1 with a dropthreshold of 50 percent. It maps DSCP values 20 to 26 to ingress queue 1 and to threshold 2 with a dropthreshold of 70 percent:

Switch(config)# mls qos srr-queue input dscp-map queue 1 threshold 1 0 1 2 3 4 5 6Switch(config)# mls qos srr-queue input dscp-map queue 1 threshold 2 20 21 22 23 24 25 26Switch(config)# mls qos srr-queue input threshold 1 50 70

In this example, the DSCP values (0 to 6) are assigned the WTD threshold of 50 percent and will bedropped sooner than the DSCP values (20 to 26) assigned to the WTD threshold of 70 percent.

Allocating Buffer Space Between the Ingress Queues

You define the ratio (allocate the amount of space) with which to divide the ingress buffers between thetwo queues. The buffer and the bandwidth allocation control how much data can be buffered beforepackets are dropped.

This example shows how to allocate 60 percent of the buffer space to ingress queue 1 and 40 percent of

the buffer space to ingress queue 2:

Switch(config)# mls qos srr-queue input buffers 60 40

Allocating Bandwidth Between the Ingress Queues

You need to specify how much of the available bandwidth is allocated between the ingress queues. Theratio of the weights is the ratio of the frequency in which the SRR scheduler sends packets from eachqueue. The bandwidth and the buffer allocation control how much data can be buffered before packetsare dropped. On ingress queues, SRR operates only in shared mode.

This example shows how to assign the ingress bandwidth to the queues. Priority queueing is disabled,and the shared bandwidth ratio allocated to queue 1 is 25/(25+75) and to queue 2 is 75/(25+75):

Switch(config)# mls qos srr-queue input priority-queue 2 bandwidth 0Switch(config)# mls qos srr-queue input bandwidth 25 75

Configuring the Ingress Priority Queue

You should use the priority queue only for traffic that needs to be expedited (for example, voice traffic,which needs minimum delay and jitter).

The priority queue is guaranteed part of the bandwidth to reduce the delay and jitter under heavy network traffic on an oversubscribed ring (when there is more traffic than the backplane can carry, and the queues are full and dropping frames).

SRR services the priority queue for its configured weight as specified by the bandwidth keyword in themls qos srr-queue input priority-queue queue-id bandwidth weight global configuration command.Then, SRR shares the remaining bandwidth with both ingress queues and services them as specified bythe weights configured with the mls qos srr-queue input bandwidth weight1 weight2 globalconfiguration command.

This example shows how to assign the ingress bandwidths to the queues. Queue 1 is the priority queuewith 10 percent of the bandwidth allocated to it. The bandwidth ratios allocated to queues 1 and 2 is4/(4+4). SRR services queue 1 (the priority queue) first for its configured 10 percent bandwidth. ThenSRR equally shares the remaining 90 percent of the bandwidth between queues 1 and 2 by allocating 45percent to each queue:

Switch(config)# mls qos srr-queue input priority-queue 1 bandwidth 10Switch(config)# mls qos srr-queue input bandwidth 4 4

Configuring Egress Queues on Catalyst 3750

Depending on the complexity of your network and your QoS solution, you might need to perform all ofthe tasks in the next sections. You will need to make decisions about these characteristics:• Which packets are mapped by DSCP or CoS value to each queue and threshold ID?• What drop percentage thresholds apply to the queue-set (four egress queues per port), and how muchreserved and maximum memory is needed for the traffic type?• How much of the fixed buffer space is allocated to the queue-set?• Does the bandwidth of the port need to be rate limited?• How often should the egress queues be serviced and which technique (shaped, shared, or both)should be used?

Configuration Guidelines

Follow these guidelines when the expedite queue is enabled or the egress queues are serviced based ontheir SRR weights:

• If the egress expedite queue is enabled, it overrides the SRR shaped and shared weights for queue 1.

• If the egress expedite queue is disabled and the SRR shaped and shared weights are configured, theshaped mode overrides the shared mode for queue 1, and SRR services this queue in shaped mode.• If the egress expedite queue is disabled and the SRR shaped weights are not configured, SRRservices this queue in shared mode.

Allocating Buffer Space to and Setting WTD Thresholds for an Egress Queue-Set

You can guarantee the availability of buffers, set WTD thresholds, and configure the maximum memoryallocation for a queue-set by using the mls qos queue-set output qset-id threshold queue-iddrop-threshold1 drop-threshold2 reserved-threshold maximum-threshold global configuration command.

Each threshold value is a percentage of the queue’s allocated memory, which you specify by using the mls qos queue-set output qset-id buffers allocation1 ... allocation4 global configuration command. The queues use WTD to support distinct drop percentages for different traffic classes.

Note The egress queue default settings are suitable for most situations. You should change them only when you have a thorough understanding of the egress queues and if these settings do not meet your QoS solution.

This example shows how to map a port to queue-set 2. It allocates 40 percent of the buffer space to egress queue 1 and 20 percent to egress queues 2, 3, and 4. It configures the drop thresholds for queue 2 to 40 and 60 percent of the allocated memory, guarantees (reserves) 100 percent of the allocated memory, and configures 200 percent as the maximum memory that this queue can have before packets are dropped:

Switch(config)# mls qos queue-set output 2 buffers 40 20 20 20Switch(config)# mls qos queue-set output 2 threshold 2 40 60 100 200Switch(config)# interface gigabitethernet1/0/1Switch(config-if)# queue-set 2

Mapping DSCP or CoS Values to an Egress Queue and to a Threshold ID

You can prioritize traffic by placing packets with particular DSCPs or costs of service into certain queues and adjusting the queue thresholds so that packets with lower priorities are dropped.

Note The egress queue default settings are suitable for most situations. You should change them only when you have a thorough understanding of the egress queues and if these settings do not meet your QoS solution.

This example shows how to map DSCP values 10 and 11 to egress queue 1 and to threshold 2:

Switch(config)# mls qos srr-queue output dscp-map queue 1 threshold 2 10 11

Configuring SRR Shaped Weights on Egress Queues

Note You cannot configure SSR shaped weights on the 10-Gigabit interfaces.

You can specify how much of the available bandwidth is allocated to each queue. The ratio of the weights is the ratio of frequency in which the SRR scheduler sends packets from each queue.

You can configure the egress queues for shaped or shared weights, or both. Use shaping to smooth bursty traffic or to provide a smoother output over time.

This example shows how to configure bandwidth shaping on queue 1. Because the weight ratios forqueues 2, 3, and 4 are set to 0, these queues operate in shared mode. The bandwidth weight for queue 1is 1/8, which is 12.5 percent:

Switch(config)# interface gigabitethernet2/0/1

Switch(config-if)# srr-queue bandwidth shape 8 0 0 0

Configuring SRR Shared Weights on Egress Queues

In shared mode, the queues share the bandwidth among them according to the configured weights. Thebandwidth is guaranteed at this level but not limited to it. For example, if a queue empties and does notrequire a share of the link, the remaining queues can expand into the unused bandwidth and share itamong them. With sharing, the ratio of the weights controls the frequency of dequeuing; the absolutevalues are meaningless

This example shows how to configure the weight ratio of the SRR scheduler running on an egress port.Four queues are used, and the bandwidth ratio allocated for each queue in shared mode is 1/(1+2+3+4),2/(1+2+3+4), 3/(1+2+3+4), and 4/(1+2+3+4), which is 10 percent, 20 percent, 30 percent, and 40percent for queues 1, 2, 3, and 4. This means that queue 4 has four times the bandwidth of queue 1, twice the bandwidth of queue 2, and one-and-a-third times the bandwidth of queue 3.

Switch(config)# interface gigabitethernet2/0/1Switch(config-if)# srr-queue bandwidth share 1 2 3 4

Configuring the Egress Expedite Queue

Beginning in Cisco IOS Release 12.1(19)EA1, you can ensure that certain packets have priority over allothers by queuing them in the egress expedite queue. SRR services this queue until it is empty beforeservicing the other queues.

This example shows how to enable the egress expedite queue when the SRR weights are configured. The egress expedite queue overrides the configured SRR weights.

Switch(config)# interface gigabitethernet1/0/1Switch(config-if)# srr-queue bandwidth shape 25 0 0 0Switch(config-if)# srr-queue bandwidth share 30 20 25 25Switch(config-if)# priority-queue out

Limiting the Bandwidth on an Egress Interface

Note You cannot configure SSR shaped weights on the 10-Gigabit interfaces.

You can limit the bandwidth on an egress port. For example, if a customer pays only for a smallpercentage of a high-speed link, you can limit the bandwidth to that amount.

This example shows how to limit the bandwidth on a port to 80 percent:

Switch(config)# interface gigabitethernet2/0/1Switch(config-if)# srr-queue bandwidth limit 80

Configuring Auto-QoS

You can use the auto-QoS feature to simplify the deployment of existing QoS features. Auto-QoS makes assumptions about the network design, and as a result, the switch can prioritize different traffic flows and appropriately use the egress queues instead of using the default QoS behavior. (The default is that QoS is disabled. The switch then offers best-effort service to each packet, regardless of the packetcontents or size, and sends it from a single queue.)When you enable auto-QoS, it automatically classifies traffic based on the traffic type and ingress packet label. The switch uses the resulting classification to choose the appropriate egress queue.You use auto-QoS commands to identify ports connected to Cisco IP Phones and to devices running theCisco SoftPhone application. You also use the commands to identify ports that receive trusted trafficthrough an uplink. Auto-QoS then performs these functions:

• Detects the presence or absence of Cisco IP Phones• Configures QoS classification

• Configures egress queues

Generated Auto-QoS Configuration in Catalyst 3550

By default, auto-QoS is disabled on all interfaces.When auto-QoS is enabled, it uses the ingress packet label to categorize traffic and to configure theegress queues as shown:

When you enable the auto-QoS feature on the first interface, these automatic actions occur:

• QoS is globally enabled (mls qos global configuration command).

• When you enter the auto qos voip cisco-phone interface configuration command on a port at theedge of the network that is connected to a Cisco IP Phone, the switch enables the trusted boundaryfeature. The switch uses the Cisco Discovery Protocol (CDP) to detect the presence or absence of aCisco IP Phone. When a Cisco IP Phone is detected, the ingress classification on the interface is setto trust the QoS label received in the packet. When a Cisco IP Phone is absent, the ingressclassification is set to not trust the QoS label in the packet. The switch configures egress queues onthe port according to the settings in the following table.

• When you enter the auto qos voip cisco-softphone interface configuration command on a port atthe edge of the network that is connected to a device running the Cisco SoftPhone, the switch usespolicing to determine whether a packet is in or out of profile and to specify the action on the packet.If the packet does not have a DSCP value of 24, 26, or 46 or is out of profile, the switch changes theDSCP value to 0. The switch configures egress queues on the port according to the settings in the following table.

• When you enter the auto qos voip trust interface configuration command on a port connected to theinterior of the network, the switch trusts the CoS value for nonrouted ports or the DSCP value forrouted ports in ingress packets (the assumption is that traffic has already been classified by otheredge devices). The switch configures egress queues on the port according to the settings in the following table.

Generated Auto-QoS Configuration in Catalyst 3750

Preliminary classification for queues:

Ingress queue:

Egress queue:

When you enable the auto-QoS feature on the first port, these automatic actions occur + configuration of the switch ingress and egress queues in accordance with the previous tables:

• QoS is globally enabled (mls qos global configuration command), and other global configurationcommands are added.

• When you enter the auto qos voip cisco-phone interface configuration command on a port at theedge of the network that is connected to a Cisco IP Phone, the switch enables the trusted boundaryfeature. The switch uses the Cisco Discovery Protocol (CDP) to detect the presence or absence of aCisco IP Phone. When a Cisco IP Phone is detected, the ingress classification on the port is set totrust the QoS label received in the packet. When a Cisco IP Phone is absent, the ingress classificationis set to not trust the QoS label in the packet.

• When you enter the auto qos voip cisco-softphone interface configuration command on a port atthe edge of the network that is connected to a device running the Cisco SoftPhone, the switch usespolicing to determine whether a packet is in or out of profile and to specify the action on the packet.If the packet does not have a DSCP value of 24, 26, or 46 or is out of profile, the switch changes theDSCP value to 0.

• When you enter the auto qos voip trust interface configuration command on a port connected to theinterior of the network, the switch trusts the CoS value for nonrouted ports or the DSCP value forrouted ports in ingress packets (the assumption is that traffic has already been classified by otheredge devices).

When you enable auto-QoS by using the auto qos voip cisco-phone, the auto qos voip cisco-softphone,

or the auto qos voip trust interface configuration command, the switch automatically generates a QoSconfiguration based on the traffic type and ingress packet label and applies the commands listed in the next table to the port.

Egress queues on Catalyst 3550:

Egress queues on Catalyst 3750:

AutoQoS configuration details for Catalyst 3750

When you enable auto-QoS by using the auto qos voip cisco-phone, the auto qos voip cisco-softphone, or the auto qos voip trust interface configuration command, the switch automatically generates a QoS configuration based on the traffic type and ingress packet label and applies the commands listed in the next table to the interface.

AutoQoS configuration details for Catalyst 3550

Effects of Auto-QoS on the Configuration

When auto-QoS is enabled, the auto qos voip interface configuration command and the generatedconfiguration are added to the running configuration.

The switch applies the auto-QoS-generated commands as if the commands were entered from the CLI.

An existing user configuration can cause the application of the generated commands to fail or to beoverridden by the generated commands. These actions occur without warning. If all the generatedcommands are successfully applied, any user-entered configuration that was not overridden remains inthe running configuration. Any user-entered configuration that was overridden can be retrieved byreloading the switch without saving the current configuration to memory. If the generated commands failto be applied, the previous running configuration is restored.

Configuration Guidelines

Before configuring auto-QoS, you should be aware of this information:

• In releases earlier than Cisco IOS Release 12.1(20)EA2, auto-QoS configures the switch for VoIPonly with Cisco IP Phones on nonrouted ports.

• In Cisco IOS Release 12.1(20)EA2 or later, auto-QoS configures the switch for VoIP with Cisco IPPhones on nonrouted and routed ports. Auto-QoS also configures the switch for VoIP with devicesrunning the Cisco SoftPhone application.

Note When a device running Cisco SoftPhone is connected to a nonrouted or routed port, the switchsupports only one Cisco SoftPhone application per port.

• To take advantage of the auto-QoS defaults, you should enable auto-QoS before you configure otherQoS commands. If necessary, you can fine-tune the QoS configuration, but we recommend that youdo so only after the auto-QoS configuration is completed.

• After auto-QoS is enabled, do not modify a policy map or aggregate policer that includes AutoQoSin its name. If you need to modify the policy map or aggregate policer, make a copy of it, and changethe copied policy map or policer. To use this new policy map instead of the generated one, removethe generated policy map from the interface, and apply the new policy map to the interface.

• You can enable auto-QoS on static, dynamic-access, voice VLAN access, and trunk ports.

• By default, the CDP is enabled on all interfaces. For auto-QoS to function properly, do not disablethe CDP.

• When enabling auto-QoS with a Cisco IP Phone on a routed port, you must assign a static IP addressto the IP phone.

• This release supports only Cisco IP SoftPhone Version 1.3(3) or later.

• Connected devices must use Cisco Call Manager Version 4 or later.

IP Multicast

Multimedia applications offer the integration of sound, graphics, animation, text, and video, which are all delivered over IP and an existing Campus infrastructure. These types of applications have become an effective means of corporate communication; however, sending combined media over a campus data network brings with it the potential for high bandwidth consumption. IP Multicast is an efficient means of delivering media to many hosts over a single IP flow.

Multimedia traffic can work its way through the network in one of several ways.

Unicast

An application sends one copy of each packet to every client unicast address requiring the multimedia flow. If the group is large, the same information is carried multiple times from sender to receiver. Unicast transmission has significant scaling restrictions.

Broadcast

An application sends one copy of each packet to a destination broadcast address. The network interface card of all hosts seeing the broadcast packets must process all broadcast packets. This is inefficient if only a small group in the network actually needs the data flow. Multimedia broadcast is rarely implemented.

Multicast

A multimedia server sends one copy of each packet to a single destination IP address that can be received by many end stations if they choose to "listen" on that address. For example if the video server transmits a single video stream to a set of host devices listening to a specific multicast address. Only 1.5 Mbps of server-to-network bandwidth is utilized regardless of the number of receiving hosts.

Multicasting conserves bandwidth by replicating packets only onto segments where listening devices exist, allowing an arbitrary number of clients to subscribe to the multicast address. Multicast flows minimize processing required by network devices and non-listening hosts.

IP Multicast is the transmission of an IP data frame to a host group that is defined by a single IP Multicast address. IP Multicasting has these characteristics:

Delivers a multicast datagram to a destination multicast address (also known as a multicast group) with the same best-effort reliability as a regular unicast IP datagram

Allows group members to join and leave dynamically Supports all host groups regardless of the location or number of members Supports the membership of a single host in one or more multicast groups Can carry multiple data streams to a single group address Can use a single group address for multiple host applications Multicast server does not keep track of the number of recipients

In IP multicasting, the variability in delivery time is limited to the differences in end-to-end network delay along the complete server-to-client path. In a unicast scenario, the server must sequence transmission of multiple copies of the data to multiple unicast hosts, so variability in delivery time is large, especially for large transmissions or large distribution lists.

Multicast traffic is handled at the transport layer using the User Datagram Protocol (UDP). Unlike the Transmission Control Protocol (TCP), UDP adds no reliability, flow control, or error recovery functions to IP. Because of the simplicity of UDP, data packet headers contain fewer bytes and consume less network overhead than TCP. Therefore, reliability in multicast is managed by observing receivers and monitoring the networks ability to deliver the multicast packets in sequence without delay.

IP Multicast Group Membership

IP multicast relies on the concept of group members and a group address. The group address is a single IP Multicast address that is the destination address of all packets sent from a source. Receiving devices join that group and listen for packets with the destination IP address of the group. In normal TCP/IP operations, packets arrive at a destination address by traversing routers that forward the packets on a hop-by-hop basis based on the IP destination of the packet and entries in the routing table. Routers do not forward traffic in this manner to multicast destination addresses.

By default, multicast traffic is blocked at the Layer 3 devices, as is the case with broadcast traffic. Routers between a multicast source and its receiving hosts must be configured with a multicast protocol that will determine where the receivers exist so that the flow is sent only onto segments with receivers and is not replicated onto segments with no group members listening for that flow. Receivers join a group by registering with a local router. The routers between the source and receiver must be configured to deliver the multicast stream from the source to all segments hosting devices that have joined the group. Once a receiver joins a group, packets addressed to that group are placed onto the segment where the receivers are located and multiple group members can receive a single flow. PIM and IGMP are multicast protocols that keep multicast traffic isolated to portions of the network where group members are located.

IP Multicast Address Structure

Multicast uses Class D IP address space. A Class D address consists of 1110 as the high-order bits in the first octet, followed by a 28-bit group address. The range of IP multicast addresses is divided into classes based on the high-order bits of a 32-bit IP address.

The remaining 28 bits of the IP address identify the multicast group ID. This multicast group ID is a single address typically written as decimal numbers in the range 224.0.0.0 through 239.255.255.255. The high-order bits in the first octet identify this 224-base address.

Multicast addresses may be dynamically or statically allocated. Dynamic multicast addressing provides applications with a group address on demand. Dynamic multicast addresses have a specific lifetime and applications must request and use the address as needed. Static addresses are used at all times. As with IP addressing, there is the concept of private address space that may be used for local, organization wide traffic and public or Internet wide multicast addresses. There are also addresses reserved for specific protocols that require well-known addresses. The Internet Assigned Numbers Authority (IANA) manages the assignment of multicast addresses that are called permanent host groups and are similar in concept to well known TCP and UDP port numbers.

IP Multicast to MAC Address Mapping

Due to decisions taken early in the development of multicasting, only the MAC address range from 0100.5e00.0000 through 0100.5e7f.ffff is the available for carrying multicast frames.

This makes the first 25 bits of the MAC address fixed and allows for the last 23 bits of the MAC address to correspond to the last 23 bits in the IP multicast group address.

Because the upper five bits of the IP multicast address are dropped in this mapping, the result is that two different IP Multicast addresses may map to the same MAC multicast address. For example, 224.1.1.1 and 225.1.1.1 map to the same multicast MAC address. If one user subscribed to Group A (as designated by 224.1.1.1) and the other users subscribed to Group B (as designated by 225.1.1.1), they would both receive both A and B streams at Layer 2. At Layer 3, however, only the packets associated with the IP address of the selected multicast group would be viewable because the port ranges used within the address will be different between aliased streams. Network administrators should consider this when assigning IP multicast addresses.

IP multicast Address Ranges

The IANA controls the assignment of IP multicast addresses. IANA has assigned the IPv4 Class D address space to be used for IP multicast. Therefore, all IP multicast group addresses fall in the range from 224.0.0.0 through 239.255.255.255. The Class D address is used for the destination IP address of multicast traffic for a specific group. The source address of a multicast datagram is the unicast address of the device sourcing the multicast flow to the destination multicast address.

Reserved Link-Local Addresses

The IANA has reserved addresses in the range 224.0.0.0 to 224.0.0.255 to be used by network protocols on a local network segment. A router should never forward packets with these addresses. Packets with link-local destination addresses are typically sent with a Time to Live (TTL) value of 1 and are not forwarded by a router. Network protocols use these addresses for automatic router discovery and to communicate important routing information. For example, Open Shortest Path First (OSPF) Protocol uses the IP addresses 224.0.0.5 and 224.0.0.6 to exchange link-state information.

Address 224.0.0.1 identifies the all-hosts group. Every multicast-capable host must join this group. If a ping command is issued using this address, all multicast-capable hosts on the network must answer the ping request.

Address 224.0.0.2 identifies the all-routers group. Multicast routers must join that group on all multicast-capable interfaces.

Globally Scoped Addresses

Multicast addresses in the range from 224.0.1.0 through 238.255.255.255 are called "globally scoped addresses." They can be used to multicast data between organizations and across the Internet. Some of these addresses have been registered with IANA, for example IP address 224.0.1.1 has been reserved for Network Time Protocol (NTP).

Source Specific Multicast Addresses

Addresses in the 232.0.0.0 to 232.255.255.255 range are reserved for Source Specific Multicast (SSM). SSM is an extension of Protocol Independent Multicast (PIM), which allows for an efficient data delivery mechanism in one-to-many communications.

GLOP Addresses

Multicast GLOP (a word, not an acronym) addresses in the range 233.0.0.0 to 233.255.255.255 can be used statically by organizations that have an Autonomous System (AS) number registered by a network registry and listed in the RWhois database. The second and third octets of the domain multicast address are represented by the AS number. For example, AS 62010 is written in hexadecimal format as "F23A." This value is separated into two parts, F2 and 3A and those numbers, converted to decimal would be 242 and 58. This would yield a multicast GLOP address of 233.242.58.0/24. Multicast group addresses in that address space can be used by the organization with AS 62010 and routed throughout the Internet Multicast Backbone.

Limited Scope Addresses

Like private IP address space that is used within the boundaries of a single organization, "limited" or "administratively scoped" addresses in the range 239.0.0.0 to 239.255.255.255 are constrained to a local group or organization. Companies, universities, or other organizations can use limited scope addresses to have local multicast applications that will not be forwarded over the Internet. Typically, routers are configured with filters to prevent multicast traffic in this address range from flowing outside of an AS. Within an autonomous system or domain, the limited scope address range can be further subdivided so that local multicast boundaries can be defined. This subdivision is called "address scoping" and allows for address reuse between smaller domains. These addresses are described in RFC 2365, Administratively Scoped IP Multicast.

Reverse Path Forwarding

Multicast-capable routers create distribution trees that control the path that IP multicast traffic takes through the network. Unlike typical IP traffic, multicast traffic is forwarded away from the source, rather than toward the receiver. Because is it the reverse of typical IP packet processing, the process is called Reverse Path Forwarding (RPF). Multicast-capable routers create distribution trees that control the path that IP multicast traffic takes through the network as it is forwarded from the source toward all receivers. Multicast distribution trees fall into the categories of source based trees and shared trees and the type of tree is dependent upon the multicast protocol in use.

Source Distribution Trees

A source tree is the simplest form of a multicast distribution tree, with its root at the source and branches forming a tree through the network toward the receivers. This type of tree uses the shortest path through the network and is therefore also called a "shortest path tree (SPT)."

An SPT is identified by a special notation of (S, G), pronounced "S comma G," where S is the IP address of the source and G is the multicast group address to which receivers belong. Using this notation, the SPT for the example shown in the figure would be (192.168.1.1, 224.1.1.1). The unicast IP address of the receivers is irrelevant.

The (S, G) notation implies that a separate SPT exists for each individual source sending to each group. For example, if host B is also sending traffic to group 224.1.1.1 and hosts A and C are receivers, a separate (S, G) SPT would exist with a notation of (192.168.2.2, 224.1.1.1).

Shared Distribution Trees

Unlike source trees that have their root at the source, shared trees use a single common root placed at a chosen point in the network. This shared root is called a rendezvous point (RP). The next figure shows a shared unidirectional tree for group 224.2.2.2, with the root located at router D. The source traffic is sent toward the RP. The traffic is then forwarded from the RP to reach all of the receivers. If the receiver is located between the source and the RP, it will be serviced directly.

In this example, multicast traffic from the sources (hosts A and D) travels to the RP (router D) and then down the tree to the two receivers (hosts B and C). Because all sources in the multicast group use a common shared tree, a notation written as (*, G), pronounced "star comma G," represents the tree. In this case, "*" means all sources, and G represents the multicast group. Therefore, the shared tree shown in the figure would be written as (*, 224.2.2.2).

Source Trees vs. Shared Trees

Members of multicast groups can join or leave at any time; therefore, the distribution trees must be dynamically updated. When all the active receivers on the Layer 3 segments which are associated with a particular branch stop requesting traffic for a multicast group, the routers prune that branch from the distribution tree, stopping traffic flow to that branch. If one receiver on that branch becomes active and requests the multicast traffic, the router will dynamically modify the distribution tree and will again start forwarding traffic.

Source trees have the advantage of creating the optimal path between the source and the receivers. This advantage guarantees the minimum amount of network latency for forwarding multicast traffic. However, this optimization comes at a cost: the routers must maintain path information for each source. In a network that has thousands of sources and thousands of groups, this overhead can quickly become a resource issue on the routers. Memory consumption from the size of the multicast routing table is a factor that network designers must take into consideration.

Shared trees have the advantage of requiring the minimum amount of network state information in each router. This advantage lowers the overall memory requirements for a network that allows shared trees only. The disadvantage of shared trees is that under certain circumstances the paths between the source and receivers might not be the optimal paths, which might introduce some latency in packet delivery. For example, in the figure, the shortest path between host A (source 1) and host B (a receiver) would be router A and router C. Because router D is used as the RP for a shared tree, the traffic must traverse routers A, B, D, and then C. Network designers must carefully consider the placement of the RP when implementing a shared tree-only environment.

Reverse Path Forwarding Check

With unicast traffic, packets are routed from the source to destination by considering the packets destination IP address. The routing table looks up the destination address and forwards a single copy of the unicast packet out the correct interface in the direction of the destination.

In multicast forwarding, the source sends traffic to a group of hosts represented by a multicast group address. The multicast router determines which direction is upstream (toward the source) and which is downstream (toward the listeners). If there are multiple downstream paths, the router replicates the packet down all appropriate downstream paths.

RPF uses the existing unicast routing table to validate the network from where upstream multicast traffic should arrive. When a multicast packet arrives at a router, the router will perform an RPF check on the packet. If the check is successful, the packet is forwarded; otherwise it will be dropped. This RPF check helps to guarantee that the distribution tree is loop-free.

For packets flowing down a source tree, the RPF check procedure follows this sequence:

Step 1 Router looks up the source address in the unicast routing table to determine if the packet has arrived on the interface located on the reverse path back to the source.

Step 2 If a packet has arrived on the interface leading back to the source, the RPF check is successful and the packet will be forwarded.

Step 3 If the RPF check in Step 2 fails, the packet is quietly dropped.

At the top of the figure the RPF check fails. A multicast packet from source 151.10.3.21 is received on interface S0. A check of the unicast route table shows that S1 is the interface this router would expect to see unicast data from 151.10.3.21. Because the packet has arrived on S0, the packet will be discarded.

In the bottom of the figure the multicast packet arrived on S1. The router checks the unicast routing table to find that S1 is the correct interface. The RPF check passes. The packet is forwarded.

The Cisco IOS software supports these protocols to implement IP multicast routing:

• Internet Group Management Protocol (IGMP) is used among hosts on a LAN and the routers (andmultilayer switches) on that LAN to track the multicast groups of which hosts are members.• Protocol-Independent Multicast (PIM) protocol is used among routers and multilayer switches totrack which multicast packets to forward to each other and to their directly connected LANs.

• Distance Vector Multicast Routing Protocol (DVMRP) is used on the multicast backbone of theInternet (MBONE). The software supports PIM-to-DVMRP interaction.• Cisco Group Management Protocol (CGMP) is used on Cisco routers and multilayer switchesconnected to Layer 2 Catalyst switches to perform tasks similar to those performed by IGMP.

IGMP

Internet Group Management Protocol (IGMP) is used to register individual hosts with a multicast group. The host sends a join message to a local router multicast address. If the router is running a multicast routing protocol, it will accept the join and then forward the multicast stream for that group onto the segment where the registering host is present. IGMP messages are IP datagrams with a protocol value of 2 and a destination address 224.0.0.2 and a TTL of 1.

In addition to listening to IGMP join messages, multicast routers also periodically send out queries to discover which groups are active or inactive on a particular subnet. Any end station that is part of the multicast group receives this IGMP query, and responds with a host membership report for each group to which it belongs . This is sent to all hosts 224.0.0.1 with a TTL of 1.

As of this writing, version 3 is the most current iteration of IGMP and is covered in more detail. Previous versions had attributes and limitations as listed in the next figure.

To participate in IP multicasting, multicast hosts, routers, and multilayer switches must have the IGMPoperating. This protocol defines the querier and host roles:

• A querier is a network device that sends query messages to discover which network devices aremembers of a given multicast group.• A host is a receiver that sends report messages (in response to query messages) to inform a querierof a host membership.

A set of queriers and hosts that receive multicast data streams from the same source is called a multicastgroup. Queriers and hosts use IGMP messages to join and leave multicast groups.

Any host, regardless of whether it is a member of a group, can send to a group. However, only themembers of a group receive the message. Membership in a multicast group is dynamic; hosts can join

and leave at any time. There is no restriction on the location or number of members in a multicast group.

A host can be a member of more than one multicast group at a time. How active a multicast group is and what members it has can vary from group to group and from time to time. A multicast group can be active for a long time, or it can be very short-lived. Membership in a group can constantly change. A group that has members can have no activity.

IP multicast traffic uses group addresses, which are class D addresses. The high-order bits of a Class Daddress are 1110. Therefore, host group addresses can be in the range 224.0.0.0 through 239.255.255.255. Multicast addresses in the range 224.0.0.0 to 224.0.0.255 are reserved for use byrouting protocols and other network control traffic. The address 224.0.0.0 is guaranteed not to beassigned to any group.

IGMP packets are sent using these IP multicast group addresses:

• IGMP general queries are destined to the address 224.0.0.1 (all systems on a subnet).• IGMP group-specific queries are destined to the group IP address for which the switch is querying.• IGMP group membership reports are destined to the group IP address for which the switch isreporting.• IGMP Version 2 (IGMPv2) leave messages are destined to the address 224.0.0.2(all-multicast-routers on a subnet). In some old host IP stacks, leave messages might be destined tothe group IP address rather than to the all-routers address.

IGMP Version 1

IGMP Version 1 (IGMPv1) primarily uses a query-response model that enables the multicast router andmultilayer switch to find which multicast groups are active (have one or more hosts interested in amulticast group) on the local subnet. IGMPv1 has other processes that enable a host to join and leave amulticast group. For more information, see RFC 1112.

IGMP Version 2

IGMPv2 extends IGMP functionality by providing such features as the IGMP leave process to reduceleave latency, group-specific queries, and an explicit maximum query response time. IGMPv2 also addsthe capability for routers to elect the IGMP querier without depending on the multicast protocol toperform this task. For more information, see RFC 2236.

IGMP Message Format

IGMP version3, the next step in the evolution of IGMP, adds support for source filtering, multiple group memberships, joins and leaves. This enables a multicast receiving host to indicate to the router the groups from which it wants to receive multicast traffic, as well as the source unicast addresses from which this traffic is expected. This membership information enables IOS software to forward traffic from only those sources requested by the receiver. IGMP v3 supports Report and Query messages that have different packet structure

IGMPv3 Report Message Format

With IGMP v3, receivers signal membership to a multicast host group in these two modes:

INCLUDE mode – The receiver announces membership to a host group and provides a list of source addresses (the INCLUDE list) from which it does want to receive traffic.

EXCLUDE mode – The receiver announces membership to a multicast group and provides a list of source addresses (the EXCLUDE list) from which it does not want to receive traffic. To receive traffic from all sources, which is the behavior of IGMP v2, a host uses EXCLUDE mode membership with an empty EXCLUDE list.

IGMPv3 Query Message Format

The IGMP query message sent from the multicast router to the all hosts address 224.0.0.1 has a different format than the report or join message.

PIM

PIM is called protocol-independent: regardless of the unicast routing protocols used to populate theunicast routing table, PIM uses this information to perform multicast forwarding instead of maintaininga separate multicast routing table.PIM is defined in RFC 2362, Protocol-Independent Multicast-Sparse Mode (PIM-SM): ProtocolSpecification. PIM is defined in these Internet Engineering Task Force (IETF) Internet drafts:• Protocol Independent Multicast (PIM): Motivation and Architecture• Protocol Independent Multicast (PIM), Dense Mode Protocol Specification• Protocol Independent Multicast (PIM), Sparse Mode Protocol Specification• draft-ietf-idmr-igmp-v2-06.txt, Internet Group Management Protocol, Version 2• draft-ietf-pim-v2-dm-03.txt, PIM Version 2 Dense Mode

PIM Versions

PIM version 1 was Cisco proprietary. In addition to being an IEEE standard, Version 2 includes the following improvements:

• A single, active rendezvous point (RP) exists per multicast group, with multiple backup RPs. Thissingle RP compares to multiple active RPs for the same group in PIMv1.• A bootstrap router (BSR) provides a fault-tolerant, automated RP discovery and distributionmechanism that enables routers and multilayer switches to dynamically learn the group-to-RPmappings.• Sparse mode and dense mode are properties of a group, as opposed to an interface. We stronglyrecommend sparse-dense mode, as opposed to either sparse mode or dense mode only.• PIM join and prune messages have more flexible encoding for multiple address families.• A more flexible hello packet format replaces the query packet to encode current and futurecapability options.• Register messages to an RP specify whether they are sent by a border router or a designated router.• PIM packets are no longer inside IGMP packets; they are standalone packets.

PIM Modes

PIM is IP routing protocol-independent and can leverage whichever unicast routing protocols are used to populate the unicast routing table. PIM dense mode operation is based on the assumption that the multicast group members are densely distributed throughout the network and that bandwidth is plentiful, meaning that almost all hosts on the network belong to the group. PIM dense-mode multicast routing protocol relies on periodic flooding of the network with multicast traffic to set up and maintain the distribution tree.

PIM DM

PIM dense mode works best when there are numerous members belonging to each multimedia group. PIM floods the multimedia packet out to all routers in the network and then prunes routers that do not support members of that particular multicast group.

PIM dense mode is most useful under the following circumstances:

Senders and receivers are in close proximity to one another. There are few senders and many receivers. The volume of multicast traffic is high.

The stream of multicast traffic is constant.

PIM DM builds source-based multicast distribution trees. In dense mode, a PIM DM router or multilayer switch assumes that all other routers or multilayer switches forward multicast packets for a group. If a PIM DM device receives a multicast packet and has no directly connected members or PIM neighbors present, a prune message is sent back to the source to stop unwanted multicast traffic. Subsequent multicast packets are not flooded to this router or switch on this pruned branch because branches without receivers are pruned from the distribution tree, leaving only branches that contain receivers.

When a new receiver on a previously pruned branch of the tree joins a multicast group, the PIM DMdevice detects the new receiver and immediately sends a graft message up the distribution tree towardthe source. When the upstream PIM DM device receives the graft message, it immediately puts theinterface on which the graft was received into the forwarding state so that the multicast traffic beginsflowing to the receiver.

PIM SM

The second approach to multicast routing is based on the assumption that the multicast group members are sparsely distributed throughout the network and bandwidth is not necessarily widely available.

It is important to note that sparse mode does not imply that the group has few members, just that they are widely dispersed. In this case, flooding would unnecessarily waste network resources. Sparse-mode multicast routing protocols rely on more selective techniques to set up and maintain multicast trees. Sparse-mode protocols begin with an empty distribution tree and add branches only as the result of explicit requests to join the distribution.

Sparse-mode PIM is optimized for environments where there are many multipoint data streams. Sparse multicast is most useful when:

There are few receivers in a group. The type of traffic is intermittent.

In sparse mode, each data stream goes to a relatively small number of segments in the campus network. Instead of flooding the network to determine the status of multicast members, sparse-mode PIM defines a rendezvous point. When a source begins to generate a flow, it is directed to a rendezvous point. When a router determines that it has receivers out its interfaces, it registers with the rendezvous point. The routers in the path will optimize the path automatically to remove any unnecessary hops. Sparse-mode PIM assumes that no hosts want the multicast traffic unless they specifically request it.

PIM is able to simultaneously support dense mode for some multicast groups and sparse mode for others. Cisco has implemented an alternative to choosing just dense mode or just sparse mode on a router interface. PIM sparse-dense mode allows the network to determine which IP Multicast groups should use sparse mode and which groups should use dense mode. PIM sparse mode and sparse-dense mode require the use of a rendezvous point.

PIM SM uses shared trees and shortest-path-trees (SPTs) to distribute multicast traffic to multicastreceivers in the network. In PIM SM, a router or multilayer switch assumes that other routers or switches do not forward multicast packets for a group, unless there is an explicit request for the traffic (join message). When a host joins a multicast group using IGMP, its directly connected PIM SM device

sends PIM join messages toward the root, also known as the RP. This join message travels router-by-router toward the root, constructing a branch of the shared tree as it goes.

The RP keeps track of multicast receivers. It also registers sources through register messages receivedfrom the source’s first-hop router (designated router [DR]) to complete the shared tree path from thesource to the receiver. When using a shared tree, sources must send their traffic to the RP so that thetraffic reaches all receivers.

Prune messages are sent up the distribution tree to prune multicast group traffic. This action permitsbranches of the shared tree or SPT that were created with explicit join messages to be torn down whenthey are no longer needed.

PIM (DM-SM) Sparse-dense Mode

This mode allows individual groups to be run in either sparse or dense mode depending on whether RP information is available for that group. If the router gleans RP information for a particular group, it will be treated as sparse mode; otherwise that group will be treated as dense mode.

Auto-RP

This proprietary feature eliminates the need to manually configure the RP information in every routerand multilayer switch in the network. For Auto-RP to work, you configure a Cisco router or multilayerswitch as the mapping agent. It uses IP multicast to learn which routers or switches in the network arepossible candidate RPs to receive candidate RP announcements. Candidate RPs periodically sendmulticast RP-announce messages to a particular group or group range to announce their availability.Mapping agents listen to these candidate RP announcements and use the information to create entries intheir Group-to-RP mapping caches. Only one mapping cache entry is created for any Group-to-RP range received, even if multiple candidate RPs are sending RP announcements for the same range. As the RP-announce messages arrive, the mapping agent selects the router or switch with the highest IP address as the active RP and stores this RP address in the Group-to-RP mapping cache.

Mapping agents periodically multicast the contents of their Group-to-RP mapping cache. Thus, allrouters and switches automatically discover which RP to use for the groups they support. If a router orswitch fails to receive RP-discovery messages and the Group-to-RP mapping information expires, it switches to a statically configured RP that was defined with the ip pim rp-address global configurationcommand. If no statically configured RP exists, the router or switch changes the group to dense-mode

operation.

Multiple RPs serve different group ranges or serve as hot backups of each other.

Bootstrap Router

PIMv2 BSR is another method to distribute group-to-RP mapping information to all PIM routers andmultilayer switches in the network. It eliminates the need to manually configure RP information in every router and switch in the network. However, instead of using IP multicast to distribute group-to-RPmapping information, BSR uses hop-by-hop flooding of special BSR messages to distribute the mapping information.

The BSR is elected from a set of candidate routers and switches in the domain that have been configured to function as BSRs. The election mechanism is similar to the root-bridge election mechanism used in bridged LANs. The BSR election is based on the BSR priority of the device contained in the BSR messages that are sent hop-by-hop through the network. Each BSR device examines the message and forwards out all interfaces only the message that has either a higher BSR priority than its BSR priority or the same BSR priority, but with a higher BSR IP address. Using this method, the BSR is elected.

The elected BSR sends BSR messages with a TTL of 1. Neighboring PIMv2 routers or multilayerswitches receive the BSR message and multicast it out all other interfaces (except the one on which itwas received) with a TTL of 1. In this way, BSR messages travel hop-by-hop throughout the PIMdomain. Because BSR messages contain the IP address of the current BSR, the flooding mechanismenables candidate RPs to automatically learn which device is the elected BSR.

Candidate RPs send candidate RP advertisements showing the group range for which they areresponsible to the BSR, which stores this information in its local candidate-RP cache. The BSRperiodically advertises the contents of this cache in BSR messages to all other PIM devices in thedomain. These messages travel hop-by-hop through the network to all routers and switches, which storethe RP information in the BSR message in their local RP cache. The routers and switches select the same RP for a given group because they all use a common RP hashing algorithm.

Multicast Forwarding and Reverse Path Check

With unicast routing, routers and multilayer switches forward traffic through the network along a singlepath from the source to the destination host whose IP address appears in the destination address field ofthe IP packet. Each router and switch along the way makes a unicast forwarding decision, using thedestination IP address in the packet, by looking up the destination address in the unicast routing tableand forwarding the packet through the specified interface to the next hop toward the destination.

With multicasting, the source is sending traffic to an arbitrary group of hosts represented by a multicastgroup address in the destination address field of the IP packet. To decide whether to forward or drop anincoming multicast packet, the router or multilayer switch uses a reverse path forwarding (RPF) checkon the packet as follows:

1. The router or multilayer switch examines the source address of the arriving multicast packet todecide whether the packet arrived on an interface that is on the reverse path back to the source.2. If the packet arrives on the interface leading back to the source, the RPF check is successful and thepacket is forwarded to all interfaces in the outgoing interface list (which might not be all interfaceson the router).3. If the RPF check fails, the packet is discarded.

Some multicast routing protocols, such as DVMRP, maintain a separate multicast routing table and useit for the RPF check. However, PIM uses the unicast routing table to perform the RPF check.

PIM uses both source trees and RP-rooted shared trees to forward datagrams. The RPF check is performed differently for each:

• If a PIM router or multilayer switch has a source-tree state (that is, an (S,G) entry is present in the

multicast routing table), it performs the RPF check against the IP address of the source of themulticast packet.• If a PIM router or multilayer switch has a shared-tree state (and no explicit source-tree state), itperforms the RPF check on the RP address (which is known when members join the group).

Sparse-mode PIM uses the RPF lookup function to decide where it needs to send joins and prunes:

• (S,G) joins (which are source-tree states) are sent toward the source.• (*,G) joins (which are shared-tree states) are sent toward the RP.

DVMRP and dense-mode PIM use only source trees and use RPF as previously described.

DVMRP

Distance Vector Multicast Routing Protocol is implemented in the equipment of many vendors and is based on the public-domain mrouted program. This protocol has been deployed in the MBONE and in other intradomain multicast networks.

Cisco routers and multilayer switches run PIM and can forward multicast packets to and receive from aDVMRP neighbor. It is also possible to propagate DVMRP routes into and through a PIM cloud. Thesoftware propagates DVMRP routes and builds a separate database for these routes on each router andmultilayer switch, but PIM uses this routing information to make the packet-forwarding decision. Thesoftware does not implement the complete DVMRP. However, it supports dynamic discovery of DVMRP routers and can interoperate with them over traditional media (such as Ethernet and FDDI) or over DVMRP-specific tunnels.

DVMRP neighbors build a route table by periodically exchanging source network routing informationin route-report messages. The routing information stored in the DVMRP routing table is separate fromthe unicast routing table and is used to build a source distribution tree and to perform multicast forwardusing RPF.

DVMRP is a dense-mode protocol and builds a parent-child database using a constrained multicastmodel to build a forwarding tree rooted at the source of the multicast packets. Multicast packets areinitially flooded down this source tree. If redundant paths are on the source tree, packets are not forwarded along those paths. Forwarding occurs until prune messages are received on those parent-childlinks, which further constrain the broadcast of multicast packets.

CGMP

The current software release provides CGMP-server support on your switch; no client-side functionality is provided. The switch can serve as a CGMP server for devices that do not support IGMP snooping but have CGMP-client functionality.

CGMP is a protocol used on Cisco routers and multilayer switches connected to Layer 2 Catalystswitches to perform tasks similar to those performed by IGMP. CGMP permits Layer 2 groupmembership information to be communicated from the CGMP server to the switch. The switch can thencan learn on which interfaces multicast members reside instead of flooding multicast traffic to all switchinterfaces. (IGMP snooping is another method to constrain the flooding of multicast packets.

CGMP is necessary because the Layer 2 switch cannot distinguish between IP multicast data packets and IGMP report messages, which are both at the MAC-level and are addressed to the same group address.

Multicast Routing and Switch Stacks

For all multicast routing protocols, the entire stack appears as a single router to the network and operates as a single multicast router.

In a Catalyst 3750 switch stack, the routing master (stack master) performs these functions:

• It is responsible for completing the IP multicast routing functions of the stack. It fully initializes andruns the IP multicast routing protocols.• It builds and maintains the multicast routing table for the entire stack.• It is responsible for distributing the multicast routing table to all stack members.The stack members perform these functions:• They act as multicast routing standby devices and are ready to take over if there is a stack masterfailure.If the stack master fails, all stack members delete their multicast routing tables. The newly electedstack master starts building the routing tables and distributes them to the stack members.

Note If a stack master running the IP services image fails and if the newly elected stack master isrunning the IP base image (formerly known as the standard multilayer image [SMI]), theswitch stack will lose its multicast routing capability.

• They do not build multicast routing tables. Instead, they use the multicast routing table that isdistributed by the stack master.

IGMP snooping process

The default behavior for a Layer 2 switch is to forward multicast traffic to every port in the VLAN on which the traffic was received. Therefore, a switch between a requesting host and a multicast router will forward a multicast flow intended for a single host out all switch ports on the same VLAN as the receiving host. IGMP snooping is an IP multicast constraining mechanism for switches. It examines IGMP frames so that multicast traffic is not forwarded out all VLAN ports but only those over which hosts sent IGMP message toward the router.

IGMP snooping runs on a Layer 2 switch. The switch snoops the content of the IGMP join and leave messages sent between the hosts and the router. When the switch sees an IGMP report from a host to join a particular multicast group, the switch creates a CAM table entry associating the port number where that message was seen to the Layer 2 multicast address for the group that the host joined. When the frames of the multicast flow arrive at the switch with the destination multicast MAC address, they are forwarded down only those ports where the IGMP messages were snooped, and associated CAM table entries were created. When the switch snoops the IGMP leave group message from a host, the switch removes the table entry.

PIM Configuration

Preventing Join Messages to False RPs

Find whether the ip pim accept-rp command was previously configured throughout the network byusing the show running-config privileged EXEC command. If the ip pim accept-rp command is notconfigured on any device, this problem can be addressed later. In those routers or multilayer switchesalready configured with the ip pim accept-rp command, you must enter the command again to acceptthe newly advertised RP.

To accept all RPs advertised with Auto-RP and reject all other RPs by default, use the ip pim accept-rpauto-rp global configuration command. This procedure is optional.

If all interfaces are in sparse mode, use a default-configured RP to support the two well-knowngroups 224.0.1.39 and 224.0.1.40. Auto-RP uses these two well-known groups to collect and distributeRP-mapping information. When this is the case and the ip pim accept-rp auto-rp command isconfigured, another ip pim accept-rp command accepting the RP must be configured as follows:

Switch(config)# ip pim accept-rp 172.10.20.1 1Switch(config)# access-list 1 permit 224.0.1.39Switch(config)# access-list 1 permit 224.0.1.40

Filtering Incoming RP Announcement Messages

You can add configuration commands to the mapping agents to prevent a maliciously configured router

from masquerading as a candidate RP and causing problems.

This example shows a sample configuration on an Auto-RP mapping agent that is used to preventcandidate RP announcements from being accepted from unauthorized candidate RPs:

Switch(config)# ip pim rp-announce-filter rp-list 10 group-list 20Switch(config)# access-list 10 permit host 172.16.5.1Switch(config)# access-list 10 permit host 172.16.2.1Switch(config)# access-list 20 deny 239.0.0.0 0.0.255.255Switch(config)# access-list 20 permit 224.0.0.0 15.255.255.255

In this example, the mapping agent accepts candidate RP announcements from only two devices,172.16.5.1 and 172.16.2.1. The mapping agent accepts candidate RP announcements from these twodevices only for multicast groups that fall in the group range of 224.0.0.0 to 239.255.255.255. Themapping agent does not accept candidate RP announcements from any other devices in the network.Furthermore, the mapping agent does not accept candidate RP announcements from 172.16.5.1or 172.16.2.1 if the announcements are for any groups in the 239.0.0.0 through 239.255.255.255 range.This range is the administratively scoped address range.

TO FINISH:

IGMP Snooping and MVR, Configuration commands on QoS for 3550,

TO CHECK:

Double-tagging prevention, Private VLAN configuration GUIDES!!!

Layer2/3 etherchannel configuration, VRRP, SLB (Server Load Balancing)

VACLs on 6500, expedite queue on 3550 value – 0 or 1,

Kerberos, SCP/SFTP, HTTPS server, DNS server, SSL, CA, NTP, reverse Telnet,

Ingress, egress, NBAR, IGMPv3,

VLAN filtering on trunk ports – only RX direction?

ROI – Return Of Investments

Modular QoS:

http://www.cisco.com/univercd/cc/td/doc/product/software/ios122/122cgcr/fqos_c/fqcprt8/qcfmdcli.htm#89799

QoS Classification and Policing Using CAR

Configuring Low Latency Queuing (LLQ)

Configuring Link Fragmentation and Interleaving

QoS Compressed Real Time Protocol



1. campus infrastructure model

Documents