architecture tutorialfienup/cs142f05/lectures/pci-x_tutorial_final.pdf · pci slots pci-x slots...
Post on 13-Aug-2020
13 Views
Preview:
TRANSCRIPT
Architecture Tutorial
Alan GoodrumChairman, PCI-X Workgroup
Staff Fellow, Compaq Computer Corporation
May 23, 2000Applied Computing Conference 2000
2
Purpose of this tutorial� This tutorial is
� Introduction to PCI-X– Key features– Key benefits
� Aimed primarily at digital designers� Limited by time
� This tutorial is NOT� Detailed study of every protocol feature� Detailed study of electrical requirements� Detailed study of bridge requirements� Detailed study of new configuration registers
3
Agenda
� Key Features� Card and System Interoperability� Protocol� Software Aspects� Electricals� Performance� Summary
4
Key PCI-X Features
5
I/O Bandwidth vs. Time
1986 1988 1990 1992 1994 1996 1998 2000 2002
Ethernet
1 Gbit/s
InternetBackbone
OC 192
T3
SCSI
10 Gbit/s
EISA
1
10
100
1,000
10,000
Ban
dwid
th (M
B/s
)
ISA
Today 2000
6MHz 16-bit
10MHz 32-bit
33MHz 32/64-bit
66MHz 32/64-bit
66/133MHz 32/64-bit
6
Key PCI-X Features� PCI-X Systems
� 32- or 64-bit, � 3.3 Volt I/O� Trade off speed for slots
1 slot @ 133 MHz; 2 slots @ 100 MHz; 4 slots @ 66 MHz� PCI-X Devices
� 32- or 64-bit� 66, 133MHz� 3.3Volt I/O or Universal
� Bus runs in PCI-X or conventional mode (similar to 33/66 MHz modes in PCI 2.2)
� PCIXCAP mode pin similar to M66EN� Integrates well with emerging switched fabric protocols like
InfiniBand
7
� Attribute phase for each transaction� Byte count� Initiator ID� Handling instructions� Tag
� More intelligent use of wait states� Only target initial wait states supported
� Standard block size movements (I/O Cache line)� Fixed transaction disconnection points on 128 byte
boundaries� Split Transactions replace Delayed Transactions� Makes multi-threaded operation practical� Electrical design for PCI-X easier than conv 66 MHz
Key PCI-X Features - cont’d
8
� Relaxed Transaction Ordering� Optional function for removing unnecessary blocking
cases� Support for non-cache-coherent transaction� New configuration registers accessed via Capabilities
data structure� No S/W initialization required
� Default values always functional
� Improved error handling � Allows cards an increased range of options for handling
data parity errors
Key PCI-X Features - cont’d
9
Building on the Foundationof PCI 2.1 and 2.2� Compatibility mode for 33MHz PCI 2.2 (3.3v)
� PCI-X systems can accept standard PCI cards� PCI-X cards can work in current PCI systems
� Requires no Device Driver or OS modification� New features designed for easy migration� Similarity to 2.1/2.2 protects development infrastructure� Required support for Message-Signaled Interrupts
and PCI Power Management (D0 & D3) � Designed to support PCI Hot-Plug
(new Hot-Plug System Driver)
10
PCI-X System Flexibility --Speed vs. Slot Tradeoff
BusWidth
BusFrequency
BusBandwidth PCI Slots PCI-X Slots
32-bit 33 MHz 133 MB/s N/A
64-bit 66 MHz 533 MB/s
64-bit 100 MHz 800 MB/s N/A
64-bit 133 MHz 1066 MB/s N/A
ConventionalPCI
PCI-X
PCI-X doubles the numberof slots per bus segment
at 66 MHz
PCI-X at 100 MHz provides enterprise class
I/O bandwidth
PCI-X at 133 MHz is the first interconnect to
exceed 1Gbyte/s
The most commonimplementation of PCI today
33MHz per bus segment
11
PCI-X System Flexibility --Hierarchical Structure
64-bit 66-MHz533-MB/s
64-bit 133-MHz1066-MB/s
64-bit 100-MHz800-MB/s
133-MHz
Chipset
PCI-Xto
PCI-XBridge
PCI-Xto
PCI-XBridge
PCI-Xto
PCI-XBridge
PCI-X bus
PCIPCI--X bridges allow PCIX bridges allow PCI--X to link up to 256 separate bus segments. X to link up to 256 separate bus segments.
Smarter protocol makes high-performance PCI-X bridges practical
Bus segment #1
Bus segment #2
Bus segment #3
Bus segment #4
12
Card and System Interoperability
13
The Million-Dollar Question
� What is PCI’s I/O voltage migration plan?
� The correct answer is….
a. All 3.3V PCI cards must be 5V tolerant.b. Universal slots accept both 5V and 3.3 V keyed cards.c. PCI slots have been mostly 3.3V I/O for years.d. Universal cards plug into both 5V and 3.3V keyed slots.
14
Hardware Compatibility --PCI 5v to 3.3v I/O Migration Story
Universal PCI adapter cards are keyed for both 5v and 3.3v slots
5v Keyed5v Keyed
3.3v Keyed3.3v Keyed
6464--bit Extensionbit Extension
5v Keyed Slot & 5v Keyed Slot & Adapter CardAdapter Card
6464--bit Extensionbit Extension
3.3v Keyed Slot & 3.3v Keyed Slot & Adapter CardAdapter Card
15
Adapter Card Selection
� Cost sensitive 32-bit Cards� PCI-X cards will work in current PCI
systems just like 66MHz conventional cards
� Conventional Speeds & Bandwidth� 33MHz 133MB/sec� 66MHz 266MB/sec (optional)
� PCI-X Speeds & Bandwidth� 66MHz 266MB/sec� 133MHz 533MB/sec (optional)
� 3.3v or Universal
� High performance 64-bit cards� PCI-X cards will work in current PCI
systems just like 66MHz conventional cards
� Conventional Speeds & Bandwidth� 33MHz 266MB/sec� 66MHz 533MB/sec (optional)
� PCI-X Speeds & Bandwidth� 66MHz 533MB/sec� 133MHz 1066MB/sec (optional)
� 3.3v or Universal
6464--bit Extensionbit Extension
16
PCI-X System Configuration
� PCI-X slots accept both PCI & PCI-X adapters
� For best performance group cards by speed and type (PCI-X or PCI) per bus segment:
� Conventional 33MHz cards � Conventional 66MHz card� PCI-X 66 MHz cards� PCI-X 133 MHz card
MemoryMemoryControllerController
PCIPCI--X X HostHost
BridgesBridges
CPUCPU CPUCPU CPUCPU CPUCPUMemoryMemory
17
Interoperability Matrix
Note:Note:1. PCI1. PCI--X system and devices must support conventional 33MHz timing, andX system and devices must support conventional 33MHz timing, and may optionally may optionally support conventional 66MHz timing.support conventional 66MHz timing.
conventional PCI cards PCI-X cards33 MHz(5V I/O)
33 MHz(3.3V I/O or
Universal
66 MHz(3.3V I/O or
Universal
66 MHz(3.3V I/O or
Univeral
133 MHz(3.3V I/O or
Universal33 MHz 33
(5V I/O)33
(5V I/O)33
(5V I/O)33
(5V I/O)33
(5V I/O)conventionalsystem 66 MHz 33 66 a) 331
b) 66a) 331
b) 6666 MHz 33 33 66 66
100 MHz 33 a) 331
b) 6666 100PCI-X
system133 MHz 33 a) 331
b) 6666 133
Legend
xx Conventional PCI system or expansion card operating in conventional mode.xx = nominal clock frequency in MHz.
xx PCI-X system and expansion card operating in PCI-X mode.xx = nominal clock frequency in MHz.
xx Most popular cases.
18
System Initialization & Interoperability� Device and Expansion Card Requirement
� PCI-X Expansion card identifies PCI-X capability via PCIXCAP pin– (open = PCI-X 133, pulldown = PCI-X 66, GND = conv. PCI)
� PCI-X devices enter PCI-X mode when RST# deasserts with TRDY#, STOP#, or DEVSEL# asserted on an idle bus
� System Requirements (M66EN & PCIXCAP)� The system is required to determine the proper operating mode for
the bus and to apply the appropriate PCI-X initialization pattern to the bus before the rising edge of RST#
� Mode and Frequency Initialization Sequence� Bus mode (Conventional PCI or PCI-X) � If PCI-X “133”, “100”, “66” modes� If conventional PCI “66”, “33” modes
19
Reset & Initialization Sequence
PCI_CLK
RST#trlcx trst_clk (ref)
tprsu tprh
trst (ref)
STOP#
TRDY#
IRDY#
FRAME#
trhff(ref)
DEVSEL#
transparentlatch
Q
D
PCI-Xinitialization
patterndecode
PCI-X_mode_en
TRDY#
STOP#
DEVSEL#
FRAME#
IRDY#
EnRST#
� Conventional vs. PCI-X mode selected at rising edge or RST#
� PCI-X bus freq range encoded at rising edge of RST#
20
PCI-X Protocol
21
PCI 2.2 vs. PCI-X Comparison�Write transaction, same device select timing,
same wait states, 6 data phases �Conventional PCI bus requires 9 clocks,
PCI-X bus requires 10 clocks
PCI 2.2 PCI 2.2 PCIPCI--XX
PCI_CLK
1 2 3 4 5 6 7 8 9 10 11 12
Data Transfer
Data Transfer
Data Transfer
Data Transfer
Data Transfer
Data Transfer
Bus Transaction
DEVSEL#
TRDY#
IRDY#
FRAME#
BUS CMDC/BE# BE#'s-0ATTR BE#'s-1 BE#'s-2 BE#'s-3 BE#'s-4 BE#'s-5
ADDRESSAD DATA-0 DATA-1 DATA-2 DATA-3 DATA-4ATTR DATA-5
PCI_CLK
1 2 3 4 5 6 7 8 9 10 11 12
Data Transfer
Data Transfer
Data Transfer
Data Transfer
Data Transfer
Data Transfer
Bus Transaction
ADDRESSAD DATA-0 DATA-1 DATA-2 DATA-3 DATA-4 DATA-5
BUS CMDC/BE# BE#'s-0 BE#'s-1 BE#'s-4BE#'s-3BE#'s-2 BE#'s-5
FRAME#
IRDY#
TRDY#
DEVSEL#
22
Transaction Phases
PCI_CLK
1 2 3 4 5 6 7 8 9 10 11
DataPhase
DataPhase
DataPhase
DataPhase
DataPhase
DataPhase
Address Phase
Turn Around
Attribute Phase
Target Response Phase
12
Data Transfer
Data Transfer
Data Transfer
Data Transfer
Data Transfer
Data Transfer
Bus Transaction
Initiator Termination
DEVSEL#
TRDY#
IRDY#
FRAME#
BUS CMDC/BE# BE#'s-0ATTR BE#'s-1 BE#'s-2 BE#'s-3 BE#'s-4 BE#'s-5
ADDRESSAD DATA-0 DATA-1 DATA-2 DATA-3 DATA-4ATTR DATA-5
23
PCI 2.1/2.2PCI 2.1/2.2
How can the bus be faster but the timing easier?Behind the Curtain
� Register-to-register design allows maximum flight time
LogicLogic
RegReg
I/O BuffersI/O BuffersBoundary ScanBoundary Scan
PCIPCIClockClock
2.1/2.2 2.1/2.2
PCIPCI--XX
PCIPCI--XX RegRegPCIPCIClockClock
24
Behind the Curtain
SenderAssertsSignal
Propagation delay across bus
ReceiverDecodes Logic
ReceiverResponds
Receiverregisters
signal
32
PCI -X Clock
1
PCI Clock, 33 MHz
SenderAssertsSignal
Propagation delay across bus
ReceiverResponds
Receiver DecodesLogic
1 2
SenderAssertsSignal
Propagation delayacross bus
Receiver Decodes Logic
PCI Clock, 66 MHz
321
ReceiverResponds
� PCI @ 33MHz� 30 ns period� 7 ns setup time
� PCI-X registered protocol allocates a full clock period for logic decision
� @ 66MHz - 15ns� @ 133MHz - 7.5ns
� PCI @ 66MHz� 15 ns period� 3ns setup time
25
PCI-X TermsAllowableDisconnectBoundary (ADB)
The initiator and target are permitted to disconnect byte-counttransactions only on naturally aligned 128-byte boundaries.
Requester Initiator that first introduces a transaction into the PCI-X domain.
Completer The device addressed by a transaction (other than a SplitCompletion).
Sequence One or more transactions associated with carrying out a singlelogical transfer by a requester.
Attributes Byte count, requester or completer ID, bus number, sequencenumber, and other transaction handling instructions.
Split Transaction A single logical transfer containing a initial transaction (the SplitRequest) that the target (the completer) terminates with SplitResponse, followed by one or more transactions (the SplitCompletions) initiated by the completer to send the read data (if aread) or a completion message back to the requester.
26
PCI-X Peer Transaction Flow
SplitCompletion
InitiatorI/O
CFGInt. Ack.
MemRead
SplitResponse
I/O, CFG,
Int. Ack, MemRead
ImmediateResponse
PostedMemoryWrites,
I/O,CFG,
Int. Ack,MemRead
RetryResponse
(TransactionRescheduled)
ErrorTermination(Sequence
Ends)
Completer(e.g. Host Bridge)
Initiator Interface
Target Interface Initiator Interface
Target Interface
SequenceInitiator
I/O,CFG,
Int. Ack.Memory,
Special Cyc,
SequenceRequester
I/OCFG
Int. Ack.MemRead
Requester(e.g. Adapter Card)
ErrorTermination(Sequence
Ends)StartHere
StartHereStartHere
PCI-X Mode
27
PCI-X Features
Decode Speed PCI-X Conventional PCI1 clock after address Not Supported Fast2 clocks after address Decode A Medium3 clocks after address Decode B Slow4 clocks after address Decode C SUB6 clocks after address SUB N/A
� AD bus specifies starting byte address (including AD[2:0])� Byte Enable bus is reserved (driven high) for all transactions
except Memory Write� Wait states not allowed, except target initial wait states. Always
pairs for memory write and Split Completion� PCI-X “DWORD transaction” like convention PCI “single data
phase” transaction (Config, I/O, Special Cycle).� DEVSEL# decode speed:
28
Burst Transactions DWORD TransactionsCommands:• Memory Read Block• Memory Write Block• Memory Write• Alias to Memory Read Block• Alias to Memory Write Block• Split Completion
Commands:• Interrupt Acknowledge• Special Cycle• I/O Read• I/O Write• Configuration Read• Configuration Write• Memory Read DWORD
64- or 32-bit data transfers. 32-bit data transfers onlyStarting address specified on AD bus down to a byteaddress (includes all AD bus).
Starting address specified on AD bus down to a byteaddress (includes all AD bus), except forconfiguration transactions, which are DWORDaligned (AD[1:0] indicate configuration transactiontype).
Supports one or more data phases and always inaddress order.
Supports only single data phase.
During the data phases the C/BE# bus is reservedand driven high by the initiator for all transactionsexcept Memory Write.
The C/BE# bus contains valid byte enables forMemory Write transactions. Any byte enablepattern is permitted (between the starting and endingaddress, inclusive), including no byte enablesasserted.
During the attribute phase the Requester Attributescontains valid byte enables. Any byte enable patternis permitted, including no byte enables asserted.
During the data phase the C/BE# bus is reserved anddriven high by the initiator.
PCI-X Burst and DWORD Transactions
29
C /B E [3 :0 ]#o r
C /B E [7 :4 ]#
C o n v e n tio n a l P C IC o m m a n d(re fe re n c e )
P C I-X C o m m a n d L e n g th
0 0 0 0 b In te r ru p tA c k n o w le d g e
In te r ru p t A c k n o w le d g e D W O R D
0 0 0 1 b S p e c ia l C y c le s S p e c ia l C y c le s D W O R D0 0 1 0 b I/O R e a d I/O R e a d D W O R D0 0 1 1 b I/O W r ite I/O W r ite D W O R D0 1 0 0 b R e s e rv e d R e s e rv e d n a0 1 0 1 b R e s e rv e d R e s e rv e d n a0 1 1 0 b M e m o ry R e a d M e m o ry R e a d D W O R D D W O R D0 1 1 1 b M e m o ry W r ite M e m o ry W r ite B u rs t1 0 0 0 b R e s e rv e d A lia s to M e m o ry R e a d B lo c k B u rs t1 0 0 1 b R e s e rv e d A lia s to M e m o ry W r ite B lo c k B u rs t1 0 1 0 b C o n f ig u ra tio n R e a d C o n f ig u ra tio n R e a d D W O R D1 0 1 1 b C o n f ig u ra tio n W r ite C o n f ig u ra tio n W r ite D W O R D1 1 0 0 b M e m o ry R e a d
M u lt ip leS p lit C o m p le tio n B u rs t
1 1 0 1 b D u a l A d d re s s C y c le D u a l A d d re s s C y c le n a1 1 1 0 b M e m o ry R e a d L in e M e m o ry R e a d B lo c k B u rs t1 1 1 1 b M e m o ry W r ite a n d
In v a lid a teM e m o ry W r ite B lo c k B u rs t
PCI-X Command Encoding
30
DWORD TransactionsDWORD Write Transaction DWORD Write Transaction
with No Wait States. with No Wait States.
Notice the initiator continues driving the Notice the initiator continues driving the bus, and IRDY# remains asserted in clock bus, and IRDY# remains asserted in clock 7, even though this is one clock past the7, even though this is one clock past the
single clock in which data was transferred single clock in which data was transferred (clock 6). (clock 6). In PCIIn PCI--X the initiator requires two X the initiator requires two
clocks to respond to the assertion of clocks to respond to the assertion of TRDY#.TRDY#.
PCI_CLK
1 2 3 4 5 6 7 8 9 10
ADDRESSAD[31::00] ATTR DATA-0
FRAME#
IRDY#
TRDY#
STOP#
DEVSEL#
BUS CMDC/BE[3::0]# ATTR BE#'s = Fh
DWORD Read Transaction DWORD Read Transaction with two Target initial Wait with two Target initial Wait
StatesStates
Notice BE# bus reserved Notice BE# bus reserved and driven high, because and driven high, because byte enables are in the byte enables are in the
requester attributesrequester attributes
PCI_CLK
1 2 3 4 5 6 7 8 9 10 11 12
ADDRESSAD[31::00] ATTR DATA-0
BUS CMDC/BE[3::0]# ATTR
FRAME#
IRDY#
TRDY#
DEVSEL#
BE#'s = Fh
31
PCI-X Protocol--Attributes and Split Transactions
32
Transaction Attributes
RO -- Relax orderingNS -- No Snoop
UpperByte Count
C/BE[3-0]# AD[31:0]
LowerByte Count
RequesterBus
Number
RequesterDevice Number
RequesterFunctionNumber
TagNS
ROR
000708101115162324313235 282930
Requester Attributes for Burst Transactions (most memory read/write)
Requester Attributes for DWORD Transactions (Memory Read DWORD, I/O read/write, Special Cycle, Int Ack)
Byte Enables
C/BE[3-0]# AD[31:0]
ReservedRequester
BusNumber
RequesterDevice Number
RequesterFunctionNumber
TagNS
ROR
000708101115162324313235 282930
33
Transaction Attributes
SCM -- Split Completion MessageSCE -- Split Completion ErrorBCM -- Byte Count Modified
UpperByte Count
LowerByte Count
CompleterBus
Number
CompleterDeviceNumber
CompleterFunctionNumber
R
00070810111516232431
SCM
2829BCM
SCE
30
AD[31:0]C/BE[3:0]#
3 0
RequesterBus
Number
RequesterDevice
Number
RequesterFunctionNumber
Lower Address [6:0]BUS CMD
00070810111516232431
C/BE[3-0]#
TagR RO R
062930 28
AD[31:0]
03
R
Split Completion Address
Completer Attributes
RO -- Relax ordering,
34
Split Transactions
Target
Initiater
Completer B
Target
Initiater
Requester A
PCI-X Bus
Address,Memory Read
RequesterAttributes
REQGNT
DataSplit Reponse
SplitCompletion
CompleterAttributes DATA
REQGNT(Requester's
Attribute)
ImmediateResponse
SplitTransactionCompleter
SplitTransactionRequester
35
PCI-X Protocol--Configuration Transactions
36
Config Address & Attributes
Secondary Bus NumberRequester
BusNumber
RequesterDevice
Number
RequesterFunctionNumber
Tag
00070810111516232431
RRR
282930
Byte Enables
3235
AD[31::00]C/BE#[3::0]
Type 1
Type 1
Type 0
BUS CMD
C/BE[3-0]#
See PCI 2.2 Specification FuncNumber Register Number
0001020708101131
0 0
AD[31:0]
3 0
Reserved Bus Number DevNumber
FuncNumber
RegisterNumberBUS CMD
000102070810111516232431
C/BE[3-0]#
0 1
AD[31:0]
3 0PCI-X Config Type 1 to Type 0 Configuration Address
PCI 2.2 Type 1 to Type 0 Configuration Address (ref)
BUS CMD
C/BE[3-0]#
3 0
Reserved Bus Number Dev Number FuncNumber Register Number
000102070810111516232431
0 1
AD[31:0]
ReservedDev
NumberFunc
NumberRegisterNumberBUS CMD
00010207081011151631
C/BE[3-0]#
0 0
AD[31:0]
3 0
Type 0
Configuration Attributes
37
Configuration Transactions
4 clocks of valid address before asserting FRAME# 4 clocks of valid address before asserting FRAME#
PCI_CLK
1 2 3 4 5 6 7 8 9 10 11 12 13 14
DEVSEL#
TRDY#
IRDY#
FRAME#
C/BE[3:0]# BUS CMD ATTR BE# 's = Fh
AD[31:0] ADDRESS DATA-0ATTR
IDSEL
38
PCI-X Protocol--PCI-X Bridges
39
PCI-X Bridge Design“This is not your father’s bridge”
�Split Transactions make consistently high-performance bridges possible
� No speculative prefetch penalty� Forwards read completion data similar to posted memory writes
� Converts between PCI-X and conventional PCI, if necessary� Frequency and mode of secondary bus independent of
primary bus� Transaction ordering rules similar to conventional PCI� Performance-tuning registers to enable adjustment of request
rate to match completion rate
40
PCI-X Bridge Transaction Flow
Target Interface
Bridge X
Primary Bus
Secondary Bus
SplitResponse
I/O CFG
Int. Ack. MemRead
ErrorTermination(Sequence
Ends)
ImmediateResponse
PostedMemory
Write
RetryResponse
(TransactionRescheduled)
ForwardedSequence
I/OCFG
Int. Ack.MemRead
ForwardedSequence
PostedMemory
Write
ImmediateResponseCompletion
DataAccepted
RetryResponse
(TransactionRescheduled)
ErrorTermination(Sequence
Ends)
ForwardedCompletion Completion Data Read SCM Write Ack Error Status
SplitCompletion
InitiatorI/O
CFGInt. Ack.
MemRead
SplitCompletion
InitiatorI/O
CFGInt. Ack.
MemRead
SplitResponse
I/O CFG
Int. Ack. MemRead
ImmediateResponse
RetryResponse
(TransactionRescheduled)
ErrorTermination(Sequence
Ends)
Completer(e.g. Host Bridge)
SequenceInitiator
I/OCFG
Int. Ack.Memory
Special Cyc
SequenceRequestor
I/OCFG
Int. Ack.MemRead
Requester(e.g. Adapter Card)
ErrorTermination(Sequence
Ends)
SplitCompletionExceptionMessageInitiator
Initiator Interface
Target Interface Initiator Interface
Target Interface
Initiator Interface
Target InterfaceInitiator Interface
StartHere
StartHereStartHere
41
PCI-X Bridge Design--Summary
�Great system performance�More complex that conventional PCI bridges
�See the PCI-X spec for full details
42
PCI-X Protocol--Parity Generation and Checking
43
Burst Write Parity Operation
Burst Write or Split Completion Transaction Parity Operation
PCI_CLK
1 2 3 4 5 6 7 8 9 10 11 12 13
ADDRESSAD[31:0] DATA-0 DATA-1ATTR ADDRESS ATTRDATA-2 DATA-3
C/BE[3:0]# BUS CMD ATTR BUS CMD ATTRBE#'s-0 BE#'s-2BE#'s-1 BE#'s-3
PAR
PERR#
FRAME#
IRDY#
TRDY#
DEVSEL#
14
DATA-0
BE#'s-0
44
Burst Read Parity Operation
PCI_CLK
1 2 3 4 5 6 7 8 9 10 1211 13
ADDRESSAD DATA-0 DATA-1ATTR ADDRESS ATTRDATA-2 DATA-3
C/BE# BUS CMD ATTR BUS CMD ATTR
PAR
FRAME#
TRDY#
DEVSEL#
PERR#
IRDY#
14
DATA-0
Burst Read Transaction (Immediate Completion) Parity Operation
45
DWORD Read Parity Operation
DWORD Read Parity Operation with Decode A and No Initial Wait States
PCI_CLK
1 2 3 4 5 6 7 8 9 10 1211
ADDRESSAD[31::00] DATA-0ATTR
C/BE[3::0]# BUS CMD ATTR
PAR
FRAME#
TRDY#
DEVSEL#
PERR#
IRDY#
ADDRESS ATTR
BUS CMD ATTR
46
DWORD Read Parity Operation
DWORD Read Parity Operation with Decode B and No Initial Wait States
PCI_CLK
1 2 3 4 5 6 7 8 9 10 1211 13
ADDRESSAD[31::00] DATA-0ATTR
C/BE[3::0]# BUS CMD ATTR
PAR
FRAME#
TRDY#
DEVSEL#
PERR#
IRDY#
ADDRESS ATTR
BUS CMD ATTR
47
PCI-X Protocol--Exception Handling
� All PCI-X devices must provide one of the following levels of support for data parity error recovery:
� Notify device driver of problem. Driver attempts recover or resets device or system.
� Assert SERR#.� Split Transaction Exception rules
� Parity errors on Split Completion� Master-Abort and Target-Abort messages
� Status registers in configuration space can only be cleared by the system-specific software after logging the exceptions
48
PCI-X Protocol--Arbitration Rules
� Relevant clock for GNT# one clock earlier than conventional PCI (registered bus)
� In general, GNT# asserted two clocks prior to start of transaction� Initiator permitted to start transaction one clock after GNT# deasserted� If the arbiter deasserts GNT# to one device, it cannot assert GNT# to
another device until the next clock.
� Fair opportunities for all devices to execute configuration transactions
� In PCI Hot-Plug systems arbiter must coordinate with the Hot-Plug Controller
� The default Latency Timer value for initiators in PCI-X mode is 31
49
PCI Hot Plug Support� Hardware Impact
� Hot-Plug Controller provides to the Hot-Plug System Driver the means to check the PCIXCAP pin to identify PCI-X adapters
� The Hot-Plug Controller drives the PCI-X initialization pattern on the bus with the proper timing prior to the rising edge of RST# for that slot
� The Hot-Plug Controller coordinates with the arbiter for bus ownership during hot insertion
� PCI-X devices ignore TRDY#, STOP# and DEVSEL# on an idle bus
� Software Impact� The Hot-Plug System Driver must read the inserted card’s M66EN
and PCIXCAP pin to ensure that inserted adapter supports bus frequency and operating mode of the bus
50
Software Aspects of PCI-X
51
Software Compatibility� No OS or driver change required
� New config registers default to functional values� Optional performance tuning registers � Other config registers unchanged� No device programming model changes required
� Optional improved error handling� Enables smart device and new driver to recover
from PERR# event� Updated Hot-Plug System Driver
52
Configuration Space� PCI-X devices use the standard PCI config header� New PCI-X registers use the Capability List� The PCI-X list item includes
� An 8-bit PCI-X Capability ID (standard reg)� An 8-bit pointer to next list item (standard reg)� A 32-bit PCI-X Command--controls various modes
and features of the PCI-X device� A 24-bit PCI-X Status--identifies the capabilities and
current operating mode of the device� PCI-X bridges have different registers
53
Software Summary--“It Just Works”
Up to 8 times I/O throughput improvement without changing your OS
OpenVMS
3.x/9x/NT/2000
ServerServer
54
PCI-X Electrical Design
55
A Word to the Wise� 33 MHz PCI spec was forgiving� High-freq designs are more complex
� PCI 66, PCI-X 66, PCI-X 133� Use high-freq design techniques and principles
– E.g., Electrical Design Considerations for PCI-X and 66-MHz PCI Cards,http://www.compaq.com/support/techpubs/whitepapers/WhitePapers_Industry_Technology.html
Doc # tc000301tb� Simulate system topology� Must follow PCI-X spec precisely
� If you don’t understand every word of Chap 9, find someone who does.
56
Signal Quality Guidelines for PCI-X� Report of the PCI-X Electrical Subgroup to be available
mid year 2000 � Includes discussion of:
� I/O Buffer Design� Overshoot� Ringback� Settling time � Inter Symbol Interference� Secondary Effects
– Ground Bounce– Cross Talk– DC Offset
� Summary results of almost 2 million SPICE simulations
57
Parameter 133 MHzPCI-X
100 MHzPCI-X
66 MHzPCI-X
66 MHzConventional PCI
(ref)
33 MHzConventional PCI
(ref)
Units
Tval (max) 3.8 3.8 3.8 6 11 nsTprop (max) 2.0 4.5 9.0 5 10 nsTskew (max) 0.5 0.5 0.5 1 2 nsTsu (max) 1.2 1.2 1.7 3 7 nsTcyc 7.5 10 15 15 30 ns
Timing Budget
Parameter PCI-X 66 MHzConventional
PCI (ref)
33 MHzConventional
PCI (ref)
Units
Tval (min) 0.7 2 2 nsTprop (min) 0.3 0 0 nsTskew (max) 0.5 1 2 nsTh (min) 0.5 0 0 ns
Setup Time Budget
Hold Time Budget
58
PCI-X V/I Curves vs. PCI 2.2 V/I Curves
PCI-X Pull-Up Output Buffer V/I Curves PCI-X Pull-Down Output Buffer V/I Curves
00.10.20.30.40.50.60.70.80.9
1
0 20 40 60 80Iout (mA)
Vout
(vol
t)
- - - -Vcc Vcc Vcc Vcc
Vcc x
PCI-XPCI
00.10.20.30.40.50.60.70.80.9
1
0 20 40 60 80
Iout (mA)Vo
ut (v
olt)
- Vcc - Vcc - Vcc - Vcc
Vcc x
PCI-X
PCI
59
PCI-X Mechanical Requirements
60
PCI-X Mechanical Requirements� Same card and slot mechanical requirements as
conventional PCI (3.3V I/O)� New labeling requirement (ECR in progress)
� Systems must identify slot capability (method not specified)� Cards must be marked 66 133
Standard LengthAdd-in CardLow ProfilePCI
Add-in Card
Short Length(Variable Height)
Add-in Card
66
66
66
61
PCI-X Performance
62
PCI-X Demonstration Using Compaq ProLiant 8500 Server
� Cable-less, tool-less design� Serviceable in minutes� Reduce training & downtime� Improved product availability
Key Customer Benefit - Easy configuration, installation, upgrade, and service.
63
Compaq & IntelIndustry Standard 8-Way Architecture
100 MHz AGTL + bus 1 100 MHz AGTL + bus 2
ProFusion
2 Way Interleaved SDRAM Memory
2 Way Interleaved SDRAM Memory
Even
Mem
ory
Port
Even
Mem
ory
Port
Odd
Mem
ory
Port
Odd
Mem
ory
Port
Bus 1 CacheBus 1 CacheCoherencyCoherencyAcceleratorAccelerator
Bus 2 CacheBus 2 CacheCoherencyCoherencyAcceleratorAccelerator
100 MHz GTL I/O bus
Pentium IIIXeon
Pentium IIIXeon
Pentium IIIXeon
Pentium IIIXeon
33 MHz 33 MHz 66 MHz
With only I/O With only I/O subsystem subsystem
upgradeupgrade
PCIPCI--X X vs.vs.
conventional PCIconventional PCIprotocol protocol
comparisoncomparison
Pentium IIIXeon
Pentium IIIXeon
Pentium IIIXeon
Pentium IIIXeon
64
PCI-X Advanced Protocol Gives You More Usable Bandwidth at Any Speed
PCI 64bit/33MHz
140MByte/s 2000
PCI-X 64bit/33MHz
230MByte/s
With “real hardware” With “real hardware” PCIPCI--X is over 60 % faster X is over 60 % faster
than conventional PCIthan conventional PCI
6 conventional PCI 6 conventional PCI adapters pulling data adapters pulling data
2 PCI2 PCI--X adapters X adapters pulling data pulling data
vs.vs.
(33 MHz for demo purposes only)(33 MHz for demo purposes only)(your mileage may vary)(your mileage may vary)
65
PCI-X vs. PCI: 4K-Byte Read Performance
SCSI
SCSI
Enet
Enet
FC
FC
PCI vs. PCIPCI vs. PCI--XX
PCI-XBridge
66MHz PCI66MHz PCI--X up to X up to 33% faster33% faster than 66MHz PCI 2.2than 66MHz PCI 2.2
1 2 3 4 5 6 7
PCI (33 MHz, MRM)
PCI (66 MHz, MRM)PCI-X (66MHz)
PCI-X (100 MHz)
100150200250300350400450500550600650700
MB
/sec
# of Requests
PCI (33 MHz, MRM) PCI (66 MHz, MRM) PCI-X (66MHz) PCI-X (100 MHz)
Note: Assumes Ideal memory controller with 32Note: Assumes Ideal memory controller with 32--byte CPU cache line and ideal 64byte CPU cache line and ideal 64--bit PCI adaptersbit PCI adapters
66
Prognostications for PCI-X� Rapid deployment in Server market beginning 2H2000
� Server vendors on Workgroup: Compaq, Dell, HP, IBM, Intel� Peripheral vendors on Workgroup
– Disk: Adaptec, Mylex (IBM), LSI– NIC: 3Com, Intel
� Si and IP vendors on Workgroup: InSilicon (Phoenix), Intel, LSI,ServerWorks
� Migration into workstation, desktop, and embedded markets� Replaces conv PCI as devices move to 3.3V-only in next 2 years
� Continue to dominate local I/O applications even after InfiniBand begins solving new problems in distributed computing applications in 2001 and 2002
� Extend the life of PCI 5-10 years
67
Summary� PCI-X compatibility
� Fully interoperable with conventional PCI� Easy design migration from conventional PCI
– Your conv PCI experience prepares you for PCI-X � Easier electrical design than 66 MHz conventional PCI� No OS or driver changes required (only Hot-Plug System Driver)
� PCI-X performance� Over 1 Gbyte/s� Byte counts, Split Transactions make whole system work smarter� PCI-X adapter cards are good citizens: No more bus hogs� Efficient P2P bridges for hierarchy of buses� Increases system flexibility: 4-slots per bus at 66MHz � Integrates well with distributed I/O standards like InfiniBand
68
Take away
PCI-X is THE logical next step for your I/O designs
Hardware and Software CompatibilityBetter PerformanceNeed I say more?
69
For More Information�PCI SIG www.pcisig.com
� PCI-X 1.0 specificationhttp://www.pcisig.com/members/index.html
� PCI-X compliance checklisthttp://www.pcisig.com/tech/docs.html
� PCI-X spec erratahttp://www.pcisig.com/tech/ecn_ecr.html
� Technical supporttechsupp@pcisig.com
� General information pci-x@pcisig.com
70
For More Information�Compaq
� PCI-X enablement site (sample core, docs, Golden Master Program, links to other vendors):
www.compaq.com/PCI-X� Information
PCI-X@compaq.com� Electrical Design Considerations for
PCI-X and 66-MHz PCI Cards,http://www.compaq.com/support/techpubs/whitepapers/WhitePapers_Industry_Technology.html,
Doc # tc000301tb
Architecture Tutorial
Alan GoodrumChairman, PCI-X Workgroup
Staff Fellow, Compaq Computer Corporation
May 23, 2000Applied Computing Conference 2000
72
Backup
73
PCI-X ComparisonFEATURE PCI AGP 1.0 AGP 2.0 PCI-XPCI slot compatibility Yes No No Yes100 MHz Bus speed No No No Yes133 MHz Bus speed No 66 MHz DDR 66 MHz DDR Yes266 MHz Bus speed No No 66 MHz QDR * NoData Bus Width 32/64 32 32 32/64Address Bus Width 32/64 32/36/64 32/47/64 64Max Bus Bandwidth(MB/sec) 533 533 1064 1064Multiple slots Yes No No YesHierarchical bus topology Yes No No YesSplit Transactions Open Yes Yes YesTransaction Byte count No 32 (256 **) 64 (256**) 4KNon-coherent Transactions No Yes Yes YesNo/Relax Ordering Rules No Yes Yes YesDevice & Bus # ID *** No No, N/A No, N/A Yes* Quad data rate (4X)** Special long read command*** Allows system to be tuned more precisely for optimal performance
74
Background--The need for a faster PCI bus
75
Changing Systems ArchitecturesEmergenceEmergence
of distributedof distributedsystems systems
architectures, architectures, SAN and STANSAN and STAN
Memory
Memory Memory Memory
S A N
Current PCI Based Systems Need a RefreshCurrent PCI Based Systems Need a Refresh
SeparatingSeparatingCPUCPU--plexesplexes
and I/Oand I/Oby fast thin by fast thin
bus segmentsbus segments
PCI-64
PCI-32
PCI-32
Compaq’sAlpha
Faster CPUFaster CPUrequires morerequires moreI/O bandwidthI/O bandwidth
76
I/O Bandwidth vs. Time
1986 1988 1990 1992 1994 1996 1998 2000 2002
Ethernet
1 Gbit/s
InternetBackbone
OC 192
T3
SCSI
10 Gbit/s
EISA
1
10
100
1,000
10,000
Ban
dwid
th (M
B/s
)
ISA
6MHz 16-bit
10MHz 32-bit
33MHz 32/64-bit
66MHz 32/64-bit
77
PCI-X Objectives� Address customer requirements for greater I/O performance� Evolutionary I/O upgrade
� Investment protection� Leverage PCI prevalence--
“The most successful interconnect”� Maintain compatibility with installed base
� More Slots @ 66MHz� Ease of design� Strong industry support
78
PCI-X Burst Write with Target Wait States
PCI_CLK
1 2 3 4 5 6 7 8 9 10 11
Initiator SignalsTermination
2 Clocks beforeEnd of Transaction
12 13 14
PCI Bus
Initiator's View of the PCI Bus
Target's View of the PCI Bus
BusTransaction
Target InitialWait State Pair
15
DecodeSpeed A
Data Transfer
Data Transfer
Data Transfer
Data Transfer
Data Transfer
Data Transfer
FRAME#
ADDRESSAD ATTR DATA-0 DATA-1 DATA-0 DATA-1 DATA-2 DATA-3 DATA-4 DATA-5
BUS CMDC/BE ATTR BE#'s-0 BE#'s-1 BE#'s-0 BE#'s-1 BE#'s-2 BE#'s-3 BE#'s-4 BE#'s-5
IRDY#
TRDY#
DEVSEL#
FRAME#
IRDY#
AD DATA-1 DATA-0 DATA-1 DATA-2 DATA-3 DATA-4 DATA-5ADDRESS DATA-0ATTR
C/BE# BE#'s-1 BE#'s-0 BE#'s-1 BE#'s-2 BE#'s-3 BE#'s-4 BE#'s-5BUS CMD BE#'s-0ATTR
s1_TRDY#
s1_DEVSEL#
s1_FRAME#
ADDRESS ATTRs1_AD DATA-0 DATA-1 DATA-2 DATA-3 DATA-4 DATA-5
s1_IRDY#
TRDY#
DEVSEL#
BUS CMD ATTRs1_C/BE# BE#'s-0 BE#'s-1 BE#'s-2 BE#'s-3 BE#'s-4 BE#'s-5
79
PCI-X Burst Read with Target Wait States
PCI_CLK
1 2 3 4 5 6 7 8 9 10 11
Initiator SignalsTermination
2 Clocks beforeEnd of Transaction
12 13 14
PCI Bus
Initiator's View of the PCI Bus
BusTransaction
Target InitialWait State
DecodeSpeed A
15
Data Transfer
Data Transfer
Data Transfer
Data Transfer
Data Transfer
Data Transfer
Target's View of the PCI Bus
DEVSEL#
TRDY#
DATA-0 DATA-1 DATA-2 DATA-3 DATA-4 DATA-5AD
s1_IRDY#
BUS CMD ATTRs1_C/BE#
ADDRESS ATTRs1_AD
s1_FRAME#
s1_DEVSEL#
s1_TRDY#
C/BE# BUS CMD ATTR BE#'s = FF
DATA-0 DATA-1 DATA-2 DATA-3 DATA-4 DATA-5s1_AD
IRDY#
FRAME#
DEVSEL#
TRDY#
IRDY#
ADDRESSAD DATA-0 DATA-1 DATA-2ATTR DATA-3 DATA-4 DATA-5
FRAME#
BUS CMDC/BE ATTR BE#'s = FF
AD ADDRESS ATTR
80
Split Transactions� Bus efficiency of Read almost as good as Write � Split Transaction components
Step 1. Requester requests bus and arbiter grants busStep 2. Requester initiates transactionStep 3. Targeted (completer) communicates intent with new
target termination, Split ResponseStep 4. Completer executes transaction internallyStep 5. Completer requests bus and arbiter grants busStep 6. Completer initiates Split Completion
� Split Completion routed back to requester across bridges using requester’s bus number and device number
81
Logic Block Diagram for Bypassing Source Sampling
PCILOGIC
PCI BusZ
Logic Gates I/O Buffer
outputenable
IOB1
data_in
M3
PCI-X / PCIMode
PCI-X Feedback Enable
PCI_in
* -- All flip-flops are assumed rising edge triggered
F7d
q ck*
F6d
q ck*
PCI Feedback Enable
F5d
qck*
F1d
qck*
F2d
qck*
M1 SourcingPCI
SamplededPCI F7
d
qck*
F5d
q ck*
SourcingPCI-X
M2
data_out
data_output_enable
82
Device Internal Timing Example
PCILOGIC
PCI BusF2
d
q W2AW2B
F1d
q W1AW1B
data_in
data_out
data_output_enable
F3d
q
P1
W3A
clockdriver PLL
P2
A*
ZB*
X
M1
M2
M3
bs_in_1
bs_in_2
bs_in_3
C*
PCI Clock
Logic Gates
I/O Buffer
Package
oe
D
Y
ASIC InternalClock
Distribution toA, B, & C
CB2
IOB1
* -- All Flip Flops are assumed to be rising edge triggered
83
PCI-X vs PCI: 8x512-Byte Read Simulation
SCSI
SCSI
Enet
Enet
FC
FC
PCI vs. PCIPCI vs. PCI--XX
PCI-XBridge
66MHz PCI66MHz PCI--X up to X up to 19% faster19% faster than 66MHz PCI 2.2than 66MHz PCI 2.2
1 23
45
67
PCI (33 MHz, MRM)
PCI (66 MHz, MRM)
PCI-X (66MHz)PCI-X (100 MHz)
100
150
200
250
300
350
400
450
500
550
600
650
700
MB
/sec
# of Requests
PCI (33 MHz, MRM) PCI (66 MHz, MRM) PCI-X (66MHz) PCI-X (100 MHz)
Note: Assumes ideal memory controller with 32Note: Assumes ideal memory controller with 32--byte CPU cache line and ideal 64byte CPU cache line and ideal 64--bit PCI adaptersbit PCI adapters
top related