thaker q3 2008

19
Verification Strategy for PCI- Express Presenter: Pradip Thaker July 4 th , 2008

Upload: obsidian-software

Post on 11-May-2015

291 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Thaker q3 2008

Verification Strategy for PCI-Express

Presenter: Pradip Thaker

July 4th, 2008

Page 2: Thaker q3 2008

2

Outline

PCI-Express Protocol Overview

Verification Paradigm

Design-for-Verification (Well-aligned implementation and verification architectures) A key ingredient for a timely verification closure

Page 3: Thaker q3 2008

3

PCI to PCI Express

Limitations of PCI Not enough bandwidth

32-bit/33 MHz (132 MB/s) 64-bit/66 MHz (528 MB/s)

Shared bus bandwidth No support for Isochronous applications (TDM or Synchronous Traffic application) Cost of hardware for parallel busses

Evolution Path Growing faster is the only possibility (not wider) Point-to-point communication (Shared bus connectivity impossible above 100/150

MHz) CDR architecture (Speed limitation of a synchronous bus above few hundred MHz) Backward compatibility – a must

Fast forward to future – PCI Express (PCIe) Packet-level data-units over high-speed SERDES based connectivity Layered architecture – much like networking protocols

Mechanical, Physical, Data-link, Transaction, Software and System Layers Compatible with existing PCI software infrastructure Weird wedding of two distinct architectural and business practices – Networking and

Computer Creation of nightmarish scenario for chip verification (Details on later slides)

Page 4: Thaker q3 2008

4

PCI-Express Protocol Overview - Terminology

Dual Simplex – a related set of two differential pairs (Tx and Rx) Lane – “Dual Simplex” when PCI-Express compliant Port – A group of Txs and Rxs within a single device that represent a single connection

to PCI-Express fabric Link – Two ports and the collection of lanes that interconnect them x1, x4, x8, xN – Number of lanes within a port or a link

Upstream – Flow of traffic towards the CPU or a port that establishes link in that direction within the hierarchy

Downstream – Flow of traffic away from the CPU or a port that establishes a link in that direction within the hierarchy

Ingress Port – the portion of a PCIe port that receives the incoming traffic Egress Port – the portion of a PCIe port that transmits outgoing traffic

Root Complex – The combination of a PCIe host bridge and one or more downstream ports

Endpoint – A device that terminates a path within the hierarchy Bridge – A device that physically and electrically connects PCIe to another protocol Switch – A device that provides a physical connection between two or more PCIe ports

Page 5: Thaker q3 2008

5

PCI-Express Hierarchy

CPU

Root Complex

Endpoint Bridge

PCIDevice

PCIDevice

Switch

Endpoint Endpoint

PCI Bus

Page 6: Thaker q3 2008

6

PCI-Express Protocol Overview : Physical Logical Functions

8B/10B Encoding and Decoding Scrambling Reset, initialization, multi-lane de-skew Lane mapping Adjustments of bit-transmission order for various throughput options (x1 through x32) Logical idle behavior and transition to active state as per protocol TLP and DLLP transmission and reception: Insertion and Processing of Special Symbols per protocol conditions Link initialization (recovery from link errors, transition from low power states) Link negotiations

Width Data-rate Lane reversal Polarity inversion

Link synchronization Bit-wise per lane Symbol-wise per lane Lane-to-lane de-skew

Ordered (TS and Skip) set handling and processing Fast training sequence Link power management Delay insertions as per protocol……………………more that could not fit here

Electrical Functions Link within 600 ppm at all times Spread spectrum clocking AC coupling Interconnect parasitic capacitance adherence Receiver DC commong mode voltage of 0 V Transmitter DC common mode established during “Detect” Receiver Detect under various scenarios Total jitter Maximum loss budget De-emphasis Maximum BER Beacon………………………………more that could not fit here

Page 7: Thaker q3 2008

7

PCI-Express Protocol Overview : Data-link Layer

Link management DL_UP, DL_Down, DL_Inactive, DL_Active, DL_Init state transitions Slot power limit handling Propagation of link-reset downstream

Point-to-point reliable data exchange Error detection, re-try as well as Error Logging and Reporting Power Management message decoding, state transitions for activation and de-activation TLP sequence number generation and tracking LCRC computation and decoding DLLP integrity encoding and decoding ACK/NAK generation and processing ACK time-out notification and handling Flow control computation, tracking and processing – Credit based flow-control Data poisoning Completion Time-out Re-transmission of packets Package storage for re-try/replay DLLP generation, processing and actuation based on current status

ACK DLLP NAK DLLP InitiFC1 InitFC2 UpdateFC Power Management Vendor specific

Cut-through routing TLP/DLLP ordering permutations per protocol TLP integrity check insertion and processing ACK/NAK latency timer rules processing a limit-triggered response………………….more that could not fit here

Page 8: Thaker q3 2008

8

PCI-Express Protocol Overview : Transaction Layer

Flow control management TL manages, DL executes Point-to-point, not end-to-end Independent for each VC ID Mechanism presumes “Ideal” conditions Credit types – PH, PD, NPH, NPD, CPLH, CPLD

Data transactions TLP storage and processing for transmission or consumption TLP generation: Header, Payload and Digest TLP generation and handling of various lengths (4 Bytes to 4096 Bytes) Transaction types

Memory (32-bit and 64-bite addressing) I/O Configuration Message

INTx PME ERR Unlock Slot Power Hot Plug Vendor-defined

Transaction Completion Reads and non-posted writes Completion routing is by ID Provide completion status

Transaction Ordering Routing rules Arbitration

Port arbitration VC arbitration

Virtual channels Traffic classes Locked transactions support Isochronous support Advance error processing and reporting………………………….………more that could not fit here

Page 9: Thaker q3 2008

9

PCI-Express Protocol Overview: Summary

Open standard containing over 500 pages Many more pages of supporting literature

Each line of each page in the standards document is a cryptic edict dictating a specific behavior for each condition and not a detailed explanation about behavior or implementation

Much space for protocol detail misinterpretation resulting into mal-function or non-compliance

Hundreds of configuration bits – each controlling a complex behavior within the chip with strict adherence to standard dictate to guarantee backward software compatibility

No wiggle room to claim bug as a feature!!!

Page 10: Thaker q3 2008

10

Verification Paradigm

Chips based on Open-Standard – Pressure Points Technology/Feature differentiator – Marginal or Non-existing

Commodity product – Power, Performance and Price Time-to-market – Very Critical

First product – To Establish Credible Presence Sub-sequent products with various flavors – To Capture Market Share

Bridges: PCI-to-PCIe, SATA-to-PCIe, 1394-to-PCIe, USB-to-PCIe etc. Switches: 4-port x1 throughput, 4-port x4 throughput, 8-port x4 throughput, etc. Root Complex: x1 throughput, x4 throughput, etc.

Quality of First Silicon – Critical

Verification Plays A Major Role in Success of Chips based on Open-Standard Addresses Two Key Aspects: TTM and Quality of Silicon

Verification Execution: Focal Points Functionality Performance Interoperability (Compliance and Compatibility)

Verification Platform Architecture and Methodology: Focal Points Re-usability Scalability (Modularity) Comprehensiveness (with leveraging of automation)

Page 11: Thaker q3 2008

11

Verification Strategy: A Broader Definition

Verification – A vehicle to deliver chips with “Zero Bugs(!)”, Compliance and Superior performance Performance Modeling (C/C++/SystemC)

Architecture and Micro-architecture of Key Data and Control Paths RTL Verification FPGA-based Emulation

Compliance and Compatibility testing PCI-SIG certification to be on Integrator’s List Performance verification

3rd party Compliance Checkers and Vectors Mixed-signal Simulations

Page 12: Thaker q3 2008

12

Functional Verification: Four Pillars

Coverage-driven constrained-random testing with reference models (HVLs) Reference Model (RFM) Temporal Checkers Protocol Monitors Sequence Generators Constraints Functional Coverage Test-plan

Assertion-based verification for key building blocks Detects design errors at the source – increases observability and decreases debug-time Can identify subtle bugs that may be hard to reach with SBV Black-box assertions – Protocol oriented Effective for size/complexity to an extent (memory-size and run-time limitations)

Suitable for block-level deployment rather than end-to-end chip-level stand-alone verification method

Complex properties are verified through bounded-proof (neither proven nor falsified) Effective for control-path oriented logic (state space exploration rather than data-path logic)

verification Assertions when written by engineer other than designer can help detect specification

(interpretation) class of errors

Asynchronous clock-domain simulations

Power-domain simulations – Power Management Compliance Check-list Improper Buffer Insertion, Missing Level Shifters, Missing Power Good, Power Sequencing Tests

Page 13: Thaker q3 2008

13

Functional Verification: CDV (Re-usability and Scalability)

Test-Plan

Constraints

Sequence Generation

BFM(Driver)

DUV

RFM

Functional Coverage

Temporal Checkers

Protocol Monitors

Page 14: Thaker q3 2008

14

Functional Verification: Golden Rules for RFM

Reference Model shall be independent of the DUT implementation Reference Model to be created by engineer other than designer of the block Reference Model created in high-level language and hence it does not have any low-

level mechanics analogous to RTL implementation to realize functionality

Reference Model shall support co-simulation with the DUT in order to predict and verify run-time behavior

Reference Model for each block shall be created such that it can be integrated into chip-level verification environment seamlessly

Hybrid Modeling Control paths: Cycle-accurate modeling Data paths: Packet-accurate or Data-unit-accurate modeling Fully cycle-accurate model is maintenance nightmare as well as a cumbersome task

without significant value-add to verification quality

Comprehensiveness (with leveraging of automation) CDV is only as powerful as comprehensiveness of automated checking features of

reference model and monitors Can run millions of RTG cycles with comprehensive reference model and monitors

without much manual overhead

Page 15: Thaker q3 2008

15

Performance Verification

Performance Parameters (to be supported with variable sized packets across mixed-traffic types, across all traffic patterns, mixed VCs and mixed-packet sizes)

Aggregate Throughput Latency (to be balanced against power dissipation) Jitter in Latency Availability/Blocking – Internal back-pressure N+1 Performance limitation (small TLPs back-to-back) Flow-control credits Load distribution and balancing (peer-to-peer as well as vertical traffic flows with

mixed of traffic types, VCs and packet sizes) Link utilization – No bubbles within or between TLPs (really challenging for cut-

through mode) Zero tolerance for packet loss Zero tolerance for wrong packet routing

20% overhead lost in 8B/10B coding Small TLPs with header as well as DL layer overhead impacting transaction layer

efficiency even with 100% link utilization Traffic-aware flow-control credit updates (large and small TLPs)

Performance Modeling (C/C++/SystemC) Architecture and Micro-architecture of Key Data and Control Paths

FPGA-based Emulation RTL Verification – Not an adequate method for performance testing for PCIe development

Page 16: Thaker q3 2008

16

Compliance Verification

Electrical Compliance Check-list Signal Quality Analysis

Eye pattern, jitter and BER analysis Signaling for upstream and downstream

Jitter Analysis DLL Clock recovery Interpolation Transition/non-transition eye points

Data-Link Layer Compliance Check-list Reserved Fields testing NAK Response Replay Timer Replay Count Link Retrain Replay TLP Order Bad CRC Undefined Packet Bad Sequence Number Duplicate TLP

Transaction Layer Compliance Check-list Completion request, completion time-out, read-data Messaging – Legacy interrupts, Native power management, Hot-plug, Error Signaling Flow Control – Initialization, Transmit and Receive States, Negotiated Link Width Virtual Channel

System Architecture/Platform-configuration Check-list Capability registers testing Default values Stress test Slot reporting Hot plug event reporting

Page 17: Thaker q3 2008

17

Compliance Verification

Separate compliance check-list with some overlap for RC, Endpoints and Switches

Integrated PHY in the silicon FPGA platforms with discrete PHY and digital logic

FPGA-based emulation (Native or 3rd Party) Compliance testing with Agilent PTC and PCI-SIG Golden Suite Compatibility testing with over 80% of the systems during

PlugFest PCI-SIG certification to be on Integrator’s List

Native protocol checkers – static and temporal 3rd party Compliance Checkers and Vectors

Synopsys, Denali, nSys and others

Page 18: Thaker q3 2008

18

Design-for-Verification

Cafeteria Architecture: Modular and Scalable For rapid deployment of various flavors of bridges and switches based on flagship

platform part Speed of Capturing market-share as critical as first product deployment to establish

credible presence

Modular architecture to enable thorough block-level or sub-system level simulations

Functional partitioning to reduce scope of chip-level verification effort and complexity

Push v/s Pull Inter-block Data-threads Distributed v/s Centralized Control Processing

Standardized block interface Reduce scope of “Error of Specification” and “Error of Omission”

Promote verification component re-use (BFMs, Sequences, etc.) Minimum number as well as flavors of physical interconnects between blocks (may

use in-band signaling where applicable)

Emphasis on correct-by-construction practices during design-creation phase Otherwise TTM Window will be missed due to prolonged verification or multiple re-

spins (PCIe non-forgiving of bugs that hamper compliance or compatibility)

Page 19: Thaker q3 2008

19

Thank You!