power on and hardware resetst13.org/documents/uploadeddocuments/technical/d9810… · web...

T13/D98109R3

Ultra ATA Implementation Guide-- Annex C --(Informative)

To: T13 Technical committeeFrom: Mark Evans

Quantum Corporation500 McCarthy BoulevardMilpitas, CA USA 95035Phone: 408-894-4019Fax: 408-952-3620Email: [email protected]

Date: 22 June 1999Subj: Ultra ATA implementation guide

Introduction: The following proposal is for a replacement for Annex C in the ATA/ATAPI-5 standard. This annex is intended to provide details for implementation for Ultra ATA modes 0, 1, 2, 3, and 4 having maximum transfer rates from 16.7 through 66.7 megabytes per second.

Clarification of some aspects of the protocol and details not specifically stated in the standard has been included for the benefit of component, PCB, and device driver engineers. This annex is not intended to be comprehensive but rather informative on subjects that have caused design questions. Included are warnings about proper interpretation of protocol where interpretation errors seem possible.

T13/D98109 Revision 3

Table of Contents

C.1 Signal Integrity................................................................................................................................. 4C.1.1 Skew........................................................................................................................................... 4C.1.2 Source-terminated bus................................................................................................................5C.1.3 Timing measurements on the 80-conductor cable assembly.......................................................10C.1.4 Simulations of the 80-conductor cable assembly........................................................................11C.1.5 Crosstalk in the 80-conductor cable assembly............................................................................11C.1.6 Ground Bounce......................................................................................................................... 17C.1.7 Measuring crosstalk in an ATA system.......................................................................................18C.1.8 System design considerations to minimize crosstalk...................................................................18C.1.9 Ringing and Data Settling Time..................................................................................................19

C.1.9.1 Controlling ringing on the 40-conductor cable assembly.........................................................22C.1.9.2 Strobe lines on the 40-conductor cable..................................................................................24

C.1.10 System Guidelines for Ultra DMA...............................................................................................24C.2 Ultra DMA protocol........................................................................................................................ 26

C.2.1 tSR, tRFS, and the number of additional transfers...........................................................................26C.2.2 Reasons for tSR.......................................................................................................................... 28C.2.3 Reason why tZIORDY longer than tENV is not a problem...................................................................28C.2.4 Recipient pauses and implications for data handling and CRC calculation..................................28C.2.5 CRC calculation and comparison...............................................................................................29C.2.6 IDENTIFY DEVICE command....................................................................................................30C.2.7 Strobe minimums and maximums..............................................................................................30C.2.8 Typical strobe cycle timing.........................................................................................................31C.2.9 Holding data to meet Setup and Hold Times..............................................................................31C.2.10 Reasons for tACK timings............................................................................................................32C.2.11 Host chances to delay a burst and reasons for them..................................................................32C.2.12 Maximums on all control signals from the device........................................................................32C.2.13 Bus turnaround responsibilities..................................................................................................33

C.3 Timing derivations.......................................................................................................................... 33C.3.1 Fundamental timings, skews and delays....................................................................................33C.3.2 IC and PCB timings, delays, and skews.....................................................................................34C.3.3 System timing parameters.........................................................................................................35

C.3.3.1 tCYC........................................................................................................................................ 36C.3.3.2 t2CYC....................................................................................................................................... 36C.3.3.3 tDS.......................................................................................................................................... 36C.3.3.4 tDH.......................................................................................................................................... 37C.3.3.5 tDVH........................................................................................................................................ 37C.3.3.6 tDVS........................................................................................................................................ 39C.3.3.7 tFS.......................................................................................................................................... 39C.3.3.8 tLI........................................................................................................................................... 40C.3.3.9 tMLI......................................................................................................................................... 40C.3.3.10 tUI...................................................................................................................................... 41C.3.3.11 tAZ...................................................................................................................................... 41C.3.3.12 tZAH.................................................................................................................................... 41C.3.3.13 tZAD.................................................................................................................................... 41C.3.3.14 tENV.................................................................................................................................... 42C.3.3.15 tSR..................................................................................................................................... 42C.3.3.16 tRFS.................................................................................................................................... 43C.3.3.17 tRP..................................................................................................................................... 44C.3.3.18 tIORDYZ, tZIORDY, tACK, tSS..........................................................................................................44

Ultra ATA implementation guide


Table of Figures

Figure 1 – A transmission line with perfect source termination 5Figure 2 – Waveforms on a source-terminated bus with rise time less than Tprop 6Figure 3 – Waveforms on a source-terminated bus with rise time greater than Tprop 6Figure 4 – Waveforms on a source-terminated bus with R_source less than cable Z0 7Figure 5 – Waveforms on a source-terminated bus with R_source greater than cable Z0 8Figure 6 – Typical step voltage seen in systems using an 80-conductor cable assembly 9Figure 7 – Typical step voltage seen in systems using an 80-conductor cable assembly 9Figure 8 Measured waveform of falling edge forward crosstalk 11Figure 9 Reverse crosstalk waveform 12Figure 10 – Model of capacitive coupling 12Figure 11 – Waveforms resulting from capacitive coupling 13Figure 12 – Model of inductive coupling 14Figure 13 – Waveforms resulting from inductive coupling 14Figure 14 – Model of mixed capacitive and inductive coupling 15Figure 15 – Waveforms resulting from mixed capacitive and inductive coupling 15Figure 16 – Forward and reverse crosstalk in a distributed system 16Figure 17 – Model of ground bounce in IC package 17Figure 18 – Waveforms resulting from ground bounce 18Figure 19 – Simple RLC model of 40-conductor cable assembly with all data lines switching 20Figure 20 – Output of simple RLC model: waveforms at source and receiving connectors 20Figure 21 – DST measurement for a line held low while all others are switching high 21Figure 22 – DST measurement for all lines switching 22Figure 23 – Improved model of 40-conductor cable assembly ringing 22Figure 24 – Results of improved 40-conductor model with termination at IC versus connector 23Figure 25 – Results of improved 40-conductor model with source rise time of one, five, and ten ns 23



C.1 Signal Integrity

The evolution of Ultra DMA has continued the trend of speed increases on the ATA interface to match increases in disk drive performance. The Ultra DMA modes defined in this standard are:

Mode Maximum transfer rate0 16.67 megabytes per second1 25.00 megabytes per second2 33.33 megabytes per second3 44.44 megabytes per second4 66.66 megabytes per second

(1MB = 106 bytes)

Ultra DMA features such as increased frequencies, double-edged clocking, and non-interlocked signaling require improved signal integrity on the bus relative to that required by PIO and Multiword DMA modes. For Ultra DMA modes 0, 1 and 2 this is achieved by the use of partial series termination and controlled slew rates. For modes 3 and 4 an 80-conductor cable assembly is required. This cable assembly has ground lines interspersed between all signal lines on the bus in order to control impedance and reduce crosstalk, eliminating many of the signal integrity problems inherent to the 40-conductor cable assembly. However, many of the design considerations and measurement techniques required for the 80-conductor cable assembly are different from those used for the 40-conductor assembly. Hosts and devices intended to be used with 40 or 80-conductor cables must be designed to meet all requirements for operation with both types.

For operation in Ultra DMA modes 0, 1 and 2 with a 40-conductor cable assembly, concerns include (in order of importance): crosstalk between signals, ringing, and timing.

For operation in Ultra DMA modes 3 and 4 with an 80-conductor cable assembly, major concerns are: timing, crosstalk, and ground bounce

Two of the features Ultra DMA introduced to the ATA bus are double edged clocking and non-interlocked (a.k.a. source-synchronous) signaling. Double-edged clocking allows a word of data to be transferred on each edge of the strobe signal, resulting in doubling the data rate without increasing the fundamental frequency of signaling on the bus. Non-interlocked signaling means that DATA and STROBE are both generated by the sender during a data transfer. In addition to previous signal integrity issues such as double clocking on strobes due to ringing and delay-limited interlock timings on the bus, non-interlocked signaling makes settling time and skew between different signals on the bus critical for proper Ultra DMA operation.

C.1.1 Skew

Skew is defined as the difference in total propagation delay between two signals as they transit the bus. Skew will be positive or negative depending on which signal is chosen as the reference delay. All skews in the Ultra DMA timing derivations are defined as strobe delay minus data delay. Therefore a positive skew is where strobe is delayed more than data.

Skew corresponds to the reduction in setup and hold times that occurs between the source IC and the receiving IC. In order to guarantee that data is clocked correctly, the maximum skew possible in each direction in a system must be less than the difference between the setup or hold time produced by the source and required by the receiver. Skew between signals will increase as they transit the bus based on differences in the electrical characteristics of the paths followed by each signal. An understanding of the origins of skew and its importance to Ultra DMA requires an explanation of the nature of signal propagation on a ground-signal-ground (G-S-G) cable.

The 80-conductor cable assembly defined in this standard adds 40 ground lines to the cable interspersed between the 40 signals lines defined for the 40-conductor cable assembly. These added ground lines are connected inside each connector on the cable assembly to the seven ground lines defined for the 40-conductor cable assembly. These additional ground lines allow the return current for each signal line to



follow a much closer path to the outgoing current than was allowed by the grounding in the 40-conductor cable assembly. This results in a lower characteristic impedance and greatly reduced crosstalk on the data bus. The controlled impedance and reduced crosstalk of the 80-conductor cable assembly results in much improved behavior of electrical signals on the bus and reduces the data settling time to zero regardless of switching conditions. Reducing the time allowed for data settling from greater than 25 ns in Ultra DMA mode 2 to zero ns with the 80-conductor cable assembly allows nominal cycle time to be reduced from 60 ns for mode 2 to 30 ns for mode 4. However, the 80-conductor cable assembly introduces an additional factor in the timing calculation due to its reduced impedance.

C.1.2 Source-terminated bus

The ATA bus operates as a “source-terminated” bus, meaning that the only low-impedance connection to ground is via the source impedance of the driver.

R_source 100

CableZ0=100 TD=4 ns

V_source

R_rec1 M

Figure 1 – A transmission line with perfect source termination

On a source-terminated transmission line, the initial voltage level produced at the source propagates through the system until it reaches the receiving end, that by definition must be an open circuit or at least have high impedance relative to the characteristic impedance. This open circuit produces a reflection of the original step with the same polarity and amplitude but travelling in the opposite direction. The reflected step adds to the first step to raise the voltage throughout the system to two times the original step voltage. In a perfectly terminated system, R_source matches the cable impedance resulting in an initial step voltage on the transmission line equal to fifty percent of V_source, and the entire system has reached a steady state at V_source once the reflection returns to the source.

The waveforms that will be measured on the bus as a result of this behavior depend on the ratio of the signal rise time to the propagation delay of the system. If the rise time is shorter than the one-way propagation delay, the initial voltage step will be visible at the source or at any point in the system except at the receiving end, where the incoming voltage step is instantaneously doubled as it reflects back to the source.



8 v

4 v

0 v

-4 v0 ns 20 ns 40 ns 60 ns 80 ns

time

Figure 2 – Waveforms on a source-terminated bus with rise time less than Tprop

If the rise time is longer than the propagation delay, the source waveform changes, but the same behavior still occurs: the reflected step adds to the initial step at the source while the receiving end sees a delayed doubling of the initial step. Because the rising edges of the two steps overlap when measured at the source, there is a temporary increase in slew rate instead of a step seen at the source while the rising edge of the reflection adds to the edge still being generated by the source.

6 v

4 v

0 v

-1 v0 ns 20 ns 40 ns 60 ns 80 ns

time

Figure 3 – Waveforms on a source-terminated bus with rise time greater than Tprop

In the waveforms above the source impedance is perfectly matched to the cable impedance, with the result that, after the first reflection returns to the source, there are no further reflections, and the system is at a steady state. In a system that is not perfectly terminated, there are two possibilities: If the source



impedance is less than the characteristic impedance of the transmission line, the initial step is greater than fifty percent of VoH, and the system is at a higher voltage (double the initial step) when the first reflection returns to the receiver. In this case another reflection occurs at the source to reduce the system to a voltage below VoH but closer to VoH than the initial peak. Reflections continue but are further reduced in amplitude each time they reflect from the termination at the source.

Figure 4 – Waveforms on a source-terminated bus with R_source less than cable Z0

If the source impedance is higher than the characteristic impedance, the initial step will be less than fifty percent of VoH, and multiple reflections back and forth on the bus will be required to bring the whole system up to a steady state at VoH.



Figure 5 – Waveforms on a source-terminated bus with R_source greater than cable Z0

Note that falling edges exhibit the same transmission line behavior as rising edges. The only difference between the edges is that VoH and VoL are reversed. Also, output impedance and slew rate of I/O cells are often different between rising and falling edges, resulting in different step voltages and waveform shapes.

For typical ATA implementations using 33 ohm series termination, the effective driving impedance of a source viewed from the cable connector ranges from 40 to 90 ohms. The initial voltage step produced when a rising edge is driven onto the cable will be equal to the I/O cell’s open-circuit VoH divided by the effective output impedance and the typical 82 ohm input impedance of the cable (or a 50 to 60 ohm PCB trace in the case of hosts). For a source using 33 ohm termination this step voltage cannot be greater than 100*(82/(33+82) = 71.3 percent of VoH, and will fall in the range from 50 to 70 percent. Because the threshold of TTL logic is not centered with respect to the high and low voltages, the initial voltage step produced by a driver will cross the TTL threshold on a rising edge but not on a falling edge. However, since the signal received at the end of the bus is a doubled version of the initial output at the source, the main voltage step only affects skew and delay for signals received at devices that are not at the end of the cable. The greater the distance a device is from the device end of the cable (i.e., closer to the host), the longer the duration of the step observed.



-20.00 ns 30.00 ns 80.00 ns #Avg 16 10.00 ns/div repetitive

Figure 6 – Typical step voltage seen in systems using an 80-conductor cable assembly(measured at device and host connectors during a READ)

-20 ns 30.00 ns 80.00 ns # Avg 16 10.0 ns/div repetitive

Figure 7 – Typical step voltage seen in systems using an 80-conductor cable assembly(measured at host and drive connectors during a WRITE)

In addition to the step produced by the initial voltage driven onto the bus and the subsequent reflection, smaller steps are produced each time the propagating signal encounters a change in the bus impedance. The major impedance changes that occur in a system are at the connections between the cable and PCBs, along the traces of the PCBs due to dropping to a different layer, connection between a motherboard and a backplane, etc. In Figure 7 above, the initial step voltage produced at the host connector is less than the usual range of 50 to 70 percent of VoH because it is based on the host IC driving the PCB trace impedance. A second step with a higher voltage follows after the reflection from the host connector has completed the round trip from the connector to the IC and back, adding to the reflection back from the device end of the cable. Steps such as this are produced by reflections at various points throughout the system and significantly change the shapes of edges and ringing.



The transmission line behavior of the 80-conductor cable assembly adds skew to the received signal in two ways: First, different impedance changes along one line versus another will result in different amounts of delay and attenuation due to reflections on the bus. This produces a time difference between the two signals’ threshold crossings at the receiver. Secondly, signals received at the middle device on the cable may cross the threshold during the initial voltage step or after the reflection from the end of the cable is received (depending on the supply voltage, series termination, output impedance, VoH, and PCB trace characteristics of the host).

Factors other than cable characteristics also contribute to skew. Differences in the capacitive loading between the STROBE and DATA lines on devices attached to the bus will delay propagating signals by varying amounts. Differences in slew rate or output impedance between sources when driving the 82 ohm cable load will result in skew being generated as the signal is sent at the source. Differences between the input RC delays on STROBE and DATA lines will add skew at the receiver.

The fundamental requirement for minimizing skew in the entire system is to make the STROBE and DATA lines as uniform as possible throughout the system. Methods of achieving this are described in the clause System Guidelines for Ultra DMA.

C.1.3 Timing measurements on the 80-conductor cable assembly

Since reflections will be present in the signal anywhere besides the receiving endpoint of the bus, it is difficult to measure skew and delays accurately in a system using an 80-conductor cable assembly. For the received signal at a device connector, the propagation delay from the connector to the IC pin is about 300 ps for typical PCBs and trace lengths. This introduces an error of plus or minus 300 ps in timing measurements made at the connector since rising edges and falling edges will be measured before and after the step respectively. When comparing two signals, this can result in an error in measured skew of plus or minus 600 ps due to the measurement position. This error is small enough relative to the total timing margin of an Ultra DMA system that it can be ignored.

The situation is different at a host connector where propagation time to the IC pin can be as high as one or two ns. This can result in plus or minus two ns accuracy in the measurement of a single signal and plus or minus four ns accuracy for skew between two signals. These errors cannot be removed by adding or subtracting an allowance for PCB propagation delay depending on rising or falling edges, because characteristics of the PCB and termination will affect the skew that actually occurs at the IC pin. As a result of this, accurate measurements of skew in signals received at the host should be made either at pins of the host IC, or at points on the PCB traces as close to the IC pins as possible. Test pads, headers, or unconnected vias in PCB layouts may be created allowing connection to DATA, STROBE, and ground for this purpose.

An alternate method of measuring skew at any point in the system is to break the continuity of the system at a point by disconnecting the signal line (in the case of DATA lines) or inserting a buffer with known input characteristics for a STROBE line. A capacitance should be connected on the DATA line to match the load created by the buffer on the STROBE line. Breaking the continuity of the system results in a waveform at that point matching what would be seen at the receiving end of the bus into the same load as the one produced at the break if no further skew was introduced along the line.



C.1.4 Simulations of the 80-conductor cable assembly

The difficult nature of measuring skew in actual systems makes simulations a more important tool in determining the effect on skew of design decisions regarding I/O cells, PCB layout, cable lengths, and other aspects of system design. Because of the well-controlled impedance of the 80-conductor cable assembly, single line transmission line models can provide accurate predictions of the delay through the bus based on a given design choice for a given set of conditions on the bus. Unfortunately, the worst-case signaling situations for skew are more varied and complex than the situations that create worst case data settling times on a 40-conductor cable assembly. As a result, a large number of simulations must be run encompassing many different combinations of parameters in order to be certain of the system-wide consequences of a particular design choice. Simulations such as these were used to determine the timing specifications for Ultra DMA mode 4, and are the basis of the guidelines given below.

One aspect of skew that can be easily measured or simulated (and should be minimized in all implementations) is output skew into a defined load. This is measured at the connector of the device into capacitive loads to ground of 15 pf and 40 pf. An alternate loading arrangement is to measure the signal produced at the end of an 18-inch 80-conductor cable assembly into typical device and host loads of 20 pf or 25 pf that are held uniform across STROBE and DATA lines. Skew is measured at the crossing of the 1.5 volt threshold. Skew should always be measured for all combinations of rising and falling edges on the signals involved.

Minimizing output skew is the best assurance of reliable signaling across the full range of cable loading and receiver termination conditions that will occur in systems.

C.1.5 Crosstalk in the 80-conductor cable assembly

Although the ground-signal-ground configuration of the 80-conductor cable assembly greatly reduces coupling between wires on the cable, the connectors generate a significant amount of crosstalk because they still use the original ATA ground configuration with no ground lines separating the 16 pins of the data bus. In addition, crosstalk between traces on the PCB can reach high levels in systems with long traces or with tight spacing between traces. Cumulative crosstalk plus ground bounce measured at the connector of the receiver in typical systems using the 80-conductor cable ranges from 400 mV to 1.5V peak, in short pulses with frequency content in the range of XXX–XXX MHz depending on the risetime of the drivers used in the system. Although this level of total crosstalk may seem like a significant hazard to reliable signaling, crosstalk exceeding 800 mV normally occurs only during the interval when other signals are switching, and as a result it does not affect the setup or hold times seen by the receiver.

Figure 8 Measured waveform of falling edge forward crosstalk

A larger signal integrity hazard exists when crosstalk occurs during the period when data is stable at the receiver and could be clocked. This can result from a high level of reverse crosstalk seen at the receiver as the reflected signal propagates from the receiver back to the source in the switching lines.



Figure 9 Reverse crosstalk waveform

Reducing systems’ creation of and susceptibility to forward and reverse crosstalk requires an understanding of how crosstalk is generated and propagates through the system. Crosstalk results from coupling between signals in the form of either a capacitance from one signal conductor to another or inductors in the path of each signal with overlapping magnetic fields. The capacitive and inductive coupling operate differently and are easiest to understand if treated as separate effects.

Capacitive coupling in its simplest form consists of a capacitor connecting together two transmission lines somewhere along their length. When a change in voltage occurs on one line (called the aggressor line), a pulse on the non-switching signal (called the victim line) is produced proportional to the rate of change of voltage (dV/dt) on the aggressor line. The pulse on the victim line propagates both forward and backward from the point of coupling and has the same sign in both directions. Forward and backward are defined relative to the direction that the aggressor signal was propagating: forward means that the propogation is in the same direction that the aggressor signal was propagating; backward means that the propogation is in the opposite direction that the aggressor signal was propagating.

Figure 10– Model of capacitive coupling



Figure 11– Waveforms resulting from capacitive coupling(at source and receiver of aggressor and victim lines)

Inductive coupling can be modeled as an inductor in series with each signal, with some coupling factor K representing the extent to which their magnetic fields overlap. In effect these two inductors constitute a transformer, creating a stepped-down version of the aggressor signal on the victim line. The amplitude of the signal produced on the victim line is proportional to the rate of change in current (dI/dt) on the aggressor line. Since the impedance of a transmission line is resistive, for points in the middle of a transmission line dI/dt will be proportional to dV/dt. Because the crosstalk signal produced across the inductance in the victim line is in series with the transmission line, it has a different sign at each end of the inductor. Because the current in an inductor always opposes the magnetic field that produced it, the polarity of the crosstalk signal is reversed from the polarity of the dI/dt on the aggressor line that produced it. As a result of these two facts, inductive crosstalk creates a pulse of forward crosstalk with polarity opposite the edge on the aggressor, and a pulse of reverse crosstalk with the same polarity as the aggressor.



Figure 12 – Model of inductive coupling

Figure 13– Waveforms resulting from inductive coupling(at source and receiver of aggressor and victim lines)

Most actual occurrences of electromagnetic coupling involve both capacitive and inductive coupling together. In this case the forward and reverse crosstalk contributions of the capacitance and inductance add together. Because the forward inductive crosstalk and the forward capacitive crosstalk have opposite signs, they tend to cancel, while the reverse crosstalk from both effects have the same sign and add together. Depending on the ratio of inductive to capacitive coupling, the forward crosstalk may sum to zero when both effects are added together.



Figure 14 – Model of mixed capacitive and inductive coupling

Figure 15 – Waveforms resulting from mixed capacitive and inductive coupling(at source and receiver of aggressor and victim lines)

When transmission lines are placed parallel with and in close proximity to each other, such as PCB traces, wires in a ribbon cable, etc., the coupling that occurs is continuous along the length of the transmission line. To find the crosstalk waveforms that result from this at the source and receiver, divide the transmission line into segments and treat each segment as an instance of capacitive and inductive coupling as shown above, producing forward and reverse crosstalk as the aggressor edge goes by. Then sum the contributions from each of these segments, delaying their arrival at the ends according to their position along the transmission



line. Doing this shows that the forward crosstalk contributions all add together and arrive simultaneously with the aggressor edge, while the reverse crosstalk is spread out along the length of the cable and produces a long flat pulse travelling back toward the source.

Figure 16 – Forward and reverse crosstalk in a distributed system

In the examples above the waveforms are simplified by the fact that all transmission lines are perfectly terminated at both ends. In the case of the ATA bus, only the source end of the bus has a low-impedance termination to ground, and this termination is rarely perfect. This has a number of consequences that must be taken into account to understand crosstalk in an ATA system.

1) Crosstalk is produced by an aggressor signal (or signals) on both the initial and reflected edges. Forward crosstalk produced by the initial edge as it propagates from the source to the receiver arrives at the same time as the edge that produced it. The edge on the aggressor signal reflects from the high impedance at the receiver (or at the end of the cable) and returns back to the source. Reverse crosstalk produced as this reflected edge propagates back to the source is seen on the victim line at the receiver.

2) Reverse crosstalk from the initial edge that is not perfectly terminated at the source will be reflected (with reduced amplitude) back towards the receiving end of the system. The quality of the source termination depends on the instantaneous output impedance of drivers as they are switching, as well as the on resistance of the drivers in the high or low state once they have completed switching. Since the source impedance is made up of the driver output impedance in series with the termination resistors, the most accurate source termination can be achieved by using drivers with low output impedance combined with high value series resistors.

3) Crosstalk is seen with doubled amplitude at the high-impedance endpoint of the system(host IC during reads and end device during writes) due to the reflection. Since crosstalk occurs as a pulse rather than a step, the initial and reflected portions of the pulse only sum at the endpoint while the pulse is reflecting, and not at other points along the bus.

4) Series termination resistors at the receiving end of the bus serve to attenuate the amplitude of crosstalk seen at the pin of the receiving IC. Because the IC input impedance is predominantly capacitive, its impedance decreases at high frequencies. At the frequency where the impedance of the IC input determined by Z = ½ pi FC equals the impedance of the series termination resistor, the crosstalk pulse amplitude seen at the IC input will be about half of the amplitude measured at the connector. As a result of this, in systems where crosstalk levels are high enough to be a serious concern, measurements should be taken at the IC pin or on the IC side of the termination resistor. In design of systems, this filtering effect can be used to reduce a system’s susceptibility to crosstalk by increasing the value of series termination resistors and placing them close to the connector to maximize the amount of capacitance on the IC side of the resistor.

In systems using the 80-conductor cable the largest single contributors to crosstalk are the connectors, followed by the PCB traces in systems with long traces or a large amount of coupling between traces. Crosstalk in the connectors in almost entirely inductive. It is produced in both directions from the connector but not necessarily in equal amplitudes. Because the highest amplitude crosstalk occurs when many lines are switching and only a small number are not, the effective source impedance of the crosstalk voltage is low, approximating a voltage source. This voltage source is in series with the transmission line impedance



on each side of the connector. As a result, the crosstalk voltage is divided between the two lines proportional to their impedances.

Because of its polarity and directional characteristics, there is only one point in the cycle when connector crosstalk creates a serious hazard to signal integrity. After a rising edge has reflected off of the receiving end of the system it passes back through the connectors at each end of the cable. The reverse crosstalk this creates is positive and will propagate back to the receiver and be seen during the middle of the current cycle, creating the possibility of clocking bad data if the amplitude is high enough. At all other points during the cycle, connector crosstalk is either negative (forward crosstalk on a rising edge or reverse crosstalk during the reflection of a falling edge) or in the case of forward crosstalk on a falling edge, it occurs only during the time when all signals are switching and as a result does not affect the setup and hold times at the receiver.

In the worst case of reverse crosstalk from a reflected rising edge, the amplitude of crosstalk at the receiver will depend on the impedance on each side of the two connectors the signal passes through (propagating past a connector on the cable does not introduce a significant amount of crosstalk). If the host PCB trace impedance is low, a larger portion of the crosstalk created in the host connector will travel back towards the receiver.

C.1.6 Ground Bounce

Ground bounce is a form of crosstalk that arises from the resistance and inductance of the power and ground pins of IC packages. For single-ended drivers such as those used in ATA, the return current for all signals flows through the power and ground leads, with the result that voltage drops across these pins are imposed on all signals equally. Voltage drops across these pins occur due to both resistance and inductance, although in general inductance has the greatest effect. In terms of the voltage seen at the receiver, crosstalk due to ground bounce is indistinguishable from inductive crosstalk, with a sign opposite the polarity of the edge on the aggressor signal(s).

Figure 17 – Model of ground bounce in IC package



Figure 18 – Waveforms resulting from ground bounce(at source and receiver of aggressor and victim signals)

In order to measure ground bounce in a functioning system, it is necessary to remove all other sources of crosstalk to the greatest extent possible, ideally by disconnecting the IC pin on which the measurement is being taken and measuring directly at the package. To measure the maximum ground bounce, a line in the middle of the data bus pins on the IC should be held low and measured while all other data lines are switching in the same direction at the same time.

C.1.7 Measuring crosstalk in an ATA system

Measuring the total crosstalk in an system is simple: set up a data pattern in which one line in the middle of the data bus is held low, while all other lines switch simultaneously. Measure the low line at the receiver connector or IC. This measurement includes ground bound at the source IC as well as the contributions to crosstalk of the PCBs, connectors, and cables. Determining the exact sources of the different features of the crosstalk measured by this technique can be difficult. The best method to isolate the crosstalk produced in a given portion of the system is to break the line before and after that feature and terminate it to ground at the breaks with a resistor equivalent to the transmission line impedance that it normally sees at that point. Measuring the crosstalk voltage across the termination resistors will indicate the raw quantity of crosstalk produced by that feature, independent of reflections due to impedance mismatches and attenuation due to capacitance along the bus. Adjusting for impedance mismatches (if they are known) and delays will allow the crosstalk from that feature to be identified in the total crosstalk of the system, and adjusting the impedance changes through the system may allow the impact of that crosstalk to be minimized.

C.1.8 System design considerations to minimize crosstalk

Because all crosstalk throughout the system is proportional to edge rate, the number one factor in controlling crosstalk is controlling the output slew rate of drivers. The second major factor is the impedance match of sources to the cable. This is important in order to prevent reverse crosstalk from the initial edge from reflecting off the source and being seen at the receiver, and to control the amount of reverse crosstalk generated by the reflected edge. Drivers, PCB layout, and resistors should be selected to provide a good source termination for crosstalk and the reflected signal edge. At each connector to the cable the



impedance seen looking back towards the source when the device is driving should match the cable impedance that is the load in the forward direction. For devices this means that the sum of I/O cell output impedance and termination resistance should match the cable impedance (typically 80 to 85 ohms), minus 5-10% to allow for capacitive loading due to other devices on the cable. Because the PCB traces on a drive are short in comparison to the electrical length of edges on the bus, they have little effect on the drive’s output impedance. For hosts, PCB traces are often long enough that for high-frequency crosstalk, the impedance at the host connector is determined by the PCB trace impedance and termination resistors (if they are located at the connector), rather than by the I/O cell output impedance. Because of this, there are two options for hosts to ensure an ideal source termination:

1) Place the termination resistors near the IC and use a PCB trace impedance that matches the source impedance of the IC plus termination resistor, ideally slightly less than the cable impedance. In this case trace impedance should be high, approximately 70-75 ohms, and a large enough trace spacing should be maintained to keep crosstalk between PCB traces at a reasonable level.

2) Place the termination resistors near the connector and select PCB trace impedance and termination resistance to sum to the cable impedance or slightly less. In this case the IC source impedance should match the PCB trace impedance rather than the cable impedance, since that is the load that it is immediately driving.

Option 2 is desirable for legacy compatibility with the 40-conductor cable because placing the resistor near the connector helps to damp the ringing that occurs with that cable. In addition, 50 to 60 ohms traces are easier to implement and produce less crosstalk than higher impedance traces.

In either case, the total output impedance should maintain a close match to the cable regardless of whether the drivers are at steady-state or switching, rising or falling, or experiencing over or undershoot conditions.

C.1.9 Ringing and Data Settling Time

High amplitude ringing can occur for some data patterns in systems using the 40-conductor cable assembly. The sixteen signal lines forming the data bus of the ATA cable have only two ground lines adjacent to them (one on each side of the data signals), and only seven ground lines are present in the entire cable assembly. This lack of ground return paths has three negative effects on data signal integrity:

1) Crosstalk between data lines is very high due to inductive coupling.2) Center conductors of the data bus exhibit very high inductance because the distance from

these signal lines to the current return path is large and the ground return path is shared with many other signal lines.

3) Center conductors of the data bus are shielded from ground by the other data lines around them. When these lines are switching in the same direction there is no potential difference and therefore no effective capacitance between lines.

This combination of factors results in the impedance of the center data lines rising from 110 to 150 ohms (measured when a single line switches) to an almost purely inductive 300 to 600 ohms when all lines switch simultaneously in the same direction. Measured impedance varies with data pattern, cable length, loading, and distance from chassis ground.

In a simplified model of the 40-conductor cable assembly with all data lines switching, a middle data line can be described as a pure inductor, forming a series RLC resonant circuit with the capacitance of the IC and PCB traces, and the combined resistance of the driver source impedance and source series termination resistor. The voltage across C will ring sinusoidally in response to an input pulse at V_source, exponentially decaying over time towards a steady state value. The approximate frequency of the ringing can be

calculated as . The rate of decay is proportional to R/L.



R40

L0.8 H

C25 pfV_source

Figure 19 – Simple RLC model of 40-conductor cable assembly with all data lines switching

10 v

5 v

0 v

-5 v0 ns 50 ns 100 ns 150 ns

Time

source end

receiving end

Figure 20 – Output of simple RLC model: waveforms at source and receiving connectors

Data settling time (DST) is defined as the portion of cycle time required for ringing to reduce in amplitude until it will not cause an incorrect level to be detected on the bus based on the ATA thresholds of 2.0 volts (ViH) and 800 mv (ViL). There are a small number of signaling situations that cause the maximum DST for a system. The worst-case situation for most systems occurs when all data lines on the bus are switching except for one line near the middle of the bus that is being held low.



-22.00 ns 28.00 ns 78.00 ns 10.00 ns/div repetitive y2 800.000 mv x2 17.600 ns y1 1.500000 v x1 -1.00 ns delta y -700.00 mv delta x 18.600 ns 1/delta x 53.7634 MHz

X1

X2

Y1 Y

2

driven line

victem line

Figure 21 – DST measurement for a line held low while all others are switching high(channel 1 is measuring DD3 at the receiver, channel 2 is measuring DD11 at the receiver)

In this situation crosstalk creates a pulse on the signal line being held low that rings with a frequency and damping determined by the effective RLC parameters of the system. The DST value is the duration of time (measured at the receiver) between the nominal beginning of the cycle (when the switching lines cross the 1.5 volt threshold) and the time when the ringing on the line drops below V iL for the last time. The same situation can also occur with reversed signal polarity (one line staying high while others are switching), but due to the asymmetric threshold range of TTL logic, ringing rarely crosses V iH in this case. Another case arises when all lines are switching simultaneously and the voltage on lines near the middle of the bus rings back across the switching threshold. This is normally only a problem in the high state as low side ringing is greatly reduced by the substrate diode clamp to ground that is inherent in CMOS logic.



-22.00 ns 28.00 ns 78.00 ns #Avg 10.0 ns/div repetitive y2 2.00000 v x2 31.800 ns y1 1.50000 v x1 5.000 ns delta y 500.000 mv delta x 26.800 ns 1/delta x 37.3134 MHz

X1

X2

Y1

Y2

source end

receiving end

Figure 22 – DST measurement for all lines switching(channel 1 is measured at the source, channel 2 is measured at the receiver)

As the measurement above shows, use of 3.3 volt signaling by many hosts removes the high side voltage margin provided by the asymmetric TTL threshold. Consequently it is important for these hosts to use reduced slew rate I/O cells to control ringing.

C.1.9.1 Controlling ringing on the 40-conductor cable assembly

An improved RLC model allows comparison between different termination schemes. This model includes separate capacitors to represent trace and IC capacitance at the receiver, as well as a clamping diode, representing the substrate diode in CMOS logic. Because this single-line simplified model does not include crosstalk between lines in the data bus, it cannot be used to predict DST for a particular design and combination of parameters. However, it can indicate the direction of changes in ringing frequency and damping in response to changes in system parameters.

R_source 7

Cable1 H

C_trace15 pf

V_source

R_series_src 33

R_series_rec 33

C_ICpin10 pf

D1D1N914

Figure 23 – Improved model of 40-conductor cable assembly ringing

Comparing the results given by this model for receiver termination resistors located at the IC versus the connector shows that greater damping is provided when termination is near the connector. In the schematic above this corresponds to R_series_rec being connected on the left side of C_trace.



10 v

5 v

0 v

-5 v0 ns 50 ns 100 ns 150 ns

time

waveform at sourceconnector

waveform at receiver IC

waveform at receiver connector

Figure 24 – Results of improved 40-conductor model with termination at IC versus connector

The same model can be used in a similar way to determine the effects of changing slew rate, termination resistor value, output impedance, PCB trace length, or the length of the cable.

Figure 25 – Results of improved 40-conductor model with source rise time of one, five, and ten ns

As the results in Figure 21 show, increasing the rise time to above five ns results in a significant decrease in the amplitude of the ringing. Drivers with control over the shape of rising and falling edges can be used to reduce ringing even more.



The two graphs above show that, although the diode clamps the voltage at the receiver at one diode drop below ground, a “ringback” pulse appears at around 100ns. This pulse occurs because the combined series resistance of the termination resistor and diode is much lower than the impedance of the LC circuit that is ringing. In addition the diode only clamps the voltage across part of the capacitance involved in the ringing. A higher-resistance clamping diode would be more effective at dissipating energy from the resonant circuit but would be less effective at clamping the input voltage.

C.1.9.2 Strobe lines on the 40-conductor cable

Although the data bus on the 40-conductor cable has such a high level of crosstalk that transmission line effects are barely perceptible, the strobe lines on the 40-conductor cable have a much more controlled impedance of approximately 115 ohms because they are in a ground-signal-ground configuration. Although the strobe lines are well-shielded to crosstalk from each other and from the data bus, in the past some devices with fast edge rates and no source termination resistors experienced problems with overshoot and ringback on the strobe lines due to the large impedance mismatch between unterminated drivers and the 115 ohms transmission line. If the ringback exceeded 800mV, the strobe would sometimes see its threshold crossed multiple times and cause extra words to be clocked at the receiver. After these problems were experienced almost all device and host manufacturers began using series termination resistors on the strobe lines at both the source and the receiver. In addition, many manufacturers elected to use hysteresis on strobe inputs, and many also implemented deglitching schemes to ignore spurious edges resulting from ringing.

With current drive technology and the ATA requirement for series termination resistors, ringing on the stobe lines is rarely if ever a problem for current ATA devices. However, it is important to keep in mind that these are high speed edge triggered signals, and the possibility of double crossing of input thresholds due to noise, ringing, or transmission line reflections still exists. Because of this it is important that all ATA hosts and devices implement some amount of hysteresis on strobe inputs, possibly in addition to glitch filtering by digital logic after the inputs.

C.1.10 System Guidelines for Ultra DMA

This summary is a collection of reminders for device, system, and chipset designers. These guidelines are not strict mandates, but are intended as tools for developing compatible, reliable, high-performance systems.

Systems should meet the requirement for capacitance measured at the connector (25 pf host, 20 pf device). With typical interface IC and PCB manufacturing technology this limits host trace length to four to six inches. Capacitance should be measured at 20MHz as this is representative of typical ringing frequencies on the 40-conductor cable assembly.

In systems designed to be used exclusively with an 80-conductor cable assembly, PCB traces up to 12 inches long may be used as long as the following conditions are met:

1) The host chipset uses 3.3 volt signaling,2) The host chipset allows timing margin for the additional propagation delay in all delay-limited

interlocks, 3) Termination resistors are chosen to minimize input and output skew and are placed near the

connector, and4) Total capacitance of traces, additional components, and host input pins is held to the

minimum possible.

In this case capacitance at the connector will exceed the value specified in this standard. As a result these systems may not operate reliably with a 40-conductor cable assembly in any Ultra DMA mode above mode 1 (22.2 megabytes per second) and should use one of the specified cable type detection methods to ensure that mode 2 and above are not set by the host without an 80-conductor cable assembly installed in the system.



The values for pull up and pull down resistors specified in this standard should be used. A value below the minimum given in the proposal will increase skew. Use of a higher resistor value on IORDY (such as 2.0 Kohm or 3.3 Kohm will reduce skew and increase noise margin when IORDY is driven low.

Pull up and pull down resistors should be placed on the connector side of the series termination to minimize loss of DC margin due to pull up/pull down current through the series termination resistors.

Do not exceed the 18 inch maximum cable length required by this standard.

Ideal spacing for device connectors is six inches apart on 40 and 80-conductor cable assemblies from twelve to eighteen inches in length. For cable assemblies shorter than twelve inches, the middle device connector should be centered on the cable.

NOTE – Exceeding a spacing of six inches between device connectors on an 80-conductor cable will cause increased skew when signaling to or from the middle device. As spacing between the devices decreases on a 40-conductor cable assembly the capacitance of the two devices (or middle device and host) act in parallel, resulting in decreased ringing frequency and increased DST.

In systems using a 40-conductor cable assembly, provide a continuous electrical connection from ground on the device chassis through the system chassis to the ground plane on the host PCB. The cable should be routed as close to the chassis as possible to minimize inductance and reduce data settling time.

Total output impedance of hosts and devices should be designed to be as close as possible to the cable impedance to minimize reflections and reverse crosstalk due to the impedance mismatch between the PCB and cable. The impedance of the 80-conductor cable is specified to fall within the range of 70 to 90 ohms and is between 80 and 85 ohms for typical cables with solid wire and PVC insulation.

The ratio of PCB trace spacing to height above ground plane should be kept high to control crosstalk between traces. Using high impedance traces has the added benefit that, due to lower capacitance, longer traces can be used while still allowing use of 40-conductor cable assemblies.

PCB trace characteristics should be controlled to minimize differences in propagation delay between STROBE and DATA lines. Factors that affect the delay are:

1) Trace length,2) Additional capacitance due to stubs, routing on inner layers, pads for unplaced components,

and external components such as pull up resistors and clamping diodes, and3) Additional inductance due to vias, series components such as termination resistors, and

routing across a break in the ground plane, over areas with no ground plane, or at a larger height above the ground plane.

The 33 ohm series termination resistors recommended by the standard should be placed as close as possible to the cable header or connector.

Series termination values should be chosen to equalize input RC delays for the STROBE and DATA lines. For typical host IC implementations the same I/O cell is used on all signals and therefore all terminations should be the same value.

Sufficient ground and power pins should be used on interface ICs to control ground bounce when many lines are switching at the same time.

NOTE – The 80-conductor cable assembly impedance is less than half that of the typical 40-conductor cable assembly impedance when multiple lines are switching at the same time. For some I/O cells this will result in more than double the current draw during switching and as a consequence the amplitude of ground bounce will also double.

I/O cells should be designed to have rise and fall times of five ns or longer across the full range of loading conditions, process, and temperature.



I/O cells should be designed to produce the output setup and hold times at the connector as specified in this standard across the full range of loading conditions, process, and temperature. Margin should be provided to allow for skew introduced between the IC and the connector.

Device PCB traces and I/O cells should be designed to present similar loading between STROBE and DATA at the connector to minimize additional skew added to signaling between other devices on the bus.

The following loading conditions should be used to test I/O cells as well as host and device output characteristics at the connector:

1) zero pf to ground (open circuit, minimize test fixture capacitance)2) 15 pf to ground3) 40 pf to ground4) 470 ohms to ground, switching low to high5) 470 ohms to Vcc, switching high to low

All tests (except open circuit) should be conducted with the intended series termination resistance in place. Output skew and slew rates should be measured between the series termination and the load. Rise and fall times measured between ten and 90 percent of VoH should be five ns or longer into all loads.

C.2 Ultra DMA protocol

C.2.1 tSR, tRFS, and the number of additional transfers

The standard states that if the recipient does not meet the tSR maximum value, then the Ultra DMA burst may be paused with zero, one, or two additional data transfers for modes 0, 1 and 2, and up to three additional transfers for modes 3 or 4. This does not imply that the sender is allowed to send up to two or three more strobes after it detects the negation of DMARDY–. In most cases it would be a violation of tRFS to do so. The tRFS time is less than or equal to one transfer cycle time for modes 0, 1 and 2, and less than or equal to the time for two transfer cycles for modes 3 and 4. Sending two or three more strobes once DMARDY– transitions at the sender’s end of the cable would always be a violation of tRFS for modes 0, 1, and 3. In many cases sending two or three more STROBE edges would be a violation of the tRFS timing for modes 2, 3, and 4. Under all conditions, the sender must meet tRFS: “The sender shall honor the recipient’s negation of DMARDY– within tRFS ns (by not sending any more strobes).”

In most cases it would be a violation for the sender to generate the maximum number of STROBE edges the receiver must be ready to receive after negating DMARDY–. In those same cases, it is still possible for the recipient that is attempting to pause to see two or three more STROBE edges after it negates DMARDY– without any violation of the protocol. This is due to the delay of the signals through cable. Take a case in mode 2 where the strobe time is 60 ns and signal delays add up to six ns:

STROBE @ sender

60 ns

49 nsDMARDY- @ sender

6 ns

DMARDY- @ recipient

5 ns

STROBE @ recipient

6 ns



In this case both STROBE from sender to recipient and DMARDY– from recipient to sender experience a cable delay of six ns. While the recipient negates DMARDY– after the instant that the sender toggles STROBE, it does not see the STROBE transition until after the DMARDY– negation. This would account for the first word received. By the time the sender sees the DMARDY– negation, there are only 49 ns until the next strobe. This STROBE is within tRFS so the sender may send the strobe without violating the protocol. To the recipient, this would be the second transfer after it negates DMARDY–, but to the sender it would be the first and only allowable STROBE transition after seeing the DMARDY– negation.

It can be calculated that in the mode 2 corner cases where the cycle time is the minimum and the delays are maximized, any time that the recipient negates DMARDY– longer than tSR after it receives a strobe edge, it may receive up to two more transfers. tSR is the only timing that is not required to be met. This defines the boundary between cases where it is possible for up to one word to be received after the negation of DMARDY- (named a “synchronous pause”) and the cases where it is possible for up to two words to be received (named an “asynchronous pause”). By the same type of analysis used to show that up to two words may be received in cases similar to the one shown above, it can also be proven that when tSR timing is met by the recipient, it can only receive up to one more word without the sender being in violation of the protocol.

It is important to note that these values are specified to be at the connector and not inside the IC. There will be some output delay of DMARDY– from inside the IC to the connector, and there will be input delay of STROBE from the connector to inside the IC. Even when tSR is met at the connector, two more words may be received inside the IC after the device system clock edge that generates the negation of DMARDY– without any part of the protocol or timing being violated. The first word received inside the IC would be an edge that transitions at the connector before the negation of DMARDY– gets there due to output delays, and the second edge would be the single STROBE (at the connector) that is allowed in the tSR case.

Additionally, a recipient can not expect a fixed number of words after negating DMARDY–. Every time a recipient begins a pause, it must be ready to accept zero more words, one more word, two more words, or three more words (for modes 3 and 4) at random. In addition, the recipient has to be capable of receiving STROBE edges until tRFS after it negates DMARDY–. The recipient should not use the receipt of two or three words after a pause has been initiated as an indication that the sender has paused. The recipient waits until tRP after the pause was initiated before taking any other action (e.g., terminating the burst). This is to allow the sender time to complete it’s process of transitioning to a paused state. This may take additional system clocks after the sender has sent it’s last STROBE transition.

It is impossible for an Ultra DMA recipient to stop a data transfer at an exact, predetermined boundary. Even by meeting tRP timing, the recipient can not avoid cases where the sender may toggle STROBE for one additional word. Please see the clause XX on recipient pauses for additional implications of the tRFS timing.

C.2.2 Reasons for tSR

tSR defines a boundary between different pause cases. tSR could have been omitted from this standard, and it could have been required that, for all recipient generated pauses, the recipient should able to receive up to two more words for modes 0, 1, and 2, and up to three more words for modes 3 and 4. However, a design could be produced in such a way as to always meet tSR through synchronizing the outgoing DMARDY– negation with the incoming STROBE signal from the sender. With this design a recipient would only be required to receive up to one more transfer. Even though this kind of design adds complexity and provides little advantage, tSR was included for completeness. Other than for this unlikely architecture, tSR has no other design implications and for most designs should be ignored.

A system where the DMARDY– is negated asynchronously with respect to the incoming STROBE would not be in violation of the protocol and would be the preferred implementation. In this implementation, the negation of DMARDY– for pauses would be controlled by the state of the FIFO. Once a near-full condition occurred, DMARDY– could be negated. There is no advantage toward FIFO size in trying to meet tSR since synchronizing the outgoing DMARDY– signal with the incoming STROBE requires an additional STROBE to occur after a FIFO near-full condition is detected before the DMARDY– can be negated. If the asynchronous method is selected as recommended, then the receiver will always be ready for the maximum



number of words allowed in the standard after it negates DMARDY– and will work under any of the described conditions.

C.2.3 Reason why tZIORDY longer than tENV is not a problem

tZIORDY does not have a maximum value while both a minimum and maximum bound tENV. In the initiation of a data in burst, this means that STOP may be negated and HDMARDY– asserted (i.e., the host is ready for the first DSTROBE edge transferring data from the device) while DSTROBE is released (in the high-impedance state). For the initiation of a data out burst, STOP may be negated (the host is ready to transfer data when the device is ready) while HDMARDY– is released. In either case, there is no problem if IORDY:DDMARDY–:DSTROBE is released. If IORDY:DDMARDY–:DSTROBE is released by the device, it will be detected as electrically high at the host because the host is required to have a pull-up resistor on this signal line. PIO and DMA protocols rely on this pull-up to maintain an electrically high level on IORDY. These protocols only require this signal to be negated when a device is not ready.

For Ultra DMA, IORDY:DDMARDY–:DSTROBE is only be driven during a data burst. At the initiation of a data in burst, the device may wait until the first data transfer to negate DSTROBE. If the device does not use this implementation, it waits tZIORDY then asserts DSTROBE. Then, for the first data transfer, the device negates DSTROBE. In both cases the host sees a negation for the first DSTROBE. The first STROBE of a burst is never a low-to-high transition. At the initiation of a data out burst, the device may wait until a ready signal is required before negating DDMARDY–. If the device does not us this implementation, it waits t ZIORDY

then negates DDMARDY– (i.e., drive it electrically high). Then, to signal that the device is ready to receive data, the device may negate DDMARDY–. Both implementations are equivalent since the negated state of this signal will appear the same to the host as the released state.

C.2.4 Recipient pauses and implications for data handling and CRC calculation

The Ultra DMA protocol allows the recipient to pause and then terminate a burst at any time regardless of the state of STROBE or the data on the bus. Except for the first two words of a burst, there is never a guarantee that data on the bus will be transferred. Since a sender must stop toggling STROBE in less than one transfer cycle time after DMARDY– negates at its input, it is impossible to avoid cases where data will be gated or latched to the bus but never strobed because the data is latched before DMARDY– is synchronized to the sender’s clock. For example, one possible Ultra DMA mode 2 design implementation would be with a 33 MHz system clock and two flip-flops to synchronize the DMARDY– signal. The first flip-flop would be on an active clock edge and the second on the normally unused clock edge. In this case t RFS is only long enough for the sender to synchronize the DMARDY– signal and then stop toggling STROBE. Any data placed on the bus but not yet strobed when DMARDY– is internally synchronized is not to be strobed.

There is no minimum cycle time for DMARDY–. The recipient does not have to wait for additional words or for tRP from the time it negates DMARDY– until it re-asserts DMARDY–. If, after negating DMARDY–, the device becomes ready, it may reassert DMARDY–. Based on the implementation of the sender, a negation and immediate re-assertion of DMARDY– may cause a subsequent STROBE timing to be delayed. It is recommended that some hysteresis be used in the FIFO trigger points for assertion and negation of DMARDY– to avoid oscillation in the transfer (DMARDY– being negated after every word or two).

The above information on recipient pauses has two major implications: the first is with output data handling, and the second with CRC calculation. If an output register is used where data is transferred from memory to the register in order for presentation on the bus, no assumptions are made that that data has been or will be transferred. If a pointer in memory is incremented or the data is cleared from memory when it is sent to the output register, then that data may be lost unless some recovery mechanism is present to decrement the pointer or restore the data if it is never strobed due to a burst termination after a pause. During a pause, other bus activity (like a status register read) might occur between when a burst is paused and it’s resumption. A design using an output register would have any data in that register overwritten during this other activity. Other designs may involve similar considerations. It is most important to remember that data on the bus is not sent and should not be treated as sent until there is a valid STROBE edge.

Beside careful data handling to avoid the loss of a word, it is important what data is used for calculating the CRC. This standard states, “For each STROBE transition used for data transfer, both the host and device



shall each calculate a new CRC value”. Only words successfully transferred in the transfer phase of the burst are used to calculate CRC. This includes words legally transferred after a pause has been requested. Words put on the bus but never strobed are not to be used for CRC calculation. In addition, if STROBE is negated at the end of a pause and then the burst is terminated, the protocol requires STROBE to be re-asserted after DMARQ is negated or STOP is asserted, depending on the case (both conditions may be true when STROBE is re-asserted). As stated in the Burst Termination steps in this standard, no data is transferred on this STROBE edge and any data on the bus that was not strobed during the transfer phase of the burst must not be used in the CRC calculation on this re-assertion of STROBE.

C.2.5 CRC calculation and comparison

As stated in the clause on recipient pauses (see XX) and implications for data handling and CRC calculation, CRC is calculated on successfully transferred data only. As explained above, there is no guarantee that data placed on the bus will be strobed. CRC is only be calculated on words that are properly strobed.

The CRC generator is not to be clocked on the unsynchronized STROBE edge. A synchronized version of STROBE is required to clock data from an input latch to a FIFO. This same synchronized version of STROBE is used to strobe data to and clock the CRC generator. By directly clocking the CRC generator with the unsynchronized STROBE input, two problems could occur. Noise on the edge of STROBE that causes the input I/O cell to trigger more than once could cause the CRC generator to clock twice, but the synchronized versions of the STROBE would not have a glitch. In this case, the correct data could be strobed into the FIFO but the incorrect CRC value generated. Second, if there is an error in synchronization that causes the wrong data to be strobed into the FIFO, there is a possibility that the wrong data would be strobed to the FIFO while the correct CRC value is determined. The use of this incorrect structure makes the CRC value unreliable and eliminates its advantage.

While CRC generation in it’s most basic form is a bit-by-bit serial shifting process, data on the bus is transferred one word at a time making a serial implementation difficult. For Ultra DMA, short of having an internal clock with a period 16 times shorter than the minimum transfer cycle time t CYC, a clock with a longer period and a parallel equivalent to the serial process is to be used. This standard includes the equations that define the XOR manipulations to make on each bit and the structure required to perform this calculation using a clock generated directly from STROBE. Through the given equations, the correct CRC can be calculated by using a small number of XOR gates, a single 16-bit latch, and a word clock (one clock per strobe edge). The equations define the value and order of each bit, and the order of each bit must be mapped directly to the same order lines of the bus. The CRC register must be pre-set to 4ABAh. This requires pre-setting the latch (CRCOUT) to 4ABAh before the first word clock occurs. After that, CRCIN15 to the latch is tied through to CRCOUT15. When the burst is terminated CRCOUT15 is the final CRC bit 15 that is sent or received on DD15. This direct matching of bit order is true for all CRC bits. The proper use of the data sent on the bus bits DD0 through DD15 during the burst transfer is defined in the equations. The DD15 on the bus has the same value as bit DD15 in the equations to calculate CRC. This direct mapping is true for all bits strobed on the bus during a burst.

Once the burst is terminated and the host sends the CRC data to the device (the host always sends the CRC independent of whether the burst was a data in or data out transfer), the device compares this to the CRC it has calculated. While other CRC validation implementations may be possible, a CRC input register may be used on the device in combination with a digital comparitor to verify that the CRC value in the input register matches the value in it’s own CRC calculation register.

C.2.6 IDENTIFY DEVICE command

A device communicates its Ultra DMA capabilities and current settings to the host in the data returned by the device as a result of an IDENTIFY DEVICE command.

Bit 2 in the Field validity word (word 53) is used to indicate that the bits in word 88 are valid. Word 88 defines the Ultra DMA modes of which the device is capable and the mode that is currently set. The bit in word 53 and the bits in word 88 are the only bits required for Ultra DMA in the ID information. For an Ultra DMA capable drive, word 53 bit 2 is always set. In addition, the device sets the bits in word 88 as appropriate.



For the PIO and Multiword DMA protocols, only the host generates data strobes so the minimum cycle times reported for those protocols in the IDENTIFY DEVICE data are used by the host for both data in and data out transfers to insure that the device’s capabilities are not exceeded. For the Ultra DMA protocol, both the host and device strobe data depending on the direction of that data. The host determines a mode setting based on both the device’s capabilities and its own. The standard states that, “The sender may send data (toggle STROBE) at a minimum period of tCYC…A recipient must be able to receive data at the minimum tCYC

for the currently active mode.” Defining the minimum cycle time at which the device is able to receive data would be redundant. If the device indicates that it is capable of an Ultra DMA mode, it must be able to receive at the minimum time for that mode, no additional cycle time information is required.

C.2.7 Strobe minimums and maximums

The Ultra DMA protocol does not define a maximum strobe time. The sender may strobe as slowly as it chooses independent of the mode that has been set. The limit on the maximum strobe time can be determined by the Ultra DMA device driver or BIOS time-out. This time out should be at least on the order of a few seconds. If for example a device begins to strobe once every ten seconds during a data in burst, this would not be in violation of the protocol. However, this could cause a driver to assume the device is hung. Whatever recovery mechanism the driver chooses to use will then be performed. The recovery will most likely be a reset to the device.

In addition to not being required to send at the minimum transfer cycle time, the sender is also not required to maintain a consistent cycle time throughout the burst. It would not be a violation of protocol for the cycle time to change on every cycle so long as all cycles are longer than or equal to the minimum cycle time for the mode that is set. A recipient must not use an upper timing bound or PLL to qualify the strobe signal. According to the Ultra DMA proposal, the sender may consider the burst paused as soon as it meets the data hold time tDVH. For every word, after the sender has met the hold time, the sender may consider the burst to be paused. The other implication to this is that data to the recipient can stop on any word. After each word, the recipient must wait (with exception of the case where it chooses to pause or stop) but never require an additional word before allowing the burst to be terminated.

While the sender can strobe data as slowly as it wishes, the recipient must always be capable of receiving data at the minimum cycle time of the mode that has been set. The host controls the minimum cycle used by the device to send data for data in bursts by using the SET FEATURES command to set the transfer mode. The device may send at the fastest cycle time tCYC for the mode to which it has been set. The host must be capable of receiving at that cycle time.

C.2.8 Typical strobe cycle timing

The typical cycle times (in order to meet the minimum cycle times based on some reasonable clock variation and signal slew rates) are 120, 80, 60, 45, and 30 ns for 16.67, 25.0, 33.3, 44.4, and 66.6 megabytes per second, respectively.

Using a common system clock rate of 66.7 MHz, the achievable typical cycle times are 120, 90, 60, 45, and 30 ns. A STROBE cycle time of 90 ns for mode 1 is not a violation of the specification as discussed above. A typical cycle time of 90 ns reflects 22.2 megabytes per second. The reason that Ultra DMA mode 1 cycle timing was specified for 80 ns typical instead of 90 ns was for better support of systems that use 25 MHz clocks (40 ns period). A system running at 25 MHz may not be able to meet the minimum mode 2 timing (receive data at a minimum of 55 ns cycle time) but it should be able to meet the typical mode 1 timing. If the mode 1 timing was changed to 90 ns, a 25 MHz system would have to use an even slower cycle time of 120 ns (it’s next slower cycle time without using both edges) since 80 ns would be too short for the protocol.

While not a common clock rate, 50 MHz can meet all mode 0, 1, and 2 timings since it's clock period is 20 ns and might be used for mode 4 where the data is sent at 40 ns typical instead of the 30 ns typical.



C.2.9 Holding data to meet Setup and Hold Times

There are three possible methods of holding data in an attempt to meet the setup and hold times. The first method would be to use the same clock edge to change data and the STROBE but delay the data through some gates. The second method would be to use one edge of the clock to change the STROBE and then the next opposite edge to change data (half cycle). The final method would be to use one active edge of the clock to change STROBE and then the next to change data.

For Ultra DMA mode 4, the cycle time is 30 ns, and the clock period is 15 ns. Using the values given in the timing derivations clause (see XX), the sum of all skews from the output flip-flop clock to the input flip-flop (most of which are out of the control of the sender) are shown to be just under plus or minus 14 ns. The skew between STROBE and DATA is just as likely to be in either direction, and the required setup and hold times of two CMOS flip-flops should be within 200 ps of each other and both less than one ns. This results in the requirement that the minimum setup and hold times that the sender ASIC must generate internally are both just greater than 14 ns. Using a single 66.7 MHz clock period between the STROBE and DATA is an ideal way of holding data. Since the setup and hold time margins are stringent for mode 4, it can be shown that either of the other two methods would fail due to gate delay variations and clock asymmetry. If the DATA transitions are not at the middle of a 30 ns mode 4 cycle, either the setup or hold time margin will be reduced.

C.2.10 Reasons for tACK timings

The tACK value is defined for the setup and hold times before assertion and after negation of DMACK– (a host generated signal). It is applied to all control signals generated by the host related to an Ultra DMA burst. These signals are STOP, HDMARDY–, HSTROBE and the address lines. To the device, the burst begins with the assertion of DMACK– and ends with the negation of DMACK–. For this burst period, all control signals must start, remain, and end in specific states as defined by the protocol. Since there may be some signal skew between signals from the host to the device due to transmission and I/O circuitry affects, the host is required to set up all the control signals before asserting DMACK–. This guarantees that by the time all the signals reach the device, they will all be in the proper state at the instant that DMACK– is asserted. A critical signal is the host STROBE signal. If the strobe does not meet the tACK timing before the assertion of DMACK, there is a possibility that the device would see a spurious STROBE transition immediately after DMACK– is asserted. If at this point the drive has asserted its DMARDY– signal, data would incorrectly be transferred and the burst would fail. Using tACK as the hold time for the signals after the negation of DMACK– guarantees that at the end of the burst, the control signals as seen by the device end in the states they are supposed in end in. This avoids any device state machine confusion.

C.2.11 Host chances to delay a burst and reasons for them

After a device has asserted DMARQ, there is one chance that the host has to delay the start of the burst indefinitely for a data in burst and two chances for a data out burst. All other timings until the first strobe must proceed within specified maximum timings. For both a data in and a data out burst, the first chance that the host has to delay the bust is by delaying the assertion of DMACK–. This delay has no maximum limit. This is the only chance the host has to delay a data in burst and is necessary for cases where overlap in PCI bus access may cause a delay in the time it takes for the host to become ready to receive data from a device after sending a data in command. This one case could have been left as the only chance to delay a data out burst but a second delay was included for optimization in the data out case. For a data out burst, the host may delay indefinitely the first strobe signal. Since the data transfer can not start until DMACK–is asserted, the host overhead to begin a data out burst will be shorter if the host chooses not to use a delay in DMACK– and immediately asserts it when it sees DMARQ asserted. The host may then wait as long as necessary before strobing the first data. If the assertion of DMACK– is delayed until data is ready, there will be additional time necessary to assert it and start the burst. The difference in overhead may seem small but can still be used to optimize for a faster overall transfer rate. The device as a sender may not delay its STROBE indefinitely as the host can since the device controls the signal that starts the transfer process (DMARQ) and it should need no more chances to delay a burst.



Note that it is a violation of the protocol to terminate the burst unless at least one word has been transferred. If, after asserting DMACK–, the host needs to do something else on the bus, then it must send or receive at least one word of data before terminating.

C.2.12 Maximums on all control signals from the device

All timings for signals from the device except for the device strobe signal during a transfer include maximums. The proposal was written in this way to bound the time it takes to perform burst initiation, pause, and termination so the host can always know in advance how long tasks performed by the device can take. For instance, the longest the initiation of a data in burst may take from the host assertion of DMACK– to the first strobe is tENV max plus tRFS max. In all cases except for strobing during a data in burst were a device can pause indefinitely (unless the host terminates the burst), the host can set time-outs for functions performed by the device. Rather than waiting a few seconds for a command or burst time-out, the host can determine that a problem exists if activity is not detected within the specified maximums. Another reason for these maximums is to create a minimum performance level for Ultra DMA devices. Also, the host may require a burst to terminate in a timely manner in order to service some other device on the bus or the system depending on the chip set design. These maximum timings allow this to occur.

For both data in and data out bursts, the host is capable and allowed to terminate the burst at any time. No matter what the device attempts to do to delay the termination of the burst, there is no way for it to indefinitely delay the burst if the host chooses to terminate it. For a device terminating a data in burst, once the device negates DMARQ, all the timings for which the device is responsible are limited with maximum times. The device can attempt to delay the negation of DMARQ after it has toggled DSTROBE for the last word of the burst or command. For the end of a command, it is likely that the host will assert STOP to terminate the command if the device delays its negation of DMARQ. Once STOP is asserted, the device can no longer delay the termination of the burst. It is required to negate DMARQ within t LI from the assertion of STOP. Similar timings apply for the termination of a data out burst, once the host asserts STOP, the device is required to respond on certain signals within tLI timings for the termination of the burst.

C.2.13 Bus turnaround responsibilities

In Ultra DMA, there are some timings that must be met in relation to the driving of the data bus DD(15:0) to avoid device and host bus contention. This particularly applies to data in bursts where the device drives DD(15:0) until the end of a burst and then there is a turnaround so that the host can drive the CRC data on the bus for comparison.

At the initiation of a data in burst the host may be driving the data bus. DMARQ and DMACK– bound Ultra DMA and most of the bus turnaround timings are taken from these signals. At the assertion of DMACK– the host must release the bus within tAZ. The host may have already released the bus long before the Ultra DMA burst started. Since there may be some system failure that causes the host to be in Multiword DMA mode and the device in Ultra DMA mode, there are two additional signals that must be in the proper state before the device drives data onto the bus. These signals are the negation of STOP and assertion of HDMARDY–. STOP is the same pin as DIOW– and HDMARDY– is the same pin as DIOR– and Multiword DMA mode never asserts both DIOW– and DIOR– at the same time. The negation of STOP and assertion of HDMARDY– is equivalent to both DIOW– and DIOR– being asserted. Since the device requires both to be in this state before driving the bus, it guarantees that the host is in ULTRA DMA mode and not Multiword DMA and is off the bus. Devices on the bus must monitor STOP and HDMARDY– signals and once both are in the proper state, must wait tZAD before driving the bus. The tZAD timing must be met from the last of the two signals to switch to the proper state. Once the device starts driving the data bus, it must continue driving the data until the end of the burst.

At the end of a data in burst, there is the second bus turnaround. Unlike the start of the burst, both host and device are known to be in Ultra DMA mode. Since the host must drive data onto the bus before DMACK– is negated, the bus turnaround must occur before this. Unlike Multiword DMA, for the Ultra DMA protocol, all data for a burst is guaranteed to be sent by the time DMARQ is negated. In light of this, the timings for bus turnaround at the end of the burst are taken from the negation of DMARQ. After negating DMARQ, the device must release the bus within tAZ. This can best be achieved by using the same system clock edge to



do both functions. After seeing the negation of DMARQ, the host begins to drive data no sooner than t ZAH. Waiting tZAH guarantees that there will be no bus contention.

Note that during a data in burst, the host must not attempt to drive the bus at any time after the tAZ timing discussed above at the initiation of the burst or before the tZAH timing discussed above at the end of the burst. In this time period, the device has full control of the data bus.

C.3 Timing derivations

While the values for skew and delay in systems may change slightly as additional simulations and measurements are made, the process of deriving the timing values and assumptions about hardware will remain unchanged.

C.3.1 Fundamental timings, skews and delays

These timings are not under the control of the IC and PCB designer.

Typical Cycle Times

Mode 0 = 120 ns (16.7 megabytes per second) = eight 66.7 MHz clock cyclesMode 1 = 80 ns (25 megabytes per second), or 90 ns with a 66.7 MHz clock (22.2 megabytes per

second) = six 66.7 MHz clock cyclesMode 2 = 60 ns (33.3 megabytes per second) = four 66.7 MHz clock cyclesMode 3 = 45 ns (44.4 megabytes per second) = three 66.7 MHz clock cyclesMode 4 = 30 ns (66.7 megabytes per second) = two 66.7 MHz clock cycles

Output Termination Resistor Delays:

Rising transition delay = 0.34 ns minimum, 1.96 ns maximumFalling transition delay = 0.23 ns minimum, 2.61 ns maximum

Input Termination Resistor Delays

Data delay = –0.53 ns minimum, 0.76 ns maximumControl signal delay = –0.18 ns minimum, 0.12 ns maximum

Cable and System skews and delays

Output IC pin to input ATA connector maximum negative skew = –3.37 ns (minimum strobe delay minus maximum data delay)

Output IC pin to input ATA connector maximum positive skew = 2.63 ns (maximum strobe delay minus minimum data delay)

Output IC pin to input ATA connector delay = 6.0 ns maximumOutput IC pin to input IC pin maximum negative skew = –3.52 ns (minimum strobe delay minus

maximum data delay)Output IC pin to input IC pin maximum positive skew = 2.73 ns (maximum strobe delay minus

minimum data delay)Output IC pin to Input IC pin delay = 6.2 ns maximum

C.3.2 IC and PCB timings, delays, and skews

The timings listed here are within the control of the IC and PCB designer. While it is recommended that these timings be met, they are not requirements. Meeting tighter timings in some areas will allow looser timings in others. A designer should take all the timings that are achieved for that design and re-derive the worst case timings for the protocol to determine if the timings for the protocol are met.

Possible Clocks and characteristics



All frequencies are assumed to have 60/40% asymmetry

25 MHz (supports modes 0 and 1)Typical Period = 40 nsClock variation = 1 %

33 MHz clock (supports modes 0, 1, and 2)Typical Period = 30 nsClock variation = 1 %

33 / 30 MHz PCI clock (supports modes 0, 1, and 2)Typical Period = 30 / 33.3 nsClock variation = 1 %Minimum high or low time = 11.3 ns

50 MHz (supports modes 0, 1, 2, and 3)Typical Period = 20 nsClock variation = 3.5 %

66 MHz (supports modes 0, 1, 2, 3 and 4)Typical Period = 15 nsClock variation = 3.5 %

Note that if 33 MHz or a multiple or 33 is used, the typical cycle time for mode 1 will be 90 ns instead of 80 as is achievable with 25 MHz or multiples thereof.

PCB Traces

Delay = 1.0 ns maximumSkew between signals due to traces = 0.1 ns maximum

IC inputs

Input delay from I/O pin to internal FF, includes input buffer and routing = 5.5 ns maximumInput skew from I/O pin to internal FF, (+/–) between strobe and data = 4.3 ns maximum

IC outputs

Output delay from internal system clock edge to I/O (including output buffer) = 18 ns maximum. In order to meet some interlock timings with a 33MHz or PCI clock, the output delay can be no more than 14 ns.

Output skew from internal system clock edge to I/O (+/–) between strobe and data. Strobe edge may be rising or falling and data edge may be rising or falling. Must be met with any falling edge starting at I/O cell’s VoH level or Vcc5 of system. Including skew due to possible ground bounce during switching.

With 25 or 33 MHz clock = 5.0 ns maximumWith 50, or 66 MHz clock = 5.5 ns maximum

Output rising vs. falling skew for a single buffer = 2.8 ns maximum

Three more ns of delay is required on data than strobe for 33 MHz and PCI cases only. This is because, with these clocks, the data is held by a half cycle, and a minimum half cycle is not sufficient to meet the output hold time given the output skews listed above. The strobe and data must be skewed so that the typical data delay is longer than the typical strobe delay. The only way to reduce this required skew and meet the hold time would be to reduce the total output skew listed above. The reduction in required data delay is equal to the reduction in total output skew.



IC flip-flops

Flip-flop setup time (internal) = 0.7 ns minimumFlip-flop hold time (internal) = 0.5 ns maximum

C.3.3 System timing parameters

All System timings for Ultra DMA (including tRP) are referenced to the connector of the agent responsible for the timing. Internally the IC must account for input and output delays and skews associated with all signals getting from the connector to the internal flip-flop of the IC and from the flip-flop of the IC to the connector.

All of the values presented in the system timing derivations below use the minimum and maximum timing characteristics listed above. While values are given for each possible clock frequency for some parameters, it is important to remember that each is only an example of what system timings will be when the above listed timing characteristics are met. An IC designer should re-derive all listed applicable timings based on the characteristics of the available system clock, IC and PCB that it are used to confirm that all system timing requirements are met.

C.3.3.1 tCYC

This minimum timing must account for STROBE asymmetry and clock variation. The worst case for minimum tCYC would be generated by using the maximum output buffer skew for signals switching in opposite directions. The formula for the minimum values is:

+ Number of clock cycles to meet minimum typical cycle time @ minimum cycle time due to clock variation

– Maximum skew for switching in opposite directions on same buffer

C.3.3.2 t2CYC

Since this timing is taken from falling edge to falling edge or rising edge to rising edge of STROBE, asymmetry in rise and fall time has no affect on the timing. Clock variation is the only significant contributor to t2CYC variation. The formula for the minimum values is:

+ 2 X (Number of clock cycles to meet minimum typical cycle time @ minimum cycle time due to clock variation %)

The minimum t2CYC timings for modes 0, 1, and 2 in ATA/ATAPI-5 are 235, 156 and 117 ns respectively. The existing numbers are based on a 2% clock frequency variation where actual clock frequency variation on some devices is higher. For a 3.5% clock variation as listed above, the t2CYC times for modes 0, 1, and 2 must be changed to 230, 154, and 115 ns respectively. As with tCYC, no manufacturers have brought this timing up as an issue, the standard is being changed to reflect actual achievable values.

C.3.3.3 tDS

This is the data set up time at the receiver. Since timings are taken at the connector and not at the ASIC, the effect of the termination resistors and traces must be considered when generating this number. Depending on the direction of the data signal and STROBE transitions, the skew between the two can change in both the positive and negative direction. A longer data signal delay will reduce the setup time, and a longer STROBE delay will increase the setup time.

In order to meet the required input skews given above, the number of buffers or amount of logic between the incoming signals and the input latch must be minimized. It may require the data input buffers to be routed directly to the input latch with no delay elements and the STROBE signal routed directly from it’s input buffer to the input latch clock with no delay elements.



The internal latch/flip-flop has a non-zero setup and hold time. tDS must be sufficient to guarantee that the setup time of the flip-flop is met. The minimum setup required at the IC pin is:

+ Maximum input skew+ Minimum flip-flop setup time

Setup at IC for all modes = 5.0 ns minimum.

What is achievable to the receiver connector without considering data settle time can be determined as follows:


– Number of clock cycles used to hold data @ minimum cycle time due to clock variation or @ minimum cycle symmetry if a half cycle is used

– Absolute value of maximum skew from sender IC to receiver connector between STROBE and DATA where DATA delay is longer.

– Maximum skew through sender trace (not included in IC to receiver connector skew value)

The proposal for Ultra DMA adds margin for settle time for all modes 0, 1, 2, and 3 by setting the minimum tDS values to 15, 10, 7, and 7 ns respectively. The minimum working (and achievable) setup time t DS at the receiver connector is five ns.

C.3.3.4 tDH

As with the setup time tDS above, the hold time at the connector of the receiver must be of sufficient time to guarantee that the hold time of the internal flip-flop is met. The longest STROBE delay and shortest data delay is the worst case for hold time. The analysis is similar to the one for tDS above. The minimum hold required at the IC pin is:

+ Maximum input skew+ Minimum flip-flop hold time

Setup at IC for all modes = 4.8 ns minimum.

What is achievable to the receiver connector without considering data settle time can be determined as follows:

+ Number of clock cycles used to hold data @ minimum cycle time due to clock variation or minimum half cycle time given worst asymmetry if a half cycle is used

– Absolute value of maximum skew from sender IC to receiver connector between STROBE and DATA where STROBE delay is longer.

– Maximum skew through sender trace (not included in IC to receiver connector skew value)

The minimum working (and achievable) hold time tDH at the receiver connector is five ns. This assumes that one 50 or 66.7MHz clock or half of a 33MHz or slower clock has been used to hold data within the sender IC.

C.3.3.5 tDVH

This is the hold time required at the sender. The determination of this parameter can be approached from many perspectives. One approach is to determine what value must me bet in order to meet tDH. The minimum hold time required by the receiver has already been determined above (tDH) and the first approach will be based on this.

Hold time is reduced in the system with a STROBE delay that is longer than the data delay, this is represented by the maximum positive skew above. The hold time required at the output IC is therefore:



+ tDH

– Maximum positive IC to IC skew– Maximum trace skew

IC requirement = 7.63 ns

Given this requirement at the IC pin, the requirement at the connector pin can be determined. Strobe delay reduces the hold time and data delay increases it so the worst case would be as follows:

+ IC requirement just determined– Maximum trace skew– Maximum output falling delay through series termination+ Minimum output rising delay through series termination

tDVH all modes = 5.21 ns minimum

This value is rounded up to the nearest nanosecond to add a little margin so the tDVH specification for all modes is set to 6.0 ns.

After a connector timing requirement is determined, the IC pin requirement to meet this must be determined. A straightforward way to do this is to determine the extra margin added to tDVH by rounding up to the nearest nanosecond and add that same margin to the IC requirement already determined. The same result is found be taking the tDVH specification value and adding back the trace and termination resistor skew maximums as follows:

+ tDVH specification+ Maximum traces skew+ Maximum output falling delay thorough series termination – Minimum output rising delay through series termination

Actual IC requirement to meet tDVH = 8.42 ns

Either The achievable tDVH or hold at the IC pin can be determined and verified against the appropriate value listed above. The achievable tDVH can be determined as follows:

+ Minimum internal hold time. This will be a minimum half clock cycle time for a 25 or 33 MHz clock (clock at the minimum cycle time due to clock variation and the minimum asymmetry). For the PCI clock it will be the minimum PCI high or low time. For a 50 or 66 MHz clock it will be a minimum full clock cycle time (clock at the minimum cycle time due to clock variation).

– Maximum IC output skew– Maximum trace skew– Maximum output falling delay through series termination + Minimum output rising delay through series termination + Extra IC data delay over strobe delay for 33 MHz and PCI clocks only

If all timing characteristics given above in this document are met, the hold time requirement at the IC in order to meet the tDVH specification will be met. As mentioned in the IC timing clause (see XX). The required extra delay on the data over the STROBE can be reduced or eliminated by reducing the output skew for the IC. No other output delays or skews are under the control of the IC so this is the only way the required delay can be reduced. A full cycle time must not be used to hold data with a 33 MHz or PCI clock because this would be a maximum of over 30 ns internal hold time making it impossible to meet the required t DVS time. Simple gate delays can not be used in order to meet the hold time. Take, for example, an IC where the output buffer skews alone are reduced to only two ns. In order to meet tDVH, the internal hold time would have to be a minimum of 10.4 ns. Delays from transistor to transistor in a single IC can be well matched but over process, temperature, and voltage, transistor delay can very by over 3X. Even if the process were controlled well enough to guarantee a delay variation of 3X, the maximum internal hold would be over 30 ns and the tDVS time would not be met. While not recommended, it may be possible to use a calibrated delay path instead of the clock to produce the desired hold time.



C.3.3.6 tDVS

This is the data signal setup time that must be met at the sender in order to meet the setup time at the receiver. In the case of Ultra DMA modes 0, 1, and 2, the data settle time can be long due to coupling in the cable and on the PCB, and loading. For the fastest modes (those above mode 2), there is little or no achievable margin for extra ringing on the cable. For these modes, the 80-conductor cable assembly that reduces the coupling between signals and eliminates any ringing on the cable that crosses the threshold is required. Using the formulas given above it can be demonstrated that the setup time required at the sender is met if both ICs meet the parameters listed in the document. The Identical parameters are therefore used here to calculate achievable setup time at the sender I/O cell. Meeting these values will therefore generate functional setup times at the receiver as long as data settle times are sufficiently short enough:


– Number of clock cycles used to hold data @ minimum cycle time due to clock variation or @ minimum cycle symmetry if a half cycle is used

– Maximum skew through sender trace

The mode 4 value leaves no settle time in with a 66Mhz clock. There should be about five ns worth of settle time available if mode 4 data is sent at a slower rate with a 50Mhz clock. In mode 3 there is about 14 ns of margin for data settle time. Given that all of these settle times are less than worst-case settle times with a 40-conductor cable assembly, an 80-conductor cable assembly is required for both modes 3 and 4.

The value specified for Ultra DMA mode 2 for tDVS is 34 ns in ATA/ATAPI-4. The only way this value may be achieved along with the required hold time is to reduce the output skew. If the total output skew is reduced by two ns, then the required data delay can also be reduced by two ns making the achievable tDVS four ns longer, that would meet the specification. For a 50 MHz clock the total output skew would have to be reduced by four ns, that will be difficult or impossible. The Ultra DMA 66 proposal changes the mode 2 value to 30.

C.3.3.7 tFS

This timing is used only for the beginning of a read command from the STOP negation and/or HDMARDY– assertion to first DSTROBE (all falling edges). The device is required to “sense” that these two control signals from the host have changed. In general, synchronization is done with two flip-flops. After synchronization is achieved, data must be driven on to the bus and clock cycles counted off to meet the minimum setup time before the first STROBE is driven. In order for an IC based on a 25 MHz, 33 MHz, or PCI clock to meet tFS, data must be driven onto the bus no later than about 2.5 clock cycles after the control signal transitions. This could be done by synchronizing with both the active and inactive edge of the system clock or by using only active edges to synchronize but then driving data onto the bus on the next inactive edge of the clock after the signals are detected at the output of the second synchronization flip-flop. With a 50 MHz clock, the first word of data must be driven out no later than three cycles after the control transitions and with a 66 MHz clock, it may be four cycles. The maximum tFS timing is the sum of the following:



+ Maximum input STROBE falling edge delay through termination resistor+ Maximum PCB trace delay+ Maximum IC input delay to flip-flop+ Minimum flip-flow setup time + two, three, or four clock cycles at the maximum period due to frequency variation to synchronize

the control signals and start the data transfer cycle. For 25 MHz, 33 MHz, and PCI based systems, the data would be drive out ½ cycle after this, for other clock frequencies, data must be drive out no later than the three or four cycles allowed for here.

+ As many cycles as required to meet the tDVS minimum timing for the first word of data. Worst case for tFS is these at the maximum period due to frequency variation. For 25 MHz, 33 MHz, and PCI based systems, the number of cycles would be whatever is required to meet tCYC time. For 50 and 66 MHz clocks, it would be one less.

+ Maximum output buffer delay+ Maximum falling edge output termination resistor (33 ohm) delay

C.3.3.8 tLI

The value of tLI needs to be large enough to give one agent enough time to respond to an input signal from the other. The derivation of tLI is similar to that of tFS since both involve one agent responding to the control signal of another. As with tFS, the number of clock cycles that an IC may take to respond is dependent on the frequency of the clock being used. For a 25 MHz or PCI clock, the maximum time to respond is three cycles, for 33 MHz clock it is four, for a 50 MHz clock it is five, and for a 66 MHz clock it is seven cycles maximum for modes 0 through 2. Modes 3 and 4 require a faster response time. For a 50 MHz clock it is three and for a 66 MHz clock it is four clock cycles maximum. The achievable values of t LI derived as follows:

+ Maximum input delay through series terminations+ Maximum PCB trace delay.+ Maximum IC input delay to flip-flop.+ Maximum flip-flow setup time + three, four, five, or seven clock periods (depending on clock used and modes supported) at the

maximum period due to frequency variation to synchronize the signals to the internal clock and respond appropriately.

+ Maximum output buffer delay+ Maximum output delay through series termination (falling)

C.3.3.9 tMLI

This timing insures that some control signals are in their proper state before DMACK– is negated. It is important that STROBE and the control lines are in their proper states because all signals revert to their non-Ultra DMA definitions at the negation of DMACK–. If the signals are not in their proper state, the active device or another device may see a false read or write strobe or data request. All control signals must be in their proper state and detectable at the device ASIC pins before DMACK– is negated so t MLI must overcome the following:

+ Maximum IC to IC delay+ Maximum IC input delay to flip-flop+ Minimum flip-flop setup time

The value determined for tMLI for all modes is just over 12 ns. The 20 ns specified for tMLI in this standard is more than enough to meet the requirement.

C.3.3.10 tUI

This timing is always measured from an action of a device to a reaction by the host. In order to allow the host to indefinitely delay the start of a read or write transfer, this value has no maximum. The minimum value is left at 0 for modes 3 and 4.



C.3.3.11 tAZ

During data bus direction turn around, the current driver of the bus is required to release the data on the same clock cycle as another action it is taking (at the latest). For the beginning of a read burst, the host must release the bus before or on the same clock cycle that it asserts DMACK–, for the end of a read burst, the device must release the bus before or on the same clock cycle that it negates DMARQ–. If the same clock is used, the maximum delay can be calculated using the following formula:

+ Maximum total IC output skew+ Maximum output (33 ohm) termination resistor delay

The minimum required value determined for tAZ for all modes is under six ns. The specification for this value in this standard allows for additional margin and, thus, is ten ns.

C.3.3.12 tZAH

This timing is used only for the termination of a read where the actions taken by both the host and device to change the direction of the data bus are measured from the same control signal (DMARQ). In this case the device is allowed to continue driving the bus for a maximum of tAZ after the DMARQ negation. The device is driving both DMARQ and the data bus to start. The host must wait t ZAH after the DMARQ negation to drive the data. Skew on the cable is the major factor to consider here and a longer data delay than DMARQ delay (referred to here as maximum negative skew) is the worst case. To avoid bus contention, this value is calculated using the following formula:

+ Maximum tAZ

– Maximum negative I/O to I/O (overestimated by one set of termination resistors)

The minimum value calculated for tZAH is just under 14 ns. The specification for this value in this standard allows for additional margin and, thus, is 20 ns.

C.3.3.13 tZAD

This timing is used only for the initiation of a read operation where the direction of the data bus is changed. Unlike the termination of a read operation where tZAH is used, the bus high impedance time (bus released) and bus driven time are measured from two different control signals. Since these control signals must meet tENV timing, that is a minimum of 20 ns, no additional delay is necessary based on the t ZAH evaluation. The device must wait for the correct conditions to be present and then may immediately start driving the bus with no possibility of having bus contention. In practice, the device will require two flip-flop delays to synchronize the control signals before it begins driving the bus. The specification for this value in this standard, however, is 0 ns minimum.

C.3.3.14 tENV

This time is from the host’s assertion of the DMACK– signal (falling edge) at the beginning of a burst to the assertion or negation of other control signals from the host (all falling edges). Since t ENV only applies to outputs from the host, the timings are synchronous with the host clock. Based on an argument similar to the one for tMLI, the minimum for tENV is set to 20 ns. This guarantees that all control signals at all the devices are in their proper (non-Ultra DMA mode) states before DMACK– is asserted and are sensed as changing only after DMACK– has been asserted. The 20 ns accounts for cable and gate skew between DMACK– and the control signals on device inputs. Since tENV involves synchronous events only and an increase in tENV reduces the performance of the specification, a maximum is applied.

Enough clock cycles must be used between the assertion of DMACK– and the other control signals to insure tENV minimum is met. For a 25 MHz, 33 MHz or PCI clock this is a single cycle, for 50 or 66 MHz clocks this must be two cycles. The minimum tENV value is calculated using the following formula:

+ one or two system clock cycles (depending on frequency used) at the minimum period due to frequency variation to delay control signals inside the IC



– Maximum total IC output skew– PCB trace skew– Maximum output falling delay through series termination+ Minimum output falling delay through series termination

Using the number of clock cycles specified above for each possible frequency, the minimum is met. The tENV

maximum must also be met. For a 25 MHz and PCI clock a single cycle must still be used. For a 33 or 50 MHz clock a maximum of two cycles may be used, and with a 66 MHz a maximum of three clock cycles may be used. The maximum tENV can be determined using the following formula:

+ one, two, three, or four clock cycles (depending on frequency used) at the maximum period due to frequency variation to delay control signals inside the IC

+ Maximum total IC output skew+ PCB trace skew+ Maximum output falling delay through series termination Minimum output falling delay through series termination

Using the number of clock cycles specified above, the maximum is met. Note that all the minimum and maximum number of clock cycles to be used are based on the timing characteristics given in this annex and fewer or more clock cycles may be used with some frequencies given reduced output skew. If the timing characteristics here are just met, then the internal IC delay must use the following number of clock cycles to be within tENV minimum and maximum values.

1) with 25 MHz, delay is one cycle2) with PCI (30 or 33 MHz), delay is one cycle 3) with 33MHz, delay is one or two cycles4) with 50 MHz, delay is two cycles5) with 66 MHz, delay is two or three cycles

C.3.3.15 tSR

The value of tSR is determined in such a way as to guarantee that the receiver will get a maximum of one additional STROBE after it negates it’s DMARDY– signal. As is stated later in this annex, there is no advantage to using synchronous pauses, so the derivation of the values is not included here.

C.3.3.16 tRFS

This timing gives the sender time to sense the negation of DMARDY– and respond by not sending any more STROBES. Unlike all other interlock timings where a delay in the timing does not affect the number of words transferred, a delay in tRFS timing does affect the number of words transferred. Since tRFS involves a response to a request to pause, it should be as short as possible. The shortest possible asynchronous input synchronization method is to use two flip-flops where the first is clocked on the active edge of the clock and the second on the normally unused (inactive) edge of the clock. The action to stop the STROBE signal would be taken on the next active clock edge (if there had been a STROBE scheduled for that edge it would not be sent). The hardware configuration just described is required for Ultra DMA operating with a 25, 33, or 50 MHz, or PCI clock. A half cycle of any of these clocks gives adequate time to avoid metastability while synchronizing the signal. The following timing diagram shows possible cases:



FF1

tRFS range

Clock

STROBE

DMARDY-

Next STROBE would havebeen here

FF2

The diagram above shows the range of possible STROBE to DMARDY– transition relationships and the possible synchronization flip-flop responses. When a 66 MHz or higher clock frequency is used, two clock periods may be used to synchronized the data as long as no STROBE edge is sent on the subsequent clock edges until the transfer is resumed.

The case where the tRFS time may be the longest is where the DMARDY– transition occurs before a clock cycle, but, due to skews and missed setup time, the transition is not clocked into the first flip-flop until the next clock (the dotted line transition on FF1 and later on FF2). When this happens one clock cycle before a STROBE transition is generated (as shown by the left tRFS range marker near the middle of the DMARDY– transition range in the diagram above), the next STROBE transition will occur (as shown in dotted lines). For all other cases, the tRFS time will be shorter. The formula for the maximum tRFS is calculated using the following formula:

+ Maximum input falling delay through series termination+ Maximum trace delay+ Maximum total IC input delay to flip-flop+ Minimum setup time for flip-flop + one or two clock cycles at the maximum system clock period due to frequency variation for

synchronization+ Maximum total IC output delay from clock to output pin+ Maximum trace delay on STROBE+ Maximum output delay through series resistor (falling)

ATA/ATAPI-4 specifies a minimum tRFS of 60 ns for mode 1 and 50 ns for mode 2. A system using a 25 MHz clock can not meet the tRFS minimum time of 60 ns. Only a system using a 50 MHz clock can meet that minimum for mode 2. All systems and devices with clock frequencies other than 50 MHz can not meet this time for the worst case condition. In this standard the values for this parameter are 75, 70, and 60 ns for modes 0, 1, and 2 respectively.

C.3.3.17 tRP

This is the time from the receiver’s negation of DMARDY– until no more STROBES will be received. STROBE edges may arrive at the sender until after this time period. Since this time parameter applies to the receiver only (as the receiver waits for STROBES), the parameter is referenced at the receiver connector. Because of this, the output delay of DMARDY– from inside the IC to the connector and the input delay of a STROBE edge from the connector to the associated internal IC flip-flop must be considered.

There are two ways to determine the tRP minimum. One method is to consider how long it will take from the negation of DMARDY– at the receiver for the sender to see the negation and become paused. This would involve synchronizing DMARDY– as it is done for tRFS, and then taking one more system clock cycle to change the state of the state machine to a paused state. Using this method, the minimum time is calculated using the following formula:



+ Maximum IC to IC delay (overestimates the delay by one termination resistor)+ Maximum total IC input delay to flip-flop+ Minimum setup time for flip-flop + two or three clock cycles (depending on clock used) at the maximum period due to clock

frequency variation

A second method to calculate this value is to consider how long it might be for the last STROBE to be detected after negating DMARDY–, and make sure tRP is long enough so that the internal assertion of STOP occurs after the last STROBE has latched the last word of data. This method is applied in the following formula:

+ Maximum IC to IC delay (this overestimates the delay by one termination resistor)+ Maximum tRFS for mode+ Maximum IC to IC delay (this overestimates the delay by one termination resistor)+ Maximum total IC input delay to flip-flop+ Minimum flip-flop setup time

Using both of the above, it can be shown that tRP is met given the tRFS requirement and is sufficient to receive the last STROBE for all modes with all clock frequencies. All of the numbers are referenced at the connector, and the time to wait internal to the IC must be longer than the value of t RP. For higher frequency clocks, the internal delay may need to be more than one clock cycle longer than the value of tRP in order to account for total output and input delays.

C.3.3.18 tIORDYZ, tZIORDY, tACK, tSS

The derivation of these values is not described in this annex.


power on and hardware resetst13.org/documents/uploadeddocuments/technical/d9810… · web...

Documents