modeling and simulation of mixed analog-digital systems
TRANSCRIPT
MODELING AND SIMULATION OF MIXED ANALOG-DIGITAL SYSTEMS
THE KLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE
ANALOG CIRCUITS AND SIGNAL PROCESSING Consulting Editor
Mohammed Ismail ()hioState [Jniversi~
Related Titles:
CHARACTERIZATION METHODS FOR SUBMICRON MOSFETs, edited by Hisham Haddara ISBN: 0-7923-9695-2
LOW-VOLTAGE LOW-POWER ANALOG INTEGRATED CmCUITS, edited by Wouter Serdijn ISBN: 0-7923-9608-1
INTEGRATED VIDEO-FREQUENCY CONTINUOUS-TIME FILTERS: High-Performance Realizations in BieMOS, Scott D. Willingham, Ken Martin
ISBN: 0-7923-9595-6 FEED-FORWARD NEURAL NETWORKS: Vector Decomposition Analysis, ModeUing and Analog Implementation, Anne-Johan Annema
ISBN: 0-7923-9567-0 FREQUENCY COMPENSATION TECHNIQUES LOW-POWER OPERATIONAL AMPLIFIERS, Ruud Easchauzier, Johan Huijsing
ISBN: 0-7923-9565-4 ANALOG SIGNAL GENERATION FOR BIST OF MIXED-SIGNAL INTEGRA TED CIRCUITS, Gordon W. Roberts, Albert K. Lu
ISBN: 0-7923-9564-6 INTEGRA TED FIBER-OPTIC RECEIVERS, Aaron Buchwald, Kenneth W. Martin
ISBN: 0-7923-9549-2 MODELING WITH AN ANALOG HARDWARE DESCRIPTION LANGUAGE, H. Alan Mantooth,Mike Fiegenbaum
ISBN: 0-7923-9516-6 LOW-VOLTAGE CMOS OPERATIONAL AMPLIFIERS: Theory, Design and Implementation, Satoshi Sakurai, Mohammed Ismail
ISBN: 0-7923-9507-7 ANALYSIS AND SYNTHESIS OF MOS TRANSLINEAR CIRCUITS, Remco J. Wiegerink
ISBN: 0-7923-9390-2 COMPUTER-AIDED DESIGN OF ANALOG CIRCUITS AND SYSTEMS, L. Richard Carley, Ronald S. Gyurcsik
ISBN: 0-7923-9351-1 HIGH-PERFORMANCE CMOS CONTINUOUS-TIME FILTERS, Jose Silva-Martinez, Michiel Steyaert, Willy Sansen
ISBN: 0-7923-9339-2 SYMBOLIC ANALYSIS OF ANALOG CIRCUITS: Techniques and Applications, Lawrence P. Huelsman, Georges G. E. Gielen
ISBN: 0-7923-9324-4 DESIGN OF LOW-VOLTAGE BIPOLAR OPERATIONAL AMPLIFIERS, M. Jeroen Fonderie, Johan H. Huijsing
ISBN: 0-7923-9317-1 STATISTICAL MODELING FOR COMPUTER-AIDED DESIGN OF MOS VLSI CIRCUITS, Christopher Michael, Mohammed Ismail
ISBN: 0-7923-9299-X SELECTIVE LINEAR-PHASE SWITCHED-CAPACITOR AND DIGITAL FILTERS, Hussein Baber
ISBN: 0-7923-9298-1 ANALOG CMOS FILTERS FOR VERY HIGH FREQUENCIES, Bram Nauta
ISBN: 0-7923-9272-8 ANALOG VLSI NEURAL NETWORKS, Yoshiyasu Takefuji
ISBN: 0-7923-9273-6 ANALOG VLSI IMPLEMENTATION OF NEURAL NETWORKS, Carver A. Mead, Mohammed Ismail
ISBN: 0-7923-9049-7 AN INTRODUCTION TO ANALOG VLSI DESIGN AUTOMATION, Mohammed Ismail, Jose Franca
ISBN: 0-7923-9071-7
MODELING AND SIMULATION OF MIXED ANALOG-DIGITAL SYSTEMS
edited by
Brian Antao Motorolla, Inc.
A Special Issue of ANALOG INTEGRATED CIRCUITS
AND SIGNAL PROCESSING An International Journal Volume 10, No. 1/2 (1996)
KLUWER ACADEMIC PUBLISHERS Boston / Dordrecht / London
Distributors for North America: Kluwer Academic Publishers 10 1 Philip Drive Assinippi Park Norwell, Massachusetts 02061 USA
Distributors for all other countries: Kll!wer Academic Publishers Group Distribution Centre Post Office Box 322 3300 AH Dordrecht, THE NETHERLANDS
Library of Congress Cataloging-in-Publication Data
A C.I.P. Catalogue record for this book is available from the Library of Congress.
ISBN-13: 978-1-4612-8609-7 DOl: 10.1007/978-1-4613-1405-9
e-ISBN-13: 978-1-4613-1405-9
Copyright © 1996 by Kluwer Academic Publishers Softcover reprint of the hardcover 1st edition 1996 All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, mechanical, photo-copying, recording, or otherwise, without the prior written permission of the publisher, Kluwer Academic Publishers, 101 Philip Drive, Assinippi Park, Norwell, Massachusetts 02061
Printed on acid-free paper.
Analog Integrated Circuits and Signal Processing, 10, 5-6 (1996) © 1996 Kluwer Academic Publishers, Boston. Manufactured in The Netherlands.
Guest Editorial Introduction to the Special Issue on Modeling and Simulation of Mixed and Analog-Digital Systems
Integrated circuit technology is perhaps one of the fastest growing technology sectors, with rapid progress being made in the design and fabrication processes. The design of more complex and increasingly dense designs continues at a frantic pace. Rapid scaling of device sizes now enables complete systems to be integrated on chip. The newer generations of integrated circuits are no longer purely analog or digital but include a combination of different functionalities. Mixed signal system design that largely combine analog and digital sections has also evolved into a field of its own to keep pace with this technology trend. We devote this special issue to report on the leading-edge enabling developments in the modeling and simulation area, necessary to support and fuel the mixed-signal design trend. In this issue we have collected an excellent representative set of papers that covers different aspects of this area.
One of the key bottlenecks recognized in the design of mixed-signal integrated circuits as well as the newer generation of other integrated circuits, is the modeling of the substrate. Noise that creeps from the highactivity digital sections tends to adversely affect the analog portions of a mixed signal chip. Hence efficient modeling of the substrate is essential to account for the substrate induced effects early in the verification phase. The first paper in this issue, by K. Kerns et al., describes a novel approach for efficient modeling the substrate. The authors use a non-rectangular substrate discretization method based on Vornoi tessellations and Delauney triangulation to generate mesh representations of the substrate. They then apply a congruence transform based reduction to make the substrate mesh representation more tractable for simulation.
The paper by R. Harjani and J. Shao describes feasibility and performance region macromodeling of analog-digital circuits. The use of hierarchy is important in structuring the design of large mixed-signal designs. As a design evolves through this hierarchical structure, we need various analyses and modeling tools that can be integrated into the design flow and address the design issues that arise on account of this hier-
archical structure. This paper presents an interesting approach for exploring the feasibility and performance modeling of the various sub-blocks in a hierarchical design process. A vertical binary search technique is used to generate feasibility macromodels and a layered volume-slicing methodology with radial basis function is used to generate the performance macromodel.
Phase-locked 100ps(PLLs) are widely used in different applications, in purely digital form, mixed analogdigital form, as well, as analog realizations. PLLs are also recognized to be a class of circuits that are very difficult to simulate, as they operate with largely varying time-constants. The paper by B. Antao, F. El-Turky and R. Leonowich describes the use of behavioral modeling ofPLLs as a technique for making the simulation and analysis of these circuits more tractable. These behavioral models can then be used in the simulation of a larger system. They present examples of different PLL configurations.
Continuing in the line of behavioral modeling is the next paper by W. Kruiskamp and D. Leenaerts on behavioral and macromodeling using piecewise linear techniques. The authors advocate the use of piecewise linear techniques as a consistent approach for modeling components across the analog and digital boundaries.
Emerging out of the efforts to find the ultimate processing machine that can mimic some of the functionality of a human brain is a computational paradigm called neural networks, based on the principles of biological information processing systems. Early successes of these systems have been in various pattern recognition tasks. In the next paper, the authors T. Wu, B. Sheu and E. Chou describe the behavioral modeling and simulation techniques for such densely connected analog cellular array processors, with pattern recognition examples.
We next digress a little and explore the use ofhierarchical modeling for fault simulation of analog circuits. The design and realization of mixed-signal integrated circuits also raises the issue of testability and fault diagnosis of these circuits. While there has been an enormous amount of research on the digital side, the analog
6 Brian A. A. Antao
part of the equation is largely un-addressed, and this paper offers some perspectives.
The paper by S. Donay et aI., "Using top-down CAD tools for mixed analog/digital ASICs: A practical design case", presents a practical design case and serves as a unifying thread for the various approaches that have been presented, and how behavioral modeling and simulation is applied to the design process. This paper is an important demonstration on the use of the various CAD techniques in a practical design case.
In keeping up with the trends of large interdisciplinary system integration, electro-optical devices provide an additional set of challenges. This last paper by V. Liberali, F. Maloberti and A. Regini, nicely rounds off the big picture and adds breadth by discussing the modeling and simulation of electro-optical devices.
As the guest editor, it was a sheer pleasure in compiling this special issue, giving me the opportunity to
interact with a diverse range of people working in this leading-edge field. There was a lot of interest and we received an ample number of submissions. I would like to thank all the reviewers for their excellent and detailed reviews of the various manuscripts, and their timely responses that enabled the prompt completion of this special issue. I would specially like to thank Carolyn Genzel, the administrative assistant of the analog and digital circuits group in the Coordinated Science Laboratory at the University of Illinois, for her efficient handling of the manuscripts, and coordinating the reviews. And finally, a special note of regrets to those whose papers did not make it to this special issue for various reasons, and hope it will not in any way discourage you. I sincerely hope this special issue to be of great interest and benefit to the exciting integrated circuit design and CAD community at large.
Brian A. A. Antao, Guest-Editor
Analog Integrated Circuits and Signal Processing, 10,7-21 (1996) © 1996 Kluwer Academic Publishers, Boston. Manufactured in The Netherlands.
Efficient Parasitic Substrate Modeling for Monolithic Mixed-AID Circuit Design and Verification
KEVIN J. KERNS, IVAN L. WEMPLE, AND ANDREW T. YANG, MEMBER, IEEE Department of Electrical Engineering. University of Washington. Seattle, WA 98195
Abstract. Parasitic analog-digital noise coupling has been identified as a key issue facing designers of mixed-signal integrated circuits. In particular, signal crosstalk through the common chip substrate has become increasingly problematic. This paper demonstrates a methodology for developing simulation, synthesis, and verification models to analyze the global electrical behavior of the non-ideal semiconductor substrate. First, a triangular discretization method is employed to generate RC equivalent-circuit substrate models which are far less complex than those formulated by conventional techniques. The networks are then accurately approximated for subsequent analysis by an efficient reduction algorithm which uses a well-conditioned Lanczos moment-matching process. Through congruence transformations, the network admittance matrices are transformed to reduced equivalents which are easily post -processed to derive passive, SPICE-compatible netlist representations of the reduced models. The pureRC properties of the extracted substrate networks are fully exploited to formulate an efficient overall algorithm. For validation, the strategy has been successfully applied to several mixed-signal circuit examples.
Introduction
Industry trends aimed at integrating higher levels of circuit functionality have triggered a proliferation of analog and digital subsystems fabricated side-by-side on the same die. The combined requirements for both high speed digital and high precision analog circuitry produce unique challenges to mixed-AID circuit designers. Specifically, monolithic mixed-signal ICs are often characterized by parasitic analog-digital interactions which can cripple the operation of high-performance designs. Noise coupling through the common chip substrate has been identified as a significant contributor to this important problem [1], [2].
Modeling the electrical behavior of non-ideal semiconductor substrates is of key interest to the mixedsignal design community. For state-of-the-art circuits, chip-level verification which excludes the effects of substrate coupling may be of questionable validity. As a result, substrate modeling for circuit simulation has been the focus of much research in recent years. Early
work in this area [3] used a box integration technique to construct 3-D rectangular RC mesh networks as equivalent circuit representations of the modeled substrates. The mesh topology could be correlated to the circuit's physical design by distributing grid points according to the layout features on relevant fabrication photomasks [4]. Unfortunately, layout-driven rectangular grid generation is prone to substrate "overpartitioning", which yields unnecessarily dense grid crowding in many regions of the chip. The strategy produces enormous circuit networks, even for moderately sized layouts. Since a primary objective of equivalent circuit macromodeling is to build simulation-ready networks, the inordinate complexity of the generated models is self-defeating-subsequent simulation on conventional CAD workstations becomes virtually impossible.
To address the complexity issues, intermediate processing is required to approximate the generated linear RC networks by smaller circuits which exhibit similar electrical properties. Since a typical mesh is dense
8 K. 1. Kerns, I. L. Wemple, and A. T. Yang
and three-dimensional, only a small percentage of network nodes, called ports, are physically connected to the external circuit (at the top surface of the modeled substrate). In theory, an "equivalent" network can be formulated by eliminating a substantial fraction of the internal nodes. The resulting network is appropriate for simulation if its port characteristics remain consistent with those of the original mesh. This technique is generally referred to as network reduction.
To accurately accommodate general, lumpedelement substrate models, Asymptotic Waveform Evaluation (AWE) [5] has been proposed as a method to reduce mesh networks for mixed-signal switching noise analysis [6]. The AWE algorithm approximates a network's multiport behavior by recursively calculating the moments of the port characteristics and then fitting these moments to pole-residue functions via the Pade approximation. A well-known problem with this technique is that calculation of the higher moments is inherently ill-conditioned-increasing the number of poles used to model a given network does not guarantee a better approximation. Heuristic methods have been developed to address this issue (e.g., see the references in [7]), but only at the cost of increased computational complexity. Another problem with AWE relates to the stability of the network approximation. While asymptotic stability is maintained by eliminating positive poles, absolute stability is not easily ensured. Consequently, non-physical, artificial oscillations may appear during subsequent transient simulations.
Network models detailed enough to accurately predict the chip-level impact of substrate coupling are, by necessity, very complex. It is not surprising that tangible simulation results have been obtained only for small device-scale examples, as existing methods possess inherent limitations which render them impractical for circuits of reasonable size. The mixed-signal design process can be greatly enhanced by the development of software tools which can efficiently extract accurate chip-level simulation substrate models directly from a physical design specification. Reliable verification of circuit functionality obviously reduces the length of the overall design cycle and promotes the likelihood of first-time silicon success. Perhaps more importantly, robust new noise reduction techniques can be developed more rapidly if capabilities exist to accurately assess and analyze the impact of switching noise in proposed design methodologies.
We propose a substrate modeling strategy which addresses the mesh complexity issue at both the model
generation and network reduction levels. For initial mesh generation, a well-known geometric construct can be efficiently applied to overcome the single most important drawback of rectangular mesh formulation methods-that is, localized mesh refinement, often required in regions of dense switching activity, is propagated to distant layout regions where a coarser mesh might otherwise be adequate. By using a nonrectangular gridding method, we extract a mesh which automatically and locally adjusts itself to the density of substrate features as inferred from the layout specification. Mesh extraction based on this approach generates substrate circuit networks containing orders of magnitude fewer circuit nodes than those of conventional gridding techniques. A brief description of the substrate model formulation, first presented in [8], is provided in Section II.
In spite of the improved model generation technique, extracted full-chip substrate networks still promise to be exceedingly complex. For model reduction, we demonstrate a new multiport algorithm which fully exploits the pure-RC property of our formulated networks and directly generates reduced equivalent circuit models in a well-conditioned manner. Using congruence transformations, full-network conductance and susceptance matrices are transformed to reduced equivalents which can be directly realized with resistors and capacitors. The approximated networks are guaranteed to be passive, and thus well-behaved in subsequent simulations. Proper formulation of the transformation ensures that the networks possess a minimal number of internal nodes and branches, and yield a specified accuracy from DC to a specified maximum frequency of interest. The requisite transforms are generated using a symmetric Lanczos method which exploits the specialized structure of the extracted substrate networks. Required matrix inversions are performed using efficient methods which also profit from the problem symmetry. Since matrix inversion often accounts for a substantial network reduction bottleneck, this strategy can be significantly faster than general AWE methods, which employ nonsymmetric techniques. Section III provides the theoretical details and an implementation description of the proposed substrate model network reduction methodology. The non-rectangular gridding strategy and the congruence-transformation-based network reduction algorithm combine to form a unified, efficient strategy for developing parasitic substrate models for mixed-signal circuit simulation and design verification. The overall approach has been applied to several
Efficient Parasitic Substrate Modeling for Monolithic Mixed-AID Circuit Design and Verification 9
mixed-signal design examples, which we present in Section IV.
Model Extraction Using Non-Rectangular Substrate Discretization
A popular and physically-based approach to parasitic substrate modeling employs an equivalent circuit mesh representation of the modeled substrate [3], [4], [6], [9]. A common drawback of previously reported modeling strategies, however, is that the derived networks ultimately contain circuit nodes in substrate regions where they are not required to obtain accurate simulation results. As emphasized in our introduction, subsequent mesh processing typically involves network reduction, and, ultimately, simulation. Since the computational efficiency of these procedures is directly impacted by the complexity of the generated network, it is of enormous advantage to constrain the size of the original mesh by adopting efficient techniques for model extraction. For mesh generation, we employ a nonrectangular substrate discretization based on geometric constructs known as the Voronoi tessellation and the Delaunay triangulation [10]. The derived mesh efficiently conforms to the substrate feature topology as dictated by the physical layout of the circuit. This section summarizes our modeling approach. For greater detail, the reader is referred to [8].
The network formulation strategy is based on the observation that typical ICs contain areas of intricate complexity surrounded by comparatively large regions with little structural detail. In gate array, standard-cell, and most custom designs, large chip areas contain no active devices but are dedicated to routing channels. Since transistors and contacts are the primary sources and collectors of noise current, it makes sense to partition the chip according to the "localized" densities of relevant substrate features.
Our approach is demonstrated qualitatively in Fig. 1, which shows a mixed-signal circuit and a progression of illustrations depicting the sequence of procedures used to discretize the underlying substrate. During layout extraction, polygons representing relevant substrate features are converted to equivalent internal point representations. In Fig. la, for example, derived layout information from the chip inset yields point locations for the enclosed transistors and substrate tie-downs.
Voronoi tessellation is the procedure we use to subdivide the Euclidean plane according to the distribution of point sites. The tessellation assigns every location in
the plane to the closest member in the set of point sites. As a result, the locations associated with each member form a convex polygon. Together, the polygons partition the substrate surface into a non-overlapping, collectively exhaustive set of regions called the Voronoi diagram, a portion of which is shown in Fig. lb.
Connecting each pair of sites that share a common edge in the Voronoi diagram is known as Delaunay triangulation (Fig. Ic). For reasons we discuss below, a key property of the triangulation is that each edge connecting adjacent sites is perpendicular to the common Voronoi polygon boundary between the sites. The line segments which constitute the triangulation can be used as a basis for the branch topology of a representative electrical network. Assuming for now we wish to model the resistive properties of the substrate, the triangulation-based network topology is shown in Fig. Id.
In actuality, the substrate properties are not homogeneous, and a two-dimensional network model (like that shown in Fig. 1) is not adequate to capture the electrical behavior of the modeled substrate. To account for doping non-uniformities, the mesh is extended to three dimensions by stacking structurally identical, triangulated (x, y) planes on top of one another, and interconnecting them site-to-site using appropriately-valued interplane branch circuit elements. Owing to the oneto-one correspondence between tessellation sites and network circuit nodes, each (x, y) plane is termed a nodeplane.
Excepting those regions which comprise the well boundaries (e.g., in CMOS circuits), linear resistors typically suffice for modeling the bulk electrical properties of the substrate [4]. To accommodate the depletion capacitance associated with the well junctions, special tessellation sites are introduced in pairs which straddle the well boundaries. If adjacent pairs are properly spaced, the corresponding triangulation segments are always perpendicular to the well edges, and modeling the junction capacitance is simplified.
The model extraction combines substrate technology data and the geometries of the individual Voronoi polygons and associated triangulation edges to derive the values of the linear resistors and capacitors which comprise the network. To demonstrate the procedure for determining the resistor values, we refer to the portion of the arbitrary triangulation shown in Fig. 2. The tessellation sites i and j represent circuit nodes, and polygons Pi and Pj are the Voronoi polygons enclosing each site. Our objective is to formulate a resistance,
10 K. 1. Kerns, I. L. Wemple, and A. T. Yang
(a)
(b)
(e)
(d)
Fig. 1. Overview of the non-rectangular substrate discretization strategy applied to a mixed-signal IC. The insets show (a) site representations of the layout -derived substrate features, (b) the corresponding Voronoi tessellation, (c) the Delaunay triangulation, and (d) the resultant electrical network topology.
R;j' which models the network branch between nodes i and j. The voltage drop between the nodes is the line integral of the electric field, E, between i and j, i.e.,
Vi - V; = - [i E . dl. (1)
The resistor current is the normal flux crossing the polygon edge common to Pi and Pj , segment ab:
(2)
Efficient Parasitic Substrate Modeling for Monolithic Mixed-AID Circuit Design and Verification 11
Fig. 2. A portion of a layout-derived tessellation showing the edges of the Delaunay triangulation. The labelled dimensions are used to compute the corresponding network branch element values.
where n is the unit vector normal to dw. The formulation relies on the assumption that the sheet conductivity, as, is uniform over Pi and Pj , and that the instantaneous electric field between the points can be approximated by the first term in its Taylor series expansion about m, the point midway between i and j. Since the polygon boundary ab is perpendicular to the triangulation edge connecting i and j, the projection ofE on dl is the same as the projection of E on n and, therefore,
Yo. - V. R'. = I J
IJ Iij (3)
If not for this unique property exhibited by the dual Voronoi and Delaunay constructs, the resistor values would depend on the local orientation of E, which is non-constant.
To account for the physical separation between adjacent nodeplanes, the tessellated polygons are assigned a finite thickness, t, and the associated resistance becomes
R;j lij R .. -----IJ - - ,
t atwab (4)
where a is the bulk conductivity of the polygon "tile". The choice of t may vary from nodeplane to nodeplane, and largely depends on the substrate doping profiles. It can be shown by similar geometrical arguments that interplane resistors (say, for example, between sites i and k) can be formulated by
t Rik = aA' (5)
where A is the cross-sectional area of the Voronoi polygons enclosing sites i and k and, again, t is the planeto-plane spacing.
To model the well capacitors, well boundary straddle sites are inserted in pairs, each separated by a distance chosen to approximate the junction depletion width. A simple abrupt p-n junction approximation is employed to determine the capacitor values. While a more rigorous formulation is certainly possible, it is well-known that substrate coupling across well boundaries is far less significant than that observed directly through the common substrate. Consequently, the exact method of capacitance formulation has a secondary impact on the overall model accuracy.
The core tessellation algorithm is computationally efficient, with an O(N log N) time complexity and O(N) memory requirement [11]. Since the (x, y) mesh topology is duplicated plane-to-plane, N is the number of sites (Le., circuit nodes) in a single nodeplane. Strictly speaking, it is the Delaunay Triangulation which forms the basis for the network topology. In any case, since the tessellation and triangulation form mathematical duals, the triangulation is derived at no additional expense. In [8] it was shown that the quantity of circuit nodes in models generated by Voronoi tessellation are orders of magnitude fewer than in networks derived by rectangular discretization techniques. The accuracy ofthe approach has been verified by both detailed 2-D device simulation, and, for small circuits, by SPICE circuit-level simulation. Extensive comparisons were made to networks obtained by more conventional gridding methods, which have been shown to faithfully model coupling effects in simple, fabricated test circuits [12], [13].
Efficient RC Network Reduction Using Congruence Transformations
In spite of the reduced complexity achieved by improved model generation, the extracted substrate networks are still too large for conventional circuit simulation. Some method of mesh reduction is required to approximate the networks using simpler equivalent circuit models. To maintain accuracy, efficiency, and flexibility, an acceptable network reduction algorithm should meet the following requirements:
• The simulation macromodels must accurately approximate the multiport admittance of the original RC network from DC to a specified maximum frequency.
12 K. J. Kerns, l. L. Wemple, and A. T. Yang
• To obtain accurate, physically-based simulation results, the reduced substrate networks must be passive so that absolute stability is preserved.
• For efficient simulation, the reduced network models must contain a near-minimum quantity of internal nodes and branches.
• To maintain compatibility with an assortment of circuit simulation and timing analysis software, the reduced models must be realizable with standard SPICE-compatible circuit elements.
As mentioned in Section I, Asymptotic Waveform Evaluation has been proposed as a technique for reducing the computational expense incurred by the simulation of substrate model circuit networks. Some fundamental drawbacks of AWE (ill-conditioning, no guarantee of absolute stability, etc.) were discussed earlier. Feldmann and Freund recently introduced a linear circuit approximation technique called Pade Via Lanczos (PVL) [14]. In contrast to AWE, this method avoids using ill-conditioned recursive moment calculations for pole-residue formulation by employing the "look-ahead Lanczos process" [15]. In [16], a multiport implementation of PVL was demonstrated. However, the issue of reduced network stability and passivity has not been addressed in [14] or [16]. And because PVL is a general technique, the algorithm does not capitalize on the unique properties of pure RC networksas a result, relatively inefficient matrix techniques must be used for the required matrix inversions.
In this work, we propose a multiport algorithm which generates passive circuit approximations of the extracted substrate networks. The network poles are retained from DC to a specified upper frequency limit, within a specified error bound. Using congruence transformations, the full-network admittance matrices are transformed to simpler equivalents, which are used to generate SPICE-compatible RC netlist representations of the reduced circuits. Proper formulation of the transformation ensures that the networks possess a minimal number of internal nodes. The approach capitalizes fully on the inherent properties of the extracted substrate models. This section summarizes the network reduction strategy.
A. Multiport Admittance Matrix Formulation
The admittance of an RC network with m ports and n internal nodes can be represented by the conductance and susceptance matrices, G and C. These matrices
have I = m + n rows and columns and relate voltage and current in the frequency domain by
(G+sC)x = b. (6)
Here, x and b are column vectors with I rows representing nodal voltages and injected currents, respectively, and s is complex frequency. Since the network contains only resistors and capacitors, both G and C are symmetric. If the resistors and capacitors are real and positive, then G and C are non-negative definite, which means that none of the eigenvalues of either matrix is negative.
A logical partitioning of the voltage vector, x, in (6), orders the entries which correspond to the port nodes first, followed by the internal node entries, i.e.,
(7)
Accordingly, G and C can be partitioned as follows:
where A and B are m x m, Q and Rare n x m, and D and E are n x n. In this formulation, D and E are diagonally dominant and thus non-negative definite. Also, D is positive definite provided each internal node has a dc path to either a port node or the common node of the network (i.e., the existence of a dc solution for x" implies that D is non-singular; therefore, D is positive definite).
To formulate Y(s), the multiport admittance of the network, we first use (7) and (8) to re-write (6) as
[ (A+sB) (QT +sRT) ] [XI ] = [bl J. (9) (Q+sR) (D+sE) x" 0
The last n elements of the column vector b are identically zero because, by definition, no external current can be directly injected into the internal nodes. Eq. (9) represents two equations with two unknowns, namely x' and x", which, according to the partitioning, are the port and internal node voltage vectors, respectively. Using the definition Y(S)X' = b/, and eliminating x" in (9) gives
Y(s) = (A + sB)
- (QT + sRT)(D + SE)-I(Q + sR). (10)
Efficient Parasitic Substrate Modeling for Monolithic Mixed-AID Circuit Design and Verification 13
The poles of yes) occur where (D + sE) is singular, and are equal to -A -I where A is the solution to the generalized symmetric eigenvalue problem
det(E - AD) = o. (11)
Since E is symmetric non-negative definite and D is symmetric positive definite, the poles of Y (s) are real and negative.
The moments, Y b defined such that Y (s) = Yo + sY1 + s2Y2 + "', are found by first expanding (D + SE)-I of (10) as a Taylor series about s = 0, i.e.,
(D + SE)-I = [I + s( -D-1E) + S2( -D-1E)2 + .. ·]D-1.
(12)
Substituting (12) into (10) and equating coefficients of like powers of s yields
Yo = A - QTD-IQ (13)
Y1 = B + QTD-Ip - RTD-1Q (14)
Yk = _pT(_D-1E)k-2D-1p (k::: 2), (15)
where
(16)
Note that (13)-(16) can be used to recursively generate the moments of Y (s) in a manner similar to AWE. However, due to symmetry, two moments can be matched for each set of m forward and backward substitutions. In contrast, AWE matches only one moment for the same amount of computational effort. Also, the matrix to be inverted, D, is symmetric positive definite with negative off-diagonal elements. This property is significant because techniques more efficient than sparse LU factorization (e.g., Cholesky factorization [17]) can be used for inversion. Finally, we do not apply (13)-(16) directly, as this would yield the same ill-conditioning problem from which AWE suffers. Instead, we utilize the congruence transformation, which can be used to efficiently reduce G and C in a well-conditioned manner and, at the same time, preserve the moments Y b of (13)-(15).
B. Accurate Moment Matching by Congruence Transformations
A congruence transformation applied to A is defined as the transformation B = VT AV, where V is square and
non-singular. V is referred to as the congruence transform, and the matrices A and B are said to be congruent. A fundamental property of congruence transformations [18] is that they preserve the eigenvalues of the generalized eigenvalue problem demonstrated in (11) above. The eigenvalues of
(17)
(where V is square and non-singular) are identical to those of (11). This property is useful since, if eigenvalues are preserved, the poles of Y (s) are also retained. In this work, we make use of network transformations using the transform, V, which has n rows and k linearly independent columns, with k :::; n. If k < n the transform is an incomplete congruence transform, but we henceforth refer to all cases as congruence transforms and identify the special cases where V is square.
In (10), inserting VV- I on the left side of (D+sE)-1 and (VV-I) T on the right side gives
Y'(s) = (A+sB)-
(QT + sRT)V(VTDV + sVTEV)-1
x VT(Q+sR). (18)
We define the following simplifying relationships:
Q' = VTQ, (19)
R' = VTR, (20)
D' VTDV, and (21)
E' = VTEV. (22)
Substituting (19)-(22) in (18) yields
Y'(s) = (A + sB) -
(Q,T +sR,T)(D' +sE')-I(Q' + sR'). (23)
The transformations using V, shown in (19)-(22), can be represented as congruence transformations on G and C, i.e.,
G' = UTGU and C' = UTCU, (24)
and thus
(G' + sC') = UT (G + squ, (25)
where
(26)
14 K. J. Kerns, I. L. Wemple, and A. T. Yang
We later show that (25) has important implications related to the stability of the reduced network.
Ifk = nthenVV-1 == J and Y'(s) = Yes). Ifk < n, G' and C' are reduced in size (relative to G and C, respectively), which is our primary objective. In this case, however, VV- I =1= J so that Y'(s) =1= Yes) and the transformed network has port characteristics which differ from those of the original network-non-square congruence transforms do not retain all the eigenvalues of (11), so the poles are not preserved (Here, V-I is defined as the k x n matrix formulated such that V-IV = J).
To circumvent this difficulty, we note that, even if k < n, a proper choice of the transform V will result in a congruence transformation which preserves the lower order moments ofY(s) given by (13)-(15). For example, ifspan{D-IQ} ~ span{V}, it can be shown thatthe transform U in (26) preserves Yo and YI of (13) and (14). The span requirement means that each column of the product matrix D-IQ is representable as some linear combination of the columns of V. Additionally . if,
span{D-1Q, D-1p, D-1ED-1p, ... , (D- I E)Q-2D- l p}
~ span{V}, (27)
then the transformation matches the first 2q moments of Yes). In general, V must contain m columns for each pair of moments to be preserved, and the size of G' andC' is (q+1)m x (q+1)m when2q moments are matched. If the columns of V are linearly independent, then as q is increased, the poles of Y' (s) converge to those of Y (s); it is easily seen that Y' (s) = Y (s) when q is large enough to make V a non-singular square matrix. By choosing the columns of V to match the moments expanded about s = 0, as in (13)-(15), the low frequency poles converge first.
The symmetric Lanczos process [19] for the generalized symmetric eigenvalue problem is used to generate the columns of V in a well-conditioned manner. The process requires a single factorization of D, and each block of m columns of V requires m forward and backward substitutions and matches two moments. Both AWE and PVL require 2m substitutions to match the same number of moments because the general algorithms do not exploit symmetry. The properties of V assert that D' is diagonal and E' is banded with a band width of m + 2. Blocks are iteratively generated until the poles ofY'(s) have converged to those ofY(s) in a specified frequency range. Each iteration increases the
size of the reduced network by m nodes and matches two additional moments. The reduced networks generally contain non-converged high frequency poles in addition to those which have converged. The extra poles increase the size of the reduced network, but in no way impact its electrical behavior in the specified frequency range of interest. The extraneous poles can be removed by a post-processing step, which we describe in Section C.
Congruence transformations preserve the symmetry and non-negative definiteness of G and C, even if k < n. The consequence of this is of great significance with respect to the stability of the reduced network. A passive substrate network is incapable of providing energy to the overall circuit. For the network represented by the immittance matrix W(s) = G' + sC' to be passive, it is necessary that [20]:
• All elements of W(s) are analytic for a > 0, where a = Re(s);
• W(s*) = W*(s) where * denotes the complex conjugate; and
• W*T (s) + W(s) is non-negative definite for a > O.
We first note that the original network is passive, and G and C are real, non-negative definite, and symmetric. The first requirement is met because each element of G' and C' is a scalar. The second requirement is met because the transformation does not introduce complex numbers. The third requirement is satisfied because
W*T(s) + W(s) = 2(G' + aC'). (28)
Since aC' is non-negative definite (for a 2: 0), and since the sum of two real, symmetric, non-negative def
inite matrices is non-negative definite, then W* T (s) + W(s) is non-negative definite for a > o. Consequently, when applied for network reduction, any square or non-square congruence transformation on (G + sC) preserves passivity, and thus ensures absolute stability.
c. Implementation of the Reduction Algorithm
A flow diagram for the network reduction process is shown in Fig. 3. Inputs are the raw extracted RC substrate network, a specified upper frequency of interest, and an error tolerance. The network data is used to build the conductance and susceptance matrices, G and C. The last n rows and columns of G form the
Efficient Parasitic Substrate Modeling for Monolithic Mixed-AID Circuit Design and Verification 15
Extracted RC Substrate Network
( Maximum Frequency)
( Error TOlerance) I
r---------L--------.~~
~-------.--------~
Reduced SPICE-compatible
RC netlist
~---~
Fig. 3. Flow diagram for the substrate mesh reduction process.
D matrix, which is factorized. The congruence transform, V, is then built iteratively using a symmetric Lanczos procedure. The process terminates when the "dominant" poles of the reduced network multiport admittance converge within the specified error tolerance. All pole frequencies less than the specified maximum frequency are automatically included. Program output is a SPICE-compatible RC netlist containing the reduced network.
In general, the derived G' and C' matrices contain unwanted (and non-converged) high frequency poles. To ensure that the final network is as simulation-efficient as possible, the unnecessary outlying poles are removed in a post-processing step. A square congruence transform is formulated and applied on E' and D'. The transformation is chosen to diagonalize the resulting congruent matrices without affecting the behavioral properties ofY'(s). Since the two matrices are diagonal, each internal node is associated with a single pole of Y' (s). The undesired poles and their associated network internal nodes can then be removed via transformation with a non-square congruence transform. The poles within the range of interest are unaffected, and the first
two moments ofY(s) are preserved. Since every postprocessing transform is representable as a congruence transformation on (G' + sC'), network passivity is retained.
A few remarks are in order regarding the similarities and differences between our approach and other linear network approximation techniques. On many accounts, the congruence transformation method offers substantial advantages over both AWE and PVL. The first relates to performance. Our algorithm is specifically tailored for purely RC networks, whereas the AWE and PVL methods are general (i.e., they are able to accommodate any linear network). With our strategy, problem formulation ensures that the D matrix is positive definite and, consequently, more efficient specialized sparse matrix techniques are applied to determine D-1• It is important to note that we do not forsake accuracy for speed. The ill-conditioned higher order moment matching problems associated with AWE are well-known. Like PVL, we overcome this pitfall by employing the Lanczos method. However, our formulation exploits symmetry in a way that published PVL methods do not. Since the matrices D and E are symmetric, the span constraint of (27) is related to the generalized symmetric eigenvalue problem. Consequently, we employ a symmetric Lanczos algorithm. PVL employs the non-symmetric Look-Ahead Lanczos process, which requires twice as many forwardbackward substitutions. As mentioned above, our execution profiles revealed that these seemingly benign computational operations are responsible for a significant portion of the total CPU time. Finally, we highlight the more fundamental differences. Our method directly generates admittance matrices which contain minimal quantities of poles and branches. The G' and C' matrices are easily "un-stamped" to directly realize the reduced RC network in a SPICE-compatible netlist format. More importantly, the reduced network is guaranteed to be passive, so subsequent circuit simulations are well-behaved.
Applications to Mixed-Signal Circuits
In this section we report on the results of the proposed modeling strategy applied to four mixed-signal circuit layouts:
1. A three-stage CMOS ring oscillator adjacent to a sensitive "analog" NMOS transistor (ro3+FET). The ring oscillator circuitry injects noise into the
16 K. J. Kerns, l. L. Wemple, and A. T. Yang
substrate, and we monitor the simulated voltage signal at the body terminal of the analog transistor.
2. A digital frequency divider and nearby analog current source associated with a CMOS PLL (FreqDiv+lsrc). The subcircuits are separated by approximately 150 /Lm. The divider circuitry switching transients are coupled through the substrate to degrade the operation of the current source. We monitor the simulated current source signal to study the impact of the switching noise on its designed value.
3. A CMOS operational amplifier completely surrounded by a ring oscillator (ro+OpAmp). We examine the impact of substrate coupling by monitoring the simulated op amp output signal resulting from a small-signal sinusoidal input.
4. The same analog op amp shielded by an ohmic guard ring (ro+OpAmp_w/Guard). We monitor the simulated noise on the amplifier output signal relative to the noise generated with no shielding (i.e., the result using ro+OpAmp, circuit (3».
All substrate models were extracted directly from layout data using the process technology and mesh properties listed in Table 1. Level 3 transistor parameters derived from MOSIS parametric tests were used for the circuit simulations. The extracted and reduced network parameters for the four circuits are tabulated in Table 2, which also specifies the computational cost and network pole retention data associated with the mesh reduction runs. Mesh generation times are also listed. The frequency below which all poles are retained is determined by the error bound at the specified maximum frequency. We use a tolerance of 5% at 1 GHz. All benchmarking was performed on a SUN Sparc20 workstation with 96 MBytes RAM. HSPICE 95.1 [21] was utilized for the circuit simulations.
Circuit ro3+FET is simple, but useful, since its limited complexity permits SPICE simulation of the substrate mesh with and without network reduction. For each switching transient, the ring-oscillator injects substrate current across one of its output source/drainjunction capacitors. The n-type "analog" MOSFET lies between the oscillator and the nearest substrate tie-down, which forces noise current to flow underneath the device. The signal at the transistor body terminal, Fig. 4, reflects the localized fluctuations in the substrate potential modeled by the underlying mesh network. The results obtained using the original extracted model and its transformed equivalent demonstrate excellent agreement.
Transient substrate voltage fluctuations impair the performance of precision analog circuitry through manifestations of the well-known body effect. To assess the circuit impact of noise coupling, it is insightful to monitor the effects of the voltage fluctuations on the performance of the sensitive circuitry. Circuit FreqDiv+lsrc contains a CMOS (digital) frequency divider and (analog) source-coupled current source which represent nearby cells on a monolithic PLL chip. The divider accepts a 25 MHz input clock and generates signals at 12.5 and 6.25 MHz. The digital circuitry is characterized by full-swing signal transitions every 20 nanoseconds. The transients are coupled through the substrate to the current source, and, as a result, the source signal deviates from its DC value by more than 20%. By modeling the chip substrate, we are able to simulate the transient-induced noise which impairs the analog signal. The simulated current source waveform is shown in Fig. 5.
The magnitude of the noise waveform illustrated in Fig. 5 is artificially high because no effort was made to protect the sensitive circuitry from the effects of the digital subcircuit. Our final examples, ro+OpAmp and ro+OpAmp_w/Guard employ a linear amplifier circuit completely surrounded by a ring oscillator. In the latter case, the amplifier is shielded by an ohmic guard ring. Figures 6 and 7 show four simulation waveforms, all representing the output signal of the amplifier with a 5-MHz small-signal sinusoidal input. For the first simulation (Fig. 6 curve A), we eliminated the source of noise by de-activating the ring oscillator. In the next simulation, we induced substrate-coupled switching noise by allowing the oscillator to run freely. The impact on the amplifier output is readily apparent in Fig. 6, curve B. Figure 7 shows the results for the same simulations, except the substrate coupling is inhibited by the presence of the guard ring.
Simulation CPU measurements for our test circuits with coupled substrate models are shown in Table 3. Excepting ro3+FET, the non-reduced substrate networks are far too complex for post-layout circuit analysis. For that single circuit, however, the speedup obtained with the reduced network is substantial. The simulations for the amplifier circuits are inordinately lengthy since we enforced small transient timesteps to obtain the level of waveform detail shown in Figs. 6 and 7.
Finally, we have emphasized throughout this paper that our algorithm exploits the matrix symmetries which characterize our formulation. Table 4 shows
Efficient Parasitic Substrate Modeling for Monolithic Mixed-AID Circuit Design and Verification 17
Table 1. Substrate parameters for model generation.
bulk doping p-type, Nsub = .90 X 1014 cm-3
substrate thickness 100 fJ,m
well doping n-type, Zj = 3.5 fJ,m, No = 5 X 1016 cm-3
channel stop doping Zj = 0.2 fJ,m, No = 8.0 X 1017 cm-3
nodeplanes 10
Table 2. Substrate modeling CPU data for the 4 mixed-AID circuit examples.
Substrate Mesh Circuit Description Generation Substrate Mesh Reduction
Name
ro3+FET
FreqDiv+lsrc
ro+OpAmp
ro+OpAmp_w/Guard
> E
PETs
7
56
100
100
40
20
o
-20
-40
port CPU total nodes (sec) nodes
10 0.6 767
58 2.2 3438
102 5.2 9149
102 5.2 9461
- Non-reduced mesh ....... Reduced mesh
2
CPU branches (sec)
2540 1.3
11916 25.6
31273 245.1
32606 225.2
3 nanoseconds
poles retained
Min freq number (GHz)
2.30
5 0.99
21 0.48
21 0.49
4 5
Fig. 4. HSPICE simulation results for the ro3+FET circuit.
Max freq (GHz)
2.30
3.02
3.01
3.01
6
Reduced Network
total nodes branches
11 100
63 3596
123 12445
123 12445
18 K. J. Kerns, I. L. Wemple, and A. T. Yang
-55 - with substrate mesh - -- without mesh
-60
'" -65 r _ -70 (J
'8 -75
-80
o 10 20 30 40 50 60 70 nanoseconds
Fig. 5. HSPICE simulation results for the FreqDiv+lsrc circuit.
2.75
2.70
2.65
2.60
~ 2.55 '0 ;> 2.50
2.45
2.40
2.35 0 30 60 90 120 150 180 210 0 30 60 90 120 150 180 210
nanoseconds
Fig. 6. HSPICE simulation results for the ro+OpAmp circuit. Fig. 7. HSPICE simulation results for the ro+OpAmp_w/Guard circuit.
Table 3. HSPICE simulation data.
HSPICE CPU(s) HSPICE CPU(s) Circuit wi full mesh wi reduced mesh Speedup
ro3+FET 1071.7 4.5 238 x
FreqDiv+lsrc t 543.3
ro+OpAmp tt 3026.2
ro+OpAmp_w/Guard 2015.7
t memory requirement exceeded 96 MBytes
tt terminated during setup matrix re-ordering after approximately 20 hours
Efficient Parasitic Substrate Modeling for Monolithic Mixed-AID Circuit Design and Verification 19
Table 4. CPU requirements for D matrix factorization.
Circuit
ro3+FET
FreqDiv+lsrc
ro+OpAmp
ro+OpAmp_w/Guard
internal nodes
757
3380
9047
9359
empirically-obtained data for the CPU times required to factorize the D matrix for each substrate mesh network discussed in this section. We first employed sparse Cholesky factorization, which is standard for our system. Additionally, we implemented a version of the mesh reduction program which called the nonsymmetric sparse LU factorization routines in Sparse [22]. The symmetric methods are faster by an order of magnitude. By comparing Table 4 data to the mesh reduction CPU times in Table 2, it is also evident that the D factorization accounts for a substantial fraction of the overall run time. Tailoring the reduction algorithm to exploit the pure-RC properties of the extracted substrate networks improves the computational performance significantly.
Conclusion
We report on techniques that greatly reduce the computationallimitations associated with equivalent-circuitbased substrate coupling analysis in mixed-signal integrated circuits. RC substrate mesh networks are generated by applying Voronoi tessellation to layout-derived substrate feature data. The networks are efficiently approximated to any level of accuracy using congruent admittance matrices in conjunction with a wellconditioned Lanczos moment-matching process. Due to their enormous complexity requirements, previous approaches have been rigorously demonstrated only on small device-scale examples. With the aim of applying global substrate modeling to realistically large circuits, the proposed techniques make significant strides towards the development of a software methodology capable of analyzing mixed-signal noise behavior in a cognizant, quantitative fashion.
Sparse Cholesky Factorization
(CPU sec)
0.5
11.8
122.6
133.9
References
SparseLU Factorization
(CPU sec)
3.8
134.2
929.0
1163.1
1. J. A. Olmstead and S. Vulih, "Noise problems in mixed analogdigital integrated circuits," in Proceedings of the IEEE Custom Integrated Circuits Conference, 1987, pp. 659-662.
2. T. H. Lee, K. S. Donnelly, J. T. C. Ho, J. Zerbe, M. G. Johnson, and T. Ishikawa, "A 2.5V CMOS delay-locked loop for an 18 Mbit, 500 Megabyte/s DRAM." IEEE Journal of Solid-State Circuits 29(12), pp. 1491-1496, December 1994.
3. S. Kumashiro, R. A. Rohrer, and A. J. Strojwas, "A new efficient method for the transient simulation of three-dimensional interconnect structures," in Proceedings of the IEEE International Electron Devices Meeting, 1990, pp. 193-196.
4. F. J. R. Clement, E. Zysman, M. Kayal, M. Declercq, "LAYIN: toward a global solution for parasitic coupling modeling and visualization," in Proceedings of the IEEE Custom Integrated Circuits Conference, 1994, pp. 24.4.1-24.4.4.
5. L. T. Pillage and R. A. Rohrer, "Asymptotic waveform evaluation for timing analysis." IEEE Transactions on ComputerAided Design 9(4), pp. 352-366, April 1990.
6. N. K. Verghese, D. J. Allstot, and S. Masui, "Rapid simulation of substrate coupling effects in mixed-mode ICs," in Proceedings of the IEEE Custom Integrated Circuits Conference, 1993, pp. 18.3.1-18.3.4.
7. V. Raghavan, R. A. Rohrer, L. T. Pillage, J. Y. Lee, J. E. Bracken, and M. M. Alaybeyi, "AWE-inspired," in Proceedings of the IEEE Custom Integrated Circuits Conference, 1993, pp. 18.1.1-18.1.8.
8. I. L. Wemple and A. T. Yang, "Mixed-signal switching noise analysis using Voronoi-tessellated substrate macromodels," in Proceedings of the 32nd ACMIIEEE Design Automation Conference, 1995, pp. 439-444.
9. T. A. Johnson, R. W. Knepper, V. Marcello, and W. Wang, "Chip substrate resistance modeling technique for integrated circuit design." IEEE Transactions on Computer-Aided Design CAD-3(2), pp. 126-134, April 1984.
10. A. Okabe, B. Boots, and K. Sugihara, Spatial TessellationsConcepts and Applications ofVoronoi Diagrams. John Wiley and Sons: Chichester, England, 1992.
11. S. Fortune, "A sweepline algorithm for Voronoi diagrams." Algorithmica, 2, pp. 153-174,1987.
12. N. K. Verghese, S.-S. Lee, and D. J. Allstot, "A unified approach to simulating electrical and thermal substrate coupling interactions in ICs," in Proceedings of the IEEE International Conference on Computer-Aided Design, 1993, pp. 422-426.
20 K. J. Kerns, l. L. Wemple, and A. T. Yang
13. D. K. Su, M. J. Loinaz, S. Masui, and B. A. Wooley, "Experimental results and modeling techniques for substrate noise in mixed-signal integrated circuits." IEEE Journal of Solid-State Circuits 28(4), pp. 420-430, April 1993.
14. P. Feldman and R. W. Freund, "Efficient linear circuit analysis by Pade approximation via the Lanczos process," in Proceedings of the European Conference on Design Automation, 1994, pp. 170-175.
15. R. W. Freund, M. H. Gutknecht, and N. M. Nachtigal, "An implementation of the look-ahead Lanczos algorithm for nonHermitian matrices." SIAM Journal on Scientific Computing 14(1), pp. 137-158, 1993.
16. P. Feldmann and R. W. Freund, "Reduced-order modeling of large linear subcircuits via a block Lanczos algorithm," in Proceedings of the 32nd ACMIIEEE Design Automation Conference, 1995,pp.474-479.
17. A. George and J. W.-H. Liu, Computer Solution of Large Sparse Positive Definite Systems. Prentice-Hall: Englewood Cliffs, NJ, 1981.
18. G. H. Golub and C. F. Van Loan, Matrix Computations. second ed. Johns Hopkins University Press: Baltimore, MD, 1993.
19. J. K. Cullum and R. A. Willoughby, Lanczos Algorithms for Large Symmetric Eigenvalue Computations, Vol. I-Theory. Birkhauser: Boston, MA, 1985.
20. M. R. Wohlers, Lumped and Distributed Passive Networks: A Generalized and Advanced Viewpoint. Academic Press: New York, NY, 1969.
21. HSPICE. Version H95.1. Meta-Software Inc.: Campbell, CA, 1995.
22. K. S. Kundert and A. Sangiovanni-VincenteIli, User's Guide for Sparse: A Sparse Linear Equation Solver. Version 1.3a, Department ofEECS, University of California: Berkeley, CA, 1988.
Kevin J. Kerns received the B.S. degree in physics from the United States Air Force Academy, Colorado Springs, CO, in 1988. He served for five years in the U.S. Air Force at the Phillips Laboratory, Hanscom AFB, MA as a space research physicist. He left the Air Force in 1993, and is currently a Ph.D. pre-candidate in the Department of Electrical Engineering at the University of Washington, Seattle.
Ivan L. Wemple received the B.S.E.E. degree from the University of Pennsylvania, Philadelphia, in 1985 and the M.S.E.E. degree from the University of California, Berkeley, in 1987.
In 1987, he held a temporary position with Integrated Device Technology in Santa Clara, California. From 1988 to 1991 he worked as a semiconductor process engineer in the Large Area Electronic Systems Laboratory at General Electric's Corporate Research and Development Center in Schenectady, New York. He interned at National Semiconductor during the summer of 1992.
He is currently a Ph.D. student and research assistant in the Department of Electrical Engineering at the University of Washington, Seattle. His research primarily focuses on the modeling and simulation of parasitic substrate coupling in mixed-signal ICs. Additional research interests include timing simulation and physical design automation for VLSL
Andrew T. Yang received the B.S. degree in electrical engineering and computer science from the University of California, Berkeley, in 1983, and the M.S. and Ph.D. degrees from the University of Illinois, UrbanaChampaign, in 1986 and 1989, respectively. From 1983 to 1984, he was with Advanced Micro Devices in California. Since 1989 he has been with the University
Efficient Parasitic Substrate Modeling for Monolithic Mixed-AID Circuit Design and Verification 21
of Washington, Seattle, where he is currently an Associate Professor of Electrical Engineering. His current research interests include simulation of mixed analogdigital circuits, timing simulation with emphasis on analog modeling, and modeling of semiconductor devices.
Dr. Yang has served as a member of the technical program committee of the IEEE International Conference on Computer-Aided Design. In 1992, he received the NSF Young Investigator Award.
Analog Integrated Circuits and Signal Processing, 10, 23-43 (1996) © 1996 Kluwer Academic Publishers, Boston. Manufactured in The Netherlands.
Feasibility and Performance Region Modeling of Analog and Digital Circuits*
RAMESH HARJANIl AND JIANFENG SHAo2
1 University of Minnesota, Department of Electrical Engineering, Minneapolis, Minnesota 55455, 2 Intel Corporation, Hillsboro, OR 97124
Abstract. Hierarchy plays a significant role in the design of digital and analog circuits. At each level of the hierarchy it becomes essential to evaluate if a sub-block design is feasible and if so which design style is the best candidate for the particular problem. This paper proposes a general methodology for evaluating the feasibility and the performance of sub-blocks at all levels of the hierarchy. A vertical binary search technique is used to generate the feasibility macromodel and a layered volume-slicing methodology with radial basis functions is used to generate the performance macro model. Macromodels have been developed and verified for both analog and digital blocks. Analog macromodels have been developed at three different levels of hierarchy (current mirror, opamp, and AID converter). The impact of different fabrication processes on the performance of analog circuits have also been explored. Though the modeling technique has been fine tuned to handle analog circuits the approach is general and is applicable to both analog and digital circuits. This feature makes it particularly suitable for mixed-signal designs.
Keywords: Macromodeling, hieararchical design, analog circuit design, feasibility, performance, modeling
1. Introduction
As feature sizes shrink even further, an increasing percentage of Ie's will have analog circuit designs in them, stressing the need for analog design automation. Hierarchy has played a significant role in the design of digital and more recently in analog circuits [1]. In digital design, extremely large designs are routinely designed by breaking up the task into smaller and smaller sub-tasks, i.e., divide and conquer. Hierarchy helps to hide the lower-level details and also helps to focus attention on more tractable sub-tasks. Even though it may be informal and less well accepted, hierarchy is used in the practice of analog circuit design. For example, an analog to digital (AID) converter is not designed at the transistor level right from the start.
An example hierarchy for an AID converter is shown in Fig. 1. There are a number of different kinds of AID converters, e.g, flash, successive approximation, sigma-delta, etc, Each of these is called a design style [1]. The hierarchy for a first-order sigma-delta is shown in this figure. The converter has as SUbcomponents, an integrator, a I-bit AID,
* This research was supported in part by a grant from NSF (MIP-9110719)
a I-bit DI A and a digital low pass filter. There are different design styles, or ways of implementing, each one of these components. For example, one of many possible choices for the integrator is shown. One of the subcomponents within the integrator is an opamp. Once again, there are many different design styles for the opamp, e.g., simple one-stage, Millercompensated two-stage, folded-cascode, etc [1].
Design proceeds down such a hierarchy. At each level of the hierarchy we are presented with a number of candidate design styles for each functional block. Each of these design styles provides the same functionality but provides different performance tradeoffs. As part of the design we need to select the best candidate for the job at each level of the hierarchy. Two
I ~
f
r£
r~lr~ LPF
Fig. 1. AID converter hierarchical decomposition
24 Harjani, et al.
~ __ Performance POWER
XI
Performance Macromodell
AREA
INPUT SPECIFICATION SPACE
Macromode12
{XI,X2)
{XI,X2)
OUTPUT PERFORMANCE SPACE
Fig. 2. Feasibility/performance relationship
lock
Design Style 1
Fig. 3. Design styles for Block A
decisions need to be made during this selection process. First, we need to evaluate which design styles are feasible, i.e, can be designed to meet input specifications. Second, we need to evaluate which design style provides the best performance. A second aspect of design, i.e., the translation of specifications from one level of the hierarchy to the next lower level is fairly well understood and has been described elsewhere [1], [2].
One possible solution to the selection problem is to exhaustively tryout all the options and then select the best candidate among them. However, the design time increases exponentially, as the number of levels in the hierarchy and/or the branching factor increase. Unfortunately, the performance of analog circuits is strongly tied to the bottom level transistor behavior. Therefore, without an appropriate macromodel, the performance or feasibility of a design at any level cannot be evaluated without traveling to the very bottom of the hierarchy. Typical branching factors in a real design situation can be fairly large at all except the lowest level of the hierarchy. For example, there are probably twenty different design styles for operational amplifiers, i.e., a branching factor of twenty. Therefore, because of the large num-
Design Specs for Block A
fa
Fig. 4. Translation of specifications
ber of designs that need to be attempted the costs can be extremely large. As a solution to this problem we propose a numerical macromodeling solution that generates macromodels a priori. To accurately predict performance and feasibility, these macromodels are built bottom-up throughout the hierarchy.
To ensure that the methodology is general, i.e, can be applied at all levels of the hierarchy and can be applied to any circuit block, it is essential that the methodology be abstract and not depend on the implementation details of any of the functional blocks. To this effect it is essential that the methodology use a set of general basis functions to perform macromodeling and general techniques for experimental design. Two kinds of macromodels are necessary. One model is necessary to check if a design is feasible, the feasibility macromodel, and the second model is necessary to predict the behavior, the performance macromodel. The relationship between the feasibility macromodel and performance macromodel is shown in Fig. 2. Performance macromodels are mappings from the feasible input specification space to the realizable performance space. At each level of the hierarchy each design style has a single feasibility macromodel and
a set of performance macromodels, one for each performance metric.
In a hierarchical system the performance and feasibility macromodels are built bottom-up and design proceeds top-down as shown in Fig. 1. The feasibility macromodel for each sub-block is generated by ORing the feasibility macromodels for all the design styles that exist for the sub-block. For example, in Fig. 3 Block A can be designed in one of many design styles (e.g., Design Style 1, Design Style 2, Design Style 3, ... ). It is possible to generate a design for Block A if anyone of the design styles is feasible. However, the feasibility macromodel for each design style at each level of the hierarchy is generated by effectively ANDing the translated feasibility regions of each of the sub-blocks involved. For example, Design Style 1, in Fig. 3, is composed of sub-blocks a and (J. During the translation process [1] the design specifications for Block A are transformed into specifications for sub-blocks a and (J as shown in Fig. 4. Let us define these transformations as I a and 1(3 for the two sub-blocks. Therefore, the feasibility region for Design Style 1 of Block A is defined by ANDing the 1;;1 (feasibility region of sub-block a) with 1;;1 (feasibility region of sub-block (J). These transforms, la and 1(3, are usually nonlinear and not easily inverted therefore the actual process, though conceptually similar, is far more complex.
The ANDing and ORing processes is illustrated in Figures 5, 6, 7, 8, and 9. Fig. 5 shows the feasibility macromodel for two design styles. In this figure the x, y and z axis are the three input specifications for this block. For example, let the lower surface define the feasibility region for Design Style 1 and the upper surface define the feasibility region for Design Style 2. For each of the surfaces, all points below the surfaces are designable. When considering only these two design styles, the feasibility region for the circuit block designable as either Design Style 1 or Design Style 2 is shown in Fig. 6. This illustrates the ORing process. The ANDing process is illustrated in Figures 7, 8 and 9. Let the surface in Fig. 7 define the feasibility region for sub-block a and let the surface in Fig. 8 define the feasibility region for sub-block (J. The feasibility region for Design Style 1 is defined when both sub-blocks a and (J are feasible. The final feasibility surface for Design Style 1 is shown in Fig. 9. Note the warping due to the translation process. For this simple example the translation functions for
Feasibility and Performance Modeling 25
I a and 1(3 are selected be unity. The feasibility region surface shown in Fig. 9 for Design Style 1 would then be one of the surfaces that is ORed while developing the feasibility region for Block A in Fig. 3.
The primary advantage of developing hierarchical macromodels is that the macromodels need only be built once. They can then be used during design time to accommodate a complete top-down design. Therefore, the performance or feasibility of a block at any level can be determined by evaluating the macromodels at that level rather than having to travel to the very bottom of the hierarchy. Only when we need to complete the circuit design do we travel down the hierarchy. However, at each level of the hierarchy, since we have feasibility macromodels, we know which design styles can be designed to meet specifications and which cannot. Likewise, since we have performance macromodels for each of these feasible design styles, we can choose the best candidate.
OASYS [1] is an example of a analog synthesis system that uses similar abstractions to efficiently design analog circuits. We shall use the OASYS system to evaluate our macromode1ing approach. However, the methodology being described is general and can be used with any other hierarchical design system. The modeling technique has been fine tuned to handle analog circuits; however, the approach is general and is applicable to both analog and digital circuits. This feature makes it particularly suitable for mixedsignal designs.
This paper proposes a general methodology for macromodeling using radial basis functions. Since a large number of simulations is required and increases exponentially in higher dimensions, systematic design plans are used to reduce the number of experiments. In our methodology, the significance of input variables is evaluated using experimental runs. Next variable screening and variable grouping is performed based on the significance of each input on individual models. To further minimize the cost of simulations we have developed an adaptive volume slicing technique during regression analysis to dynamically perform experimental runs.
The paper is organized as follows. Section 2 reviews the previous work in this area. Section 3 presents the macromodeling methodology that has been developed. In Section 4 we provide example macromodels for both analog and digital circuit
26 Harjani, et at.
5 4 3
~2 ~ 1 o
Fig. 5. Feasibility macro models for two styles
5 4
X3a ~ 1 o
Fig. 7. Feasibility macro model for sub-block a
blocks to illustrate the viability of our approach. Finally, in Section 5 we provide some conclusions.
2. Review of Previous Work
In an attempt to evaluate past research it is instructive to provide a more formal definition for the macromodel. In general, the relationship between the ith response and input variables and its macromodel can be represented by
Yi
fh
fi(Xl,X2, ... ,Xn )
Ji(Xl, X2,···, xn)
(1)
(2)
where Xi, (0 :s; i :s; n), is the ith input variable, Yi is the ith response, [Ii is the approximation of the ith response, fi is the unknown relationship between Yi and {Xl, X2, ... , Xn }, Ji is the macromodel of Ii, and n is the number of input variables in the ith macromodel. As we shall see later, the number of variables in Yi and Yi do not have to be the same.
A number of macromodeling techniques have been developed for design centering of integrated circuits [3], [4], [5], [6]. Design centering deals with choosing a nominal design which maximizes the fabrication yield. The design approach to improve yield seeks to inscribe the largest norm body, usually a hyperellipsoid, in the boundary by moving the center of the norm body, called the nominal point [3], [4]. In [3], [4], [5] an approximation of the feasibility
X2A
Fig. 6. Global macromodel for both styles
region is generated using simplicial approximation. We use a variation of this technique to generate a macromodel for the feasibility region [7]. In [6] an approximation of the feasibility region is generated using function forms that are similar to radial basis functions. The sequential experimental design strategy in [6] is similar to our volume slicing strategy that is discussed later. However, it does not include variable grouping and differs significantly in the details. Additionally, the techniques developed in [6] are geared towards design centering while our techniques have been fine tuned for a hierarchical design system.
In [8], [9], [10] macromodels are built by performing regression analysis on empirical polynomials with the aid of well designed experiments. The approach in [11] employs a quasi-physical model form for a MOS process using a predetermined set of variables and is thus not sufficiently general for our macromodeling requirements. In the MULREG program [12], a multiple-layered regression scheme is used with a large number of variables. In this approach, the polynomial regression model is synthesized layer by layer using a number of low order polynomials. The data points are generated randomly which, unfortunately, leads to extremely large and non-optimal experimental runs. The approach in [5] builds a secondorder polynomial macromodel using regression analysis with a fractional factorial experiment plan. However, as with other polynomial based approaches, the number of experimental runs required increases exponentially with the number of input variables. Additionally, factorial design plans require that the maximum complexity of the polynomial be known before experiments can be designed. However, to provide a general solution, we cannot make any a priori assumptions about the complexity of the response surface.
In the next section, we develop our solution that uses radial basis functions in combination with a dy-
5 4
X3~ ~ r.I!I!~~~_ 1 o
8 0
Fig. 8. Feasibility macro model for sub-block f3
namic experimental design strategy to provide general macromodels efficiently.
3. Macromodeling Approach
Macromodeling is desirable when the computation cost of generating data points is unacceptably high. As we have seen before, the complexity of the design space can grow rapidly. Both of these stress the need for efficient macromodeling techniques. Macromodel construction requires a number of simulations to gather the data points. Therefore, the number of experimental runs can be used as a metric of the cost of macromodeling. However, if too few experimental runs are performed, the accuracy of the resultant macromodel is sacrificed. Hence, there is a direct tradeoff between the cost of generating the model versus the accuracy of the model. Experiments can be generated statically or dynamically. Static experiments are either designed using factorial design techniques [13] or done manually using previous knowledge of the target space. On the other hand, dynamic techniques use no previous knowledge of the target space, but adapt to provide the best tradeoff between cost and accuracy.
In the following paragraphs, we present the four primary components of our macromodeling approach. These are: • Feasibility region definition: The circuit design
problem is defined by specifying input variables and output responses. The specifications include the domain of the input variables and constraints on the output responses. This in effect defines the feasibility region for the design tool. To find the boundary points of the feasibility surface we use one of two search algorithms. Once an adequate number of data points are gathered, a macromodel can be built for the feasibility region. The two methods that are used to perform the search are
5 4 3
X3A 2
1 o
Feasibility and Performance Modeling 27
5
Fig. 9. Global model of a design with two components
radial binary search and vertical binary search [7]. Radial binary search requires that the feasibility surface be convex in all dimensions while the vertical binary search requires that the surface be convex in only one dimension.
• Experiment design: Experiment design techniques are employed to build an appropriate experiment plan. The use of an appropriate experiment plan results in substantial savings in the number of experimental runs. In our approach, we use both static and dynamic experimental design techniques. We use a static factorial design technique to measure variable significance and we use a dynamic technique called volume slicing to generate the data points for regression. During volume slicing, we dynamically increase the number of experimental runs where the response surface is more complex and decrease the number of experiments where the response surface is smooth.
• Variable screening and grouping: Through systematic experimental runs the significance of each variable, i.e., its effect on the output response, is estimated. The variables below a certain threshold level are neglected. Selected variables are further grouped into layers with the more significant variables in the upper layers. This classification reflects the varied influences of different variables on the output response and has a significant impact on the complexity of the modeling technique for high dimensions.
• Regression analysis: Having once performed the necessary experimental runs and gathered the data points, we now use these data points to calculate the coefficients of the macromodel. Additionally, the accuracy of the resultant macromodel also needs to be verified. To dynamically collect the data points, we perform the regression analysis at two levels. A local regression analysis procedure is called recursively for a local area until a cer-
28 Harjani, et at.
2 4
Fig. 10. The simplicial approximation technique applied to a feasibility curve
tain accuracy is obtained. After all the data points have been obtained, a global regression analysis is then performed to obtain an approximation of the entire surface. The adaptive volume slicing technique was developed in conjunction with local regression analysis to dynamically generate the necessary experiments.
3.1. Feasibility region definition
The feasibility region is explored by using our macromodeling methodology. In general, the feasibility macromodel generation procedure consists of: 1. Boundary point generation using vertical binary
search. 2. Regression analysis using radial basis functions.
The feasibility surface in the input variable space defines the region in which a circuit is designable. Once a feasibility surface for a design style is established then for any set of input specifications we can easily establish if a circuit in that topology can be designed to meet specifications. This provides substantial savings in comparison to a methodology that establishes the feasibility of any topology only by completing the design to the very lowest level of the hierarchy. Hence, it proves highly beneficial to establish the feasibility region for each topology at each level of the hierarchy.
Ideally, we would like to develop a completely general methodology that works under all circumstances. Unfortunately, since we have no a priori knowledge about the convexness of the surface, this is not feasible. Considering this limitation, we have developed a method that works in the majority of circumstances and produces a good approximation of the feasibility surface. Since one of our primary goals is to reduce the cost of generating these macromodels we take particular care to reduce the total number of experimental runs.
Though we use a different equation form and a different search algorithm to locate boundary points still the simplicial approximation technique [3], [14] forms the basis of our technique and will be discussed briefly. For a given region in a n-dimensional space, R, and its boundary, 8R, the simplicial approximation is based on approximating the boundary 8R by a polyhedron, which consists of a set of n-dimensional hyperplanes which lie inside 8R or on it. The procedure starts with determining a set of m 2: n + 1 points on the boundary 8R. Usually m is taken to be either n + 1 or 2n. One way to find 2n boundary points is to first locate a point inside R and then perform one-dimensional line searches in the positive and negative directions for each coordinate. A convex polyhedron is constructed using the set of the boundary points. This polyhedron is the first approximation of 8R. Next the largest inscribed hypersphere is found in this polyhedron. Among the hyperplanes, which are tangent to the hypersphere, the largest tangential face is then found. A line search is now performed along the direction from the center of the hypersphere to the tangential point, Po, on the largest face. It results in a boundary point, Pl. If the distance between Po and Pl is small enough, the polyhedron is a good approximation of 8R; otherwise, Pl is added to the set of the boundary points and a new polyhedron is constructed. The above procedure is repeated until a good approximation is obtained.
Fig. 10 shows the feasibility surface in two dimensions for a two-stage opamp. The numbered points and the polygon show the steps of the simplicial approximation technique. The simplicial approximation technique is not well suited for this surface. The primary problem being created by the coordinate axis boundary. It is not possible to find a well inscribed circle inside the polyhedron, which approximates the curve well. This is a fairly typical case because input variables usually have positive values. In addition we note that the curve is not convex.
The key aspect of any methodology is to find a sufficient set of boundary points efficiently. There are two aspects to finding these points; the direction of the search and the method used to perform the search. In simplicial approximation the search direction is generated by drawing a line from the center of the hypersphere to the tangential point on the largest face. The search is performed using a simple line search. To reduce the number of experiments nec-
essary we perform a binary search instead of a line search. And to ascertain the direction of the search we propose two approaches: radial binary search and vertical binary search.
Fig. 11 shows the radial binary search and the vertical binary search strategies. For radial binary search, the search direction starts from the origin of the coordinate system. Here, like with simplicial approximation, the surface needs to be convex. Additionally, the origin needs to lie inside the surface. The second requirement ensures the existence of a solution (a boundary point) of the search. The first requirement ensures the uniqueness of the solution. As shown on the left-half of Fig. 11, there exists at least one solution along any direction from the origin. Furthermore, we observe that, along line 1, there are three solutions because the curve is nonconvex in that direction. The radial search strategy is similar to simplicial approximation and therefore the uniqueness of a solution is essential. Furthermore, there is an additional disadvantage to the radial search strategy from a code implementation point of view. Though, an implementation can be visualized in two or three dimensions it is extremely difficult to visualize even the definition of the spherical coordinates in higher dimensions. Most circuit designs have dimensions greater than three. Radial search, though, has the advantage that as long as the two requirements are met then the shape of the surface is irrelevant.
The vertical binary search procedure shown on the right-half of Fig. 11 is not without problems either. The problem is illustrated by line 2 in this figure. Any search that is performed along this line will not result in a solution. All the experiment runs along a direction in which there is no solution are completely wasted. Since, we have no a priori knowledge of the surface, the search procedure is performed on a regularly shaped region. This is also a requirement of the recursive procedure in the case of higher dimensions. However, it is usually the case that the boundary of the feasibility region in the hyperplane perpendicular to the "vertical" direction is irregular. Therefore, we face the problem of either exploring an incomplete surface or performing the search outside the boundary. However, the advantage of the vertical search procedure is that the requirement on convexness is relaxed. The surface need only be convex in at least one dimension.
Feasibility and Performance Modeling 29
For our results we use the vertical binary search procedure. We find that it is not unusual that a feasibility surface is non-convex. However, our experience suggests that most feasibility surfaces are convex in at least one direction. Hence, it still allows for a general methodology. Additionally, the vertical search procedure is better suited for high-dimensional situations. We also note that since we use radial basis functions to represent the feasibility surface, it makes it easier to to use Cartesian coordinates throughout our entire procedure.
3.2. Static experiment design
Experiment design is a sampling strategy. Properly designed experiments can result in substantial savings in the number of experimental runs. Since one of our primary concerns is to minimize the number of experimental runs, systematic experiment design techniques are employed to achieve this goal. The resultant experiment plan is used for variable screening and variable grouping. So, as not to waste experimental runs we reuse the results of these experiments for regression analysis as well. To maintain consistency, we use a two-level fractional factorial plan to design experiments for variable screening and for variable grouping.
In a two-level fractional factorial plan, each variable takes two values. In our application, they are selected to be within the feasible input domain. Assuming that each variable has been normalized such that its low and high values are -1 and +1 respectively, then a full two-level factorial plan with n variables requires 2n experimental runs for all possible combinations [13]. The number of experimental runs can be reduced substantially by using a 2~iIP fractional factorial plan. Interested readers are referred to [5], [13] for more details.
3.3. Variable screening and grouping
As discussed earlier, not all the input variables have the same effect on the output response. Therefore, it is possible to identify a subset of the input variables which are more significant than the others. These significant variables are then used in macromodel construction while the other variables are discarded from consideration. Even among the selected significant
30 Harjani, et at.
y
/ /
/
/ -/ -
/ /
/
/
Line 1 /
pne2
~--~--------------------~~x
y Line 1
* I ;l( I I I I
Line2 I I I I I I I I I
~------~------------~~.x
Fig. 11. Radial and vertical binary search procedures
variables, the degree of influence on the response is different. To further reduce the complexity of the regression analysis, we group the significant input variables into layers.
A slightly modified 2~i? fractional factorial plan is used for variable screening. For each input variable Xi we define the following quantities: • Main effect, Vi = H Vi+ - Vi-} = .: 2:=~:1 Sik . Yk • High deviation, df, from the nominaf (i.e., the cen-
.) dh - 2 " ter pomt response, i - n L..kEKi Yk - Yc
• Low deviation, di, from the nominal response, d1 _ 2"
i - n,. L..kEK~Yk - Yc
where nr is the n'umber of experimental runs, Yk is the response value of the kth run, Sik is the sign of Xk on the kth run, Ki is the set of run indices when Xi is +1, Ki is the set of run indices when Xi is -1, and Yc is the response value of the center point.
Using these quantities, a statistical significance test is performed to determine if the corresponding input variable is significant. Further we can define
8'!=~ t A'
<Tv (3)
where 0-v, o-h, and 0-1 are the estimates of the standard derivation of Vi, df, and dL respectively.
A variable is considered to be significant if 18i I > t a / 2 ,n-l is true for at least two of the three variables, Vi, df, and di. Here a is the desired level of significance and t a / 2,n-l is obtained from the t-distribution table. Variables that are considered insignificant are neglected from further discussion. Moreover, we rank the remaining variables by their significance in terms of 8;:, 8f, and 8i. We group them into layers according to their ranks. The more significant a variable is, the higher is ilie group that it is placed in.
3.4. Regression analysis
Having identified the set of significant input variables and grouped them into layers, we now proceed to construct the macromodel for the response in terms of these variables. In general, a macromodel of a response Y is given as described in eq. (2), and ilie actual response Y can be written as, Y = fj + f where f
is the random error due to the effects of ilie insignificant variables we neglected earlier.
The form of the function f(.), in eq. (2), can be selected from some known function classes. For example, in most previous macromodeling efforts, the polynomial function is used. However, models for a large number of variables requires excessive computation work and the resulting model is too complicated to be used. Additionally, polynomial function forms require some a priori knowledge of the system in order to limit the order of the polynomial. Therefore, the polynomial function form isn't well suited for a general macromodeling methodology. We employ radial basis functions (RBFs) as the model form in our approach. The linear-in-the-variable structure of RBFs provides the generality we require and the dimensionality of the problem space has little effect on the complexity of the resulting macromodel. Furthermore, no a priori knowledge of the problem domain is required for complete model development.
Rather than sampling randomly, we have developed an adaptive volume slicing technique to perform the experimental runs and gather data points. We perform additional experimental runs only in areas where more details are required. To this end, we dynamically adapt the spacing to minimize the total number of experimental runs. However, before discussing volume slicing we present some details about radial basis functions.
Domain Edge C4 C3 C3
\ ( __ ~:_+"_D'_ )E'''''rin''n~ D[ i D,
C--[------e -Variable Xl""'" C2
Do
C' C[ -VllrllibleXl .......
Fig. 12. Before and after volume slicing
Radial basis functions Radial basis functions have been used extensively to approximate multidimensional spaces [15]. The form of the radial basis function for an n-variable input space with a scalar output response y is given by,
nr
y = AO + L Ai¢(11 x - Xi II) (4)
where ¢(.) is a function from Rn to R, II . II denotes the Euclidean norm, x ERn, Ai, (0 ::; i ::; nr ), are the weights or the parameters, Xi E Rn, (1 ::; i ::; nr), are the RBF centers and nr is the number of the centers. The function form ¢(.) is selected before hand. The choice of the function form doesn't affect the average performance, but particular forms are better suited for different conditions. The centers Xi are points in the n-dimensional space where experimental runs are performed. The centers could potentially be distributed uniformly within the input domain. However, substantial savings in terms of experimental runs can be garnered by selecting appropriate centers. These centers can be selected using a priori knowledge of the design space or by using the knowledge garnered from previous experimental runs. The second of these two approaches, dynamic experiment design, was selected because of its generality. We call this experiment design technique adaptive volume slicing and use it to select the RBF centers dynamically.
We note, from eq. (4), that the dimension n of the input space has little influence on the function, because the Euclidean norm II . II is a scalar. This gives the RBF an advantage over other model forms where the number of terms in the function depends upon the dimension of the input space.
Typical choices for ¢(.) are the inverse multiquadratic function ¢(r) = (r2 + (32)-1/2, and the gaussian function, ¢(r) = exp( _r2 /(32). The approximation capabilities of the RBF approximation is
Feasibility and Performance Modeling 31
directly related into its localization properties. The localization property implies that the contribution of XS in the input domain, D, which are far away from the center of the function ¢i, is much less than those in the vicinity of the center. For a set of given centers, the global function of these radial basis functions corresponding to the contribution from each center is formed into eq. (4). It is easy to see that ¢(r) --+ 0, as r --+ 00. For these choices, the RBF approximation has good localization properties. A judicious choice of the RBF centers, or experiments, is extremely important and leads to our adaptive volume slicing strategy.
Adaptive volume slicing
The allocation of the RBF centers has a direct impact on the performance of the RBF approximation. Additionally, we also wish to minimize the number of experimental runs. Both these constraints require a efficient method to determine when and where to place the RBF centers. We call our method of locating the RBF centers as adaptive volume slicing. In adaptive volume slicing we choose the intersection points of surfaces in the input domain as the RBF centers, and slice the volume only when the accuracy of the approximation is not sufficient. To illustrate this methodology, we present an example in two dimensions.
The adaptive volume slicing method is compatible with the experiment design techniques used for variable screening and grouping. Therefore, we are able to reuse the experimental runs generated for variable screening and grouping. To maintain generality we adopt a regular structure for the data points, i.e, the method is recursive. The left-side of Fig. 12 shows a unit of the input domain Do, with input variables Xl and X2 and the output response y. To start with, the intersection points Ci, 1 ::; i ::; 4 (the four nodes of the square), are chosen as the RBF centers. From eq. (4), we have
4
y= LAj¢j(IIX-Cjll) (5) j=l
which is the approximation of the response of the input domain Do. The RBF coefficients Aj are solved using the values at c/s. To check if the accuracy of this approximation is good enough, we perform an experimental run at the center point Co and generate
32 Harjani, et at.
XOl
• Orignal RBF centers
@ New RBF centers
Fig. 13. Volume slicing in 3-dimensions
the response value Ye' From eq. (5), we have the estimated value of the response at Co, Yeo If the criterion in eq. (6) is satisfied, then the RBF model of the four centers is a good approximation for the domain Do. Otherwise, we slice the square such that it results in the figure on the right-side of Fig. 12. For each subarea Di , 1 :::; i :::; 4, we repeat the above procedure for Do. The recursive procedure is terminated when the criterion given in eq. (6) is satisfied for each subarea. The parameter E, in eq. (6) can be varied for the desired level of accuracy.
I Ye - Ye I:::; E
Ye (6)
The simple volume slicing technique mentioned here can easily be extended to 3-dimensions. However, for higher dimensions (n > 3) the basic methodology has to be modified due to the complexity of the data structures that are generated. This modification and generalization of the volume slicing approach to n-dimensions is described next.
Layered volume slicing
To extend the volume slicing method to 3-dimensions, a cubic unit is used instead of a square unit as shown in Fig. 13. When slicing a cube, we add 19 additional center points, instead of 5 as in the 2-dimensional case. The number of RBF centers that have to be added dynamically grows exponentially for higher dimensions. To reduce this problem, we use the significance of input variables. We group the variables into layers, as discussed in subsection 3.3. Each layer has at most three variables. The most significant variables are assigned the highest layers. The layering for a single unit cube is shown in Fig. 14.
I X02 3 I
XOI
Loyer 0
(XOI, X02, X03)
Layer 1 (XIl, XI2, X13)
Layer 2 (Xll. X22, X23)
LayerN (XNI, XN2, XN3)
Fig. 14. The layered structure for input variables
Area (11m2)
150000
l00000~. 50000
o
45 50 55 60 65 70 75 5
Voltage G . 80 al/) (dB)
10
Fig. 15. Original performance data for the two-stage opamp
The volume slicing method for each layer is the same as that described in the last subsection. Let the top layer be La, then the procedure for layered volume slicing is as follows:
1. Call the procedure for layer Lo. 2. For layer Li , perform the volume slicing procedure
in the input domain of Xil, Xi2 and Xi3: obtain the value of the response at each node, derive the local RBF model for the local domain, run an extra simulation at the center of the domain to check if eq. (6) is satisfied. In case eq. (6) is not satisfied, the local domain is sliced as described in the previous subsection.
For each RBF center, to obtain the value of the response, set the values of Xil, Xi2 and Xi3, and call the volume slicing procedure for layer Li+l.
Return the average response value of the RBF center points.
We make the approximation that the response value of a RBF center in layer Li is equal to the average of the response values of layer LHI with the input variables in layer Li set to the values at that center. Since we have grouped the most significant variables on the top layer, the response values of
Area (/1m2)
15()()()()
10()()()()
5()()()()
0 10
45 50 55 ___ ~ ./ 60 65
'" 70 75 80 5 oltage Gain (dB)
Fig. 16. Macromodel using 1st-order polynomials
RBF centers in a lower layer have less influence. So, the resulting error from making this approximation is small. It is important to note that this approximation only affects the experiment design phase. It has no effect on the final regression. For the final regression all the input variables are included at the same level. We have found that this variable grouping strategy works well in practice. We show examples of this in the next section.
4. Results
In this section we provide results to evaluate the validity of our methodology. We first provide some results that confirm our choice of basis functions. We accomplish this by comparing radial basis functions with the ubiquitous polynomial basis functions. We then compare our results with some other available techniques. We provide detailed macromodels for a number of analog and digital modules at different levels of the hierarchy. And finally, we evaluate the effect of altering the fabrication process used to realize the analog circuits.
4.1. RBFs vs. polynomials
We now compare the effects of using RBFs instead of polynomials as the function form in the macromodel. The number of RBF centers is determined by the size of our data points. Increasing the dimension of the input space does not directly affect the number of regressors (RBF centers). On the other hand, for polynomials of n variables, the number of regressors is determined by m = L~=o mi where l is the polynomial degree and
mo 1
Feasibility and Performance Modeling 33
Area (/1m2)
150000
100000
50000
0 10
Fig. 17. Macromodel using 2nd-order polynomials
It is obvious that the number of regressors m increases exponentially as l increases. Therefore, in practice, l must be restricted. This is the reason that, in most polynomial approximations, only the second order polynomial is considered. Although second order polynomial approximations work well for some cases, it usually requires some knowledge of the response surface. Moreover, second order polynomials do not provide a general solution and cannot be used for higher order surfaces. The RBF approach provides a more general solution and can be used for any dimensional response surface.
Some results to illustrate this claim are shown in Figures 15 - Figures 19. Fig. 15 shows the data generated from experimental design for a two-stage opamp using our adaptive volume slicing methodology. The performance metric being evaluated is the area consumed with respect to gain and bandwidth. Figures 16, 17 and 18 show the macromodels created for this data set using 1st, 2nd and 3rd order polynomial functions. Fig. 19 shows the macromodel created for this data set using RBFs. Clearly, the RBF macromodel provides a better fit than any of the polynomial macromodels. For the polynomial macromodels the quality of the fit can be improved by increasing the polynomial order. However, as mentioned earlier, the complexity of the model increases exponentially as the number of input variables increase. Further it should be mentioned that the abrupt drop in the area at 70dB in the original data is caused in the OASYS system because of a topology change in one of the sub-blocks used in this opamp. Similar results can be expected of other synthesis systems. Therefore, a general methodology must be able to adapt to sharp local changes. In addition to the quality of the fit the order of the polynomial determines the number of regressors necessary to generate the macromodel.
34 Harjani, et al.
Area (I-'m2)
150000
1OO000kd., 50000
o
45 50 55 v: 60 65 70
oltage Gaill (dB)
Fig. 18. Macromodel using 3rd-order polynomials
10
The number of regressors for a polynomial fit can be determined a priori, however, to do so we need to specify the order of the polynomial before running the experiments. Which in turn implies that the user has some a priori knowledge of the surface, i.e., not a general solution.
Polynomial basis functions have the advantage that once the order of the polynomial is fixed then the number of regressors and their locations can be generated ahead of time using well established experimental design techniques [13]. However, as mentioned earlier, this implies some a priori domain knowledge, i.e., not a general solution. Unfortunately, with RBFs as basis functions no such well established experimental design techniques exits. With polynomials the number of regressors could be selected ahead of time because once the order of the polynomial is selected then the maximum curvature of the surface is also established. However, this is not true for RBFs where they exploit their locality property to generate better local fits. Therefore, for RBFs the regressors and their locations need to be decided dynamically. Overall, RBFs provide a more general solution and requires no a priori knowledge of the surface. But unfortunately they cannot exploit the vast experimental design knowledge that is available for polynomial basis functions.
4.2. Other regression methods
Next, we compare our procedure with other available regression procedures. In particular, we compare the quality of fits produced by such methods as multivariate adaptive regression splines (MARS), k-nearest neighbors (K-NN) and projection pursuit (PRPR)[16].
Area (I-'m2)
150000
100000
50000
o
~ ~ ~ 00 6 Vol 65 70 75 5
tage Gaill (dB) 80
Fig. 19. Macromodel generated using RBFs
10
Two different sets of data points were generated. One set of data points (training) was used to to build the macromodels, and the second set of data points (evaluation) was used to evaluate the quality of fit for all the different procedures. Using these macromodels an estimated value for each test data point was computed for the different methods. The performance of each regression method was measured using the normalized RMS error (NRMS). The NRMS value for each procedure is calculated by dividing the RMS error by the estimated standard derivation for the test set. The results of this experiment are shown in Table 1. The NRMS value for our methodology is significantly lower than for the other procedures. The training set and the evaluation set for all the procedures are equal and are 182 and 25 respectively. The CPU times for the various procedures on a Sun SPARCstation 2 are also listed. However, these numbers aren't necessarily reflective of the different algorithms as they were written by people with different levels of expertise. Additionally, the different algorithms provide significantly different debug trace information during normal operation. In addition to providing a significantly lower NRMS value, our procedure does not require a user to adjust parameters to tune its performance. In general, either domain knowledge is used for parameter setting, or the procedure has to be repeated several times to obtain improved performance. From the results shown in Table 1, we observe that our methodology is favored: the performance is no worse than the other methods, and it provides a more general solution.
4.3. Macromodeling results
In this subsection, we apply our macromodeling methodology on a few circuit examples. We illus-
Table 1. Normalized RMS error for different methods
Our Mars K-NN PRPR
NRMS 0.1459 1.1022 1.0989 1.1026
Training 182 182 182 182
Evaluation 25 25 25 25
Time (Sec) 0.7 0.5 1.0 0.1
Table 2. The inputs considered for the current mirror
Variable Definition Range
R~in Min. output impedance (0.4-200 MD)
v~in Min. voltage swing (0.25-1.0 volt.)
trate the versatility of the approach by applying the modeling technique to both analog and digital circuits at different levels of the hierarchy. As mentioned earlier, we treat each of these designs as block boxes, only the interfaces are visible. Some of the feasibility macromodeling results presented here have appeared previously in [7] and some of the performance macromodeling results presented here have appeared previously in [17].
Example I: Current Mirror Macromodel Our first example is a simple CMOS current mirror. The current mirror is a common functional block in CMOS analog circuit design. In the design hierarchy, it is at the second layer just above the transistors. The design styles available in this design example are the simple current mirror, the cascode current mirror, and the Wilson current mirror [1], [18]. A simplified set of input variables for the current mirror are shown in Table 2. The metric being monitored in this performance macromodel is the 3dB bandwidth at the output node of the designed current mirror. The two input variables are grouped into a single layer. A 2-d volume slicing procedure is called to generate the regressors (RBF centers). A macromodel is built by using the RBF approximation given in eq. (4). To verify the accuracy of the approximation, we randomly choose additional data points, different from the RBF centers, in the input domain and compare the approximated value BWe of the response with the actual value BWo' The tabulated bandwidths are in MHz. These results and the normalized error in the approximation is shown in Table 3. The total number of
Feasibility and Performance Modeling 35
Table 3. Current mirror macro modeling results
R~in(MD) v~in(V) BWe BWo £
12.0 0.41333 0.050985 0.049578 0.0284
12.0 0.57333 0.098020 0.095328 0.0282
18.0 0.41333 0.033419 0.033320 0.0029
18.0 0.57333 0.064187 0.064149 0.0006
24.0 0.41333 0.018647 0.018717 0.0037
24.0 0.57333 0.036012 0.036029 0.0005
30.0 0.41333 0.011959 0.011969 0.0009
30.0 0.57333 0.023040 0.023038 0.0001
36.0 0.41333 0.008320 0.008307 0.0015
36.0 0.57333 0.015989 0.015989 0.0000
RBF centers that were generated for this example is equal to 146 and the evaluation set size that was used to validate this model is equal to 10. We note that the errors are fairly insignificant. The maximum error is less than 3 percent.
To illustrate the distribution of the regressors and the response surface, we show the response surface for lin = 25.0p,A in Fig. 20. The grid on the surface demonstrates that the regressors are not uniformly distributed. This is due to our adaptive volume slicing method in which the regressors are generated only when necessary. For the same accuracy the adaptive technique always results in fewer experimental runs in comparison to a uniformly distributed regressor method. All factorial design techniques perform uniform distribution of regressors.
Example II: Opamp Macromodels The opamp uses the current mirror and other functional blocks in its design. We investigate two design styles: a two-stage opamp and a one-stage OTA. The subset of input variables considered for our example is listed in Table 4. We have selected the active area as the response to be monitored.
For our experiments we set v~ax and v~in to have the same magnitude but opposite sign. We, therefore, only have seven input variables. Using a 2iii4 design plan, we obtain the values for Vi, df, and d~ as shown in Table 5. We use the significance criterion developed in subsection 3.3 to discard and group the input variables. The variable power is discarded, and the variables, gain, ugf, and slew are grouped
36 Harjani, et at.
3dB (MHz)
0.2
0.1
o
60
0.75 0.65 i~')
0.55 ~~ "-70 0.45 ~o~
Fig. 20. The performance surface for the current mirror
Table 4. The input variables for the opamp
Variable Definition Range
Gain voltage gain 40-100 dB
UGF bandwidth 0.2-30 MHz
Slew slew rate 0.5-30 V / ILs
C1d load 0.1-50 pF
Power supply current 1.0-5.0 rnA
v max 0 max. Vout 0.5-2.25 V
vmin 0
min. V out -2.25--0.5 V
Phase phase margin 30°_75°
into Layer 0, the variables, C1d , phase, and Vg'ax are grouped into Layer 1. Using the layered volume slicing procedure we obtain the response surface for the selected input domain. Fig. 21 shows the response surface where all variables except the gain and the ugf are fixed. The variables are not fixed in the model. However, they were fixed for the figure because it is not possible to show data for dimensions greater than three. Table 7 shows the results of the macromodel for the opamp. For the results shown in this table the C1d, Vg'ax, and phase are fixed. The errors here, though slightly larger than the current mirror, are still small (less than 9%). The performance surface for the one-stage OTA is modeled in a similar manner and is shown in Fig. 22.
In Table 7 the performance metric being monitored is different from the input performance specifications. However, it is also possible to develop a realizable performance macromodel corresponding to each input performance specification. For example, for an opamp such a macromodel would monitor the ac-
Table 5. Two-stage oparnp variable significance
Variable Definition Vi d 1 , d h , Gain Xl 29.157 30.560 27.754
UGF X2 12.270 10.867 13.672
Slew X3 14.312 12.910 15.715
Cld Xl X X2 4.825 6.227 3.422
Power Xl x X3 1.785 0.382 3.187
Phase X2 x X3 8.377 9.779 6.974
vrnax 0 Xl X x2 X X3 10.920 12.322 9.517
tual designed voltage gain corresponding to the input voltage gain specifications. Such a macromodel for a two-stage opamp was developed and is shown in Table 6. In this table the first column corresponds to the minimum input gain specification provided to the synthesis tool. The second and third columns correspond to the input bandwidth and slew-rate specifications. The forth column shows the realizable gain as estimated by the macromodel, Gaine , and the final column shows the actual gain realized, Gaino , as estimated by the synthesis tool. First, we note that the estimated gain, Gaine , and the actual realized gain, Gaino match fairly well. The worst case error is slightly over 6%. Next, as in the first row, the input specification is approximately 60.74dB, however, the realized gain is approximately 71.2dB. The realizable gain is also affected by other performance parameters. For example, in row 7 the realized gain increases slightly to approximately, 72.1dB. This is a result of the input bandwidth specification having increased from 6 to 8.5Mhz. Though not as radical here the interactions between parameters is substantially more apparent for the feasibility macromodels discussed next.
Fig. 23 shows the feasibility region for a two-stage opamp. And Fig. 24 shows the feasibility region for a single-stage OTA opamp. The macromodel was generated by finding the maximum designable bandwidth corresponding to the different gain and slewrate specifications. The surface in both figures correspond to the feasibility macromodel. All circuits below the surface are designable. Both macromodels were generated by using the vertical binary search procedure. Note the substantial difference in the shape of the two graphs, i.e., different design styles provide different performance tradeoffs. Also, note
Feasibility and Performance Modeling 37
Table 6. Two-stage opamp macromodeling results: gain response
Gain UGF Slew Gaine Gaino E
dB MHz V/p,s dB dB
60.692822 6.069282 6.069282 71.199997 71.806931 0.008524
72.692818 6.069282 6.069282 82.620003 83.352333 0.008864
84.692818 6.069282 6.069282 91.760002 90.990631 0.008385
60.692822 7.269282 6.069282 71.550003 72.085754 0.007488
72.692818 7.269282 6.069282 82.650002 81.988686 0.008001
84.692818 7.269282 6.069282 97.699997 91.475945 0.063706
60.692822 8.469282 6.069282 72.099998 71.666290 0.006015
72.692818 8.469282 6.069282 82.690002 77.822746 0.058861
84.692818 8.469282 6.069282 91.639999 91.969749 0.003598
60.692822 6.069282 7.269282 71.199997 71.810371 0.008573
72.692818 6.069282 7.269282 82.620003 83.337723 0.008687
84.692818 6.069282 7.269282 91.760002 90.882866 0.009559
60.692822 7.269282 7.269282 71.550003 72.105286 0.007761
72.692818 7.269282 7.269282 82.650002 81.970200 0.008225
84.692818 7.269282 7.269282 97.699997 91.493668 0.063524
60.692822 8.469282 7.269282 72.099998 71.647484 0.006276
72.692818 8.469282 7.269282 82.690002 77.641830 0.061049
84.692818 8.469282 7.269282 91.639999 92.124039 0.005282
60.692822 6.069282 8.469282 7IJ99997 71.910202 0.009975
72.692818 6.069282 8.469282 82.620003 83.316154 0.008426
84.692818 6.069282 8.469282 91.760002 90.928818 0.009058
60.692822 7.269282 8.469282 71.550003 72.119339 0.007957
72.692818 7.269282 8.469282 82.650002 81.944679 0.008534
84.692818 7.269282 8.469282 97.699997 91.511993 0.063337
60.692822 8.469282 8.469282 72.099998 71.637985 0.006408
72.692818 8.469282 8.469282 82.690002 77.659004 0.060842
84.692818 8.469282 8.469282 91.639999 92.145599 0.005517
38 Harjani, et al.
Table 7. Two-stage opamp macromodeling results: area
Gain UGF Slew Areae Areao E
dB MHz V /p,s p,M2 p,M2
50.866 5.086 6.069 11700 11390 0.0257
50.866 8.086 6.069 13100 12660 0.0330
65.866 8.086 6.069 13700 14890 0.0872
50.866 6.586 7.269 12300 12320 0.0017
65.866 6.586 7.269 12900 11980 0.0706
50.866 5.086 8.469 11600 11610 0.0010
65.866 5.086 8.469 12100 12070 0.0019
50.866 8.086 8.469 13100 12550 0.0413
65.866 8.086 8.469 13700 14890 0.0872
the interaction between the three variables. For example as the input gain specification increases the maximum realizable bandwidth decreases. It is much harder to design both high gain and high bandwidth amplifiers. Additionally, we also note the non-convex nature of the surface, particularly for the two-stage amplifier. One of the reasons why the surface is non-convex is because OASYS uses locally optimal strategies to design sub-blocks. Such locally optimal strategies do not necessarily result in globally optimal solutions. We think that other synthesis systems, as shown later, are also likely to result in non-convex response surfaces and therefore, we have developed a methodology that is general and is able to cope with such surfaces. The CPU time on a Sun SPARCstation 2 computer to generate the two plots are 13108 seconds for the two-stage design and 6702 seconds for the OTA design. This is significantly larger than the CPU time necessary for the performance macromodels. During the vertical binary search process the feasibility region is defined by the feasible and infeasible design boundary. With the result that a large number of infeasible designs also need to be attempted. For the two-stage design 970 RBF centers were generated, while for the OTA design 496 RBF centers were generated. However, the total number of experiments that were attempted was significantly larger, i.e., 17512 for the two-stage and 8951 for the OTA. This is reflected in the large CPU times.
Fig. 25 shows the global feasibility macromodel for the two opamp design styles. As mentioned in Section 1, this global macromodel was generated by
Area (fJm2 ) 150000
10 9~ 8#
7/.j;,~
50~--&60~~;70~~=80~~~-L-J5 6~~ Voltage Gain (dB) 90 100
Fig. 21. The two-stage opamp performance surface
ORing the feasibility regions of the concerned design styles, i.e., the two-stage opamp and the OTA opamp. The total CPU time on a Sun SPARCstation 2 to perform the ORing operation is equal to 0.9 seconds which is significantly less than the time required to generate either of the two macromodels. The top left half of this figure corresponds to the OTA design style and the bottom right half corresponds to the two-stage design. It is clear that the two-stage design is better suited for higher gains, because it has two stages to obtain this gain. We also note that for the same power the I-stage OTA is better suited for higher slew rates. One of the reasons we see the abrupt drop in the feasibility surface for the two-stage design for slew rates greater than 40V / (/1's) is that the design exceeds the power constraint.
In the next two figures we show the effectiveness of the adaptive volume slicing strategy for experimental design. We use our volume slicing method to generate RBF centers for an opamp circuit design. Both figures show the number of experimental runs necessary to explore the design space for the two-stage opamp. In these figures each circle represents an experimental run and the lines represent domain edges. We slice the input domain units and add new centers (i.e., experimental runs) only when it is necessary to further explore the details of the response surface. By doing so we not only increase the performance of the RBF approximation but we also minimize the number of necessary experimental runs. Since these improvements are made without any earlier knowledge about the shape of the response surface our methodology proves to be extremely general. Compared to a uniform regressor distribution the first example results in a savings in the number of experiments of 30% and the second example results in a savings of 56%. Both models were generated with the same error criterion, i.e., the same level of accuracy. The difference be-
Area(~2) 30000
20000
10000
a 10
Fig. 22. The OTA performance surface
tween the two experiments is the value of the other input specifications. Clearly, the savings is dependent on the shape of the surface. A surface that is "flatter" requires fewer experimental runs. The savings shown in these figures are fairly representative and usually varies from between 5% to 70%.
Example III: ~~ Converter Macromodel As shown in Fig. 1, the sigma-delta converter includes an integrator, a comparator, and a digital LPF. The feasibility macromodel for the first-order sigma-delta converter can be built through the design hierarchy. Given a set of specifications for the sigma-delta converter; resolution n and bandwidth, fo, the design is feasible if and only if the specifications translated for each sub-block lie in their feasible design regions. Macromodels for the integrator, the comparator, and the LPF can be built by using a similar procedure. Using the feasibility macromodels for the two-stage opamp and the one-stage OTA, the comparator and the digital LPF, etc., we obtain the feasibility curve in 2-d for the sigma-delta converter in Fig.28. This corresponds to the ANDing after translation of the feasibility macromodels of the various subcomponents of the data converter, i.e., integrator (opamp), comparator and digital LPF. However, unlike the simple example discussed in Section 1, it is not possible to make a direct visual comparison of the feasibility macromodels of the subcomponents and the data converter because of the nonlinear mapping of the design specifications during the translation process. For this experiment simplified first order models were assumed for the sigmadelta converter. A number of second order effects including flicker and switching noise, slewing nonlinearities, tones, supply rails, etc., were neglected. The feasibility region including all these second order effects is likely to be more complex. For this simple example, the performance of the data converter
Feasibility and Performance Modeling 39
30 25 20 15 10 5
35
Fig. 23. Feasibility region for a two-Stage opamp
is primarily limited. by the performance of the integrator which in turn is limited by the performance of the opamp. This is also likely to be true of a more complex analysis [19], [20].
Example IV: 16-bit Digital Adder Macromodel Though our methodology has been explicitly fine tuned for analog circuits it is equally applicable to digital circuit blocks as no knowledge of the underlying circuit is assumed. In this experiment we have generated a macromodel for a 16-bit digital adder. The results are shown in Fig. 29. The x-axis on this graph shows the number of bits of lookahead and the y-axis shows the total gate count. The data for this experiment was generated using the Mentor Graphics Autologic tool set. Rather than just vary the number of bits for the adder we have selected to vary the number of bits of lookahead. As shown in Fig. 29 this results in a more interesting graph. The adder was initially designed in VHDL and then later optimized equally for both area and power. Since these optimization transforms aren't necessarily linear we note the non-monotonic variation in the gate count with the increased number of bits of lookahead. We also note that the surface is not convex.
Example v.. Effects of Process Parameters In this last set of examples we explore the effect of different process parameters on the realizable performance of analog circuits. All the previous results were generated assuming that the circuit was to be fabricated using the MOSIS 2J1,lli N-WELL process. In the next few figures we show the effect on the performance of analog circuits if they were to be designed using the MOSIS 1.2J1,lli N-WELL process instead. Figures 30 and 31 show the area performance macromodel generated for a two-stage opamp using the MOSIS 2J1,lli and 1.2J-Lm processes respectively. Here we note that
40 Harjani, et al.
UGF(MHz) 35
~ 20 15 \0
a
20 25
30
G:>~5 40
~45 !:I 50
55
Fig. 24. Feasibility region for an OTA opamp
both the actual area value and the shape of the surface is different clearly showing that the performance of analog circuits is highly process dependent. Because the performance of analog circuits is highly process dependent, therefore, placing hard limits to perform topology selection is extremely limiting. A design tools that uses such hard limits can only be used for a single fabrication process. If it has to be used for a separate process a new set of such limits need to be generated. Furthermore, there is a strong interaction among the input variables. For example, it is much easier to design either a high gain or a high bandwidth amplifier than it is to design a high gain and high bandwidth amplifier. It is not possible to include the many subtleties of these interactions with hard limits.
The next two figures show the feasibility macromodels for the two-stage and one-stage opamps using the MOSIS 1.2J.Lm process. Recall, that the feasibility macromodels for the same amplifier using the MOSIS 2J.Lm process were shown in Figures 23 and 24. Note the substantial difference in the shape and the maximum realizable performance for the two different processes. It is much easier to generate higher bandwidth amplifiers using the 1.2J.Lm process, however, it is much harder to realize large gain. Conversely, for the 2J.Lm process it is easier to realize large gain, however, it is much harder to realize large bandwidths.
Both these experiments further stress the need to use process dependent macromodels to perform topology selection in the design of analog circuits. Next, we provide some concluding remarks that evaluate the performance of our methodology.
30 20
19r~~~~~~~
Fig. 25. The global macromodel for the two opamp styles
4.4. Methodology performance
60
The rationale for developing hierarchical macromodels was to substantially reduce the design time. The savings in the design time for a simple design may not be significant because of the limited number of design styles at each level included in the OASYS system, i.e., small branching factor. Recall that there are over twenty design styles for an opamp while only two of them have been implemented in OASYS. However, even for the limited branching factor, the savings can be substantial when running multiple experiments as may be necessary for design space exploration [1]. Additionally, the rationale for developing a systematic methodology to select experimental runs was to reduce the effort necessary to develop the macromodel. The savings in design time and in the number of experiments is illustrated by the following examples. 1. The macromodeling methodology results in a sig
nificant savings in design time. For example, to develop the performance macromodel for the OTA, we performed 196 experiments in 524.34 seconds CPU time(2.675 seconds/data point). Having once built the macromodel, we were able to perform 667 experiments in 20.46 seconds CPU time (0.031 seconds/data point). This is a savings of 98.84%. Similar savings also result in the case of generating the combined feasibility macromodel for the two opamp styles. If the combined macromodel was generated from scratch it would require 13108 + 6703 = 19811 CPU seconds. However, once the individual macromodels for the two designs styles were generated the ORing required only 0.9 CPU seconds. This is a savings of 99.99955%. Other savings are design dependent, however, these examples are indicative of the sav-
0
1 I
~ >---
>---- >-
>--- >-
>-~
60 70 80 90 100
Voltage Gain (dB)
Fig. 26. Adaptive volume slicing (I)
ings that can result from using our macromodeling methodology.
2. We were able to generate extremely high accuracy macromodels using our methodology. In general, the error between predicted values and measured values varied from 0% to 9%. However, the majority of errors were less than 1 %. This is true even for surfaces that are highly nonlinear, (see Fig. 25, making this an extremely general methodology.
3. In macromodeling, traditional experimental design techniques use no knowledge of the statistical distribution of previous regressors and thus uniformly distributes the regressors. To obtain the same accuracy for the two-stage opamp performance macromodel, the most optimum traditional method would generate 289 data points, while the volume slicing method only required 204 data points (see Fig. 26). This is a savings of approximately 30%. For a different set of circumstances the savings was even larger, i.e., 56%, (see Fig. 27). However, care should be taken when interpreting these numbers. Since no a priori knowledge can be assumed, a designer would normally either overestimate the number of experimental runs, i.e., our savings would be substantially more. Or she/he would underestimate the number of experimental runs, i.e, a less accurate model.
4. Design space exploration without a macromodel is generated by performing a design run for each point on the design surface. However, with the macromodel, the design surface is already approximated by the macromodel. Therefore, design space exploration only involves evaluating the macromodel a number of times. As illustrated earlier, the savings can be substantial. More importantly, the substantial reduction in real time implies that design space exploration becomes prac-
Feasibility and Performance Modeling 41
40 50 60 70 80
Voltage Gain (dB)
Fig. 27. Adaptive volume slicing (2)
tical even for systems that take substantial design time [2]. For example, the real time required to generate the feasibility macromodel for the OTA opamp design discussed in Section 4 was over eight and a half hours. However, once the macromodel was generated it required less than five minutes of real time to evaluate and display it.
5. Conclusions
In this paper, we have presented a general macromodeling approach for hierarchical circuit design. The validity of our approach was tested by generating macromodels at different hierarchical levels. Fractional factorial experiment experiment design techniques were used to measure the significance of input variables. Variable screening and grouping techniques were employed to select and organize the input variables based upon their influence on the output response. An adaptive volume slicing technique was used during regression analysis to dynamically distribute regressors such that the number of experimental runs is minimized. The RBF approximation is well suited to our methodology because of its locality and linear-in-parameter structure. Our methodology produces extremely accurate macromodels in an efficient manner. Additionally, once generated these models are easy to evaluate and provide substantial savings in design time. We have found that our methodology is extremely general and works well for both analog and digital circuit blocks.
References
I. R. Harjani, R. A. Rutenbar, and L. R. Carley, "OASYS a framework for analog circuit syntheis," IEEE Transactions on Computer-aided Design of Integrated Circuits and Systems, December 1989.
42 Harjani, et al.
2. E. S. Ochotta, R. A. Rutenbar, and L. R. Carley, "ASTRXlOBLX: Tools for rapid synthesis of high-performance analog circuits," in ACMIIEEE Design Automation Conference, 1994.
3. S. W. Director and G. D. Hachtel, "The simplicial approach to design centering," IEEE Transactions on Circuits and Systems, July 1977.
4. R. K. Brayton, G. D. Hachtel, and A. S. Vincentelli, "A survey of optimization techniques for integrated-circuit design," Proceedings of IEEE, October 1981.
5. K. K. Low, A Methodology for Statistical Integrated Circuit Design. PhD thesis, Carnegie Mellon University, Pittsburgh, Pennsylvania, 1989.
6. M. C. Bernardo, R. Buck, L. Liu, W. A. Nazaret, J. Sacks, and W. J. Weich, "Integrated circuit design optimization using a sequential strategy," IEEE Transactions on Computer-Aided Design, vol. 11, pp. 361-372, March 1992.
7. J. Shao and R. Harjani, "Feasibility region modeling of analog circuits for hierarchical circuit design," in IEEE Midwest Symposium on Circuits and Systems, 1994.
8. Y. Aoki, H. Masuda, S. Shimada, and S. Sato, "A new design centering methodology for vlsi device development," IEEE Transactions of Computer-Aided Design of Integrated Circuits, vol. CAD-6, pp. 452--461, May 1987.
9. A. R. Alvarez, B. Abdi, D. Young, H. Meed, J. Teplik, and E. Herald, "Application of statistical design and response surface methods to computer-aided VLSI device design," IEEE Transactions of Computer-Aided Design ()f'Integrated Circuits, vol. CAD-7, pp. 272-288, February 1988.
10. T. Yu, S. Kang, I. Hajj, and T. Trick, "Statistical performance modeling and parametric yeild estimation of MOS VLSI," IEEE Transactions of Computer-Aided Design of Integrated Circuits, vol. CAD-6, pp. 1013-1022, November 1987.
11. P. Cox, P. Yang, S. Mahant-Shetti, and P. Chatterjee, "Statistical modeling for efficient parametric yield estimation of MOS VLSI circuits," IEEE Transactions on Electron Devices, vol. ED-32, pp. 471--478, Feb 1985.
12. C. Shyamsundar, "Mulreg - user's manual," technical report, Carnegie-Mellon University, 1986.
13. G. Box, W. Hunter, and J. Hunter, Statistics for Experimenters: an Introduction to Design Data Analysis and Model Building. John Wiley, 1978.
14. L. M. Vidigal and S. W. Director, "A design centering algorithm for nonconvex regions of acceptability," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. CAD-I, pp. 13-24, January 1982.
15. M. Powell, "Radial basis functions for multivariable integration: A review," in IMA Conference on Algorithm and Approximations and Data, RMCS, 1985.
16. V. Cherkassky, D. Gehring, and F. Mulier, "Pragmatic comparison between statistical and neural network methods for function estimation," in Proc. World Congress on Neural Networks WCNN-95, July 1995.
17. J. Shao and R. Harjani, "Macromodelling of analog circuits for hierarchical circuit design," in IEEE Internatoinal Conference on Computer Aided Design, 1994.
18. R. Gregorian and G. Ternes, Analog MOS Integrated Circuits
for Signal Processing. Wiley and Sons, 1986.
19. J. C. Candy and G. C. Ternes, eds., Oversampling Methods for ND and DIA Conversion, pp. 1-25. IEEE Press, 1992.
20. R. Harjani, The Circuits and Filters Handbook, ch. Analogto-Digital Converters. CRC Press, 1995.
Ramesh Harjani received the B.Tech, M. Tech and Ph.D. degrees in electrical engineering in 1982, 1984, and 1989 from Birla Institute of Technology and Science, Pilani, India, the Indian Institute of Technology, New Delhi, India and Camegie Mellon University, Pittsburgh, PA, respectively. He was with Mentor Graphics Corporation, San Jose, CA till he joined the Department of Electrical Engineering at the University of Minnesota, Minneapolis, MN in 1990, where he is currently employed. His research interests include analog CAD techniques, low power analog circuit design, disk drive electronics and analog and mixed-signal circuit test.
Dr. Harjani received the National Science Foundation Research Initiation Award in 1991, and a Best Paper Award at the 1987 IEEE/ACM Design Automation Conference. Dr. Harjani is a member of IEEE and ACM and is currently an Associate Editor of the IEEE Transactions on Circuits and Systems II.
Jianfeng Shao received the B.S. degree III physiCS from the University of Science and Technology, China in 1989, and the M.S.E.E. and M.S.C.S. degrees from the University of Minnesota, Minneapolis, MN in 1994. He is currently at Intel Corporation, Oregon, and specializes in networking and database application development.
70000
60000
50000 i$
g 40000 8 05 ~ J
30000
20000
10000
0 8 10 12 14 10 18 20
Remludon (bIts)
Fig. 28. Feasibility region for a sigma-delta converter
Area (pm2 )
150000~irI6; 100000
50000 10
o
40-~5mO--:~--__ J 60 70
Voltage Gain (dB)
Fig. 30. Area performance macromodel for a 2u process
UGF(MHz)
150
100
50
o
50 100
Voltage Gain (dB)
Fig. 32. Two-Stage feasibility macromodel (1.2u)
550
500
450
400
350
300
250
200
Feasibility and Performance Modeling 43
10 12 14 16
Number of bits of carry lookahead
Fig. 29. Feasibility macromodel for a 16-bit digital adder
Area (pm2)
150000~~ 100000
50000
o
50 60 70 Voltage Gain (dB)
Fig. 31. Area performance macro model for a 1.2u process
150
100
50
o
25 30 -----" Voltage Gain (dB) 35
Fig. 33. OTA feasibility macromodel (1.2u)
Analog Integrated Circuits and Signal Processing, 10,45-65 (1996) © 1996 Kluwer Academic Publishers, Boston. Manufactured in The Netherlands.
Behavioral Modeling Phase-locked Loops for Mixed-Mode Simulation
BRIAN A. A. ANTA01, FATEHY M. EL-TURKy2, AND ROBERT H. LEONOWICW I Coordinated Science Laboratory. University of Illinois at Urbana-Champaign. Urbana. IL 61801. and 2 AT&T Bell Laboratories. Allentown. PA 18103
Abstract. Phase-locked Loops(PLLs) are a class of feedback systems with wide range of applications. A PLL in its entirety can be viewed as a closed-loop servosystem, comprised of three major functional subsystems; 1) Phase detectors, 2) Loop filters and 3) Voltage/Current controlled oscillators. The overall characteristics of the phase-locked loop are dependent on the realization of individual subsystems which have mixed analog-digital implementations. In simulating a PLL, one has to deal with the mixed-signal nature of most implementations, as well as the problem of simulating the PLL over a large number of signal cycles. Long simulation run times plague the simulation of a PLL using a conventional simulator, sometimes making such simulation impractical. In the methodology described in this paper, these drawbacks are overcome by the use of behavioral models and a mixedsignal simulation platform. This paper presents a general mixed-mode behavioral simulation methodology and the derivation of behavioral simulation models for various kinds of PLLs. The top-down and bottom-up modeling paradigms are illustrated through the use of examples of actual PLL designs. The simulation models are generated for the AT&T Bell Laboratories mixed analog-digital simulator, ATTSIM.
Keywords: Phase-locked loops, behavioral modeling, mixed-mode simulation, modeling language description
I. Introduction
Phase-locked Loops are frequently used in a wide range of applications, ranging from data recovery in communication systems to clock synthesizers in digital systems. Some of the general areas in which the PLL finds applications are 1) tracking, 2) synchronization, 3) Linear demodulation, 4) phase or frequency demodulation of analog or digital signals and 5) amplitude detection [10], [39]. Phase-locked loops are also used for frequency tuning in the implementation of continuous time analog filtering applications [20]. The essential components of the PLL are 1) phase detector, 2) loop filter and 3) voltage or current controlled oscillator. Each of these components can be realized in analog or digital circuit technology. Based on the implementation of its components and the type of the signals operated on, PLLs may be classified as analog, discrete, digital or hybrid in nature [18].
The design of a PLL involves considering the overall system characteristics such as stability etc., and constructing the constituent components. The component parameters are chosen so that the complete closed loop PLL circuit exhibits behavior that complies with the desired characteristics. The system parameters are usually verified with a linearized model which is valid once locking occurs since the behavior of the PLL prior to locking is highly non-linear or stochastic in nature. Simulation of the PLL design is essential to verify its functionality. Simulation of the PLL usually requires that implementations for the specific components exist. The PLL can then be simulated using a conventional simulator. Simulation of a PLL is however plagued with two major bottlenecks: 1) The PLL system and the signals typically are mixed analog-digital in nature often requiring a circuit-level simulator. 2) A large number of clock cycles have to be simulated to mea-
46 B. A. A. Antao, F. M. El-Turky and R. H. Leonowich
sure the PLL performance characteristics. Various approaches have been taken to address these bottlenecks [6], [25], [36], [38], ranging from custom simulators that usually cater to a fixed PLL implementation to SPICE type macromodels; however none offer a comprehensive solution.
A PLL usually has a mix of analog and digital components. A digital PLL can have digital reference and output signals with intermediate loop signals being analog in nature. Thus due to the hybrid nature of the PLL a designer has to resort to conventional circuit level simulation using a SPICE like simulator [30] with device level models or macro models [12], [36]. Detailed circuit-level simulation of the entire PLL is plagued with long simulation run times and does not really provide quick estimates of the PLL performance. The other approach has been to use customized simulators to overcome the problem of extended simulation run times [6], [25], [38]. These approaches are restricted to simulating a specific PLL implementation. The models are tightly coupled to the simulator and cannot be used in simulating a larger system in which the PLL may function. Also it is desirable to be able to simulate a PLL during the design process itself, i.e. simulate a partial design to verify the functionality of the entire PLL loop without having to design the whole implementation. This paper addresses these problems by the use of a general purpose mixed-signal and multi-level simulator, and behavioral models at varying degrees of abstraction.
In a typical application, a PLL forms a subsystem in a larger system configuration. In such cases device level PLL models cannot be used effectively to simulate an entire system. The AT&T Bell Laboratories mixed-mode simulator, ATTSIM, described in this paper provides the capability of simulating large systems. The simulation can be obtained in reasonable runtime by utilizing system-level behavioral models for each of the individual subsystems. The simulator provides a framework for 1) Multi-level simulation, i.e. mixing behavioral level models and circuit level models within the same simulation and 2) Mixed-Mode simulation mixing analog and digital signals and models. MOTIS [8], SAMSON [33] and SPLICE [1] are examples of other multi-level tools geared towards simulating systems in the digital domain, by mixing gate-level and electrical-level models. The simulator M3 incorporates analog models in the MOTIS framework [7]. In this context, SPECS [37] is another digital simulator that exploits circuit latency and the use of varying
accuracy device models to speed up simulation. Using piecewise approximate models, the granularity of the simulation models can be varied by trading off accuracy and simulation speed. These tools however are restricted to the digital domain. Whereas, ATTSIM provides the capability of multi-level simulation in the analog domain, as well as incorporating digital models for mixed analog-digital simulation.
A system such as the PLL can be quickly simulated by utilizing behavioral models for the constituent subsystems, thus providing the designer with the ability to simulate partial designs by mixing circuit-level models and behavioral models as the design progresses. Using behavioral simulation, a designer can focus on the highlevel characteristics of the PLL and design the loop by adjusting the various gain parameters such as the phase detector gain. Experiments can be made with various types of components and high-level parameters to fine tune the PLL configuration in a quick and efficient manner prior to detailed circuit-level design. Behavioral models provide a means by which the PLL system can be tested for its overall loop performance characteristics at a higher level prior to generating circuit-level realization for the individual subsystems.
Besides integrated circuit simulation, the behavioral modeling capability also provides for board-level simulation of large systems. Board-level designs or circuit packs include off-the-shelf type of components, i.e. existing integrated circuits. In order to simulate boardlevel designs, simulation models are required for the various components. Knowing the characteristics of a component, behavioral models can be written to model a component in a hierarchical fashion. For example a behavioral model for a PLL as a whole, composed of models for its components, can be incorporated in the simulation of an entire board-level design. The behavioral characteristics for off-the-shelf components are usually specified in the manufacturer's data sheet. These behavioral models written in a bottom up fashion from data sheets are used to simulate large board-level designs to verify the overall functionality.
In this paper we describe a behavioral simulation methodology for multi-level and mixed-mode simulation of Phase-Locked Loops using a general purpose simulation engine. Behavioral models for PLL components are derived in both the top-down and the bottomup modes. The models developed in the top-down approach are more general in nature, and by supplying specific parameter values can be tuned to desired specifications. The bottom-up models are useful in speeding
Behavioral Modeling Phase-locked Loops for Mixed-Mode Simulation 47
up the simulation of specific PLL implementations, and can be later used in the design of larger systems. Our modeling and simulation approach is directed toward developing system-level behavioral models that can be incorporated in the simulation of a larger system where the PLL is one of the constituent subsystems. The paper is organized as follows: Sections II and III present an overview of phase-locked loop fundamentals and the behavioral simulation methodology. Section N outlines the PLL behavioral modeling methodology. Behavioral models for various phase detectors along with some simulation results are described in section V. In section VI, typical PLL loop filters are described, section VII covers behavioral models for controlled oscillators and miscellaneous PLL components are described in section VIII. Section IX describes the simulation of an all analog PLL modeled in the top-down paradigm. Section X presents results on modeling and simulation of a novel high-speed eMOS PLL design. Section XI describes the simulation of an off-the-shelf PLL component.
II. Phase-locked loop fundamentals
Figure 1 shows a typical PLL configuration, during operation, the output signal of the voltage controlled oscillator (YeO) is made to track the input or reference signal by a phase locking process. The phase detector output is the phase/frequency difference between the reference signal and the veo generated signal. The output signal of the phase detector is filtered and applied to the control input of the yeo. The effect of the control signal applied to the veo is such that it causes the veo output to track with the reference signal, eventually locking in phase, ie. the veo is synchronized with the reference signal. Analysis of a PLL involves estimating the locking characteristics, which includes the acquisition process during which the PLL attains lock, and stability characteristics where a stable output is maintained under deviations in the input and noise. Figure 1 shows a typical PLL configuration linearized to obtain a tractable analysis. Let the phase of the input or reference signal be (}j, and the phase of the veo output or the generated signal be (}o. The phase detector produces an output signal proportional to the phase difference between the two applied signals. If Vd is the phase detector output voltage, then,
(1)
Phase t--V....;:d.:::.(t) __ ---t
Detector
Fig. 1. Typical PLL configuration.
Kd being the phase detector gain in volts/radian. Let Vc
be the output of the loop filter, and is the voltage used to control the output of the yeo. flw = deviation of the veo output frequency from the center frequency,
(2)
Frequency is the derivative of the phase, and the veo operation can be expressed as
d(}o dt = Kovc (3)
Taking the Laplace transform,
therefore,
(4)
Equation 4 is the transfer function of the yeO, expressing the fact that the phase of the veo is linearly related to the integral of the control voltage. Proceeding with the analysis of the entire loop,
Vd(S) = Vc(s) = (}o(s) =
Now,
(}o(s) =
=
=
Kd [(}j(s) - (}o(s)]
F(S)Vd(S)
KoVc(s)
S
Ko -Vc(s) S
Ko -F(S)Vd(S) S
Ko -F(s)Kd [(}j(s) - (}o(s)]
S
(5)
(6)
(7)
(8)
48 B. A. A. Antao, F M. El-Turky and R. H. Leonowich
therefore,
s(}o(s) = KoKdF(S) [(}i(S) - (}o(s)]
(}o(s) [s + KoKdF(S)] = KoKdF(S)(}i(S) (}o(s)
H(s) = (}i (s)
KoKdF(S) (9) =
s + KoKdF(S)
Equation (9) is the linearized loop transfer function, also the error transfer function He(s) is obtained as
(}i(S) - (}o(s) (}e (s) =
(}i (s) (}i (s)
= He(s) s
= (10) s + KoKdF(S)
The operation of the PLL can be categorized into two modes, the acquisition mode and the tracking mode. In the acquisition mode, the PLL is in the process of synchronizing its generator, the VCO, with the input signal. Once lock-in occurs, the PLL enters the tracking mode where it tracks the input continuously with minimal phase error. In the tracking mode of operation the ability of the PLL to maintain its lock state is dependent on the stability of the input signal. Large steps of deviation in the input signal can drive the PLL out of lock. Based on these modes of operation, the performance characteristics of the PLL are broadly categorized as 1) acquisition characteristics, 2) tracking characteristics and 3) frequency stability.
The signals of interest in a PLL system in the context of the simulation are:
• Reference signal VI (t). • Angular frequency Wi of the reference signal. • Output signal V2(t) of the VCO. • Angular frequency Wo of the output signal. • Output signal Vd(t) of the phase detector. • Output signal vc(t) of the loop filter. • Phase error (}e between the reference and output
signals.
A. Acquisition characteristics
The acquisition characteristics are associated with the lock-in process of the PLL. An important metric in this context is the lock-acquisition time or the lock-in time. The lock-in time can be defined as the time taken by the PLL to synchronize with or lock onto the reference sig-
nal. However this metric is statistical in nature, partly due to the fact that the linear PLL model is based on the assumption that the PLL is in the locked state [10]. This metric is usually modeled by a probability distribution function for a particular loop configuration. This characteristic is typically a time domain function, and can be estimated by transient simulation of the particular loop in consideration. In the latter sections of this paper, it is demonstrated that the acquisition characteristics such as the lock-in time can be quickly determined by a transient behavioral simulation of the entire loop. Estimates for the desired characteristics can be obtained by assigning various values to different gain parameters of a loop configuration being modeled.
The acquisition process can be a self acquisition process whereby the loop acquires lock by itself, or an aided acquisition process with the help of auxiliary circuits [13]. Acquisition can be phase acquisition or frequency acquisition, the latter being more generally referred to simply as acquisition. The acquisition process occurs in two stages [10]. The first is the frequency pull-in stage, where the loop adjusts the VCO frequency to match the input frequency. After pull-in occurs, the loop adjusts the VCO phase to match the input phase in the phase lock-in stage.
B. Tracking characteristics
The tracking characteristics deal with how closely the PLL can track deviations in the reference signal and remain in a locked state. The effects of noise in the reference signal on the locked state of the PLL also make up part of this characteristic. Once the PLL attains lock with the input signal it enters a tracking mode of operation, where it continuously tracks the input with a minimal phase error. This condition represents a steady-state where the output is maintained constant for small fluctuations in the input signal. In this mode of operation two performance measures are of interest that characterize the loop performance: 1) Phase error jitter, and 2) Cycle slips. Phase error jitter results in the form of fluctuations or phase variations in the output signal about its nominal value. Cycle slipping occurs when the VCO drops or adds one cycle of oscillation relative to the incoming signal. Cycle slipping is characterized by the metric average cycle-slipping rate, which is the total average number of cycles slipped per second [10].
Behavioral Modeling Phase-locked Loops for Mixed-Mode Simulation 49
C. Frequency stability
Stability characteristics measure the limits over which the PLL will remain in lock and will not pull out of the tracking mode due to fluctuations in the frequency of the input signal. The metrics of interest are the hold range and the pull-out range [4]. The hold range is the frequency range over which the PLL can maintain phase tracking of the input. In the hold range the PLL is conditionally stable for gradual variations in the input frequency. If the input frequency fluctuation occurs as a large step change, the PLL could momentarily go out of lock but would reattain phase tracking only if the step change in input frequency is within the pullout range. The hold range is also referred to as the static limit of stability, where as the pull-out range is the dynamic limit of stability [4].
III. Behavioral modeling and simulation
Significant speedup in the simulation runtime, specially in hard to simulate systems such as the PLL, can be obtained by using behavioral models. Rather than use the approach of writing a customized PLL simulator, more realistic simulation results are obtained by using a mixed-signal behavioral simulator. Thus the behavioral models can not only be used in verifying the functionality of the PLL alone, but can also be incorporated in the simulation of a larger system. Conceptually, the simulation models represent the behavior of the PLL components in general, and should be adaptable to specific simulators, by writing the models in a specific simulation language. The behavioral simulation models for the AT&T Bell laboratories mixed analogdigital simulator are written in ABCDL (Analog Behavioral Circuit Description Language) [2], that borrows its syntax from the widely used C-programming language [19]. ABCDL provides a set of predefined data structures with which a user encodes the behavioral models. Other analog hardware description languages are in the standardization process [34]. The behavioral simulator provides the capability that allows a system designer to verify the functional characteristics of large system designs within reasonable simulation run times by utilizing behavioral models. These behavioral models can be of varying degrees of complexity depending on the number of behavioral characteristics that are modeled. At the behavioral level system designs can be represented as an interconnection of major functional blocks that comprise the system. For
Analogpln inputs
D1gUalpin inputs
Behavioral C-model definition
Circuit pins, inputs/outputs
Fig. 2. Interface of a Behavioral C-model.
Analogpln outputs
Digilllipin outputs
example a PLL is represented by models of its major functional blocks, the phase detector, loop filter, and VCO.
Behavioral models differ from macromodels [5] in that macromodels are essentially device-level models with reduced non-linearities and device count. Behavioral models can be written in a top-down or a bottomup fashion. In the top-down approach, the behavior of the system is known in terms of one of the behaviorallevel models such as s-domain or z-domain transfer functions, algebraic or differential/difference equations. This known behavior can then be transformed into a ABCDL representation. For this purpose automated model generation tools such as gensims or gensimz can be utilized [3]. These model generation tools generate ABCDL simulation models from sand z domain transfer function specifications. In the bottomup approach, the behavioral model is abstracted from an existing circuit utilizing the known response characteristics. Modeling the charge-pump or specific implementations of phase-detectors from available circuits, is an example of the bottom-up modeling approach. In either case, the choice of the functional characteristics to be included in a behavioral model are at the discretion of the designer writing the models.
An ABCDL model consists of the behavioral description with interfaces to external models. Figure 2 shows the interface structure of an ABCDL model. Externally, the model interfaces to other simulation entities through 110 pins. A model can have any of, or a combination of analog pins (inputs/outputs), digital pins (inputs/outputs) and circuit pins. Analog pins carry a single analog signal, voltage or current and digital pins carry logic signals. A circuit pin carries analog signals, and has an associated V-I characteristic, used to model a pin connection that carries both voltage and current signals. The distinction between an analog pin and a circuit pin is that the analog pin carries a single
50 B. A. A. Antao, F. M. El-Turky and R. H. Leonowich
FPDL -- pin definition
-- model partJIMun
Cmodel #1
C mothlll2 or
ADVICE
subcircuit
LSL -- inter model cDMedi."ity
-- exunUIJ pin defmition
Fig. 3. A typical ATTSIM simulation description.
analog quantity such as the voltage with no loading effects, where as the circuit pin has an associated voltage and current. The analog and digital models can be connected directly together. The analog-digital interface is handled internally by the simulation engine through the use of threshold functions. This is a significant advantage over the use of conversion models at the analogdigital interface [15]. A typical ATTSIM simulation description of a system includes 1) Individual behavioral models written in ABCDL, as a C function definition ; 2) Connectivity and simulation control parameters specific to a model defined in an FPDL description ; 3) System connectivity definition (LSL) that links the individual models according to the system configuration, and defines external connectivity to the system as a whole; and 4) Test signal definition, and overall simulation control parameters, such as the run time, specified in the ATTSIM command file. Figure 3 is a schematic illustration of a typical ATTSIM description. Central to the ATTSIM behavioral simulation paradigm is the behavioral modeling capability that allows the user to incorporate customized behavioral models in the simulation. Depending on the nature of the functional block being modeled, the model definition can be written using one of the three or a combination of the behavioral level modeling equations. These modeling tools are 1) algebraic expressions, 2) differential equations, 3) difference equations and 4) non-linear equations. The state-variable formulation serves as a basis for generating behavioral models from high-level transfer function specifications [3]. Other mixed-mode simulation tools [1], [33], [37] use piecewise linear models [11] or table look-up models. Additionally the ATTSIM behavioral models can also be written using table look-up methods. The state-variable based models utilize differential equations in the continuous domain and difference equations in the discrete domain. These models encode the state-equations in the form given by the gen-
eral expressions (11)& (12) [9], [32]. In the continuous case the state equations are of the form,
i(t) = A(t)x(t) + B(t)u(t)
y(t) = C(t)x(t) + D(t)u(t)
and for the discrete case,
x(k + 1) = Ax(k) + Bu(k)
y(k + 1) = Cx(k) + Du(k)
(11)
(12)
Here x is the nth order state vector, u is the input vector and y is the output vector. Models for purely digital subsystems can be written in terms of boolean expressions or state-tables. An elaborate model example is illustrated in the appendix.
The ATTSIM simulation engine operates on a chain of simulation models linked together to make up the system configuration being simulated. The simulation is event driven, i.e. a model is evaluated whenever it generates or receives an event. An event occurs whenever a signal changes its current state by more than the minimum specified range. This range can be varied for each model in the FPDL description. The occurrence of an event triggers the evaluation of the associated models. The simulation engine passes a set of parameters to the ABCDL model, which is evaluated and the outputs of the model updated. The resulting output changes are propagated to other models, which are then in turn evaluated.
The event-driven nature of the simulation helps in speeding up simulation runtimes, by exploiting the effect oflatency in the various subcircuits 1. At each time step only those models which are active are evaluated. This is in contrast to the conventional device level analog simulators which evaluate the entire circuit at each simulation time point. In event driven simulation, the user can control the granularity of the events by varying the minimum value of the signal change that needs to occur in order to trigger an event. The event granularity reflects on the accuracy of the simulation results, and represents a tradeoff between simulation accuracy and run time. Additionally the user also has control over the individual ABCDL model latency by being able to specify an explicit evaluation time interval. These issues pertaining to the simulation platform, enable more efficient models to be written.
Behavioral Modeling Phase-locked Loops for Mixed-Mode Simulation 51
IV. Modeling PLL functional blocks
The PLL is modeled by a functional decomposition of the loop configuration. Behavioral models are then developed for each of the individual functional blocks. The generalized models are developed in a top-down fashion, knowing the general behavior of each component. These models are parameterized and can be used in specific applications by assigning appropriate values to the adjustable parameters. Behavioral models for each of the PLL components are derived in the following subsections of this paper. These models encode the general behavior of each functional block. In the latter section, behavioral level models are also developed for the functional blocks from a manufacturer's data sheet. Models are developed for both analog and digital realizations. A typical PLL configuration can be all analog, mixed analog-digital with digital input and output and intermediate analog signals, or all digital. The most widely used PLL type has the mixed analogdigital configuration. Besides the basic components, models are also developed for auxiliary circuit blocks, such as the charge pump used along with a sequential phase-frequency detector. Charge-pump based PLLs are yet another category of PLLs [14].
V. Behavioral models for phase detectors
The function of the phase detector is to compare two signals and produce an output signal that reflects the difference in phase and/or frequency between the reference signal and the generated VCO signal. A phase detector can have an analog or a digital implementation. A linear model for a phase detector is
here, Kd is the phase detector gain in volts/radian; 8e is the phase error of the VCO output signal relative to the reference input signal; Vdo is the free running voltage.
Various types of phase detectors commonly used in PLL configurations are modeled, these are:
1. Analog multiplier phase detector. 2. Switching analog phase detector. 3. Exclusive-OR (XOR) phase detector. 4. JK type phase detector. 5. Sequential phase/frequency detector. 6. Transmission gate mixer.
vi i1 = 0
:1 vd =Km vivo
X ... i2 = 0 vo
Fig. 4. Analog multiplier phase detector.
A. Analog phase detectors
The most common form of an analog phase detector is the multiplier. A typical implementation of an analog phase detector is a four quadrant multiplier, that produces an output proportional to the product of the input signals. A four quadrant multiplier can operate on both positive and negative values of the input signals, i.e. the range of operation covers the four quadrants. The Gilbert multiplier circuit is a commonly used circuit configuration of a four quadrant multiplier [16], [17].
Let Vi be the input signal and Vo the VCO output signal such that
Vi = Vi sin(wit + 81)
Vo = Vo COS(Wit + 82)
The output of the mUltiplier is
Vd = KViVo
Vd = O.5KVi Vo sin(81 - 82)
(13)
(14)
(15)
+O.5KV; Vo sin(2wit + 81 + 82) (16)
The second term in equation (16)2 is a high frequency ac component which is filtered by the loop filter, excluding this term, gives us the average component of the multiplier output:
(17)
with 8e = 81 - 82 and Km = O.5KVi Yo. For small values of8e near lock condition, sin(8e) ~ 8e, the phase detector output is proportional to the phase error.
The behavioral model for a multiplier shown in figure 4 is composed of the following equations,
il = 0
i2 0
Vd KmViVo
52 B. A. A. Antao, F. M. El-Turky and R. H. Leonowich
The above equations define the circuit pin (bidirectional pin) characteristics of an ATISIM behavioral simulation model. Alternately, the same behavioral model can be implemented using analog pins (unidirectional pin).
Another commonly used analog phase detector is the switching phase detector [13], where the output of the veo is a square wave instead of a sinusoidal signal. The veo output signal is now a square wave of the form
sgn is the signum function that generates a square wave from the periodic analog signal and is defined as
{ 1 forx(t) >0 sgn(x) = -1 forx(t) < 0
The veo square wave being periodic, can be expressed as a Fourier series:
Vo(t) = ~ [COS(Wit + eo) - ~ cos 3(Wit + eo) :rr 3
+ ~ COS5(Wit + eo) + ... J (18)
The output of the switching phase detector is the sum of each term of the Fourier series multiplied by the input sinusoidal signal. The dominant low frequency output term from the phase detector is
. 4 Vd(t) = KV; sm(wit + ei)- COS(Wit + eo)
:rr 2 . ) = -KVi sm(ei - eo :rr
(19)
(20)
Equation 20 indicates that the output of the switching analog phase detector is similar to that obtained with an analog mUltiplier phase detector. Figure 5 shows the simulation results for a switching phase detector. R is the reference or the input signal, V is the veo square wave output and OUT is the phase detector output.
B. Digital phase detectors
Three types of digital phase detectors are commonly used in PLL implementations [4]. These are the Exclusive-OR (XOR) phase detector, edge triggered JK phase detector, and sequential phase/frequency detector. A fourth type of phase detector is the transmission gate mixer phase detector that was used in a novel high-speed PLL design [21].
6-r--.......,.,---'--""""',---',---',--, -- R .-, i" i I -. I I 1-, i" I -. I hi" I r, j''' - - - V
4 -I"T", I:f l' f··1 ill 1'''·i .. ·l'''I···Tf·~T·y··I···.fT·T··I .... I I I : p I I :; : I I ; : I .Ii I I : III I :
2 .1.· .•••. I ·.· .... ,·, ...•... ,· .. ···t·.· .. ···'· ... I.··1·· ...... I .... I .. l·)· .. ···I·········ti ....... , ......... . I I I I i I I I I i I I I i I I I I ~ I I I I i I I I
o .,' ... ~:.,' I. I I . I I,. ~: ... 1 ... .,!: ...... ,'
2 3 Time
4 (x 1E-6)
0.6,.--.......,.---,--.......,.--...,----,--, __ OUT
0.2
·0.0 ·0.2
.Q.4
·0.64-r ........ .....-+ ................ -t-.....,...,......-+ ................................ ...-r ................ -t
vi
vo
2 3 Time
4 5 (x 1E-6)
Fig. 5. Simulation results of a switching phase detector.
rr-____ vb
~---------- va
Fig. 6. Exclusive-OR phase detector.
The XOR phase detector functions as an overdriven multiplier where the output is saturated between a positive and negative value. In the XOR phase detector the saturation levels correspond to logic level "high" and "low". The multiplier output which is an analog signal is positive when both the inputs are positive or negative, and negative when one of the inputs is negative. The XORIXNOR exhibits similar output characteristics. Figure 6 shows an XOR phase detector with complementary outputs. The average value of the outputs IS:
This average output voltage corresponds to the free running voltage of the phase detector. To obtain an output that is both positive and negative, a balanced output Vd = Vb - Va can be used. The linear range of the XOR phase detector is -0.5:rr to 0.5:rr [39], i.e the range in which Vd rises linearly as a function of ee. In case of the XOR phase detector, the two input signals will lock with a constant phase difference of 90° resulting in the stable output free running voltage Vao.
Behavioral Modeling Phase-locked Loops for Mixed-Mode Simulation 53
J __ ~ -up Edge triggered JK
K--~ phase detector
J K UP DN
1 LOGIC_I/O LOGIC_I LOGIC_O LOGIC_I/O 1 LOGIC_O LOGIC_I LOGIC_I/O LOGIC_I No change No change LOGIC_I/O LOGIC_O No change No change
Fig. 7. Edge triggered JK phase detector.
R __ -<>I Sequential phase'
V __ -+I frequency de"'c/or
VI
D1
Fig. 8. Sequential phase/frequency detector.
V+
The edge triggered JK phase detector provides improved characteristics over the XOR phase detector in terms of the linear range [4], [39]. The linear range is -n to n. The XOR phase detector is also sensitive to the duty cycle of the signals and should be used with symmetrical square waves [4]. The JK type phase detector being edge triggered can be used with asymmetrical square waves as well. Figure 7 shows the JK phase detector along with the state table that models the behavior.
The sequential phase/frequency detector is a frequently used digital phase detector, with improved characteristics over the XOR or JK phase detectors. The linear range of the sequential phase detector extends from -2n to +2n. The extended linear range also enables phase as well as frequency detection. Figure 8 shows the sequential phase/frequency detector [4], [28] with the state diagram. The behavior of the phase/frequency detector is also expressed using the state transition table shown in figure 9 [28]. In the transition table, the number in parenthesis indicates a stable condition. To interpret this table, consider the stable condition (1) when both inputs are 0, under column R-V = 0-0, the corresponding output is Ul = 0 and
R-V R-V R-V R-V Ul Dl
0-0 0-1 1-1 1-0
(1) 2 3 (4) 0 1 5 (2) (3) 8 0
(5) 6 7 8 1 1 9 (6) 7 12 1 1 5 2 (7) 12 1 5 2 7 (8) 1
(9) (10) 11 12 1 0 5 6 (11) (12) 1 0
Fig. 9. Sequential phase/frequency detector state transition table.
Dl = 1. If the next input is R-V = 0-1; moving horizontally from the stable condition (1) under R-V = 0-0 to the R-V = 0-1 column, results in condition 2, which is unstable. The circuit will assume a stable condition by moving vertically in the R -V = 0-1 column to the stable condition (2) which results in the corresponding output in this case to remain unchanged at Ul = 0, VI = 1. This phase detector is also commonly referred to as a three state phase detector [39] as evident from the three states exhibited by the outputs, each corresponding to the condition when the output is 1) lagging (state 1) 2) locked (state 2) and 3) leading (state 3) with the reference signal. Figure 10 shows the simulation results of the sequential phase detector for the three states. The output Ul is pulsed proportional to the phase error when the output signal (V) is lagging behind the reference signal (R) (state 1), the two outputs are high when both the signals are locked in phase and frequency (state 2), and Dl is pulsed when the output signal is leading the reference signal (state 3). Similarly Ul is pulsed if the frequency of the output signal is less than the reference signal, and Dl is pulsed if the frequency of the output signal is greater than the reference signal. In the PLL operation, signal Ul is used to modulate the veo frequency upwards, whereas Dl is used to decrease the veo frequency momentarily. Thus the veo frequency is corrected up and down till the two signals lock in phase and frequency. The sequential phase/frequency thus ensures that the two signals lock in phase as well as frequency and the two signals lock with zero phase difference. With the other two phase detectors, false locking can occur, where the output signallocks on to a harmonic frequency of the reference signal.
54 B. A. A. Antao, F. M. El-Turky and R. H. Leonowich
TIME 1 ns 650
SIM 0.1 ns 1111111111111111111'1'11111111111111111'11111111111111111111111111111'11111111"1111111111111111111'1
R ...... .
v .......
Ul ...... nrnn Dl 10." .. - values H,[.,U •••••••••••••• Jm
Fig. 10. Sequential phase/frequency detector simulation results.
Rl Passive Filters
Rl
~ I
Typel
i~ I
TypeZ
Active filters
Type 3 Type 4
Fig. 11. PLL Loop filters.
VI. Modeling the PLL loop filter
The phase detector output contains a signal component proportional to the phase or frequency difference of the two input signals being compared. This output also contains higher order harmonics. The function of the loop filter is to filter out the higher order harmonics and provide a clean control signal that can be used to modulate the output frequency of the voltage
controlled osciIIator. Typical loop filters used in PLL configurations are first order low pass filters, with either an active or passive realization. The order of the loop filter determines the order of the PLL system, and this relation is expressed as [4]
Order ofPLL = order ofloop filter + 1 (21)
In this section we describe the most commonly used first order loop filters. Only first order filters are
Behavioral Modeling Phase-locked Loops for Mixed-Mode Simulation 55
Filter Transfer function
Type 1 F(jw) = I+]W'1
Type 2 F(' ) - I+jw'2 JW - I+jW('I+'2)
1Ype3 F(jw) = I-J:"jw'2 }W'l
1Ype4 F(jw) = _.1_ }W'l
Fig. 12. Loop filter transfer functions.
discussed as the typical PLL is a second order system. Higher order PLLs have complicated configurations which need to include stability compensation [4]. Figure 11 shows four commonly used PLL loop filters. The s-domain transfer functions are shown in figure 12.
Depending on the overall structure of the PLL, either a behavioral model that directly implements the transfer function, or discrete RC components may be used. In PLL configurations that use a charge pump or current controlled oscillators, discrete component models have to be used. In case of active filters, ADVICE/SPICE subcircuits that model the R and C elements are used in conjunction with one of the ideal or non-ideal operational amplifier behavioral models.
VIT. Behavioral model for voltage controlled oscillators
Essentially two kinds of signal generators are used in PLLs, analog with sinusoidal outputs and digital with square wave output. The signal generators are in the form of controlled oscillators, with the controlling signal being voltage or current. Recent frequency synthesizer applications make use of the current controlled oscillator [35]. General behavioral models are developed for both analog and digital controlled oscillators. By default these models operate with voltage as the controlling signal. These models can be modified to handle current as the controlling signal by changing the input pin to a circuit pin, and replacing the voltage variable in the model equations by the current variable.
A. Analog voltage controlled oscillator
The analog voltage controlled oscillator produces an output signal whose frequency is proportional to the controlling input signal. The VCO characteristic is a linear function of the output frequency with respect to the controlling voltage. When the PLL is in lock the VCO output frequency Wo = Wj the input frequency. The control voltage at which locking occurs is called the static control voltage [39]. The frequency We is called the center frequency of the VCO and Veo is the corresponding control voltage. The linear VCO characteristic can be expressed in terms of the center frequencyas
or in terms of a minimum frequency (output frequency when Ve = 0) as
(23)
Here, Ko is the VCO gain in radiansls/volt. In order to compute the phase angle of the output signal we integrate equation (22)
Oo(t) = ! wo(t)dt
= ! [we + Ko(ve(t) - Yeo)] dt
= wet + Ko ! (Ve(t) - Yeo) dt (24)
The VCO output signal is
(25)
Equation (24) is modeled in ATTSIM by using a continuous state to implement the integration. Nonideal effects can be modeled in the VCO by augmenting equation (24) to include a phase error and jitter frequency term 1/I(t) resulting in
Oo(t) = wet + Ko ! (ve(t) - Yeo) dt + 1/I(t) (26)
Sometimes the VCO characteristic is not available in the form of the linear relation (4) but as a table relating the discrete control voltage steps to corresponding output frequencies. In such a situation the linear relation of the VCO model would be replaced by a table model,
56 B. A. A. Antao, F. M. El-Turky and R. H. Leonowich
and the instantaneous frequency would be computed by looking up the table and interpolating between regions. Besides the linear relation, the veo can also be modeled to handle a general non linear veo characteristic expressed as
(27)
The non-linear model is implemented by replacing equation (22) with the above non-linear relation in the derivation of the phase angle relation (24).
B. Digital voltage controlled oscillator
The behavior of the digital veo in many ways is similar to the analog counter part with the output being a square wave instead of a sinusoid. Two approaches can be taken to model the digital yeo. The first approach is to utilize equations (24) and (25) to generate a periodic signal, from which the square wave output is derived by using the signum function.
{ LOGICI for x(t) > 0 sgn(x) = LOGICO for x(t) < 0
(28)
The second approach is to compute the period of the square wave which is modulated instantaneously in response to the control voltage.
The instantaneous output frequency is given by the linear relation
The period of the square wave is
2rr period = --
wo(t)
Let tc be the real time from the start of a square wave cycle, the output is as follows (tc is set to 0 at the end of each cycle):
if (tc = periodj2) Vout = LOGIC_l
else if (tc 2: periodj2 && tc < period) Vout = LOGIC_O
The period is updated and hence the output frequency changes whenever the input controlling voltage changes.
~ NO "'-----.--- V c
~ Fig. 13. Ideal current charge pump.
VIII. Behavioral models for other PLL components
The sequential phase/frequency detector(PFD) has three output states in the form of logic levels. These logic levels have to be converted to analog quantities to drive the yeo. The charge - pump [14] is a device used in conjunction with a phase/frequency detector to convert the logic levels to appropriate analog signals. In essence the charge pump is a three position switch that delivers a pump voltage ± Vp or a pump current ±Ip to the loop filter. As discussed earlier, the Ul output of the PFD implies a pump-up signal where as Dl implies a pump down signal which is used to modulate the veo frequency. In the third state, when the two signals are locked, the switch is open. Figure 13 shows an ideal current charge pump driving a passive filter. Similar configurations with constant voltage sources replacing the current sources may be used.
Another component often used in PLLs is the divide by N counter. The divide by N counter is used in frequency synthesis applications where the veo output is a square wave at a frequency that is a multiple of the reference frequency. The veo output has to be divided down to the reference frequency before the signal is fedback to the phase detector. A parameterized behavioral model is implemented that accepts a variable parameter N for use in PLL applications. The model can accept integral values for N which is specified in the FPDL file. The model simply counts down N input digital pulses before it generates an output whose frequency is lIN times the veo frequency.
Behavioral Modeling Phase-locked Loops for Mixed-Mode Simulation 57
f. = In
7MHz
~
Analog multiplier
I Analog
I veo
f =7MHz c
Low-pass filter
I-I
Fig. 14. Analog Phase Locked Loop.
~
IX. Behavioral simulation of an analog PLL
This section describes the simulation of an analog PLL. The parameters for this PLL were adapted from a design presented in [26]. This example illustrates the top down modeling methodology, where the overall system characteristics of the PLL are first designed and verified by simulation. The behavioral models provide for the system characteristics to be verified in a quick and efficient manner, without having to actually design the detailed circuit implementation. Figure 14 shows the configuration. This PLL has three components, an analog multiplier phase detector, a first order low pass loop filter, and an analog yeo. The parameterized behavioral models described earlier were used to simulate this PLL. The key parameter values are as follows:
Analog multiplier gain Kd = 3.72. Low pass filter pole at 7 x 105 rad/sec. veo center frequency = 7MHz. veo gain = 30Khz/volt. veo control voltage at center frequency, Veo = OV.
In addition to using separate behavioral models for each of the PLL components, a compact behavioral model was written for the entire PLL. The compact model essentially couples each block of the behavioral model corresponding to a component internally within the model. The signals are made available to the external observer or to interface to other parts of the system through external pins. The simulation results with the compact model and the separate models were identical, except the compact model simulated faster.
The first simulation was performed to check if the PLL would lock with these set of parameters, and to determine the lock-in time. The output of the loop filter is indicative of the PLL performance. The simulation response is shown in figure 15, which displays the out-
put of the loop filter. Lock in occurs when the loop filter output, which is the veo control voltage settles to a steady state value. In this case since the reference input is at the veo center frequency, at lock-in the control voltage Ve = 0, and the lock-in time is approximately 12ps. The input and reference were visually observed to have locked 900 in phase. The epu time for this simulation, that was carried out for 35 p,s or 245 cycles was Imin 56.41 sec. All the PLL simulations were carried out on a Sun Sparcstationl +.
The next simulation was to determine the step response of this PLL. For the step response simulation, an identical veo was connected to the reference frequency input. A step control voltage of 1.0 volt was applied to the veo driving the reference. The loop filter output or the veo control voltage is the step response of the PLL. Figure 16 shows the step response simulation of this PLL. The epu time for simulating 70p,s was 5min 7.41 sec.
Finally the same configuration that was used to measure the step response was used to simulate the tracking range of the PLL. A staircase input from -4.0 volts to 3.0 volts, in steps of 1 volt and 35p,s width, was applied to the input yeO, and the entire PLL was simulated over this range for 240p,s. The epu time for this simulation was 21min 3.91sec. Figure 17 shows the simulation results. The PLL fails to lock or track at either ends of the input spectrum, and oscillates without reaching a steady state.
X. Behavioral simulation of a high-speed PLL
1.6,"]""-...,...--.,..---.,..---.,..--....,....-....,....--, __ ReOUT
11.42 ...•.•.• ·:············T··········l············: .. ··········r··········r··········· . . ...... " ................ .: .............. ; ............ ~ ............. I·············1·············
:: ···r ··········I •••• · ••• J1.··....li ••••••••••• 0.6 . :.::.: ... :1 ..... ::::::. · .......... · ............ ··· .. · .... T .......... ·T .... · .... .. 0.4 ............ j .••••.•••... '- ...••..•••••. ~ .••.•.•••..•. ; .•..•••••••.• i ............. j ..•.•........
0.2 ............ /... ........ i.. ........................ l ........................... L ........ . 0.0 ............ 1 ............ '1" ..
-0.2,..,..,......,r-I' ......... ~' ......... ~~.,...,...;f-r-r...-.-f-r-r.,....,..;.,.....,....,...t 5 10 15T1me20 25 30 35 (x1E-6)
Fig. 15. Acquisition characteristics of analog PLL.
58 B. A. A. Antao, F. M. El-Turky and R. H. Leonowich
1.8'-:r--x---.,....------,----.,...-----, __ ReOUT
j~:rfI" ---~, 0.8 ....................... , ................. , ··T········, .. · .. · .... · .. ]" ............ ·· .... · .. ·
~:: :::::::::::::::::::::T::::::::::::::::~. ::L:::::::::::::::::::::r.::::::::::::::::::::::: ! ,! !
0.2 .. · ........ · .......... ··~· .. · ........ · .... T· .. ~ ............ ' .......... j ....................... .
-0.0 .......... · ............ ·[ .. ···· ...... · .... : .. ·+ ...... · .. · ............ 1 .......... · .. · .. ·, ..... . -0.2 · .... · ............ · .. ·+· .. · ...... ·· .... t· .. + ...... · .. · ............ ·, ...... · .......... · .... .. -0.4 ........................ l ................ {-.. ~ ............ , .......... ~ ................. , .... . -0.6 ........................ !., .. , ....................................... j ............... , ....... . -0.8 ........................ ~ ................. ~ ..... ; ...................... ~ ....................... .
-1.0 ...................... ·t· .. · ............ 1_·T· . ' -1.2+-.,..............,...-.-;--.......,...-.-ri......,...-.-,.....,.-;--r-,.......,.-.-!
o 20 T~e 60 80 (x 1 E-6)
Fig. 16. Step response of analog PLL.
.,.,'1 .. , ..... , .. ,., ..... v+v
... ,0-1 .. " ... ;, ........... + .... + .. _-
.2.'1 ., ............ + / ..... ..
.3.,0-1 ..... , ..... +, ..... "-.;--_.+ ........ .
(x1E-8)
Fig. 17. Tracking performance of analog PLL.
A mixed analog-digital PLL configuration of a highspeed PLL was modeled and simulated in ATTSIM. This PLL is shown in figure 18 [21]. This example illustrates the bottom up behavioral modeling methodology. The behavioral models were derived from evaluating the performance and behavior of a custom inte-
CK CK@fc=243.75 Mhz
D
~ VCOI -Dif/erenlilll VCP
T -gale mixer High·Speed ~3X -
~ PlUIse detector VCN VCO ~mu/L _
1: -
CKB :1 I :::r:::
Fig. 18. High speed PLL design.
INI
IN2
f 200M
10M
veOl Hs-veo 3x
multiplier
Ko =47.5 Mhz IV
OUTI
OUT2
---r----,----------------,-----vc vth 5.0
Fig. 19. High speed veo and veo transfer characteristics.
grated PLL design. This PLL is composed of a transmission gate mixer phase detector, first order passive RC filter, a high speed digital VCO and 3 x multiplier.
The transmission gate phase detector operates as a balanced multiplier with complementary outputs. The phase detector is composed of complementary transmission gates, and the behavioral model is composed of logic expressions that represent the operation of the gates. The characteristics of this phase detector in terms of the linear range are similar to that of the XOR phase detector. The reference and the VCO signal lock with a phase difference of 90° . The average value of the output when the signals are locked at center frequency is
The complementary loop filters were simulated using discrete RC components. Though the filter resembles a type 1 loop filter, in the actual operation the output impedance of the transmission gates contribute an additional series impedance to the loop filter, and in actuality the loop filter is a type 2 loop filter that was described in the section on modeling PLL loop filters. The series impedance was modeled by using a series connected resistor in the loop filter configuration. The output impedance was estimated by a circuit level (ADVICE) simulation of a single transmission gate.
The high speed VCO produces three phase output clocks which are combinatorally multiplied to produce a complementary output at three times the VCO output frequency. Figure 19 shows the VCO along with
Behavioral Modeling Phase-locked Loops for Mixed-Mode Simulation 59
Circuit Name: yeo Created, 26-AlJg-92 11,03:43 Pile: yeo
TIME 1 os 846 896 946 996 1046 1096 1146 1196 1246 1296
SIM 0.01 os 11111111111111111111111111111111111111111111111111/11111111111111111111111111111'1111111111111111111
INl ..
veOl
ve02
ve03
OUTl
OUT2 .
Fig. 20. Simulation response of the High speed yeo.
2.6iT--T ---::Jil:::;;;::;:;:::;:;;=::::;;;;:;:=::::j -- VCP1 2.4 .. + ................. + .................. j ................. ..
:1:082 ::::::::::·:::::::·1:::: ... :::::::::::1::::::::::::::::::L:::::::::::::::r:::::::::::::::: .................. , ................... , ................... , .................. , ................. ..
i~lttI~: f(!lfl
2 4 6 8 Time
10 (x 1E-6)
Fig. 21. Simulation results for lock-in process of high-speed PLL.
the transfer characteristics. The transfer characteristics were obtained from measurements of the veo implementation [21]. The veo was modeled using the second approach that was outlined in the section on the digital veo models. The multiplier was modeled using behavioral expressions that represent the combinatoriallogic implementation. Figure 20 shows the stand alone simulation response of the yeO, a ramp signal was applied at the control input to verify that the simulated output frequency of the veo matched the measured veo characteristics of the actual implementation.
The complete PLL was simulated using the behavioral models and at a reference frequency equal to the veo center frequency of 243.75M Hz. Note that the input frequency is three times the actual center frequency of the veo as the veo generated signal frequency is increased by a factor of 3 by the multiplier. Figure 21 shows the veo control voltage that stabilizes at 2.5 volts, corresponding to the veo center frequency. The lock-in time is about 4fLS, and the epu time for the simulation was 3min. 15.21sec. Figure 22 shows the reference and feedback signals locked 900 in phase. D AT A and D AT AN are the complementary reference input signals, C K and C K B are the signals fedback to the phase detector, i.e. the output of the multiplier, and V COl, V C02, V C 03 are the three phase clock outputs of the yeo.
Since the device level implementation was available for this design, a mixed-level simulation of the PLL was undertaken. The behavioral model of the Phase detector was replaced with the MOS device-level subcircuit of the actual implementation which consisted of 8 MOS transistors, and substrate capacitances. The version of ATTSIM used in this simulation is capable of simulating MOS-devices using the AT&T Bell Labs eSIM model using nominal process parameters. Figure 23 shows the simulation results of the lock-in process using a full device-level model for the phase detector compared with an all behavioral simulation. V C PI is the response with an all behavioral model and V C P is the response with mixed behavioral and
60 B. A. A. Antao, F. M. El-Turky and R. H. Leonowich
Circuit Name: blpll Created: 26-Aug-92 13:06:}7 File: blpll
TIME 1 os - - ~ - - - - - - - -81M 0.001 ns 111111111111111111111'111111111111111111/11111111111111111111111111111111111111111111111111'11111111/
DATA ..••
DATAN ••.
CK ••••••
CKB ••.•.
VCPl
VCOl
VC02
VC03
Fig. 22. Simulation results oflock-in process of high-speed PLL showing locked signals.
device-level models. The slight deviation in the characteristics is due to the approximate fixed resistor value used to model the output impedance of the transmission gate that contributes to the pole frequency of the loop filter. In the actual operation of the phase detector the output impedance is slightly modulated by the frequency of switching the transmission gates. The results with the present estimated resistance value provided satisfactory results. Both the simulation configurations locked in to the right frequency, the small deviation being in the lock-in time. Parasitics in the device-level models also contribute to this deviation. However a close fit may be obtained by iterating the simulation at the behavioral level with various resistance values. The epu time for the mixed-level simulation forlOus was 5min 17.00 sec as compared to 3min 15.21sec for the all behavioral model.
To determine the range of the PLL operation, the PLL was simulated with the input frequency at half the center frequency and 1.5 times the center frequency. The PLL was able to successfully lock on to the lower input frequency, the simulation results are shown in figure 24. However when driven at the higher frequency of 365MHz, the PLL displayed the phenomena of false locking where it locked on to a lower harmonic frequency of 222.86MHz. This represents a harmonic lock of 5 input cycles to every 3 veo cycles. False locking is partially due to the characteristics of the type
2.6;l--r-~~:::;:;;:;:::;:;;:;;:;;:;;;:;:::;j - VCP1 .~ ---vcp
:' ··················,·······11:1:: •••.••••••••• • •• I ................... ;. ••••••••••••••••••• ~ •••••••••••••••••• , •••••••••••••••••••
I!::;:!:: •••••• • •••••••••• 1 ••••••••••••••••••• ;. ••••••••••••••••••• ;. •••••••••••••••••• , .................. .
•••. ...:j:jj..! •••••••••••••••••• . . .............. , ................... ;. ................... ~ .................. ( .................. . ................ .l ................... t ................... t .................. i. ................ .. ··················1·················+···············+· ............. + ................ .
4 Time
10 (x 1E-6)
Fig. 23. Mixed-level simulation results for lock-in process of highspeed PLL.
of phase detector used, which has a limited useful linear range of operation. False locking could be avoided all together by using the phase/frequency detector. Figures 25 and 26 show the false locking phenomena detected by the all behavioral level simulation.
Behavioral Modeling Phase-locked Loops for Mixed-Mode Simulation 61
1.8:--,----,---...,...---...,...---.,...-----, __ VCPl
1.6 ........................ ··f···················~·············· .. ··.:.············· .....
1.4 j . . j ........ '1" ................ 1' ................ "[' ................ '1" ................ .
1.2
1.0
0.8
0.6
0.4
0.2 ............ ·· .. T···· .......... ·"[' .. · .. ·· .......... I .... · .... · ........ r ................ · O.O-f-,r-r-.,......-;-,...,.-,.....,..-+-r...,.-,.....,..-+-r...,.-,.......i-T......,r-r-I
o 2 4 Time
6 8 10 ex 1 E-6)
Fig. 24. Simulation results for lock-in process of high-speed PLL at half center frequency.
2.6T--T --T --::;:;;;;;;;;_;;;;w;;;;j -- VCPl 2.4
::: ::::::::::::::::::r::::: .. :::::::::I:.::::::::::::::::r:::::::::::::::::r::::::::::::::::: 1.8 ............... ··:··· .... ···· .... ···[···· .. ·· ........ ·T .... · .. ······ .. ··r ...... ·· .. ·· .... .. 1.6 ............ · .... j····· .... ····· .. ··"[··· .... ··· .... · .... [· .... · .. ······ .. ··r ............... .
~jtit,.I;j: ~ ~
0.4 ... ····· .. ·· .. · .. i··· .... · ........ · .. t·· .. ·· .... · ...... ·( ........ ··· .. · .. j· ................ .. 0.2 .................. ( ................. : ................... (" ............... :--............... ..
O.0-r-.r-r-.,......-;-,...,.-,.....,..-j-r...,.-,.....,..-j-r...,.-,.......i-T......,-r-I o 2 4 6
Time 10 ex lE-6)
Fig. 25. Simulation results for detecting false lock in high-speed PLL.
XI. Behavioral simulation of a frequency synthesizer using the MC4044
The MC4044 is a standard off-the-shelf type PLL component manufactured by Motorola Inc. [28]. This standard component is used in board-level designs. This example illustrates the use of behavioral modeling in board-level design. In this case the approach is based on the bottom-up modeling methodology, where behavioral models are developed from the characteristics of a standard component.
The MC4044 has two on-chip phase detectors, a
charge pump and a gain amplifier which are used along with an external VCO to implement various PLL applications. The two on chip phase detectors are the sequential phase/frequency detector that was modeled previously and a combinatorial phase detector similar to the XOR phase detector. The on chip charge pump is a voltage base charge pump that delivers a pump voltage of 1.5 ± 0.75 voltage. The charge pump has a mean no-pump value of 1.5 volts. For the pump up and pump down signals to have equal effects, the on chip filter amplifier should be biased to a threshold of 1.5 volts. With the MC4044, the type 3 active filter configuration is used. Figure 27 shows the behavioral model for the voltage based charge pump along with the active filter configuration. Vn = 1.5V is the nopump voltage, and Vp = 0.75V. The pump-up voltage Vu = Vn + Vp = 2.25V and pump-down voltage Vd = Vn - Vp = O.7SV.
The MC4044 is widely used in frequency synthesis applications, where a higher order multiple of a reference frequency is desired. Figure 28 shows a frequency synthesizer configuration that uses this standard component. The output frequency is 12 times the input reference frequency. This frequency synthesizer was simulated at the behavioral level with the input reference frequency at 1.0833MHz. Figure 29 shows the lock-in characteristics. B is the loop filter output which settles at 1.5 volts corresponding to the center frequency of operation. The lock-in time is about 1 OOILS. CPU time for simulating 200ILS was 15min 45.43 sec. Figure 30 shows the locked signals and the VCO output FOUT at 13 MHz. Here R is the reference input, and V is the signal fedback to the phase detector after dividing the VCO output frequency by 12. Since this configuration uses a sequential phase/frequency detector, the two signals lock perfectly in phase and frequency with zero phase difference.
XII. Conclusions
Phase Locked Loops are an important class of systems used in a wide range of applications. Traditional PLL simulation has been plagued by bottlenecks such as the mixed signal nature of most implementations and long impractical simulation run times. These traditional bottlenecks were overcome by the use of the methodologies presented in this paper. Behavioral models were developed, using both the bottomup and top-down modeling paradigms to provide simulation speedup. The modeling methodology is also
62 B. A. A. Antao, F. M. El-Turky and R. H. Leonowich
Circuit Name: blplJ Created: 26·Aug-92 00:04:29 File: blpll
TIME 1 ns 7560 7570 7580 7590 7600 7610 7620 7630 7640 7650 7660
SIM 0.001 ns 1IIII1111111111111111111111111111111111111111111111111IIIII111111111111111111111111I11I11111111111111
DATA ....
DATAN •••
CK ..... .
eKB .....
veOl ....
ve02 ....
ve03 ....
Fig. 26. Simulation results of high-speed PLL showing false locked signals.
R2
>---'--.c +
Fig. 27. Behavioral model for mc4044 charge pump.
in= 1.0833 M SequentiD/
H Phase- Clrorge Active
frequency pump filler
detector ,--.
I Di •• by 12 I DigiJa/
VCO
F. out = 13Mhz
more widely applicable since a general purpose multilevel mixed-mode simulator was used as the simulation framework. The various simulation results presented, successfully demonstrate the utility of this approach. We were able to undertake complex simulations such as determination of the tracking range of a PLL within reasonable CPU time, which is otherwise impractical with conventional simulators. The simulation speedup would make it possible to undertake a number of simulations to optimize PLL characteristics. The multilevel modeling and simulation capability would serve in the top-down synthesis process.
Fig. 28. Frequency synthesizer using the MC4044.
~
Behavioral Modeling Phase-locked Loops for Mixed-Mode Simulation 63
(x 1E-S)
Fig. 29. Simulation results of frequency synthesizer.
Notes
1. Event driven simulation and exploitation of circuit latency is also utilized by the mixed-mode tools in the digital domain, mentioned earlier in this paper.
2. Equation (16) is obtained by applying the trigonometric identity sin(A) cos(B) = 1/2sin(A - B) + 1/2sin(A + B).
References
1. E. L. Acuna, J. P. Dervenis, A. J. Pagones, F. L. Yang and R. A. Saleh, "Simulation techniques for mixed analog/digital circuits. " IEEE Journal of Solid-State Circuits 25(2) pp. 353-362, Apri11990.
2. ATISIM Team, "The ABCDL: a robust environment for analog circuit behavioral modeling." AT&T Bell Laboratories interoal technical memorandum, March 1991.
3. B. A. A. Antao and F. M. EI-Turky, "Automatic analog model generation for behavioral simulation." IEEE Custom Integrated Circuits Conference May, 1992.
4. R. E. Best, Phase-Locked Loops: Theory, Design and Applications. McGraw-Hill Book Co.: NY, 1984.
5. G. R. Boyle, B. M. Cohn, D. O. Pederson, and 1. E. Solomon, "Macromodeling of integrated circuit operational amplifiers." IEEE Journal of Solid-State Circuits SC-9(6) pp. 353-364, December 1974.
6. S. Can, and Y. E. Sahinkaya, "Modeling and simulation of an analog charge-pump phase locked loop." Simulation 50, pp. 155-160, April 1988.
7. R. Chadha, C. Visweswariah, and C. Chen, "M3_A multilevel mixed-mode mixed AID simulator." IEEE Transactions on Computer-Aided Design 11 (5) pp. 575-584, May 1992.
8. B. R. Chawla, H. K. Gummel, P. Kozak, "MOTIS-A MOS timing simulator." IEEE Trans. on Circuits and Systems cas-22(12) pp. 901-910, December 1975.
9. C. T. Chen, Linear System Theory and Design. Holt, Rinehart and Winston Inc.: NY, 1984.
10. C. M. Chie and W. C. Lindsey, "Phase-locked loops: Applications, performance measures and summary of results," in Phase-Locked Loops. W. C. Lindsey and C. M. Chie eds, IEEE Press: NY, 1986.
11. L. O. Chua and A. Deng, "Canonical piecewise-linear modeling." IEEE Transactions on circuits and systems CAS-33(5) pp. 511-525, May 1986.
12. J. A. Connelley and P. Choi, Macromodeling with SPICE. Prentice-Hall Inc.: NJ,1992.
13. F. M. Gardner, Phaselock Techniques. John Wiley & Sons Inc.: NY, 1979.
14. F. M. Gardner, "Charge-pump phase-lock loops." IEEE Transactions on Communications COM-28, pp. 1849-1858, Nov. 1980.
15. I. E. Getreu, "Behavioral modeling of analog blocks using the SABER simulator," in Proc. of Midwest Symposium on Circuits and Systems, August 1989, pp. 977-980.
16. B. Gilbert, "A precise four-quadrant multiplier with subnanosecond response." IEEE Journal of Solid-State Circuits sc-3(4), December 1968.
17. P. R. Gray and R. G. Meyer, Analysis and Design of Analog Integrated Circuits. John Wiley & Sons: New York, 1984.
18. S. C. Gupta, "Phase-locked loops:' in Proceedings of the IEEE 63(2), Feb. 1975, pp. 291-306.
19. B. W. Kernighan and D. M. Ritchie, The C Programming Language. Prentice-Hall Inc.: NJ, 1978.
20. J. M. Khoury, "Design of a 15-Mhz CMOS continuous-time filter with on-chip tuning." IEEE Journal of Solid-State Circuits 26(12) pp. 1988-1997, December 1991.
21. R. H. Leonowich, "A high speed, wide tuning range, monolithic CMOS voltage controlled oscillator utilizing coupled ring oscillators," AT&T Bell Laboratories internal design document, in preparation.
22. W. C. Lindsey and C. M. Chie, "A survey of digital phaselocked loops," in Proceedings of the IEEE 69(4) pp. 410-431, April 1981.
23. W. C. Lindsey and C. M. Chie, Phase-Locked Loops. IEEE Press: NY, 1986.
24. E. Liu, A. L. Sangiovanni-Vincentelli, G. Gielen and P. R. Gray, "A behavioral representation for nyquist rate AID converters," in Proceedings of the ICC AD, 1991, pp. 386-389.
25. E. Liu and A. L. Sangiovanni-Vincentelli, "Behavioral representations for VCO and detectors in Phase-Lock systems." IEEE Custom Integrated Circuits Conference, May 1992.
26. V. Manassewitsch, Frequency Synthesizers: Theory and Design, 3rd edition, John Wiley & Sons Inc.: New York, 1987.
27. H. Meyr and L. Popken, "Phase acquisition statistics for phaselocked loops." IEEE Transactions on Communications COM-28, pp. 1365-1372, Aug. 1980.
28. Motorola Inc., MECL Device Data book, section 7, 1985. 29. L. W. Nagel, "ADVICE for circuit simulation:' in Proceedings
International Symposium on Circuits and Systems, 1980. 30. L. W. Nagel, SPICE2: A Computer Program to Simulate Semi
conductor Circuits. University of California, Berkeley, Memorandum no. UCBIERL M520, May 1975.
31. J. F. Oberst, "Generalized phase comparators for improved phase-locked loop acquisition." IEEE Transactions on Communication Technology COM-19(6) pp. 1142-1148, December 1971.
32. J. G. Reid, Linear System Fundamentals: Continuous and Discrete, Classic andModern. McGraw Hill Publishing Company: NY, 1983.
33. K. A. Sakallah and S. W. Director, "SAMSON2: An event driven VLSI circuit simulator." IEEE Transactions on Computer-Aided Design CAD-4(4) pp. 668-684, October 1985.
64 B. A. A. Antao, F. M. El-Turky and R. H. Leonowich
Circuit Name: roc40H Created: 25-Aug-n 17;22:52 Pile: rrc4044
TIME 1 ns 109560 110060 110560 111060 111560 112060 112560 113060 113560 114060
SIM 0.1 ns IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIHIIIII1IIIIjllllllllllllllllllllllllllllllllllllili
R
v
DF
B ....
FOUT
Ul
D1
Fig. 30. Simulation results of frequency synthesizer showing locked signals.
34. R. A. Saleh, D. L. Rhodes, E. Christen and B. A. A. Antao, "Analog hardware description languages." IEEE Custom Integrated Circuits Conference, May 1994.
35. R. Shariatdoust et. al., "A low jitter 5MHz to 180MHz clock synthesizer for video graphics," in Proceedings of the IEEE Custom Integrated Circuits Conference (CICC), May 1992, pp.24.2.1-24.2.5.
36. M. Sitkowski, "The macro modeling of phase-locked loops for the SPICE simulator." IEEE Circuits and Devices, pp. II-IS, March 1991.
37. C. Visweswariah and R. A. Rohrer, "Piecewise approximate circuit simulation." IEEE Transactions on Computer AidedDesign 10(7) pp. 861-870, July 1991.
38. K. M. Ware, H. S. Lee and C. G. Sodini, "A 200-Mhz CMOS phase-locked loop with dual phase detectors," IEEE Journal of Solid-state Circuits 24(6) pp. 1560-1568, December 1989.
39. D. H. Wolaver, Phase-Locked Loop Circuit Design. Prentice Hall Inc.: NJ, 1991.
Brian A. A. Antao received the B. E. (honors) in Electrical Engineering from the University of Bombay
(V.J.T.I.) in 1986, and the M.S. and Ph.D. in Electrical Engineering from Vanderbilt University, in 1988 and 1993. Currently he is a member of the research faculty in the Coordinated Science Laboratory, at the University of Illinois at Urbana-Champaign, and will be joining the Semiconductor Systems Design Technology group of Motorola Inc., in Austin Texas. In addition he has held summer research positions at AT&T Bell Laboratories, in 1991, and 1992 working on behavioral modeling and mixed-mode simulation. He is also a member of the Technical Program Committee of the IEEE Custom Integrated Circuits Conference, and is a member of the IEEE, ACM and Tau Beta Phi.
Dr. Antao's research is in the design and synthesis of high performance analog and mixed analog-digital integrated circuits and systems through an interdisciplinary effort combining various aspects of computeraided design, circuit design and architectural design; and the development of methodologies for efficient design and verification of integrated circuits and systems. Specific areas of focus at present include highlevel analog synthesis and optimization, modeling and mixed-mode simulation. Some of the problems that he is currently working on include high-level synthesis techniques for analog circuits and systems, new techniques for simulation of multi-domain analog circuits and mixed analog-digital circuits and behavioral modeling.
Behavioral Modeling Phase-locked Loops for Mixed-Mode Simulation 65
Robert H. Leonowich is a Technical Manager for a LAN design group with AT&T Bell Labs in Allentown Pennsylvania. He worked as a 10 and 100 Mb/s LAN transceiver designer for 10 years with AT&T Bell Labs in Reading, Pennsylvania previously. Bob holds a BSEE degree from the University of Pennsylvania and MSEE degree from Lehigh University.
Fatehy M. EI-Thrky received the B.Sc. degrees in electrical engineering and mathematics in 1971, and
1974 respectively, from the University of Alexandria, Egypt. He received the M.A.SC. degree in 1976, and the Ph.D. degree in 1980, both in electrical engineering, from the University of Waterloo, Ontario, Canada.
From 1971 to 1974 he was an instructor with the University of Alexandria. From 1974 to 1979 he was a research and teaching assistance in the Department of Electrical Engineering, University of Waterloo. In 1980, he joined the Department of Electrical and Computer Engineering, Wayne State University, Detroit, Michigan, as an assistant professor. In 1981, he was promoted to associate professor. Since 1983 he has been with AT&T Bell Laboratories and AT&T Design Automation as a member of technical staff in the Design Automation Laboratory, where he is responsible for mixed mode simulator ATTSIM. He is also responsible for analog design automation and analog synthesis tools. He has pioneered the area of analog design synthesis, and the application of artificial intelligence in design automatin and was the principle developer of AT&T's analog design expert system BLADES.
While at Wayne State University, he was named the outstanding professor of the year in the Electrical and Computer Engineering Department for three consecutive years. Dr. El-Turky is a member of IEEE and Sigma Xi.
Analog Integrated Circuits and Signal Processing, 10,67-76 (1996) © 1996 Kluwer Academic Publishers, Boston. Manufactured in The Netherlands.
Behavioral and Macro Modeling using Piecewise Linear Techniques
WIM KRUISKAMP AND DOMINE LEENAERTS Eindhoven University of Technology, Dept. of Electrical Engineering, P.O. Box 513,5600 MB Eindhoven, The Netherlands
Abstract. In this paper we will demonstrate that most digital, analog as well as behavioral components can be described using piecewise linear approximations of their real behavior. This leads to several advantages from the viewpoint of simulation. We will also give a method to store the resulting linear segments in a compact way, in order to avoid storage problems.
1. Introduction
Due to the ever decreasing feature sizes in modem technology processes, large mixed analog digital systems are nowadays integrated in a single integrated circuit (IC). This high integration rate makes it more and more difficult to verify the behavior of an IC by means of simulations [1]. The large number of transistors makes it often impossible to perform the entire simulation at the transistor level. Therefore behavioral or macro models have to be used for the major part of the IC and only the most critical parts can be modeled at the transistor level. Another problem is the interaction between the analog and digital subsystems. Usually these subsystems are simulated by different simulators that are controlled by some kind of shell. When the system has several loops in which both digital and analog circuits are involved, this strategy might give rise to severe convergence and timing problems.
A possible way to overcome these problems is to model all elements and circuits in a Piecewise Linear (PL) way. Then all nonlinear functions are approximated by linear segments, as depicted in the example of Fig. 1.
Many analog circuits and subsystems are developed in a way that the relations between the inputs and outputs are in first order linear with a clipping behavior. That kind of behavior is very suitable to be modeled with PL techniques. Also digital circuits appear to be very suitable to be modeled using linear segments. For
digital circuits, only the regions where the output is low or high are important, while the exact transfer function in the take-over region is often of minor importance and can be approximated by a linear mapping.
When an entire system is modeled with PL techniques, this will lead to several advantages during simulation. At any time during the simulation, we have a linear description of the behavior, including the region in which that mapping is valid. This actually means that one always solves a set of linear equations in stead of a set of nonlinear equations which is far more difficult. Furthermore, digital and analog as well as behavioral and component models use the same kind of data format. It is therefore not necessary anymore to use different simulators for each different class of models.
The organization of the remainder of the paper is as follows: In section 2, various examples of PL models will be given to show the usefulness of this concept for analog, digital as well as behavioral modeling. We will give a method to find the piecewise linear approxima-
~
1.25 ,..------.------,----r-----,
0.75
0.5
0.25
0 0 0.25 0.5
x
Fig. 1. PL approximation.
x"2 -PL{x**2) ..•..•
0.75
68 W. Kruiskamp and D. Leenaerts
tion of a scalar function in section 3. To ensure compact data storage and efficient simulation, a data format of the PL models will be proposed in section 4. In section 5, advantages of PL simulators will be discussed, illustrated with a simulation example. Concluding remarks are given in section 6.
2. PLModels
In this section, we will give some examples of PL models of common used electronic components and circuits. The models are chosen from various hierarchicallevels and from both the digital and the analog domain. The list of described models is by far not complete, but illustrates how various blocks, circuits, and devices can be modeled with PL techniques.
2.1. Digital Functions
Modeling of digital functions in PL techniques is done using threshold logic. The basic idea of threshold logic is that the output can be treated as a threshold function of the weighted sum of the binary inputs. For an ANDgate this is described in (1), where it can be seen that three regions (polytopes) are required. The location of the two boundary planes can be seen in the logic diagram (Fig. 2) where threshold logic implies that the zeros must be separated from the ones. This can be done with the planes Xl + X2 = ~ and Xl + X2 = ~. The slope of the segment in the transition area is equal to 2. This results in the following model description, valid for binary inputs:
{Y= 0
Y = 2· (Xl + X2) - 2~
Y=}
5 Xl + X2 :s "4
1<Xl+X2<Z 4 - - 4 7
Xl +X2 2: "4
(1)
In this way all linear separable logic functions can be modeled with PL functions. In the case of nonlinear separable logic functions, the problem can always be solved by adding the outputs of several linear separable logic functions. An automatic method to find a suitable PL model for an arbitrary logic function is described in [2]. A truth-table is enough to derive a PL model, even if it is not yet known how that logic function will be realized with logic gates.
y
Fig. 2. AND-gate with the threshold function (left) and the logic diagram (right).
5U1D.:--., , 4 •• : •• --:--:
~3 --;-- --,--;
8' 2 - - ~ - - . - ~ - - :
1 --;-- --;--: • , , I
o ~ - - ~ - - , 2 2.25 2.5 2.75 3
in.v
sm" ,--', , 4--------: ~ 3 - - - - - - -.;
'8 2 - - - - - - - - ;
1--------: , "
o .-- .--, 2 2.25 2,5 2.75 3
in.v
>:.-.. ::.::.::.: f: .: :: :: :::
1 • • . .: , " o . . . . ,
2 2.25 2,5 2.75 3
in.v
Fig. 3. Transfer function of an example 3 bits ADC.
2.2. AnaloglDigital Converter (ADC)
In mixed analog digital systems, ADCs are often used circuits. The used ADCs differ in their conversion time, resolution, nonlinearity, input voltage range, etc. All these different converters can be made with a basic ADC that is modeled in PL techniques. This basic ADC is a model with as parameters: the number of bits, the input voltage range, the digital voltage levels and takeover region. Suppose an ADC of 3 bits with as further specifications: '0' = OV, '}' = 5V, 2V < Yin < 3V and a take-over region of 0.4 LSB, then the output signals are as depicted in Fig. 3. This basic model is the straightforward relation between the analog input and its digital output word, resulting in the well known "staircase" curve.
The staircase curve of Fig. 3 has non-vertical edges, in contrast to most ADCs. However, the slope of the edges can be made as steep as possible, which will be sufficient for most simulations. In the case that really vertical edges are required, several options are available to achieve this property. The first option is to add clocked comparators, as depicted in Fig. 4, behind the output of the ADC. The positive feedback in phase ¢ prevents the output signal from staying in an undefined state between V max and Vmin. Another option to achieve vertical edges is using the three segment hysteresis function of Fig. 5. When the input voltage is rising, the output voltage will make a step from V min to Vmax when the input becomes higher than VR. When
Behavioral and Macro Modeling using Piecewise Linear Techniques 69
-(Vrnax - Vrnin)
2R
it -(Vrnax - Vmin)
2
2
Vmax- Vmin
2R
~ Va
Fig. 4. Clocked comparator macro model.
the input voltage is falling, the output will make a step from V max to V min when the input becomes lower than VL . The way PL simulators handle this kind of functions is discussed in detail in [3]. The advantage of this method compared to the previous lies in the fact that now the ADC is static and therefore no transient analysis has to be performed.
2.3. OPAMP
Probably the most widely used analog circuit is the opamp. With PL-controlled sources and linear passive components, it is possible to construct a macro model of an opamp. An example of such a PL model is depicted in Fig. 6.
Voutt Vmax
Vmin ~--"""IP-----r-----~) Vmin VL VH Vmax Yin
Fig. 5. Hysteresis function.
v:"~ ~tO t'v· ~tO t'Vb f"C;'" f(Vin, Va) Ra Ca Va/Rb Rb Cb _
f(~n, Va)t .' .. ,::: i t Vin \'----t----'---... .-5
\' . Vmin
. ,-? Vm~' Va
Fig. 6. Opamp macro model.
The left-hand current source is controlled by the PL function/(Vin, Va). Due to this PL function, the output current of the source and its output voltage is limited. With this PL controlled source, the slewing and clipping behavior of the opamp is modeled. All other components are linear elements. With the following set of equations, the opamp specifications can be mapped on the model parameters:
Imin = SR- . Ca
Ra = DCgainjGa I Ga = UGF· 2 . 7i ' Ca
Imax = SR+ . Ca
Rb = tan (90° - phase margin)jUGF . 2· 7i . Cb (2)
As depicted in the above example, only a few or even just a single PL function is required to construct useful macromodels of circuits like opamps.
2.4. MOS Transistor
To verify the behavior of macro- or behavioral-models and to simulate critical parts of a system, transistor models are required. In this section, we will describe a simple PL transistor model, which models several nonideal effects.
The basic operation of a nMOS transistor (Fig. 7) can be described by the model of Sah [4]:
Ids = ~ . if (Vgs - VT) -I (Vgd - VT))
{ x ~ 0 ~ I(x) = 0 x ~ 0 ~ I(x) = x?
VT = threshold voltage
K - y·cox·w - L
(3)
70 W. Kruiskamp and D. Leenaerts
~ ~Ib Gate ~ 11---4~ Bulk
Source
Fig. 7. nMOST.
When we rewrite (3) explicitly for the 4 operation regions, we get the following well known equations:
Vgs :::; VT A Vgd :::; VT
Vgs ::: VT A Vgd :::; VT
Vgs ::: VT A Vgd ::: VT Vgs :::; VT A Vgd ::: VT
(4)
To make a PL model of this relation, we only have to approximate the nonlinear function 1 in (3), by a PL function. With six segments, a PL approximation of the function 1 can be made which has a maximal relative error of 10% for 0.1 < x < 4. A possible way to find such a PL approximation will be described in section 3.
The Early effect is usually modeled by multiplying the current, as defined by (3), by a drain-source voltage dependent factor:
Ids = Ids,Sah' (1 + I)..' VdsD. (5)
In PL models, a multiplication of two variables has to be approximated by means of linear mappings. We can do this by using the PL mapping of/(x).
Ids = ~. (I (Vgs - VT + ~ .)... Vds . Vrefl )
- 1 ( Vgd - VT - ~ . ).. . V ds . V rej2 ) )
with/(x) as in (3) (6)
When we take Vref1 in (6) equal to the average value of (Vgs - VT) in each polytope of/(x) and Vrej2 equal to the average value of (Vgd - VT), we do not have to increase the number of linear segments. Expression
(6) describes in first order the static transistor behavior, including the Early effect.
The dynamic behavior of a MOST can be modeled by adding (PL) capacitors between the terminals of a static transistor. The bulk effect can be modeled by making (6) linearly dependent on the bulk voltage. Fig. 8 depicts the characteristics of the resulting PL nMOS transistor. In many situations, this simple transistor model will be sufficient to analyze the effects of nonideal transistor behavior.
2.5. Differential Equations
Also time dependent models can be described using PL techniques. We simply separate the time dependent, but linear, behavior from the time independent, possibly nonlinear behavior. For a nonlinear capacitor this yields the following equations:
t = a {• Q!J.
q = ~(v) (7)
When q(v) in (7) is a nonlinear function, it is approximated by a PL function. In this way, all kinds of nonlinear differential equations can be approximated.
2.6. Remarks
In the previous sections, we gave an overview of several often used circuit components and circuits, with their behavior modeled using PL techniques. Of course, also other approximation or modeling techniques could be used to do so. However, later on we will demonstrate
MJETEC 2.4u CMOS, nMOST W/L = 2.4/2.4
IdslA] Vgs = 0, 0.5 ... ., 5 -Vds = 0, 0.5, ... , 5 ..... .
= \On, \000, .. ., 400u ...... .
5000
4000
3000
Fig. 8. Characteristics of a minimal size nMOST (PL-model).
Behavioral and Macro Modeling using Piecewise Linear Techniques 71
f(x)
X n ~n X
n+l
Fig. 9. Function approximation, alternation theorem.
the advantage of using PL techniques compared to other methods for simulation purpose.
3. Modeling Technique
Piecewise linear techniques rely on the approximation of a nonlinear function by a set of linear functions. To obtain the best set of linear functions, one can use the theorems developed in the approximation theory in mathematics. For PL approximation the Chebychev approximation theory can be used, which is based on the alternation theorem [5]. Suppose that functionf(x) has to be approximated by l(x)n with a maximum relative error
If(X) -lex) I < 0 f(x) -
on each interval Xn ~ x ~ Xn+l for which lex) = anx+bn. The alternation theorem states that for a linear approximation the function f(x) must alternate twice around the approximated function in each interval, as depicted in Fig. 9.
Furthermore, the approximation will only be optimal when the relative error in the points xn, ~n' and Xn+l is equal to 0 and the relative error in all other points is smaller or equal to o. For the functionf(x) = xl, x > 0.1, the following relations can then be derived:
I::: ~.1 (1 +' + .j2H 2") ·x. (8)
bn - ~ - 0 - an . Xn _ 1+~
Xn+l - Xn . 2.(1-8)
With (8), all linear segments can be found for a certain maximal relative error. In this way we also derived the six segments in the transistor model (4).
For scalar functions the routines to find a mapping with constant relative error or constant absolute error can be automated with as output the PL model description. However, finding multi-dimensional functions is a more difficult task. For the model description of Chua such an algorithm for a limited set of problems does exist [6]. Fortunately, as depicted in section 2, many PL models can be constructed by combinations of scalar functions.
4. Data Format
From the viewpoint of compactness and data storage, it is not convenient to store all the linear mappings of each region in a list. This would cost an enormous amount of memory. A better method would be to store the minimal amount of data necessary to retrieve the complete model. This canonical model then contains all the information. In the literature many model descriptions are presented which will do [7]. We will treat only one, the implicit piecewise linear model description of van Bokhoven [8], because of its convenience for simulation purpose.
A piecewise linear resistor can be made by a network of ideal diodes, linear resistors, and constant voltage sources. An example of such a piecewise linear resistor is depicted in Fig. 10. The network of Fig. 10 has two states, one in which the diode is conducting and one for which the diode is blocked. Due to the ideality of the diode, the current j through the diode or the voltage u over the diode has to be zero at all times. This constraint can be expressed as:
u ::=: 0, j::=: 0, U· j = 0 (9)
Therefore the currents in the network of Fig. 10 are equal to:
{ I = ~ . V + 1 . (V + u - 1) (10) j=V+u-1 .
The second equation of (10) together with (9) defines the state of the diode. When V is at least 1 volt, u has to be equal to zero. The current I can then be obtained from the first equation of (10). When V is less then I volt, j is equal to zero. We can therefore calculate the value of u from the second equation of (10). When we insert that value for u in the first equation, we obtain the relation between I and V, i.e. I = V /3. By adding more diodes, resistors, and voltage sources to the network, we can make a PL resistor with more segments.
72 W. Kruiskamp and D. Leenaerts
~
V
+~
I
v 30
Fig. 10. Piecewise linear resistor.
In a more abstract way of thinking, a PL electrical network represents a PL function. With each ideal diode one can define a hyperplane in the domain space (like the V = IV plane in Fig. 10). Each hyperplane divides the domain space into two regions. Then the situation} 2: 0 or u 2: 0, but not both positive, defines which half space of the domain space is valid. In a one-dimensional situation, this looks like (see also the second part of (10»
}=ex+u+g. (11)
Expression (11) defines the hyperplane e· x + g = O. For x 2: - ~ (right hand side of the hyperplane), u becomes positive and thus} zero. For the left halfspace, the situation is reversed. For each halfspace a linear function description is valid, which must be continuously at the boundary of ex + g = O. Suppose that for the left hand side of the hyperplane (x .:::: - ~) the function is defined as
(12)
and for the other side
(13)
then at the boundary the following relation must hold
(14)
To ensure (14), define the function description as
y=ax+bu+1 (15)
For the right side, u = 0 holds and thus a = aj, I = /J from (12). For the other side} = 0 holds and thus from (11) u = -ex - g. This together with (13) and (15) gives
y = ajX + /J + b( -ex - g) = a2X + /2 (16)
From (16), we can compute the value for b in terms of the mappings:
b = al ~ a2 ( = II ~ h) (17)
The restriction for the parameters aj, a2,fj, and/2 due to (17) is just property (14). The complete model becomes
{ y=aj.x+al~a2 .u+/J
}=e·x+u+g u·} = 0, u 2: 0, j2: 0
(18)
To find the value for y for a given x, we first solve the second and third relation of (18) to find the value of u. With this value, we can compute y from the first relation of (18).
A model, based on the above considerations, was proposed by van Bokhoven in [8] for the mapping I: Rn -+ Rm :
{ O=I·A·x+B·u+f j=D.y+C·x+I·u+g, u 2: 0, j 2: 0, uT • j = 0
(19)
with A E Rmxn, BE Rmxk, C E Rkxn, DE RkXk,
f E Rm, g E Rk, U E Rk, JERk
The first equation in (19) defines the linear mapping y + A . x + f = 0 between x and y only if u equals zero. For this situation, the second and third equation define a region in the space for which this property holds. Going from one linear segment to another will make a certain u-entry Uj unequal to zero. By substituting the expression for Uj, derived from the second equation, in the first equation and exchanging the i-th entries of u and j, will result in a new linear mapping. The also updated second equation now defines the new region in which this mapping is valid. As an example, consider the linear mappings of the 2-input AND gate
Behavioral and Macro Modeling using Piecewise Linear Techniques 73
as described in (1) and depicted in Fig. 2. In the format of (19), the PL mapping is equal to:
o = Y + (0 O)x + (-2 2)u + 0
j+(::: :::)x+G ~)·+(n (20)
u~O,j~O,urj~O
The second part of this equation clearly defines the planes as shown in Fig. 2. For u equals zero, we have the linear mapping Y = 0 valid for Xl + X2 ::: 5/4. Because the vector x is binary, the output y = 0 holds for the inputs (0,0), (0, 1), and (1,0) and none of the u-entries becomes positive. When Xl + X2 becomes larger than 1.25, which is the case for the binary input (1, 1), both entries of u will become positive. The above described updating procedure applied to the first entry of u will result in the following PL mapping:
o = y + (-2 - 2)x + (-2 2)u + 21
j ~ (_: _:)x+ G ~)u+ (~l) (21)
u~O,j~O,urj=O.
Because for the input (1,1), also Xl + X2 > 7/4 holds, the second entry of u must be updated as well. Performing this update yields
O=y+(O 0)x+(-2 2)u-1
. (1 J = 1
(22)
for which y = 1, valid for Xl + X2 ~ 7/4. We see that indeed the truth table for an AND can be modeled in the format of (19). As an example of an analog PL model, the static nMOST model as described in section 2.4 is in the format of (19) equal to:
( Id ) (0 0 0) (Vd) Ig + 0 0 0 . Vg + B . u + 0 = 0 Is 0 0 0 Vs
=j
ur . j = 0, u ~ 0, j ~ 0
with the constants equal to:
(-C1 -C, -C5 Cl C, ) B= 0 0 0 0 -~5 Cl C2 C5 -Cl
-al -b A +al +b
-a2 -b A+a2 +b
c= -a5 -b A+ab+b
b - JL'Cox'W A - ..£:L - 2·L ' - 2.~'
Cl = 0.09, C2 = 0.2238, C3 = 0.4666, C4 = 1.1605, C5 = 2.8863, al = b . A . 0.025, a2 = b . A . 0.0872, a3 = b . A . 0.2168, a4 = b· A . 0.5392, a5 = b· A ·1.341, hYPl = b . Vr, hYP2 = b . (Vr + 0.1),
(23)
hYP3 = b· (Vr + 0.2487), hYP4 = b . (Vr + 0.6815), hYP5 = b . (Vr + 1.5383)
With the geometry parameters Wand L and the technology parameters JL, Vr , Cox, y, q;, and A [9], the PL model of an arbitrary nMOST is defined. With the above presented updating procedure, all linear mappings can be derived out of (23).
From the two model descriptions (20) and (23), it will be clear that from a data point of view there is no difference between a digital, analog, or behavioral component. This will lead to an advantage in analyzing a mixture of analog and digital components.
5. Mixed Signal PL Simulation
When the behavioral components are described using PL techniques and modeled in a way like (19), one
74 W. Kruiskamp and D. Leenaerts
needs a simulator in order to analyze a network, built up with such components. When all components are PL modeled one can use standard simulators like SPICE to solve the network equations. A small modification will be necessary to ensure that the proper linear segments are used. It is however more convenient to use dedicated PL simulators. They are specially developed to solve PL described equations. They not only need a single solution algorithm to solve the equations, but the algorithm itself has better convergence properties than Newton Raphson iteration algorithms.
For that purpose, several PL simulators are developed [2], [3], [10]. As already stated, the model format for a digital, analog or other component is always the same. Therefore, the simulator is not aware what kind of component or network it is analyzing. Only a single solution algorithm is required to solve the complete network. This is an important advantage, since often a combination of digital and analog simulators is necessary to solve a large mixed signal network. Both simulators are however often combined into a single simulation tool kit, but internally there are two solution algorithms. One solution algorithm for the analog parts and an other for the digital parts. See for instance the latest versions of SPICE where the digital circuit is separated by a conversion block from the analog circuit. The digital circuit will be analyzed using event driven logic algorithms, while the analog circuit will be analyzed using numerical methods. Another advantage is the convergence property of PL simulators. In SPICE like simulators a Newton-Raphson (NR) iteration scheme is applied in order to solve the nonlinear equations. Such an iteration scheme converges only locally [11]. The initial point must be close enough to the real solution in order to find the solution. In the example of Fig. 11, Newton-Raphson is not able to find the solution for f(x) = 0, when we start with x = O. In PL simulators, the algorithm can always follow the curve. The only problem is to find the linear mapping corresponding to the current part of the domain space. This problem is directly related to the linear complementary problem (LCP) [12]. The solution algorithms (e.g., Katzenelson algorithm [13]) for this LCP are globally convergent. They are more robust to the initial guess of the solution than NR algorithms. Therefore PL simulators have fewer problems in finding the operating point of mixed signal networks and are more stable. A last advantage of PL simulators is that they also are truly mixed level simulators. Again, because the fact that each component is described in the
4 -x**3+3*x**2-x+1 -
3
2
1
0
-1
-2 -2 -1 0 1 2 3 4
Fig. 11. Newton-Raphson convergence problem.
same format, there is no difference between a resistor, transistor or an operational amplifier.
Although it is possible to analyze complex networks at the transistor level, the results are not as accurate as with SPICE-like simulators. This is due to the PL approximation of the nonlinear behavior of the transistors. Therefore it is not worthwhile to compare PL simulators with SPICE-like simulators at the transistor level. The latter simulators have the advantages of accuracy and speed at the transistor level, while PL simulators have those advantages at the behavioral level and the macro model level. In the same way as it can be of advantage to model parts of a system at the behavioral level in SPICE, it can be of advantage to model parts of a system at the transistor level in PL simulators.
As an example, the Phase Locked Loop (PLL) of Fig. 12 is simulated with the PL simulator PLANET [2], [3]. The phase detector in the example is modeled by a digital EXOR-gate as can be described using techniques outlined in section 2.1. The loop filter is analog and consists of an opamp, a capacitor, and two resistors. The opamp in the filter is modeled as described in section 2.3. The voltage controlled oscillator (VCO) is modeled by a PL hysteresis function, a voltage controlled current source and a capacitor.
The wave forms of the input signal, the output signal and the output of the loop filter are depicted in Figs. 13, 14, and 15. It can be seen that the output of the PLL is initially the free-run frequency of the VCO (1 kHz). After 20 ms, the frequency is adjusted to the input frequency which is 1.5 kHz.
Behavioral and Macro Modeling using Piecewise Linear Techniques 75
~~ 10M _ t IT Voul
Yin : + V1f:~nF , AO=60dB,
, ~F!'TJf: : Ivco=' (VOUI· (Vlf - O,3V»· 13,6 uSN pm =60 de ••
phase deteclor ' loop filter veo
Fig, ]2, Digital Phase-Locked Loop with analog loop filter,
0.5
~ 0
" :>
-0.5
-I -
0 0.005 0.01 0,015 0.02 0.025 timers]
Fig, 13, Input signal of the PLL.
0,2
0,15
0,1
~ 0.05
l! 0 ..::: go -0.05 .s 'll
-0.1 ~ " -0.15 0
-0.2
-0,25
-0.3 0 0.005 0,01 0,015
time [s]
Fig. 14, Output signal of the loop filter in the PLL,
0.5
o ........ _._ .......... - ...... -.
-0.5
-1
o 0,005 0,01 0,015 0,02 0.025 time [s]
Fig, 15, Output signal of the PLL,
Although the above presented example is relatively small, many SPICE-like simulators will have severe problems in simulating such loops. In mixed analog digital ICs, several of those loops may occur, with more complex analog and digital parts.
6. Conclusions
In this paper we discussed the possibility of using piecewise linear techniques to model the behavior of circuit components. We have presented various examples of PL models of devices, circuits and subsystems. All kind of these models, which can be digital, analog, device level, or circuit level, can be described using sets of linear equations. We also showed a model description to store the model information with a minimal amount of data. Because, using this PL model description, all components have a similar data format, mixed signal mixed level simulation can be performed by applying a single simulation algorithm. This in contrast to SPICE-like simulators which use for the digital domain an other solution algorithm than for the analog domain . Furthermore, the algorithms to solve PL functions are more robust in finding a solution than Newton Raphson based algorithms, applied on the nonlinear functions,
References
1, S, Donnay, G, Gielen, W, Sansen, W. Kruiskamp, D, Leenaerts, S, Buytaert, K. Marent, M, Buckens and C, Das, "Using topdown CAD tools for mixed analog/digital ASICs: a practical design case," in this journal,
76 W. Kruiskamp and D. Leenaerts
2. T. A. M. Kevenaer, PLANET: A hierarchical network simulator, Ph.D. dissertation, Eindhoven University of Technology, 1992.
3. T. A. M. Kevenaar and D. M. W. Leenaerts, "A flexible hierachical piecewise linear simulator." Integration, the VLSI 10urnaI12,pp.211-235,1991.
4. C. T. Sah, "Characteristics of the metal-oxide-semiconductor transistor." IEEE Trans. on Electr. Devices ED-ll, pp. 324-345, July 1964.
5. E. W. Cheney, Introduction to Approximation Theory. McGraw Hill: London, 1966.
6. L. O. Chua and A. Deng, "Canonical piecewise linear modeling." IEEE Trans. Circuits and Syst. CAS-33, pp. 511-525, May 1986.
7. T. A. M. Kevenaar and D. M. W. Leenaerts, "A comparison of piecewise linear model descriptions." IEEE Trans. Circ. and Syst. Part I CAS-39, pp. 996-1004, Dec 1992.
8. W. M. G. van Bokhoven, "Piecewise linear analysis and simulation," in Circuit Analysis, Simulation and Design. A. E. Ruehli (Ed.), Elsevier: Amsterdam, 1986, Ch. 9.
9. S. M. Sze, Semiconductor Devices, Physics and Technology. J. Wiley and Sons: New York, 1985.
10. L. O. Chua, "Canonical piecewise linear analysis: part II tracing point and transfer characteristics." IEEE Trans on Circuits and Syst. CAS-32, pp. 417-433, May 1985.
11. J. Vlach and K. Singhal, Computer Methodsfor Circuit Analysis and Design. Van Nostrand Reinhold: New York, 1983.
12. C. E. Lemke, "On the complementary pivot-theory," in Mathematics of Decision Sciences, Part I. G. B. Dantzig and A. F. Veinott Jr. (Eds.), Academic Press: New York, 1970.
13. J. Katzenelson, "An algorithm for solving nonlinear resistive networks." Bell Syst. Tech. 1. 44, pp. 1605-1620, Oct. 1965.
Wim Kruiskamp was born in Arnhem, The Netherlands on March 31, 1966. He received the M.S. degree in electrical engineering from the University of Twente, Enschede, The Netherlands, in 1990. In 1992, after his military service, he joined the Eindhoven University of Technology, The Netherlands, where he is currently working towards his Ph.D. degree. His main research interests are analog and mixed analog/digital design automation.
Domine M. W. Leenaerts received the Ir. and the Ph.D. degrees both in electrical engineering from the Eindhoven University of Technology in 1987 and 1992 respectively. Since 1992 he is with this university as an assistant professor of the micro-electronic circuit design group. In 1995, he has been a Visiting Scholar at the department of Electrical Engineering and Computer Science of the University of California, Berkeley and at the Electronic Research Laboratory of the same department. His research interests include non linear dynamic system theory, chaotic behavior in circuits, and analog design automation. He has published several papers in scientific and technicaal journals and conference proceedings.
Analog Integrated Circuits and Signal Processing, 10,77-88 (1996) © 1996 Kluwer Academic Publishers, Boston. Manufactured in The Netherlands.
Behavioral Simulation of Densely-Connected Analog Cellular Array Processors for High-Performance Computing
TONY H. WU, BING J. SHEU, AND ERIC Y. CHOU Department of Electrical Engineering. Integrated Multimedia Systems Center, University of Southern California, Los Angeles, CA 90089-0271
Abstract. The analog cellular neural network (CNN) model is a powerful parallel processing paradigm in solving many scientific and engineering problems. The network consists of densely-connected analog computing cells. Various applications can be accomplished by changing the local interconnection strengths, which are also called coefficient templates. The behavioral simulator could help designers not only gain insight on the system operations, but also optimize the hardware-software co-design characteristics. An unique feature of this simulator is the hardware annealing capability which provides an efficient method of finding globally optimal solutions. This paper first gives an overview of the cellular network paradigm, and then discusses the nonlinear integration techniques and related partition issues, previous work on the simulator and our own simulation environment. Selective simulation results are also presented at the end.
1. Introduction
The original cellular neural networks paradigm was first proposed by Chua and Yang in 1988 [1], [2]. It is an important advance over the silicon retina model from Mead [3] by including extensive programming capability. Fundamental ingredients of this paradigm include [4]:
• the use of analog processing cells with continuous signal values,
• local interaction within a finite radius, and • binary or gray-level output values.
The cellular network can be either the 1-, 2-, 3- or higher-dimensional array consisting of many identical analog computing cells. Each computing cell has a simple structure and interconnects with its neighboring cells directly but work together to achieve many global effects. A coefficient template specifies the interaction strengths from one cell to its neighbors in terms of the relationship among the input, state, and
output variables. The coefficient template may be a linear or a nonlinear function of the state, input, and output variables in each computing cell. It could contain time-delay or time-varying values. The dynamic system may also be perturbed by some noise sources of desirable statistics to facilitate searching of better solutions.
Since its introduction, the cellular neural network gains widespread interests from the scientific community. The cellular networks can be used in many applications, such as image processing [5], [6], artificial vision [7], solving partial differential equations [1], and modeling biological systems [8], [9]. The cellular network architecture not only is a powerful paradigm, but also exhibit a unique topological property that is suitable for special-purpose hardware implementation. Due to its regular structure, local interconnection and parallelism, one microchip with 100 x 100 cells can achieve the equivalence of more than 1 tera operations per second. It could possibly be the paradigm that provide efficient solutions to many scientific problems.
78 T. H. Wu, B. J. Sheu, and E. Y. Chou
2. Overview of the CNN paradigm
Assume the network is in the form of a 2-dimensional n-by-m rectangular-grid array where nand m are the numbers of rows and columns, respectively. Figure 1 (a) shows this n-by-m network with neighborhood size r = 1 where the darkened cells represent the neighborhood cells Nl (i, j) of CU, j), including CU, j) itself. The circuit schematic diagram of one computing cell is shown in Figure 1 (b).
The cell C(i, j) has direct interconnection with its neighborhood cell through two kinds of weights, i.e., the feedback weights A(k, I; i, j) and A(i, j; k, I), and the feedforward weights B(k, I; i, j) and B(i, j; k, I). Here the index pair (k, I; i, j) represents the direction of signal from C(i, j) to C(k, I). The cell C(i, j) communicates directly with its neighborhood cells C(k, I) E Nr(i, j). Since its neighborhood cells C(k, I) have their own neighbors, C(i, j) also communicates with all other cells in the whole array in multiple steps. Therefore, even with local interconnection, this architecture is still able to ripple the effects across the whole network.
The dynamic system can be described using a set of
(a)
C (Ie, /) EN, (i.})
.==;:1t:::j integrator
·1/Ax
(b)
nonlinear output function
v (i.})
Fig. 1. Cellular neural network. (a) An n-by-m network on a rectangular grid. Shaded squares are the neighboring cells of C(i,j) and itself. (b) Functional block diagram of neuron cell.
Piece-wise Linear Function Sigmoid Function
fIx) fIx)
Fig. 2. Two widely-used transfer functions.
differential equations [1]:
1 . - Rx Vxij (t) + lxij (t)
1 = -RVXij(t)
x
+ L A(i, j; k, I)vyk/(t) C(k,l)EN,(i.j)
+ C(k./)EN,(i.j)
+Ib ;
BU, j; k, I)vuk/(t)
1 ~ i ~ n, 1 ~ j ~ m,
x
(1)
where Rx and Cx are the equivalent resistance and capacitance of the computing cell and h is the bias current. The shift-invariant cellular networks have the interconnections that do not depend on the position of cells in the array. It is the most desirable feature when implementing a large-size electronic network on a very large-scale Ie (VLSI) chip.
Each computing cell contains a nonlinearity between the state node and the output and its input-output relationship is represented by Vyij(t) = !(Vxij(t)). Two widely used nonlinearities are the piecewise-linear and sigmoid functions as given by [10], [11]
y = lex)
= {~(IX+II-IX-ll)
(1 - e-AX )/(1 + e-Ax )
piecewise-linear function, (2)
sigmoid function.
and plotted in Fig. 2. Here the parameter A represents the gain factor of the sigmoid function. For a unity gain at x = 0, A = 2 is used for the sigmoid function.
The programmability of this array processors lies in the coefficient templates A(i, j; k, I) and BU, j; k, I).
Behavioral Simulation of Densely-Connected Analog Cellular Array Processors 79
A good behavioral simulator for the networks will not only help designers gain insight on the system operations, but also can be used for optimizing the electronic implementation.
3. Computer simulation methods
In order to solve the system of 2-dimensional n x m cellular networks using the digital computers, the governing equation (1) can be re-written in the matrix form [11] as
The Rx and ex are used for integration purpose and will affect the speed of the VLSI hardware. For behavioral simulation, these two items can be normalized to 1. Since the inputs will be kept as constants during each operation, the last two terms can also be lumped together as another constant term. Once the output transfer function is included, the overall systems can be expressed as a set of differential equations to be solved as
(4)
where f( . ) is the output transfer function and Lc is the lumped constant whose values are equal to TBu + hw. By using highly optimized differential equation solver subroutines provided by many software vendors [12], the whole system dynamics can be simulated and analyzed.
One severe drawback for this simulation method is that, for an n x m network, the feedback matrix TA and control matrix TB will have the dimensions of mn x mn, which increases on the order of O(n4). When a large system is to be simulated, very large storage resources will be required to hold the data for these two matrices. Only a small portion of the entities are non-zero and others will be zero. Precious storage space and computing resource are not efficiently utilized. Besides, it will be quite challenging to partition the computing jobs for multi-processors or multi-computer systems because of the synchronization requirement, non-regularity of the network and the inefficiency in the routing and use the communication bandwidth.
An alternative solution is to apply the "convolution" idea popularly used in the digital image processing.
The basic approach is to consider a sub-image block whose size is the same as the feedback matrix TA (or TB)' It moves from the top-left corner of the image to the bottom-right corner. The output state will only be updated after the whole image has been processed at each iteration. Only new states information on the border has to be passed to the neighboring block. The simulation will stop after all the outputs saturated to either 1 or -1 and no longer change their values.
The updating equation in each cell can be written in the form as
(5)
Now the problem will be treated as solving the set of differential equations with initial values Vx (0) in each sub-image block. The one-step algorithms, such as the simplest Euler's algorithm or the more elaborated fourth-order Runge-Kutta algorithm can be used for the integration. The latter will cost more in terms of computation time because it evaluates four derivatives per iteration. However its high cost is compensated by its accuracy in transient behavior analysis and thus is usually favored. The fourth-order Runge-Kutta for one-step integration is given by [13]
where 8t is the integration time step and F1, F2, F3, F4 are four intermediate terms.
One advantage to solve the cellular network in this scheme is that the whole system can be easily partitioned for parallel computation using multi-processor or multi-computer systems. As shown in Figure 3, assuming that the 2-dimensional cellular network is partitioned into P x Q subblocks for parallel computation, the overall execution time Te will be equal to
T. - -T. - - - T.. nm 2 (n m) e - P Q U + Be P + Q I,
(7)
where Tu is the computing time for each cell and T; represents the time for inter-block communication. Be is the communication bandwidth factor between adjacent subblocks. It is clear that the more subblocks divided, the less the computing time will be needed. The penalty will be the larger number of computing resource required, massive synchronization among the blocks, and the communication burden among the blocks.
80 T. H. Wu, B. 1. Sheu, and E. Y. Chou
The total execution time versus the numbers of blocks divided per dimension are plotted in Figure 4 with different T; / Tu ratio. In the experiment the input is assumed to have the same size in both dimensions (256 x 256) and the number of blocks divided in each dimension is assumed to be equal. The communication bandwidth factor Be is set to be 1 to facilitate the discussion. When a multi-computer system is used to simulate the network, the T; / T u ratio tends to be large due to the extra memory access time and bus arbitration time. In contrast, if the systems are composed of lesspowerful CPUs but efficient interconnection scheme, a low T; / Tu ratio will be achieved. When the number of divided subblocks increases, the execution time difference among different T; / Tu ratios will also increase. This implies that a fine-grained system will need better communication to improve its system performance.
A compact analog VLSI implementation will be an extremely powerful approach for P 4- n, Q 4- m. Here Tu is equal to k . RC, where k is a scaling factor depending on the coefficient template and was proved to have an upper-bound limit for each template [1]. Therefore saturated binary results will be achieved after
subb/ock(O,Q'7)
C E :::l c-! "5 ii 'T .c :::l t Ul Q.
~ 0
:5 al :g ,~
% ~ 2 c:
~~------------~ ~------------~) V m columns (divided into a sub-columns)
o NauronCall - intar-block intarconnaciton
D Partition Boundary - Intra-block Intarconnection
Fig. 3. Partition of an n x m network into P x Q subblocks.
10'
100
10
0.1 0.01
1 0 '0:--1-!-::0--2:':0---:':30:---:'40':---:5~0 -....,6-!-::0-""':':70--:'80:--~gO number of blocks divided per dimension
Fig, 4. Total execution time versus number of blocks divided in each dimension.
a certain amount of execution time. If the network can be realized on one silicon microchip, the updated interconnection will occur simultaneously when the state changes. T; will become negligibly small and the second term in equation (7) can be dropped. The overall execution time will be equal to just k . RC. By using modem VLSI fabrication techniques, the RC constant will be in the range of 10 ns to 1 f.1,S. The achieved speed is enormously fast when dedicated microelectronic hardware is built.
4. Related work on cellular neural network simulators
Several research versions of Cellular Neural Network simulators have been announced. The CNN Workstation [14], the XCNN simulator [15], the SIRENA environment [16] or the Neurobasic simulator [17], are representative examples.
The CNN Workstation, developed by the Dual and Neural Computing Systems Laboratory, Budapest, Hungary, provides a simple experimental tool for studying cellular neural networks. Transients ofCNNs with linear, nonlinear, and delay-type templates can be monitored graphically. A basic menu-driven user interface provides the control mechanism of the system.
Another software package XCNN simulator from Texas A&M University focuses on a multi-layer CNN structure performing color image processing applications. Additional post processor is used to perform pixel-wise logical operations among different layers.
Behavioral Simulation of Densely-Connected Analog Cellular Array Processors 81
The commands are issued using a specialized BNFlike language.
Researchers in the Universidad de Sevilla, Spain, developed the SIRENA environment which is a general framework for artificial neural networks, with emphasis on CNNs. The focus is on the simulation and modeling of the non-ideal effects from VLSI implementations, without efficiency better than SPICE circuit simulator. Graphics interface is applied for simulation supervision and image visualization.
The Neurobasic from Swiss Federal Institute of Technology, Zerich, Switzerland, is another simulation environment for neural networks which use the Basic programming language as the development tool. A special feature is that it is also designed to execute on the MUSIC parallel computer. The neuron function can execute very fast because of the massive parallelism.
5. The behavioral simulator CNNA
The cellular neural network annealing simulator CNNA was constructed based on the Runge-Kutta method described above. It is developed using the portable C language and consists of more than 1,500 lines of code for text-mode simulation. It runs under either Unix operating system or DOS system with a suitable C compiler. The input and output images can be visualized using companion software subroutines. The behavioral simulator can provide us valuable information and a good tool to characterize the behavior of the system.
Template library is supported which accommodates more than 50 useful templates. A list of selective known templates for cellular nonlinear networks can be found in [11]. Additional coefficients templates were also reported in [18], [19]. Many templates for new applications are continuously reported in the literature and the number of working templates is still growing. Several examples are used in the next section for demonstration.
The strength of the cellular networks lies in the programmability by changing the coefficient templates. That is to say, the content of the templates can be viewed as instruction sets used in the conventional digital microprocessors. The input images and initial states can be handled as the operands. The system will only need information for new templates for different applications. Therefore a general-purpose simulation environment can be built.
The simulator reads the commands from a text file and properly established the configurations for the network operation. The user can program the command files for different applications. A sample of the command file that is used to simulate the connectedcomponents detection operation described in the next section is listed in the following:
METHOD RK4 ISIZE 20 20 INPUT ccd.20 INITIAL ccd.20 OUTPUT ccd20ns.out TRAN 0.01 10.0 TSIZE 3 3
TEMPA 0 1 0 0 2 0 0 -1 0 TEMPB ALL 0 ANNEAL N BIAS 0.0 TOL 0.0001 BOUNDARY 0
There are several features that could be incorporated in this behavior simulator to help us study the system behavior before the actual design of the hardware. It provides us valuable information for effects of the nonlinear network and non-ideal microelectronic fabrication. Those effects can be summarized as: Internal state limitation Although the internal states
will be bounded to a certain value as proved in [1], it is not desirable to have such a large dynamic range for the actual circuit simulation. With limited swing voltage (or current) range, a large dynamic will sacrifice the resolution. To decide which range is appropriate, the simulator can help.
Non-ideal output function The output function described in section 2 has the characteristics such as passing the origin, skew-symmetric with respect to the origin point, and saturating at fixed output when the input is large. However, the output function when implemented in hardware will not be so perfect. Symmetric characteristic is not always achieved and the output might keep growing even when the saturation point is reached. To simulation this effect, a look-up table for output function can be used to study the effect of the desired output function.
Crosstalk in the interconnection There are heavy communication activities among the computing cells during the execution which might cause the crosstalk. Especially for the analog implementation, the burst noise might randomly strengthen
82 T. H. Wu, B. J. Sheu, and E. Y. Chou
f(x)
gain increase
Fig. 5. Annealing processing with changing gain.
or weaken the templates weights and thus lead to a different solution. By imposing randomly generated noise with a pre-specified strength, the simulator could provide useful information.
Limited resolutiou For digital simulation, the resolution can be as high as desired with the penalty of longer execution time. For analog implementation, the resolution supported by the analog circuits is typically limited to about 7 to 8 bits. Using integer data type instead of fl oating point data type can accurately simulate the effect.
An unique feature of this simulator is the annealing capability. Hardware annealing [10], [20], which is effectively the paralleled version of the popular meanfield annealing used in analog arrays, provides an efficient method of finding globally optimal solutions. It is performed by changing the gain value of the inputoutput transfer function I ( . ), which can be described by
Vy = I(gr . Vx)
r-if Vx > l/gr
= gr· Vx, if - l/gr :::: Vx :::: l/gr (8) -1, ifVx < -1/gr.
for the piecewise linear function. At the beginning of the annealing process, the initial gain can be set to a very small, positive value. During the annealing process the gain keeps increasing and the final gain gr = 1 for the piecewise linear function is maintained until the next operation. Notice that the new current-mode circuits scheme is used and the maximum gain value in the cellular network is only 1. Figure 5 shows the transfer characteristics of the nonlinearity for several values of the gain control parameter gr.
The simulator takes much longer CPU time for the annealing process because the gain 8r needs to be changed during the simulation. This is due to the use of low neuron-gain at the beginning, in contrast to the constant high gain for simulations without the annealing. The states will change in order to deterministically search for the optimal solution in the solution space.
The framework for a universal cellular neural network simulator is shown in Figure 6. Instead of using multiple detailed commands specified every time, a software library stores the known template information. The desired operation sequences can be entered from the user interface by either a high-level programming language or using the graphics-oriented approach. The compilation process is necessary and the linkage to the known templates can be assisted by the software library manager.
6. Simulation results
The connected-components detection operation [21] can count the number of connected components in each row (column) of the input image. This operation is performed by using the appropriate template
A(i, j; k, I) = [~ ~ -~] , o 0 0
R(i, j; k, I) = !!,
h = O. (9)
The input image to be processed is entered as the initial state values Vx(O). The output will be saturated to
User
Fig. 6. The framework of the array-processor simulation environment.
Behavioral Simulation of Densely-Connected Analog Cellular Array Processors 83
Table 1. Simulation summary for connected-component detection.
Normalized Time Step Simulation Behavior
bot = 0.1 Time Steps Required CPU Time (sec)
LH = 0.05 Time Steps Required CPU Time (sec)
bot = om Time Steps Required CPU Time (sec)
bot = 0.005 Time Steps Required CPU Time (sec)
bot = 0.001 Time Steps Required CPU Time (sec)
either 1 or -1 and the numbers of cells which have positive outputs in each row is the number of connected components. These positive output values will be separated by one negative pixel. Figure 7 shows the final results using this template. The execution time for this specific application grows linearly with the image size because the I-dimensional dependency of the data.
Another example is the hole-filling operation [22], which can fill the holes within the edge of objects in the input image. The edge of the hole has to be at least eight-connected for proper filling. A pixel X is said to be eight-connected to the neighbors if it is a logical one and at least one of its east, west, north, south, northeast, northwest, southeast or southwest neighbors is also logical one. The operation can be realized by the template
[! 1 n A(i, j; k, I) 2 1
U 0 n BCi, j; k, I) = 4 0
h = -1. (10)
The image to be processed is entered as the input, and the initial states are all set to be 1. As time evolves, the pixels to be filled will stay at 1 and the those pixels which won't be filled will keep decreasing their values and finally saturated at -1. Figure 9 shows the hole-
Image Size 5x5 8x8 20 x 20 64 x 64
90 126 296 763 0.18 1.05 24.033 752.067
155 225 549 1449 0.317 1.95 45.51 1461.15
571 899 2470 6827 1.45 9.133 218.98 7056.03
992 1645 4770 13453 3.217 18.117 436.07 14113
3307 6557 22131 65390 14.70 90.15 2176.40 70482
filling capability applied on two simple 5 spiral images. Figure 10 shows the transient characteristics of the state variable at different locations in the second example used in the previous figure. Cell (0,0) is the upper-left corner cell which won't be filled and Cell (1, 1) is the cell that is to be filled. Cell (0, 1) represents the cell that is already at logical-I.
The applications described above both use the binary input images. For some applications, gray-level images are used. In the edge-detection operation, the required templates are
ACi, j; k, I) = [~ ~ ~] , 000
BCi, j; k, I) = [=~:;; -0.25
h = -0.3.
-0.25 2
-0.25
-0.25] -0.25 , -0.25
(11)
An example that shows the superiority of hardware annealing for finding the global optimal is shown in Figure 11. The original Mickey image is plotted in Figure 11 (a) and it is a gray-level 187 x 294 image. After we apply the conventional digital image processing operation with
[ -0.25 -0.25 -0.25]
M = -0.25 2 -0.25 , -0.25 -0.25 -0.25
(12)
84 T. H. Wu, B. 1. Sheu, and E. Y. Chou
5
10
15
20
0 '0' · . . . · . . . .. , .... ' ...... . · . . . , . . .
· . · . ............. _ ... · . . .
0 '0' · . . . · . . . ........ . ..... · . . . · . . . · . . . · . ................ .
5 10 15 20 (a)
5.:JUI 10 . . . .
15 .:: 11·11 20 .................
5 10 15 20 (b)
5 ... , ......... , .... .
10 ... : .... ' . ... .' .... . . .
~-: -:~ l;: .;. : .; .. 8:: .. ; ... ;' .. ;J. ::
5 10 15 20
(c)
Fig. 7. (a) Input artificial image. (b) Output result along the rows after CCD operations. (c) Output result along the columns.
and threshold set to 0, an edge image is generated in Figure 11 (b). The simulation result shown in the Figure 11 (c) is the obtain by (11) without hardware annealing. In Figure 11 (d), an improved edge result is shown when the annealing is applied. The annealing speed and the threshold h are the parameters for the annealing operation. The whole simulation operation takes about 15 minutes to complete on a SUN Sparc20-Station.
The cellular network can be enhanced by incorporating chaotic neurons into the array to explore the rich
10' ".,.,.".",.,.,."..,.".c,.,.,om",p""ari""SO"," ".,.of,.,.,eX""eC",.utC!io".",t",im",e m.,.,.a""de,.,.,by.",.,.,diff""e'ce"",t ,."imca9""e S"."iz""es,.".,.,.",..,.".".,..,
10'
10·
10-' L:-_--'_-'--'--'--'-'....i....i-'-:-__ -'----'_'--'-'-.......... ..i..l 10" 10'" 10"
iteration time step
Fig. 8. Execution time of connected-components detection operation using different integration step sizes for multiple image sizes.
spatio-temporal relationship. Such complex networks are an important model for physical systems and biological signal processing with many degrees of freedom [23]. The Chua's circuits can be used as standard chaotic cells [24]. Due to the high dimensionality of the complex cellular chaotic networks, accurate simulation will be a challenging task and will be carefully addressed in our future study. The proposed behavioral simulator can be used in conjunction with the mixedmode circuit simulator iSPLICE [25] for development of application-specific array processing VLSI chips in pattern recognition.
7. Conclusion
In this work a behavioral simulation methodology for the densely-connected analog cellular networks is presented. The proposed method is based on a differential equation solver engine which can efficiently simulate the system dynamic behavior. System partition techniques can provide valuable information about the influence when multiple chips had to be used to construct a large system. Random noise added in the system would simulate the case when crosstalk noise generated during the fast data switching and test the robustness of the cloning templates. The effect on the finite precision of analog system can also be analyzed using the simulator.
Behavioral Simulation of Densely-Connected Analog Cellular Array Processors 85
2
3
4
5
2
3
4
5
2
3
4
5
2
2
2
•
3
(a)
3
(b)
3
(c)
4 5
4 5
4 5
• ....
2 3 4 5
(d)
Fig. 9. (a) Input image without an 8-connected pattern. (b) Output image which has no change. (c) Input image with a 4-connect object. (d) Hole-filling output image where the enclosed pixels are filled .
6r------,------~-------.------.-------._----_,
4 ........................ . ···: .... ···· .. ······CeIl·{1;0}·
.. c..11·{1;1}·· ~
~ . -2
-4 . .... . ........:..
-6 'C1!II'{M}'
-8~----~------~------~------~------~----~ o 2 4 6 8 10 12 normalized time (RC = 1)
Fig. 10. The transient ofthe state variables at different locations.
References
1. L. O. Chua and L. Yang, "Cellular neural networks: Theory." IEEE Trans. on Circuits and Systems 35, pp. 1257-1272, Oct. 1988.
2. L. O. Chua and L. Yang, "Cellular neural networks: Applications." IEEE Trans. on Circuits and Systems 35, pp. 1273-1290, Oct. 1988.
3. C. Mead, Analog VLSI and Neural Systems. Addison Wesley, 1989.
4. L. O. Chua and T. Roska, "The CNN paradigm." IEEE Trans. on Circuits and Systems I 40, pp. 147-156, Mar. 1993.
5. K. R. Crounse, T. Roska and L. O. Chua, "Image halftoning with cellular neural networks." IEEE Trans. on Circuits and Systems II 40, pp. 267-283, Apr. 1993.
6. T. Sziranyi and J. Csicsvari, "High-speed character recognition using a dual cellular neural network architecture." IEEE Trans. on Circuits and Systems II 40, pp. 223-231, Mar. 1993.
7. A. G. Radvanyi, "A dual CNN model of cyclopean perception and its application potentials in artificial stereopsis," in IEEE Proc. of Workshop on Cellular Neural Networks and Applications, Munich, Germany, Oct. 1992, pp. 222-227.
8. T. W. Berger, B. J. Sheu and R. H.-K. Tsai, "Analog VLSI implementation of a nonlinear systems model of the Hippocampal brain region," in Proceedings of the Third IEEE International Workshop on Cellular Neural Networks and their Applications (CNNA-94), December, 1994, pp. 47-51.
9. A. Jacobs, T. Roska and F. Werblin, "Techniques for constructing physiologically motivated neuromorphic models in CNN," in Proceedings of the Third IEEE International Workshop on Cellular Neural Networks and their Applications (CNNA-94), December, 1994, pp. 53-58.
10. S. H. Bang, "Performance optimization in cellular neural network and associated VLSI architectures," SIPI Technical Report #268, Dept. of EE, University of Southern California, 1994.
11. B. J. Sheu and J. Choi, Neural Information Processing and VLSI. Kluwer Academic Publishers: Boston, MA, 1995.
86 T. H. Wu, B. J. Sheu, and E. Y. Chou
(a)
(b)
(c)
(d)
Fig. 11. Demonstration of the hardware annealing effects. (a) Original Mickey image. (b) The result obtained by using the traditional digital image processing. (c) The result after applying CNN edge detection template. (d) The result when both edge detection template and hardware annealing are applied.
12. W. H. Press, B. P. Flannery, S. A. Teukolsky and W. T. Vetterling, Numerical Recipes in C. Cambridge University Press, 1988.
13. J. M. Ortega and W. G. Poole, Jr., An Introduction to Numerical Methods for Differential Equations. Pitman Publishing Inc., 1981.
14. "Cellular Neural Network Simulator User's Manual, ver. 3.6,"
in Cellular Neural Networks, edited by T. Roska and J. Vandewalle, Wiley, 1993.
15. Jose Pineda de Gyvez, "XCNN: A software package for color image processing," in Proceedings of the Third IEEE International Workshop on Cellular Neural Networks and their Applications (CNNA-94), December, 1994, pp. 219-234.
16. R. Dominguez-Castro, S. Espejo, A. Rodriguez-Vazquez, I. Garcia-Vargas, J. F. Ramos and R. Carmona, "SIRENA: A simulation environment for CNNs," in Proceedings of the Third IEEE International Workshop on Cellular Neural Networks and their Applications (CNNA-94), December, 1994, pp.417-422.
17. The simulator is written by J. A. Osuna. Additional information can be found by ftp:/ife.ethz.ch/publNeuroBasic.
18. S. Espejo, VLSI Design and Modeling ofCNNs. Ph.D. Dissertation, University of Sevilla, Spain, Apr. 1994.
19. T. Roska, "CNN analogic (dual) software library." Internal Report DNS-I-1993, Computer and Automation Institute, Hungarian Academy of Science, Jan. 1993.
20. S. Bang, B. J. Sheu and T. H. Wu "Optimal solutions for cellular neural networks by paralleled hardware annealing," accepted by IEEE Trans. on Neural Networks.
21. T. Matsumoto, L. O. Chua and H. Suzuki, "CNN cloning template: connected component detector." IEEE Trans. on Circuits and Systems 37, pp. 663-635, May 1990.
22. T. Matsumoto, L. O. Chua and R. Furukawa, "CNN cloning template: Hole filler." IEEE Trans. on Circuits and Systems 37, pp 635-638, May 1990.
23. M. J. Ogorzalek, A. Dabrowski and W. Dabrowski, "Hyperchaos, clustering and cooperative phenomena in CNN arrays composed of chaotic circuits," in Proceedings of the Third IEEE International Workshop on Cellular Neural Networks and their Applications (CNNA-94), December, 1994, pp. 315-320.
24. L. O. Chua, G.-N. Lin, "Canonical Realization of Chua's Circuits Family." IEEE Trans. on Circuits and Systems 37(7) pp. 885-902, 1990.
25. R. A. Saleh, iSPUCE3 Version 3 User's Guide. Dept. of Electrical and Computer Engineering, University of Illinois, Urbana-Champaigu.
Tony H. Wu was born in Taiwan in 1967. He received the B.S. degree in electrical engineering from National Taiwan University, Taipei, in 1989, and M.S., and Ph.D. degrees in electrical engineering from University of Southern California in 1992 and 1995, respectively.
At USC, Mr. Wu has been a teaching assistant for two graduate-level courses in image/video processing
Behavioral Simulation of Densely-Connected Analog Cellular Array Processors 87
technology and digital information superhighway. He works as a graduate research assistant in the VLSI Signal Processing Laboratory where he also manages the computing facility and equipment. He has participated in many research topics including VLSI image processing and signal transmission, neural networks, and intelligent systems. During June-August 1995, he worked on programmable video processor design in the AT&T Bell Labs, in Holmdel, NJ. He joined Cirrus Logic Corp. in December 1995.
He has been an active participant in IEEE activities. He serves on the Technical Program Committee of the 1995 International Conference on Computer Design in the Architectures-and-Algorithm Track. He also serves as a co-editor of the book, Microsystems Technology for Multimedia Applications, from IEEE press, 1995. He is a member of the IEEE.
Bing J. Sheu was born in Taiwan in 1955. He received the B.S.E.E. degree (Honors) in 1978 from the National Taiwan University, the M.S. and Ph.D. degrees in electrical engineering from the University of California, Berkeley, in 1983 and 1985, respectively.
In National Taiwan University, he was the recipient of the Distinguished Book-Coupon Award for 7 times. In 1981, he was involved in custom VLSI design for a speech recognition system at Threshold Technology Inc., Cupertino, CA. From 1981 to 1982, he was a Teaching Assistant in the EECS Department, UC Berkeley. From 1982 to 1985, he was a Research Assistant in the Electronics Research Laboratory, UC Berkeley, working on digital and analog VLSI circuits for signal processing. In 1985, he joined the faculty in Electrical Engineering Department at University of Southern California and is currently an Associate Professor with a joint appointment in Biomedical Eng. Department. He has been an active researcher in several research organizations at USC including Signal and Image Processing Institute (SIPI), Center for Neural Engineering (CNE), Institute for Robotics and Intellegent Systems (IRIS), and Center for Photonic Technology (CPT). He
serves as the Director of VLSI and Signal Processing Laboratory. Since 1983, he has served as a consultant to the microelectronic and information processing industry. His research interests include VLSI chips and systems, massively paralleled neural networks and image processing, and high-speed interconects and computing. He is an Honorary Consulting Professor in National Chaio Tung University, Hsin-Chu, Taiwan.
Dr. Sheu was a recipient of the 1987 NSF Engineering Initiation Award and, at UC Berkeley, the Tse-Wei Liu Memorial Fellowship and the Stanley M. Tasheira Scholarship Award. He was also a recipient of the Best Presenter Award at IEEE International Conf. on Computer Design in both 1990 and 1991. He is a recipient of the Best Paper Award of IEEE Transactions on VLSI Systems in 1995, and Best Poster Paper Award of World Congress on Neural Network from International Neural Network Society in 1995. He has published more than 170 papers in international scientific and technical journals and conferences and is a coauthor of the book HardwardAnnealing in Analog VLSI Neurocomputing in 1991, and the book Neural Information Processing and VLSI in 1995 from Kluwer Press and co-editor of Microsystems Technology for Multimedia Applications in 1995 from IEEE Press. He served on the Technoical Program Committee of IEEE Jour. of Solid-State Circuits for March 1992 and 1993 Special Issues; a Guest Editor on computer technologies for IEEE Transactions on VLSI Systems for June 1993 Special Issue; an Associate Editor ofIEEE Transactions on Neural Networks. He is on the TEchnical Program Committees of IEEE Int'l Conf. on Neural Networks, Int'l Conf. on Computer Design, and Int'l Symposium on Circuits and Systems. At present, he serves as an Associate Editor of IEEE Transactions on VLSI Systems; an Associate Editor of IEEE Transactions on Circuits and Systems, Part I and Part II; and the CAS Editor of IEEE Circuits and Devices Magazine. He also serves on the editoral board, and a guest editor for intelligent microsystems special issue of the Journal of Analog ICs & Signal Processing, Kluwer Press; and the editoral board of Neurcomputing Journal, Elsevier Press. He serves as the Tutorials Chair of 1995 IEEE Int'l Symposium on Circuits and Systems; and as the Technical Program Chair of 1996 IEEE Int'l Conf. on Neural Networks. He is among the key contributors of the widely used BSIM model in the SPICE circuit simulator. He is a Fellow of IEEE, a member of International Neural Networks Society, Eta Kappa Nu, and Phi Tau Phi Honorary Scholastic Society.
88 T. H. Wu, B. J. Sheu, and E. Y. Chou
Eric Y. Chou was born in Hsinchu, Taiwan, in 1968. He received the B.Sc. degree in computer science and information engineering from National Taiwan University in 1990, and M.Sc. degree in electrical engineering
from the University of Southern California, in 1993. He is currently a Ph.D. candidate at USC.
From 1990 to 1992, he worked as a ROTC officer in Taiwan Navy. He joined the VLSI Signal Processing Laboratory at USC in Fall 1992. During 1993-1994, he worked on a low-power microelectronics project at USC Information Sciences Institute, at Marina Del Rey, CA. He returned to full-time study on USC campus since Jan. 1995. He has participated at research projects on VLSI system architecture and software of compact supercomputing for signal/image processing, and machine intelligence. He is currently a student member of IEEE, ACM and Tau Beta Pi honorary society.
Analog Integrated Circuits and Signal Processing, 10, 89-99 (1996) © 1996 Kluwer Academic Publishers, Boston. Manufactured in The Netherlands.
Hierarchical Fault Modeling for Linear Analog Circuits
NAVEENA NAGI! AND JACOB A. ABRAHAM2 nagi@[vision.com
1 LogicVision. 101 Metro Drive. Third Floor, San Jose. CA 95110 and 2 Computer Engineering Research Center, University of Texas atAustin. ENS 424, Austin. TX 78712-1084
Abstract. This paper presents a hierarchical fault modeling approach for catastrophic as well as out-of-specification (parametric) faults in analog circuits. These include both, ac and dc faults in passive as well as active components. The fault models are based on functional error characterization. Case studies based on CMOS and nMOS operational amplifiers are discussed, and a full listing of derived behavioral fault models is presented. These fault models are then mapped to the faulty behavior at the macro-circuit level. Application of these fault models in an efficient fault simulator for analog circuits is also described.
1. Introduction
The aim of test generation is to minimize production testing costs and improve test quality by choosing an optimal set of test patterns. This task is well understood for digital circuits, for which Automatic Test Pattern Generators (ATPGs) assume a fault model (stuck-at, stuck open, delay faults, etc.) and generate tests based on these. However, because of the complex nature of analog circuits, a direct application of digital fault models proves to be inadequate in capturing the faulty behavior. Hence analog test selection has been approached in a rather ad-hoc way. Sometimes circuits tend to be overtested to avoid shipping a faulty product, while, at other times, the tests may be inadequate. A first step towards developing an analog testing methodology is to develop comprehensive analog fault models.
With the increasing demand for low defect levels, it is imperative that realistic and most probable physical manifestations from the non-idealities of the process be modeled and tested for. The fault model should be based on the physical defects caused by random process instabilities and contaminations present during the fabrication process. These faults are highly dependent on the type of process, and their effect on the overall circuit behavior depends on the design style and the layout. Moreover, some faults are more probable than others. To reduce test costs, faults should be graded according to their probability of occurrence to aid in
the trade-off decision of accuracy versus complexity of the fault models.
The first step in developing meaningful fault models is to understand the various types of failures, their cause and effect which we briefly enumerate along with some issues that are imperative to a successful approach to the problem:
There are two sources offaults: local process defects and global process defects. Catastrophic faults are random defects that are caused by local structural deformations including spot defects like oxide pinholes or extra metal (which result in dead/resistive opens/shorts, capacitive couplings) or large variations in design parameters. These cause a complete malfunction of the circuit. Parametric faults are caused by global statistical fluctuations in the process parameter(s) including oxide thickness, linewidth variations, and mask misalignment. These result in out of specification performance of some circuit parameters like gain or bandwidth. Catastrophic faults usually occur as single faults whereas parametric variations often occur as multiple faults affecting several parameters/components simultaneously. The issue is further complicated since faults may be either independent or dependent (correlated). In this paper we will not address intermittent faults but focus only on permanent faults.
We will first review some of the previous approaches to analog fault modeling and summarize the progress that is being made by recent research efforts. We then
90 N. Nagi and J. A. Abraham
propose a hierarchical fault modeling approach which is aided by two case studies and illustrate the use of the behavioral fault models in an efficient fault simulation and test generation framework. Finally, we discuss the limitations and outline the future work required towards a comprehensive solution to this key problem.
2. Review of previous approaches
Several analog fault models have been proposed in [1] [2], but they only model catastrophic failures in analog circuits, i.e. failures that result in a completely malfunctioning circuit. For digital circuits catastrophic faults dominate, but yield losses in analog circuits are caused by catastrophic as well as parametric faults.
Most of the earlier work on analog fault modeling and diagnosis [1][3][4] focuses on the theoretical aspects and relies extensively on the characteristic matrix of the circuit and sensitivity analysis. The approach in [4] is based on a linear coefficient matrix model that relates device response to a small set of parameters. This involves a high computational complexity and may not be feasible for large circuits. In addition, the linear error models are not suited for non-linear devices. In [5] faults are based on tolerances in process parameters, and are modeled by statistical distributions. This approach relies on the designer to identify the critical parameters and supply a parameter model of process fluctuations.
More recent publications [2] [7] approach the problem of analog testing from a more experimental perspective. However, the fault models are limited to resistive faults [2], or faults in passive components [7]. Some work has recently been initiated in the area of fault modeling for active analog devices. Faults are injected in the layout and translated to circuit faults using a defect simulator. A fault model for operational amplifiers (op amps) has been proposed in [6] but is limited to dc faults. A fault macromodel for opamps that includes both, dc and ac faults has been presented in [8]. It also incorporates accurate I/O parameters responsible for interfacing errors due to loading effects. However, the fault model is developed for only dead shortlbridging faults. In reality, shorts inside transistors can have a wide range of resistance because they are embedded in gate oxide or substrate materials and can have very different effects [9].
This effect has been considered in [10] where a range of resistances is considered for the shortlbridging faults. Each circuit block is replaced by its behavioral
model equivalent, except the subcircuit in which faults are injected, which is described by its layout extracted netlist. This alleviates the simulation complexity to a certain extent, but nevertheless necessitates numerous simulations involving a faulty circuit level block. Behavioral models of the defect-free subblocks can be generated using a macromodeling approach such as that presented in [11]. Performance macromodels of subblocks are generated using a layered volume-slicing methodology with radial basis functions. The approach is very general and does not require any a priori knowledge of the system, thereby generating accurate macromodels.
However, there is a trade-off between the accuracy and computational complexity of macromodels. For the purpose of fault simulation and test generation, which are by their nature compute intensive having to be performed for each of the numerous faults, it is imperative that the complexity be as reduced as possible. Only then will it be feasible. In addition, the aim is mainly to differentiate a good circuit from a faulty one rather than exact accuracy of the response. For the above mentioned two reasons, fault models must be as simple as possible as well as amenable to efficient simulation techniques.
In this paper, a hierarchical methodology is proposed for developing behavioral fault models of macro blocks based on [12], and the practical application of these fault models in an efficient fault simulator. This work addresses both dc and ac faults in op amps. Op amps are considered since they form an intrinsic part of almost all analog circuits. In addition, op amps occupy a much larger silicon area in monolithic ICs than passive components, and hence are prone to more manufacturing defects. In the next section, a general fault modeling approach is presented. Section 3 considers two case studies of CMOS and nMOS op amps. These fault modeling concepts are then extended to model faults at the higher macro-circuit level. This is not a direct mapping, since a large class of faults, particularly the out-of-specification faults, are based on the acceptable region of operation of the device, which depends on the circuit configuration. For example, although the open loop characteristics of an op amp may differ widely from the specifications, the inverter circuit in which it is embedded may not exhibit any faulty behavior. This leads to the concept of macro-circuit fault models which is discussed in Section 4. Finally, the application of these fault models in an efficient fault simulator is described and some future research directions outlined.
Hierarchical Fault Modeling for Linear Analog Circuits 91
3. Fault modeling approach
The derivation of fault models must take into consideration the effect of circuit malfunctions at the required level of abstraction. The decision at what level the faults should be modeled is based on a trade-off between accuracy and cost of simulation. Fault models can be categorized as structural or behavioral. Structural fault models, in turn, can be either at the device level or the circuit level. In digital circuits structural fault models can be abstracted to the higher gate level, at which level it is feasible to fault simulate practical circuits using gate-level fault simulators. However, for analog circuits, structural circuit-level fault models are useful only if the size of the circuit is small, due to the computationally intensive nature of circuit simulators. For circuits with more than 3-4 op amps, higher level fault models are required.
One approach is to generate fault models based on macromodels of analog functional blocks. This reduces the complexity to a certain extent, but for large circuits and especially for mixed-signal circuits, even this level of complexity may be impractical. The second approach which is at a higher level of abstraction, is to develop behavioral fault models. These not only reduce the computational complexity of fault simulation which can be performed by a behavioral simulator, but also enable the development of hierarchical fault models for analog macro-circuits. These analog fault models can be used along with the existing fault models for digital circuits to develop a mixed-signal test generation technique. The advantage of developing fault models based on functional error characterization is that they can be directly used to generate test programs for analog and mixed-signal circuits.
In this paper we propose a hierarchical fault model-ing approach that entails the following steps:
1. Fault injection 2. Circuit simulation 3. Formation of the fault model 4. Verification of the fault model 5. Macro-circuit fault modeling
3.1. Fault injection
The key to ensuring valid fault models is that they should be derived as closely as possible from the underlying physical processing defects. A viable approach is to apply inductive fault analysis [13] to physical layouts of the circuit. This requires a description of the
manufacturing defect statistics, which, when mapped onto the layout, provide a list of possible electrical faults. For catastrophic faults a yield simulator e.g. VLASIC [14] may be used. For parametric failures a process simulator e.g. FABRICS [IS] is used. These extract the faults to the circuit level. At the circuit level, the faultlist consists of shorts, opens, breaks in lines, and parameter variations in both, active and passive components, e.g. resistors, capacitors, and transistor transconductances. As with digital circuits, there can be either single or multiple faults. Multiple faults may either mask out the effects of each other or have a cumulative effect.
3.2. Circuit simulation
Having derived a list of the possible electrical faults, the next step is to abstract these into behavioral descriptions of their effects. This is done by performing electrical simulations of the defective circuits using a circuit simulator like SPICE. In order to take into account process variations and tolerances, several Monte Carlo simulations must be performed, in addition to the nominal values, similar to those in [16]. Separate distributions are used for local mismatches as opposed to global drifts. In addition, two types of correlations must be considered for intra-cell and inter-cell parameters.
After performing the simulations, it is found that some faults do not have any observable effect on the performance specifications of the circuit, and can be dropped from the faultlist. Despite this, the size of the faultlist to be considered for test generation is huge, and it is imperative to model the faults at a higher level, e.g. at the op amp level. This results in an implicit compaction in the large number of faults that would need to be considered at the circuit level. This model is then used in the next phase of developing fault models for macro-circuits, e.g. active filters, sample-and-hold circuits, phase-locked-loops, etc.
3.3. Formation of the fault model
The faulty behavior of a circuit can be modeled at different levels of abstraction depending on a tradeoff between the required accuracy and computational complexity. Although circuit simulation is very accurate, it is impractical and extremely time consuming for
92 N. Nagi and 1. A. Abraham
complex analog and mixed-signal VLSI circuits. The complexity becomes further daunting when performing multiple simulations for each one of the numerous faults possible.
From circuit simulation results, the behavior of the good and faulty circuits is analyzed to develop higher level fault models. Fault models at the structural level are derived by forming macromodels of the defect-free circuit and then either modifying some parameter values or adding extra components like resistors, switches, voltage or current sources, etc. This reduces the complexity of a circuit from hundreds of (possibly nonlinear) components to a much fewer number of (mostly linear) components.
Structural fault models however, still restrict simulation to the circuit level which may not be feasible for numerous fault simulations of complex circuits. Behavioral level fault models are derived at yet another level of abstraction, and can be used for efficient behavioral fault simulation. The output response of the good and faulty circuits can be modeled by performing regression analysis on polynomials [17], or a more general technique using radial basis functions [11]. For linear circuits, however, several efficient approaches exist. One such approach that was used to illustrate the examples in this work is based on Asymptotic Waveform Evaluation (AWE) [18].
AWE is a moment-matching technique that approximates the response by matching the initial conditions and the first 2q - 1 moments of the exact response to a lower order q-pole model. Let the exact response of the circuit be represented by x(t). Then the approximating function is of the form
n
j(x) = LkiePiX
i=1
where Pi are the approximating poles and ki the residues. The moments are computed from the circuit response, and are matched to obtain the required number of poles and residues for the approximate model. This model is then used for fault modeling at the next level of hierarchy.
3.4. Fault model verification
The fault models are verified against the simulation response of the transistor level circuit. The verification criterion is, however, different from that for regular macromodels, where the absolute accuracy is imp or-
tant. In the case of fault models, the goal is to differentiate a fault-free circuit from a faulty one with manufacturing defects. In order to have a feasible fault-based approach for either fault simulation or test generation, the fault models must be as simple as possible and lend themselves to an efficient simulation technique. Thus, for a q-pole approximation we start with a low order model. If the verification step results in a higher percentage of aliasing than is acceptable, only then is the model further refined. Aliasing will occur for those faults whose (approximate) response falls within a certain specified tolerance of that of the fault-free circuit.
3.5. Macro-circuit fault modeling
The effects of parametric faults have different manifestations depending on the circuit configuration and, in some cases, it may not be possible to model the effect at the macro-circuit level based on the op amp fault model. In these cases it is necessary to simulate the flat circuit for the electrical fault. Circuit level simulation is feasible for a circuit with a few op amps, but becomes impossible for larger circuits. A hierarchical approach may be used to deal with such circuits. The fault-free op amps are replaced by their models which are well developed, while the faulty op amp is simulated at the component level.
An interesting point to note is that the presence of a fault in an op amp may cause some characteristics of the op amp to be out-of-specification, but may have no effect in the functioning of the macro-circuit. This is equivalent to the case of redundant faults in digital circuits. Since it is not possible to simulate over the entire range of operation of the device at the circuit level, there is a need for a hierarchical procedure for mapping faults in order to characterize this effect. This would define the regions of operation where the fault effect dies and those where it appears as a measurable error. The aim of test generation would then be to generate inputs that operate the circuit in the faulty mode.
The fault modeling problem is aggravated by compounding effects of distributed faults. Consider, for example, a 3-stage filter with a global feedback. The phase shift of each stage may be within the tolerance, but the total phase shift of the feedback path may cause the circuit to be unstable and break into oscillations. This is similar to the path-delay faults in digital circuits.
Hierarchical Fault Modeling for Linear Analog Circuits 93
Fig. 1. CMOS operational amplifier.
We summarize the hierarchical procedure for developing fault models for macro-circuits based on functional error characterization of their constituent primitive components. The fault models are developed at different levels of abstraction:
• Structural level
- Circuit level
- Functional block level
• Behaviorallevel
- Error characterization of functional blocks and macro-circuits
Once the behavioral fault models for the primitives, e.g. op amps, have been developed, they are used to inject faults in the macro circuit which, in tum, is used to develop fault models for the macro-circuit such that the fault-free and faulty circuits can be simulated using an efficient behavioral simulator.
4. Case study
The general approach to fault modeling, outlined above, has been applied to two op amp circuits, an unbuffered CMOS op amp with a p-channel input pair, and an nMOS op amp consisting of enhancement and depletion mode transistors [19]. Figs. 1 and 2 show the circuit schematics of the op amps. Table 1 gives the specifications of the fault-free op amps.
4.1. Fault injection
The first step of fault injection requires a faultlist at the desired level, in this case the circuit level, which is extracted from the layout and defect statistics. From the
defect simulation results published in literature a list of the most likely faults is compiled. The faultlist at the circuit level consists of shorts, opens, bridging faults, variations in the characteristics of active components, e.g. transistor transconductance, and in the values of passive components. The op amps considered in this study were small enough (less than 20 transistors) to simulate faults in all the components. Both dead and resistive shorts are considered and are modeled by a small resistor between the two nodes, while a break in a line is injected by inserting a large resistor at that point. Breaks in lines producing floating gates cannot be modeled as a large resistor in series with the gate (the approach used for breaks in lines), as this does not model the effect of voltages induced by neighboring nodes. Renovell and Cambon [20] provide a detailed
vIm
10114
Vo
Vss 1'100.: GY = V88
Fig. 2. nMOS operational amplifier.
Table 1. Performance specifications of the Op Amps as predicted by SPICE.
nMOSOpAmp CMOSOpAmp
Power supply VDD = 5V VDD = 5V Vss = -5 V Vss = -5V
VBB = -lOV Power consumption 2.8mW 2.46mW Open-loop gain 68 dB 74 dB Unity-gain bandwidth 2MHz IMHz Input voltage range ±3.5V -4 to +3.8 V CMRR 60 dB 60 dB PSRR for VDD 82 dB 81.76 dB
for Vss 60 dB 95.23 dB Output voltage swing ±3.7V ±4.75 V Slew rate ±4 v//-tsec +10 V//-tsec
-7 V//-tsec
94 N. Nagi and f. A. Abraham
·4
\.Jood :;'pimp -···· .. ~:n2 [)(; :.ho:t -- ...
'm:' [)'j ",hon: ••••
mim~. '~:i •. ~~~~~ ~.:::
! : ~~~'1-:~~"""""'" . ----- .. ---~ .... ---- ... ----~--
·6L---~--~--~------~ __ ~ ·6 -4 ·2 0 2
OC Input Volt<1g~
Fig. 3. DC characteristics of faulty CMOS OP Amp.
1')00 ,--_--_--~--____ ~ __ ..,
~oooj opamp -lOa j.--------_m.2 [Yj short .......
••...•..•.••••••.••••••••••••. DC short •••• 10 mi' - shon ....... .
·-·-·-·-·-·-·-·-·-·-·-·-·-·m;·~ .. {,..~ .... :.~:e -.-.
O. I
.J. 00 I .................... _ ..................................... -',
0.000 I .....•....• .....
le-05 I...-----~---......;::;--~ le.OO le.Ol le.02 le+Ol le+u4 le.05 le.06 le .. 07
frequency
Fig. 4. AC characteristics of faulty CMOS OP Amp.
electrical analysis and modeling of floating gate faults. The change in transistor transconductances is incorporated by changing their W IL ratios.
4.2. Fault simulation
The op amps were simulated using SPICE for each fault that was injected. The simulations involved both the dc and ac response of the op amps. The results of the simulations for a demonstrative sample of faults are shown in Figs. 3 and 4 and summarized in Tables 2 and 3.
As seen in Table 1, an op amp is characterized by a large number of specifications, some of which are easy to test for, while others involve a large number of measurements and take long testing times. To detect a fault, it is not necessary to measure all the specifications. Instead, the goal would be to try to identify a subset of tests that would detect all the faults. In the sample fault set considered, all the faults reflected a
Table 2. Selected faults in the nMOS Op Amp.
Fault
Inputs shorted Ml D-S short Ml D-G short Ml G-S short M5 D-S short M5 D-G short M5 G-S short M12 D-S short M9 D-G short MlO D-G short Mll WIL change M12 WIL change
Faulty behavior in open loop response V;n = O.4mV
noisy output about V;o output stuck-at -4V reduced gain V;o = 1.5V V;o = -3V reduced gain Output stuck-at -3.7 V Non-linear DC characteristics Non-linear DC characteristics Inverted DC characteristics V;o = O.SV Non-linear DC characteristics
Table 3. Selected faults in the CMOS Op Amp.
Fault Faulty behavior in open loop response
Ml D-S short Ml D-G short Ml G-S short M2 D-G short M4G-S short M3 G-S open MS floating gate M5 floating gate
output stuck-at -5V output stuck-at -5V V;o = O.5V Inverted DC characteristics Reduced output voltage swing Output stuck-at -5V Reduced output voltage swing output stuck-at -4.7V
faulty behavior in the open loop gain measurements, either the dc response, or frequency response, or both. An important observation from the faulty behavior that affects the fault model and test generation follows.
4.3. Nonlinear behavior under fault
Certain faults, e.g. Drain-Gate short in transistor M7 of the CMOS op amp in Fig 3, result in nonlinear dc characteristics. This faulty behavior is normally not taken into consideration while testing, since dc characteristics are extrapolated by performing measurements at a few voltages (typically min, max and mid) only. This is a case of undertested circuits as the nonlinear fault may remain undetected. However, once a fault model incorporating this nonlinear behavior is developed, a fault-targeted test generator can generate a test for it. The fault models developed in this paper do not address these nonlinear faults. However, the fault modeling approach described here can be used to extend the models
Hierarchical Fault Modeling for Linear Analog Circuits 95
Fig. 5. Nominal Op Amp model.
AI
v .
Fig. 6. Fault model with external resistors.
to account for these faults.
4.4. Formation of the fault model
Structural and behavioral op amp fault models based on the results of the entire study for op amp faults is now developed. A structural fault model can be incorporated in the nominal op amp macromodel of Fig. 5 by the use of external resistors as shown in Fig. 6 [6]. The external resistors are connected to fixed supplies or ground depending on what value the output is stuckat, hence modeling only the catastrophic faults. The fault model shown in Fig. 7 models parametric faults. A voltage source and a current source are introduced in the nominal model to account for the changes in the input offset voltage and the gain of the faulty op amp. Table 4 lists the external current source values for the faulty op amp model that yield the same output gain as the faulty op amps. The importance of developing these fault models stems from the fact that circuit simulation of large flat circuits is not feasible, but may be made tractable by modeling op amps for their nominal and faulty behavior. This would be useful in the fault simulation step while generating automatic test sequences for analog circuits. However, for large circuits even this may not be practical since circuit simulation may take enormous time. Hence we need to develop behavioral fault models for op amps.
The faulty behavior can be categorized as catastrophic andout-of-specification. Catastrophic faults
v.
Fig. 7. Fault model with internal source, F.
Table 4. Internal current source for fault model.
FAULT
Good Inputs shorted
M 1 W IL change
OP AMP GAIN
68 dB 41 dB
66.27 dB
MODEL(F)
F=O F = -1.3643E-4 F = -2.5243E-5
result in a complete malfunction of the op amp and are usually caused by dead shorts or faults in critical components. Out-of-specification faults are caused by some resistive shorts or parameter variations of transistors that are not so critical. The op amp still continues to function, but with a degraded performance. The op amp behavioral fault model is summarized in Table 5.
Behavioral fault models of the good and faulty op amps were obtained by fitting their response to a linearized rational function model using the momentmatching technique of AWE as described earlier. The RICE [21] software library was used to generate the poles and residues. For this case study, a 2-pole model
Table 5. Op Amp behavioral fault model.
Catastrophic faults
stuck-at V dd stuck-at Vss stuck-at Vpos (positive saturation voltage) stuck-at Vneg (negative saturation voltage) shorted inputs: noisy output centered around o or Vio (input offset voltage)
Out-of-specification faults
Reduced open loop gain Increased input offset voltage Reduced output voltage swing Nonlinear dc characteristics Shift in dominant pole
96 N. Nagi and 1. A. Abraham
. "
c.
Fig. 8. Integrator.
L20 F:::;;==;;;;;;;iiiiZ!:~-:=--==-:-"" good oparrp -
model _ .... LOO
80
60
40
20
0L-----------------~ .. --~ le.OO le+Ol le.02 le.OJ l~.u", le·05 le+06 le.0 7 f re<r.lency
Fig. 9. Good Op Amp and model response.
proved to be a good approximation. The behavioral model for the op amp and its fault model are given by,
Xes) = 1.40 1.15e2 -----+-----::--:-:~----::-S + 4.55e - 7 S + 2.55e - 5
5.94e-2 8.17 Xj(s) = s+4.33e-8 + s+5.37e-6
The response for the good and faulty (change in the WIL ratio of the transistor M6) CMOS op amp and the models are shown in Figs. 9 and 10. The models developed for an op amp can now be used to develop fault models for higher-level macro-circuits.
.. "
f:!ulty opamp -model - ....
0L-------------~--~~--~ le.OO le.Olle.021e.a) l~d..j, I"! .OCj le .061e .. 011e+ as
t r~"I'J·!n.:y
Fig. 10. Faulty Op Amp and model response.
• ~
'0 >
;; a. ., 0
I~O
140
l!O
100
80
60
<0
20
0 le.aO
--.---------.~---,
;~"Ch1 .:l:··'ll-\::>t mt'lo.fel
le.Ol le .. ·n le.O} le·04 le.OS t re'l'Jency
Fig. 11. Good integrator and model response.
4.5. Macro-circuit fault modeling for an integrator
This section illustrates the hierarchical fault modeling procedure for analog macro-circuits by developing fault models for an integrator. This in turn can be used to model faults in filters, etc. Analog faults that are responsible for performance outside the specified margins are difficult to identify, since the fault effect at the system level depends not only on the fault itself, but also on the circuit in which it is embedded. For example, consider the fault in the op amp described in the previous section. It causes the op amp open loop dc characteristics to become non-linear, and the gain is degraded too. But when this fault occurs in an op amp used in an inverting amplifier configuration, the fault has no effect on the performance of the amplifier and is masked. On the other hand, a circuit which consists of a number of stages could display faulty behavior due to distributed fault effects. Each individual stage may differ from the nominal response within its specified tolerance, but the combined effect may result in the circuit malfunction.
In this section we illustrate the hierarchical fault modeling approach by using the op amp fault model to study the faulty behavior of an integrator, shown in Fig. 8. The op amp in the integrator is replaced by the good and faulty models and this modeled response is compared with the actual response of the integrator as shown in Figs. 11 and 12 respectively. It is observed that the hierarchical approach using the behavioral model of the op amp produces a response which closely matches the actual behavior of the circuit. This will be true as long as the opamp model is accurate for at least the frequency range of interest of the macrocircuit.
The model approximation approach using AWE can
Hierarchical Fault Modeling for Linear Analog Circuits 97
~
l:' ~
-0 " ~ , ~ , 0
12
10
6
(dultv i.nteqr4tor -model ---
oL-----------~~------~~ 1e.00 le.Ol le·02 1e-O I le.04 le.O')
t reQ\,Jency
Fig. 12. Faulty integrator and model response.
R2
Fig. 13. Lossy integrator.
be used to generate the fault model for the integrator too. As an example, consider a lossy integrator shown in Figure 13 whose fault free transfer function can be represented as
-WI H(s) =--
S+W2
where, WI = R!IC! and W2 = R2IC! • The faulty response of the integrator is obtained by replacing the opamp with its fault model and using RICE to generate the poles and residues for the integrator fault model. Here, RI = R2 = 10k and CI = 0.02/1. For a sample fault a 2-pole fault model for the integrator is given by
HI (s) = -2.237e3 + 2.237e3 s + 2.107e3 s + 3.407e4
4.6. Application offault models
We now illustrate the use of the fault models developed so far to enable the fault simulation of an analog filter. The biquadratic filter is shown in Figure 14 and consists of three stages: inverter, integrator and a lossy integrator. The signal flow graph of the fault free filter is shown in Figure 15 and can be used to compute
Fig. 14. Biquadratic filter.
·R4IR6
u(s) -'R-41R-t",,\'i~(40J ::..----;l!_---tl-S ~"'.:..(s.:....) ---;;;0_---.:.:=--;~~ y(s) ·tlR2Ct -t1R3C2 = K2(s)
-t1R5C2
Fig. 15. Signal flow graph of biquadratic filter.
the state equations for efficient behavioral simulations. The faults in this circuit can be classified into three different categories:
• Connectivity faults (capacitiveiresistive shorts and opens)
• Parameter variations in the passive components (resistors, capacitors)
• Faults within the opamps Connectivity faults will change the topology of the circuit and a new signal flow graph will need to be constructed. Parameter variations in the passive components, on the other hand, will only change the weights of the arcs in the signal flow graph. In order to address faults within the opamps, they must first be modeled as described earlier. For example, the 2-pole model for the fault in the lossy integrator that was developed in the previous section can be directly plugged into the signal flow graph as shown in Figure 16. An efficient behavioral fault simulator has been developed based on this technique which is described in [22].
5. Conclusions and future work
A hierarchical fault modeling approach for analog circuits has been proposed. To illustrate this, comprehensive fault models for op amps and an integrator have been developed. The results validate the approach for analog fault modeling based on error responses, and show promise in developing fault models for other macrocircuits, including sample-and-hold
98 N. Nagi and J. A. Abraham
-1
U(t)_-_1~'--+<>_""""_-<>-9-___ o.3i.
Fig. 16. Signal flow graph with faulty integrator.
circuits, phase-locked loops, switched capacitor filters, etc. The lack of suitable analog fault models has been the prime reason for restricting the problem of analog test to the functional domain. The development of comprehensive fault models for catastrophic as well as parametric ac and dc faults in passive and active circuits opens up the ground for a more quantitative fault-based approach to analog testing. These fault models can enable the implementation of a behavioral fault simulator and a structured framework for test generation for analog and mixed-signal circuits.
Future work should extend the fault models developed here to include non-linear effects due to faults as well as for non-linear circuits. A possible application of the macromodeling technique described in [11] could be considered for this. At the same time, an efficient simulation approach for these nonlinear models and circuits would be required. Another area for future work is in fault compaction. We are looking at this in terms of equivalence, dominance and redundancy in the analog context.
Acknowledgements
The authors would like to thank the reviewers for their valuable suggestions that helped improve the quality of this paper.
This research was supported in part by General Electric Company and in part by the National Science Foundation under grant MIP-9222481. This work was performed when the first author was with the University of Texas at Austin.
References
1. P. M. Lin and Y. S. Elcherif, "Analogue circuits fault dictionary: New approaches and implementation." Circuit Theory and Applications 12, pp. 149-172, John Wiley & Sons, 1985.
2. M. J. Marlett and J. A. Abraham, "DC-IATP: An iterative analog circuit test generation program for generating DC single pattern tests," in Proc. IEEE International Test Conference, pp. 839-845, 1988.
3. L. Rapisarda and R. DeCarlo, "Analog multi frequency fault diagnosis." IEEE Trans. Circuits Syst. CAS-30, pp. 223-234, April 1983.
4. T. M. Souders and G. N. Stenbakken, "A comprehensive approach for modeling and testing analog and mixed-signal devices:' in Proc IEEE International Test Conference, pp. 169-176,1990.
5. L. Milor and A. Sangiovanni-Vincentelli, "Optimal test set design for analog circuits," in Proc.IEEE ICCAD 1990.
6. A. Meixner and W. Maly, "Fault modeling for the testing of mixed integrated circuits," Research report No. CMUCAD-91-6, Feb. 1991.
7. M. Soma, "A Design-for-Test Methodology for Active Analog Filters, " in Proc.IEEE International Test Conference pp. 183-192,1990.
8. C-Y. Pan, K-T. Cheng, and S. Gupta, "A comprehensive fault macromodel for opamps:' inProc.ICCAD, pp. 344-348, 1994
9. H. Hao and E. J. McCluskey. " 'Resistive shorts' within CMOS gates," in Proc. IEEE International Test Conference, 1991, pp. 292-301.
10. R. Harvey, A. Richardson, E. Bruls, and K. Baker, "Analog fault simulation based on layout dependent fault models:' in Proc. IEEE International Test Conference, pp. 641-649, 1994.
11. 1. Shao and R. Harjani, "Macromodeling of analog circuits for hierarchical circuit design," in Proc. ICCAD, pp. 656-663, 1994.
12. N. Nagi and J. A. Abraham, "Hierarchical fault modeling for analog and mixed-signal circuits," in Proc. IEEE VLSI Test Symposium, pp. 96-101, 1992.
13. J. P. Shen, W. Maly, andF. J. Ferguson, "Inductive fault analysis of MOS integrated circuits." IEEE Design and Test 2, pp. 13-26, Dec. 1985.
14. D. M. H. Walker and S. W. Director, "VLASIC: A catastrophic fault yield simulator for integrated circuits." IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, pp. 541-556, Oct. 1986.
15. S. R. Nassif, A. J. Strojwas, and S. W. Director, "FABRICS II: A statistically based IC fabrication process simulator." IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, pp. 40-46, Jan. 1994.
16. B. R. Epstein, M. Czigler, and R. Miller, "Fault detection and classification in linear integrated circuits: An application of discrimination analysis and hypothesis testing." IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, pp. 102-113, Jan. 1993.
17. T. Yu, S. Kang, I. Hajj, and T. Trick, "Statistical performance modeling and parametric yield estimation of MOS VLSI circuits." IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, Nov. 1987.
18. L. T. Pillage and R. A. Rohrer, "Asymptotic waveform evaluation for timing analysis." IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, Apr. 1990.
19. P. A. Allen and E. Sanchez-Sinencio, Switched Capacitor Circuits. Van Nostrand Reinhold Company, 1984.
20. M. Renovell and G. Cambon, "Topology dependence of floating gate faults in MOS integrated circuits." Electronic Letters 22(3), pp. 152-153.
21. C. L. Ratzlaff, N. Gopal and L. T. Pillage, "RICE: Rapid Interconnect Circuit Evaluator:' in Proc. ACMREEE Design Automation Conference, 1991.
Hierarchical Fault Modeling for Linear Analog Circuits 99
22. N. Nagi, A. Chatterjee, and J. A. Abraham, "Fault simulation of linear analog circuits." Journal of Electronic Testing: Theory and Applications 4, pp. 345-360, 1993.
Naveena Nagi received her B.E. in Electronics and Communication Engineering from the University of Roorkee, India, in 1986, M.S. in Computer Engineering from the University of Roorkee, India, in 1986, M.S. in Computer Engineering from the University of Southern California in 1989 and Ph.D. in Electrical Engineering from the University of Texas at Austin in 1994. She is currently at LogicVision, San Jose, CA.
Her research interests are in the fields of testing, fault modeling and fault simulation of digital, analog and mixed-signal circuits.
Jacob A. Abraham is a Professor in the Department of Electrical and Computer Engineering at the
University of Texas at Austin. He is also director of the Computer Engineering Research Center and holds a Cockrell Family Regents Chair in Engineering. He received the Bachelor's degree in Electrical Engineering from the University of Kerala, India, in 1970. His M.S. degree, in Electrical Engineering, and Ph.D., in Electrical Engineering and Computer Science, were received from Stanford University, Stanford, California, in 1971 and 1974, respectively. From 1975 to 1988 he was on the faculty of the University of Illinois, Urbana, Illinois.
Professor Abraham's research interests include VLSI design and test, formal verification, and faulttolerant computing. He is the principal investigator of several contracts and grants in these areas, and a consultant to the industry and government on testing and fault-tolerant computing. He has over 200 publications, and has supervised over 30 Ph.D. dissertations. He was elected Fellow of IEEE in 1985, and is also a member of ACM and Sigma Xi. He has served as an associate editor of the IEEE Transactions on VLSI Systems, and as a chair of the IEEE Computer Society Technical Committee of Fault-Tolerant Computing.
Analog Integrated Circuits and Signal Processing, 10, 10 1-117 (1996) © 1996 Kluwer Academic Publishers, Boston. Manufactured in The Netherlands.
Using Top-Down CAD Tools for Mixed AnaloglDigital ASICs: a Practical Design Case
STEPHANE DONNAy1, GEORGES GIELEN1 *, WILLY SANSENl,WIM KRUISKAMp2, DOMINE LEENAERTS2 AND STEVEN BUYTAERT3, KATRIEN MARENT3, MARC BUCKENS3, CARL DAs3
1 Katholieke Universiteit Leuven, Dep. Elektrotechniek, ESAT-MICAS, Kardinaal Mercierlaan 94, B-3001 Heverlee, Belgium, 2 Eindhoven University oj
Technology, Dept. oj Electrical Engineering, po. Box 513, 5600 MB Eindhoven, The Netherlands and 3IMEC, INVOMEC division, KapeldreeJ 75, B-3001
Heverlee, Belgium
Abstract. A mixed analog/digital ASIC from a real satellite application (a radiation detector front-end) has been designed, simulated and processed according to a hierarchical top-down design methodology. CAD tools (commercial and academic) have been used as much as possible. The top-down methodology is discussed and illustrated by going through the different steps of the ASIC design. At each level the different choices and tradeoffs are briefly discussed and practical difficulties of top-down design are pointed out. One of the most important problems in top-down mixed-signal ASIC design-modeling and verification-is highlighted and discussed in detail.
1. Introduction
Advances in integrated circuit processing technologies offer designers the possibility of integrating complex systems onto a single application specific integrated circuit (ASIC). An increasing part of these integrated systems contain analog as well as digital circuits. The drive towards mixed-signal analog/digital ASICs is posing a serious problem concerning design tools. The complexity of the systems that can be integrated on a single ASIC can only be mastered by using advanced computer-aided design (CAD) tools. For the digital circuits, simulation and synthesis tools (logic, high-level) have been around for some years and a considerable part of the digital design flow has been automated.
The growing interest in mixed-signal ASICs however is now exposing the lack of mature analog CAD tools, particularly at levels higher than the opamp level. In addition, the specific problems posed by the integration of analog and digital circuits on the same chip (such as substrate couplings or crosstalk) are not sufficiently handled by existing digital-oriented tools. As a result, several important tools are missing today to cover the design flow of a mixed-signal ASIC.
* research associate of the Belgian National Fund of Scientific Research
In section 2 a complete top-down design methodology for mixed-signal ASICs is discussed with emphasis on the computer-aided synthesis of the analog part. An essential part of top-down design is the aspect of modeling and simulation. Therefore the different simulation needs are identified, and the modeling and simulation methodology employed for the practical design case are discussed in detail. The top-down design flow described in section 2, which is now being adopted more and more in industry, has been followed as much as possible during the design of a radiation detector interface ASIC. This practical design case is discussed in detail in section 3. The different design steps are discussed, the advantages and drawbacks of the methodology are indicated, and the choicei'; and trade-offs at each level are identified. Finally, in section 8 some conclusions are presented.
2. Top-Down Design Methodology for Mixed Analog/Digital ASICs
2.1. Top-Down Design Flow
This section first describes briefly the complete topdown design methodology for mixed-signal ASICs, shown schematically in figure 1. The different steps in this methodology are:
102 Donnay et al.
System-level design: - system-level arcbitecture+ partitioning digitallDSP/analog
-package selection -test s1rategy -floorpIanning/pin placement
Behavioral description of digita1!DSP/analog
Floorplanning Parasitics estimation Testin
MixedND chip Iayout
Fig. 1. Outline of the design flow of a mixed-signal chip.
• system-level specification - First the functionality of the complete ASIC should be described in some formal way independent of the way of implementation. The system-level specification is then verified with system-level simulation to check the functionality of the chip within its system context. Due to the lack of mixed-signal HDLs at this level, ad hoc approaches are usually followed. Often "conceptual simulation" of the complete system is performed by means of a dedicated program in some general purpose programming language or with mathematical software tools such as MATLAB/Simulink or Mathcad.
• system-level design - The next step is the design of a system-level architecture, including the partitioning in digital, DSP and analog parts, as well as issues like package selection, pin placement and initial floor planning, test strategy development. .. Each of the different parts (digital, DSP and analog) is then described separately by behavioral models. A mixed-mode simulator is used to verify the correctness and consistency of this description with the system-level specification [1], [2].
• synthesis - Starting from the corresponding behavioral descriptions, the digital, DSP and analog parts are then synthesized separately. Afterwards, the complete floor-plan can be generated and the wiring parasitics estimated. Testability is taken into account here as well. The complete chip can now be simulated at transistor level to verify the chip performance. When the specifications are not met, the design has to be restarted: a new iteration with a different partitioning, different specifications for the blocks or a different floor plan has to be tried.
• layout - If the chip performance is within the specifications, the layout can be generated. Different tools are used to generate the layouts of the analog and digital parts separately. These layouts then have to be assembled by mixed-signal block place&route, maintaining timing constraints and signal integrity, and the real parasitics can be extracted. If the simulated chip performance is still within the specifications, mask production can be started after ERC and DRC. If not, new iterations are necessary as shown in figure 1. With growing integration complexities and de
mands for optimization (e.g. for low power) com-
Using Top-Down CAD Tools for Mixed Analog 103
bined with tightening times-to-market, top-down design is more and more being adopted in industry nowadays. Some of the advantages of this methodology are: reduction of design time and design errors (redesign runs) by using careful simulations, explorations and architectural optimizations, before continuing detailed implementation down the hierarchical design tree. Test issues can also be included right from the beginning. Challenges of the approach are the hierarchical partitioning and the propagation of specifications down the hierarchy, as will be pointed out for the specific design case presented in this paper. Emphasis will also be put on the simulation and verification aspects in top-down design. The following sections now highlight in more detail the major steps in the design flow.
2.2. System-Level Design
Ideally a mixed-signal silicon compiler should be able to synthesize the complete mixed analog/digital ASIC starting from the system-level formal specification, which is independent from the implementation methodology. This would imply an automatic partitioning of the functionality of the system into a part that will be implemented in software and a part that will be implemented in hardware. The latter will be further partitioned in parts that will be implemented using digital or DSP techniques and parts that will be implemented with analog circuitry resulting in a system-level block architecture for the ASIC. Such automatic partitioning is not likely to be feasible in the near future. Therefore, the designer today has to partition the chip manually. He then uses different HDLs to describe the different hardware parts separately: e.g. VHDL [3] for the digital part, VHDL or Silage [4] for the DSP part and VHDL-A [5], [6] for the analog part.
For the synthesis of the digital part and the DSP part several commercial tools exist nowadays. Therefore, we will focus on the synthesis of the analog part in the next section.
2.3. Analog Synthesis
Hierarchical Structural Refinement A hierarchical design methodology is required to master the complexity of a complete ASIC. In the analog domain,
104 Donnay et at.
specifications at level i layout at level i
level i
t
r-~
specifications at level i+ 1
I I I I
level i+l
layout at level i+ 1
Fig. 2. Hierarchical structural refinement strategy for the analog part.
however, hierarchical levels are not strictly defined and are mainly used to structurally decompose the global, complex design task into smaller, more manageable subtasks.
Typical levels that are considered are: • sub-chip level - where the analog part is a sub
system of the complexity of an analog signal acquisition chain, like the radiation detector interface front-end discussed in section 3.
• module level - analog modules like an analog-todigital converter (ADC)
• circuit level - analog circuits like an opamp, a comparator or a voltage reference.
• device level - transistors, resistors and capacitors. In recent years a number of academic tools have
tried to automate the synthesis of the analog circuitry. An excellent overview is given in [7]. Most of these tools implement the automatic transition from the circuit level to the device level (e.g. the design of operational amplifiers). Some have tried to automate the synthesis of analog modules ("analog module generation") like ADCs or analog filters. An overview of the state of the art in CAD tools for data converters for instance is given in [8].
K.U.Leuven has recently developed an analog module generator [9] that is able to design analog functional modules like amplifiers, filters, etc., either in an automatic or in an interactive way, starting from performance specifications all the way down to layout. The analog module generator is technology pro-
cess tolerant, easily extendible with new topologies and integrated in a commercial EDA framework.
The design strategy employed in the analog module generator is hierarchical and performance-driven [10], [11], [12], as is being used in most analog synthesis approaches nowadays. It is schematically shown in figure 2. In between any two hierarchical levels i and i + 1, the following steps are performed: • top-down:
- topology selection
- specification translation (or "sizing and opti-mization")
- verification after sizing
• bottom-up:
- layout generation and extraction
- verification after layout
Analog High-level Synthesis For some mixed AID ASICs an analog module generator like the one described above [9], may be all that is needed for the analog part. This is the case when the ASIC consists of a large digital core and only a small and not too complex analog part (e.g. only an ADC, a DAC, some amplifiers). A digital synthesis tool with a link to an analog module generator or an analog cell library may be adequate for the design of such an ASIC. However, the complexity of the analog part on mixed AID ASICs keeps on increasing. For the design case of the
Using Top-Down CAD Tools for Mixed Analog 105
AHDL description of radiation detector interface + performance specification
NOK
§C1 speed: ... accuracy: ... ~wer: ... [~ gain: •.•
:n~~ . ..
radiation detector interface architecture + full set of specifications of all modules
Fig. 3. High-level design of a radiatin detector interface circuit.
radiation detector interface described in section 3 an
analog module generator was not sufficient anymore.
In this case the transition from the analog sub-chip
level to the module level was not trivial: an archi
tecture consisting of modules had to be generated,
that can meet the specifications for the analog part
on the ASIC. This step is what we call analog 'high
level' synthesis. At the moment this step is always
performed manually by the designer.
The hierarchical structural refinement strategy dis
cussed above can be applied to the analog high-level
synthesis as well [13]. This is illustrated in figure 3,
where we can again recognize the three basic design
tasks: topology selection (called "architecture gen
eration" at this level), specification translation and
verification applied to the radiation detector interface
ASIC.
These three steps of the analog high-level synthe
sis and the module generation will be illustrated in
detail for the practical design case in section 3.
2.4. Simulation Needs in the Mixed-Signal Design Methodology
Simulation is a crucial part in the design methodology. After each step in the top-down synthesis / bottom-up assembly methodology, simulations are required to verify the correctness of the previous steps in order to reduce the overall number of design iterations in the design flow. The most difficult simulations are those where large and/or mixed-signal circuits have to be simulated. In the design flow of figure 1 the most demanding simulations are : • verification of the system-level design and parti
tioning. • global verification after synthesis (before layout). • final verification after layout assembly and extrac
tion. All three are mixed-signal simulations of the com
plete chip. For the first one the exact implementation of the different parts is not yet known. Behavioral models are therefore required to simulate the complete chip. The latter two verifications ideally are transistor-level simulations. However for practical reasons of CPU time behavioral or macro-models
106 Donnay et al.
often have to be used here as well. We will now investigate some of the problems encountered in these simulations.
Simulation Bottlenecks in Large Mixed-Signal Circuits • DC convergence problems - The simulation of
a huge netlist is not trivial. Simulators such as SPICE use a Newton-Raphson iteration algorithm to find the operating point. Unfortunately, these kind of algorithms are locally convergent. When the initial situation is too far away from the real situation, the algorithm may not be able to converge. For small circuits, as for instance a single opamp, this problem can easily be overcome by setting the initial node voltages manually at suitable values. However, for a large and complex system as the radiation detector interface chip, the initial voltages cannot easily be estimated anymore by the designer. Most simulators will therefore have problems in finding the right operating point, required to start the simulation.
• long simulation times - Another problem with simulating the complete netlist is the simulation time. When the entire chip is modeled at the transistor level, even a short simulation (in simulated time scale) may take several days in real time. This is of course always much faster and cheaper than real processing and testing, but still not optimal. Especially in an automated design process where several iterations are possible and this final verification might have to be done several times, simulation times really have to be decreased. This can be done by using macro-models or behavioral models. However, these models must be developed with great care, in order not to simplify the chip behavior too much and miss important effects. This will be illustrated for our practical design example in section 3.6.
• interfacing problems between analog and digital - The description of a large mixed-signal chip will often be a diverse mix of all kinds of models: analog models, as well as digital models, both at several hierarchical levels. Usually, such a mixedsignal chip is simulated by a combination of a digital event-driven simulator and an analog SPICElike simulator [2]. A control program is used to make sure that each simulator is provided with the required data from the other simulator at the
right moment. This approach works perfect when there is little interaction between the analog and the digital parts. However, in the case that there is large interaction between the analog and the digital parts, and especially when there are loops consisting of analog and digital parts, the control of the two simulators will be a difficult task.
Piecewise Linear simulation An alternative approach to overcome some of the above mentioned difficulties is to describe all models in a piecewise linear format (PL) and use a single PL simulator to solve the entire network. The basic idea of PL is that all nonlinear functions are approximated by a piecewise linear function as shown in figure 4 for the output current of a differential pair. It is beyond the scope of this paper to give a complete description of PL techniques (see e.g. [14], [15], [16]). Instead we will present the main advantages and drawbacks of PL as well as practical problems encountered during the design of the radiation detector interface.
advantages of PL: • One of the main advantages of PL techniques
is that no longer Newton-Raphson iteration algorithms have to be used to solve a set of nonlinear equations. Since the nonlinear functions are described by a set of linear segments, one can exactly move along the curves to find the solution [16], instead of doing this in steps with the risk of missing the solution, as with Newton-Raphson algorithms. PL-based algorithms are therefore globally convergent in finding the solution of a set of nonlinear equations.
• A second advantage is that PL (like other commercial mixed-signal simulators nowadays) allows the use of behavioral or macro-models (see section 3.6). Simulation time can significantly be decreased by using such models. An additional advantage of PL with respect to behavioral modeling is the fact that digital models as well as behavioral models can often be constructed with only a few linear segments. For digital gates like AND, OR, etc. this is clear: the transfer function has a segment in which the output is low, a segment in which the output is high, and a segment which connects the two other segments. Analog circuits, like for instance opamps and OTAs, are most often used together with a negative feedback loop.
Using Top-Down CAD Tools for Mixed Analog 107
-2 2
Fig. 4. Piecewise linear model of the output current of a differential pair.
Under normal conditions, the transfer function of the circuit can be approximated by a linear function, as long as the input signal is not rising too fast (slewing behavior) and the output voltage is within a certain region (clipping behavior). Those non-ideal effects can be modeled with only a few linear segments of a current source, respectively a voltage source.
• In PL techniques, digital, analog and behavioral models, can all be described in the same unified representation format. This allows the simulator to simulate a mixed-level, mixed-mode network with one single algorithm and this eliminates the interfacing problems between analog and digital simulation kernels encountered in glued mixed-signal simulators. This makes PL techniques very suitable to simulate a complete mixed analog/digital chip as the radiation detector interface.
drawbacks of PL: • There will always be some loss of accuracy due
to the PL approximation. In fact one has to make a trade-off between the number of linear segments
Table 1. System-level specifications for the radiation detector interface ASIC.
Specification Value Unit
Detector capacitance 80 pF Number of input channels 4 Technology 0.7fJ, CMOS Positive power supply 2.5 V Negative power supply -2.5 V Noise of total system ~ 1 LSB bits Frame time ~5 fJ,s Output resolution 8 bits Overall power consumption ~ 500 mW Operating temperature -301+50 deg C Radiation tolerance ~ 50 kRad
(which largely determines the simulation speed) and the accuracy.
• Another important disadvantage for transistorlevel PL simulations is the lack of foundry support. Low-level transistor parameters are usually only available and optimized for certain (SPICE) transistor models. The PL transistor models are therefore constructed based on the simulation results with SPICE. This extra step from device to model makes the PL device models less accurate than foundry supported models. In the case that an entire circuit has to be simulated at the device level, PL will not be a good choice. It will also be slower than SPICE for the same accuracy at the device level since the CPU time increases drastically with the number of linear segments considered in the models. Nevertheless, when only some critical parts are modeled at the device level, the advantages of PL can compensate for the less accurate transistor models.
• The construction of macro-models or behavioral models (see section 3.6) can be a difficult task, because no general and systematic procedure exists to generate a PL macro-model. Non-idealities are crucial for a proper functioning of the system. The inclusion of too many secondorder effects, on the other hand, will result in inefficient simulations. The parasitic capacitances in a flip-flop for example cannot be omitted in the macro-model of this flip-flop. Otherwise the PL simulator will get problems in the transition from one stage to the other. For most other digital gates however, parasitic capacitances are not required for a proper functioning. The selection of suitable macro-models (accurate enough, but nevertheless not too complex) will therefore always be a diffi-
108 Donnay et at.
cult compromise between efficiency and accuracy. This problem however, is implicit to the use of macro-models and behavioral models in general.
3. Practical Design Case: Radiation Detector Interface ASIC
The top-down methodology described in the previous sections will now be illustrated by going through the different steps of a practical ASIC design case. At each level the different choices, trade-offs and problems are briefly discussed.
3.1. System-Level Specification
The mixed-signal ASIC that was designed is based on the WIND project [17], which was realized as a breadboard by the European Space Agency (ESA) for on-satellite radiation measurements (in the Solid State Telescope experiment). The Solid State Telescope measures the energy of incoming particles. Radiation detector sensors are used to generate small charge pulses that are equivalent to the energy of the incoming particles. The radiation detector interface provides the link between the sensors and a digital signal processing core. Typically different sensors are processed in parallel. Each time a particle hits one of the detectors, charge packets are generated and have to be collected and transformed into a voltage that is equivalent to the total charge generated in the detector. When that voltage exceeds a user-programmable threshold, the "event" has to be recorded: the voltage is converted into a digital word which can further be processed by the DSP core. In our design case, the complete sensor interface circuitry for 4 sensors has been integrated on a single chip in 0.7 J1, CMOS (without the sensor and the DSP core). The system-level performance specifications are summarized in table 1.
In today's design practice a description in natural language like the one above (but more detailed) will typically be used for a system-level specification (unlike the formal specification of the ideal design flow of figure 1). There clearly is a need for more formal methods of specifying system performance, because natural language may lead to misunderstandings. From this it is also clear that the functional description is not formally verified at the system level today. This problem of system-level specification for mixed-signal applications is an open research field.
3.2. System-Level Design
The system-level partitioning was very straightforward in this design. The boundary between analog and digital signal processing was quite clear from the start. As much signal processing as possible is done in the DSP domain. Only the high-speed processing of the weak and very noise-sensitive charge signals coming from the detectors has to be done in the analog domain (mainly filtering to increase the signal-to-noise ratio).
Also the boundary between the analog signal processing data-path and the digital controller is very obvious. Since the analog data-path and the controller are very tightly coupled, and critical timing loops exist over the analog-digital boundary, the controller will be designed as a digital module within the analog front-end. The further design of the analog front-end will be discussed in the next section.
3.3. Analog High-Level Synthesis
Architecture Generation The architecture cannot yet be generated automatically by CAD tools. An expert designer has to describe the architecture as an interconnection of analog modules and one digital module (the data-path controller). Once the architecture is known, he can however use a number of tools for specification translation and behavioral verification (see below) that allow him to explore quickly at a high level different possible architectures.
Four sensor readout channels are put in parallel on the ASIC. The overall architecture of such a channel is well known for this application domain [18]. The charge packets coming from the detectors are integrated on a capacitor in the charge sensitive amplifier (CSA) and then shaped to a semi-Gaussian pulse in a pulse shaping amplifier (PSA). This PSA is a bandpass filter consisting of a differentiator and a number of integrators. Its purpose is to increase the signalto-noise ratio by filtering out large part of the white noise spectrum. After shaping, the signal is shifted to a lower DC offset voltage level by means of a summing amplifier. In this summing amplifier, an external offset compensation voltage is mixed in, to compensate for any remaining offset voltage.
The output of this summing amplifier is then fed to a peak detect sample and hold (PDSH) circuit, which
has to hold the peak value of the pulse until it has been converted by an ADC. The output of the summing amplifier is also fed to a comparator, together with a user-programmable threshold, which is usually set equal to the noise level. When the peak value of the pulse exceeds this threshold level, a logic pulse is sent to the digital controller (finite state machine) to indicate that a detector has been hit.
A number of trade-offs have to be made during the architecture generation. A straightforward solution is to give each sensor readout channel a PDSH and an ADC. But then 4 (slow) ADCs are necessary which will consume quite a lot of area. By multiplexing the four analog channels to the same analog-to-digital converter, only one (faster) ADC is necessary. However by introducing analog memories between the analog channels and the ADC, the required speed for this ADC can be reduced significantly. If enough memories are available, the sampling rate of the ADC can be reduced to the average rate of incoming particles. When during a certain time interval more particles than average are entering the detectors, the ADC cannot convert them immediately and the analog memories are filled. During below-average particle activity the memory cells can be read out and converted. Of course in this case there is a (small) chance that some particles will be missed when there is a heavy particle activity during a long period of time. The acceptable "error rate" determines the optimal number of memory cells.
The expert designer can try different architectures. The specification translation tool automatically calculates specifications for the modules and gives a first-order estimate of the implementation cost of the architecture. The user can then choose the best solution. In our design example the optimal solution (in terms of area and power) was an architecture with 4 analog channels, 16 memories and 1 ADC.
The output of the summing amplifier of each channel is now fed to the input multiplexer of the analog memory (which consists of 16 cells). Each analog memory cell consists of two PDSHs: one to store positive pulses and one to store negative pulses.
After it receives the event pulse from one of the four channels, the finite state machine (FSM) will select one of the analog memories that is available, by means of the input multiplexer of the analog memory. This part of the FSM functions independently of the output part. While the input processor part of the
Using Top-Down CAD Tools for Mixed Analog 109
FSM is constantly selecting the correct analog memory cell for storing the results of the readout channel, the back-end processor is constantly moving the analog information of the memory cells to the analog to digital converter (ADC). It will select the appropriate memory cell by means of the output multiplexer of each channel and route the analog value to the ADC. After conversion, it will generate a signal at the output pin to notify that there is a digital sample at the output bus. After converting the value to an 8 bit word, the FSM resets the analog memory so that it becomes available again to be selected by the frontend processor of the FSM. All these operations can happen in parallel for all four channels.
The complete final high-level architecture (block diagram) of the mixed-signal ASIC is shown in figure 5.
Specification Translation Once the designer has provided the architecture, the specifications for this selected architecture have been translated into specifications for all the sub-blocks within that architecture. This means that a complete list of specifications was derived for each of the modules used in the radiation detector interface architecture, like the ADC, the CSA and the PSA.
The specification translation was performed by means of a simulated annealing optimization loop. Therefore we needed a way to evaluate the performance (including the implementation cost: e.g. the power or area consumption) of the complete architecture, when the module performances are given. We have implemented this performance evaluation procedure by a set of equations. One equation specifies the accuracy of the complete architecture as a function of the different error and noise sources of the modules and another equation gives the speed of the complete system as a function of the different speedrelated specifications (GBW, sampling rate, slew rate, etc.) of the modules. For each module we also have a heuristic equation that gives an estimate of the power consumption as a function of the other major performance specifications of that module. For an amplifier for instance we have an equation relating power consumption to the product of gain, gain-bandwidth and signal-to-noise ratio. The proportionality factor, which depends on the amplifier topology, has been determined by making a number of designs with the analog module generator.
11 0 Donnay et al.
Demux Reset Mux 1234 1234 1234
j j
To Flash ADC
~------~/ ,~----==~--~
Memories '~------------,/ '-----/ ,'-----------"/
CSA+Resistor Multiplier Diff.+PZ 4th order shaper Trigger Level
'-----/ \~ ______________________________________________ T~rig~g~er ______ ~~
X4
Demux Reset Mux 1 16 I). 1)6 1 16 Event )) _ _ ) t 1
L ________________________ ~~423 FINITE STATE MACHINE ~
FLASH
ADC +r-----------------------~
Fig. 5. Top-level architecture of the radiation detector inferface.
Table 2. Module specifications for the CSA-PSA.
Specifications Value Unit
Detector capacitance 80 pF Maximum peaking time 1.5 J.Ls Frame time ::;5 J.Ls Noise ::;1000 RMS electrons Gain 20 mV per fC Output voltage range -1 1+1 V Power consumption ::;40 mW
Optimization variables are the specifications of the different modules in the architecture. At each iteration the performance of the complete architecture was calculated by evaluating the equations and the power consumption was estimated. In the cost function there was a term related to the system specifications and a term related to the implementation cost (power consumption).
An alternative would have been to use behavioral simulations to evaluate the overall system performance at each iteration. We did not choose this alternative because it would require a lot more CPU time, although it may be more accurate if good macromodels are used.
From the specifications of table 1, specifications have been derived for all the modules used in the architecture. The specifications derived in this way for CSA-PSA module are shown in table 2.
Verification Before going down in the hierarchy and in order to reduce the number of iterations, the correctness of the previous design steps had to be verified which at this stage could only be done by means of behavioral simulation. The architecture was simulated in terms of the behavioral models (or macromodels) of the composing sub-blocks and their specifications as determined in the previous step. If the specifications of the block under design are not met by the architecture of sub-blocks, then the specification translation or the architecture can be changed. If the specifications are met, then top-down synthesis can proceed with the design of the different modules, following the same top-down strategy in the developed analog module generator [9].
In section 2.4 the problem of behavioral modeling and simulation was discussed and piecewise linear simulation was introduced. In the context of this project, the capabilities of the piecewise linear simu-
~ 9\ en s· S ~.
0 =
..., " en S "' 0 .....
S'- " g ()q
0
- ~
<: g.
<: " ::J. O"l
() S. 0 p
~
=>
0 « U)
Cl.
-I ~
dete
ctor
! ~
dete
ctor
2
10
time
[us]
CS
A2
20
~
=> o .5~ ~
PS
A2
Of-
-I \
I
-I o
10
lim
e [u
s]
fini
te s
tate
mac
hine
4X
~ I ,'l ~
20
i" ~
anal
og c
hain
3
anal
og c
hain
4
':mE
~
~ -2.5
.~
10
20
§:
time
(us)
::;
;
-I I
'lID
·2.5 10
20
tim
e (u
s] ev
ent 3
even
t 2
"" 10
20
tim
e [u
s]
bit
0 (M
SB
) bi
t 4
2:E
E
2:IT
IJ
-2.5 -2.5 D
OD
10
tim
e (u
s]
20
10
time(
us(
20
2:CIJ
2:EJ
-2
.500
-2.5
10
tim
elu
sj
20
10
time
(us]
20
2:rT
I 2:
EE
-2.5 D
c:J
-2.5 10
tim
e (u
s]
20
10
time
(us]
20
2:IT
J 2:
DlTl
-2
.500
·2.5 [
]1I]
10
tim
e (u
s]
bit 3
20
10
time
(us]
bit
7 (L
SB)
20
~
S·
!1Ci Ql
'0 6 o :E ::l ~ tI
Ql
2- Vl 0'
.... ~
:;;.
8- ~ e:..
o !1Ci ......
......
......
112 Donnay et al.
were derived during the high-level specification trans-10 4 lation.
Fig. 7. Sized Transistor schematic of the maximum value memory circuit.
lator PLANET [16] have been explored for the verification of the high-level architecture of the radiation detector interface. The finite state machine was replaced by high-level digital PL models. The analog circuits which are connected directly to the detectors had to be modeled with much more precision. This is because the detectors generate extremely small and short current pulses from only several thousands of electrons in each pulse. The charge sensitive amplifiers (CSA), which are directly connected to the detectors, will therefore be critical modules. All other analog circuits are less critical since the signals are much stronger than in the CSA. Nevertheless, their non-ideal behavior still effects the performance of the chip and therefore has to be included in the behavioral or macro-models. In section 3.6 one of the PL models used for the design case will be discussed in detail.
With this PL model of the entire system, it was possible to verify the behavior of the system with different specifications for the modules in the chip. When a certain module is then designed at the transistor level, the behavioral model can be replaced by a transistor model. The rest of the chip can still be modeled at a behavioral level. This will result in acceptable simulation times, while at the same time allowing to verify the effects on the chip behavior of a certain transistor implementation of a module. The behavioral model of the rest of the chip ensures that the transistor-level circuit is provided with realistic input and control signals. The simulation results of the entire radiation detector front-end architecture of figure 5 are shown in figure 6.
3.4. Module Generation
After high-level verification the different modules were synthesized separately for the specifications that
Digital Modules The digital modules (the FSM) were generated with commercial logic synthesis tools. A VHDL description was generated by the expert designer and simulated with QuickVHDL of Mentor Graphics. The controller was synthesized from this description with Synopsys tools. Fast Scan of Mentor Graphics has been used for test pattern generation.
Analog Modules The analog modules have been generated by means of the analog module generator (AMG) developed at K.U.Leuven [9]. The AMG contains a cell library with fixed cells, parameterized cells and full-custom cells. The CSA-PSA combination for instance was synthesized as a full-custom cell for the specifications of table 2. An optimization loop around an equation solver is used to calculate the optimal transistor sizes.
For the ADC a fixed cell from the library was used. The ADC from the library has a higher sampling rate than strictly required and is therefore not optimal. However, this solution was preferred because of the savings in design time. The cells that were not yet included in the library of the AMG at the time that this design was performed have been designed manually.
The output of the synthesis part of the analog module generator is a sized transistor schematic for each module as shown in figure 7 for the maximum value memory. At this level the analog modules have been simulated separately without any problems with the classical circuit simulator HSPICE.
After module generation the complete ASIC was verified again through simulation. With HSPICE, it was not possible to simulate the entire chip at the transistor level. This was due to convergence problems. Even with simplified transistor models and with a piecewise linear simulator, computation times were too high to be useful. Therefore the only remaining option was to use (piecewise linear) behavioral or macro-models. In fact more or less the same simulation model was used as for the high-level verification. Only now more accurate values could be used, because data extracted from device-level simulations were used to tune the PL macro-models. The finite state machine is now modeled as a set of logic gates
Using Top-Down CAD Tools for Mixed Analog 113
Fig. 8. Micro-photograph of the radiation detector interface chip.
like NANDs, which are approximated with three linear segments.
3.5. Layout and Verification
Next, the layout of the entire ASIC was generated. The layout of the digital part was generated with IC Station of Mentor Graphics (standard cell place&route). Simulation after layout of the digital part was done with QuickSim, which is a multilevel (gate level and VHDL) simulator. The layout of the analog modules was also performed by means of the analog module generator (AMG) developed at K.U.Leuven [9]. Assembly of all module layouts (analog and digital) was done manually. Special attention was paid to avoid coupling problems between the analog and the digital parts: separate analog and digital supplies and ground, shielding, etc. Final verification after layout extraction was again performed with the PL simulator in a hierarchical way. The data used to tune the PL models are extracted from devicelevel simulations after layout extraction. This will be described in further detail in the next section.
The micro-photograph of the realized and fabricated mixed analog/digital ASIC is shown in figure 8. Its area is 19 mm2 .
The way how a piecewise linear model is made is not basically different from the way a 'normal' macro-model is derived. The first step is to construct a macro-model that models the main functionality of the circuit. For example, in the case of the maximum value memory, depicted in figure 7, this can be done by charging a capacitor through an ideal diode. Since
the current in the diode can only flow in one direction, the voltage over the capacitor can only increase and not decrease. This property makes the capacitor voltage to follow increasing input voltages and hold the maximal value when the input voltage decreases. The other basic functionality of the maximum value memory is of course the possibility to reset the memory. This property can easily be modeled by an ideal switch over the diode. When the switch is closed, the voltage over the capacitor (which is the output voltage of the macro-model) will become the same as the input voltage. The required ideal diode is very easy to obtain with piecewise linear techniques. It consists of two segments: one in which the current through the diode is zero, and one in which the voltage over the diode is zero. Usually, this basic macro-model is not sufficient and should be improved. The resulting PL macro-model is depicted in figure 9. In order to make the model independent of input and output impedances of other circuits, ideal voltage buffers are placed at both the input and the output of the macromodel.
3.6. PL Modeling
Also several non-ideal effects will occur in reallife maximum value memories. These effects have to be modeled by either extra components in the macromodel or by adjusting the piecewise linear behavior of the components. In the macro-model of the maximum value memory both options were used. The diode has got a non-zero on-resistance in order to model a limited bandwidth. By means of the product
114 Donnay et at.
_ Yin '- Vrnin,
~V::1_
v4~~oot '+"it' ,..: ~~ I leakage
ground ~
V
Fig. 9. Piecewise linear model of the maximum value memory circuit.
maximum value memory 0.2
0 " -0.2 ................
-0.4
'8 ---->
-0.6
: -0.8 ,
-I l-L -1.2
, o 2e-05 4e-05 6e-05
Vin
P~: ~~~ ::.'~~:
.
'8 >
8e-05 0.0001
-0.88
-0.9
-0.92
-0.94
-0.%
-0.98
-I Ie-OS 1.05e-05
maximum value memory
l.Ie-05 Vin
Lmodel -----HSPICE .
USe-OS 1.2e-05
Fig. 10. Comparison between SPICE and PLANET simulation results of the maximum value memory circuit.
of the on-resistance and the capacitor value, we can adjust the bandwidth of the macro-model. Further
more, the input buffer has got a limited voltage range,
With the three segments in the input-output relation
of the buffer, the clipping behavior of the maximum value memory is modeled. The last modification of
the ideal macro-model is the extra constant current
source. This current source discharges the capacitor,
and therefore models a droop rate when the memory operates in the 'hold mode'. With the following sim
ple expressions, the parameters in the macro-model can be derived out of the specifications of the circuit:
Rdiode,on
IZeakage
1
BW,2'7r'C C ' DroopRate
The specifications Vmax , Vmin , DroopRate, and
BW of the real maximum value circuit (as depicted
in figure 7) can be estimated by means of SPICE sim
ulations. Those SPICE simulations can afterwards be
used as a reference for the piecewise linear macro
model. When simulations of the macro-model and
simulations of the transistor-level circuit differ too
much, small changes in the macro-model parameters
might be required.
In order to demonstrate the accuracy that can be
achieved with PL models, the maximum value mem
ory as used in the analog memory circuit has been
simulated both as a PL model with the simulator
PLANET and as a transistor netlist with HSPICE.
The results of both simulations are depicted in figure
10. Just before the second input pulse, the circuit is reset. As can be seen from figure 11, both simulations correspond rather well. 4. Conclusions
The top-down design of a mixed-signal ASIC for a real satellite application was discussed in detail. The ASIC, a radiation detector interface, has been designed with CAD tools according to a hierarchical top-down methodology. At each design step a brief discussion is given of the design trade-offs, decisions taken, tools used, problems encountered, etc. Special focus was on the modeling and verification aspects in this top-down methodology. Detailed simulations are needed at the different levels during top-down synthesis as well as bottom-up verification. This requires mixed-level (circuitlbehavioral) and mixed analog/digital simulation capabilities. The usefulness of piecewise linear simulation for this purpose has been explored.
Acknowledgements
This research was performed in part under projects with ESA-ESTEC, ESPRIT-ADMIRE and the Belgian IUAP-20.
References
1. R.Harjani, "Designing Mixed-signal ICs", IEEE Spectrum, November 1992, pp.49-S1.
2. "AID Simulators: an Expanding Array of Choices", Elec· tronic Design, December, 1994, Vo1.42, No.2S, pp.9S-102.
3. R.Lipsett, C.Schaefer, C.Ussery, VHDL: Hardware Description and Design, Kluwer Academic Publishers, 1989.
4. P.Hilfinger, "A High-level Language and Silicon Compiler for Digital Signal Processing", Proc. of the Custom Integrated Circuits Conference, 1985, pp.213-216.
s. R.Saleh, D.Rhodes, E.Christen, B.Antao, "Analog Hardware Description Languages", Proc. of the Custom Integrated Circuits Conference, 1994, pp.1S.1.1-lS.1.8.
6. VHDL-A, Design Objectives Document, IEEE PAR 1076.1. 7. R.Rutenbar, "Analog Design Automation: Where are we?
Where are we going?" - Proc. of the Custom Integrated Circuits Conference, 1993, pp.13.1.1-13.1.8.
8. G.Gielen, J.Franca, "CAD Tools for Data ConvertersOverview", IEEE Transactions on Circuits and Systems, accepted for publication.
9. G.Gielen, et al., "An Analog Module Generator for Mixed AnaloglDigital Design", International Journal of Circuit Theory and Applications, July-August 1995, pp. 263-283.
10. K.Swings, W. Sansen, "Ariadne, a Constraint-based Approach to Computer-aided Synthesis and Modeling of Analog Integrated Circuits", Analog Integrated Circuits and Signal Processing Journal, Kluwer, May 1993, pp.197-21S.
Using Top-Down CAD Tools for Mixed Analog 115
11. G.Gielen, W. Sansen, Symbolic Analysis for Automated Design of Analog Integrated Circuits, Kluwer Academic Publishers, 1991.
12. G.Gielen, K.Swings, W.Sansen, "Open analog synthesis system based on declarative models", from J.Huijsing, R.van der Plassche, W.Sansen (editors), Analog Circuit Design, Kluwer Academic Publishers, 1993.
13. S.Donnay, K.Swings, G.Gielen, W.Sansen, "A Methodology for Analog High-Level Synthesis", Proc. of the Custom Integrated Circuits Conference, 1994, pp.lS.6.1-lS.6.4.
14. W.van Bokhoven, "Piecewise linear analysis and simulation", in Circuit analysis, simulation and design, A.E. Ruehli (Ed.). Amsterdam: North-Holland, 1986, Ch. 9.
IS. T.Kevenaar, D.Leenaerts, "A comparison of piecewise linear model descriptions", IEEE trans. Circ. and Syst.-part I, vol 39, December, 1992, pp.996-1004.
16. T.Kevenaar, D.Leenaerts, "A flexible hierarchical piecewise linear simulator", Integration, the VLSI Journal, vo1.12, 1991, pp.211-23S.
17. WIND-SST Project, ESA-ESTEC, Noordwijk, The Netherlands.
18. Z.Chang, W.Sansen, "Low-Noise, Wide-Band Amplifiers in Bipolar and CMOS Technologies", Kluwer Academic Publishers, 1991.
Stephane Donnay was born in Elsene, Belgium, on May 1st, 1967. He received the M.S. degree in electrical engineering from the Katholieke Universiteit Leuven, Belgium, in 1990. He is currently working toward the Ph.D. degree in electrical engineering at the ESAT laboratory of the same university. His research interests are in analog and mixed-signal circuit design and CAD methodologies.
116 Donnay et at.
Georges G.E. Gielen received the M.S. and Ph.D. degrees in Electrical Engineering from the Katholieke Universiteit Leuven, Belgium, in 1986 and 1990, respectively. From 1986 to 1990, he was appointed as a research assistant by the Belgian National Fund of Scientific Research. In 1990, he was appointed as a postdoctoral research assistant and visiting lecturer at the Department of Electrical Engineering and Computer Science of the University of California, Berkeley. From 1991 to 1993, he was a postdoctoral research assistant of the Belgian National Fund of Scientific Research at the ESAT laboratory of the Katholieke Universiteit Leuven. In 1993, he was appointed as a tenure research associate of the Belgian National Fund of Scientific Research and as a professor at the Katholieke Universiteit Leuven, where he is now associate professor.
Dr. Gielen serves regularly on the Program Committee of international conferences and he currently is Associate Editor of the IEEE Transactions on Circuits and Systems, part I, responsible for Fundamentals of CAD. His current research interests are in the design of analog and mixed- signal integrated circuits, and especially in analog and mixed-signal CAD (numerical and symbolic simulation, synthesis, layout, design for manufacturability) and test. He is technical coordinator of several industrial research projects in this area. He has authored the book Symbolic Analysis for Automated Analog Design (Kluwer Academic Publishers, 1991) and has published more than 50 papers in edited books, international journals and conference proceedings. He is a Member of the IEEE.
Willy Sansen has received the M.S. degree in Electrical Engineering from the Katholieke Universiteit Leuven in 1967 and the Ph.D. degree in Electronics form the University of California, Berkeley in 1972. Since 1981 he has been full professor at the ESAT laboratory of the K.U. Leuven. During the period 1984-1990 he was the head of the Electrical Engineering Department. He was a visiting
professor at Stanford University in 1978, at the Federal Technical University Lausanne in 1983, at the University of Pennsylvania, Philadelphia in 1985 and at the Technical University Ulm in 1994.
He has been involved in design automation and in numerous analog integrated circuit designs for telecom, consumer electronics, medical applications and sensors. He has been supervisor of 30 PhD theses in that field and has authored and co-authored more than 300 papers in international journals and conference proceedings and six books, among which the textbook (with K. Laker) on "Design of Analog Integrated Circuits and Systems".
Wim Kruiskamp was born in Arnhem, The Netherlands on March 31, 1966. He received the M.S. degree in electrical engineering from the University of Twente, Enschede, The Netherlands, in 1990. In 1992, after his military service, he joined the Eindhoven University of Technology, The Netherlands, were he is currently working towards his Ph.D. degree. His main research interests are analog and mixed analog/digital design automation.
Domine M.W. Leenaerts received the Ir. and the Ph.D. degrees both in electrical engineering from the Eindhoven University of Technology in 1987 and 1992 respectively. Since 1992 he is with this university as an assistant professor of the micro-electronic circuit design group. In 1995, he has been a Visiting Scholar at the Department of Electrical Engineering and Computer Science of the University of California, Berkeley and at the Electronic Research Laboratory of the same department. His research interests includes nonlinear dynamic system theory, chaotic behavior in circuits and analog design automation. He has published several papers in scientific and technical journals and conference proceedings.
Steven Buytaert was born in Kruibeke, Belgium on April 3, 1965. He received the degree of Industrial Engineer ElectroMechanics, option micro-electronics from the Katholieke Industriele Hogeschool Limburg in Diepenbeek in 1987. Since 1987 he has been working at IMEC as an analog design engineer. He was mainly involved in the development of several detector readout electronics for high-energy physics experiments. His other activities are in the area of analog design automation and the development of high-performance analog ASICs.
Katrien Marent was born in Poperinge, Belgium on November 8, 1970. She received the degree of Industrial Engineer Electricity, option micro-electronics from the Katholieke Industriele Hogeschool West- Vlaanderen in Oostende in 1992. Since 1992 she has been working at IMEC as an analog design engineer. She was mainly involved in the development of detector readout elec-
Using Top-Down CAD Tools for Mixed Analog 117
tronics for high-energy physics experiments. Her other activities are in the area of analog design automation.
Marc Buckens was born in Biankenberge, Belgium on May 26, 1967. He received the degree of Industrial Engineer Electricity, option telecommunications from the Industriele Hogeschool van het Rijk BME Gent in 1989. Since October 1990 he has been working at IMEC as an ASIC Design Engineer. He was involved in the design of a Local Time Management System ASIC. Since October 1991 he is involved in analog design activities. He was engaged in analog design automation projects and the development of both analog and mixed signal ASICs in CMOS and BiCMOS processes.
Carl Das was born in Diest, Belgium on August 13, 1954. He received the degree of electronic engineer from the Katholieke Universiteit Leuven in 1977. From 1978 until 1983 he was an assistant at the Katholieke Universiteit Leuven where he received his Ph.D. degree in February 1984 on the realization and modeling of ion-implanted junction-field effect transistors in compatible JFET-bipolar and lFET-CMOS processes. From 1983 until 1985 he was employed at the Katholieke Universiteit Leuven as a logistic engineer and in charge of the set-up of a clean room and prototype line for CMOS technology. Since 1986 he is employed by IMEC in the INVOMEC division where he is in charge of the Multi Project Wafer prototyping service and the analog design group. His main interest in analog design is in the design of high-performance, low-noise circuits for readout electronics in high-energy physics experiments and radiation monitoring systems on board of spacecrafts.
Analog Integrated Circuits and Signal Processing, 10, 119-132 (1996) © 1996 K1uwer Academic Publishers, Boston. Manufactured in The Netherlands.
Electro-Optical Device Models for Electrical Simulators
VALENTINO LIBERALI, FRANCO MALOBERTI, AND ALBERTO REGINI [email protected]
Department of Electronics, University of Pavia, Via Abbiategrasso 209, 27100 Pavia, Italy
Abstract. This paper describes the modeling for the analysis of electro-optical devices using a conventional electrical simulator. The proposed approach is intended for the analysis of optical sensor systems, which have optical and electronic devices integrated on the same silicon chip. Models have been developed using a hardware description language for the following devices: the LED, the photodiode and the absorbing medium. Suitable approximations allow the models to be accurate with a limited number of parameters while the computation time is kept sufficiently short. Simulation results show good agreement between numerical analysis and experimental data previously reported in the literature.
Introduction
The present trend in solid-state electronics is from integrated circuits towards integrated systems. The silicon technology has already reached a mature stage, and compatibility between integration of both microelectronics and sensors has been demonstrated for a wide variety of structures [1,2].
Over the past years evolution was mainly focused on mixed analog/digital (ND) systems, which include analog conditioning, ND interface, digital signal processing and sometimes D/ A conversion. New data conversion and signal processing methods can reduce technological requirements for analog component precision and matching.
Starting from these premises, circuit and system designers are now exploring new research fields, aiming at ever higher levels of integration of functional capabilities into a single silicon chip. Intelligent sensors are a new branch of mixed integrated systems. They include the sensor which converts a physical quantity into an electrical variable and some electronic functions which can be either in analog or in mixed analog/digital do-
mains. Intelligent sensors cover a wide variety offunctions, spanning from pressure and temperature sensors to gas flow metering systems, and finally to electrooptical integrated systems.
Optical systems have an enormous number of possible applications, and they are expected to have a strong commercial impact in the near future. Smoke and fog detectors, air pollution metering stations, security alarms, blood analyzers, pH measurement systems are just a few examples of appliances which would benefit from intelligent sensors, in terms of cost, size and power consumption.
One important part of the design of an integrated optical system is the simulation of electrical and optical components. Nowadays electrical simulators that allow the description of non-electrical elements are available. Therefore the bottleneck for mixed opticalelectrical simulations is the lack of suitable device models. For this reason a part of activities should concentrate on the modeling of electro-optical devices. This paper describes the physics and the implementation for a popular electrical simulator of photodiode and LED models. The level of complexity involved for most of
120 V Liberali, F. Maloberti, and A. Regini
practical integrated optical sensor systems allows simulations with good accuracy.
System Simulation
Analog simulation and digital simulation were two separate worlds until the appearance of mixed AID integrated systems. Analog simulators analyze electrical variables (current and voltage) in each circuit element. On the other hand, digital simulators consider only transitions of logical levels from one to zero and viceversa, transmission delays, fan-out capacitance loads, etc. Their level of analysis is more abstract, since the elementary unit they handle is the logical gate, not the single transistor.
Since time to market is a tight requirement for industrial designers, the coming of mixed AID circuits produced a strong demand for software tools capable of analyzing them. Mixed AID simulation tools, including analog hardware description languages, are now supplied by many CAD vendors and a remarkable effort is put on the standardisation of the language.
Hardware description languages (HDL) are based on behavioral simulation, which has proven to be a viable solution for the analysis of complex circuits because it speeds-up computation time while maintaining a good degree of accuracy. Under this approach, the system is split into a number of blocks, each of them is described through a set of relationships between input, output and state variables. Such relations can be either in implicit or in explicit form. Explicit relations can be written for "one-way blocks," since their outputs are dependent on the inputs, but not the vice-versa [3]. The possibility of writing explicit relationships for blocks plays a relevant role in the computation speed-up.
The ELDO simulator [4] was used for this kind of analysis. Besides standard electronic devices, it allows us to describe components with behavioral models written in FAS language [5]. However, the proposed approach is not language-dependent, since it can be adopted with any conventional analog simulator which provides an hardware description language or has an interface with user-written routines.
The use of HDL models with a conventional simulator has another advantage: models can be written at different abstraction levels, from high level considering only a simplified input-output relationship, to circuit level including accurate physical details to account for parasitic elements. High level models sacrifice accuracy to speed, thus reducing computation time. On
the other hand, low level models are as accurate as the device models implemented in analog simulators; the accuracy is obtained at the expense of CPU time. Both high level and low level models are behavioral, in the sense that they describe the component behavior using a dedicated language. The user can choose the appropriate simulation level to have the best trade-off between accuracy and computation time. Moreover, when using a mixed analog-digital simulator, a highlevel analysis of the whole processing system is possible within the same simulation environment, including the optical sensor, the analog-digital interface and the digital signal processing.
The transmission of a signal can be described as an exchange of suitable physical quantities between component terminals. In electrical circuits signals are expressed using currents and voltages, while optical element require to deal with other quantities accounting for light radiation. Such terms are presented in Table I, together with their symbols and SI units.
In conventional electrical simulators, the signal transmission is modeled by the exchange of electrical power between circuit elements, being power the product of current and voltage (P = I . V). In optical components we must describe an exchange of radiant flux ct>e. It can be considered either as the product ofradiant intensity Ie and solid angle Q (ct>e = Ie' Q), or as the product ofirradiance Ee and area A (ct>e = Ee' A). The former relationship will be used for the LED, while the latter will be applied to the model of the photodiode. To model the dependence on wavelength A, we approximated the spectral distributions of radiant flux, radiant intensity and irradiance with staircase functions. The wavelength range is divided into intervals of 5 nm, and the spectral distributions are assumed to be constant within each interval. Such an approximation is useful in simplifying numerical calculations by reducing the integrals to finite summations. Each wavelength interval [An, An+I1 is associated to an "optical terminal," which is used to describe the connections between optical components. The simulator determines the amount of radiant flux within [An, An+I1 that is exchanged by two components. An output optical terminal is considered as a source of radiant intensity (or irradiance), which enters the input terminal of another optical component. The choice of 5 nm as wavelength interval arises from a trade-off between accuracy and numbers of optical terminals.
The following sections contain the description of the models implemented for the LED, for the photodiode
Electro-Optical Device Models for Electrical Simulators 121
Table J. Radiometric units used in this paper.
name
radiant flux spectral radiant flux
relative spectral radiant flux radiant intensity
spectral radiant intensity relative radiant intensity
irradiance spectral irradiance
spectral component of current spectral sensibility
absorption coefficient
Anode
Cathode
LED
symbol
<l>e <l>e}.(A)
<l>e,rel(A) le(Q) or le(if)
le}.(A, if)
le.rel (if) Ee
Eel. (A) Iph!. (A)
SeA) ex or ex (A)
opU opt2 opt3
optN
Fig. 1. One-way block for the behavioural model of the LED.
and for the transmission medium respectively. Finally, some simulation results will be presented.
Model for the LED
The model implemented to describe the behavior of a LED consists of an electrical input section and an optical output section (Fig. 1).
The nodes of the electrical section (anode and cathode) can be connected to standard electrical devices. The simulator will solve the circuit using Kirchhoff's laws. The model of the electrical section is shown in Fig. 2. It consists of a diode (voltage-dependent current source hED) with depletion capacitance Cj and diffusion capacitance CD, plus a series resistance Rs which takes into account bulk resistivity and ohmic contacts. As proposed in [6], the series resistance has a typical dependence on the current I:
{ Rso if I ::: IRso Rs = Rso (£yuf if I> IRso
(1)
where IRso is the upper bound of the linear region, Rso
unit abbrev.
watts W watts per meter W/m pure number
watts per steradian W/sr watts per meter per steradian W/m· sr
pure number watts per square meter W/m2
watts per cubic meter W/m3
amperes per meter AIm amperes per watt AJW
meters-I m-I
Cathode
Fig. 2. Electrical section of the LED model.
is the low-current resistance and the resistance exponent x ranges between 0.4 and 0.6.
The I - V dc characteristic of the LED can be modeled by considering two effects: the diffusion current I D , which is responsible for light emission, and the space-charge recombination current I R , which is assumed to be non-radiative. The relationship between LED current hED and voltage VLED is [7]:
hED = ID + IR
( qVLED ) = IDo exp---l nlkT
( qVLED ) + IRo exp -- - 1 n2kT
(2)
where q is the electron charge, k is the Boltzmann's constant, T is the temperature in Kelvin, nl and n2 are the emission coefficients of the two current components (their default values being 1 and 2 respectively). The saturation currents I DO and I RO depend on the temperature T through the following relationships [8]:
(T)* IDo(T) = IDo(To)' To
122 V. Liberati, F. Maloberti, and A. Regini
( qEG T - To) ·exp --.---n2kT To
(3)
(4)
where To is the reference temperature (with a default value of298 K = 25°C), EG is the activation energy (1.6 eV for GaAs devices) and XT is the saturation current temperature exponent.
If the applied voltage VLED is small, the depletion capacitance can be modeled with the classical relationship [9]:
(5)
where CjO is the zero-bias capacitance, V; is the built-in voltage, and M is the grading coefficient of the junction (M = 4 for an abrupt junction and M = t for a linear gradient junction) [8].
Equation (5) is no longer valid when the direct voltage across the diode becomes comparable with the junction built-in voltage Vj . According to the formula implemented in the electrical simulator [10], the depletion capacitance is modeled as follows:
Cj = if VLED < yV;
(l_;)Ol+M (1 - y(1 + M) + M Vw) if VLED > yVj
(6)
where y is an adimensional parameter used to define the upper boundary of (5). For direct voltages above y Vj the capacitance is modeled with a linear extrapolation. The default value is y = 0.5.
The diffusion capacitance is expressed by the relationship proposed in [11]:
qID CD = T--
2kT
where T is the transit time of the junction.
(7)
Each of the N output terminals out!, ... outN of the LED in Fig. 1 corresponds to the spectral radiant intensity in a wavelength interval of 5 nm, thus covering a wavelength range of 5· N nm.
The total output radiant flux is assumed to be proportional to the LED diffusion current [6], until it reaches an upper bound <I>e,max:
<I>e = min (a. I D , <I>e.max) (8)
where the constant a is the radiant power coefficient, which depends on the particular device and has a typical value ranging from 0.01 W/A to 0.3 W/A.
As proposed in [12], a linear temperature dependence is assumed:
where KT is the temperature coefficient. The spectral distribution of the total radiant flux (i.e.
its dependence on the wavelength) is given by the spectral radiant flux <I>e}.(A):
where <I>e}.(Ap) is the spectral radiant flux at the peak wavelength A p.
The relative spectral radiant flux <I>e,rel (A) can be modeled with a Gauss function:
<I>e.rel(A) = exp (-(4 In 2) C· ~:p ) 2) (11)
where /)"A is the spectral bandwidth between half power points. Usually only parameters Ap and /)"A are available in data books. When the complete spectral response of the LED is available, a better approximation of the relative spectral radiant flux <I>e,rel(A) can be obtained with a linear combination of three Gauss functions:
<I>e,rel(A) = exp ( -(4 In 2) (A ~:p) 2)
- CI exp (-(4 In 2) (A :A:A y) + C2 exp ( -(4 In 2) (A :A~B) 2) (12)
The total radiant flux is the integral of the spectral radiant flux over the wavelength domain:
(13)
Electro-Optical Device Models for Electrical Simulators 123
<l>e,reI O.)
1
0.8
0.6
0.4
0.2
910 930 950 970 990 1010 A.
Fig. 3. Staircase approximation of the output radiant flux of the LED.
From (10) and (13), we can express the spectral radiant flux at the peak wavelength through the relationship:
<l>e <l>e .. (Ap) = .. (14)
~ max <l>e rei (A) dA JAmrn '
assuming that the spectral radiant flux is negligible outside the range [Amin, Amax].
The peak wavelength is assumed to have a linear spectral shift with temperature:
Ap(T) = Ap(To) . (1 + aT(T - To» (15)
where aT is the peak wavelength temperature coefficient.
Fig. 3 illustrates the staircase approximation of the spectral radiant flux in a commercial device, The number of wavelength intervals is N = 25, thus covering a range Amax - Amin = 125 nm.
Equations (8) to (14) give us the radiant flux emitted in all directions and do not provide information about the emission angle. To account for view axis, the radiant intensity Ie has to be considered. The relationship between the radiant flux <l>e (in W) and the radiant intensity Ie (in W/sr) is:
(16)
We assume that the light emission has a maximum at the axis of the LED, has radial symmetry, and is limited to the half space above the device, as proposed in [12]. Since the differential solid angle has the value dQ = 27r sin IJ dIJ, the radiant intensity can be expressed as a function of the view angle IJ, with IJ ranging from 0 to ~. Therefore (16) becomes:
<l>e = 1! Ie(IJ) 27r sin IJ dIJ (17)
According to [12], we introduce the relative radiant intensity Ie,rel(IJ), which is an adimensional quantity related to the radiant intensity through the relationship:
(18)
where Ie(O) is radiant intensity on the LED axis. The relative radiant intensity Ie,rel(IJ) can be specified in data books through a diagram (the angular response curve) or through an approximate analytic relationship [12]:
(19)
The radiant intensity cosine exponent c accounts for the angular response. For a Lambertian source, c = 1 [7]. Narrow beam devices have larger values of radiant intensity cosine exponent [12].
From (14), (17) and (18) we obtain the radiant intensity on the axis:
<l>e .. (Ap) J/max <l>e rei (A) dA Ie(O) = " Am..' (20)
27r f02 Ie,rel(IJ) sin IJ dIJ
Under the assumption that the spectral distribution of the radiant intensity is the same as that of the radiant flux, we can finally obtain the values of the spectral radiant intensity on the axis at different wavelengths:
<l>e .. (A p ) <l>e,rel (A) Ie .. (A,O) = " (21)
27r f02 Ie,rel(IJ) sin IJ dIJ
If the view point does not lie on the axis, from (18) we obtain:
Ie .. (A, IJ) = Ie .. (A, 0) . Ie,rel(IJ) (22)
In the practical implementation of the model, the integral at the denominator of (21) is evaluated with a staircase approximation of Ie ,rei (IJ ) sin IJ, in intervals of 5 degrees. The error introduced by this approximation has proved to be less than 2% [12].
Table II contains a list of the parameters of the LED. Each parameter is listed with the symbol used in the text, the name used in the model file, its physical dimension and default value. Parameters are divided into three groups, corresponding to three different level of complexity. The first level contains only the most significant parameters, which can be found in data books.
.... N
Tabl
e II
. P
aram
eter
s o
f the
LE
D.
.f:>.
mod
el
phys
ical
de
faul
t :00
:::: sy
mbo
l na
me
leve
l pa
ram
eter
des
crip
tion
di
men
sion
va
lue
t-<
PO
WC
ra
dian
t pow
er c
oeff
icie
nt
pow
er/c
urre
nt
0.1
W/A
s.:
a "'
AP
PWA
VE
pe
ak w
avel
engt
h o
f em
issi
on s
pect
rum
ful
l w
idth
at h
alf m
axim
um
wav
elen
gth
9S
0n
m
a -fl.
A
FW
HM
o
f em
issi
on s
pect
rum
w
avel
engt
h S
Onm
:-
.
Am
in
LE
DS
RG
be
ginn
ing
of L
ED
spe
ctra
l fie
ld
wav
elen
gth
89
Sn
m
~
c IN
C
radi
ant i
nten
sity
cos
ine
expo
nent
pu
re n
umbe
r 3
~ {}
V
IEW
vi
ew a
ngle
an
gle
0°
S"
IDO
ID
O
diff
usio
n cu
rren
t at s
atur
atio
n sp
ace-
char
ge r
ecom
bina
tion
cur
rent
cu
rren
t 1
0-2
6 A
<:
:)- "' IR
O
ISR
O
at s
atur
atio
n cu
rren
t S
OfA
~
:-.
To
TN
OM
re
fere
nce
tem
pera
ture
te
mpe
ratu
re
25°C
§
T
TE
MP
de
vice
tem
pera
ture
te
mpe
ratu
re
=T
NO
M
$:l..
KT
T
EM
PC
ra
dian
t pow
er te
mpe
ratu
re c
oeff
icie
nt
tem
pera
ture
-1
8·1
O-3
°C-l
?:-
aT
T
EM
PP
W
peak
wav
elen
gth
tem
pera
ture
coe
ffic
ient
w
avel
engt
h/te
mpe
ratu
re
0.2
nm
l°C
:::t
l
Rso
RSO
2
seri
es o
hmic
resi
stan
ce
resi
stan
ce
2Q
~ S·
IRSO
m
so
2 cu
rren
t par
amet
er o
f the
ser
ies
ohm
ic re
sist
ance
cu
rren
t lO
rnA
...
x X
2
resi
stan
ce c
oeff
icie
nt
pure
num
ber
O.S
C
jO
CJO
2
zero
-bia
s ju
ncti
on c
apac
itan
ce
capa
cita
nce
OF
\-j
VI
2 ju
ncti
on p
oten
tial
vo
ltag
e I.
SV
y
FC
2
coef
fici
ent f
or fo
rwar
d bi
as d
eple
tion
cap
acit
ance
pu
re n
umbe
r 0.
5 M
M
2
grad
ing
coef
fici
ent
pure
num
ber
0.5
, T
T
2 tr
ansi
t tim
e ti
me
1 ns
EG
EG
3
acti
vati
on e
nerg
y vo
ltag
e 1
.6eV
n
l N
IDO
3
emis
sion
coe
ffic
ient
of d
iffu
sion
cur
rent
pu
re n
umbe
r I
emis
sion
coe
ffic
ient
of s
pace
-reg
ion
n2
N
ISR
O
3 re
com
bina
tion
cur
rent
at s
atur
atio
n pu
re n
umbe
r 2
XT
XT
I 3
satu
rati
on c
urre
nt te
mpe
ratu
re e
xpon
ent
pure
num
ber
3
Electro-Optical Device Models for Electrical Simulators 125
opt! opt2
optN opt(N+l)
PHOTODIODE
Anode
Cathode
Fig. 4. One-way block for the behavioural model of the photodiode.
Second and third model levels use additional parameters, which require the knowledge of the fabrication process and are similar to the diode model. The second level considers series resistance and capacitances, while the third level models also the dependence of the saturation currents on the temperature.
Model for the Photodiode
The model describing the behavior of the photodiode has an optical input section and an electrical output section (Fig. 4).
In the input section, the optical terminals (opt!, ... , optN) collect the incident light at different wavelength intervals of 5 nm, as explained in the previous sections. Since in most applications the photodiode is illuminated not only by the LED, but also by other light sources, we model the "environmental" light with a constant white source applied to the input opt( N + 1) in Fig. 4. Another important difference with respect to the LED is that the physical quantity at the input of a photodiode is the irradiance Ee (in W 1m2), while a radiant intensity Ie (in W/sr) is the output of a LED. The transmission medium between source and detector, described in the next section, take into account the conversion between the above quantities. As we did in the previous section for radiant flux and radiant intensity, we introduce also the spectral irradiance Ee). (). .. ), in W/m3•
To simplify calculations, we assume that the irradiance is constant over the whole area of the photodiode. This approximation is valid when the distance from the source to the photodiode is much larger than the size of the light detector [12]. The spectral radiant flux on the photodiode is expressed by:
where A is the area of the photosensitive surface.
Cathode
Fig. 5. Electrical section of the photodiode model.
At a wavelength A, the spectral component of the photo-generated current Iph ). (A) is proportional to the spectral radiant flux <Pe).(A), and consequently to the input spectral irradiance Ee). (A), through the relationship:
Iph).(A) = SeA) . <Pe).(A) = A· SeA) . Ee).(A) (24)
where SeA) is the spectral sensibility, which is measured in AIW and is a function of the wavelength. We found that an approximation of SeA) with a 5-th degree polynomial is satisfactory. Using this approximation, we can write SeA) as:
(25)
where AO is the peak wavelength of the spectral sensibility and Si are the fitting coefficients. To be consistent with the staircase approximation of <Pe). (A) and Ee).(A), SeA) has also been considered to be constant over wavelength intervals of 5 nm.
When the incident light is not perpendicular to the photosensitive surface, the dependence of the spectral sensibility on the incidence angle can be modeled through a relationship similar to (19):
SeA, ff) = SeA, 0) . (cos ff)C (26)
where c is the spectral sensibility exponent. The photo-generated current Iph is calculated by in
tegrating (24) over the wavelength range of the photodiode [Aphl, Aph2]:
1).Ph2
SeA, ff)Ee).(A)dA ).phl
(27)
It is worth noting that silicon planar photodiodes have a wavelength range [Aphl, Aph2] larger than the one of LED's [Amin, Amax] [13].
126 V. Liberali, F. Maloberti, and A. Regini
Since the input spectral irradiance EeA(J..) also includes the environment light, it is worth splitting Iph into the sum of two contributions: one from the signal irradiance Ee)..,signal (J..) and the other from the environmental irradiance EeA,env, as follows:
Iph = Iph,signal + Iph,env
= A ( Aph2 s(J.., l'J)EeA.signal(J..) dJ..
jAPhl
1 Aph2
+A EeA,env S(J.., 0) dJ.. APhl
(28)
To account for the worst possible situation, in the second term the environmental irradiance has been assumed to be perpendicular to the photosensitive area.
The electrical section of the photodiode was modeled as shown in Fig. 5, The current of the generator IpHD
is:
( qVPHD ) I pHD = 10 exp --- - 1 - I h
nkT p (29)
where 10 is the saturation dark current, V PHD is the bias voltage across the photodiode and n is the emission coefficient of the dark current.
Since photodiodes are used with inverse bias voltage, the depletion capacitance is given by:
CjO Cj = M
(1- V~Q) (30)
As for the LED, the series resistance Rs models bulk resistivity and ohmic contacts. Rs is assumed to be constant.
Table III list of the parameters of the photodiode. Three different levels were considered. The first one contains main parameters. The second level considers series resistance and capacitances, while the third one models also the temperature dependence of the saturation current.
Model for the Transmission Medium
The measurement principle of optical sensor systems is based on absorption differences when the light passes through transmitting media with different characteristics. Therefore it is necessary to model the optical transmission through an absorbing medium.
It is also necessary to convert the radiant intensity generated by the LED into the irradiance which is the
inoptl outoptl inopt2 outopt2 inopt3 ~ outopt3
• • • • • • inoptN • • outoptN
TRANSMISSION MEDIUM
Fig. 6. One-way block for the behavioural model of the transmision medium.
input variable of the photodiode. Of course, the relationship between the above quantities involves the optical length L of the medium.
The schematic diagram of the absorbing medium is shown in Fig. 6. The block considered has N input nodes and N output nodes, each of which is associated with a wavelength interval of 5 nm. When the distance from the source is much greater than the detector size, under the assumption that the absorption coefficient ex is independent on the wavelength, the output spectral irradiance Ee).. (J..) is related to the input spectral radiant intensity Ie)..(J..) by the relationship:
exp(-exL) Ee)..(J..) = Ie)..(J..) £2 (31)
where ex is the absorption coefficient. If the absorption coefficient depends on the wave
length, we approximate ex(J..) with a staircase function using wavelength intervals equal to 5 nm, as explained in the previous sections of this paper.
Table IV shows the parameters of the transmission medium. At level one, the absorption coefficient is considered constant, while it is a function of the wavelength at level two.
Simulation Results and Discussion
To show the potential of our approach, in this section we present the results of some analyses employing the models described using the simulator ELDO [4] on a SUN SPARCstation 2. An example of behavioral model written in FAS language is shown in Table V. The examples presented in this section have been simulated using the most detailed models (level 3 for LED and photodiode; level 2 for the transmission medium).
The first simulation example concerns the analysis of a LED in the time domain, to verify the capability of
Electro-Optical Device Models for Electrical Simulators 127
Table III. Parameters of the photodiode.
model symbol name level parameter description physical dimension default value
10 IDRO dard current at saturation current 2nA n PHDN emission coefficient of dark current pure number 1.5 A AREA photosensible area area 7.5 mm2
tJ AINC incidence angle angle 0° c INC spectral sensibility exponent pure number 1.3
Amjn LEDSRG beginning of LED spectral field wavelength 895nm
Aphl PHDSRG beginning of photodiode spectral field wavelength 445nm Sj FITC(n) coefficients for spectral sensibility currenUpower 0.55 NW (peak)
To TNOM reference temperature temperature 25°C T TEMP device temperature temperature =TNOM
Rs RS 2 series ohmic resistance resistance 20 n CjO CJO 2 zero-bias junction capacitance capacitance 75 pF
l-j VJ 2 junction potential voltage 0.75 V y FC 2 coefficient for forward bias depletion capacitance pure number 0.5 M M 2 grading coefficient pure number 0.5
EG EG 3 activation energy voltage 1.1 eV XT XTI 3 saturation current temperature exponent pure number 3
Table IV. Parameters of the transmission medium.
symbol name level parameter description physical dimension default value
a ABSC absorption coefficient length-I 1 m- I
flag for changing ABSC dimensions ABSCDB into dB/m boolean o (NO)
L LENGTH length of light pathway length 1m
a(A) ABSC(n) 2 absorption coefficient vector length-I 1 m-I
128 V. Liberali, F. Maloberti, and A. Regini
Table V. Behavioral model of the photodiode in PAS language.
* * * PHOTODIODE LEVEL 2 * * * amodel phdiode2(inl, in2, in3, in4, in5, in6, in7, in8, in9,
inlO, inll, in12, in13, in14, in15, in16, in17, in18, in19, in20, in21, in22, in23, in24, in25,
in26, aphd, cphd); pin inl, in2, in3, in4, in5, in6, in7: pin in8, in9, inlO, inll, in12, in13:
declare declare declare declare declare
pin in14, in15, in16, in17, in18, in19:
electrical; electrical;
electrical; electrical; pin in20, in21, in22, in23, in24, in25:
pin in26, aphd, cphd: declare par am irO, area, phdsrg, ledsrg, phdn: declare local spectrsens(1:120) , a(1:6): declare local irradiance(1:25), irramb: declare local bgcurr, ispsens, signcurr: declare local revvolt, totcurr, photocurr: declare local vt, k, q, dummy, m, irOtemp, sp:
declare local n, firstch: initialize
make irO=2.0*(10.0 power -9.0)
make area=7.5*(10.0 power -6.0)
make phdsrg=445.0 make ledsrg=895.0 make k=1.3806*(10.0 make q=1.602*(10.0
power -23.0) power -19.0)
make m=0.035 make phdn=1.5
endinitialize analog
make make make make
make
a(1)=0.55281 a(2)=-1.3276*(10.0 a(3)=-6.4478*(10.0 a(4)=-2.4447*(10.0 a(5)=-5.0586*(10.0
power -4.0) power -6.0) power -8.0) power -11.0)
make a(6)=-3.8131*(10.0 power -14.0) make irradiance(l)=volt.value(inl) make irradiance(2)=volt.value(in2) make irradiance(3)=volt.value(in3) make irradiance (4) =volt.value (in4) make irradiance(5)=volt.value(in5) make irradiance(6)=volt.value(in6) make irradiance (7) =volt.value (in7) make irradiance(8)=volt.value(in8) make irradiance(9)=volt.value(in9) make irradiance(lO)=volt.value(inlO) make irradiance(ll)=volt.value(inll) make irradiance(12)=volt.value(in12) make irradiance(13)=volt.value(in13) make irradiance(14)=volt.value(in14) make irradiance(15)=volt.value(in15) make irradiance(16)=volt.value(in16) make irradiance(17)=volt.value(in17)
electrical;
real; real; real; real; real; real;
integer;
Electro-Optical Device Models for Electrical Simulators 129
Table V. (continued)
make irradiance(18)=vo1t.va1ue(in18) make irradiance(19)=vo1t.va1ue(in19) make irradiance(20)=volt.value(in20) make irradiance(21)=volt.value(in21) make irradiance(22)=volt.value(in22) make irradiance(23)=volt.value(in23) make irradiance(24)=vo1t.value(in24) make irradiance(25)=volt.value(in25) make irramb=volt.value(in26) for n=l upto 120 do
make sp=phdsrg+real(n)*5.0-900.0 make spectrsens(n)=a(1)+a(2)*sp+a(3)*(sp power 2.0)+
a(4)*(sp power 3.0)+a(5)*(sp power 4.0)+a(6)*(sp power 5.0)
endfor make firstch=integer((ledsrg-phdsrg)/5) make signcurr=O.O for n=l upto 25 do
make sp=irradiance(n)*spectrsens(firstch+n)*area+signcurr make signcurr=sp
endfor make ispsens=O.O for n=l upto 120 do
make sp=spectrsens(n)+ispsens make ispsens=sp
endfor make bgcurr=irramb*ispsens*area make photocurr=signcurr+bgcurr make dummy=m*(temp-298.0) make irOtemp=irO*(10.0 power dummy) make revvolt=volt.diff(aphd, cphd) make vt=k*temp/q make totcurr=irOtemp*(expo(revvolt/(vt*phdn))-l) - photocurr make curr.on(aphd)=totcurr make curr.on(cphd)=-totcurr
endanalog endmodel
amodel phdscap(inc, outc); declare pin inc, outc: electrical; declare param phdcjO, phdvj: real; declare param phdm: real; declare local iicap, voltcap: real; declare local phdcs, vcoeff: real; declare state phdcharge: real;
initialize make phdvj=0.375 make phdcjO=75.0*(10.0 power -12.0) make phdm=0.5 make iicap=O.O
endinitialize
130 V. Liberati, F. Maloberti, and A. Regini
Table V. (continued)
analog make voltcap=volt.diff(inc, outc) make vcoeff=l.O-voltcap/phdvj make phdcs=phdcjO/(vcoeff power phdm) make phdcharge=phdcs*voltcap if mode=trans then
make iicap=state.dt(phdcharge)
endif make curr.on(inc)=iicap make curr.on(outc)=-iicap
endanalog endmodel
macro phd2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
ycapphd1 phdscap pin: 27 28 yphd1 phdiode1 pin: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
17 18 19 20 21 22 23 24 25 26 27 28 endmacro
1.4
E 0.6
r o > 0.2
o -0,2
0.0
I
Y LED Voltage
LED current
\
4.0E-S S.OE-S timers]
1.2E-7 1.6E-7
Fig. 7. Voltage and current simulated waveforms for the LED.
accounting for parasitics during transient simulations. The LED is connected to a voltage source which generates a 2.5 V rectangular pulse with a duration of 125 ns. The voltage drop across the LED is 1.5 V and a 10 Q series resistance is used to limit the maximum current to 100 rnA. Fig. 7 shows the voltage and current waveforms of the LED in the time domain. Effects of capacitances, such as the rise time and the inverse recovery time are evident from the plot.
The time evolution of the spectral radiant intensity has also been simulated, assuming a radiant power coefficient a = 0.1 W/A, a Gaussian spectral distribution with peak wavelength A p = 950 nm and FWHM !J..A = 50 nm, a radiant intensity cosine exponent c = 3 and a view angle '/J = 0_ The spectral radiant inten-
e ~
f j I '"
1.0E-3
8.0E-4
6.0E-4
4.0E-4 I 2.0E-4 /
O.OE+O
-2.0E-4 O.OE+O
".-
4.0E-8 S.OE-8 lime[s]
1.2E-7 1.6E-7
Fig. 8_ Time-domain simulation of the spectral radiant intensity at the peak wavelength ofthe LED.
sity at the peak wavelength is represented in Fig. 8. To obtain a high degree of accuracy, the analysis was performed over 1500 time intervals. The total CPU time was 20 min, which corresponds to an average simulation time of 0.8 s for a single time interval.
In the second simulation, an optical system made of a LED, a transmission medium and a photodiode was analyzed in time domain. The length of light pathway was L = 0.1 m. The photodetector was assumed to have an area A = 7.5 mm2, and its peak spectral sensibility was S(AO) = 0.551. The environmental irradiance was set equal to zero. The inverse current of the photodiode is plotted in Fig. 9. Its main component is the photo-generated current; the dark current is negligible_ This analysis required 2 h 13 min of CPU
Electro-Optical Device Models for Electrical Simulators 131
4IlE-7
3.0E-7
."..-/ ~
S 2.0E-7
/ ~ / II
~ 1.0E-7
~ O.OE+O
-1.0E-7 O.OE+O 4.0E-8 8.0E-8
timers] 1.2E-7 1.6E·7
Fig. 9. Time-domain simulation of the photodiode inverse output current.
S.0E-6
4.SE-6
4.0E-6
~ ~ 3.SE-6
8 3.0E-6
2.SE-6
2.0E-6
~
"'" ........... I'-..... ~ ........
r--..... I"-......... r-..
~ ~ 0 ill W ~ ~ ~ ro m w Temperature ['C]
Fig. 10. Simulated temperature dependence of the photodiode inverse output current.
time (5.3 s per time interval). When using less detailed models for LED and photodiode (level 1 in Tables II and III) the CPU time is reduced by a factor from 5 to 10, but loosing the capability of evaluating rise time and inverse recovery time.
The last example is a dc analysis at different temperatures. The optical system is the same as the one in the previous example, but with a shorter optical length of the transmission medium (L = 0.03 m). We can see that the total radiant flux and consequently the output current of the photodiode (shown in Fig. 10) decrease exponentially as the temperature increases, as reported in data books [13]. To perform the dc analysis over 100 different temperatures, 11 min of CPU time were spent.
Conclusion
This paper presents an approach to the behavioral simulation of mixed optical-electrical systems. The models
developed for the LED, for the photodiode and for the transmission medium allow us to analyze these components together with standard microelectronics devices using a conventional simulator. The results obtained agree with data reported in optical component data books, thus demonstrating that it is possible to analyze optical sensor systems using generic purpose simulation tools.
Acknowledgements
This work was supported by the Commission of European Communities, under the ESPRIT project 7101 (MInOSS).
Notes
1. The spectral sensibility was modeled with the following approximation: S(A) = 0.55 - 0.133· 106lU - 6.45 . 1012 ~A 2 - 24.4· 1018 ~A3 -50.6.1024 ~A4 -38.1 .1030 ~AS where ~A = A-AO.
References
1. H. Baltes, T. Boltshauser, O. Brand, R. Leggenhager and D. Jaeggi. "Silicon microsensors and microstructures," Proc. 1992 IEEE Int. Symp. on Cire. and Syst., San Diego, CA, May 1992,pp.1820-1823.
2. B. J. Hosticka, "Circuit and system design for silicon microsensors,"Proc.1992IEEEInt. Symp. on Cire. andSyst., San Diego, CA, May 1992, pp. 1824-1827.
3. A. E. RuehIi, A. L. Sangiovanni-Vincentelli and G. Rabbat, "Time analysis oflarge-scale circuits containing one-way macromodels." IEEE Trans. Cire. and Syst. 29, pp. 185-190, March 1982.
4. EWO electrical cireuit simulator, ANACAD Computer Systems GmbH, Uim, Germany, 1992.
5. EWO-FAS dynamical system modeling, ANACAD Computer Systems GmbH, Ulm, Germany, 1991.
6. A. Descombes and W. Guggenbiihl, "Large signal circuit model for LED's used in optical communication." IEEE Trans. Electr. Devices 28, pp. 395-404, Apr. 1981.
7. C. H. Gooch, Injection Electroluminescent Devices. 1. Wiley: London, UK, 1973.
8. P. Antognetti (Ed.), Semiconductor Device Models for CAD Applications. McGraw-HilI: New York, NY, 1986.
9. R. S. MulIer and T.I. Kamins, Device Electronicsfor Integrated Cireuits (2nd ed.). 1. Wiley: New York, NY, 1986.
10. EWO model equations, ANACAD Computer Systems GmbH, Uim, Germany, 1991.
11. S. Nakamura, J. Umeda and O. Nakada, "Response times of Iight-emitting diodes:' IEEE Trans. Electr. Devices 19, pp. 995-997, Aug. 1972.
12. Hewlett-Packard Company, Optoelectronics applications manual. McGraw-HilI: New York, NY, 1977.
132 V. Liberati, F. Maloberti, and A. Regini
13. Infrared Detectors and Emitters, Laser Devices Data Book. Telefunken Electronic, Heilbronn, Germany, 1986.
Valentino Liberali was born in Broni, Italy, in 1959. He graduated with the Laurea degree in Electronic Engineering from the University of Pavia in 1986. In the same year he received a one-year scholarship from SGS (now SGS-Thomson Microelectronics), within the frame of the Piano Nazionale Microelettronica. From 1987 to 1990 he was with Italian Nuclear Physics Institute (INFN) working on the development and characterisation of low-noise electronics for particle detectors. In 1990 he joined the Department of Electronics of the University of Pavia, where he worked on the development of the simulator TOSCA for sigma-delta AID converters. His main research interests are the design of analog/digital interfaces and development of CAD tools for analog and mixed integrated circuits.
Franco Maloberti was born in Parma, Italy, in 1945. He received the Laurea Degree in Physics (Summa cum Laude) from the University of Parma, Italy, in 1968. He joined the University ofL' Aquila, Italy, then the University of Pavia. For the years 1975-79, he was technical co-ordinator of the Engineering School at the University of Mogadishu, Somalia. He is currently Professor of Microelectronics at the University of Pavia, and is also head of the Integrated Microsysterns Group. His professional expertise is in the design, analysis and characterisation of integrated circuits and analogue digital applications, mainly in the areas of switched capacitor circuits, data converters, interfaces for telecommunication and sensor systems, CAD for analogue and mixed A-D design. Dr. Maloberti has written more than 140 published papers, 2 books and holds 12 patents. He was the recipient of the XII Pedriali Prize (1992) for his technical and scientific contributions to national industrial production. He is a member of the AEI (Italian Electrothecnical and Electronic Society), a Senior Member of the IEEE, Vice-President Region 8 of the IEEE-Circuit and System Society.
CALL FOR PAPERS CALL FOR PAPERS CALL FOR PAPERS
ANALOG INTEGRATED CIRCUITS AND SIGNAL PROCESSING An International Journal
Special issue on Analog Implementations of Cellular Neural Networks and Analog VLSI
Massively-parallel analog hardware has been used for cellular neural networks (CNNs), artificial neural networks (ANNs), early vision processors, and other applications, offering speed, power consumption and cost advantages over digital hardware. To address various aspects of massively-parallel analog hardware, a special issue will be published on all aspects of CNN, ANN, and other specialized analog circuits implementation. Topics of interest include, but are not limited to:
* Analog implementations of CNNs * CNN universal machine architecture and design * Analog implementations of ANNs and other massively-parallel analog hardware * Field-programmable analog array (FPAA) implementations of CNNs, ANNs and
other massively-parallel systems * Applications of analog CNN, ANN, and other massively-parallel analog hardware
A call for papers for a companion special issue on field-programmable analog arrays and their applications is being simultaneously announced. Please see the journal or contact one of the Guest Editors for more details.
All manuscripts are subject to review. To be considered for this special issue of Analog Integrated Circuits and Signal Processing, prospective authors should submit six copies of their complete manuscript describing original contributions, and specify this issue, by July 1, 1996 to: Ms. Karen S. Cullen, Kluwer Academic Publishers, 101 Philip Drive, Assinippi Park, Norwell, MA 02061, Tel: (617) 871-6300, Fax: (617) 878-0449, E-mail: [email protected] or to one of the four Guest Editors:.
Dr. Leon O. Chua Dept. of Electrical Engineering
and Computer Sciences University of California Berkeley, CA 94720 Phone: 510-848-7652 Fax: 510-845-4267 Email: [email protected]
Edmund Pierzchala Dept. of Electrical Engr. Portland State University P.O. Box 751 Portland, OR 97207-0751 Phone: 503-725-3806 Fax: 503-725-3807 Email: [email protected]
Dr. Glenn Gulak Dept. of Electrical & Computer Engr. University of Toronto 10 King's College Road Pratt Building, Room 484C Toronto, Ontario, Canada, M5S lA4 Phone: 416-978-8671 Fax: 416-971-2286 Email: [email protected]
Dr. Angel Rodriguez-Vazquez Centro Nacional de Microelectronica (CSIC) Analog Design Department A vda. Reina Mercedes sIn. Universidad de Sevilla 41012-Sevilla, Spain Phone: +34-5-423-99-23 Fax: +34-5-462-45-06/423-18-32 Email: [email protected]
Instructions for Authors are regularly published in the Journal and may also be obtained from Ms. Karen Cullen. Analog Integrated Circuits and Signal Processing is an archival, peer reviewed journal publishing research papers on design and applications of analog integrated circuits and signal processing circuits and systems. It is published bi-monthly with worldwide distribution to engineers, researchers, educators and libraries. Readers interested in subscribing to this journal should contact Kluwer Academic Publishers, P.O. Box 358, Accord Station, Hingham, MA 02018-0358, USA; Tel: 617-871-6600; Fax: 617-871-6528; Email: [email protected].
CALL FOR PAPERS CALL FOR PAPERS CALL FOR PAPERS
ANALOG INTEGRATED CIRCUITS AND SIGNAL PROCESSING An International Journal
Special issue on Field-Programmable Analog Arrays and Their Applications
Field-programmable analog arrays (FPAAs) are analog circuits of changeable configuration and parameters, intended to provide a medium for the implementation of various analog and mixed-signal (analog digital) processing functions. Several FPAA architectures, combining analog signal-processing "data path" and digital control circuits, have been presented in the literature along with numerous applications. To address the FP AA technology and its applications, a special issue will be published on all aspects of programmable analog hardware, and especially FPAAs and analog signal processing in the context of cellular neural networks (CNNs), artificial neural networks (ANNs), analog early vision processors and other specialized programmable analog circuits. Topics of interest include, but are not limited to:
* Architectures for programmable analog and mixed-signal hardware * Circuit implementations of programmable analog and mixed-signal hardware * FPAAs as a medium for implementing CNNs, ANNs, and other massively-parallel systems * FPAAs as a medium for implementing multi-valued logic (MVL) and fuzzy logic circuits * Specialized programmable analog hardware (e.g. programmable and adaptive filters, control systems) * Commercial applications of programmable analog hardware
A call for papers for a companion special issue on analog implementations of cellular neural networks and analog VLSI is being simultaneously announced. Please see the journal or contact one of the Guest Editors for more details.
All manuscripts are subject to review. To be considered for this special issue of Analog Integrated Circuits and Signal Processing, prospective authors should submit six copies of their complete manuscript describing original contributions, and specify this issue, by July 1, 1996 to: Ms. Karen S. Cullen, Kluwer Academic Publishers, 101 Philip Drive, Assinippi Park, Norwell, MA 02061, Tel: (617) 871-6300, Fax: (617) 878-0449, E-mail: [email protected] or to one of the four Guest Editors:.
Dr. Leon O. Chua Dept. of Electrical Engineering
and Computer Sciences University of California Berkeley, CA 94720 Phone: 510-848-7652 Fax: 510-845-4267 Email: [email protected]
Edmund Pierzchala Dept. of Electrical Engr. Portland State University P.O. Box 751 Portland, OR 97207-0751 Phone: 503-725-3806 Fax: 503-725-3807 Email: [email protected]
Dr. Glenn Gulak Dept. of Electrical & Computer Engr. University of Toronto 10 King's College Road Pratt Building, Room 484C Toronto, Ontario, Canada, M5S lA4 Phone: 416-978-8671 Fax: 416-971-2286 Email: [email protected]
Dr. Angel Rodriguez-Vazquez Centro Nacional de Microelectronica (CSIC) Analog Design Department A vda. Reina Mercedes sIn. Universidad de Sevilla 41012-Sevilla, Spain Phone: +34-5-423-99-23 Fax: +34-5-462-45-06/423-18-32 Email: [email protected]
Instructions for Authors are regularly published in the Journal and may also be obtained from Ms. Karen Cullen. Analog Integrated Circuits and Signal Processing is an archival, peer reviewed journal publishing research papers on design and applications of analog integrated circuits and signal processing circuits and systems. It is published bi-monthly with worldwide distribution to engineers, researchers, educators and libraries. Readers interested in subscribing to this journal should contact Kluwer Academic Publishers, P.O. Box 358, Accord Station, Hingham, MA 02018-0358, USA; Tel: 617-871-6600; Fax: 617-871-6528; Email: [email protected].
---------------------------- <:~~~ ]?()Il )J~)J~Il~---------------------------
~N~~()G INT~GIl~T~D <:IIl<:UIT~ ~ND ~IGN~~ )J1l()<:~~~ING
Special Issue on Analog VHDL
VHDL was defined and is used mainly for digital systems. For this domain, VHDL is a powerful and now widely accepted standard (for modeling, simulation and synthesis). But there is also an urgent need for analog designers to have such modeling, simulation and synthesis tools.
One approach relies on the structure of VHDL and uses leaf cells as basic blocks. These leaf cells could be handled by some analog simulator (SPICE). This is the most economical approach, but the main problem in this case is to standardize it and the results will depend strongly on the kind of simulator used.
Another approach is to develop analog modeling and simulation in pure VHDL applying one of the digital systems, which makes it possible to get analog representation of results.
The most important way is to extend the semantic and syntax of VHDL and standardize it. VHDL'93 with currently developing VHDL-A seems to be a good approach to overcome many problems concerning analog and mixed-analog design. The first VHDL-A Language Reference Manual is expected by mid '95, and the standard is expected to be approved beginning of '96.
The goal of this Special Issue is to present state-of-the-art in analog VHDL domain, taking into account new approaches to be used in VHDL-A.
The topics of this Special Issue include, but are not limited to:
* Use VHDL for analog design * Analog and mixed-analog design methodologies * VHDL-A standardization * Simulation
* Component modeling * Macromodeling * Behavior modeling * Synthesis
All manuscripts are subject to review. To be considered for this Special Issue of Analog Integrated Circuits and Signal Processing, prospective authors should submit six copies of tl,eir complete manuscript, and specify this issue, by August 1, 1996 to: Ms. Karen S. Cullen, Kluwer Academic Publishers, 101 Philip Drive, Assinippi Park, Norwell, MA 02061, Tel: (617) 871-6300, Fax: (617) 878-0449, E-mail: [email protected]. The Guest Editors are: Guest Editors:
Professor Andrzej T. Rosinski Department of Microelectronics Institute of Electron Technology AI. Lotnikow 32/46 02-668 Warsaw, POLAND Email: [email protected] Phone: +48 22 47 22 61 Fax: +48 22 47 06 31
Dr. Alain Vachoux Integrated Systems Center Swiss Federal Institute of Technology CH-1015 Lausanne SWITZERLAND Email: [email protected] Phone: +41-21-693-69-84 Fax: +41-21-693-46-63
Instructions for Authors are regularly published in the Journal and may also be obtained from Ms. Karen Cullen. Analog Integrated Circuits and Signal Processing is an archival, peer reviewed journal publishing research papers on design and applications of analog integrated circuits and signal processing circuits and systems.