memory chips with adjustable configurations*downloads.hindawi.com/journals/vlsi/1999/062801.pdf ·...

14
VLS1 DESIGN 1999, Vol. 10, No. 2, pp. 203-215 Reprints available directly from the publisher Photocopying permitted by license only (C) 1999 OPA (Overseas Publishers Association) N.V. Published by license under the Gordon and Breach Science Publishers imprint. Printed in Malaysia. Memory Chips with Adjustable Configurations* LIZY KURIAN JOHN Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX 78712 (Received 10 June 1997; In final form 20 October 1998) In this paper, we present the concept of Field Programmable Memory Cell Arrays (FPMCAs) as the memory counterpart to Field Programmable Gate Arrays which have proved their utility in design and rapid prototyping. Principles of dynamic reconfigur- ability using programmable logic and programmable interconnect are incorporated into random access memories to achieve this flexibility. We first present the design of a variable width RAM (VaWiRAM) which is a simple example of a Field Programmable Memory Cell Array. The configuration of VaWiRAMs can be adjusted by setting a few configuration pins on the memory chip. A VaWiRAM reconfigurable between widths and Wmax can be constructed with the extra cost of Wmax pass gates, (Wmax/2) 2-to-1 multiplexers, and [log2[log2(k)+ 1] mode pins. A novel scheme to overlap the address pins with mode control pins and achieve the mode control with only one extra pin is also presented. The paper discusses the architecture of the proposed VaWiRAMs in detail, analyzes the design tradeoffs and introduces the concept of FPMCAs. Keywords." Dynamically alterable memory configurations, flexible memory, random access memory, reconfigurable memory 1. INTRODUCTION Designing efficient memory systems for modern microprocessors is increasingly becoming a chal- lenge [10, 12, 15, 18, 19,21]. The width of DRAMs available in the market is lower than the bus-width of most modern high performance processors and hence most computers normally use single-in-line memory modules (SIMMs) which are small printed circuit boards stocked with multiple DRAMs ar- ranged in parallel, to yield a width of 32 to 144 bits, matching the memory width to the computer’s bus-width. Granularity the minimum increment of memory that may be added to a system [11] in- creases as the width of the memory chip becomes narrower with respect to the memory bus. For in- stance, on a 32-bit wide bus that uses 4-Mbit DRAMs, if the chips are organized as M X 4 bits, *This paper is based on "VaWiRAM: A Variable Width Random Access Memory Module", by Lizy Kurian John which appeared in the Proceedings of the 9th International Conference on VLSI Design, January 1996, page number 219-224, (C) 1995 IEEE. U.S. Patent 5, 867, 422 has been granted to the author on FPMCAs. Lizy John’s research is supported by grants from the United States National Science Foundation, the State of Texas Advanced Technology Program, IBM, DELL, AMD, and Intel. 203

Upload: others

Post on 21-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Memory Chips with Adjustable Configurations*downloads.hindawi.com/journals/vlsi/1999/062801.pdf · [20]. Forinstance, the lookuptables ontheXilinx XC4000series ofFPGAscanbeusedto

VLS1 DESIGN1999, Vol. 10, No. 2, pp. 203-215Reprints available directly from the publisherPhotocopying permitted by license only

(C) 1999 OPA (Overseas Publishers Association) N.V.Published by license under

the Gordon and Breach SciencePublishers imprint.

Printed in Malaysia.

Memory Chips with Adjustable Configurations*LIZY KURIAN JOHN

Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX 78712

(Received 10 June 1997; In finalform 20 October 1998)

In this paper, we present the concept of Field Programmable Memory Cell Arrays(FPMCAs) as the memory counterpart to Field Programmable Gate Arrays which haveproved their utility in design and rapid prototyping. Principles of dynamic reconfigur-ability using programmable logic and programmable interconnect are incorporated intorandom access memories to achieve this flexibility. We first present the design of avariable width RAM (VaWiRAM) which is a simple example of a Field ProgrammableMemory Cell Array. The configuration of VaWiRAMs can be adjusted by setting a fewconfiguration pins on the memory chip. A VaWiRAM reconfigurable between widthsand Wmax can be constructed with the extra cost of Wmax pass gates, (Wmax/2) 2-to-1multiplexers, and [log2[log2(k)+ 1] mode pins. A novel scheme to overlap the addresspins with mode control pins and achieve the mode control with only one extra pin is alsopresented. The paper discusses the architecture of the proposed VaWiRAMs in detail,analyzes the design tradeoffs and introduces the concept of FPMCAs.

Keywords." Dynamically alterable memory configurations, flexible memory, random access memory,reconfigurable memory

1. INTRODUCTION

Designing efficient memory systems for modernmicroprocessors is increasingly becoming a chal-lenge [10, 12, 15, 18, 19,21]. The width of DRAMsavailable in the market is lower than the bus-widthof most modern high performance processors andhence most computers normally use single-in-linememory modules (SIMMs) which are small printed

circuit boards stocked with multiple DRAMs ar-ranged in parallel, to yield a width of 32 to 144bits, matching the memory width to the computer’sbus-width. Granularity the minimum increment

ofmemory that may be added to a system [11] in-creases as the width of the memory chip becomesnarrower with respect to the memory bus. For in-stance, on a 32-bit wide bus that uses 4-MbitDRAMs, if the chips are organized as M X 4 bits,

*This paper is based on "VaWiRAM: A Variable Width Random Access Memory Module", by Lizy Kurian John which appearedin the Proceedings of the 9th International Conference on VLSI Design, January 1996, page number 219-224, (C) 1995 IEEE. U.S.Patent 5, 867, 422 has been granted to the author on FPMCAs. Lizy John’s research is supported by grants from the United StatesNational Science Foundation, the State of Texas Advanced Technology Program, IBM, DELL, AMD, and Intel.

203

Page 2: Memory Chips with Adjustable Configurations*downloads.hindawi.com/journals/vlsi/1999/062801.pdf · [20]. Forinstance, the lookuptables ontheXilinx XC4000series ofFPGAscanbeusedto

204 L.K. JOHN

then it takes eight of them to fill out the 32-bit buswith a minimum granularity of 4 MBytes of mem-ory. If, however, the memory chip is organized as4M X bit, then 32 of them are needed to fill out a32-bit bus with a minimum granularity of 16MBytes of memory. If the processor bus-width is64 bits, the granularity increases even further. Theorganization of the chip thus affects the expand-ability of the memory system. It would be desirablefor a user to be able to configure the memory chip toany desired width, without being forced to adopt acertain granularity. In this paper, we present thedesign of a variable width RAM (VaWiRAM) thatcan be programmed to configure itself to variablewidths. In essence, if a memory chip ofN b bits isavailable, our design makes it possible to configurethe chip to 2NXb/2 bits, or 4NXb/4 bits, or ingeneral kNXb/k bits, where k is a power of two.

Currently, each SIMM can be used only withprocessors that have exactly the same bus-width.Even a configuration with half the bus-width can-not efficiently use the same SIMM. With the avail-ability of variable-width RAMs, programmableSIMMs can easily be constructed. By changing thevalues on a few control pins, the chips can be re-

programmed to different configurations.Another fact motivating the research in this paper

is that different processors need different inter-face logic in order to be able to use the same memorychip. Currently memory vendors manufacture sepa-rate cache modules for different processors. Forinstance, Motorola MCM72BA32 and MCM72BA64 BurstRAMs are [17] designed to be secondarycache modules for the Intel Pentium processor. Forthe PowerPC and MC68040 processors, Motorolaprovides the 64 K X 18 Bit BurstRAM, MCM67M618. It would be desirable to embed program-mable logic on a memory chip so that it may beeasily interfaced to various processors.The primary objective of this paper is to present

the design of a variable width RAM (VaWiRAM)which can be configured to different widths. Asecond objective of this paper is to present thedesign of a fully programmable memory cell array(FPMCA) with programmable logic on the chip,

enabling to interface the memory module to vari-ous processors, with reduced parts count.

In Section 2, we describe the related research inthe past. In Section 3, the design of the VaWiRAMis described. The design of the programmable out-put data control circuitry and the programmablecolumn decoder unit are described in detail. Varioustrade-offs involved in the design are examined.In Section 4, exten.sions to the VaWiRAM, especial-ly Field Programmable Memory Cell Arrays(FPMCAs), are explained. In Section 5, we offerconcluding remarks.

2. RELATED RESEARCH

Some of the look-up table (LUT) based FPGAsfrom manufacturers such as Xilinx [22] and Altera[2] offer a method to realize reconfigurable RAM[20]. For instance, the look up tables on the XilinxXC4000 series of FPGAs can be used to constructRAMs and it is possible to realize RAMs ofdifferent sizes using these FPGAs. However, sinceFPGAs are primarily intended for programmablelogic, it’is inefficient to use them as RAMs. Forinstance, Xilinx XC4020 (a 20,000 gate chip withmore than 240 pins) [22] can yield only 28.8kRAM (not really surprising, because XC4020 isnot intended to be a memory chip). What we pre-sent in lhis paper are dedicated memory resourceswith reconfigurability. Such FPMCAs were firstdiscussed in [14]. Programmable memories havebeen discussed by Bertin et al. [4,5], howeverthe programmability had been for functionalityrather than for configuration.Some reconfigurability at the factory is available

in most RAMs for fault-tolerant purposes. Forinstance, many RAMs incorporate spare rows andcolumns to replace faulty cells. In many cases, theprogramming technology is laser-interconnect andreconfigurability is limited to one time. This paperpresents the design of a user-programmable mem-ory cell array which can be adjusted to differentconfigurations multiple times.

Page 3: Memory Chips with Adjustable Configurations*downloads.hindawi.com/journals/vlsi/1999/062801.pdf · [20]. Forinstance, the lookuptables ontheXilinx XC4000series ofFPGAscanbeusedto

MEMORY CHIPS 205

3. ARCHITECTURE

Conventional memory chips contain a matrix ofmemory cells, row and column address decoders,input and output buffers and control circuitry. Indynamic RAMs, the I/O circuits will include senseamplifiers etc. Figure illustrates the architectureof the proposed VaWiRAM chip, which is similarto conventional memory chips except that program-mability has been incorporated into the variousblocks. For instance, the column address decoder(PCADU), output data control unit (PADCU), andinput data control unit are programmable. Therow decoder and the memory cells are not shownto be reconfigurable, but they also could be madereconfigurable. We have restricted the program-mability to a few selected units in order to reduce thecost of reconfigurability. (We chose the columndecoder to be programmable, but one could havechosen the row decoder instead.) The architecturealso includes an extra block, the Mode Control Unit

(MCU) which receives mode information fromthe user and delivers appropriate control signalsto the various programmable units.

In order to incorporate programmability intothe VaWiRAM, we use pass gates, and multi-plexers. A pass transistor can be used to realize aprogrammable interconnection between two wires.The pass transistor turns on or off depending onthe control value. When off, the pass gate presentsa very high resistance and thereby an open connec-tion between the two wires. When the pass gate isturned on, it forms a relatively low resistance andthereby a closed connection between the two wires.Thus the connection between the two wires can beturned on or off by changing the control values.Similarly, a bidirectional pass gate can be con-structed with two pass transistors and an inverter.A multiplexer may be used if there are severalinput wires. The control signals to control the passgates and multiplexers is applied at special modeinput pins (here, the MCU input pins).

FIGURE Architecture of VaWiRAM.

Page 4: Memory Chips with Adjustable Configurations*downloads.hindawi.com/journals/vlsi/1999/062801.pdf · [20]. Forinstance, the lookuptables ontheXilinx XC4000series ofFPGAscanbeusedto

206 L.K. JOHN

The details of the various blocks inside the chipare presented below. The design uses a hardwiredrow decoder and a programmable column deco-der. The job of the programmable circuitry is toconfigure the data width to the desired width andcorrespondingly configure the column decoder,column address pins and chip data pins. At thisjuncture, we define a few terms.

DEFINITIONSWmax: The maximum allowed width of the chip.Wmax configuration: A VaWiRAM in its max-imum width.

Wmin: The minimum allowed width of the chip.Wmin configuration: A VaWiRAM in its minimumwidth.Width Variability factor (k): The ratio of the high-est desired width to the lowest desired width.For illustration purposes, we use a 1MX chip

which is reconfigurable to 512kX2 bits, or256k X 4 bits, or 128k X 8 bits, or 64kX 16 bits.Since we keep the row decoder to be fixed, the1MX chip can be viewed as one thousand kXblocks as in Figure 2. For this chip, maximumwidth,

Wmax 16

minimum width,

Wmin

Pmgrammabl; Logic t""reali’ze Clumn decoders, I/0’ Buers,’nd"othetcontrol circuitry

FIGURE 2 A 1M-bit chip can be viewed as 1000 blocks oflk bit size.

and width-variability factor,

k=16.

3.1. Programmable Column Address DecoderUnit (PCADU)

When data-width of a chip increases by factor k,the number of address lines get reduced by logz(k)and hence if there are n address bits in the columnaddress in the minimum width case there will ben-log2(k) address bits in the maximum widthcase. For instance, the chip under design will need20, 19, 18, 17, or 16 address pins and 1,2, 4, 8, 16data pins respectively in the five different config-urations.The PCADU is set up in two stages with a layer

of multiplexers constituting the first stage as inFigure 3a, and a layer of decoders in the secondstage as in Figure 3b.

THEOREM If there are n bits in the columnaddress in the minimum width case and n- logz(k)bits in the maximum width case, the programmablecolumn decoder can be set up as k/2 multiplexers,each of size 2-to-l, and k decoders each of size

n- log2(k) to 2n-lg2(k).

In this example, we set up the first stage as eight2-to-1 muxes as in Figure 3a, and the second stageas sixteen 6-to-64 decoders as illustrated in Figure3b. Let us assume that A0, A1, A2,..., A19 denotethe twenty address lines. In the 1M configura-tion, the lower 10 address pins A0, A1,..., A9 willbe connected to the row decoder, and the higher 10address bits, A10,A11,... ,A19 are connected tothe column decoder. The column decoder will use

10, 9, 8, 7, and 6 bits when the width of the chip is1,2, 4, 8, and 16 bits respectively. The 6 input linesto all of the sixteen 6-to-64 decoders in Figure 3bare connected to address lines A10, A11,..., A15.Only one of the sixteen 6-to-64 decoders will beenabled concurrently if chip width is 1. Two, four,eight or all sixteen of the decoders will be enabledsimultaneously if desired chip-width is 2, 4, 8, or 16respectively. The circuitry that will accomplish the

Page 5: Memory Chips with Adjustable Configurations*downloads.hindawi.com/journals/vlsi/1999/062801.pdf · [20]. Forinstance, the lookuptables ontheXilinx XC4000series ofFPGAscanbeusedto

MEMORY CHIPS 207

A19

H

2-to-1 L--- b3

A18

H

A17

H

A17

H

A16

H

-’6 6-t0-64 clecoder64b3-b2 bl bO

--6 6-to-64 decoder64b3 b2 b aO

-’ 6-to-64_ decoder,s-64b3 b2 a b0

b3 b2 a aO

---6 6-to-64 decoder ..64b3 a2 b bO- 6-to-64 decoder --64

.Ya3 a2 aO

(b)

FIGURE 3 Programmable column address decoders.

enabling of different numbers of decoders is shownin Figure 3a. The inputs and outputs of the cir-cuitry in Figure 3a are illustrated in Table I.

3.2. Programmable Output Data Control Unit

(PODCU)

The PODCU in VaWiRAM incorporates passgates in order to realize width variability. Pass

gates are placed between the data lines as inFigure 4, so that the lines can be connected to-gether or not, depending on the desired width.

THEOREM 2 To achieve a width-variability factorof k, log2(k) levels ofpass gates are required.

Proof By connecting together pairs of data linesin the Wmax configuration, a width of Wmax/2 can

TABLE Inputs and outputs of circuitry in Figure 3a

InputsMC1 MC2 MC3 MC4

Outputsa3 a2 al a0 b3 b2 bl

Configurationb0

0000

A19 A18 A17 A16 A19 A18 A17H A18 A17 A16 H A18 417H H A17 A16 H H 417H H H A16 H H HH H H H H H H

A16 bit wideA 16 2 bit wideA16 4 bit wideA 16 8 bit wideH 16 bit wide

Page 6: Memory Chips with Adjustable Configurations*downloads.hindawi.com/journals/vlsi/1999/062801.pdf · [20]. Forinstance, the lookuptables ontheXilinx XC4000series ofFPGAscanbeusedto

208 L.K. JOHN

D14

level4

level3

level2

level1

D6 DIO

Control Voltages

D2 D12 D4 1138 DO 01

ContrVoltageWire Wire

D9 D5 D13 03

A pass gate

D1 D7 D15

FIGURE 4 Circuitry for reconfiguring data width. A simplified symbol is used to represent the pass gate.

be realized. Since width-variability factor is k,Wmax/Wmi k. The width gets divided by twofor each level of pass gates; so log2(k) levels of passgates will be required to achieve a width variabilityfactor of k.

THEOREM 3 The total number of pass gates re-

quired to achieve a width-variability factor of k is

log2 (k) WmaxZ 2i=1

Proof In the Wmax configuration, (Wmax/2) passgates are required to connect together pairs of datalines and obtain a width of (Wmax/2). (Wmax/4)pass gates will be required for the next level,(Wmax/8) pass gates for the next level and so on.Hence the total number of pass gates will be

Wmax/2 -I-- Wmax/4 -1- Wmax/8 "q-"""-+- Wmax/k.

THEOREM 4 If Wmi 1, the total number ofpassgates required to achieve a width-variability factorofkis

Wmax- 1.

Proof If Wmin 1, k Wmax and

Wmax/2 -- Wmax/4 -- Wmax/8 +""+ Wmax/k Wmax 1.

THEOREM 5 To achieve a width-variability factorof k, logz(k) control values are sufficient.

Proof To achieve width-variability factor of k,logz(k) levels of pass gates are sufficient. Each passgate in the same level has identical values. Hencelogz(k) control values are sufficient. Actually wecan further reduce the number of control signals,which is explained in the subsection on ModeControl Unit (MCU).

For illustration, from Theorem 2, four levels ofpass gates are required to change the width betweenand 16. Figure 4 illustrates how 16 data lines can

be controlled by 4 mode select lines to yield anywidth from 16 to 1, which are powers of two. FromTheorem 3, the number of gates required can beverified to be 15. Figure 5 illustrates the valuesrequired to control the pass gates.A similar circuit has to be incorporated into

the input circuitry (PIDCU) also. Because of thesimilarity, the design of PIDCU is not describedhere. The mode select lines controlling the pass

Page 7: Memory Chips with Adjustable Configurations*downloads.hindawi.com/journals/vlsi/1999/062801.pdf · [20]. Forinstance, the lookuptables ontheXilinx XC4000series ofFPGAscanbeusedto

MEMORY CHIPS 209

Control Values

[ o"I I’i’", i9Resulting Configuration

16 bit wide

8 bit wide

4 bit wide

2 bit wide

I" "i.]’lMC4 MC3 MG2 MC1

bit wide

FIGURE 5 Values for the select signals for the 5 differentconfigurations.

transistors in the input data lines can be the sameas those in the output circuitry.

3.3. Mode Control Unit (MCU)

The control values required for the PCADU,PODCU and PIDCU are identical and are genera-ted by the MCU. Out of the log2(k) control values,not all combinations are used and hence wecould further minimize the number of controlsignals required for the operation of the chip.

THEOREM 6 If the width-variability factor is k,[log(log2k + 1) bits of information are sufficientforthe mode information.

Proof If the width-variability factor is k, thereare log2k + distinct widths considering the re-striction that only powers of two are allowed. Ifthere are log2k + distinct widths, [log(log2k + 1)bits of information will be sufficient to indicate thedesired width.

Illustration In our example, the width-variabilityfactor is 16. Hence there will be log2(16)+ or 5distinct widths that may be desired (actually 1,2, 4,8, and 16). Since there are 5 different modes to besupported, [log2(5) or 3 bits are required to re-present the mode information. Even in a case forprogramming from bit to 1024 bit width, onlyfour lines are sufficient.

For our example design, the mode control unithas three input bits which are used to generate the

M3

Inputs

M2

0 0 00 00 00

0 0

M1

Outputs

MC4 MC3 MC2 MC1

0 0 0 00 0 0

0 00

FIGURE 6 MCU inputs and outputs.

select signals for the programmable column dec-oder and the input and output data control units,according to Figure 6. The input combinations tocorrespond to each output combination is ratherarbitrary and different choices can lead to differentnumber of gates in an implementation. The inputcombinations shown in Figure 6 is the trivialchoice but not the optimal choice. There are waysto overlap the functions of some of the modecontrol pins with the address pins, which is dis-cussed later.

3.4. Tradeoffs

3.4.1. Pin Limitations

The most serious problem associated with Va-WiRAMs is the requirement for a large number ofpins for the chip. VaWiRAM chips should be ableto support the number of data pins correspondingto the largest width to be supported, the number ofaddress pins corresponding to the smallest width,and at least log2[log2(k)+ 1] mode pins. Widerconfigurations naturally require larger number ofdata pins and hence the increase in the number ofdata pins should not be considered as a disadvan-tage. The need to have mode control pins is a disad-vantage, however, we can overlap some ofthe modecontrol pins with the address pins as describedbelow.While increasing the data width by k times, the

number of address lines required reduces by log(k)and those lines can be used to deliver the modeinformation to the chip. Figure 7 illustrates the

Page 8: Memory Chips with Adjustable Configurations*downloads.hindawi.com/journals/vlsi/1999/062801.pdf · [20]. Forinstance, the lookuptables ontheXilinx XC4000series ofFPGAscanbeusedto

210 L.K. JOHN

Inputs

M A19 A18 A17 A16O ’xxx’"lOxxx110xx1110x1111x

Outputs

MC4 MC3 MC2 MC11111110110010000000

Configurationbit wide

2bitwide4 bit wide8bitwide16 bit wide

FIGURE 7 Combining address pins with mode pins.

manner in which the address lines and the modecontrol bits can be shared. All mode control pinsexcept one can be time-shared with the addresspins. That line if 0 commands the chip to be in thenarrowest possible configuration. If this modecontrol line is 1, the chip is in a wider mode. Henceat least the highest address line (here, A20) is notrequired for addressing. So the highest address linecan be used for mode information. If that line isalso 0, the chip is twice as wide as the narrowestconfiguration. If A20 is a 1, the chip is even wider.So A19 is not required for addressing and will beused for mode information. If A19 is 0, the chip is4 times as wide as the narrowest configuration. IfA19 is 1, the chip is wider than 4 times the originalwidth. So A18 will not be required for addressingand can be used for mode information. If A18 is a0, the chip is 8 times the original width and if A18is a 1, the chip is 16 times the original width. Thusmode control can be accomplished by just oneextra pin.

3.4.2. Area Overhead

The VaWiRAM requires more hardware than aconventional RAM. The input and output datacontrol unit will need (Wmax- 1) extra pass gateseach. Similarly the programmable address decoderunit requires k/2 extra multiplexers each ofsize 2-to-1. The mode control unit also consumes a few gates.Table II illustrates the area overhead for achiev-

ing the programmability for a 64k bit memoryarray. The overhead is computed as the ratio of

TABLE II Area overhead for VaWiRAMs

Size of RAM Width Overhead area % Overhead(cells) (bits) (transistor count)

64k 1-16 208 0.053%1-32 394 0.100%1-64 756 0.192%1-128 1470 0.372%1-256 2888 0.729%

the number of transistors utilized for reconfigur-ability to the number of transistors utilized for theactual memory cells. Static RAM cells with 6-transistors are assumed for the RAM. A pass gateis assumed to contain 4 transistors and a 2-inputmux is assumed to contain 6 transistors includingthe required inverters. The mode control unit isassumed to be built as a compound logic functionimplemented at transistor level. Figure 8 illustratesthe percentage overhead for programmability uptoa width variability factor of 256, for 16k, 64k and1M RAMs.Compared to the flexibility yielded by the de-

sign, the incurred overhead is negligible. For exam-ple, for a 1M RAM, the area expended in achievinga programmable width from bit to 256 bits isonly 0.046%. For smaller RAMs, the percent-age overhead is higher. Overall, the area overheadincreases with the width variability factor k andreduces with increasing array sizes.

64

Variability Factor (k)

FIGURE 8 Area overhead of VaWiRAMs of sizes 16k, 64kand 1M bits.

Page 9: Memory Chips with Adjustable Configurations*downloads.hindawi.com/journals/vlsi/1999/062801.pdf · [20]. Forinstance, the lookuptables ontheXilinx XC4000series ofFPGAscanbeusedto

MEMORY CHIPS 211

3.4.3. Time Considerations

The circuitry incorporated for the reconfigurabilityof VaWiRAMs can contribute to longer line de-lays and slower switching. The various program-mable units, namely the PCADU, PODCU, andMCU will add a few gate delays to the RAM ac-cess process unless carefully implemented. A naivedesign will slow down the chip, however, on care-ful observation, it was clear that many of thesedelays could be eliminated by appropriate circuitdesign. The delays can be reduced or eliminatedby overlapping them with operations in the con-ventional RAM. It is difficult to obtain preciseestimate of the delays without detailed simulationsof the whole circuit and such an analysis is beyondthe scope of this paper. A preliminary evaluationshows that the PODCU will contribute one tran-sistor delay. Similarly, the MCU will lead to addingtwo transistor delays.

4. EXTENSIONS TO THE VaWiRAMARCHITECTURE

In VaWiRAM, we restricted reconfigurability tothe column decoder and the input-output datacontrol. We could remove this restriction to yield achip with programmable memory array as well asprogrammable row and column decoders. Themost general view of such a programmable me-mory cell array would be as in Figure 9. There willbe a sea of memory blocks (individual memorycells or large groups of cells) which can be inter-connected in ways desired by the user, using pro-grammable interconnect present in variouslocations within the chip. The row and columndecoders, I/O circuits and interconnections be-tween the memory blocks will be configured by theuser from the programmable logic present in thechip. The chip will also have I/O blocks which canbe configured as input or output buffers. The pinson the chip are not hard-wired to any address ordata lines and will be connected by the user toyield the desired configuration. These program-

mable logic blocks may be similar to the config-urable logic blocks in commercial FPGAs or theycan be tailored to constitute elements specificto interface logic needed by memory chips. Theprogrammable logic can be used to dynamicallyalter the data width of the RAM, and to configureany control logic required to interface the memoryto the processor.The voltages controlling the various program-

mable points in the chip reside in static RAM cellsconstituting the ’configuration RAM’. The chipconfiguration can be altered by loading a differentset of bits into the configuration RAM. Smallermemory blocks will yield increased flexibility dur-ing reconfiguration but will increase the overheadassociated with reprogramming. Higher amount ofrouting resources will help easy routing but willincrease the overhead. Partitioning, placement androuting algorithms developed in connection withFPGAs may be used for FPMCAs as well.

Every programmable point consumes silicon andreduces the real estate that can be used for mem-ory cells. Fine-grain programmability in whichevery memory block is a single cell will be veryinefficient due to this reason. A whole range ofdevices from less efficient and highly flexible tohighly efficient and less flexible, as in Figure 10 canbe designed. The key to designing a successful pro-grammable device is striking the right balancebetween the overhead associated with reprogramm-ability and the flexibility gained by it.The FPMCA concept could be implemented in

several ways. Depending on the amount of pro-grammability provided by the chip, we categorizeFPMCAs to various levels as indicated in Table III.

(i) Level 0 is conventional RAM with no reconfig-urability of organization. There is no extraflexibility or overhead. Level 0 is defined onlyto form a baseline configuration for compar-ison purposes. It may be noticed that level 0FPMCAs have an overhead of 0%.

(ii) Level 1 or Coarse Grain ReconfigurableFPMCA is in fact VaWiRAM; i.e., RAMwith very minimal programmability. A chip

Page 10: Memory Chips with Adjustable Configurations*downloads.hindawi.com/journals/vlsi/1999/062801.pdf · [20]. Forinstance, the lookuptables ontheXilinx XC4000series ofFPGAscanbeusedto

212 L.K. JOHN

I/0 Block

MemoryBlock

Programmable Logic Block Programmable Interconnect

FIGURE 9 A field programmable memory cell array (FPMCA) with a sea of memory blocks.

Ftexibility,Chip cost

andoverheadincreaseas we

progressin this

direction

Level 0 FPMCA (conventional RAM)

FPMCAs (VaWiRAMs)

FIGURE 10 The spectrum of programmable memory cellarrays.

reconfigurable between widths and Wma canbe constructed using Wma pass gates,(Wmax/2) 2-to-1 multiplexers, and Ilog2[log2(k) + 1] mode pins. Level FPMCAs have anoverhead typically smaller than 10%. The areaoverhead of a few FPMCAs with program-mability equal to the square root of the totalarray size is presented in Table IV.The area overhead is seen to be less than 6.5%

for RAMs of size lk to 64k bits. As the size ofthe RAM reduces, the overhead can go beyond

TABLE III Classification of FPMCAs

Classification Reconfigurability Basic memory grain Typical overhead

Level 0 (Conventional RAM) None The whole chip noneLevel (VaWiRAM) Coarse grain or more columns Less than 10%Level 2 Medium grain Several cells 10-50%Level 3 Fine-grain cell Greater than 50%

Page 11: Memory Chips with Adjustable Configurations*downloads.hindawi.com/journals/vlsi/1999/062801.pdf · [20]. Forinstance, the lookuptables ontheXilinx XC4000series ofFPGAscanbeusedto

MEMORY CHIPS 213

TABLE IV Area overhead for VaWiRAMs with a widthvariability factor equal to the square root of the size of thearray

Size of RAM Width variability factor % Overhead(cells) (k)

256 16 11.894%1024 32 6.029%4096 64 2.984%16384 128 1.473%65536 256 0.729%

TABLE V Area overhead for level 3 FPMCAs

Size of RAM Width variabilityfactor

(cells) (k)

% Overhead

256 256 65.3%512 512 65.0%1024 1024 64.9%2048 2048 64.8%

10% (for instance, in the case of a 256 bitarray with a width variability factor of 16,the overhead is 11%). However, it should beremembered that the classification is beingused only to illustrate the general trend thatreconfigurability at a fine-grain costs morethan coarse-grain reconfigurability.

(iii) Level 2 or Medium Grain ReconfigurableFPMCAs contain two-dimensional program-mability rather than only programmability ofwidth as in VaWiRAMs. Each of the memoryblocks is not a single cell, but it is a macrocellconsisting of a two-dimensional array of in-dividual cells. Level 2 FPMCAs have overheadtypically ranging from 10% to 50%. It may benoticed that according to our definition, 50%overhead means that the overhead is equal tothe usable memory area. Overhead decreases asthe size of the macro-block increases.

(iv) Level 3 or Fine Grain Reconfigurable FPMCAscontain a highly programmable memory arrayin which each memory block is an individualcell. As expected, the area overhead is extre-mely high, and is higher than the real estatespent on memory. (Not too surprising- inmany FPGAs, more silicon is spent on inter-connections rather than on programmablelogic). Level 3 FPMCAs have an overheadhigher than 50%. An evaluation of the over-head for level 3 FPMCAs of sizes 256 bits to 2kbits is illustrated in Table V. It may be noticedthat this computation considers only the hard-ware required to yield a programmable array; itdoes not consider any programmable interface

logic on the chip intended to interface the chipto processors or other devices.

The distinction between different classes of de-vices is not rigid. As mentioned before, the classi-fication is intended only to illustrate the generaltrends. FPMCAs is a new concept and the treat-ment provided here is rather preliminary. How-ever, it is an important step towards understandingthe tradeoffs in their design.The dynamic reconfigurability of FPGAs [2,. 6,

22] have been employed in several interestingapplications [3, 8, 9, 13]. Availability of FPMCAscan lead to interesting applications requiring largeamounts of flexible memory. It will be useful fordeveloping adaptive cache architectures or branchpredictors that dynamically reconfigure itself tosuit the application being run on the processor. Asin FPGA based systems where hardware upgradesbecome a software procedure that can be accom-plished by sending a different configuration pat-tern on a diskette or via an electronic transmission[16], systems with FPMCAs can easily be upgrad-ed or reconfigured. The proposed FPMCA willadd more flexibility to all users of memory chips.Memory manufacturers need not make X1,X4,X16 and X32 chips. The proposed generic memoryarray can be programmed to the desired config-uration easily.The FPMCAs will also be extremely useful for

prototyping large memory systems. Field Program-mable Interconnect Devices (FPIDs) [1] are oftencombined with FPGAs to realize field program-mable boards [7]. The FPMCA will add moreflexibility to such field programmable boards andprogrammable backplanes. Currently differentmemory chips have to be interconnected through

Page 12: Memory Chips with Adjustable Configurations*downloads.hindawi.com/journals/vlsi/1999/062801.pdf · [20]. Forinstance, the lookuptables ontheXilinx XC4000series ofFPGAscanbeusedto

214 L.K. JOHN

FPIDs in order to realize flexible memory organi-zations. With the availability of FPMCAs, theprogrammable boards and backplanes will becomemore efficient. There’s no doubt that FPMCAs willbe a powerful addition to the suite of program-mable hardware consisting of FPGAs, FPIDs, andfield programmable boards and backplanes.

5. SUMMARY AND CONCLUDINGREMARKS

In this paper, we presented the design of a variablewidth RAM called VaWiRAM which allows theuser to configure the RAM to the desirable width.A VaWiRAM chip reconfigurable between widthsand Wma can be constructed at an additional

cost of Wmax- pass gates, (Wmax/2) 2-to-1 mul-tiplexers, and logz[logz(k)+ 1] mode pins. Thispaper discussed the architecture of the proposedVariable Width RAMs (VaWiRAMs), and analyz-ed the trade-offs in their design. Although a naiveimplementation requires log2[ log2(k) + 1] modepins, a novel scheme requiring only a single modepin was presented. We also discussed a powerfulextension to the VaWiRAM, the Field Program-mable Memory Cell Arrays (FPMCAs) which arereconfigurable memory chips with embedded logicto realize the interface logic to other system com-ponents with reduced parts count. The FPMCAs il-lustrate the potential for a fully reconfigurablememory component, whereas the VaWiRAM illus-trates that limited versions of such systems can beconstructed with extremely low overhead. Al-though the concept can be applied to static anddynamic RAMs, design of adjustable dynamicRAMs is more complex (compared to SRAMs)due to the need for refreshing.

[3] Athanas, P. M. and Silverman, H. F., "Processor Recon-figuration Through Instruction Set Metamorphosis",Computer, March 1993, pp. 11-18.

[4] Bertin, P., Roncin, D. and Vuillemin, J. (1989). "Intro-duction to Programmable Active Memories", PRL Report3, Digital Equipment Corporation, Paris Research Lab.

[5] Bertin, P., Roncin, D. and Vuillemin, J. (1992). "Pro-grammable Active Memories: Performance Measure-ments", ACM/SIGDA Workshop on Field ProgrammableGate Arrays, pp. 57-59.

[6] Brown, S. D., Francis, R. J., Rose, J. and Vranesic, Z. G.(1992). Field Programmable Gate Arrays, Kluwer Aca-demic Publishers.

[7] Chan, P. K., Schlag, M. D. F. and Martin, M. (1992)."BORG: A Reconfigurable Prototyping Board usingField-Programmable Gate Arrays", ACM/SIGDA Work-shop on Field Programmable Gate Arrays, pp. 47-51.

[8] Cox, C. E. and Blanz, W. E., "GANGLION A FastField- Programmable Gate Array Implementation of aConnectionist Classifier", IEEE Journal of Solid StateCircuits, 27(3), March 1992.

[9] Dobbelaere, I., Gamal, A. E., How, D. and Kleveland, B.(1992). "Field Programmable MCM Systems Design ofan Interconnection Frame", ACM/SIGDA Workshop onField Programmable Gate Arrays, pp. 52-56.

[10] Katayama, Y.,"Trends in Semiconductor Memories",IEEE Micro, November/December 1997.

[11] Prince, B.,"Memory in the fast lane", IEEE Spectrum,Feb. 1994, pp. 38-41.

[12] Prince, B. (1991). Semiconductor Memories, John Wiley &Sons.

[13] Gokhale, M., Holmes, W., Kosper, A., Lucas, S.,Minnich, R., Sweely, D. and Lopresti, D., "Building andUsing a Highly Parallel Programmable Logic Array",IEEE Computer, January 1991, pp. 81 89.

[14] John, L. K., "VaWiRAM: A Variable Width RandomAccess Memory Module", Proceedings of the Ninth Inter-national Conference on VLSI Design, 1996 January.

[15] Special Report on Memory Systems, IEEE Spectrum,October 1992, pp. 34-57.

[16] Hsieh, W. J. (1993). "Reconfigurable Hardware", Com-puter Design, March 1993, pp. 27-30.

[17] Motorola Fast SRAM data book, 1993.[18] Przybylski, S. (1990). Cache and Memory Hierarchy

Design: A Performance-Directed Approach, MorganKaufman.

[19] Quinnell, R. A., High Speed Bus Interfaces, ElectronicDesign News, Sept. 30, 1993. p. 42.

[20] Salcic, Z. and Smailagic, A. (1997). "Digital SystemsDesign and Prototyping Using Field ProgrammableLogic", Kluwer Academic Publishers, p. 75.

[21] Tuite, D., Cache architectures under pressure to matchCPU performance, Computer Design, March 1993, pp.91-97.

[22] http://www.xilinx.com/

References

[1] Aptix, The Programmable Interconnect Company DataBook, February 1993.

[2] http://www.altera.com

Author’s Biography

Lizy Kurian John is an assistant professor in theDepartment of Electrical and Computer Engineer-ing at the University of Texas at Austin. She

Page 13: Memory Chips with Adjustable Configurations*downloads.hindawi.com/journals/vlsi/1999/062801.pdf · [20]. Forinstance, the lookuptables ontheXilinx XC4000series ofFPGAscanbeusedto

MEMORY CHIPS 215

received her Ph. D in Computer Engineering fromPenn State in August 1993. Her research interestsinclude high performance processor and memoryarchitectures, program behavior studies, compileroptimization for high performance processors,rapid prototyping, etc. Currently she is supportedby grants from the National Science Foundation

including a CAREER Award, and the State ofTexas Advanced Technology Program. She re-ceived a Junior Faculty Enhancement Award fromOak Ridge Associated Universities in 1996-1997.She is senior member of IEEE and a member ofACM and ACM SIGARCH. She is also a memberof Eta Kappa Nu, Tau Beta Pi and Phi Kappa Phi.

Page 14: Memory Chips with Adjustable Configurations*downloads.hindawi.com/journals/vlsi/1999/062801.pdf · [20]. Forinstance, the lookuptables ontheXilinx XC4000series ofFPGAscanbeusedto

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttp://www.hindawi.com Volume 2010

RoboticsJournal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporation http://www.hindawi.com

Journal ofEngineeringVolume 2014

Submit your manuscripts athttp://www.hindawi.com

VLSI Design

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

The Scientific World JournalHindawi Publishing Corporation http://www.hindawi.com Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Modelling & Simulation in EngineeringHindawi Publishing Corporation http://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

DistributedSensor Networks

International Journal of