expedited-compact architecture for average scan power reduction
TRANSCRIPT
Expedited-CompactArchitecture for AverageScan Power Reduction1
Samah Mohamed Ahmed Saeed
New York University-Polytechnic InstituteOzgur Sinanoglu
New York University-Abu Dhabi
h EXCESSIVE SWITCHING ACTIVITY during
scan operations endangers the reliability of the
chip under test. Elevated levels of peak power, which
is the maximum instantaneous power throughout
the entire test process, may result in yield loss, while
high levels of average powerVthat is, the total power
dissipation averaged over the duration of the test
application processVleads to the overheating of the
chip under test [2]. As the shift operations dominate
the test application process, average power mostly
depends on scan power, and thus, the impact of
capture power on average power is negligible.
Capture power is more of a concern when the target
is reductions in peak power. Researchers have
proposed numerous scan power reduction method-
ologies, ranging from test generation and x-filling to
scan chain segmentation via clock gating; various
papers [2]–[7] outline these techniques in detail.
A recent trend has been
low-power test solutions in
the context of compression-
based scan architectures
where filling of x’s for higher
compression and for lower
power are two conflicting
objectives. Test generation
and/or x-filling solutions
for addressing shift and/or capture power have
attained reductions at the expense of an increase in
pattern count, and thus in test costs. An ideal solution
is one that retains cost-quality metrics (pattern count,
compression level, fault/defect coverage) intact
without interfering with the design flow via intrusive
techniques such as clock gating.
Very recently, Chandra et al. proposed a Design-for-
Testability (DfT)-based approach [8], which we shall
refer to asDeferred-Broadcast (DB), for reducing scan-in
power in the Illinois scan architecture. In this scheme,
only one reference chain receives and subsequently
broadcasts the stimulus into theother chains during the
final small fragment of the shift process, thus allowing
all-but-one chains to receive constant-0’s for the
majority of shift cycles. Lower scan-in power is the
end-result, while the scan chains eventually receive
the intended stimulus intact prior to capture; and, this
technique works without clock gating.
The shortcomingof theDB architecture [8] is that it
only targets scan-in power reduction and overlooks
scan-out power. While each stimulus and response
transition equally contributes to switching activity
during test, scan-out power typically dominates test
power; a DfT engineer can always fill the stimulus
‘‘don’t care’’ bits (x’s) that remain postcompression
Editor’s notes:In expedited-compact scan, the output response of the STUMPS channelsis smartly compacted without the overhead of the full-scan chain-shiftoperation, thereby reducing the scan mode power. The authors alsopropose suitable integration with other scan compression methods.
VDr. Rubin Parekhji, Texas Instruments
1We have presented a preliminary version [1] of this workat the VLSI Test Symposium 2011 in Dana Point, CA, USA, andreceived the best paper award.
2168-2356/12 B 2012 IEEEMay/June 2013 Copublished by the IEEE CEDA, IEEE CASS, IEEE SSCS, and TTTC 25
Digital Object Identifier 10.1109/MDT.2012.2213793
Date of publication: 17 August 2012; date of current version:
23 September 2013.
properly (0-fill or repeat-fill) to leash the scan-in power,
while such a direct control over response transitions,
with the exception of probabilistic and inexact
simulations, does not exist. Thus, although the DB
architecture [8] may attain significant savings in scan-
in power, these savingsmay correspond to onlya small
fraction of the overall scan power.
In this work, we propose a complementary
solution, Expedited-Compact (EC), that targets scan-
out power reduction, reducing average scan power.
The expedited-compact feature in the proposed
architecture enables the collection of the com-
pacted responses in a few chains by utilizing them
as buffer. Overwriting of the captured response
(upon its expedited compaction) in all the other
scan chains with shifted constant-0 values in turn
deliversreductions in scan-out power. For industrial
cases that employ 0-fill so as to eliminate transitions
in stimuli, the proposed technique is 5 to 66 times
more effective than DB [8] in reducing average test
power. The proposed features incur a very minor
area cost, yielding significant power savings cost-
effectively. Furthermore, as the proposed EC and the
previous DB approaches are complementary and
orthogonal, their joint application delivers both
scan-in and scan-out power reduction.
EC architecture does not require design-flow
intrusive hardware such as clock gating logic,
retaining the clock tree intact,which differentiates
the proposed solution from the traditional scan
chain segmentation techniques [9]. It retains test
development (test generation, x-fill) and application
(test data, pattern count, fault/defect quality) intact.
EC can deliver 70–85% average scan power reduc-
tion at a projected area cost of less than 0.1% for
large-sized industrial circuits.
Proposed Expedited-Compact (EC)architecture
Assume that a given scan architecture has four
scan chains feeding a 4� 1 compactor; Figure 1
provides the proposed EC architecture for such an
architecture. As the compactor has a single output,
only one chain (topmost) is designated as the
reference chain (R), while the other (three) chains
are the shadow chains (S). The additional compactor
(shaded color) introduced in between the regions
performs the expedited compaction operation. The
new compactor feeds the reference chain of Region 2
with the compressed response of all the chains of
Region 1, while simultaneously the original compac-
tor propagates the compressed response of Region 2
to the scan-out channel. Also during the first half of the
shift operations, constant-0 stimulus feeds the shadow
chains of Region 2. By the end of the first half of shift
cycles, the chains in Region 1 consist of inserted
stimulus, the reference chain in Region 2 consists of
the compacted response, and the shadow chains of
Region 2 consist of all 0’s.
Figure 1. Expedited-Compact (EC)-2 regions.
IEEE Design & Test26
Average Scan Power Reduction Architecture
In the second half of the shift cycles, stimulus
feed into all the chains in the Region 1 continues,
while the compacted response in the reference
chain of Region 2 passes on to the scan-out channel.
Simultaneously, the stimulus in Region 1 passes on
to Region 2. A simple counter-based controller,
similar to the one in [8], can control the select lines
of the multiplexer, eliminating the need for any
dedicated external pins or additional control data.
Note that EC does not require physical partition-
ing of the chip but rather inserts, on the test path,
multiplexer and compaction logic, which is typically
slow and can be distributed physically to ease
routing. As the associated delay can already be
afforded in the conventional scan (between the last
scan cell and the output channel), it is reasonable to
expect that the same delay can be tolerated in
between the scan cells; if not, scan pipelining or
balancing registers [10] can be utilized at the
expense of additional area.
While Figure 1 illustrates the proposed EC
architecture for only two regions, a larger number
of regions can increase the scan-out power savings.
EC with r regions enables the filling of all the shadow
chains, except for those in the leftmost region, with
0’s subsequent to one rth of the shift cycles,
collecting all the compacted responses in the
reference chain at this time. Thus, during the
remainder of shift cycles (the last ðr � 1Þ=r portions),the scan-out power dissipation occurs only in the
reference chain. Wewill show in the next section that
EC attains a reduction factor of ðr � cÞ=ðr þ c� 1Þ inaverage scan-out power for c chains.
It is important to differentiate EC from another
architectural solution that breaks chains into shorter
ones, utilizes multiple compactors and reduces the
shift speed. Such a solution delivers average power
savings while retaining the test time similar to EC; yet
key features of the scan architecture, such as the
number of scan chains, the number of output
channels and thus the tester interface and/or
compactor characteristics (if the number of chan-
nels is reduced) and shift speed are changed. As the
power savings of EC stem from the switching activity
reduction due to the constant-0 shift-in enabled by
the use of multiple compactors, these key features
are retained in a scan architecture with EC.
As a multiplexer driven by a constant-0 on one of
the data inputs simplifies down to an AND gate, the
cost of EC per chain, assuming a simple XOR tree as
the compactor, for instance, is approximately r � 1
XOR gates and r � 1 AND gates. Based on the area
constraints and targeted power reduction levels, we
can appropriately adjust r, enabling a cost-effective
tradeoff between area and power; larger values for r
deliver larger savings in scan-out power yet at the
expense of higher area cost.
Expected power reductionsIn our expected power saving analysis, we will
refer to the basic scan architecture with a response
compactor as the base case. We pursue a simplified
power model wherein the number of transitions in
scan cells defines the power value, as the two
strongly correlate [11]; we validate the accuracy of
this model in the Experimental Results Section.
PsðPrÞ denotes the expected number of transitions
induced by only stimulus (response) transitions in a
fragment of l scan cells over a shift period of l cycles;
Ps ¼ ts � l2 and Pr ¼ tr � l2, where ts and tr denote the
transition probability between consecutive stimulus
bits and consecutive response bits, respectively.
In Table 1, we present the expected power
dissipation levels for different scenarios for an l-bit
scan chain fragment, which vary in the bit vector
that the fragment receives serially and the bit vector
that the fragment initially contains. We express the
scan-in and scan-out power components separately.
Replacing a stimulus (response) fragment shift-in
with a constant-0 shift-in, and replacing a stimulus
(response) fragment shift-out with a constant-0 shift-
out yields a power saving of Ps=2ðPr=2Þ each.Figure 2 provides the power dissipation savings of
the EC technique with respect to traditional scan in
every scan chain fragment during different intervals
of shift operations. From this figure, we observe
savings in the scan-out power component, while the
Table 1 Power dissipation scenarios.
May/June 2013 27
scan-in power component remains intact. We can
express the expected power dissipation level for the
traditional (base) and EC architecture with c chains,
each with r regions (r ¼ 4 in the example), as
Pbase ¼Ps � r2 � c
2þ Pr � r2 � c
2(1)
PEC ¼Ps � r2 � c
2þ Pr � r � ðr þ c� 1Þ
2: (2)
We can see that EC attains a reduction factor of
ðr � cÞ=ðr þ c� 1Þ in scan-out power only; for our
example above (r ¼ 4 and c ¼ 4), this reduction
factor is 16=7 ¼ 2:3x. Apparently, the larger values of
c and r deliver higher savings in scan-out power.
Deferred-broadcast [8] (DB) andDB+EC Architectures
We illustrate DB with a single-input fanout-based
decompressor, which is how [8] originally defined
this scheme, while we later on discuss the extension
of DB for other basic combinational decompressors,
an aspect missing in [8]. For brevity purposes, we
illustrate the two orthogonal techniques DB and EC
together, which we refer to as DB+EC.
Figure 3 provides the DB+EC architecture for a
single scan-in channel fanning out to four scan
chains. Also, in this example, the DB technique
decomposes every scan chain into four blocks.
Simultaneous to the expedited compact operations,
in the first three quadrants of the shift cycles, only
the reference chain receives the broadcast stimulus,
filling in the first three blocks of the reference chain,
while simultaneously the shadow chains receive
constant-0’s. In the last (fourth) quadrant of the shift
cycles, the deferred broadcast operation takes
place; the Ri and Sij blocks receive the broadcast
stimulus in Ri�1, while the scan-in channel broad-
casts stimulus into R1 and S1j blocks. By the end of
the last quadrant of shift cycles, all the chains will
have received the intended broadcast stimulus.
Power reduction in the DB architecture (with no
EC) stems solely from the constant-0 stimuli that we
pump into the shadow chains, delivering scan-in
power reductions. As the DB scheme shifts out the
responses intact, however, scan-out power remains
the same. A similar analysis to the one in the pre-
vious section can show that DB attains a reduction
factor of ðb � cÞ=ðbþ c� 1Þ in scan-in power where b
and c denote the number of blocks and chains,
respectively, as
PDB ¼Ps � b � ðbþ c� 1Þ
2þ Pr � b2 � c
2: (3)
The cost of DB per scan chain is approximately 1
AND gate and b� 1 multiplexers.
PDBþEC ¼Ps � b � ðbþ c� 1Þ
2
þ Pr � r � ðr þ c� 1Þ2
: (4)
Figure 2. Power dissipation savings of 4-region EC with respect to traditional scan.
IEEE Design & Test28
Average Scan Power Reduction Architecture
DB with b blocks together with EC with r regions
result in a reduction factor of ðb � cÞ=ðbþ c� 1Þ inscan-in power and a reduction factor of
ðr � cÞ=ðr þ c� 1Þ in scan-out power; in this DB+EC
architecture, b and r can have distinct values.
For very large values of bðrÞ and c, the overall
reduction ratio of the DB architecture approaches
1þ ðPs=PrÞ while EC delivers an overall power
reduction ratio of 1þ ðPr=PsÞ. When Ps and Pr are
comparable, both reduction ratios asymptotically
approach 2x. The typical expectation, however, is
that Pr is much larger than Ps, as proper x-fill
techniques enable reductions in Ps while no such
direct control exists over Pr. In such cases, the
reduction by the DB architecture barely exceeds 1x,
while EC can deliver very high reduction ratios with
large values of bðrÞ and c.
Application domain and extensionsThe proposed EC technique is appliedwith a given
type of compactor chosen by the DfT designer. The
conventional response compaction applies the same
compaction operation by a single hardware unit
sequentially on numerous regions, as the data of
these regions pass by; the proposed EC technique
applies the same operation concurrently on each
region by multiple of the same hardware units
operating in parallel. In EC, the compacted responses
that have been collected in the reference chains
bypass all the compactors on the way to the output
channels, producing the same compacted response
with respect to the conventional case; aliasing,
masking, fault/defect coverage and diagnostic prop-
erties of the given scan architecture are perfectly
retained. Furthermore, the patterns are applied in an
identical manner; pattern count, test time and data
volume are also retained. We also note that the
proposed technique copes with the more challenging
problem of reducing power in the compressionmode.
In the case of multiple compression/compaction
modes [12], the same reconfigurable/dual compactor
needs to be repeated to enable the EC operations;
furthermore, the multiplexing logic that enables
multiple compression modes can be reused to lower
the cost of EC. Power dissipation in the serial top-up
mode can always be lowered by properly filling the
Figure 3. DB + EC: 4 blocks, 2 regions.
May/June 2013 29
don’t cares, which constitute the majority of the bits of
uncompressed patterns.
Uneven scan chain lengthsThe proposed EC architecture can accommodate
for uneven scan chain lengths. As we utilize the
reference chain fragments as buffers for the com-
pacted responses, one constraint is that the refer-
ence chain fragments in a region should be longer
than or equal to the longest chain fragment in the
neighboring region to its left. We can ensure this by
inserting the EC logic in such a way that all fragments
in all regions except for the leftmost one are identical
in length, which are longer than or equal to the
longest chain fragment in the leftmost region. Similar
constraints apply to the DB architecture.
EC with clock gatingAs we mentioned earlier, one important and be-
neficial aspect of the proposed EC architecture is its
capability to deliver power savings without resorting
to design-intrusive clock-gating. However, we also
note that power dissipation in clock trees can be
significant. If clock-gating is indeed permissible, EC
can work with clock-gating also. In such an imple-
mentation, we can shut off the clock of the shadow
chains in a region from the time of completion of the
expedited compaction operation (reference chain
has collected the compressed responses and shad-
ow chains have received constant-0’s) until the
chains receive their stimulus. During this period,
dynamic power dissipation in the corresponding
clock trees disappears. In the DB+EC architecture
with clock gating, we can extend the shut off of the
clock of the shadow chains until the beginning of
the deferred broadcast operation, providing a wider
window where we can further reduce the power
dissipation in the clock trees.
Response unknownsEvery response compactor bears a particular un-
known (x) mitigation characteristic. An x-clean de-
sign can benefit from the use of a MISR given that a
serial scan mode can be enabled for diagnostics.
The proposed EC architecture can also accommo-
date MISRs by inserting multiple copies of the MISR
in between the regions in order to expedite the
response compression, yet without the need for any
reference chains (buffers). The scan-out power re-
duction ratio is improved to r, the number of regions.
The presence of unknown x’s in the design
necessitates the use of masking in conjunction with
the MISR. This is a challenge for the proposed EC
scheme; as the expedited compaction operations
should finish by the end of the first rth of the shift
cycles, so should the load of the entire mask data. In
an effort to retain the number of mask channels
intact, an r -bit buffer can help distribute r bits of
mask data in every cycle to r MISRs, necessitating
the mask channels be operated r times faster.
Another approach to mitigate x’s while retaining
some diagnostic capabilities is the use of multi-
output XOR compactors [13], rather than simple
single-output XOR tree. Implementing EC in this case
necessitates the use of multiple reference chains: as
many reference chains ðnÞ as the number of scan-
out channels. For a 4 � 2 XOR-based compactor, for
instance, the example in Figure 1 can be slightly
modified by having the top two chains as the
reference chains, and having constant-0 shift opera-
tions in the bottom two chains only. Apparently, the
power reduction benefit will be reduced compared
to the single-output XOR tree; the scan-out power
reduction factor becomes ðr � cÞ=ðn � r þ c� nÞ for ac by n response compactor (n scan-out channels).
This general formula can be used to derive the
power reduction factor for any case; for a single-
output XOR tree ðn ¼ 1Þ, the scan-out power
reduction ratio is ðr � cÞ=ð1 � r þ c� 1Þ ¼ ðr � cÞ=ðr þ c� 1Þ, while for a MISR ðn ¼ 0Þ, this ratio
degenerates to r.
Extensions for DBSimilar extensions can be foreseen for the DB
architecture as well, although [8] presented the
original idea for a particular decompressor, namely,
a single-input broadcast (fanout) decompressor. A
generalized ‘‘deferred decompress’’ scheme can
save scan-in power with other basic types of com-
binational decompressors, such as multi-input fan-
out decompressors or combinational XOR-based
decompressors; such a scheme can use b copies
of an n by c decompressor (n scan-in channels)
along with n designated reference chains. The end-
result would be a scan-in power reduction factor of
ðb � cÞ=ðn � bþ c� nÞ.
Experimental resultsWe have computed the power reduction results
of DB by assuming an Illinois architecture, and the
IEEE Design & Test30
Average Scan Power Reduction Architecture
results of EC by assuming various compactors (XOR-
based and MISR). We utilize a few ISCAS89 bench-
mark circuits (test data generated with ATLANTA
ATPG tool) and the industrial test data that we
obtained from Cadence, which consists of 100 fully
specified (x’s remaining post-compression 0-filled)
stuck-at patterns and their responses for three indus-
trial designs.
Table 2 provides the average power reduction
comparisons where the underlying scan architec-
ture is assumed to be a single scan-in channel feed-
ing eight scan chains that drive a single-output XOR
tree. The proposed scheme also assumes 0-filling of
don’t cares that remain post-compression in the
stuck-at patterns. All the techniques deliver perfect
stuck-at fault coverage levels of 89.9%, 99.5%, and
95.9%, respectively. Columns 2 and 3 compare the
scan power reductions by the proposed scheme
with respect to the scan cell switching model [11]
and a more elaborate timing-based model (via run-
ning ModelSim and creating a VCD file that captures
all the switching activity in the circuit); the results
closely correlate, validating the accuracy of the
simple scan cell switching model. The proposed
3-region EC approach delivers 40–50% scan power
reductions at the expense of 14 XORs, 2 multiplexers,
and 14 AND gates (0.17% area cost). The DB ap-
proach [8] delivers 10–15% scan power reduction,
while the x-fill approach provides around 30%
power savings at no area cost; a potential disadvan-
tage of the x-fill techniques is the degradation in
defect coverage and/or pattern count inflation. The
scan-out gating approach may possibly incur timing
penalties in addition to 0.1–0.3% area cost; the
approach in [15] is applied at the RT-level to prevent
timing penalties. Most importantly, all the four
schemes compared in this table are orthogonal tech-
niques and can be applied in conjunction to minimize
scan power.
Table III provides the average power reduction
results of the proposed EC technique that we ap-
plied on the test data of three industrial designs. For
the largest circuit C, for instance, DB delivers almost
no reduction, while the full-capacity 12-region EC
delivers a reduction around 90%. On the other
extremal point, the proposed EC delivers 35–50%
reductions in scan power for these designs with only
a single replication of the compactor (2 regions)
cost-effectively. In between these two extremal
points, the cost-effective 3-region EC delivers 45–
65% reductions; for the largest design ðCÞ, 3-regionEC delivers an overall scan power reduction of 63%
for a single-out compactor, and 54% for a five-output
compactor, mimicking the end-result of designer’s
choices in enhancing x-mitigation capabilities. We
can obtain higher levels of reductions in the case of
a MISR due to the absence of a reference chain that
collects the compacted responses.
Table 2 Average scan power reduction comparisons.
Table 3 Average scan power reductions (%) for industrial circuits.
May/June 2013 31
As only the test data was available to us, we can
gauge the area cost of DB and EC architectures with
respect to the scan overhead (the area cost due to
scan multiplexers). Per-chain cost of DB with 12
blocks is 11 MUXes and 1 AND gate, while per-chain
cost of EC (with XOR tree as the compactor) with 2, 3,
and 12 regions is 1 XOR + 1 AND gate, 2 XOR + 2 AND
gates, and 11 XOR + 11 AND gates, respectively. For
Design C that has 61 K registers, for instance, as each
scan chain has more than 2 K scan cells, the per-
chain scan overhead is more than 2000 MUXes. The
area cost of DB, EC, and DB+EC correspond to a
small fraction of scan overhead. We can therefore
project the cost of DB, EC and DB+EC architec-
tures to be less than 0.1% of the die area for even
larger industrial designs.
We also provide a switching activity plot for a
duration that spans a little more than the shift and
capture operations of three test patterns for various
EC architectures in Figure 4. All six plots (corre-
sponding to EC with varying number of regions)
present a similar behavior; peak switching activity
occurs during the capture operations where roughly
half of the 61 K flip-flops toggle, and this activity
decays as shift operations proceed. The underlying
reason for this behavior is that the responses embed
more transitions compared to the stimuli; as more
stimuli enter the scan chains and as the responses
exit the system, switching activity reduces. EC
architectures with a larger number of regions deliver
a quicker silencing of the switching activity.
IN THIS PAPER, we propose a DfT-based solution
that can reduce average test power significantly in a
cost-effective manner without resorting to any x-
filling techniques. The proposed solution is simple,
scalable, and retains test data and quality intact, as
observed responses are the same with or without EC.
Furthermore, EC is non-intrusive for design flow, as it
does not require clock gating for power savings. The
proposed EC architecture advances the response
compaction operations, ensuring that only the
reference chain holds the compacted response
during the majority of shift cycles, thus enabling a
constant-0 feed into all the other chains. The
proposed EC architecture also offers a power-area
co-optimization for designs with a very tight area
budget. It can still deliver significant reductions in
test power at reduced area costs. For industrial test
cases we have experimented with, we observe 70–
90% reductions in test power, boding well for even
larger-sized circuits. h
Figure 4. Switching activity (y-axis: number of toggles) vs time (x-axis: the cycle number) plot forEC on Design C: Plots from top to bottom correspond to EC with 1, 2, 3, 4, 6, and 12 regions,respectively.
IEEE Design & Test32
Average Scan Power Reduction Architecture
h References[1] S. M. Saeed and O. Sinanoglu, ‘‘Expedited response
compaction for scan power reduction,’’ in Proc. VLSI
Test Symp., 2011, pp. 40–45.
[2] P. Girard, ‘‘Survey of low-power testing of VLSI
circuits,’’ IEEE Design Test, vol. 19, no. 3, pp. 82–92,
2002.
[3] J. Saxena, K. M. Butler, V. B. Jayaram, S. Kundu,
N. V. Arvind, P. Sreeprakash, and M. Hachinger,
‘‘A case study of IR-drop in structured at-speed
testing,’’ in Proc. Int. Test Conf., 2003, pp. 1098–1104.
[4] S. Ravi, ‘‘Power-aware test: Challenges and solutions,’’
in Proc. IEEE Int. Test Conf., 2007, pp. 1–10.
[5] S. Ravi, R. Parekhji, and J. Saxena, ‘‘Low power test for
nanometer system-on-chips (socs),’’ J. Low Power
Electron., vol. 4, pp. 81–100, 2008.
[6] C. P. Ravikumar, M. Hirech, and X. Wen, ‘‘Test
strategies for low-power devices,’’ J. Low Power
Electron., vol. 4, pp. 127–138, 2008.
[7] D. Czysz, M. Kassab, X. Lin, G. Mrugalski, J. Rajski,
and J. Tyszer, ‘‘Low-power scan operation in test
compression environment,’’ IEEE Trans.
Computer-Aided Design Integr. Circuits, vol. 28, no. 11,
pp. 1742–1755, 2009.
[8] A. Chandra, F. Ng, and R. Kapur, ‘‘Low power Illinois
scan architecture for simultaneous power and test
data volume reduction,’’ in Proc. Design, Automation
and Test in Europe Conf., 2008, pp. 462–467.
[9] L. Whetsel, ‘‘Adapting scan architectures for low
power operation,’’ in Proc. Int. Test Conf., 2000,
pp. 863–872.
[10] Z. Qi, H. Liu, X. Li, D. Wang, Y. Han, H. Li, and W. Hu,
‘‘A scalable scan architecture for godson-3 multicore
microprocessor,’’ in ATS, 2009, pp. 219–224.
[11] R. Sankaralingam, N. A. Touba, and B. Pouya,
‘‘Reducing power dissipation during test using scan
chain disable,’’ in Proc. VLSI Test Symp., 2001,
pp. 319–324.
[12] A. Chandra, Y. Haihua, and R. Kapur, ‘‘Multimode
illinois scan architecture for test application time and
test data volume reduction,’’ in Proc. VLSI Test Symp.,
2007, pp. 84–92.
[13] S. Mitra and K. S. Kim, ‘‘X-compact: An efficient
response compaction technique for test cost
reduction,’’ in Proc. IEEE Int. Test Conf., 2002,
pp. 311–320.
[14] X. Liu and Q. Xu, ‘‘On simultaneous shift- and
capture-power reduction in linear decompressor-based
test compression environment,’’ in Proc. IEEE Int. Test
Conf., 2009, pp. 9.3.
[15] E. Alpaslan, Y. Huang, X. Lin, W.-T. Cheng, and
J. Dworak, ‘‘On reducing scan shift activity at RTL,’’
IEEE Trans. Computer-Aided Design Integr. Circuits,
vol. 29, no. 7, pp. 1110–1120, 2010.
Samah Mohamed Ahmed Saeed has BS andMS degrees from the Computer Science Departmentof Kuwait University and graduated at the top of herclasses in 2008 and 2010, respectively. She workedas a teaching assistant in the Department of Infor-mation Science, College ofWomen, Kuwait Universityand as a Research Assistant in the Computer En-gineering Department, College of Engineering andPetroleum, Kuwait University while working towardsher degrees. Upon receiving her MS degree in 2010,she worked as an instructor in the Department ofInformation Technology and Computing, Arab OpenUniversity, Kuwait. Since fall 2011, she has been aPhD student in the Computer Science Department ofNYU-Poly. Her Primary field of research is computer-aided design and reliability of vlsi circuits, specifi-cally design-for-testability. She published five papersin prestigious VLSI test conferences and receivedacknowledgement for her contribution in the imple-mentation and experimentation in two other confer-ence papers.
Ozgur Sinanoglu obtained his PhD in computerscience and engineering from the University ofCalifornia, San Diego, in 2004. He worked for twoyears at Qualcomm in San Diego as a Senior Design-for-Testability engineer, primarily responsible fordeveloping cost-effective test solutions for low-powerSOCs. After a four-year academic experience atKuwait University, he joined, in fall 2010, the NewYork University in Abu Dhabi. His primary field ofresearch is the reliability and security of integratedcircuits, mostly focusing on design-for-testability anddesign-for-trust. He has more than 100 conferenceand journal papers, three patents issued, and severalpatents pending. He is the recipient of the Best PaperAward of VLSI Test Symposium 2011.
h Direct questions and comments about this articleto Ozgur Sinanoglu, Computer Engineering Depart-ment, New York University-Abu Dhabi.
May/June 2013 33