test architecture for next-generation nand flash - · pdf file1 test architecture for...

11
1 Test Architecture for Next-General NAND Flash June 2013 Advantest Page 1 Test Architecture for Next-Generation NAND Flash By Ken Hanh Lai, Advantest America, Inc., product marketing manager Introduction Over the next four years, market research firm, IC Insights, forecasts the NAND Flash market to have the third-highest revenue-growth rate among all semiconductor segments as well as stronger bit growth than DRAM. Mobile phones and tablets are driving eMMC volume, pushing faster speeds and transitioning into UFS and other higher speed interfaces. Other applications such as Ultrabooks and enterprise storage solutions are spurring SSD growth, demanding even greater quality and higher speeds for both ONFi and Toggle NAND interfaces. To keep up with burgeoning market demands, NAND technology is rapidly evolving – in fact, intrinsically shifting. All major NAND manufacturers have begun sub-20nm process migration, further shrinking the basic 2-D NAND memory cell design. Also, more bits are being squeezed into each Flash cell as manufacturers increase from 2-bits-per-cell to 3-bits-per-cell. Recently, Samsung Electronics announced that it is now fabricating 3-bits-per-cell NAND Flash memory devices with 10nm class process technology. Many industry experts believe that 2-D NAND scaling would eventually reach its limit at 10nm process technology. As a result, the industry is adopting novel 3-D NAND lithography processes such as Samsung TCAT, Toshiba BiCS, SK Hynix DC-SF and Macronix BE-SONOS which drastically change the basic memory cell design. These advances in NAND Flash technology have raised quality and reliability challenges for NAND manufacturers, renewing the need for increased test coverage without hiking up the cost of test and dampening market growth. To deliver the needed performance economically, NAND manufacturers need a test solution with architecture capable of increasing yield and throughput. This paper examines NAND Flash test requirements at the package level and shows how the right ATE architecture can meet them most efficiently. NAND Flash Needs a Dedicated Test Solution As with other commodity products, NAND Flash per-bit ASP has been declining dramatically. According to Gartner Dataquest, ASP per Gigabyte of NAND Flash declined from $7,870 in 1997 to just $0.25 in 2012. With the recent industry consolidation, per-bit ASP has stabilized somewhat. Still, Gartner 2Q13 forecast shows a CAGR of -18.7% during 2012-2017. As a result, NAND Flash manufacturers are under constant pressure to reduce cost. One of the ways to reduce cost is to minimize the cost of test. This places the burden on IC manufacturers to utilize the most economical NAND test solution that delivers high throughput and maximizes yield. A dedicated tester is necessary because NAND Flash requires different test capabilities that are not commonly found on other memory testers. Most importantly, the tester needs to be optimized for NAND Flash testing and should not contain extraneous capabilities which would increase cost. Table

Upload: nguyennhi

Post on 05-Feb-2018

233 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Test Architecture for Next-Generation NAND Flash - · PDF file1 Test Architecture for Next-General NAND Flash June 2013 Advantest Page 1 Test Architecture for Next-Generation NAND

1

Test Architecture for Next-General NAND Flash June 2013 Advantest Page 1

Test Architecture for Next-Generation NAND Flash

By Ken Hanh Lai, Advantest America, Inc., product marketing manager

Introduction

Over the next four years, market research firm, IC Insights, forecasts the NAND Flash market to have the third-highest revenue-growth rate among all semiconductor segments as well as stronger bit growth than DRAM. Mobile phones and tablets are driving eMMC volume, pushing faster speeds and transitioning into UFS and other higher speed interfaces. Other applications such as Ultrabooks and enterprise storage solutions are spurring SSD growth, demanding even greater quality and higher speeds for both ONFi and Toggle NAND interfaces.

To keep up with burgeoning market demands, NAND technology is rapidly evolving – in fact, intrinsically shifting. All major NAND manufacturers have begun sub-20nm process migration, further shrinking the basic 2-D NAND memory cell design. Also, more bits are being squeezed into each Flash cell as manufacturers increase from 2-bits-per-cell to 3-bits-per-cell. Recently, Samsung Electronics announced that it is now fabricating 3-bits-per-cell NAND Flash memory devices with 10nm class process technology. Many industry experts believe that 2-D NAND scaling would eventually reach its limit at 10nm process technology. As a result, the industry is adopting novel 3-D NAND lithography processes such as Samsung TCAT, Toshiba BiCS, SK Hynix DC-SF and Macronix BE-SONOS which drastically change the basic memory cell design.

These advances in NAND Flash technology have raised quality and reliability challenges for NAND manufacturers, renewing the need for increased test coverage without hiking up the cost of test and dampening market growth. To deliver the needed performance economically, NAND manufacturers need a test solution with architecture capable of increasing yield and throughput. This paper examines NAND Flash test requirements at the package level and shows how the right ATE architecture can meet them most efficiently.

NAND Flash Needs a Dedicated Test Solution

As with other commodity products, NAND Flash per-bit ASP has been declining dramatically. According to Gartner Dataquest, ASP per Gigabyte of NAND Flash declined from $7,870 in 1997 to just $0.25 in 2012. With the recent industry consolidation, per-bit ASP has stabilized somewhat. Still, Gartner 2Q13 forecast shows a CAGR of -18.7% during 2012-2017. As a result, NAND Flash manufacturers are under constant pressure to reduce cost. One of the ways to reduce cost is to minimize the cost of test. This places the burden on IC manufacturers to utilize the most economical NAND test solution that delivers high throughput and maximizes yield.

A dedicated tester is necessary because NAND Flash requires different test capabilities that are not commonly found on other memory testers. Most importantly, the tester needs to be optimized for NAND Flash testing and should not contain extraneous capabilities which would increase cost. Table

Page 2: Test Architecture for Next-Generation NAND Flash - · PDF file1 Test Architecture for Next-General NAND Flash June 2013 Advantest Page 1 Test Architecture for Next-Generation NAND

2

Test Architecture for Next-General NAND Flash June 2013 Advantest Page 2

1 summarizes a list of NAND Flash test functions and benefits. Some of these functions will be covered in more detail in this paper.

ID NAND Flash Test Functions Benefits

1 Tester-per-site architecture —architecture in which there are many test sites in a system, each with its own independent test resources, such as: Test processor, algorithmic pattern generator, parametric measurement units, buffer memory, fail memory, and others.

Enables fastest test time for NAND Flash because each test site only handles a small number of DUTs, independent of other test sites in the system. Although more test electronics increases cost, this test architecture still lowers the overall cost of test by delivering much faster test time.

2 Site-chain support—to enable two or more test sites to be combined into a single test site with higher pin count.

Enables flexibility to test various NAND Flash interfaces, from low-pin-count devices such as eMMC to high-pin-count devices such as ONFi/Toggle NAND with one, two, or more ports.

3 AC performance—accuracy and speed for next generation ONFi/Toggle NAND and managed NAND such as eMMC. A tester optimized for NAND Flash needs to support at-speed testing up to 800Mbps + 10% margin with overall timing accuracy of < 200ps to maximize yield.

Enables at-speed testing to guarantee quality while maximizing yield with high accuracy at low cost. Higher AC performance directly results in higher tester cost; therefore, it’s necessary to optimize AC performance of the tester specifically for NAND Flash testing to minimize cost.

4 Real-time source synchronous function--capable of automatic, cycle-by-cycle adjustments to compensate for timing drift and jitter of critical timing parameters such as tDQSRE and tDQSQ.

Enables maximum yield and faster test time, especially when test speeds higher than 400Mbps.

5 Scalable, high-current PPS architecture—allows for flexible & high PPS pin-count per DUT.

Enables faster test time through concurrent, multi-die operation, such as NAND Flash array program and erase.

6 Flexible ECC analysis function—capable of on-the-fly analysis and ECC grading.

Enables maximum yield, flexibility and faster test time.

7 Bad-block management—allows for easy management of bad blocks in NAND devices, including the ability to mask bad blocks from being tested.

Enables faster test time and simpler test program generation for managing bad blocks.

8 Fail Memory optimized for NAND—NAND Flash package is already approaching 1 Terabit and increasing. It is not cost effective to have a large Fail Memory in the tester to store fail bitmaps of every DUT. Furthermore, it is even more costly when error logging is required during at-speed tests. Therefore, it’s necessary to have a new Fail Memory design which is capable of providing the necessary functions as well as lowering cost.

Enables flexibility at lower cost.

9 Buffer Memory optimized for NAND—as is the case with Fail Memory, it is not cost effective to have a large Buffer Memory to store custom data pattern to be used for programming and verifying the NAND Flash array, especially when it has to support at-speed tests. Therefore, it’s also necessary to have a new Buffer Memory design which is capable of providing the necessary functions as well as lowering cost.

Enables flexibility at a lower cost.

10 eMMC test support—capable of supporting eMMC protocol and simple multi-DUT support.

Enables flexibility and simpler test program generation for faster TTM.

Table 1 Key test features for NAND Flash

Figure 1 shows NAND market surpassed the DRAM market in revenue for the first time in 2012, and the gap is expected to widen for the foreseeable future. This market growth provides sufficient justification for developing a dedicated NAND Flash test solution capable of handling evolving technologies, increasing test coverage, reducing the cost of test, and lowering ASPs.

Page 3: Test Architecture for Next-Generation NAND Flash - · PDF file1 Test Architecture for Next-General NAND Flash June 2013 Advantest Page 1 Test Architecture for Next-Generation NAND

3

Test Architecture for Next-General NAND Flash June 2013 Advantest Page 3

Figure 1 Flash and DRAM market comparison

Tester-per-Site Architecture

Many papers have documented the advantages of a tester-per-site architecture over shared-resource designs for testing NAND Flash. Instead of covering that ground again, this paper will examine some key differences among tester-per-site architectures to determine which are best suited for NAND Flash.

The architecture with the best throughput for NAND Flash is one that is based on a true tester-per-site architecture where one DUT is tested per test site. In this architecture, there are many test sites in a system to support the required parallelism. Each test site has its own independent test resources, such as, test processor, algorithmic pattern generator, parametric measurement units, buffer memory, fail memory and others. However, this tester design is very expensive and may not offer the lowest cost of test overall. Typically, a tester-per-site architecture testing between two and eight DUTs per site presents the optimal balance of tester cost and throughput, resulting in the lowest overall cost of test.

Figure 2 illustrates the point by comparing two types of tester-per-site architectures in terms of test time and cost of test. Tester A can accommodate smaller number of DUTs per test site than Tester B; therefore, Tester A offers lower test time and lower cost of test.

Page 4: Test Architecture for Next-Generation NAND Flash - · PDF file1 Test Architecture for Next-General NAND Flash June 2013 Advantest Page 1 Test Architecture for Next-Generation NAND

4

Test Architecture for Next-General NAND Flash June 2013 Advantest Page 4

Figure 2 Comparison of various tester-per-site architectures

Depending on the test flow, normalized test times can differ, as shown by the NAND FT and eMMC FT curves. For a typical tester-per-site architecture, test time increases as a test site handles higher and higher number of DUTs. The rate of test time increase is highest between one to four DUTs per site, but it gradually decreases beyond that. This behavior is quite typical for NAND Flash test flows, mainly due to:

Test time for NAND Flash operation, especially array program, varies per Flash cell per DUT – some will have longer or shorter test times depending on process differences.

Per-DUT, unique data generation is necessary for test items such as trimming of references and marking of bad blocks in the array. When combined with testing multiple DUTs per site, it is typical for a tester to serially test each DUT in the test site resulting in long test times. A more efficient tester design would have the capability to test all DUTs in the test site in parallel with minimum serialization so that the overall test time is reduced.

To reduce tester cost, it is necessary to share some of the test resources in a test site among DUTs. These test resources such as ADC and PMU, required for test items involving measurement of voltages and currents on DUT pins, are typically shared among DUTs per test site. They need to be applied serially per-DUT for such test items resulting in slightly longer test times, but the hardware cost savings would result in an overall lower cost of test.

Page 5: Test Architecture for Next-Generation NAND Flash - · PDF file1 Test Architecture for Next-General NAND Flash June 2013 Advantest Page 1 Test Architecture for Next-Generation NAND

5

Test Architecture for Next-General NAND Flash June 2013 Advantest Page 5

Another test resource that is typically shared to save cost is the test processor. It is responsible for executing the test program which sometimes includes intensive data processing for every DUT in the test site. For example, if a tester does not have the hardware to support real-time source synchronous testing or on-the-fly ECC analysis, the test processor has to perform significant post-processing of data for every DUT in the test site resulting in increased test time. However, for a tester with the necessary hardware supports, all the related post-processing of data can be eliminated resulting in shorter test time.

Testing of eMMC devices requires protocol support which is based on a series of command-response exchanges between tester and DUT. The problem is that the response time varies per DUT, from two to 64 clock cycles according to specification. Combined with testing multiple DUTs per test site, which is typical for such low-pin-count devices, the tester must poll all DUTs in the test site for responses in parallel. If one of the DUTs is ready to provide its response before the others, it must wait until all other DUTs in the test site are ready to provide their responses in parallel. Therefore, test time is longer because it is determined by the DUTs with the slowest response times.

Real-time Source-Synchronous Testing

To meet higher data-transfer rate demand by NAND Flash applications, the industry has introduced DDR NAND Flash and managed NAND Flash such as eMMC. These devices are already supporting 200 MT/s and 400 MT/s data-transfer rates, and they will need to support even higher rates in the future. As data-transfer rates increase, source synchronous capability is added to the NAND interface with the addition of bidirectional DQS signal to better control the data interface timing. This is another test requirement which best can be addressed by selecting the right tester architecture to improve yield and throughput.

There are two major DDR NAND interface designs in the market. One design, Toggle Mode, is developed and supported by Toshiba and Samsung. The other, ONFi, is developed by the Open NAND Flash Interface Working Group and is supported by NAND manufacturers including SK Hynix, Micron, Intel and SanDisk. To enable the design of interoperable systems for these two designs, JEDEC and ONFi have developed and published a document, called JESD230 NAND Flash Interface Interoperability Standard, which focuses on NAND Flash packages.

Tables 2 and 3 show that both ONFi and Toggle Mode DDR NAND Flash have added bidirectional data strobes at data-transfer speeds of 133 MT/s and higher. For next-generation devices supporting data-transfer speeds higher than 400 MT/s, source-synchronous is even more critical because timing margins are significantly reduced.

Page 6: Test Architecture for Next-Generation NAND Flash - · PDF file1 Test Architecture for Next-General NAND Flash June 2013 Advantest Page 1 Test Architecture for Next-Generation NAND

6

Test Architecture for Next-General NAND Flash June 2013 Advantest Page 6

Feature ONFi 1.0 ONFi 2.x ONFi 3.0 ONFi NG (projection)

Interface SDR NV-DDR NV-DDR2 NV-DDRx?

Maximum transfer speed

50 MT/s 200 MT/s 400 MT/s 533 MT/s ~ 800 MT/s?

Data strobe No Yes Yes Yes?

Table 2 ONFi NAND Flash interface and projected roadmap (Source: JEDEC & independent projection)

Feature Legacy Toggle 1 Toggle 2 Toggle NG (projection)

Interface SDR DDR DDR DDR?

Maximum transfer speed

40 MT/s 133 MT/s 400 MT/s 533 MT/s ~ 800 MT/s?

Data strobe No Yes Yes Yes?

Table 3 Toggle Mode NAND Flash interface and projected roadmap (Source: JEDEC & independent projection)

One of the key challenges in supporting source-synchronous functionality arises during device data output operations. According to ONFi 3.0 specifications shown in Figures 3 and 4, there are two critical timing parameters: tDQSRE and tDQSQ. These timing parameters have very wide ranges and are extremely sensitive to process-voltage-temperature (PVT). Most existing testers do not have real-time source-synchronous support. Therefore, users typically rely on a process called “training” to determine these parameters. Training is a time-consuming operation in which a simplified read pattern is looped while strobe-timing edges are finely adjusted over a wide range until the desired result is found. Because both of these timing parameters are sensitive to PVT, training needs to be done for every DUT whenever voltage conditions are modified in the test flow. As a result, significant test-time overhead can be incurred. Furthermore, device operation can change the device temperature, and together with jitter, they affect the timing parameters during the test which could not be resolved with training. If a tester does not have a source-synchronous function to automatically compensate real-time, then yield loss is possible, especially at data-transfer rates higher than 400 MT/s where data eye is significantly reduced.

Page 7: Test Architecture for Next-Generation NAND Flash - · PDF file1 Test Architecture for Next-General NAND Flash June 2013 Advantest Page 1 Test Architecture for Next-Generation NAND

7

Test Architecture for Next-General NAND Flash June 2013 Advantest Page 7

Figure 3 NV-DDR2 Data Output Cycle Timing (Source: JEDEC ONFi 3.0 specification)

Condition Min. Max. Unit

tDQSRE 3 25 ns

tDQSQ ~100 MHz (tRC = 10 ns)

-- 0.8 ns

~133 MHz (tRC = 7.5 ns)

-- 0.6 ns

~166 MHz (tRC = 6 ns)

-- 0.5 ns

~200 MHz (tRC = 5 ns)

-- 0.4 ns

Table 4 tDQSRE and tDQSQ specifications (Source: JEDEC ONFi 3.0 specifications)

For a tester with real-time source-synchronous function similar to Figure 4, it would be able to automatically compensate for the tDQSRE and tDQSQ dispersions across multiple devices on a cycle-by-cycle basis, despite changing PVT. The net result is test execution with guaranteed data eye which

Page 8: Test Architecture for Next-Generation NAND Flash - · PDF file1 Test Architecture for Next-General NAND Flash June 2013 Advantest Page 1 Test Architecture for Next-Generation NAND

8

Test Architecture for Next-General NAND Flash June 2013 Advantest Page 8

maximizes yield when testing at data rates higher than 400 MT/s. Furthermore, the tester would be able to eliminate all related training operations significantly reducing test time.

Figure 4 Real-time source synchronous function

Managed NAND Flash – such as eMMC typically used for mobile applications – also is increasing its data-transfer rates to 400 MT/s and possibly higher. As with ONFi and Toggle NAND interfaces, the eMMC interface is moving to DDR and adding source-synchronous capability to better support higher data-transfer rates. The real-time source-synchronous function shown above should be able to support eMMC 5.0 requirement as shown in Table 5.

eMMC4.41 eMMC4.51 eMMC5.0

Clock frequency 0~52 MHz 0~200 MHz 0~200 MHz

Maximum bandwidth 104 MB/s 200 MB/s 400 MB/s

Interface SDR-104 SDR-200 DDR-200

Data strobe No No Yes

Table 5 eMMC interface and projected roadmap (Source: JEDEC & independent projection)

Flexible, On-the-Fly ECC Analysis with Grading Capability

To meet the ever-increasing demand for higher bit growth by the NAND market, manufacturers have advanced to sub-20nm process lithography, raising the number of bits per cell. At 10nm process technology, a Floating-Gate NAND memory cell holds as few as 10 electrons according to the trend. With 3-bits-per-cell enabled, the data stored in such memory cell changes by just adding or removing one to two electrons in the floating gate. Also, they are adopting novel 3-D lithography processes. All of these changes affect the quality and reliability of NAND Flash, which must meet the performance demands for SSDs in Ultrabooks and enterprise storage solutions. As a result, NAND

Page 9: Test Architecture for Next-Generation NAND Flash - · PDF file1 Test Architecture for Next-General NAND Flash June 2013 Advantest Page 1 Test Architecture for Next-Generation NAND

9

Test Architecture for Next-General NAND Flash June 2013 Advantest Page 9

manufacturers are more dependent upon increasing ECC capabilities, which can be addressed by the tester architecture.

Most of today’s testers do not offer real-time ECC support. Therefore, users typically are required to perform a post-processing operation to determine ECC solutions for every DUT. This operation is time consuming and results in significant test-time overhead, especially, for test flows that perform many read operations with ECC analysis enabled. Also, it requires significant tester resources including a large error-capture RAM to store fail bitmap of every DUT and high-performance test processors to analyze those fail bitmaps and come up with ECC solutions. The net result is significant increase in cost of test for supporting ECC analysis.

However, it is possible to implement a low-cost ECC analysis function with on-the-fly analysis to achieve lower cost of test. This ECC analysis function would have the following attributes:

On-the-fly ECC analysis: ECC analysis is done on-the-fly across all DUTs, eliminating all related test-time overhead. It also reduces tester cost because it does not require expensive tester resources such as a large error-capture RAM and many high-performance test processors.

Flexibility: NAND manufacturers sell their raw NAND to various customers, each with its own proprietary controller and ECC algorithm supporting different data organization (refer to figure 6). The solution must be flexible to support different ECC algorithms.

Concurrent execution of multiple ECC algorithms: For the same reason as above, it is beneficial to support concurrent execution of multiple ECC algorithms to save test time.

Support of ECC grading: Besides providing pass/fail status, it also is advantageous to have the capability to store fail counts per ECC sector so that the data can be analyzed, graded and binned as necessary. This gives NAND manufacturers the flexibility to sell their devices to customers that could support different thresholds of fail-bits per ECC sector, thus, allowing manufacturers to improve profit by selling devices that would otherwise be rejected or binned with lower densities.

Figure 5 Data/ECC organizations

Page 10: Test Architecture for Next-Generation NAND Flash - · PDF file1 Test Architecture for Next-General NAND Flash June 2013 Advantest Page 1 Test Architecture for Next-Generation NAND

10

Test Architecture for Next-General NAND Flash June 2013 Advantest Page 10

High-Current PPS for Higher Throughput

To achieve higher densities and integration needed by mobile applications, NAND manufacturers are stacking as many as 16 dies per package. Testing all the bits in a package with capacity nearing 1 Terabit or higher in the future requires very long test time which directly increases the cost of test. One of the techniques to reduce test time is to perform concurrent testing of all dies in a package, especially, for long test time items such as array program and erase; however, it requires a significant amount of power supply current per DUT. Using a tester with scalable power supply architecture which enables flexible PPS pin-count per DUT can reduce test time as well as cost of test.

As shown in Table 6, typical NAND Flash array program and erase operations consume about 100mA of ICC, assuming that programming or erasing is done sequentially per die. To reduce test time, all dies in the package could be enabled to do these operations concurrently. The ICC current required is directly proportional to the number of dies participating in the operation. Figure 7 shows test time impact for array program/erase operation among testers with different PPS pin-count per DUT—test time saving is significant for testers with high PPS pin-count per DUT especially as more dies are stacked in a package.

Parameter # of die in parallel Max. Unit

Array program/erase current (ICC at VCC = 3.7 V)

1 100 mA

2 200 mA

4 400 mA

8 800 mA

12 1200 mA

16 1600 mA

Table 6 Example of array program and erase current requirement

Page 11: Test Architecture for Next-Generation NAND Flash - · PDF file1 Test Architecture for Next-General NAND Flash June 2013 Advantest Page 1 Test Architecture for Next-Generation NAND

11

Test Architecture for Next-General NAND Flash June 2013 Advantest Page 11

Figure 6 Test time for parallel NAND program/erase operations versus number of PPS per DUT

Many existing testers do not have PPS resource with the required voltage and ICC current to support concurrent array program/erase operation for up to 16 dies per package. In such cases, some testers allow ganging of two or more PPS channels to increase the ICC current. Unfortunately, the PPS pin-count is fixed per system on most existing testers. As a result, either test time must be extended to allow for serialization of the array program and erase operations for all dies in the package, or parallelism must be reduced to allow for more PPS channels per DUT. Either way, it results in lower throughput which increases cost of test.

On the other hand, a tester with a scalable PPS architecture can allow for flexible numbers of PPS channels per DUT enabling a significantly lower test time as well as cost of test.

Conclusion

The NAND Flash market is expected to continue growing for the foreseeable future, driven by expanding use in mobile phones, tablets and SSDs. These applications will require continued increases in density and performance while demanding higher quality and reliability. To keep pace, NAND manufacturers are developing innovative, new NAND Flash technologies, requiring more test coverage while driving up the cost of test. To manage testing costs, NAND manufacturers need a dedicated NAND Flash test solution with a scalable tester-per-site architecture, optimized AC performance, and other NAND-specific features to increase yield and improve throughput.

Ken Hanh Lai is a manager of Product Marketing at Advantest America Inc. with responsibility for memory test systems. Prior to joining Advantest, Ken worked as manager and engineer of Product Marketing and Applications, supporting memory ATEs at several companies, including Verigy, Agilent, Hewlett-Packard and Versatest. Notable is his 22 years of experience in the industry supporting and defining memory test systems for non-volatile memory products, including NAND and NOR Flash.

This article was published in the July 2013 issue of the SEMI (www.semi.org) Global Update e-newsletter.