# Scalable approach for electricity hunch reduction in the course of test-primarily based good judgment BIST

1. RELA RAGHU RAM, 2.DONE SRIDHAR.

 M.TECH (VLSI) FROM RAMACHANDRA COLLEGE OF ENGINEERING, VATLURU, ANDHRA PRADESH 534007.
ASSOCIATE PROFESSOR IN ELECTRONICS AND COMMUNICATION ENGINEERING, RAMACHANDRA COLLEGE OF ENGINEERING, VATLURU, ANDHRA PRADESH 534007.

# Abstract:

The generation of significant power droop (PD) during at-speed test performed by Logic Built-In Self-Test (LBIST) is a serious concern for modern ICs. In fact, the PD originated during test may delay signal transitions of the circuit under test (CUT): an effect that may be erroneously recognized as delay faults, with consequent erroneous generation of test fails and increase in yield loss. In this paper, we propose a novel scalable approach to reduce the PD during at-speed test of sequential circuits with scan-based LBIST using the launch-on capture scheme. This is achieved by reducing the activity factor of the CUT, by proper modification of the test vectors generated by the LBIST of sequential ICs. Our scalable solution allows us to reduce PD to a value similar to that occurring during the CUT in field operation, without increasing the number of test vectors required to achieve target fault coverage (FC). We present a hardware implementation of our approach that requires limited area overhead. Finally, we show that, compared with recent alternative solutions providing a similar PD reduction, our approach enables a significant reduction of the number of test vectors (by more than 50%), thus the test time, to achieve a target FC. The proposed architecture of this paper analysis the logic size, area and power consumption using Xilinx 14.5.

Index Terms— Logic BIST (LBIST), microprocessor, power droop (PD), test.

#### **1. INTRODUCTION**

With current advances in very large scale integrated technology, the sensitivity of today's chips to deep sub micrometer effects is increasing. Along with technology scaling, the increase in the operating frequency and the increase in the functional density of today's digital designs has led to new challenges for designers and test engineers. Furthermore, dynamic power consumption and IR-drop due to excessive switching activity are critical challenges. As a result, power reduction techniques have been extensively studied by both industry and academia with respect to both design and test. Scan-based test remains one of the most widely accepted design-for-test techniques because it significantly improves the

controllability and the observability of the circuit's internal nodes with an insignificant area and performance overhead. Switching activity during scan-based test is often much higher than that during normal operation. There are multiple reasons for this phenomenon. First, the test vectors applied consecutively are not correlated. Second, nonfunctional states may be traversed during scan-test. Furthermore, test compaction and testing multiple cores simultaneously contribute toward high-switching activity. In addition, as patterns are shifted into and out of the scan chains, multiple changes of the flip-flop values can propagate into the combinational logic and cause massive amounts of switching.

Scan-based tests might cause excessive circuit switching activity compared to a circuit's normal operation. Higher switching activity causes higher peak supply currents and higher power dissipation. High power dissipation during test can cause many problems, which are generally addressed in terms of average power and peak power. Average power is the total distribution of power over a time period, and is calculated using the ratio of consumed

Energy to test time .Peak power is the highest power value at any given instant. When peak power is beyond the design limit, a chip cannot be guaranteed to function properly due to additional gate delays caused by the supply voltage drop. The power consumption within one clock cycle may not be large enough to elevate the temperature over the chip's thermal capacity limit. To damage the chip, high power consumption must last for an enough number of clock cycles. The test power consumed during scan shifting and capture cycles is referred to as shift power and capture power, respectively. A typical scan chain in industrial designs consists of at least hundreds of scan cells, whereas the capture window only lasts one or a few clock cycles. Clearly, the average power consumption is determined by the shift power. Excessive shift power accumulation may make a good chip fail during test even if the peak capture power is low. Inserting no operation cycles between the end of scan shifting and the beginning of capture cycles can reduce the chance of rejecting good chips during test.

#### **Existing System:**

At-speed test of logic blocks is nowadays frequently performed using Logic BIST (LBIST), which can take the form of either combinational LBIST or scan-based LBIST, depending on whether the CUT is a combinational circuit or a sequential one with scan. In case of scan-based LBIST, two basic capture-clocking scheme sexist: 1) the launch-on-shift (LOS) scheme and2) the launch-on-capture (LOC) scheme. In LOS schemes, test vectors are applied to the CUT at the last clock (CK)of the shift phase, and the CUT response is sampled on this can chains at the following capture CK. In the LOC scheme, instead, test vectors are first loaded into the scan chains during the shift phase; then, in a following capture phase, they are first applied to the CUT at a launch CK, and the CUT response is captured on the scan chains in a following capture CK. In this paper, we consider the case of sequential CUTs with scan-based LBIST adopting an LOC scheme, which is frequently adopted for high-performance microprocessors. They suffer from the PD problems discussed above, especially during the capture phase, due to the high AF of the CUT induced by the applied test patterns. We consider the conventional scan-based LBIST (Conv LBIST) architecture shown in Fig. 1.

Testate flip-flops (FFs) of the CUT are scan FFs, arranged into many scan chains (s scan chains in Fig. 1). The pseudorandom pattern generator is implemented by an LFSR. The PS, which reduces the correlation among the test vectors applied to adjacent scan-chains, is composed of an XOR network expanding the number of outputs of the LFSR to match the number of scan chains. The PS gives to its output the current LFSR output configuration, together with future/past configurations at each shift CK.



Figure 1: Schematic of the considered scan-based LBIST architecture.

The Space Compactor compacts the outputs of these scan chains to match the number of inputs Of the Multiple-Input Signature Register (MISR). The MISR, the test response analyzer, and the

BIST Controller are the same as in combinational scan-based LBIST.

As for the scan FFs, our approach requires that, during shift phases, they maintain the last test Vector applied to the CUT at their outputs. This is guaranteed by the scan-FF, which is frequently Employed in microprocessors, and considered here as a significant example. However, this can also be achieved with other different scan FFs. The internal structure of this FF is shown in Fig. 2. It consists of two sub blocks, namely, the scan portion and the system portion, each consisting of a master-slave FF composed of two latches (Latches LA and LB for the scan portion, and latches PH2 and PH1 for the system portion). The latches have two clocks, and sample one out of two input data lines, depending on which clock is active.



Figure 2: Considered scan FF and signals' timing

The clocking scheme adopted to implement an LOC strategy is also reported in Fig. 2. It consists of a shift phase [scan enable (SE=1)] and a capture phase (SE=0). During the shift phase, a new Test vector is loaded in the scan chains after n shift CKs, where n is the number of scan FFs of the longest scan chain. At each shift CK, a new bit of the test vector present at the scanning of latch LA is shifted to the scan out of latch LB.

#### **Disadvantages:**

Performance is low

# **Proposed System:**

We propose a novel, scalable approach to reduce PD during capture phases of scan-based LBIST, thus reducing the probability to generate false test fails during test.Similar to the solutions, and our approach reduces the AF of the CUT compared with conventional scan-based LBIST, by properly modifying the test vectors generated by the Linear Feedback Shift Register (LFSR). Our approach is somehow similar to reseeding techniques, to the extent that the sequence of test vectors is properly modified in order to fulfill a given requirement that, however, is not to increase FC (as it is usually the case for reseeding),but to reduce PD.

#### Proposed scalable approach:

As we introduced in Section I, the goal of our approach is to reduce the PD that may generate false test fails during at speed test with scan-based LBIST. Such a PD occurs after the application of a new test vector to the CUT. This occurs at the launch CK (Update pulse in Fig. 2) within capture phases. The generated PD is proportional to the CUT AF induced by the application of a new test vector, which in turn depends on the AF of the scan FFs' outputs. For the considered scan FFs (Fig. 2), such an AF depends on the number of FFs'

outputs switching when the new test vector is applied. Therefore, the target of our approach is to reduce the number of FFs' outputs transitions occurring after the application of new test vector to the CUT.

#### Approach with 1 Substitute Test Vector:

For each scan chain m (m=1...s), one ST vector ST mire places the original test vector Tomato be applied to the CUT atheist capture phase according to Conv-LBIST (Fig. 3). It will be shown that this enables a 50% AF reduction compared with Conv-LBIST.In our approach, the ST vector STmito be charged in the Scan-Chain (SC)m and applied to the CUT at theith capture phase is constructed based on the structure of test vectorsTmi-1andTmi+1to be applied at the (i-1)th and (i+1)th capture phases. Assuming the presence of a generic PS, our solution exploits the fact that, During the shift phase preceding theith capture phase, test vectorsTmi-1andTmi+1are given at Proper outputs of the PS. should some test vectors not be produced at the PS outputs, the PS could be easily modified to generate them.



Figure 3: Schematic of (a) sequence of test vectors filling each scan chainm,(b) bits in the ST vector STmi and in the test vectors applied/to be applied at the previous/following capture phase

| <br><b>m</b> |     | - |    |       |  |
|--------------|-----|---|----|-------|--|
| <br>1.0      | 6.1 | ю | к. | B-4 - |  |
| <br>1.5      | ъ.  |   | -  |       |  |
|              |     |   |    |       |  |

PS PERFORMED FUNCTION AND GENERATED OUTPUTS

| o"(ξ)<br>= T("(j)     | PS Punction                                                  | $\begin{array}{l} \sigma^m(\xi-3)=\\ T^m_{i-1}(j)=\sigma^k(\xi) \end{array}$ | $o^{m}(\xi+3) =$<br>$T^{m}_{i+1}(j) = o^{p}(\xi)$ |
|-----------------------|--------------------------------------------------------------|------------------------------------------------------------------------------|---------------------------------------------------|
| 01(\$)                | $X_4(\beta)$                                                 | 0 <sup>10</sup> (\$)                                                         | 02(5)                                             |
| 0 <sup>2</sup> (2)    | X1 (\$)                                                      | $O^{1}(\frac{p}{2})$                                                         | $Q^{3}(\xi)$                                      |
| 01(5)                 | X1(\$)@X2(\$)@X3(\$)@X4(\$)                                  | 0 <sup>2</sup> (2)                                                           | 0 <sup>1</sup> (\$)@0*(\$)                        |
| 04(5)                 | X <sub>3</sub> ( <i>\$</i> )⊕X <sub>2</sub> ( <i>\$</i> )    | 0 <sup>4</sup> (\$)80 <sup>11</sup> (\$)                                     | 0 <sup>5</sup> ( <i>ξ</i> )                       |
| $O^{3}(\frac{1}{2})$  | X2 (\$)                                                      | 04(5)                                                                        | 0*(4)                                             |
| 0"(5)                 | $X_1(\hat{\xi}) \oplus X_2(\hat{\xi}) \oplus X_4(\hat{\xi})$ | 0 <sup>5</sup> ( <i>ξ</i> )                                                  | $O^{2}(\xi) \oplus O^{8}(\xi)$                    |
| 0'(2)                 | $X_3(\tilde{\xi}) \oplus X_3(\tilde{\xi})$                   | 0*(2)@07(2)                                                                  | 0 <sup>8</sup> ( <i>Ž</i> )                       |
| 0*(3)                 | $X_0(\xi)$                                                   | 0 <sup>7</sup> ( <i>ξ</i> )                                                  | 0%(5)                                             |
| 0"(5)                 | $X_i(\hat{\xi}) \otimes X_d(\hat{\xi})$                      | $O^{2}(\vec{z})$                                                             | 0 <sup>2</sup> (2)00 <sup>2</sup> (5)             |
| 010(5)                | $X_3(\tilde{\xi}) \oplus X_4(\tilde{\xi})$                   | $0^{7}(\xi) \oplus 0^{10}(\xi)$                                              | $O^{11}(\vec{\xi})$                               |
| $0^{11}(\frac{3}{2})$ | $X_4(\xi)$                                                   | 0 <sup>19</sup> (\$)                                                         | $0^{12}(\xi)$                                     |
| 012(5)                | X:(5)                                                        | 011(2)                                                                       | 03(2)                                             |



Fig. 3.Schematic of a possible implementation of our approach with N ST vectors.

Applications: 1) Built-In Self-Test 2) Testing.

Advantages: 1) High speed 2) Area and delay reduced.

# VLSI AND SYSTEMS

These advantages of integrated circuits translate into advantages at the system level:

**Smaller physical size:** Smallness is often an advantage in itself—consider portable televisions or handheld cellular telephones.

- Lower power consumption: Replacing a handful of standard parts with a single chip reduces total power consumption. Reducing power consumption has a ripple effect on the rest of the system: a smaller, cheaper power supply can be used; since less power consumption means less heat, a fan may no longer be necessary; a simpler cabinet with less shielding for electromagnetic shielding may be feasible, too.
- **Reduced cost:** Reducing the number of components, the power supply requirements, cabinet costs, and so on, will inevitably reduce system cost. The ripple effect of integration is such that the cost of a system built from custom ICs can be less, even though the individual ICs cost more than the standard parts they replace. Understanding why integrated circuit technology has such profound influence on the design of digital systems requires understanding both the technology of IC manufacturing and the economics of ICs and digital systems.

# **Experimental Results:**



# **RTL DIAGRAM**



#### Internal block diagram



#### **Simulation results**

# **Parameters**

| Property Name                          | Value               |        |
|----------------------------------------|---------------------|--------|
| Evaluation Development Board           | None Specified      | ~      |
| Product Category                       | All                 | ~      |
| Family                                 | Spartan3E           | ~      |
| Device                                 | XC3S500E            | ~      |
| Package                                | FG320               | ~      |
| Speed                                  | -5                  | ~      |
|                                        |                     |        |
| Top-Level Source Type                  | HDL                 |        |
| Synthesis Tool                         | XST (VHDL/Verilog)  | $\sim$ |
| Simulator                              | ISim (VHDL/Verilog) | ~      |
| Preferred Language                     | Verilog             | ~      |
| Property Specification in Project File | Store all values    | ~      |
| Manual Compile Order                   |                     |        |
| VHDL Source Analysis Standard          | VHDL-93             | ~      |
|                                        |                     |        |
| Enable Message Filtering               |                     |        |
|                                        |                     |        |
|                                        |                     |        |

**FPGA** 

# <u>Area</u>

| Device Utilization Summary                     |      |           |             |         |  |  |  |  |  |  |
|------------------------------------------------|------|-----------|-------------|---------|--|--|--|--|--|--|
| Logic Utilization                              | Used | Available | Utilization | Note(s) |  |  |  |  |  |  |
| Number of Slice Latches                        | 56   | 9,312     | 1%          |         |  |  |  |  |  |  |
| Number of 4 input LUTs                         | 130  | 9,312     | 1%          |         |  |  |  |  |  |  |
| Number of occupied Slices                      | 66   | 4,656     | 1%          |         |  |  |  |  |  |  |
| Number of Slices containing only related logic | 66   | 66        | 100%        |         |  |  |  |  |  |  |
| Number of Slices containing unrelated logic    | 0    | 66        | 0%          |         |  |  |  |  |  |  |
| Total Number of 4 input LUTs                   | 130  | 9,312     | 1%          |         |  |  |  |  |  |  |
| Number of bonded IOBs                          | 94   | 232       | 40%         |         |  |  |  |  |  |  |
| IOB Latches                                    | 24   |           |             |         |  |  |  |  |  |  |
| Number of BUFGMUXs                             | 2    | 24        | 8%          |         |  |  |  |  |  |  |
| Average Fanout of Non-Clock Nets               | 3.58 |           |             |         |  |  |  |  |  |  |

# <u>Delay</u>



# **Power**

| A                | В             | C        | D       | E          | F             | G           | Н               | Т | J      | К         | L           | М           | Ν           |
|------------------|---------------|----------|---------|------------|---------------|-------------|-----------------|---|--------|-----------|-------------|-------------|-------------|
| Device           |               |          | On-Chip | Power (W)  | Used          | Available   | Utilization (%) |   | Supply | Summary   | Total       | Dynamic     | Quiescent   |
| Family           | Spartan3e     |          | Clocks  | 0.000      | 4             | -           |                 |   | Source | Voltage   | Current (A) | Current (A) | Current (A) |
| Part             | xc3s500e      |          | Logic   | 0.000      | 130           | 9312        | 1               |   | Vccint | 1.200     | 0.026       | 0.000       | 0.026       |
| Package          | fg320         |          | Signals | 0.000      | 147           | -           |                 |   | Vccaux | 2.500     | 0.018       | 0.000       | 0.018       |
| Temp Grade       | Commercial    | ~        | IOs 🛛   | 0.000      | 94            | 232         | 41              |   | Vcco25 | 2.500     | 0.002       | 0.000       | 0.002       |
| Process          | Typical       | ¥        | Leakage | 0.081      |               |             |                 |   |        |           |             |             |             |
| Speed Grade      | -5            |          | Total   | 0.081      |               |             |                 |   |        |           | Total       | Dynamic     | Quiescent   |
|                  |               |          |         |            |               |             |                 |   | Supply | Power (W) | 0.081       | 0.000       | 0.081       |
| Environment      |               |          |         |            | Effective TJA | Max Ambient | Junction Temp   |   |        |           |             |             |             |
| Ambient Temp (C) | 25.0          |          | Thermal | Properties | (C/W)         | (C)         | (C)             |   |        |           |             |             |             |
| Use custom TJA?  | No            | <b>v</b> |         |            | 26.1          | 82.9        | 27.1            |   |        |           |             |             |             |
| Custom TJA (C/W) | NA            |          |         |            |               |             |                 |   |        |           |             |             |             |
| Airflow (LFM)    | 0             | ~        |         |            |               |             |                 |   |        |           |             |             |             |
|                  |               |          |         |            |               |             |                 |   |        |           |             |             |             |
| Characterization |               |          |         |            |               |             |                 |   |        |           |             |             |             |
| PRODUCTION       | v1.2,06-23-09 |          |         |            |               |             |                 |   |        |           |             |             |             |
|                  |               | _        |         | 1          |               |             |                 |   |        |           |             |             |             |

#### **CONCLUSION**

We have presented a novel approach to reduce PD during at- speed test of sequential circuits with scan-based LBIST using the LOC scheme. The proposed solution enables designers to reduce the probability that the delay induced by PD exhibited during at-speed test is erroneously interpreted as a delay fault, with consequent generation of a false test fail. This is achieved by reducing the AF of the CUT compared with conventional scan-based LBIST, by proper modification of the test vectors generated by the LFSR. We have shown that, compared with conventional scan-based LBIST, our approach allows us to achieve a scalable PD reduction (ranging from 50% to 89%), with no drawback on the required number of test vectors to achieve a target FC and with limited costs in terms of AO (ranging from 1.5% to 14%). We have also shown that, compared with the solutions in [9] and [21], our solution allows us to reduce significantly (more than 50%) the number of test vectors (thus TT) to achieve the same target FC.



Student: RELA RAGHU RAM

RAMACHANDRA COLLEGE OF ENGINEERING

Vatluru(V),NH-5Bypass road, Eluru,west godavari dist,AP

GUIDE NAME :D.SRIDHAR M.Tech.,(Ph.D), ASSOCIATE PROFESSOR RAMACHANDRA COLLEGE OF ENGINEERING ELURU, PH:+91-9866612112



# REFERENCES

[1] J. Rajski, J. Tyszer, G. Mrugalski, and B. NadeauDostie, "Test generator with preselected toggling for low power built-in self-test," inProc. Eur.Te s t S y m p ., May 2012, pp. 1–6.

[2] Y. Sato, S. Wang, T. Kato, K. Miyase, and S. Kajihara, "Low power BIST for scan-shift and capture power," inProc. IEEE 21st Asian TestSymp., Nov. 2012, pp. 173–178.

[3] E. K. Moghaddam, J. Rajski, M. Kassab, and S. M. Reddy, "At-speed scan test with low switching activity," in Proc. IEEE VLSI Test Symp., Apr. 2010, pp. 177–182.

[4] S. Balatsouka, V. Tenentes, X. Kavousianos, and K. Chakrabarty, "Defect aware X-filling for low-power scan testing," inProc. Design, Autom. TestEur. Conf. Exhibit., Mar. 2010, pp. 873–878.

[5] I. Polian, A. Czutro, S. Kundu, and B. Becker, "Power droop testing," IEEE Design TestComput., vol. 24, no. 3, pp. 276–284, May/Jun. 2007.

[6] X. Wenet al., "On pinpoint capture power management in at-speed scan test generation," in Proc. IEEE Int. Test Conf., Nov. 2012, pp. 1–10.

[7] S. Kiamehr, F. Firouzi, and M. B. Tahoori, "A layout-aware X-fillingapproach for dynamic power supply noise reduction in at-speed scan testing," inProc. IEEE Eur. Test Symp., May 2013, pp. 1–6.

[8] M. Nourani, M. Tehranipoor, and N. Ahmed, "Low-transition test pattern generation for BISTbased applications,"IEEE Trans. Comput., vol. 57,no. 3, pp. 303–315, Mar. 2008.

[9] N. Z. Basturkmen, S. M. Reddy, and I. Pomeranz, "A low power pseudo-random BIST technique," inProc. 8th IEEE Int. On-Line Test. Workshop,Jul. 2002, pp. 140–144.

10] J. Rajski, N. Tamarapalli, and J. Tyszer, "Automated synthesis of large phase shifters for built-in self-test," in Proc. Int. Test Conf., Oct. 1998, pp. 1047–1056.

[11] P. Girard, L. Guiller, C. Landrault, S. Pravossoudovitch, and H. J. Wunderlich, "A modified clock scheme for a low power BIST test pattern generator," in Proc. IEEE VLSI Test Symp. , Apr./May 2001, pp. 306–311.

