# Compatible VLSI Architecture of Multiplier and Accumulator Based on Modified Booth Algorithm

# Pratibhadevi Tapashetti<sup>1</sup>, Dr. Praveen Kumar<sup>2</sup>

<sup>1</sup>PhD Scholar, NIMS University, Jaipur, Rajsthan, India

<sup>2</sup> Associate Professor, Department of Computer Science and Engineering, NIMs Institute of Engineering and Technology, NIMS University, Jaipur, Rajsthan, India

*Abstract* – In our study, we propose the best solution to this problem by introducing a new efficient VLSI architecture of parallel Multiplier-and Accumulator (MAC) using hybrid approach for high-speed arithmetic operations. Since the accumulator that has the largest delay in MAC, the Carry Save Adder (CSA) and compressor techniques are used as one of the processing element to improve the overall performance. Since, the accumulator that has the largest delay in MAC was merged into CSA, hence the overall performance is elevated.

The Proposed MAC accumulates the intermediate results in the type of sum and carry bits instead of the output of the final adder, which made it possible to optimize the pipeline scheme to improve the performance.

like radix This study presents an efficient implementation of high speed multiplier, Radix-8 modified Booth multiplier algorithm. The parallel 2 and radix 4 modified booth multiplier does the computations using lesser adders and lesser iterative steps. However, the fact remains that the area and speed are two conflicting performance constraints. Hence, innovating increased speed always results in larger area. In this, we arrive at a better trade-off between the two, by realizing a marginally increased multipliers speed performance the proposed MAC will show the better properties to the existing standard design in many ways and performance twice as much as the previous research in the similar clock frequency.

*Keywords:* Booth Multiplier, Carry Save Adder–CSA, Computer Arithmetic, Digital Signal Processing-DSP, Multiplier & Accumulator-MAC, Booth Algorithm etc.

#### I. INTRODUCTION

The demand for high speed and efficient processing has been mounting as a result of growing computer and signal processing applications.

The core of every processing system is its data path. Available statistics [3] has given clear indications that more than 70% of the instructions usually perform arithmetical and logical operations mainly consist of addition and multiplication in the data path of RISC and CISC machines. Multiplication based computation, which involve operations like Multiply and Accumulate and inner product most intensive arithmetic functions, currently implemented in many signal processing applications such as convolution, fast Fourier transform, filtering and in microprocessors in its arithmetic and logic unit. Since multiplication dominates the execution time of most signal/ instruction processing algorithms, so there is a need of speed efficient multiplier.

Also, low power consumption and area efficiency are among the most important criteria for the fabrication of any processing and high performance systems. Optimizing the speed and area of the multiplier is a major design issue. However, area and speed are usually contradictory, so that improving speed results mostly in larger areas also.

In our study, we propose the best solution to this problem by introducing a new efficient VLSI architecture of parallel Multiplier-and Accumulator (MAC) using hybrid approach for high-speed arithmetic operations. Since the accumulator that has the largest delay in MAC, the Carry Save Adder (CSA) and compressor techniques are used as one of the processing element to improve the overall performance.

According to in the past work, the general Booth calculation and CLA's are utilized as a part of MAC operation for getting the proficient yield comes about through pipeline conspire. Yet, the issue ascends by the general stall calculation and CLA's in past work by that pipeline conspire execution will be diminished. To amend these issues this proposed strategy has been presented the adjusted stall calculation and CSA.

This investigation exhibits an effective execution of rapid multiplier, Radix-8 adjusted Booth multiplier calculation. The parallel multipliers like radix 2 and radix 4 changed stall multiplier does the calculations utilizing lesser adders and lesser iterative strides.

In any case, the reality remains that the region and speed are two clashing execution requirements. Henceforth, advancing expanded speed dependably brings about bigger territory. In this, we land at a superior exchange off between the two, by understanding a barely expanded speed execution. The proposed MAC will demonstrate the better properties to the current standard plan from various perspectives and execution twice as much as the past research in the comparative clock recurrence.

The power reduction techniques adopted in this work, and we expect that the proposed MAC can be adapted to various fields requiring high performance such as the signal processing areas.

### II. LITERATURE SURVEY

Sathya, A.; Fathimabee, S.; Divya, S., "Parallel multiplier-accumulator based on radix-2 modified Booth algorithm by using a VLSI architecture," Electronics and Communication Systems (ICECS), 2014 International Conference on , vol., no., pp.1,5, 13-14 Feb. 2014

In this paper, another design of multiplier-and-gatherer (MAC) for fast math proposed. By joining increase with aggregation and conceiving a crossover kind of convey spare snake (CSA), the execution was moved forward. The proposed CSA tree utilizes 1's-supplement based radix-2 changed Booth's calculation (MBA) and has the altered cluster for the sign augmentation to build the bit thickness of the operands. The CSA engenders the conveys to the slightest noteworthy bits of the fractional items and produces the minimum critical bits ahead of time to diminish the quantity of the info bits of the last snake.

Suriya, T.S.U.; Rani, A.A., "Low power analysis of MAC using modified booth algorithm," Computing, Communications and Networking Technologies (ICCCNT),2013 Fourth International Conference on , vol., no., pp.1,5, 4-6 July 2013

Multipliers with fast are fundamental of advanced applications for instance flag preparing. Another design of multiplier-and-gatherer (MAC) was proposed for rapid number juggling. By consolidating duplication with amassing the execution was moved forward. In Modified stall calculation system the adjusted corner encoder will decrease the quantity of fractional items. Indeed, even all in all reason processors rapid multipliers are most required to give a physically minimized, great speed and low power devouring chip. To spare noteworthy power utilization of a VLSI outline, it is a decent heading to diminish its dynamic power. This paper proposes the spurious power concealment procedure (SPST) in VLSI will decrease the power utilization of the framework altogether.

Naveen Kumar, Manu Bansal, Amandeep Kaur, "Speed Power and Area Efficient VLSI Architectures of Multiplier and Accumulator," International Journal of Scientific & Engineering Research Volume 4, Issue 1, January-2013 Proposed architectures of the high-speed low power and less area of modified Booth Wallace MAC. CSLA has comparatively low value of critical path length hence less combinational path delay but it has higher no. of leaf cell count and combinational path area. It also has high dynamic power than CLA and CSKA. So CLA and CSKA architectures can be used for low power applications as it has low value of dynamic as well cell leakage power.

J. Y. Yahwanth Babu, P. Dinesh Kumar, "A new multiplier Accumulator architecture based on high accuracy modified Booth algorithm," International journal of advanced research in computer Engineering & Technology, vol.2,no.3,pp.1036-1040,Mar 2013.

A new multiplier Accumulator architecture based on high accuracy modified Booth algorithm [19]. In this paper, a new MAC architecture is developed for high speed performance. The performance improvement is achieved by merging CSA and accumulator. MAC architecture is synthesized with 180 nm standard CMOS library using cadence SOC encounter.

Rashmi Ranjan et al., "A New VLSI Architecture Of Parallel Multiplier Based On Radix-8 Modified Booth Algorithm Using VHDL," International Journal of Computer Science & Engineering Technology (IJCSET), Vol. 3 No. 4 April, 2012

Presented an efficient implementation of high speed multiplier using the shift and adds method, Radix-8 modified Booth multiplier algorithm. The architecture includes a final adder with the size of 2 to perform a multiplication. It means that the operational bottle neck is induced in the final adder no matter how much delay.

Adalanki Purna Ramesh, Dr. A. V. N. Tilak, Dr. A. M. Prasad, "Efficient implementation of 16 bit multiplier, Accumulator using Radix – 2 modified

Booth algorithm and SPST adder using Verilog," International journal of VLSI design and communication systems, vol. 3, no.3, pp.107-118, Jun. 2012.

Proposed for high speed and low power. For improving the speed and to reduce the dynamic power there is a need to reduce the glitches 1-0 transition and spikes 0- 1 transition. Adder designed using spurious power suppression technique (SPST) which avoids the unwanted glitches and spikes.

S.Jagadeesh, S.Venkata Chary, "Design of Parallel Multiplier–Accumulator Based on Radix-8 Modified Booth Algorithm with SPST, International Journal Of Engineering Research And Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, issue 5, September-October 2012, pp.425-431 Another MAC engineering to execute the increase amassing operation, which is the key operation, for computerized flag handling and sight and sound data preparing productively, was proposed. By evacuating the autonomous amassing process that has the biggest postponement and blending it to the pressure procedure of the incomplete items, the general MAC execution has been enhanced twice as much as in the past work.

B. Jyothirmai, M. Premalatha, "Design and implementation of parallel MAC by Booth algorithm," International Journals of scientific Research, vol.1, no.6, pp.78-79, Nov.2012.

In this paper, a new architecture for a high speed MAC, in which computations of multiplication and accumulation are combined and hybrid type CSA structure is used to reduce the critical path and improve output rate is achieved

Harilal, Durga Prasad, "High speed arithmetic Architecture of parallel multiplier Accumulator based on Radix-2 modified Booth algorithm," International journals of computational Engineering research, vol.2, no.8, pp.28-38, Dec.2012.

Present an efficient implementation of high speed multiplier using the shift and add modified Booth algorithm. The adder used is look ahead carry adder. The compression tree along with the carry look ahead adder has reduced the hardware overhead and power consumption.

# III. PROPOSED RESEARCH METHODOLOGY

In this work an efficient architecture is proposed for high speed arithmetic multiplication. Algorithm for MAC is Booths radix-8 algorithm, Modified Booths multiplier, 32 bit Wallace tree multiplier, which improves speed.

The Wallace tree basically multiplies two unsigned integers. The conventional Wallace tree multiplier architecture comprises of an AND array for computing the partial products, a carry save adder for adding the partial products so obtained and a carry propagate adder in the final stage of addition. In the proposed architecture, partial product generation and reduction is accomplished by the use of booth algorithm, 3:2, and 4:2, 5:2 compressor structures.

## Proposed Implementation:

The proposed MAC is executed and broke down. At that point it would be contrasted and some past scientists. To start with the measure of utilized recourses in executing the equipment is dissected hypothetically and tentatively then the deferral of the equipment is broke down. At last the pipeline arrange is characterized and execution is examined in light of the pipelining plan. Execution result from each segment will be contrasted and standard outline [4] and Equibaly's plan [3], each of which has the most illustrative parallel MBA engineering.

## Overview of MAC:

In this MAC operation a multiplier can be divided into 3 parts. The first is radix - 8 booth encoding in which the partial product is generated from the multiplicand (x) the multiplier (y). The second is adder array or partial product compression. The last is the final addition in which final multiplication result is produced by adding the sum & the carry.

General hardware architecture of this MAC is below in fig. -1



Fig.1General Hardware architecture of MAC

General hardware architecture of the MAC executes the multiplication operation by multiplying the input multiplicand X and multiplier Y. This is added to the previous multiplication result Z as the accumulation step if accumulation is needed.

The N-bit 2's complement binary number can be expressed as

$$X = -2^{N-1}x_{N-1} + \sum_{i=0}^{N-2} x_i 2^i, \qquad x_i \in 0, 1.$$
.....(1)

If (1) is expressed in base-4 type redundant sign digit form in order to apply the radix-8 Booth's algorithm.

$$X = \sum_{i=0}^{N/2-1} d_i 4_i$$
.....(2)

Where

If (2) is used, multiplication can be expressed as

$$X \times Y = \sum_{i=0}^{N/2-1} d_i 2^{2i} Y.$$
.....(4)

The multiplication - accumulation result are then

$$P = X \times Y + Z = \sum_{i=0}^{N/2-1} d_i 2^i Y + \sum_{j=0}^{2N-1} z_j 2^i.$$
 ...... (5)

The MAC architecture implemented by (5) is called the standard design [4].

#### Booth Algorithm for Partial Products Generation:

To generate and reduce the number of partial products of multiplier, proposed modified Booth Algorithm has been used, In the proposed modified Booth Algorithm, multiplier has been divided in groups of 4 bits and each groups of 4 bits have been operation according to modified Booth Algorithm for generation of partial products  $0, \pm 1A, \pm 2A, \pm 3A, \pm 4A, \pm 5A, \pm 6A, \pm 7A$ . These partial products are summed using compressors in

These partial products are summed using compressors in structure of Wallace Tree.

In radix-8 Booth Algorithm, multiplier operand B is Partitioned into 11 groups having each group of 4 bits. In first group, first bit is taken zero and other bits are least Significant three bit of multiplier operand. In second group, first bit is most significant bit of first group and other bits are next three bit of multiplier operand. In third group, first bit is most significant bit of second group and other bits are next three bits of multiplier operand. In third group, first bit is most significant bit of second group and other bits are next three bits of multiplier operand. This process is carried on. For each group, Partial product is generated using multiplicand operand A. For n bit multiplier there is n/3 or [n/3 + 1] groups and partial products in proposed modified Booth Algorithm radix-8.

#### Compressor for Partial Products Reduction:

In proposed architecture, to reduce the partial product stage compressor technique has been used. To minimize the delay, the output of compressor is used by replacing XOR blocks with mux block. In conventional architecture various adder are replace by compressors.

#### IV. PROPOSED IMPLEMENTATION

A multiplier configuration comprises of three operational strides. The first is Booth encoding in which an incomplete item is created from the multiplicand X and the multiplier Y. The second is viper exhibit or incomplete item pressure to include every single halfway item and change over them into the type of whole and convey. The latter is the last option in which the last increase result is created by including the whole and the convey. At the point when the multiplier comes about are to be aggregated, an extra stride is required, as appeared in figure 6.1



## Fig. 6.1.Booths's Multiplier steps

In our design we are using more advanced features to enhance the parallelism. Block diagram of multiplier is shown in figure 6.2. Modified Booth encoder reduces partial products by half, so we are required to sum partial products. Here we are using compressors so it reduces the number of partial product sum stages.



Fig. 6.2.Block diagram of Multiplier





Fig. 6.3. Block diagram of Multiplier with pipelined architecture

V. RESULTS

The Simulation Results of MAC using Booths algorithm is shown in figure 6.4



Fig. 6.4. MAC Simulation results using Xilinx

# VI. CONCLUSION

The proposed MAC implementation demonstrate the better properties to the present standard and execution twice as much as the pastresearch in the comparative clock recurrence.

The power reduction techniques adopted in this work, and we expect that the proposed MAC can be adapted to various fields requiring high performance such as the signal processing areas.

# REFERENCES

- S. Waser and M.J. Flymn, "Introduction to arithmetic for Digital system designers," New York: Holt, Rinechart and Winston, 1982.
- [2] J J F Cavanagh, Digital computer arithmetic. New York: Mc GrawHill 1984.
- [3] F. Elguibaly, "A fast parallel multiplier-accumulator using the modified Booth algorithm," IEEE Trans. Circuits Syst., vol. 27, no. 9, pp. 902–908, Sep. 2000.
- [4] A.R. Omondi, Computer arithmetic systems, Eagle wood cliffs. NJ: Printice Hall. 1994.
- [5] J.Fadavi Ardekani, "Booth encoded multiplier generator using optimize Wallace tree," IEEE Transactions on VLSI Systems, vol.1, pp.120-125, 1993.
- [6] A. Fayed and M. Bayoumi, "A merged multiplier-accumulator for high speed signal processing applications," Proc. ICASSP, vol. 3, pp. 3212–3215, 2002.
- [7] P. Zicari, S. Perri, P. Corsonello, and G. Cocorullo, "An optimized adder accumulator for high speed MACs," Proc. ASICON 2005, vol. 2, pp. 757–760, 2005.
- [8] Young Ho seo, Dong wook kim, "A new VLSI Architecture of parallel multiplier Accumulator based on Radix 2 Modified Booth algorithm," IEEE transactions on very large scale integration (VLSI) systems, vol. 18, no.2, pp. 201-208, Feb.2010.
- [9] M.V. Sathish, M. Sailaja, "Vlsi Architecture Of Parallel Multiplier– Accumulator Based On Radix-2 Modified Booth Algorithm," International Journal of Electrical and Electronics Engineering (IJEEE), vol.1, Issue-1, 2011
- [10] P. Sasi Bala, S. Raghuvendra, "A new VLSI architecture of parallel multiplier – Accumulator based on Radix – 2 modified Booth algorithm," International Journal of Instrumentation, control and Automation, vol.1, no.2, pp.91-97, 2011.
- [11] Sankey Goel, R.K. Sharma, "Parallel MAC based on Radix 4 & Radix 8 Booth encodings," International Journals of Engineering & Technology, vol.3, no.8, pp.6692-6697, 2011.
- [12] K. Hima Bindu, K. Bala Souri, K. V. Ramana Rao, "MAC architecture Accumulator based on Booth encoding parallel multipliers,"International journals of soft computing and Engineering, vol.1, no.5, pp. 385–388, Nov.2011.

- [13] Rashmi Ranjan et al., "A New Vlsi Architecture Of Parallel Multiplier Based On Radix-8 Modified Booth Algorithm Using Vhdl," International Journal of Computer Science & Engineering Technology (IJCSET), Vol. 3 No. 4 April,2012
- [14] Adalanki Purna Ramesh, Dr. A. V. N. Tilak, Dr. A. M. Prasad, "Efficient implementation of 16 bit multiplier, Accumulator using Radix – 2 modified Booth algorithm and SPST adder using Verilog," International journal of VLSI design and communication systems, vol. 3, no.3, pp.107-118, Jun. 2012.
- [15] S. Jagadish, S. Venkatachary, "Design of parallel multiplier Accumulator based on Radix-8 modified Booth algorithm with SPST," International journal of Engineering Research and application, vol.2, no.5, pp.425-435, Sept.Oct. 2012.
- [16] B. Jyothirmai, M. Premalatha, "Design and implementation of parallel MAC by Booth algorithm," International Journals of scientific Research, vol.1, no.6, pp.78-79, Nov.2012.
- [17] Harilal, Durga Prasad, "High speed arithmetic Architecture of parallel multiplier Accumulator based on Radix-2 modified Booth algorithm," International journals of computational Engineering research, vol.2, no.8, pp.28-38, Dec.2012.
- [18] Naveen Kumar, Manu Bansal, Amandeep Kaur, "Speed Power and Area Efficient VLSI Architectures of Multiplier and Accumulator," International Journal of Scientific & Engineering Research, vol.4, Issue 1, Jan-2013
- [19] J. Y. Yahwanth Babu, P. Dinesh Kumar, "A new multiplier Accumulator architecture based on high accuracy modified Booth algorithm," International journal of advanced research in computer Engineering & Technology, vol.2,no.3,pp.1036-1040,Mar 2013.
- [20] Pratibhadevi Tapashetti, Dr.Rajkumar B Kulkarni, "Implementation of High Speed MAC VLSI Architectures, Based on High Radix Modified Booth Algorithm", International conference on advances in electronics ,computer and mathematical sciences(ICIREMPS-2016) Sagar Institute of Research and Technology(SIRT) Bhopal,MP,India,26 to 28 FEB 2016

[21] Pratibhadevi Tapashetti, Dr.Rajkumar B Kulkarni, "Efficient And Compatible VLSI Architecture Of Parallel Mac Based On High Radix Modified Booth Algorithm", 3rd International Conference onIET-IEEE-Electrical, Electronics, Engineering Trends, Communication, Optimization and Sciences (EEECOS)-2016, SASI Institute of Technology and Engineering ,Tadepalligudem, Andrapradesh,India, 2016/5.