# CDMA BASED NETWORK ON CHIP WITH CROSSBAR SWITCHES

<sup>1</sup>M.Koteswara rao,<sup>2</sup>D.Sree Lakshmi <sup>1</sup>PG Scholar,<sup>2</sup>Assistant Professor, Dept. Of E.C.E <sup>1,2</sup>Narayana Engineering College, Nellore, Andhra Pradesh.

Abstract— On-chip interconnects are the performance bottleneck in modern system-on-chips. Code-division multiple access (CDMA) has been proposed to implement on-chip crossbars due to its fixed latency, reduced arbitration overhead, and higher bandwidth. In CDMA, medium sharing is enabled in the code space by assigning a limited number of N-chip length orthogonal spreading codes to the processing elements sharing the interconnect. In this paper, we advance overloaded CDMA interconnect (OCI) to enhance the capacity of CDMA network-on-chip (NoC) crossbars by increasing the number of usable spreading codes. Serial and parallel OCI architecture variants are presented to adhere to different area, delay, and power requirements. Compared with the conventional CDMA crossbar, on a Xilinx Artix-7 AC701 FPGA kit, the serial OCI crossbar achieves 100% higher bandwidth, 31% less resource utilization, and 45% power saving, while the parallel OCI crossbar achieves N times higher bandwidth compared with the serial OCI crossbar at the expense of increased area and power consumption.

## **I.INTRODUCTION:**

System on chip (SOC) is a complex interconnection of different functional components. It makes communication bottleneck in the gigabit communication because of its transport based design. In this way there was need of system that express modularity and parallelism, network on chip have numerous such alluring properties and take care of the issue of communication bottleneck. It essentially deals with the possibility of interconnection of centers utilizing on chip network.

The communication on network on chip is done by methods for router, so for actualizing better NOC, the router ought to be effectively structure. This router bolsters four parallel connections in the meantime. It uses store and forward kind of stream control and Fsm Controller deterministic directing which improves the presentation of router. The exchanging system utilized here is bundle exchanging which is commonly utilized on network on chip. In parcel exchanging the data the data moves as packets between participating routers and autonomous directing decision is taken. The store and forward stream system is best since it doesn't save channels and in this way does not prompt inactive physical channels. The mediator is of turning need conspire with the goal that each channel once get opportunity to move its data. In this router both information and yield buffering is utilized with the goal that congestion can be maintained a strategic distance from at the two sides.

A router is a gadget that advances data packets crosswise over PC networks. Routers play out the data "traffic direction" functions on the Internet. A router is a chip controlled gadget that is connected to at least two data lines from various networks. At the point when a data parcel comes in on one of the lines .The router peruses the address information in the bundle to decide its definitive destination. At that point, utilizing information in its steering table, it guides the parcel to the following network on its voyage.

The router is a" traffic direction" has a one information port from which the bundle enters. It has three yield ports where the bundle is driven out. Parcel contains 3 sections. They are Header, data and casing check arrangement. Parcel width is 8 bits and the length of the bundle can be between 1 bytes to 63 bytes. Parcel header contains three fields DA and length. Destination address (DA) of the parcel is of 8 bits. The switch drives the parcel to individual ports dependent on this destination address of the packets. Each yield port has 8-bit remarkable port address. On the off chance that the destination address of the bundle coordinates the port address, at that point switch drives the parcel to the yield port, Length of the data is of 8 bits and from 0 to 63. Length is estimated regarding bytes. Data ought to be regarding bytes and can take anything. Casing check grouping contains the security check of the bundle. It is determined over the header and data.

A data packet is commonly passed from router to router through the networks of the Internet until it gets to its destination PC. Routers additionally perform different undertakings, for example, interpreting the

#### © 2019 JETIR June 2019, Volume 6, Issue 6

data transmission convention of the bundle to the proper convention of the following network.

#### **II.LITERATURE REVIEW:**

Using CDMA as a medium access scheme in crossbar switches provides favorable qualities like the fixed transaction latency and low arbitration overhead. Nikolic et al. [16] have proposed a scalable CDMAbased peripheral bus to decrease the number of parallel transfer lines and point-to-point (PTP) buses and to avoid the overhead of TDMA arbiters. This approach reduces the pin count when used at the interface of multiple peripherals to multiple PEs since the data from the peripherals are added and transmitted on fewer lines. The increase in the transaction latency due to data spreading is acceptable because peripherals usually operate at lower frequencies than the master PEs. A master-slave bus wrapper has been presented in [17] and [18], where the data are bundled and spread using orthogonal CDMA codes to decrease the number of parallel transfer lines. The control signals are not encoded to facilitate interconnection to other TDMA buses. Another CDMA bus implementation has been compared with a TDMA split transaction bus in [11]. The results show that the CDMA bus outperforms the split transaction bus as the number of PEs increases since the CDMA bus avoids bus contention and queuing delays, which hinder the scalability of a TDMA bus. A multilevel 2-bit CDMA bus has been utilized in [19] as an input/output (I/O) reconfiguration scheme that also demonstrates a reduction in the bus contention over the TDMA bus. CDMA and TDMA have been combined in the CT-Bus where data are multiplexed over both the time and code domains [12]. The CT-Bus depicts that the communication overhead of CDMA is lower than that of TDMA as the CDMA bus controller is required to assign only spreading codes, while the TDMA controller must perform arbitration every clock cycle. The CT-Bus performance surpasses its TDMA counterpart for heterogeneous traffic since it combines the TDMA bus scalability with the CDMA channel continuity. A CDMA-based NoC has been compared with a PTP bidirectional ring-based NoC in [20], and the comparison shows that the CDMA NoC's fixed data transfer latency is equal to the best case latency of the PTP of the same channel width. The fixed data transfer latency of the CDMA NoC is attributed to concurrent interconnect sharing by the network nodes. A hierarchical CDMA star NoC router has been presented in [21] and [22]. The CDMA router is connected in a star- star topology and a star-mesh topology and compared with pure mesh and fat tree topologies. The CDMA star NoC demonstrates fewer

#### www.jetir.org (ISSN-2349-5162)

resources and routing complexity than its rivals. The maximum hop count of the CDMA star NoC router is lower than that of the compared topologies due to the concurrent transmission of packets through the router. The CDMA interconnect topology presented in [21] and [22] is made scalable either by doubling the number of chips in the Walsh code set to double the number of ports that can be connected to the router or by using more routers in a star or mesh fashion. The CDMA encoding and decoding operations are local to the router, and therefore, the same Walsh codes can be reused in each NoC router.

#### **III. OVERLOADED CDMA INTERCONNECT**

Fig. 1(a) illustrates the high-level architecture of the CDMA-based NoC router. The CDMA router has M transmit/receive ports. The main difference between the overloadedand classical CDMA routers is that M > N -1 for the former due to channel overloading. Each PE is connected to two network interfaces (NIs), transmit and receive NI modules. During packet transmission from a PE, the packet is divided into flits to be stored in the transmit NI first-input firstoutput (FIFO). The router arbiter then selects M winning flits at most from the top of the NI FIFOs to be transmitted during the current transaction. The selected flits must all have an exclusive destination address to prevent conflicts, and a winner from two conflicting flits is selected according to the router's priority scheme. The employed priority scheme is the fixed winner that takes all priority schemes; only one of the transmitters is given a spreading code and is acknowledged to start encoding. Once done, the router assigns CDMA codes to each transmit and receive NI. NIs with empty FIFOs or conflicting destinations are assigned all-zero CDMA codes such that they do not contribute MAI to the CDMA channel sum. Afterward, flits from each NI are spread by the CDMA codes in the encoder module.

The data are spread into N chips, where N is the CDMA code length that equals the number of clock cycles in a single crossbar transaction. Spread data chips from all encoders are summed by the CDMA crossbar adder and the sum is sent out serially to all decoders. The encoding/decoding process lasts for N clock cycles synchronized via a counter. At each decoder, the assigned code is cross correlated with the received sum to decode the data from the summed chips. The decoded flits are stored in the receive NI FIFOs until they are read by the PEs. In this paper, we focus on the high-level architecture and implementation details of the overloaded CDMA crossbar represented by the gray block in Fig. 1(a).



Fig. 1. (a) CDMA NoC router architecture. (b) Classical CDMA crossbar.

A store and forward flow control and a deterministic routing algorithm are employed in the OCI router. The routing algorithm lies at the network layer, which is a higher layer than the physical layer containing the crossbar switch. According to the OSI model design principles, each layer of the model exists as an independent layer. Theoretically, one can substitute one protocol for another at any given layer without affecting the operation of layers above or below. Thus, using the same flow control protocol and routing algorithm enables comparing the OCI-based router with SDMA- and TDMA-based routers.

#### A. OCI Crossbar High-Level Architecture

The main objective of this paper is increasing the number of ports sharing the ordinary CDMA crossbar presented in [17], while keeping the system complexity unchanged using simple encoding circuitry and relying on the accumulator decoder with minimal To achieve this changes. goal, some modifications to the classical CDMA crossbar are advanced. Fig. 2 depicts the high-level architecture of the OCI crossbar for a singlebit interconnection. The same architecture is replicated for a multibit CDMA router. M TX-RX ports share the CDMA router, where spread data from the transmit ports are added using an arithmetic binary adder having M binary inputs and an m-bit output, where m = log2 M.The adder is implemented in both the reference and pipelined architectures. Α controller block is used for code assignment and arbitration tasks. Each PE is interfaced to an encoder/decoder wrapper enabling data spreading/dispreading.



Fig. 2. High-level architecture and building blocks of the OCI crossbar. (a) T-OCI/P-OCI hybrid encoder. (b) T-OCI nonorthogonal decoder. (c) P-OCI nonorthogonal decoder. (d) T-OCI pipelined crossbar tree adder, in which the adder is replicated N times for P-OCI crossbar. (e) P-OCI orthogonal decoder. (f) T-OCI orthogonal decoder

#### **B. OCI Code Design**

The Walsh-Hadamard spreading code family has featured property that enables a CDMA interconnect overloading. The difference between any consecutive channel sums of data spread by the orthogonal spreading codes for an odd number of TX-RX pairs M is always even, regardless of the spread data. This property means that for the N - 1 TX-RX pairs using the Walsh orthogonal codes, one can encode additional N - 1 data bits in consecutive differences between the N chips composing the orthogonal code. Thus, exploiting this property enables adding 100% nonorthogonal spreading codes, which can double the capacity of the ordinary CDMA crossbar.

In this section, the code design methodology, mathematical foundations, and the decoding details of the OCI codes are provided. The notations used throughout this paper are listed in Table I. An AND gate encoder is used to encode data with nonorthogonal spreading codes as shown in Fig. 2(a). Therefore, for a nonorthogonal encoder, if data to transmit are one, a single spreading chip at a specific time slot in the spreading cycle is added to the channel sum, which causes the consecutive sum difference to deviate. The nonorthogonal codes imitate the TDMA signaling scheme as each code is composed of a single chip of "1" sent in a specific time slot. The encoding/decoding scheme presented in this paper provide a novel approach that enables coexistence between CDMA and TDMA signals in the same shared medium. Therefore, the developed encoder is called TDMA overloaded on CDMA interconnect (T-OCI). Fig. 3 shows an encoding/decoding example of two T-OCI codes for a spreading code of length N = 8. An odd number of orthogonal codes must be used simultaneously to

## © 2019 JETIR June 2019, Volume 6, Issue 6 (ISSN-2349-5162)



Fig. 3. Encoding/decoding of three orthogonal codes and two T-OCI codes

## TABLE I

### **DEFINITION OF NOTATIONS**

| Notation   | Description                                    |
|------------|------------------------------------------------|
| N          | Orthogonal spreading code Length               |
| - M        | Number interconnected ports                    |
| m          | Number of crossbar adder wires                 |
| $S_{-}$    | Sum of CDMA chips carried by the channel       |
| $d_C$      | Data bit encoded by an orthogonal CDMA code    |
| $d_T$      | Data bit encoded by a non-orthogonal TDMA code |
| $C_{o}(j)$ | The jth chip of the orthogonal CDMA code       |
| T(j)       | The jth chip of the non-orthogonal TDMA code   |
| $C_n$      | TDMA MAI code (non-orthogonal spread data)     |
| R(k)       | Output of the kth correlator decoder           |

## **C. OCI Crossbar Building Blocks**

Two variants are realized for each OCI crossbar, reference and pipelined architectures. The pipelined architecture is implemented to increase the crossbar operating frequency, and consequently, bandwidth by adding nonfunctional pipelining registers to reduce the crossbar critical path. The OCI crossbar shown in Fig. 2 is basically composed of three main building blocks: 1) the encoder wrappers; 2) the decoder wrappers; and 3) the crossbar adder blocks, which are described in the following.

1) Crossbar Controller: At the beginning of each crossbar transaction, the controller assigns spreading codes to different encoders. The assignment of orthogonal despreading codes to receive ports is fixed, i.e., does not change between the crossbar transactions. Therefore, a router for port to initiate the communication with the receive port it addresses, its encoder must be assigned a spreading code that matches the destined decoder. If two different ports request to address the same decoder, the controller allows one access and suspends the other according to a predefined arbitration scheme. This code assignment scheme is called receiver-based protocol [20]. In this paper, a static allocation scheme that allocates fixed spreading codes to all encoders is used. To interconnect a large number of PEs, a torus, star, or hybrid NoC topology can be realized where the port decoder. The crossbar controller issues handshake signals to the transmit and receive ports with matching spreading codes to enable the transmitter encoders and receiver decoders.

2) Hybrid Encoder: The encoder is hybrid, it can encode both orthogonal and nonorthogonal data. A transmitted data bit is XORed/ANDed with the spreading code to produce the orthogonal/nonorthogonal spread data, respectively. A multiplexer chooses between the orthogonal and nonorthogonal inputs according to the code type assigned to the encoder as depicted by Fig. 2(a). The encoder is replicated N times for the P-OCI crossbar.

## 3) Crossbar Adder:

For a spreading code set of length N, the number of crossbar TX-RX ports is equal to M = 2(N - 1). In the T-OCI crossbar, sending a "1" chip to the adder is mutually exclusive between nonorthogonal transmit ports according to the T-OCI encoding scheme. This indicates that among the 2(N-1) inputs to the adder, there are guaranteed (N - 2) zeros, while the maximum number of "1" chips is N. Therefore, a multiplexer is instantiated to select only a single input of the nonorthogonal TDMA encoded data bits and discard the remaining bits that are guaranteed to be "0." Thus, the adder has only N-bit inputs, N -1 from orthogonal encoders, and 1 from the multiplexer, as shown in Fig. 2(d). The sum produced by the adder circuit needs (log2 N) wires. The number of needed stages of registers to pipeline the adder is (log2 N), as depicted in Fig. 2(d). N replicas of the crossbar adder are instantiated for the parallel encoding adopted in the P-OCI crossbar.

4) Custom Decoder: There are four decoder types for different CDMA decoding techniques: the orthogonal T-OCI and P-OCI decoders and the overloaded T-OCI and P-OCI decoders. The orthogonal T-OCI decoder is an accumulator implementation of the correlator

#### © 2019 JETIR June 2019, Volume 6, Issue 6

#### www.jetir.org (ISSN-2349-5162)

receiver. N – 1 accumulator decoders are instantiated in all CDMA crossbar types for orthogonal data despreading. Instead of implementing two different accumulators (the zero and one accumulator), an up– down accumulator is implemented and the accumulated result is the difference between the two accumulators of the conventional CDMA decoder as shown in Fig. 2(f). The accumulator adds or subtracts the crossbar sum values according to the despreading code chip and **VI.SIMULATION RESULTS:**  resets every N cycles. The sign bit of the accumulated value directly indicates the decoded data bit, where the positive sign is decoded as "1," while the negative sign is decoded as "0." The P-OCI orthogonal decoder shown in Fig. 2(e) differs from the T-OCI orthogonal decoder in receiving the adder sum values concurrently not sequentially; therefore, the accumulator loop is unrolled into a parallel adder.

| te     | Value | 0 ns |     | 200 ns |      | 400 ns |     | 600 ns |     | 1800 ns |     |
|--------|-------|------|-----|--------|------|--------|-----|--------|-----|---------|-----|
| Dillo  | 66    | 170  | 119 | 118    | ( 0) | 85     | 255 | 85     | 255 | 85      | 255 |
| 1(7:0) | 170   | 170  | 119 | 118    | 137  | 204    |     |        | 170 |         |     |
| dk     | 1     |      |     |        |      |        |     |        |     |         |     |
| rst    | e     |      |     |        |      |        |     |        |     |         |     |
|        |       |      |     |        |      |        |     |        |     |         |     |
|        |       |      |     |        |      |        |     |        |     |         |     |
|        |       |      |     |        |      |        |     |        |     |         |     |
|        |       |      |     |        |      |        |     |        |     |         |     |

Fig 4:Simulation Result of existing system(CDMA1)

|        |       |      |        |        |        | 1,000.000 ns |
|--------|-------|------|--------|--------|--------|--------------|
| ame    | Value | Ons  | 200 ns | 400 ns | 600 ns | 800 ns       |
| o[7:0] | 182   | 85 ) | 118    |        | 182    |              |
| N (7.4 | 183   | 170  | 119    |        | 183    |              |
| 16 ck  |       |      |        |        |        |              |
| 闎 rst  | ð     |      |        |        |        |              |
|        |       |      |        |        |        |              |
|        |       |      |        |        |        |              |
|        |       |      |        |        |        |              |
|        |       |      |        |        |        |              |

Fig 5:Simulation Result of existing system(CDMA2)

| Name         | Value  | Pue      | 100 us | [200 us | 300 us   | 400 us | 500 us | 1600 |
|--------------|--------|----------|--------|---------|----------|--------|--------|------|
| 100          | 1      |          |        |         |          |        |        |      |
| tet .        | 0      |          |        |         |          |        |        |      |
|              | 101011 |          |        |         | 10101111 |        |        |      |
| B0[7:        | 110101 |          |        |         | 01010111 |        |        |      |
| a1(7)        | 001011 |          |        |         | 00101100 |        |        |      |
| B1[7:        | 010101 |          |        |         | 01010111 |        |        |      |
| A2(7)        | 101111 |          |        |         | IG111100 |        |        |      |
| 62[7]        | 101100 |          |        |         | 10110000 |        |        |      |
| 437:         | 000111 |          |        |         | 00011101 |        |        |      |
| D3(7)        | 100001 |          |        |         | 10000111 |        |        |      |
| Tel: (2) (2) | 001100 | <u> </u> |        |         | 00110011 |        |        |      |
| yothe        | 011001 |          |        |         | 01100130 |        |        |      |
| y1171        | 001100 |          |        |         | 00110011 |        |        |      |
| y2014        | 011001 |          |        |         | 01100110 |        |        |      |
| y3171        | 001100 |          |        |         | 00110011 |        |        |      |
| <b>10(7)</b> | 100000 |          |        |         | 10000010 |        |        |      |
| <b>21</b> De | 000500 |          |        |         | 00000010 |        |        |      |
| ■M #217:0    | 191111 |          |        |         | 87111100 |        |        |      |
| ad(7)        | 110100 | (        |        |         | 11010010 |        |        |      |
| ac057        | 100000 | k        |        |         | 10000000 |        |        |      |



## TABLE II

#### COMPARISON TABLE

| PARAMETER   | EXISTING SYSTEM | PROPOSED SYSTEM |
|-------------|-----------------|-----------------|
| NO.OF LUT'S | 280             | 224             |
| DELAY       | 64.546ns        | 12.102ns        |

### **V. CONCLUSION**

In this paper, we introduced the concept of overloaded CDMA crossbars as the physical layer enabler of NoC routers. In overloaded CDMA, the communication channel is overloaded with nonorthogonal codes to increase the channel capacity. Two crossbar architectures that leverage the overloaded CDMA concept, namely, T-OCI and P-OCI, are advanced to increase the CDMA crossbar capacity by 100% and 2N  $\times$  100%, respectively, where N is the spreading code length. We exploited featured properties of the Walsh spreading code family employed in the classical CDMA crossbar to increase the number of router ports sharing the crossbar without altering the simple accumulator decoder architecture of the conventional CDMA crossbar. Generation procedures of nonorthogonal spreading codes are presented along with the reference and pipelined architectures for each crossbar variant.

#### REFERENCES

[1] K. Asanovic et al., "The landscape of parallel computing research: A view from berkeley," Dept. EECS, Univ. California, Berkeley, CA, USA, Tech. Rep. UCB/EECS-2006-183, 2006.

[2] P. Bogdan, "Mathematical modeling and control of multifractal workloads for data-center-on-a-chip optimization," in Proc. 9th Int. Symp. Netw.-Chip, New York, NY, USA, 2015, pp. 21:1–21:8.

[3] Z. Qian, P. Bogdan, G. Wei, C.-Y. Tsui, and R. Marculescu, "A trafficaware adaptive routing algorithm on a highly reconfigurable network-onchip architecture," in Proc. 8th IEEE/ACM/IFIP Int. Conf. Hardw./Softw. Codesign, Syst. Synth., New York, NY, USA, Oct. 2012, pp. 161–170.

[4] Y. Xue and P. Bogdan, "User cooperation network coding approach for NoC performance improvement," in Proc. 9th Int. Symp. Netw.-Chip, New York, NY, USA, Sep. 2015, pp. 17:1–17:8.

[5] T. Majumder, X. Li, P. Bogdan, and P. Pande, "NoC-enabled multicore architectures for stochastic analysis of biomolecular reactions," in Proc. Design, Autom. Test Eur. Conf. Exhibit. (DATE), San Jose, CA, USA, Mar. 2015, pp. 1102–1107. [6] S. J. Hollis, C. Jackson, P. Bogdan, and R. Marculescu, "Exploiting emergence in on-chip interconnects," IEEE Trans. Comput., vol. 63, no. 3, pp. 570–582, Mar. 2014. [7] S. Kumar et al., "A network on chip architecture and design methodology," in Proc. IEEE Comput. Soc. Annu. Symp. (VLSI), Apr. 2002, pp. 105–112.

[8] T. Bjerregaard and S. Mahadevan, "A survey of research and practices of network-on-chip," ACM Comput. Surv., vol. 38, no. 1, 2006, Art. no. 1.

[9] Y. Xue, Z. Qian, G. Wei, P. Bogdan, C. Y. Tsui, and R. Marculescu, "An efficient network-on-chip (NoC) based multicore platform for hierarchical parallel genetic algorithms," in Proc. 8th IEEE/ACM Int. Symp. Netw.-Chip (NoCS), Sep. 2014, pp. 17–24.

[10] D. Kim, K. Lee, S.-J. Lee, and H.-J. Yoo, "A reconfigurable crossbar switch with adaptive bandwidth control for networks-on-chip," in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), May 2005, pp. 2369–2372.

[11] R. H. Bell, C. Y. Kang, L. John, and E. E. Swartzlander, "CDMA as a multiprocessor interconnect strategy," in Proc. Conf. Rec. 35th Asilomar Conf. Signals, Syst. Comput., vol. 2. Nov. 2001, pp. 1246–1250.

[12] B. C. C. Lai, P. Schaumont, and I. Verbauwhede, "CT-bus: A heterogeneous CDMA/TDMA bus for future SOC," in Proc. Conf. Rec. 35th Asilomar Conf. Signals, Syst. Comput., vol. 2. Nov. 2004, pp. 1868– 1872.

[13] S. A. Hosseini, O. Javidbakht, P. Pad, and F. Marvasti, "A review on synchronous CDMA systems: Optimum overloaded codes, channel capacity, and power control," EURASIP J. Wireless Commun. Netw., vol. 1, pp. 1–22, Dec. 2011.

#### © 2019 JETIR June 2019, Volume 6, Issue 6

[14] K. E. Ahmed and M. M. Farag, "Overloaded CDMA bus topology for MPSoC interconnect," in Proc. Int. Conf. ReConFigurable Comput. FPGAs (ReConFig), Dec. 2014, pp. 1–7.

[15] K. E. Ahmed and M. M. Farag, "Enhanced overloaded CDMA interconnect (OCI) bus architecture for on-chip communication," in Proc. IEEE 23rd Annu. Symp. High-Perform. Interconnects (HOTI), Aug. 2015, pp. 78–87.

