UGC Approved Journal no 63975(19)
New UGC Peer-Reviewed Rules

ISSN: 2349-5162 | ESTD Year : 2014
Volume 13 | Issue 3 | March 2026

JETIREXPLORE- Search Thousands of research papers



WhatsApp Contact
Click Here

Published in:

Volume 12 Issue 11
November-2025
eISSN: 2349-5162

UGC and ISSN approved 7.95 impact factor UGC Approved Journal no 63975

7.95 impact factor calculated by Google scholar

Unique Identifier

Published Paper ID:
JETIR2511308


Registration ID:
571810

Page Number

d66-d72

Share This Article


Jetir RMS

Title

MAXIMIZING THE INPUT REUSE FOR A DEPTHWISE CONVOLUTION USING BIRD(BIDIRECTIONAL INPUT REUSE DATAFLOW)

Abstract

Depthwise convolution is widely used in lightweight CNNs (e.g., MobileNet, EfficientNet) because it sharply reduces multiply–accumulate (MAC) counts by decoupling spatial from cross-channel processing. However, naively mapping depthwise kernels onto conventional systolic arrays yields poor PE utilization: each channel’s spatial kernel is applied independently, so many PEs sit idle while a single channel streams through the array. The Bi-Directional Input Reuse Dataflow (BiRD) mitigates this inefficiency by enabling both vertical and horizontal reuse of input activations along the systolic chain. BiRD reduces redundant off-chip/on-chip transfers, increases arithmetic intensity, and substantially raises average PE utilization . In the proposed work we extend the Bi-Directional Input Reuse Dataflow (BiRD) to support 3×3 depthwise kernels while retaining the original five–processing-element (5-PE) systolic chain developed for 2×2 operations. Rather than increasing the PE count, the design time-multiplexes the 5-PE strip by dynamically classifying PEs as active or inactive and using local registers to preserve partial sums across cycles. To accelerate the arithmetic path without significantly enlarging the datapath footprint, the MAC macro integrates a Radix-4 Booth multiplier with local accumulation. The complete design was implemented in synthesizable Verilog and validated in Vivado: RTL simulation and post-synthesis functional checks confirm correct convolution outputs for 3×3 depthwise kernels, and synthesis reports show only a small area overhead relative to the 2×2 baseline. Overall, the architecture attains high on-chip data reuse and improved PE utilization for larger kernels, making it an attractive, resource-efficient choice for edge-AI accelerators that require compact, high-throughput depthwise convolution support.

Key Words

Depthwise Convolution, BiRD Dataflow, Systolic Array, Radix-4 Booth Multiplier, Multiply-Accumulate (MAC), Verilog HDL, FPGA Acceleration, Low-Power Edge Computing.

Cite This Article

"MAXIMIZING THE INPUT REUSE FOR A DEPTHWISE CONVOLUTION USING BIRD(BIDIRECTIONAL INPUT REUSE DATAFLOW)", International Journal of Emerging Technologies and Innovative Research (www.jetir.org), ISSN:2349-5162, Vol.12, Issue 11, page no.d66-d72, November-2025, Available :http://www.jetir.org/papers/JETIR2511308.pdf

ISSN


2349-5162 | Impact Factor 7.95 Calculate by Google Scholar

An International Scholarly Open Access Journal, Peer-Reviewed, Refereed Journal Impact Factor 7.95 Calculate by Google Scholar and Semantic Scholar | AI-Powered Research Tool, Multidisciplinary, Monthly, Multilanguage Journal Indexing in All Major Database & Metadata, Citation Generator

Cite This Article

"MAXIMIZING THE INPUT REUSE FOR A DEPTHWISE CONVOLUTION USING BIRD(BIDIRECTIONAL INPUT REUSE DATAFLOW)", International Journal of Emerging Technologies and Innovative Research (www.jetir.org | UGC and issn Approved), ISSN:2349-5162, Vol.12, Issue 11, page no. ppd66-d72, November-2025, Available at : http://www.jetir.org/papers/JETIR2511308.pdf

Publication Details

Published Paper ID: JETIR2511308
Registration ID: 571810
Published In: Volume 12 | Issue 11 | Year November-2025
DOI (Digital Object Identifier):
Page No: d66-d72
Country: DR B R A KONASEEMA , ANDHRA PRADESH, India .
Area: Engineering
ISSN Number: 2349-5162
Publisher: IJ Publication


Preview This Article


Downlaod

Click here for Article Preview

Download PDF

Downloads

00051

Print This Page

Current Call For Paper

Jetir RMS