UGC Approved Journal no 63975(19)
New UGC Peer-Reviewed Rules

ISSN: 2349-5162 | ESTD Year : 2014
Volume 13 | Issue 1 | January 2026

JETIREXPLORE- Search Thousands of research papers



WhatsApp Contact
Click Here

Published in:

Volume 13 Issue 1
January-2026
eISSN: 2349-5162

UGC and ISSN approved 7.95 impact factor UGC Approved Journal no 63975

7.95 impact factor calculated by Google scholar

Unique Identifier

Published Paper ID:
JETIRHG06015


Registration ID:
573643

Page Number

146-156

Share This Article


Jetir RMS

Title

Bias Detection And Mitigation In Text Datasets Using Natural Language Processing: Comprehensive And Comparative Review

Abstract

Text datasets form the foundation of Natural Language Processing (NLP), powering applications from online search and recommendation engines to decision-critical systems in healthcare, law, and finance. However, these datasets often encode social, cultural, historical, and annotation-driven biases that, when learned by AI, propagate and amplify unfair outcomes. This paper presents a comprehensive and comparative review of bias detection and mitigation in text datasets. It surveys major forms of bias—representational, stereotypical, sampling, annotation, cultural, and algorithmic—while analyzing state-of-the-art detection tools such as statistical audits, embedding association tests (e.g., WEAT, SEAT), explainable AI (XAI), and behavioral probing. It further examines mitigation strategies across the NLP pipeline: pre-processing (data balancing, counterfactual augmentation), in-processing (adversarial training, fairness-aware objectives), and post-processing (threshold calibration, output rewriting). The review critiques benchmark datasets like StereoSet, CrowS-Pairs, Jigsaw Toxicity, BEADS, and domain-specific corpora. Persistent gaps include intersectionality, multilingual fairness, dataset documentation, annotation bias, and the need for dynamic, real-time monitoring in deployed systems. The paper concludes with future research directions emphasizing living benchmarks, participatory AI, explainable fairness, and continuous bias auditing.

Key Words

Bias Detection And Mitigation In Text Datasets Using Natural Language Processing: Comprehensive And Comparative Review

Cite This Article

"Bias Detection And Mitigation In Text Datasets Using Natural Language Processing: Comprehensive And Comparative Review", International Journal of Emerging Technologies and Innovative Research (www.jetir.org), ISSN:2349-5162, Vol.13, Issue 1, page no.146-156, January-2026, Available :http://www.jetir.org/papers/JETIRHG06015.pdf

ISSN


2349-5162 | Impact Factor 7.95 Calculate by Google Scholar

An International Scholarly Open Access Journal, Peer-Reviewed, Refereed Journal Impact Factor 7.95 Calculate by Google Scholar and Semantic Scholar | AI-Powered Research Tool, Multidisciplinary, Monthly, Multilanguage Journal Indexing in All Major Database & Metadata, Citation Generator

Cite This Article

"Bias Detection And Mitigation In Text Datasets Using Natural Language Processing: Comprehensive And Comparative Review", International Journal of Emerging Technologies and Innovative Research (www.jetir.org | UGC and issn Approved), ISSN:2349-5162, Vol.13, Issue 1, page no. pp146-156, January-2026, Available at : http://www.jetir.org/papers/JETIRHG06015.pdf

Publication Details

Published Paper ID: JETIRHG06015
Registration ID: 573643
Published In: Volume 13 | Issue 1 | Year January-2026
DOI (Digital Object Identifier):
Page No: 146-156
Country: -, -, India .
Area: Engineering
ISSN Number: 2349-5162
Publisher: IJ Publication


Preview This Article


Downlaod

Click here for Article Preview

Download PDF

Downloads

0004

Print This Page

Current Call For Paper

Jetir RMS