Abstract
Water, functioning as a nearly universal solvent, has the ability to dissolve various compounds based on their polarity. This includes both polar and nonpolar compounds, even at extremely low concentrations. However, these seemingly invisible and tasteless contaminants in water can pose health risks for consumers. To address this, a comprehensive understanding of water quality is crucial for informed decisions on protection and management. Horton introduced the concept of the Water Quality Index (WQI)[1], providing a numerical representation for assessing water quality in specific locations. This tool is widely used by environmental scientists, water resource managers, and policymakers to communicate water quality information effectively to the public. The assessment of water quality relies on various physical and chemical parameters associated with its intended use, and establishing acceptable values for each parameter is essential. If water fails to meet these standards, treatment is necessary before utilization. This project aims to leverage machine learning techniques, including Logistic Regression, Decision Tree, Random Forest, Naïve Bayes, K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and AdaBoost, for assessing water quality. Using a dataset with parameters such as Trihalomethanes, pH, Solids, Chloramines, Sulphate, Hardness, Conductivity, Organic Carbon, and Turbidity from various water bodies, the study successfully predicts water potability with near accuracy.