Abstract
The increasing complexity of cyber threats have made it more challenging to detect them
accurately using the traditional Intrusion Detection Systems (IDS)..Machine Learning (ML)-
based IDS have gained prominence due to their ability to analyze vast amounts of network
traffic, detect anomalies, and classify cyber threats with high accuracy. However, challenges
such as data imbalance, high-dimensional feature spaces, and false positive rates remain. This
paper presents the complete analysis of ML techniques for IDS, in particular, supervised,
unsupervised, and hybrid approaches. Feature selection and dimensionality reduction methods,
such as Principal Component Analysis (PCA) and clustering-based Stacking Feature Embedding,
are explored to enhance model efficiency. The study evaluates various ML algorithms, including
Decision Trees (DT), Random Forest (RF), and Extreme Trees (ET), using benchmark datasets
such as UNSW-NB15, CIC-IDS-2017, and CIC-IDS-2018. The experimental results show that
the deep learning models and ensemble techniques can achieve up to 99.99% accuracy, which is
a big improvement over traditional IDS methods. Additionally, the study discusses key
challenges, including adversarial attacks, scalability concerns, and interpretability issues. It
suggests future research directions, such as Explainable AI (XAI), federated learning, and
blockchain-based IDS solutions. The findings underscore the potential of ML-driven IDS in
enhancing cybersecurity resilience and mitigating emerging cyber threats.