This project presents a comparative analysis of several widely used machine learning models for classification tasks. The goal is to evaluate the performance of each model using a breast cancer dataset and determine which one yields the highest accuracy and predictive reliability.
The following models were implemented and evaluated:
- K-Nearest Neighbors (KNN)
- Kernel Support Vector Machine (Kernel SVM)
- Logistic Regression
- Naive Bayes
- Support Vector Machine (SVM)
- Decision Tree
- Random Forest
Each model was assessed using confusion matrix and accuracy score metrics to evaluate its performance.
Model | Accuracy Score |
---|---|
Decision Tree | 95.90% |
Kernel SVM | 95.32% |
K-Nearest Neighbors | 94.73% |
Logistic Regression | 94.73% |
Naive Bayes | 94.15% |
Support Vector Machine | 94.15% |
Random Forest | 93.56% |
- 🥇 Best Model: Decision Tree
- 🥈 Runner-Up: Kernel SVM
The dataset used for this project is a publicly available Breast Cancer Dataset, commonly used for classification problems.
Data.csv
– Dataset file*.ipynb
– Jupyter notebooks for each individual modelPresentation - Machine Learning Model Comparison.pdf
– Summary presentation of the projectREADME.md
– This fileLICENSE
– MIT License
- Clone this repository
git clone https://github.com/nadhif-royal/ModelComparisonML.git
- Open the Jupyter notebooks in your preferred IDE or environment.
- Run each notebook to view model training, evaluation, and comparison.
The Decision Tree model showed the highest accuracy, making it the best performer in this project. Meanwhile, Random Forest had the lowest accuracy. The Kernel SVM came close to matching the performance of the Decision Tree, making it a solid alternative.