This repository contains a data analysis of the iris dataset using multiple machine learning approaches in Python and Jupyter Notebook. The iris dataset is a multivariate dataset introduced by Ronald Fisher in 1936. It consists of 150 samples from three species of Iris (setosa, virginica, and versicolor) with four measured features: sepal length, sepal width, petal length, and petal width. The dataset is commonly used for data mining, classification, clustering, and algorithm testing purposes.
iris.csv
- The dataset used for the analysisiris_data_analysis.ipynb
- The Jupyter Notebook containing the analysisdata_analysis.py
- The Python script containing the analysis
- Decision Trees
- Logistic Regression
- K-Nearest Neighbors
- Support Vector Machines
- Random Forests
- K-Means
- Hierarchical Clustering
- Pandas
- Numpy
- Matplotlib
- Seaborn
- Scikit-learn
iris.csv
- The dataset used for the analysisiris_data_analysis.ipynb
- The Jupyter Notebook containing the analysisdata_analysis.py
- The Python script containing the analysis
This project requires Python v3.6+ to run.
Install the dependencies and devDependencies and start the server.
pip install -r requirements.txt
- Datacamp - For the course Supervised Learning with scikit-learn
- Kaggle - For the dataset Iris Species
- UCI Machine Learning Repository - For the dataset Iris Data Set
This project is licensed under the MIT License - see the LICENSE.md file for details