Wine Chromatic Profile Classification

Docker
Machine Learning
Logistic Regression
pipeline
Published

December 15, 2024

About

In this project we created a classification model to predict the type of wine (red or white) based on its physicochemical properties, such as acidity, sulfates, citric acid, etc. The logistic regression model was selected for its simplicity and effectiveness in binary classification tasks. Our classifier demonstrated excellent performance on unseen data, achieving test accuracy of 0.99. Out of 1,950 instances, only 15 were misclassified, showcasing the model’s reliability and its potential for practical applications in wine quality assessment and categorization.

The data set we used in this project was created by By P. Cortez, A. Cerdeira, Fernando Almeida, Telmo Matos, J. Reis. 2009 as part of Decision Support Systems publication, and is available on UCI Machine Learning Repository here.

Inmportaint note:

The main focus of this project was to build a successful reproducible data science pipeline using Docker and Makefile. Please check out the GitHub Repo (linked above) to see the details on how to run the Docker container.

Also, check out my article on how to build intuition around composing the Docker file,

Report

The final report can be found here.

Contributors

Daria Khon, Farhan Bin Faisal, Adrian Leung, Zhiwei Zhang