+ All Categories
Home > Documents > IDS 570 project presentation

IDS 570 project presentation

Date post: 14-Apr-2017
Category:
Upload: university-of-illinoischicago
View: 44 times
Download: 0 times
Share this document with a friend
15
Analysis of wine quality Aadhish Chopra Abhilekh Das Gopal Bhutada Parichay Jain Presented By:
Transcript
Page 1: IDS 570 project presentation

Analysis of wine qualityAadhish ChopraAbhilekh DasGopal BhutadaParichay Jain

Presented By:

Page 2: IDS 570 project presentation

Steps:

●Data Exploration●Data Cleaning●Examining Relationship●Modeling and Prediction

Page 3: IDS 570 project presentation

Data Exploration

Dataset Source Link: https://archive.ics.uci.edu/ml/datasets/Wine+Quality

Predictors (Variables) in dataset: fixed acidity, volatile acidity, citric acid, residual sugar, chlorides, free sulfur dioxide, total sulfur dioxide, density, pH, sulphates, alcohol

Output variable:quality

Page 4: IDS 570 project presentation

Understanding Data

Page 5: IDS 570 project presentation

Understanding Data

Page 6: IDS 570 project presentation

Understanding Data

Page 7: IDS 570 project presentation

Exploratory Analysis

Page 8: IDS 570 project presentation

Data Cleaning

The cleaning of the data is done in three steps here

Imputation of missing values

Removal of outliers

Scaling of all the Quantitative variables

Page 9: IDS 570 project presentation

Removing Outliers

Boxplot from Original Data Boxplot after removing outliers

Page 10: IDS 570 project presentation

Scaling the Variables

Page 11: IDS 570 project presentation

Examining Relationship Correlation between the variables

● We try to find out the relation between various attributes and with respect to our output variable quality

● Correlation factor lies between -1 to +1

● Chart along-with indicates the measure of correlation between various attributes.

Page 12: IDS 570 project presentation

Regression

Divide data into train and test data

Train data using regression model

Based on the output of regression analysis we find out the parameters which has statistical importance over the quality of wine and are not by random chance

Model analysis various combinations and finally concludes the one with minimum RSE, better adjusted R-squared value and F-statistics

Decide to accept or reject hypotheses based on p-value

Page 13: IDS 570 project presentation

Regression

Interpretation from Regression

Page 14: IDS 570 project presentation

Prediction

Based on the train data we try to predict the quality in the test data that is we apply our model on test data

Page 15: IDS 570 project presentation

Thank You!

Questions?


Recommended