Date post: | 14-Apr-2017 |
Category: |
Documents |
Upload: | university-of-illinoischicago |
View: | 44 times |
Download: | 0 times |
Analysis of wine qualityAadhish ChopraAbhilekh DasGopal BhutadaParichay Jain
Presented By:
Steps:
●Data Exploration●Data Cleaning●Examining Relationship●Modeling and Prediction
Data Exploration
Dataset Source Link: https://archive.ics.uci.edu/ml/datasets/Wine+Quality
Predictors (Variables) in dataset: fixed acidity, volatile acidity, citric acid, residual sugar, chlorides, free sulfur dioxide, total sulfur dioxide, density, pH, sulphates, alcohol
Output variable:quality
Understanding Data
Understanding Data
Understanding Data
Exploratory Analysis
Data Cleaning
The cleaning of the data is done in three steps here
Imputation of missing values
Removal of outliers
Scaling of all the Quantitative variables
Removing Outliers
Boxplot from Original Data Boxplot after removing outliers
Scaling the Variables
Examining Relationship Correlation between the variables
● We try to find out the relation between various attributes and with respect to our output variable quality
● Correlation factor lies between -1 to +1
● Chart along-with indicates the measure of correlation between various attributes.
Regression
Divide data into train and test data
Train data using regression model
Based on the output of regression analysis we find out the parameters which has statistical importance over the quality of wine and are not by random chance
Model analysis various combinations and finally concludes the one with minimum RSE, better adjusted R-squared value and F-statistics
Decide to accept or reject hypotheses based on p-value
Regression
Interpretation from Regression
Prediction
Based on the train data we try to predict the quality in the test data that is we apply our model on test data
Thank You!
Questions?