IDS 570 project presentation

Post on 14-Apr-2017

44 views 0 download

transcript

Analysis of wine qualityAadhish ChopraAbhilekh DasGopal BhutadaParichay Jain

Presented By:

Steps:

●Data Exploration●Data Cleaning●Examining Relationship●Modeling and Prediction

Data Exploration

Dataset Source Link: https://archive.ics.uci.edu/ml/datasets/Wine+Quality

Predictors (Variables) in dataset: fixed acidity, volatile acidity, citric acid, residual sugar, chlorides, free sulfur dioxide, total sulfur dioxide, density, pH, sulphates, alcohol

Output variable:quality

Understanding Data

Understanding Data

Understanding Data

Exploratory Analysis

Data Cleaning

The cleaning of the data is done in three steps here

Imputation of missing values

Removal of outliers

Scaling of all the Quantitative variables

Removing Outliers

Boxplot from Original Data Boxplot after removing outliers

Scaling the Variables

Examining Relationship Correlation between the variables

● We try to find out the relation between various attributes and with respect to our output variable quality

● Correlation factor lies between -1 to +1

● Chart along-with indicates the measure of correlation between various attributes.

Regression

Divide data into train and test data

Train data using regression model

Based on the output of regression analysis we find out the parameters which has statistical importance over the quality of wine and are not by random chance

Model analysis various combinations and finally concludes the one with minimum RSE, better adjusted R-squared value and F-statistics

Decide to accept or reject hypotheses based on p-value

Regression

Interpretation from Regression

Prediction

Based on the train data we try to predict the quality in the test data that is we apply our model on test data

Thank You!

Questions?