Date post: | 23-Jan-2018 |
Category: |
Data & Analytics |
Upload: | anubhav-dhiman |
View: | 50 times |
Download: | 2 times |
●
●
●
DataNames
# 590# 2
Class
Time- stampUnlabeled features
NaNs
FeaturesClass
# 590# 1
-1 or 1
19/07/2008 11:55:00
YearMonth
Day of MonthDay of Week
Hour Minute
V1 to V590
Timestamp
# 6
NAs 4.54%
Today’s focus
Input Input Data frame
116 28
Removed
Removed
#52
#28Categorical Continuous
#20 #116#454
Clean#52
Upto 1% missing#324
2 - 3%#46
17%#20
46%#451-91%
#28
Removed
BACKUP
BACKUP
9.5%
5.9%
4.1%
2.1%
% ErrorImputation Method
Mean
kNN (+39%)
Rpart (+30%)
PMM (+48%)
BACKUP
BACKUP
Upto 1%#324
2 - 3%#46
17%#20
46%#4
Continuous Variables
Categorical Variables
5% trimmed mean#308
Mode #16
kNN imputation*
Clean #52
#48
#4
For each record, identify missing features. For each missing feature find the k nearest neighbors which have that feature. Impute the missing value using the imputation function on the k-length vector of values found from the neighbors. (Source: R package imputation v2.0.3 by Jeffrey Wong)
46% NA
KNN impute
10
11566
1545
BACKUP