Disguise Adversarial Networks for Imbalanced Click-through ......A common issue in using machine...

Disguise Adversarial Networks for Imbalanced Click-through Rate Prediction

Derek Zhao, James Xue, Chaitanya Dara

Sai Pavan Kumar Unnam, Daniel Gomez

Industry Mentors: Prasad Chalasani, Aravind Sadagopan

Faculty Mentor: Eleni Drinea

I. THE PROBLEM OF IMBALANCED DATAA common issue in using machine learning to predict ad conversions in click-through

rate datasets is the imbalanced classification problem: binary classifiers struggle to

train effectively due to the lack of sufficient exposure to positive (minority) class

samples. We explore the effectiveness of a novel neural architecture, the Disguise

Adversarial Network (DAN)1, a synthetic oversampling technique that transforms

negative samples into positive samples, thus rebalancing the dataset.

1 Deng, Yue & Shen, Yilin & Jin Hongxia, ‘Disguise Adversarial Networks for Click-through Rate Prediction’, in Proceedings of the Twenty-Sixth Joint Conference on Artificial Intelligence, 2017.

III. MIXED RESULTSThe DAN does not appear to offer benefits in tasks where the data is linearly separable or where base models already predict with high recall. To account for this, we further skew

MediaMath’s training data from 11% positive to 1% positive while leaving validation and test sets unchanged. Under these conditions, the DAN is able to approximate the

performance of models trained on the original dataset.

Data Science Capstone Project

with

0 1

Features

Sa

mp

les

The rightmost figure shows a heatmap of scaled transformation magnitudes across

400 random samples of negative data. Certain features are noticeably more

relevant for performing successful transformations than others. The disguise network

can be a powerful tool for inferring sample-specific feature importance.

Positive class

proportion (train)Disg. Discr. Accuracy AUROC Precision Recall

0.11 No Logistic 0.9173 0.9844 0.5795 0.9789

0.11 No 64 32 0.9173 0.9843 0.5800 0.9754

0.11 256 x 4 64 32 0.9168 0.9844 0.5794 0.9653

0.5 No Logistic 0.9155 0.9650 0.5732 0.9901

0.5 No 64 32 0.9155 0.9668 0.5728 0.9933

0.01 No 64 32 0.9073 0.9809 0.6568 0.3768

0.01 256 x 4 64 32 0.9181 0.9838 0.5899 0.9041

II. VISUAL INTUITIONSThe disguise network attempts to learn a transformation on negative class data that

satisfies two properties: 1) negative samples are transformed to look like positive

samples and 2) the transformation is not too drastic. The hyperparameter 𝜆 balances

these two competing interests, with higher values encouraging the disguise network

to learn an identity transformation and lower values affording the network greater

flexibility at the cost of reduced disguise diversity. This effect can be seen on the left

using MediaMath’s advertising data projected onto two dimensions via PCA.

Date post:	12-Jul-2020
Category:	Documents
Upload:	others
View:	1 times
Download:	0 times

Disguise Adversarial Networks for Imbalanced Click-through ......A common issue in using machine...

Documents