+ All Categories
Home > Documents > On Computing Probabilities of Dismissal of 10b-5 ...Number of securities class-action lings since...

On Computing Probabilities of Dismissal of 10b-5 ...Number of securities class-action lings since...

Date post: 06-Oct-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
26
On Computing Probabilities of Dismissal of 10b-5 Securities Class-Action Cases Sumanta Singha, Steve Hillmer, Prakash P. Shenoy School of Business University of Kansas Capitol Federal Hall, 1654 Naismith Drive Lawrence, KS 66045 USA October 21, 2016 S.Singha, S.Hillmer, PP. Shenoy (KU) Computing Probabilities October 21, 2016 1 / 26
Transcript
Page 1: On Computing Probabilities of Dismissal of 10b-5 ...Number of securities class-action lings since 1996 : 4,100 1 in every 18 S&P 500 companies face class-action litigation. $87 billion

On Computing Probabilities of Dismissal of 10b-5Securities Class-Action Cases

Sumanta Singha, Steve Hillmer, Prakash P. Shenoy

School of BusinessUniversity of Kansas

Capitol Federal Hall, 1654 Naismith DriveLawrence, KS 66045 USA

October 21, 2016

S.Singha, S.Hillmer, PP. Shenoy (KU) Computing Probabilities October 21, 2016 1 / 26

Page 2: On Computing Probabilities of Dismissal of 10b-5 ...Number of securities class-action lings since 1996 : 4,100 1 in every 18 S&P 500 companies face class-action litigation. $87 billion

Outline

1 Introduction

2 Objective and Related Works

3 Data

4 Feature Selection and Analysis

5 Results

6 Conclusions

S.Singha, S.Hillmer, PP. Shenoy (KU) Computing Probabilities October 21, 2016 2 / 26

Page 3: On Computing Probabilities of Dismissal of 10b-5 ...Number of securities class-action lings since 1996 : 4,100 1 in every 18 S&P 500 companies face class-action litigation. $87 billion

Introduction

Securities class-action violations are lawsuits filed by investors orshareholders against corporations.

Some common allegations are fraudulent disclosure, misleadingforecast, violation of securities laws, insider trading and financialrestatements.

Some statistics:

Number of securities class-action filings since 1996 : 4,100

1 in every 18 S&P 500 companies face class-action litigation.

$87 billion has been dispensed in settlements.

S.Singha, S.Hillmer, PP. Shenoy (KU) Computing Probabilities October 21, 2016 3 / 26

Page 4: On Computing Probabilities of Dismissal of 10b-5 ...Number of securities class-action lings since 1996 : 4,100 1 in every 18 S&P 500 companies face class-action litigation. $87 billion

Class-action Litigation Process

S.Singha, S.Hillmer, PP. Shenoy (KU) Computing Probabilities October 21, 2016 4 / 26

Page 5: On Computing Probabilities of Dismissal of 10b-5 ...Number of securities class-action lings since 1996 : 4,100 1 in every 18 S&P 500 companies face class-action litigation. $87 billion

Objective

To identify features that are significant for dismissal/non-dismissal of10b-5 securities class-action lawsuit.

To propose a model that predicts the probability of dismissal based onthese features.

S.Singha, S.Hillmer, PP. Shenoy (KU) Computing Probabilities October 21, 2016 5 / 26

Page 6: On Computing Probabilities of Dismissal of 10b-5 ...Number of securities class-action lings since 1996 : 4,100 1 in every 18 S&P 500 companies face class-action litigation. $87 billion

Related Works

Two types of literatures:

Baker et al. [2007, 2009]; Cox et al. [2006, 2008]; Johnson et al.[2007]: Make qualitative arguments which features are importantfrom legal viewpoint.

Pitchard et al. [2005]; McShane et al. [2012]: Focus on predictingthe probability of dismissal/settlement in a class-action case.

In this paper, we propose a hybrid model of Naıve Bayes (NB) andLogistic Regression (LR).

S.Singha, S.Hillmer, PP. Shenoy (KU) Computing Probabilities October 21, 2016 6 / 26

Page 7: On Computing Probabilities of Dismissal of 10b-5 ...Number of securities class-action lings since 1996 : 4,100 1 in every 18 S&P 500 companies face class-action litigation. $87 billion

Assumptions of NB, LR, and Hybrid LR-NB method

NB Model:

Conditional independence of the predictors given the class variable.

Can not incorporate non-parametric continuous features.

LR Model:

Log odds of class variable is a linear function of features.

Can not incorporate features with missing values.

Hybrid Model:

Features in the LR part are independent of features in NB part giventhe class.

Features in the NB part are conditionally independent given the class.

Can simultaneously handle missing value and continuous predictors

Inference Method : Prior of the naıve Bayes is replaced by posteriorof logistic regression.

S.Singha, S.Hillmer, PP. Shenoy (KU) Computing Probabilities October 21, 2016 7 / 26

Page 8: On Computing Probabilities of Dismissal of 10b-5 ...Number of securities class-action lings since 1996 : 4,100 1 in every 18 S&P 500 companies face class-action litigation. $87 billion

Hybrid Model

O(C = d | f, e) = eβ0+∑m

i=1 βi fi

n∏j=1

L(C = d , ej) (1)

S.Singha, S.Hillmer, PP. Shenoy (KU) Computing Probabilities October 21, 2016 8 / 26

Page 9: On Computing Probabilities of Dismissal of 10b-5 ...Number of securities class-action lings since 1996 : 4,100 1 in every 18 S&P 500 companies face class-action litigation. $87 billion

Data Preparation: A New Notion of Dismissal

Total instances: 925 (# Dismissed: 414; # Not dismissed: 511)Training to Test Ratio: 90:10Source: SCAC, Stanford Law School; between 2002-2010.

S.Singha, S.Hillmer, PP. Shenoy (KU) Computing Probabilities October 21, 2016 9 / 26

Page 10: On Computing Probabilities of Dismissal of 10b-5 ...Number of securities class-action lings since 1996 : 4,100 1 in every 18 S&P 500 companies face class-action litigation. $87 billion

Features

9 important features were selected for the analysis. They are:

1 GP = GAAP violations (1) or not (0)

2 SI = SEC Investigation (1) or not (0)

3 II = lead plaintiff is institutional investor (1)or not (0)

4 BR = defendant filed for bankruptcy (1) or not (0)

5 IS = insider selling (1) or not (0)

6 IC = lack of internal control (1) or not (0)

7 S11 = Section-11 violations (1) or not (0)

8 RF = Restated the company financial (1) or not (0)

9 STD = sudden short-term (one to five working days) drop in shareprice

S.Singha, S.Hillmer, PP. Shenoy (KU) Computing Probabilities October 21, 2016 10 / 26

Page 11: On Computing Probabilities of Dismissal of 10b-5 ...Number of securities class-action lings since 1996 : 4,100 1 in every 18 S&P 500 companies face class-action litigation. $87 billion

5-Step Procedure

1 Markov blanket estimation for the class ‘dismissed’ as initial step forfeature selection.

2 Supervised discretization of any continuous feature, necessary only forNB model.

3 Searching for the best LR and best NB Models, from the set offeatures in Step 1; using 8 fold CV on training set and RMSE asperformance metrics.

4 Searching for the best hybrid model using a heuristic. This minimizessearch space from 602 models to 30 models.

5 Estimating out-of-sample error and confidence interval for the besthybrid model using 1000 non-parametric bootstrap re-samples of thetest set.

S.Singha, S.Hillmer, PP. Shenoy (KU) Computing Probabilities October 21, 2016 11 / 26

Page 12: On Computing Probabilities of Dismissal of 10b-5 ...Number of securities class-action lings since 1996 : 4,100 1 in every 18 S&P 500 companies face class-action litigation. $87 billion

What is Markov blanket?

A variable’s Markov Blanket contains its parents, children, andco-parents of the children. It can be shown that a node is conditionallyindependent of all other nodes in the network given its Markov blanket.

Figure: Markov Blanket

S.Singha, S.Hillmer, PP. Shenoy (KU) Computing Probabilities October 21, 2016 12 / 26

Page 13: On Computing Probabilities of Dismissal of 10b-5 ...Number of securities class-action lings since 1996 : 4,100 1 in every 18 S&P 500 companies face class-action litigation. $87 billion

Step 1: Markov Blanket Estimation

40 Markov blankets are estimated using 4 constrained-basedalgorithms and 10 different CI tests.

Union of all 40 MBs is considered as MB for class ‘dismissed’, whichis equivalent to removing features that all MBs agree as irrelevant.

The Markov blanket contains 6 features: (i) GP, (ii) RF , (iii) IC , (iv)S11, (v) BR, and (vi) STD.

In the next step, best LR and best NB model are searched from theset of 6 features.

S.Singha, S.Hillmer, PP. Shenoy (KU) Computing Probabilities October 21, 2016 13 / 26

Page 14: On Computing Probabilities of Dismissal of 10b-5 ...Number of securities class-action lings since 1996 : 4,100 1 in every 18 S&P 500 companies face class-action litigation. $87 billion

Step 2: Discretization of Continuous Feature STD

FR

EQ

UE

NC

Y

SHORT TERM DROP

" Not Dismissed'

"Dismissed"

42.2 %

The proportion of dismissed to non − dismissed (likelihood ratio) changesbelow and above this breakpoint.

S.Singha, S.Hillmer, PP. Shenoy (KU) Computing Probabilities October 21, 2016 14 / 26

Page 15: On Computing Probabilities of Dismissal of 10b-5 ...Number of securities class-action lings since 1996 : 4,100 1 in every 18 S&P 500 companies face class-action litigation. $87 billion

Step 3: Selection of Best LR and Best NB Model

There are a total (26 − 1) = 63 candidate models to search from tofind the best LR and best NB model.

We use 8-fold cross-validation on the training set to find thetraining-set error. We repeat this process 100 times and take theaverage CV error.

We propose to use RMSE as the performance metric, not classificationerror, because we aim to predict probability and not do classification.

The best model is that one that produces lowest average training seterror. The best LR model contains 5 features and best NB modelcontains 2 features.

S.Singha, S.Hillmer, PP. Shenoy (KU) Computing Probabilities October 21, 2016 15 / 26

Page 16: On Computing Probabilities of Dismissal of 10b-5 ...Number of securities class-action lings since 1996 : 4,100 1 in every 18 S&P 500 companies face class-action litigation. $87 billion

Best Naıve Bayes Model

Figure: Best naıve Bayes model based on RMSE

S.Singha, S.Hillmer, PP. Shenoy (KU) Computing Probabilities October 21, 2016 16 / 26

Page 17: On Computing Probabilities of Dismissal of 10b-5 ...Number of securities class-action lings since 1996 : 4,100 1 in every 18 S&P 500 companies face class-action litigation. $87 billion

Best Logistic Regression Model

Figure: Best logistic regression model based on RMSE

S.Singha, S.Hillmer, PP. Shenoy (KU) Computing Probabilities October 21, 2016 17 / 26

Page 18: On Computing Probabilities of Dismissal of 10b-5 ...Number of securities class-action lings since 1996 : 4,100 1 in every 18 S&P 500 companies face class-action litigation. $87 billion

Step 3: Selection of Best LR and Best NB Model

Computation of RMSE:

Compute the probability of dismissal for all 832 cases in the trainingset.

Sort instances based on predicted probabilities and partition thetraining set into 8 bins.

For each bin, compute the average of predicted probability and actualprobability.

The difference between predicted average and actual average is theprediction error.

Compute the SSE and RMSE.

S.Singha, S.Hillmer, PP. Shenoy (KU) Computing Probabilities October 21, 2016 18 / 26

Page 19: On Computing Probabilities of Dismissal of 10b-5 ...Number of securities class-action lings since 1996 : 4,100 1 in every 18 S&P 500 companies face class-action litigation. $87 billion

Step 3: Selection of Best LR and Best NB Model

An example of computation of bin probability and RMSE:

Avg. predicted prob. Actual prob. Sq. error

Bin 1 0.270 0.288 0.0003Bin 2 0.320 0.355 0.0012Bin 3 0.340 0.336 0.0000Bin 4 0.385 0.365 0.0004Bin 5 0.438 0.442 0.0000Bin 6 0.482 0.432 0.0024Bin 7 0.598 0.644 0.0021Bin 8 0.654 0.625 0.0008

Sum of squared errors 0.0072RMSE 0.0300

S.Singha, S.Hillmer, PP. Shenoy (KU) Computing Probabilities October 21, 2016 19 / 26

Page 20: On Computing Probabilities of Dismissal of 10b-5 ...Number of securities class-action lings since 1996 : 4,100 1 in every 18 S&P 500 companies face class-action litigation. $87 billion

Step 3: Selection of Best LR and Best NB Model

Figure: Graphical representation of bin probabilities

S.Singha, S.Hillmer, PP. Shenoy (KU) Computing Probabilities October 21, 2016 20 / 26

Page 21: On Computing Probabilities of Dismissal of 10b-5 ...Number of securities class-action lings since 1996 : 4,100 1 in every 18 S&P 500 companies face class-action litigation. $87 billion

Step 4: Searching the Best Hybrid Model

There are a total 602 possible candidate models to search from.

The heuristic proceeds in two steps.

First, it considers features which are present in either best LR and bestNB model. We have 5 such features. Number of candidate modelsreduces from 602 to 180.

Second, it considers all 5 features must be there in the best hybridmodel in some arrangement. Number of candidate models reducesfrom 180 to 30.

The best hybrid model has 4 features in the LR side and 1 feature inthe NB side.

S.Singha, S.Hillmer, PP. Shenoy (KU) Computing Probabilities October 21, 2016 21 / 26

Page 22: On Computing Probabilities of Dismissal of 10b-5 ...Number of securities class-action lings since 1996 : 4,100 1 in every 18 S&P 500 companies face class-action litigation. $87 billion

Best Hybrid Model

Figure: Best hybrid model based on RMSE

S.Singha, S.Hillmer, PP. Shenoy (KU) Computing Probabilities October 21, 2016 22 / 26

Page 23: On Computing Probabilities of Dismissal of 10b-5 ...Number of securities class-action lings since 1996 : 4,100 1 in every 18 S&P 500 companies face class-action litigation. $87 billion

Step 5: Finding Test Set Error of Best Hybrid Model

1000 re-samples of same size as the test set generated usingnon-parametric bootstrapping.

Computation of RMSE from each bootstrap re-samples.

Average test set error and confidence interval are computed.

S.Singha, S.Hillmer, PP. Shenoy (KU) Computing Probabilities October 21, 2016 23 / 26

Page 24: On Computing Probabilities of Dismissal of 10b-5 ...Number of securities class-action lings since 1996 : 4,100 1 in every 18 S&P 500 companies face class-action litigation. $87 billion

Results

Table: Model Selection Results

Method Predictors Avg. RMSE Std.Dev.

Naıve Bayes GP, STD 0.0488 0.0012Logistic regression GP, IC, STD, BR, S11 0.0436 0.0010Hybrid LR part: GP, IC, BR, S11 0.0412* 0.0011

NB part: STD

* significant in paired t-test @5% significance level.

Table: Test set errors

Models RMSE Bootstrap Std Error Bootstrap CI

Best Hybrid Model 0.0930 0.0444 [0.0110,0.1843]

S.Singha, S.Hillmer, PP. Shenoy (KU) Computing Probabilities October 21, 2016 24 / 26

Page 25: On Computing Probabilities of Dismissal of 10b-5 ...Number of securities class-action lings since 1996 : 4,100 1 in every 18 S&P 500 companies face class-action litigation. $87 billion

Comparing Results with McShane et al. [2012]

Method # Predictors Training Set Error Test Set Error

LR Model 18 6.01% 11.17%Hybrid LR-NB 5 4.12% 9.30%

McShane et al. use all 18 features to predict dismissal and settlementboth. We do not know which features affect dismissal and which featuresaffect settlement.

S.Singha, S.Hillmer, PP. Shenoy (KU) Computing Probabilities October 21, 2016 25 / 26

Page 26: On Computing Probabilities of Dismissal of 10b-5 ...Number of securities class-action lings since 1996 : 4,100 1 in every 18 S&P 500 companies face class-action litigation. $87 billion

Conclusions

Hybrid model includes best aspects of logistic regression and naıveBayes.

Retains simplicity (small number of parameters) of LR and NB.

Method for learning parameters remains the same as LR/NB.

Easy to make inferences.

For the dataset on 10b-5 class-action cases, hybrid model performsbetter than pure LR and pure NB.

Features for predicting dismissal are (i) GP, (ii) IC, (iii) BR, (iv) S11,and (v) STD.

Our hybrid model performs better than the LR model proposed byMcShane et al. [2012]

S.Singha, S.Hillmer, PP. Shenoy (KU) Computing Probabilities October 21, 2016 26 / 26


Recommended