Date post: | 14-Jul-2015 |
Category: |
Data & Analytics |
Upload: | raul-chong |
View: | 318 times |
Download: | 2 times |
Risk and Financial Portfolio Analytics: A Technical Introduction
Oleksandr Romanko, Ph.D. Senior Research Analyst, Risk Analytics – Business Analytics, IBM Adjunct Professor, University of Toronto
Toronto SMAC Meetup January 15, 2015
© 2015 IBM Corporation 2
Please note:
IBM Risk Analytics statements regarding its plans, directions, and intent are subject to
change or withdrawal without notice at IBM’s sole discretion.
Information regarding potential future products is intended to outline our general product
direction and it should not be relied on in making a purchasing decision.
The information mentioned regarding potential future products is not a commitment,
promise, or legal obligation to deliver any material, code or functionality. Information about
potential future products may not be incorporated into any contract. The development,
release, and timing of any future features or functionality described for our products
remains at our sole discretion.
Performance is based on measurements and projections using standard IBM benchmarks
in a controlled environment. The actual throughput or performance that any user will
experience will vary depending upon many factors, including considerations such as the
amount of multiprogramming in the user's job stream, the I/O configuration, the storage
configuration, and the workload processed. Therefore, no assurance can be given that an
individual user will achieve results similar to those stated here.
© 2015 IBM Corporation 3
About me
Dr. Oleksandr Romanko
Senior Research Analyst, Quantitative Research at Risk Analytics, Business Analytics, IBM, with the company since 2010
Ph.D. in Computer Science from McMaster University
Author of over 10 papers and reports
Adjunct professor at University of Toronto and lecturer at McMaster University
Research areas:
business analytics, operational research, optimization, finance
portfolio optimization, multi-objective optimization
market and credit risk modeling and optimization
numerical methods for risk management
design of numerical algorithms and their software implementation
© 2015 IBM Corporation 4
Making the world work better – pioneering the science
2008
1973 1969
1981
© 2015 IBM Corporation 5
IBM Centennial: 100 Years of Innovation
© 2015 IBM Corporation
Analytics Jobs
Created by: Dennis Buttera
© 2015 IBM Corporation
Data science
9
© 2015 IBM Corporation
Business Analytics
© 2015 IBM Corporation
Predictive Analytics What will happen?
Descriptive Analytics What has happened?
Prescriptive Analytics What should we do?
What is analytics?
Data Insight Action
Decide Analyze
Business Value
11
Analytics is the scientific process of deriving insights from
data in order to make decisions
© 2015 IBM Corporation 12
History of analytics
© 2015 IBM Corporation 13
Movies
© 2015 IBM Corporation 14
Applications of big data analytics
Homeland Security
Finance Smarter Healthcare Multi-channel sales
Telecom
Manufacturing
Traffic Control
Trading Analytics Fraud and Risk
Log Analysis
Search Quality
Retail: Churn, NBO
© 2015 IBM Corporation
Cloud
© 2015 IBM Corporation 16
Bluemix
www.bluemix.net
© 2015 IBM Corporation 17
Bluemix
© 2015 IBM Corporation
Applied Statistics
© 2015 IBM Corporation
What kind of data are we dealing with?
Types of data
• Quantitative
• Categorical (ordered, unordered)
Data collection
• Independent observations (one observation per subject)
• Dependent observations (repeated observation of the same subject, relationships
within groups, relationships over time or space)
Type of data drives the direction of your analysis
• How to plot
• How to summarize
• How to draw inferences and conclusions
• How to issue predictions
19
© 2015 IBM Corporation
Quantitative data
Examples: financial return, temperature, age, income
Quick check: “Does it makes sense to calculate an average?”
Appropriate summary statistics:
– Mean and Median
– Standard Deviation
– Percentiles
More advanced predictive methods: Regression, Time Series Analysis, …
Plot your data!
20
© 2015 IBM Corporation
Summarizing quantitative data
One-number summaries
– Mean
Average, obtained by summing all observations and dividing by the number of obs.
– Median
The center value, below and above which you will find 50% of the observations.
Summarizing your data with one number may not tell the whole story:
21
Median = 19.8 Median = 19.8 Median = 10.5
© 2015 IBM Corporation 22
Flaw of averages
“Plans based on average assumptions are wrong on average”
Average depth 3 ft
© 2015 IBM Corporation
“Most observations fall within ±2 standard deviations of the mean.”
Standard deviation
23
If the data is normally distributed
95 % of observations
Standard Deviation = 4.2
~95% of observations between 11.4 and 28.2
© 2015 IBM Corporation
Descriptive statistics - example
Random sample of 5000 customers of a credit card company
24
Amount spent on
primary card last
month
Debt to income
ratio (x100)
N Valid 5000 5000
Missing 0 0
Mean 1683.7340 9.9578
Median 1690.0670 8.8000
Std. Deviation 210.26680 6.42317
Minimum .00 .00
Maximum 2482.72 43.10
© 2015 IBM Corporation
Percentiles
Generalizations of the median (50th percentile).
The pth is the data point below which p percent of the observations fall.
Often used to compare a single observation to a general population.
Examples:
– Standardized test scores
If you scored in the 93th percentile, your score was higher than that of 93% of test
takers.
– Finance and risk management
If your portfolio value-at-risk 95% is $10M, your portfolio loss will not exceed $10M
with probability 95%.
25
© 2015 IBM Corporation
Percentiles - example
Percentiles can be another way of describing how spread out data values are.
Example: 5-Number Summary
Minimum – 25th percentile – Median – 50th percentile - Maximum
26
Amount spent on
primary card last
month
Debt to income
ratio (x100)
Minimum .00 .00
Percentiles
25 1567.4658 5.1250
50 1690.0670 8.8000
75 1814.5430 13.5000
Maximum 2482.72 43.10
© 2015 IBM Corporation
Distributions: Normal distribution
27
© 2015 IBM Corporation
Distributions
28
© 2015 IBM Corporation
Distributions
29
© 2015 IBM Corporation 30
Distributions
Estimate of the probability distribution of global mean temperature resulting
from a doubling of CO2 relative to its pre-industrial value, made from
100000 simulations
© 2015 IBM Corporation
Simulation Modeling
© 2015 IBM Corporation 32
Sums of random variables
For any random variable and a constant
Expectation of the sum of two random variables is equal to the sum of
expectations
and, therefore
Example: expected return of a portfolio
For the variance
© 2015 IBM Corporation 33
Sums of random variables
How to compute the
probability distribution of the
sum of random variables?
We cannot add PDFs or
PMFs
The formula involves non-
trivial integration and is
known as convolution:
Use simulation to evaluate
such complex integrals
© 2015 IBM Corporation 34
Sums of random variables
© 2015 IBM Corporation 35
Simulation modeling – example 1
We want to invest $1000 in the US stock market for 1 year:
Invest into the S&P 500 market index (index fund)
Value of investment at the end of year 1:
Market return over the time period [0,1) is
Generate scenarios for the market return over the year and compute
decide on the number of scenarios and the set of scenarios for
generate scenarios
use historic scenarios
draw randomly from historic scenarios (bootstrapping)
draw random numbers from the assumed distribution (Monte Carlo)
visualize and analyze the approximate probability distribution of
In our example we assume that the return of the market over the next year
follow Normal distribution
© 2015 IBM Corporation 36
Simulation modeling – example 1
Between 1977 and 2007, S&P 500 returned 8.79% per year on average with a
standard deviation of 14.65%
Generate 100 scenarios for the market return over the next year (draw
100 random numbers from a Normal distribution with mean 8.79% and standard
deviation of 14.65%):
Compute and plot
Number of values 100
Mean $ 1,087.90
Std Deviation $ 146.15
Skewness 0.0034442
Kurtosis 2.871695
Mode $ 1,118.96
5% Perc $ 837.40
95% Perc $ 1,324.00
Minimum $ 708.81
Maximum $ 1,458.52
© 2015 IBM Corporation 37
Simulation modeling – example 1 in Matlab
600 700 800 900 1000 1100 1200 1300 1400 15000
5
10
15
20
25
Value at time 1
Fre
quency
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1600
700
800
900
1000
1100
1200
1300
1400
1500
Time
Valu
e
Simulated Value Paths
v0 = 1000; % initial capital
Ns = 100; % number of scenarios
% Generate Normal random variables
r01 = normrnd(0.0879, 0.1465, Ns, 1);
% Distribution of value at the end of year 1
v1 = (1 + r01) * v0;
% Plot a histogram of the distribution of outcomes for v1
[frequencyCounts, binLocations] = hist(v1, 10);
bar(binLocations, frequencyCounts);
% Plot simulated paths over time
time = 0:1:1;
plot(time,[v0*ones(100,1) v1],'Linewidth',2);
© 2015 IBM Corporation 38
Why use simulation?
Example 1 illustrates very basic Monte Carlo simulation system
Simulation allows us to evaluate (approximately) a function of a random variable
in example 1 the function is simple
given distribution of , in some cases we can compute distribution of in closed
form, e.g., if followed a Normal distribution, then also follows a Normal
distribution with mean and standard deviation
if was not Normally distributed, or if the output variable were a more complex
function of the input variable , it would be difficult and practically impossible to
derive the probability distribution of from the probability distribution of
Other advantages of simulation:
simulation enables visualizing probability distribution resulting from compounding
probability distributions of multiple input variables (example 2)
simulation allows incorporating correlations between input variables (example 3)
simulation is a low-cost tool for checking the effect of changing a strategy on an output
variable of interest (example 4)
Next, we extend example 1 to illustrate such situations
© 2015 IBM Corporation 39
Simulation modeling – example 2
You are planning for retirement and decide to invest in the market for the next
30 years (instead of only the next year as in example 1). Your initial capital is
still
Assume that every year your investment returns from investing into the
S&P 500 will follow a Normal distribution with the mean and standard deviation
as in example 1.
Value of investment after 30 years:
The return over 30 years will depend on the realization of 30 random variables
Observations:
sum of Normal random variables is Normal
here we have multiplication of Normal random variables, is it Normal?
© 2015 IBM Corporation 40
Simulation modeling – example 2
Between 1977 and 2007, S&P 500 returned 8.79% per year on average with a
standard deviation of 14.65%
Simulate 30 columns of 100 observations each of single period returns:
Compute and plot
Number of values 5000
Mean $ 12,587.62
Std Deviation $ 10,948.39
Skewness 3.349066
Kurtosis 28.24214
Mode $ 4,458.97
5% Perc $ 2,655.55
95% Perc $ 32,481.38
Minimum $ 609.75
Maximum $194,355.00
© 2015 IBM Corporation 41
Simulation modeling – example 2 in Matlab
0 1 2 3 4 5 6
x 104
0
5
10
15
20
25
30
35
40
Value after 30 years
Fre
quency
0 5 10 15 20 25 300
1
2
3
4
5
6x 10
4
Time
Valu
e
Simulated Value Paths
v0 = 1000; % initial capital
Ns = 100; % number of scenarios
% Generate Normal random variables
r_speriod30 = normrnd(0.0879, 0.1465, Ns, 30);
% Distribution of value at the end of year 30
v30 = v0 * prod(1 + r_speriod30, 2);
% Plot a histogram of the distribution of outcomes for v30
[frequencyCounts, binLocations] = hist(v30, 10); bar(binLocations, frequencyCounts);
% Plot simulated paths over time
time = 0:1:30; v_t = v0*ones(Ns,1);
for(t=1:30) v_t = [v_t v0 * prod(1 + r_speriod30(:,1:t), 2)]; end
plot(time,v_t,'Linewidth',2);
© 2015 IBM Corporation 42
Simulation modeling – example 2 in Matlab
0 2 4 6 8 10 12 14
x 104
0
50
100
150
200
250
300
350
400
450
500
Value after 30 years
Fre
quency
0 5 10 15 20 25 300
2
4
6
8
10
12
14x 10
4
Time
Valu
e
Simulated Value Paths
v0 = 1000; % initial capital
Ns = 5000; % number of scenarios
% Generate Normal random variables
r_speriod30 = normrnd(0.0879, 0.1465, Ns, 30);
% Distribution of value at the end of year 30
v30 = v0 * prod(1 + r_speriod30, 2);
% Plot a histogram of the distribution of outcomes for v30
[frequencyCounts, binLocations] = hist(v30, 100); bar(binLocations, frequencyCounts);
% Plot simulated paths over time
time = 0:1:30; v_t = v0*ones(Ns,1);
for(t=1:30) v_t = [v_t v0 * prod(1 + r_speriod30(:,1:t), 2)]; end
plot(time,v_t,'Linewidth',2);
© 2015 IBM Corporation 43
Simulation modeling – example 3
You are planning for retirement and decide to invest in the market for the next
30 years. Your initial capital is
You have an opportunity to invest in stocks and Treasury bonds:
allocate 50% of your capital to the stock market (S&P 500 index fund) today
allocate 50% of your capital to bonds today
Assume that every year your investment returns from investing into the
S&P 500 and Treasury bonds will follow a Normal distribution with the mean
and standard deviation as in example 2 (for S&P 500), mean 4% and standard
deviation 7% for bonds. Assume correlation -0.2 between the stock market and
the Treasury bond market.
Covariance matrix:
Value of investment after 30 years:
© 2015 IBM Corporation 44
Simulation modeling – example 3
Simulate 30 years of 100 observations each of single period correlated returns:
Compute and plot
Number of values 5000
Mean $ 7,892.80
Std Deviation $ 5,233.10
Skewness 2.921482
Kurtosis 20.48869
Mode $ 5,050.96
5% Perc $ 2,951.82
95% Perc $17,457.43
Minimum $ 1,408.63
Maximum $79,729.34
© 2015 IBM Corporation 45
Simulation modeling – example 3 in Matlab
0 1 2 3 4 5 6 7 8
x 104
0
200
400
600
800
1000
1200
Value after 30 years
Fre
quency
0 5 10 15 20 25 300
1
2
3
4
5
6
7
8x 10
4
Time
Valu
e
Simulated Value Paths
v0 = 1000; % initial capital
Ns = 5000; % number of scenarios
mu = [0.0879; 0.04]; % expected return
sigma = [0.1465^2, -0.0021; -0.0021, 0.07^2]; % covariance matrix
% Generate correlated Normal random variables
stockRet = ones(Ns,1);
bondsRet = ones(Ns,1);
for iYear = 1:30
scenarios = mvnrnd(mu, sigma, Ns);
stockRet = stockRet .* (1 + scenarios(:,1));
bondsRet = bondsRet .* (1 + scenarios(:,2));
end
% Distribution of value at the end of year 30
v30 = 0.5*v0*stockRet + 0.5*v0*bondsRet;
© 2015 IBM Corporation 46
Simulation modeling – example 4
Using scenario generation procedure from example 3 for decision-making
Compare portfolios:
50-50 portfolio allocation in stocks and bonds (Strategy A)
30-70 portfolio allocation in stocks and bonds (Strategy B)
Compute and plot
Number of values 5000
Mean $ 1,865.13
Std Deviation $ 2,214.87
Skewness 3.506451
Kurtosis 40.18968
Mode $ 687.75
5% Perc $ -254.41
95% Perc $ 6,027.23
Minimum $-1,829.78
Maximum $45,972.08
© 2015 IBM Corporation
Mean-Variance Portfolio Selection
© 2015 IBM Corporation 48
Measuring risk and portfolio selection
Consider n assets with random returns:
proportion invested in asset i
exp. return and standard dev. of
the return of asset i
variance-covariance matrix
Portfolio expected return and variance:
Set of admissible portfolios:
Portfolio Return ( )
Pro
ba
bili
ty d
en
sity
0
Variance
(standard deviation)
Mean
return
Portfolio return distribution ( ) is assumed to be Gaussian (Normal)
Consider n assets with random returns:
proportion of total funds invested in asset i
expected return and standard deviation of
the return of asset i
correlation coefficient of i’s and j’s returns
vector of expected returns
variance-covariance matrix (PSD)
Expected return and variance of the resulting portfolio:
Set of admissible portfolios:
Portfolio selection
49
© 2015 IBM Corporation 50
Portfolio selection
A feasible portfolio x is efficient if it has:
maximal expected return among all portfolios with the same variance,
minimum variance among all portfolios with the same expected return.
Mean-variance optimization (Markowitz, 1952):
Alternative formulations
Solving for all the values of V, R, or gives efficient portfolios:
© 2015 IBM Corporation 51
Portfolio selection
Portfolio optimization problem – efficient frontier:
© 2015 IBM Corporation 52
Portfolio selection
Extensions of mean-variance model: introduce transaction costs
Mean-variance portfolio optimization problem – two
objectives:
© 2015 IBM Corporation 53
Portfolio selection
Mean-variance portfolio optimization problem – efficient
frontier and portfolio composition:
© 2015 IBM Corporation 54
Questions
© 2015 IBM Corporation 55
Legal Disclaimer
• © IBM Corporation 2013. All Rights Reserved.
• The information contained in this publication is provided for informational purposes only. While efforts were made to verify the completeness and accuracy of the information
contained in this publication, it is provided AS IS without warranty of any kind, express or implied. In addition, this information is based on IBM’s current product plans and strategy,
which are subject to change by IBM without notice. IBM shall not be responsible for any damages arising out of the use of, or otherwise related to, this publication or any other
materials. Nothing contained in this publication is intended to, nor shall have the effect of, creating any warranties or representations from IBM or its suppliers or licensors, or altering
the terms and conditions of the applicable license agreement governing the use of IBM software.
• References in this presentation to IBM products, programs, or services do not imply that they will be available in all countries in which IBM operates. Product release dates and/or
capabilities referenced in this presentation may change at any time at IBM’s sole discretion based on market opportunities or other factors, and are not intended to be a commitment
to future product or feature availability in any way. Nothing contained in these materials is intended to, nor shall have the effect of, stating or implying that any activities undertaken
by you will result in any specific sales, revenue growth or other results.
• If the text contains performance statistics or references to benchmarks, insert the following language; otherwise delete:
Performance is based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput or performance that any user will
experience will vary depending upon many factors, including considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage
configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve results similar to those stated here.
• If the text includes any customer examples, please confirm we have prior written approval from such customer and insert the following language; otherwise delete:
All customer examples described are presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual environmental
costs and performance characteristics may vary by customer.
• Please review text for proper trademark attribution of IBM products. At first use, each product name must be the full name and include appropriate trademark symbols (e.g., IBM
Lotus® Sametime® Unyte™). Subsequent references can drop “IBM” but should include the proper branding (e.g., Lotus Sametime Gateway, or WebSphere Application Server).
Please refer to http://www.ibm.com/legal/copytrade.shtml for guidance on which trademarks require the ® or ™ symbol. Do not use abbreviations for IBM product names in your
presentation. All product names must be used as adjectives rather than nouns. Please list all of the trademarks that you use in your presentation as follows; delete any not included
in your presentation. IBM, the IBM logo, Lotus, Lotus Notes, Notes, Domino, Quickr, Sametime, WebSphere, UC2, PartnerWorld and Lotusphere are trademarks of International
Business Machines Corporation in the United States, other countries, or both. Unyte is a trademark of WebDialogs, Inc., in the United States, other countries, or both.
• If you reference Adobe® in the text, please mark the first use and include the following; otherwise delete:
Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other
countries.
• If you reference Java™ in the text, please mark the first use and include the following; otherwise delete:
Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both.
• If you reference Microsoft® and/or Windows® in the text, please mark the first use and include the following, as applicable; otherwise delete:
Microsoft and Windows are trademarks of Microsoft Corporation in the United States, other countries, or both.
• If you reference Intel® and/or any of the following Intel products in the text, please mark the first use and include those that you use as follows; otherwise delete:
Intel, Intel Centrino, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States
and other countries.
• If you reference UNIX® in the text, please mark the first use and include the following; otherwise delete:
UNIX is a registered trademark of The Open Group in the United States and other countries.
• If you reference Linux® in your presentation, please mark the first use and include the following; otherwise delete:
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. Other company, product, or service names may be trademarks or service marks of
others.
• If the text/graphics include screenshots, no actual IBM employee names may be used (even your own), if your screenshots include fictitious company names (e.g., Renovations,
Zeta Bank, Acme) please update and insert the following; otherwise delete: All references to [insert fictitious company name] refer to a fictitious company and are used for illustration
purposes only.
55