Supervised Learning - University of Massachusetts...

Supervised Learning

David Kazmer, Guest Lecturer 16.711 Computational Data Modeling

November 13, 2013

http://launchhack.com/content/25-cartoons-give-current-big-data-hype-perspective/

This session supplements

Chapter 3 of Bishop text

Agenda

• Introduction – Topology

– Linear regression

– Transforms

• Multivariate Modeling – Overview (and error norms)

– Univariate Optimization: R's optimize

– Multivariate Optimization: R’s L-BFGS-B

• The Big Picture: Two Real Examples – High Fidelity Simulation & Polymer Rheology

– Process Optimization of DVD Manufacturing

Topology

• The study of properties that are preserved under continuous deformations including stretching and bending, but not tearing or gluing

– The term "Topologie" was first introduced in German in 1847 by Johann Benedict

– The English term “topology” was first used in 1883 in Nature’s obituary of Benedict and defined as "qualitative geometry from the ordinary geometry in which quantitative relations chiefly are treated“

– The term has since been formalized by mathematics

Geometric Topology

• Can a function or mapping be defined to transform a set X to a different set, τ?

– If so, then a topology exists

• Consider this mug example:

– What would the function look like?

– What are the modeling issues with respect to function complexity & surface flexibility?

http://en.wikipedia.org/wiki/File:Mug_and_Torus_morph.gif

Agenda



– Transforms






Linear Regression • Given data and a set of coefficients b, an output

is predicted for each row i

• Given the observation yi, the error is defined as

• The sum of squared error is:

0 1 ,1 2 ,2 3 ,3ˆ

i i i iy b b x b x b x

X1 X2 X3 X4 X5 X6 X7 X8 X9 Y1 Y2 Y30.2301 3.0098 2.9 7.9902 4.8 15.262 2.2427 13.846 67.304 204.59 113.26 23.891

0.2301 3.0098 2.9 7.9902 2.98 15.262 2.2427 13.846 67.304 204.59 113.26 23.891

0.2301 3.0098 2.9 7.9902 7.04 15.262 2.2427 13.846 67.304 204.59 113.26 23.891

0.2301 3.0098 2.9 7.9902 3.9807 15.262 2.2427 13.846 67.304 204.59 113.26 23.891

0.22 3 3.01 8 16.94 14.88 2.1218 14.202 68.295 204.85 112.14 23.689

0.24 3 2.8739 7.99 14.981 15.343 2.2427 13.94 66.695 204.4 116.36 23.971

0.2301 3.0099 2.8825 8.0045 14.98 15.356 2.2226 13.812 67.427 205.43 109.51 24.106

0.24 3 2.8946 8.0007 14.99 15.545 2.1688 13.812 66.469 204.98 112.88 24.146

0.24 3 2.8946 8.0007 5.4001 15.545 2.1688 13.812 66.469 204.98 112.88 24.146

0.23 3 3.06 8 61.23 15.403 2.4979 14.329 67.488 205.08 108.53 24.018

0.2303 2.9998 4.14 8 14.96 15.457 2.3099 16.424 67.929 157.27 85.214 24.018

0.23 3 4.1899 8 14.95 15.437 2.3703 16.746 67.377 155.16 83.196 24.065

0.23 3 4.1301 8 14.96 15.511 2.4643 16.525 68.437 154.55 86.228 23.992

0.2301 3 4.23 7.9899 14.94 15.471 2.4509 16.626 68.57 153.02 86.269 23.931

0.2299 3 4.2712 7.99 14.95 15.639 2.7396 16.733 68.747 153.05 86.28 23.891

0.2299 3 4.27 8 14.95 15.45 2.6724 16.706 68.161 153.67 83.859 23.985

0.23 3 4.2296 7.99 14.95 15.477 2.3971 16.733 68.383 154.12 85.86 23.985

0.22 2.99 4.3 8 14.94 15.048 2.4979 16.901 69.146 153.61 84.211 23.797

0.23 3 4.22 8 14.93 15.142 2.565 16.76 66.609 154.58 83.758 23.689

0.2301 3 4.1811 8 14.94 15.377 2.5986 16.78 67.341 155.65 83.683 23.985

0.2199 2.99 4.2298 8 14.94 15.021 2.6657 16.746 68.698 154.32 84.859 23.757

0.23 3 4.24 8 14.939 15.511 2.5784 16.901 68.114 154.22 83.632 24.065

0.22 2.9899 4.2899 8 14.93 15.115 2.753 16.928 68.687 152.5 83.377 23.824

0.2201 2.99 4.22 8.0013 14.94 15.095 2.3971 16.572 68.694 155.36 85.732 23.736

0.2301 3 4.22 8 14.93 15.672 2.5314 16.767 68.586 152.11 83.909 24.065

0.23 3 4.2899 7.99 14.929 15.538 2.5986 16.847 68.892 152.11 84.709 24.039

0.23 3 4.19 7.99 14.92 15.545 2.8135 16.753 68.947 151.98 85.457 23.938

0.2301 3 4.2401 8 14.93 15.565 2.3971 16.639 68.38 154.94 84.542 24.112

0.2226 3 4.2099 8 14.93 15.182 2.4576 16.699 70.01 154.45 82.752 23.79

0.2199 2.99 4.2201 8 14.92 15.061 2.471 16.733 68.967 153.28 84.236 23.824

0.23 3 4.2199 8 14.9 15.618 2.4173 16.532 68.346 153.8 85.182 24.039

0.22 2.9909 4.1121 7.999 14.9 15.027 2.4777 16.505 68.481 154.64 84.766 23.81

0.2301 3 4.2 8 14.9 15.491 2.4106 16.572 67.728 155.36 83.656 24.086

0.23 3 4.2201 8 14.89 15.397 2.3434 16.646 68.059 154.22 83.718 23.998

0.2301 3 4.2188 8 14.9 15.491 2.3434 16.599 67.999 153.57 84.614 23.931

0.2189 2.99 4.23 8 14.88 15.021 2.5449 16.874 68.475 153.02 83.474 23.736

0.2301 3 4.1901 8 14.89 15.558 2.424 16.726 68.01 155.36 83.998 24.126

0.2301 3 4.22 7.9899 14.89 15.618 2.6993 16.679 69.336 151.59 85.384 24.039

0.2289 3.0005 4.25 7.9995 14.88 15.417 2.5046 16.679 68.602 152.89 83.963 23.945

0.23 3 4.21 8 33.899 15.256 2.4979 16.854 66.536 152.11 83.387 23.703

0.2312 3 4.02 7.9889 15.2 15.41 2.0077 15.363 68.25 159.61 87.377 23.918

0.2199 2.99 4.03 8.0006 15.2 15.007 2.0278 15.585 68.149 159.32 91.89 23.797

0.2313 2.9987 3.9901 8 15.2 15.585 1.8331 15.477 67.983 160.59 83.546 24.099

0.2299 3 3.99 7.9901 15.19 15.518 1.9741 15.37 69.078 159.84 86.549 23.985

0.23 3 3.94 8 15.19 15.618 2.001 15.236 68.636 162.5 85.074 24.133

0.23 3.0001 3.9801 7.9999 15.19 15.518 2.0816 15.471 68.467 159.84 85.356 23.998

0.22 3 3.94 8 15.19 15.007 2.095 15.417 68.544 160.88 87.048 23.642

0.23 3.0001 3.92 8 15.18 15.511 2.0144 15.43 67.689 160.72 84.372 24.052

0.2199 2.99 3.93 8 15.17 15.021 1.8667 15.336 68.138 161.59 86.599 23.736

0.22 2.9901 4 8 15.169 14.98 2.0816 15.524 68.151 160.29 85.931 23.783

0.23 3 4.19 7.99 14.9 15.457 2.4643 16.545 69.009 154.19 85.542 23.992

0.2301 3 4.25 7.9999 14.9 15.397 2.5046 16.854 67.612 154.06 82.797 23.992

0.23 2.9999 4.2 8.0001 14.9 15.571 2.4106 16.639 68.254 153.93 83.536 23.998

0.23 3 4.17 8 14.89 15.592 2.3233 16.364 67.741 155.1 85.905 23.931

0.23 3.0019 4.19 7.9881 14.9 15.558 2.7127 17.082 68.883 151.07 84.844 23.998

0.23 2.9913 4.1801 7.999 14.89 15.383 2.2561 16.505 68.338 154.12 87.706 23.871

0.2189 3 4.18 7.9999 14.89 14.974 2.236 16.458 68.685 155.98 86.645 23.582

0.23 3 4.3 8 14.88 15.551 2.659 16.82 68.551 152.92 81.783 24.092

0.2199 3 4.2 8 14.88 14.987 2.3837 16.746 68.552 154.58 86.244 23.609

0.22 2.99 4.19 8 14.88 15.054 2.424 16.713 68.92 155.26 86.354 23.743

0.2299 3 3.95 8 15.02 15.471 1.8533 15.296 67.673 163.8 85.35 24.079

0.2289 3 3.9201 7.99 15.03 15.491 2.283 15.356 68.544 161.24 87.288 23.965

0.2301 2.9999 3.86 7.9901 15.03 15.383 1.9741 15.202 68.402 162.28 89.308 23.864

0.22 2.9899 3.9399 8 15.03 15.068 2.1218 15.659 68.528 161.37 86.248 23.877

0.2299 3 4.0601 8 15.03 15.518 1.9674 15.645 67.882 162.11 84.448 24.146

0.23 3 3.9799 7.99 15.01 15.484 1.954 15.565 68.539 159.42 85.833 23.951

0.2301 3 3.9389 7.9899 15.02 15.41 1.9741 15.182 68.438 161.72 88.76 23.904

0.2301 3 3.9888 7.9899 15.02 15.484 1.9943 15.323 68.565 160.46 86.678 23.945

0.2199 2.99 3.9591 8 15.02 14.987 1.907 15.679 68.362 160.88 86.136 23.676

0.2299 3 3.95 7.99 28.96 15.444 2.0614 15.363 68.788 160.17 87.228 23.918

0.2 4.2602 4.13 10.49 18.68 15.82 3.9482 18.593 80.456 164.26 92.08 24.233

0.2 4.2509 4.2811 10.49 18.651 15.806 4.3914 19.07 79.843 163.54 90.397 24.26

0.19 4.25 4.23 10.5 18.67 15.256 4.4854 19.493 80.315 162.24 90.512 23.891

0.2 4.26 4.2501 10.49 18.661 15.86 4.5324 19.043 80.805 163.96 91.379 24.314

0.2002 4.2498 4.21 10.5 18.651 15.847 4.331 19.103 80.69 161.33 92.089 24.133

0.2 4.25 4.3487 10.49 18.649 15.867 4.8346 19.466 80.302 162.41 89.754 24.28

0.2 4.241 4.3 10.499 18.65 15.827 4.566 19.439 80.653 160.49 90.829 24.206

0.2 4.26 4.1899 10.49 18.65 15.941 4.2235 18.962 79.92 165.23 89.103 24.394

0.2 4.25 4.1999 10.5 18.65 15.8 4.0624 19.123 79.733 165.17 89.135 24.415

0.2 4.26 4.3199 10.491 18.64 15.84 4.4921 19.137 80.729 162.24 91.125 24.24

0.3101 4.9899 3.5699 4.0101 13.02 14.9 4.7339 19.05 48.806 149.25 84.548 23.475

0.3101 5 3.5788 4.0001 13.019 14.819 4.8883 18.902 49.245 149.45 84.31 23.495

0.31 5.0001 3.7399 4 13.02 14.866 5.271 19.251 49.025 149.51 84.209 23.535

0.3099 5 3.71 4 13.01 14.907 5.2509 19.452 48.49 147.24 82.002 23.569

0.3 4.99 3.55 4 13.03 14.564 4.7271 18.915 48.952 151.4 85.439 23.32

0.31 5 3.6699 4 13.01 14.846 4.9823 19.144 48.491 149.38 82.829 23.555

0.3112 4.9889 3.6289 4.0112 13.01 14.866 4.8413 19.238 48.66 150.42 85.175 23.441

0.3 4.9901 3.76 3.9999 13.008 14.578 5.13 19.553 48.98 147.82 82.377 23.394

0.3111 5 3.7589 4 13 14.947 5.1502 19.432 48.782 147.6 83.279 23.495

0.3101 4.9999 3.7999 4 12.999 14.907 5.7075 19.943 48.596 145.26 80.35 23.569

0.2199 3 4.23 7.99 14.94 14.987 2.9343 16.807 68.277 152.5 83.631 23.763

0.2304 3 4.1211 7.9995 14.95 15.545 2.3569 16.417 68.368 154.71 85.204 24.039

0.2199 2.99 4.3011 8.0001 14.93 15.048 2.4844 16.901 68.831 151.43 81.975 23.824

0.23 3.0008 4.2297 7.9892 14.92 15.518 2.4039 16.787 68.805 151.36 84.882 23.918

0.2299 3 4.1799 8 14.92 15.538 2.3233 16.699 67.615 153.25 81.357 24.133

0.2199 2.99 4.16 8 14.93 15.128 2.4374 16.558 68.798 154.12 85.143 23.763

0.2301 3 4.191 8 14.91 15.565 2.3434 16.471 67.748 154.81 82.191 24.146

0.2301 3 4.27 7.9899 14.92 15.35 2.6389 16.753 68.477 151.2 84.862 23.851

…

2

1ˆ

n

i iiSSE y y

ˆi iy y

Linear Regression • We often seek a model that provides us the

least squared error:

• This sum of squared error can be minimized with the model coefficients:

2

1ˆmin

n

i iiy y

b

1b X X X y

X1 X2 X3 X4 X5 X6 X7 X8 X9 Y1 Y2 Y30.2301 3.0098 2.9 7.9902 4.8 15.262 2.2427 13.846 67.304 204.59 113.26 23.891

0.2301 3.0098 2.9 7.9902 2.98 15.262 2.2427 13.846 67.304 204.59 113.26 23.891

0.2301 3.0098 2.9 7.9902 7.04 15.262 2.2427 13.846 67.304 204.59 113.26 23.891

0.2301 3.0098 2.9 7.9902 3.9807 15.262 2.2427 13.846 67.304 204.59 113.26 23.891

0.22 3 3.01 8 16.94 14.88 2.1218 14.202 68.295 204.85 112.14 23.689

0.24 3 2.8739 7.99 14.981 15.343 2.2427 13.94 66.695 204.4 116.36 23.971

0.2301 3.0099 2.8825 8.0045 14.98 15.356 2.2226 13.812 67.427 205.43 109.51 24.106

0.24 3 2.8946 8.0007 14.99 15.545 2.1688 13.812 66.469 204.98 112.88 24.146

0.24 3 2.8946 8.0007 5.4001 15.545 2.1688 13.812 66.469 204.98 112.88 24.146

0.23 3 3.06 8 61.23 15.403 2.4979 14.329 67.488 205.08 108.53 24.018

0.2303 2.9998 4.14 8 14.96 15.457 2.3099 16.424 67.929 157.27 85.214 24.018

0.23 3 4.1899 8 14.95 15.437 2.3703 16.746 67.377 155.16 83.196 24.065

0.23 3 4.1301 8 14.96 15.511 2.4643 16.525 68.437 154.55 86.228 23.992

0.2301 3 4.23 7.9899 14.94 15.471 2.4509 16.626 68.57 153.02 86.269 23.931

0.2299 3 4.2712 7.99 14.95 15.639 2.7396 16.733 68.747 153.05 86.28 23.891

0.2299 3 4.27 8 14.95 15.45 2.6724 16.706 68.161 153.67 83.859 23.985

0.23 3 4.2296 7.99 14.95 15.477 2.3971 16.733 68.383 154.12 85.86 23.985

0.22 2.99 4.3 8 14.94 15.048 2.4979 16.901 69.146 153.61 84.211 23.797

0.23 3 4.22 8 14.93 15.142 2.565 16.76 66.609 154.58 83.758 23.689

0.2301 3 4.1811 8 14.94 15.377 2.5986 16.78 67.341 155.65 83.683 23.985

0.2199 2.99 4.2298 8 14.94 15.021 2.6657 16.746 68.698 154.32 84.859 23.757

0.23 3 4.24 8 14.939 15.511 2.5784 16.901 68.114 154.22 83.632 24.065

0.22 2.9899 4.2899 8 14.93 15.115 2.753 16.928 68.687 152.5 83.377 23.824

0.2201 2.99 4.22 8.0013 14.94 15.095 2.3971 16.572 68.694 155.36 85.732 23.736

0.2301 3 4.22 8 14.93 15.672 2.5314 16.767 68.586 152.11 83.909 24.065

0.23 3 4.2899 7.99 14.929 15.538 2.5986 16.847 68.892 152.11 84.709 24.039

0.23 3 4.19 7.99 14.92 15.545 2.8135 16.753 68.947 151.98 85.457 23.938

0.2301 3 4.2401 8 14.93 15.565 2.3971 16.639 68.38 154.94 84.542 24.112

0.2226 3 4.2099 8 14.93 15.182 2.4576 16.699 70.01 154.45 82.752 23.79

0.2199 2.99 4.2201 8 14.92 15.061 2.471 16.733 68.967 153.28 84.236 23.824

0.23 3 4.2199 8 14.9 15.618 2.4173 16.532 68.346 153.8 85.182 24.039

0.22 2.9909 4.1121 7.999 14.9 15.027 2.4777 16.505 68.481 154.64 84.766 23.81

0.2301 3 4.2 8 14.9 15.491 2.4106 16.572 67.728 155.36 83.656 24.086

0.23 3 4.2201 8 14.89 15.397 2.3434 16.646 68.059 154.22 83.718 23.998

0.2301 3 4.2188 8 14.9 15.491 2.3434 16.599 67.999 153.57 84.614 23.931

0.2189 2.99 4.23 8 14.88 15.021 2.5449 16.874 68.475 153.02 83.474 23.736

0.2301 3 4.1901 8 14.89 15.558 2.424 16.726 68.01 155.36 83.998 24.126

0.2301 3 4.22 7.9899 14.89 15.618 2.6993 16.679 69.336 151.59 85.384 24.039

0.2289 3.0005 4.25 7.9995 14.88 15.417 2.5046 16.679 68.602 152.89 83.963 23.945

0.23 3 4.21 8 33.899 15.256 2.4979 16.854 66.536 152.11 83.387 23.703

0.2312 3 4.02 7.9889 15.2 15.41 2.0077 15.363 68.25 159.61 87.377 23.918

0.2199 2.99 4.03 8.0006 15.2 15.007 2.0278 15.585 68.149 159.32 91.89 23.797

0.2313 2.9987 3.9901 8 15.2 15.585 1.8331 15.477 67.983 160.59 83.546 24.099

0.2299 3 3.99 7.9901 15.19 15.518 1.9741 15.37 69.078 159.84 86.549 23.985

0.23 3 3.94 8 15.19 15.618 2.001 15.236 68.636 162.5 85.074 24.133

0.23 3.0001 3.9801 7.9999 15.19 15.518 2.0816 15.471 68.467 159.84 85.356 23.998

0.22 3 3.94 8 15.19 15.007 2.095 15.417 68.544 160.88 87.048 23.642

0.23 3.0001 3.92 8 15.18 15.511 2.0144 15.43 67.689 160.72 84.372 24.052

0.2199 2.99 3.93 8 15.17 15.021 1.8667 15.336 68.138 161.59 86.599 23.736

0.22 2.9901 4 8 15.169 14.98 2.0816 15.524 68.151 160.29 85.931 23.783

0.23 3 4.19 7.99 14.9 15.457 2.4643 16.545 69.009 154.19 85.542 23.992

0.2301 3 4.25 7.9999 14.9 15.397 2.5046 16.854 67.612 154.06 82.797 23.992

0.23 2.9999 4.2 8.0001 14.9 15.571 2.4106 16.639 68.254 153.93 83.536 23.998

0.23 3 4.17 8 14.89 15.592 2.3233 16.364 67.741 155.1 85.905 23.931

0.23 3.0019 4.19 7.9881 14.9 15.558 2.7127 17.082 68.883 151.07 84.844 23.998

0.23 2.9913 4.1801 7.999 14.89 15.383 2.2561 16.505 68.338 154.12 87.706 23.871

0.2189 3 4.18 7.9999 14.89 14.974 2.236 16.458 68.685 155.98 86.645 23.582

0.23 3 4.3 8 14.88 15.551 2.659 16.82 68.551 152.92 81.783 24.092

0.2199 3 4.2 8 14.88 14.987 2.3837 16.746 68.552 154.58 86.244 23.609

0.22 2.99 4.19 8 14.88 15.054 2.424 16.713 68.92 155.26 86.354 23.743

0.2299 3 3.95 8 15.02 15.471 1.8533 15.296 67.673 163.8 85.35 24.079

0.2289 3 3.9201 7.99 15.03 15.491 2.283 15.356 68.544 161.24 87.288 23.965

0.2301 2.9999 3.86 7.9901 15.03 15.383 1.9741 15.202 68.402 162.28 89.308 23.864

0.22 2.9899 3.9399 8 15.03 15.068 2.1218 15.659 68.528 161.37 86.248 23.877

0.2299 3 4.0601 8 15.03 15.518 1.9674 15.645 67.882 162.11 84.448 24.146

0.23 3 3.9799 7.99 15.01 15.484 1.954 15.565 68.539 159.42 85.833 23.951

0.2301 3 3.9389 7.9899 15.02 15.41 1.9741 15.182 68.438 161.72 88.76 23.904

0.2301 3 3.9888 7.9899 15.02 15.484 1.9943 15.323 68.565 160.46 86.678 23.945

0.2199 2.99 3.9591 8 15.02 14.987 1.907 15.679 68.362 160.88 86.136 23.676

0.2299 3 3.95 7.99 28.96 15.444 2.0614 15.363 68.788 160.17 87.228 23.918

0.2 4.2602 4.13 10.49 18.68 15.82 3.9482 18.593 80.456 164.26 92.08 24.233

0.2 4.2509 4.2811 10.49 18.651 15.806 4.3914 19.07 79.843 163.54 90.397 24.26

0.19 4.25 4.23 10.5 18.67 15.256 4.4854 19.493 80.315 162.24 90.512 23.891

0.2 4.26 4.2501 10.49 18.661 15.86 4.5324 19.043 80.805 163.96 91.379 24.314

0.2002 4.2498 4.21 10.5 18.651 15.847 4.331 19.103 80.69 161.33 92.089 24.133

0.2 4.25 4.3487 10.49 18.649 15.867 4.8346 19.466 80.302 162.41 89.754 24.28

0.2 4.241 4.3 10.499 18.65 15.827 4.566 19.439 80.653 160.49 90.829 24.206

0.2 4.26 4.1899 10.49 18.65 15.941 4.2235 18.962 79.92 165.23 89.103 24.394

0.2 4.25 4.1999 10.5 18.65 15.8 4.0624 19.123 79.733 165.17 89.135 24.415

0.2 4.26 4.3199 10.491 18.64 15.84 4.4921 19.137 80.729 162.24 91.125 24.24

0.3101 4.9899 3.5699 4.0101 13.02 14.9 4.7339 19.05 48.806 149.25 84.548 23.475

0.3101 5 3.5788 4.0001 13.019 14.819 4.8883 18.902 49.245 149.45 84.31 23.495

0.31 5.0001 3.7399 4 13.02 14.866 5.271 19.251 49.025 149.51 84.209 23.535

0.3099 5 3.71 4 13.01 14.907 5.2509 19.452 48.49 147.24 82.002 23.569

0.3 4.99 3.55 4 13.03 14.564 4.7271 18.915 48.952 151.4 85.439 23.32

0.31 5 3.6699 4 13.01 14.846 4.9823 19.144 48.491 149.38 82.829 23.555

0.3112 4.9889 3.6289 4.0112 13.01 14.866 4.8413 19.238 48.66 150.42 85.175 23.441

0.3 4.9901 3.76 3.9999 13.008 14.578 5.13 19.553 48.98 147.82 82.377 23.394

0.3111 5 3.7589 4 13 14.947 5.1502 19.432 48.782 147.6 83.279 23.495

0.3101 4.9999 3.7999 4 12.999 14.907 5.7075 19.943 48.596 145.26 80.35 23.569

0.2199 3 4.23 7.99 14.94 14.987 2.9343 16.807 68.277 152.5 83.631 23.763

0.2304 3 4.1211 7.9995 14.95 15.545 2.3569 16.417 68.368 154.71 85.204 24.039

0.2199 2.99 4.3011 8.0001 14.93 15.048 2.4844 16.901 68.831 151.43 81.975 23.824

0.23 3.0008 4.2297 7.9892 14.92 15.518 2.4039 16.787 68.805 151.36 84.882 23.918

0.2299 3 4.1799 8 14.92 15.538 2.3233 16.699 67.615 153.25 81.357 24.133

0.2199 2.99 4.16 8 14.93 15.128 2.4374 16.558 68.798 154.12 85.143 23.763

0.2301 3 4.191 8 14.91 15.565 2.3434 16.471 67.748 154.81 82.191 24.146

0.2301 3 4.27 7.9899 14.92 15.35 2.6389 16.753 68.477 151.2 84.862 23.851

…

So there are at least two reasons that

regression is so popular:

1) Intuitive 2) Fast

Coefficient of Determination, R2

• The analytical solution minimizes the sum of squared error:

which maximizes the coefficient of determination:

where,

i

ii yySSE2

ˆ

SSY

SSER 12

2

1

n

iiSSY y y

-1.0 -0.5 0.0 0.5 1.0

0.5

1.0

1.5

2.0

2.5

x

y

Data

R^2= 0.907

Example 1: Linear regression of a noisy, non-linear function

• R: source(“Kazmer_1.R”)

– F=Exp(X)+0.1*rnorm(), where X [-1,1]

– Fit with linear regression, R’s lm()

• Modeling results:

– R2 ~0.91: Not too bad…

• % of explained behavior

… but incorrect topology

• Poor decision making

• Both interpolation & extrapolation

y(x=1)=2.718

Two Strategies for Higher Fidelity Models

Agenda



– Transforms






Common Transforms, t: Example, for y=exp(x)

-1.0 0.0 1.0

0.0

0.5

1.0

1.5

2.0

2.5

3.0

t=y^2

-1.0 0.0 1.0

0.0

0.5

1.0

1.5

2.0

2.5

3.0

t=y^0.3

-1.0 0.0 1.0

0.0

0.5

1.0

1.5

2.0

2.5

3.0

t=log(y)

-1.0 0.0 1.0

0.0

0.5

1.0

1.5

2.0

2.5

3.0

t=e^y t=(y- )/

-1.0 0.0 1.0

-3-2

-10

12

3

More Common Transforms, t: Example, for y=exp(x)

-1.0 0.0 1.0

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Sine(y)

-1.0 0.0 1.0

-2-1

01

23

fft

FFT(y) Diff(y)

-1.0 0.0 1.0

0.0

0.5

1.0

1.5

2.0

2.5

3.0

-1.0 0.0 1.0

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Diffinv(y)

-1.0 0.0 1.0

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Logistic(y)

0.5 1.0 1.5 2.0 2.5

0.5

1.0

1.5

2.0

2.5

exp(x)

y

DataR^2= 0.981

Example 2: Linear regression of transformed input

• R: source(“Kazmer_2.R”)

– F=Exp(X)+0.1*rnorm(), where X [-1,1]

– Fit with linear regression to t=exp(X) instead of X


– R2 ~0.98

– Correct topology

– Noise significant but likely does not effect “true” model coefficients

Notation in Chapter 3 of Bishop Text

• The linear function y is:

• The sum of squared error is:

• The coefficients are solved as:

Agenda



– Transforms






Multivariate Modeling: Overview

• The preceding transforms and example were univariate

• In the “real world”, all data is multivariate

– Where there are univariate experiments, other (unmodeled) factors will appear as noise

• A note on controllability & observability:

– How controlled are the factors?

– What is the fidelity of the observations?

Multivariate Modeling: Potential Challenges

• Difficult to visualize

– Multiple dimensions

– Complex curvature

• Confounding of behaviors

– Misconceptions of true behaviors

• Curse of dimensionality

– Number of permutation of variables increases with nr, n:#modes, r:#variables!

• (n=4,r=4) 256 models to investigate

• (n=5,r=8) 390,625 models to investigate

Multivariate Modeling: Strategy

• We may have knowledge of expected behavior:

– Practical experience, e.g. monotonicity or ranges

– Theory/physics, e.g. laws that drive topology

• Strategy: Supervised Learning

– Let our knowledge of the behavior guide the use of transforms & multivariate fitting

– Impact:

• Reduce # variables to consider

• Constrain the topological behavior

Regression Alternative: Optimization

• Numerical optimization can be used to find the set of model coefficients to best fit a dataset

• Optimization algorithms are typically numerical techniques that iteratively search for the model coefficients, ai, to minimize an objective function, f(xi)

• System identification, e.g., (Ljung) relies on these techniques

Optimization

• The objective function, g, is often an error norm between the observed response, yobserved, and the model predictions, ypredicted

– Any model, including numerical simulations, can be

implemented and its best coefficients determined

– Constraints may be defined for the coefficients, b: • To ensure reasonable coefficients, and

• To ensure the model can be evaluated

min , , ( , )observed predictedg b y y f b xb

Numerical Methods

• There are many, many optimization methods

• As an intuitive example, the commonly used steepest descent method iteratively evaluates the objective function g(b,x), and then computes the gradient to determine the direction in which to move

– Because the true optimal solution is unknown and the objective function may be complex, algorithms may adaptively change the step size to balance convergence rate and numerical stability

-1

0

1

2

-1

0

1

2

-2

0

2

4

6

8

x2

x1

Example: Global Maximum • Here’s a function of two variables

– Starting at point 1, the optimization took 82 iterations to find the global maxima

– The solution is automatic once the objective function and constraints are defined

x1 x2

Example: Local Maximum • Unfortunately, starting at point 2, the same

algorithm found a local maxima and stopped

• It is important to understand the behavior of the objective function and verify the quality of the solution

-1

0

1

2

-1

0

1

2

-2

0

2

4

6

8

x2

x1

x1 x2

Common Error Norms

• The most common error norm is the sum of squared error, SSE:

– where yi is the observation and is the prediction

– This norm is consistent with regression techniques

• A common alternative is the median average percentage error (MAPE):

i

ii yySSE2

ˆ

n

i i

ii

y

yy

nMAPE

1

ˆ1 Why?

iy

Agenda



– Transforms






R’s Optimize() function: 1973, Brent

Example 3: Model Fitting by Optimization

• source(“Kazmer_3.R”)

– A repeat of prior regression but by optimization:

• Define model function

• Define objective function, SSE or MAPE

• Call R’s optimize()

• Compute statistics


– R2 ~0.98

0.5 1.0 1.5 2.0 2.5

0.5

1.0

1.5

2.0

2.5

y_hat

y

Data

R^2= 0.978

Agenda



– Transforms






Constrained Optimization with R’s Optim: 1995, L-BFGS-B

Example 4: Multiple Regression vs. Nonlinear Fitting by Optimization

• Consider this circuit:

– Wish to develop a model relating input to output voltages as a function of the potentiometers

– Thevenin’s Theorem yields

• Inspect & compare mulreg & nonlinear fitting

– Source(“Kazmer_4_MulReg.R”)

– Source(“Kazmer_4_NL_Fit.R”)

Example 5: Reverse Engineering – “Mystery Box” RC Citcuit

• Here’s a black box, potentiometer, and a square wave generator

– How can we determine R & C in the black box?

Governing Analog Circuit Theory • The resistors act in series to limit the current to

the capacitor, so the system should follow:

𝑉𝐶 = −𝑉0 + 2𝑉0 1− 𝑒𝑥𝑝 −𝑡

𝑅𝑏𝑜𝑥 + 𝑅𝑝𝑜𝑡 𝐶

System Data

• A 6 run design of experiments is performed:

– Two voltages, V0:

• 1.5 V

• 12V

– 3 Potentiometer values, Rpot:

• 10 Ohms

• 100 Ohms

• 1000 Ohms

• Data is gathered every 0.01 s

– Resulting in 606 rows of data

t V0 R2 Vc

0 1.5 10 -1.5

0.01 1.5 10 -0.68433711

0.02 1.5 10 -0.09044287

0.03 1.5 10 0.341978876

0.04 1.5 10 0.656830496

0.05 1.5 10 0.886077854

0.06 1.5 10 1.052995692

0.07 1.5 10 1.174530634

… … … …

0.88 12 1000 11.99999999

0.89 12 1000 11.99999999

0.9 12 1000 11.99999999

0.91 12 1000 11.99999999

0.92 12 1000 11.99999999

0.93 12 1000 12

0.94 12 1000 12

0.95 12 1000 12

0.96 12 1000 12

0.97 12 1000 12

0.98 12 1000 12

0.99 12 1000 12

1 12 1000 12

Why?

Two Problems – 30 minutes & break

• Perform multiple regression with lm() to find a function of the form

– VC=a0+a1*t+a2*V0+a3*Rs

– Also try out some different basis functions & combinations with regression

• Define a model and perform constrained optimization to find Rbox & C

𝑉𝐶 = −𝑉0 + 2𝑉0 1− 𝑒𝑥𝑝 −𝑡

𝑅𝑏𝑜𝑥 + 𝑅𝑝𝑜𝑡 𝐶

Solutions

• source(“Kazmer_5_MulReg.R”)

• source(“Kazmer_5_LinReg_Basis.R”)

– How “good” is the linear model with 4 coefficients?

– How much “better” is the more complex model?

– What do these models this tell us of Rbox & C?

• source(“Kazmer_5_NL_Fit.R”)

– Strengths of this approach?

– Weaknesses of this approach?

Agenda



– Transforms




• Two Real Examples – High Fidelity Simulation & Polymer Rheology


Simulation: Modeling Reality

• Many different types of simulations: – Circuits: Multisim & SPICE

– Finite element analysis

– Process simulation

– And others

• Common Inputs: – Physical layout/connectivity

– Initial/boundary conditions

– Material/device properties

– Control parameters

- TA Instruments

10-1

100

101

102

103

104

105

101

102

103

104

105

Apparent Shear Rate (1/s)

Appare

nt

Vis

cosity (

Pa s

)

T=180 C

T=200 C (reference)

T=220 C

100

101

102

103

104

105

101

102

103

104

105

Apparent Shear Rate (1/s)

Appare

nt

Vis

cosity (

Pa s

)

T=180 C

T=200 C (reference)

T=220 C

Capillary Rheometer Data High Impact Polystyrene

• DOW Styron® 478, MFI of 6 g/10 min

• Raw data:

– Transient

– Oscillatory

– Intermediate values

• What can be done with all this data?

10-1

100

101

102

103

104

105

101

102

103

104

105

Shear Rate (1/s)

Vis

cosity (

Pa s

)

T=180 C

T=200 C

T=220 C

Cross-WLF Model

• Standard models used with constrained optimization to estimate:

– Temperature sensitivity, A1

– Reference viscosity, D1

– Critical shear stress, *

– Power law index, n

– T* assumed 373.15 K

– A2 assumed 51.6 K

• Results:

– R2~0.97 (acceptable)

Standard “Cross-WLF” Models for Viscosity v. Shear Rate

𝜂 𝛾 ,𝑇 =𝜂0 𝑇

1 + 𝜂0 𝛾 𝜏*

1−𝑛

𝜂0 = 𝐷1 exp −𝐴1 𝑇 − 𝑇

∗

𝐴2 + 𝑇 − 𝑇∗

)(

)()log(

*

2

*

1

TTA

TTAaT

0 50 100 150 200 250 300 35010

3

104

105

106

Time (s)

Appare

nt

Shear

Str

ess (

Pa)

• Raw data, sampled at 0.2 second increments, describes the shear stress history of material

• The transient data very clearly exhibits viscoelasticity

Transient Data

appappapp

0 50 100 150 200 250 300 35010

3

104

105

106

Time (s)

Appare

nt

Shear

Str

ess (

Pa)

T=180 C

T=200 C

T=220 C

Stress Model

Standard Viscosity Model Results (Cross-WLF Model)

• R2 is 0.97, but

– There is some bias in the model predictions

– No transient behavior

Parameter Value

n 0.281

* 23.5·103 Pa

D1 9.0·1011 Pa·s

A1 31.3

A2 0.281

10-1

100

101

102

103

104

105

101

102

103

104

105

Shear Rate (1/s)

Vis

cosity (

Pa s

)

T=180 C

T=200 C

T=220 C

Cross-WLF Model

Note step response and transients

0 20 40 60 80 1001.35

1.4

1.45

1.5

1.55

1.6

1.65

1.7

1.75

1.8

1.85x 10

5

Shear

Str

ess (

Pa)

T=180 C, =257s-1, =1.83s

Polymer Pseudo Time (s)

Measured Output

m1 Fit: 91.36%

0 20 40 60 80 100 120 1407

7.5

8

8.5

9

9.5x 10

4

Shear

Str

ess (

Pa)

T=180 C, =32s-1, =3.92s


Measured Output

m1 Fit: 91.21%

• Viscoelastic response estimate at each shear rate

– First order model, pole and integrator

– Minimizes the prediction error for autoregressive moving-average (ARMA) model • K is steady state value

• is the time constant 0 5 10 15 20 25

2.7

2.8

2.9

3

3.1

3.2

3.3

3.4

3.5

3.6x 10

5

Shear

Str

ess (

Pa)

T=180 C, =2048s-1, =0.415s


Measured Output

m1 Fit: 95.82%

Viscoelastic Modeling, Approach 1: System Identification

ss

K

s

ssG

tt

tttt

1

Viscoelastic Modeling, Approach 1: System Identification Results

104

105

106

10-2

10-1

100

101

102

103

Stress (Pa)

Rela

xation T

ime (

s)

SysID, T=180 C (MF=6 g/10 min)



Isayev, 2003 (MF=14 g/10 min)

Models agree with

expectations

Viscoelastic Modeling, Approach 2: Proposed Constitutive Model

Rheology Data

Calculate Stress

Error

Converged?

Optimize A1, D1, n, *, A2=51.6, T*=373.2

No

is the relaxation time at the critical shear stress

(55.7 s for this PS at 200 °C)

𝑎𝑇 𝑇 = 𝑒𝑥𝑝 −𝐴1 𝑇 − 𝑇

∗

𝐴2 + 𝑇 − 𝑇∗

0 𝑇 = 𝐷1𝑎𝑇 𝑇

𝑇, , 𝑡 = 0 𝑇

1 + 0 𝑇 𝑡 𝜏∗

𝜎 𝑡 = 𝑡 𝑡 + 𝑀 𝑡 − 𝜏 𝑑𝜏𝑡

−∞

𝑀 𝑡 − 𝜏 =𝜎 𝑡 − 𝑡 𝑡

𝑎𝑇 𝑇 𝜃 𝑡 𝑒𝑥𝑝 −

𝑡 − 𝜏

𝜃 𝑡

𝜏 = 1

𝑎𝑇 𝑇 𝑑𝑠

𝜎 𝑡

0

𝜃 𝑡 =𝜏∗

𝜎 𝑡

Viscoelastic Stress Constitutive Model: Results Comparison

104

105

106

10-3

10-2

10-1

100

101

102

103

Shifted Shear Stress (Pa)

Rela

xation T

ime (

s)


SysID, T=200 C

SysID, T=220 C

Dual Fit

Isayev, 2003 (MF=14 g/10 min)

Constitutive Stress Model

Characteristic Relaxation

10-1

100

101

102

103

104

105

101

102

103

104

105

Shear Rate (1/s)

Vis

cosity (

Pa s

)

T=180 C

T=200 C

T=220 C

Cross-WLF Model

Proposed Model

Proposed Model: Results

0 50 100 150 200 250 300 35010

3

104

105

106

Time (s)

Appare

nt

Shear

Str

ess (

Pa)

T=180 C

T=200 C

T=220 C

Stress Model

104

106

104

106

Observed Shear Stress(Pa)

Pre

dic

ted S

hear

Str

ess (

Pa)

y=0.9979*x; R2=0.9943Parameter Viscous

n 0.277

* 24.7·103 Pa

D1 2.28·1012 Pa·s

A1 32.8

55.7 s Pa/Pa

Result Comparisons

Parameter Standard Model

Proposed Model

n 0.281 0.277

* 23.5·103 Pa 24.7·103 Pa

D1 9.0·1011 Pa·s 2.28·1012 Pa·s

A1 31.3 32.8

N/A 55.7 s

Slope, m 0.9851 0.9979

R 0.9718 0.9943

MAPE( ) 15.6% 7.0%

MAPE( ): Mean absolute percentage error of apparent shear stress

Added Relaxation Parameter

Conclusions

• The standard model has 4 coefficients & R2~0.97

• Adding a fifth coefficient allows the modeling of the transient behavior

– Allowing the model to better predict complex behavior, R2~0.998

• Better models enable us to push the limits of the material & designs

– Saving many millions or billions of $$$

Agenda



– Transforms






Motivation • Manufacturing process optimization objectives:

– Maximize robustness or yield, and

– Minimize costs

• While there are typically multiple decision variables, the optimization sought process settings to minimize or maximize a single objective

• However, sometimes it is useful to explicitly consider the trade-off between two objectives, such as cost and quality, weight and stiffness, or other decision criteria

Pareto Optimality

• A solution is said to be Pareto optimal when no improvement can be made in one of the objectives without degrading the performance of a different objective

• For example, we can find the maximum quality at different costs to define the Pareto optimal boundary or “efficient frontier” Cost

Qu

alit

y

Infeasible & Inferior Points

• The Pareto optimal boundary shows the best that is possible for a given system – Points outside boundary are desired but infeasible

– Points inside the boundary are inferior since: • The cost can be

reduced while providing the same quality, or

• The quality can be improved while reducing the cost

Cost

Qu

alit

y Increase quality at same cost

Decrease cost at same quality Suboptimal or

“inferior” point

Desirable but infeasible point

Decision Maker Preferences

• We know we want maximum quality and minimum cost but how to trade-off?

• But which of the three points, or other location on the boundary is best? – Well, it depends on

the application and the decision makers preferences • If you’re making straws,

then choose blue • If you’re making iPhones,

then choose red

Cost

Qu

alit

y

Pareto Optimization Method

• The solution of the Pareto optimal set can be obtained by performing a series of optimizations to minimize the function:

– where there are two objective functions, f1 and f2

– w is a weighting coefficient that trades f1 and f2

– For this formulation to be effective, both objective functions need to be minimized and of similar magnitude

1 2min 1

i

w f w fx

Example: DVD Pareto Optimality

• Find the Pareto optimal set to trade off process time, t, and rolled up process capability, CP

– The example uses quadratic response surfaces derived from a 45 run design of experiments with over 2,000 DVDs with 15 quality attributes

min 1 P

i

w t w Cx

Example: DVD Pareto Optimality (Matlab code)

• The implementation iterates across different weights to scale the time & CP in the objective

– fmincon is functionally equivalent to R’s L-BFGS-B

– The results are stored in global variables t & cp

function find_Pareto

global B LSL USL ystd cpk iOpt y g k w t cp % Define global variables

data=csvread('DVD_All_Data.csv'); % Load 575 cycle data set

… % Same as previous example

k=0; % Set results counter to 0

for w=0.05:0.05:0.95 % Loop on weightings

k=k+1; % Increment result counter

fmincon(@Pareto_time_cp,x0,[],[],[],[],Xmin,Xmax,[]); % Optimize for w

end

plot(t,cp,'ko','LineWidth',2); % Plot results

xlabel('Processing Time (s)');ylabel('Rolled-Up Process Capability')

end

function f=Pareto_time_cp(xp)


x(1)=1; % Define intercept

for i=1:9 % Loop on each setting, x_i

x(i+1)=xp(i); % Define linear terms

x(9+i+1)=xp(i)^2; % Define quadratic terms

end

for j=1:8 % Loop on all outputs, y_j

y(j)=x*B(:,j); % Evaluate process output

cpk(j)=min([(USL(j)-y(j))/(3*ystd(j))... % Evaluate Cpk from eq 12-4

(y(j)-LSL(j))/(3*ystd(j))]);

end

DPMO=1e6*(1-erf(3*cpk/2^0.5)); % Calculate defect rates

yield=1-sum(DPMO)/1e6;yield=max([1e-9, yield]);yield=min([1,yield]);

t(k)=(2+xp(3)+xp(5)+xp(7)); % Calculate processing time

cp(k)=2^0.5/3*erfinv(yield); % Calculate rolled up Cp

f=w*t(k)+(1-w)*(-cp(k)); % Calculate weighted objective

end

Example: DVD Pareto Optimality

• The objective function is just a weighted sum of time & CP

function find_Pareto




k=0; % Set results counter to 0

for w=0.05:0.05:0.95 % Loop on weightings

k=k+1; % Increment result counter

fmincon(@Pareto_time_cp,x0,[],[],[],[],Xmin,Xmax,[]); % Optimize for w

end

plot(t,cp,'ko','LineWidth',2); % Plot results


end

function f=Pareto_time_cp(xp)


x(1)=1; % Define intercept

for i=1:9 % Loop on each setting, x_i

x(i+1)=xp(i); % Define linear terms

x(9+i+1)=xp(i)^2; % Define quadratic terms

end

for j=1:8 % Loop on all outputs, y_j

y(j)=x*B(:,j); % Evaluate process output

cpk(j)=min([(USL(j)-y(j))/(3*ystd(j))... % Evaluate Cpk from eq 12-4

(y(j)-LSL(j))/(3*ystd(j))]);

end

DPMO=1e6*(1-erf(3*cpk/2^0.5)); % Calculate defect rates

yield=1-sum(DPMO)/1e6;yield=max([1e-9, yield]);yield=min([1,yield]);

t(k)=(2+xp(3)+xp(5)+xp(7)); % Calculate processing time

cp(k)=2^0.5/3*erfinv(yield); % Calculate rolled up Cp

f=w*t(k)+(1-w)*(-cp(k)); % Calculate weighted objective

end

DVD Pareto Optimality Results

• The results: Higher process capability requires increased processing time & cost – In this case,

I would use: • A time of

4.1 s with a CP of 2.15

or

• A time of 3.4 s with a CP of 1.75

3.2 3.4 3.6 3.8 4 4.2 4.4 4.6 4.8 51.6

1.7

1.8

1.9

2

2.1

2.2

2.3

2.4

Processing Time (s)

Rolle

d-U

p P

rocess C

apabili

ty

Better

Efficient frontier

Discussion

• Pareto optimal sets are effectively the result of multiple optimizations with different weightings of included objective functions

• Of course, it is possible to define a single objective for a given set of weightings

– However, this would defeat the purpose since the Pareto set provides a very insightful trade-off between different objective functions

• We, the decision makers, can then understand the process and make better decisions

Uncertainty

• The Achilles heel of optimization is, without a doubt, the model uncertainty

• Modeling uncertainty is a huge issue

– If the models are wrong, then all this work is meaningless…

Modeling Uncertainty

• Remember the example in which the transformed factor exp(x) increases the correlation coefficient from 0.907 to 0.98?

• Well, the transformed model still doesn’t fit the data perfectly – The data has

variation

– The model has uncertainty

-1.0 -0.5 0.0 0.5 1.0

0.5

1.0

1.5

2.0

2.5

x

y

Data

R^2= 0.907

Sources of Uncertainty

• Uncertainty is often caused by a mismatch of the model topology relative to the behavior of the system being modeled

– Can we ever know the “true” system behavior?

• Another source of uncertainty is the validity of the sample to the larger population

– To some extent, this source of uncertainty can be reduced by increasing the number of samples

• A third source is the capability of the measurement systems (validate with gage R&R)

Handling Uncertainty

• In the face of uncertainty, engineers can artificially tighten the system specifications to guarantee satisfactory performance:

– where error is the standard deviation of the residual errors from the model, and k is a factor, often between 1 and 3 by which the specs are tightened

min , subject toi

i i i

j error j j error

f x yx

LPL x UPL

LSL k y USL k

Handling Uncertainty • The tightening of the specifications forces the

optimization to move to a more robust, but often more expensive area of the process

– Note that if the specifications are overly tightened, then the process becomes infeasible

• In the example below, tightening 1 is feasible, but 2 will lead to a low yield, and 3 will have no solution at all

x1 x2

y1

USL1

LSL1 1

1

Example: DVD Uncertainty

• Find the Pareto optimal set to trade off process time and rolled up process capability while tightening the specification limits

– The example uses the same 45 run CCD DOE with quadratic models for eight quality attributes as a function of six process settings

min 1 P

i

w t w Cx

Example: DVD Uncertainty • The code is nearly identical to the last example

– The only difference is the calculation of the standard error and tightening of the spec limits

function opt_max_yield




for j=1:8

[B(:,j),BINT,R,RINT,STATS]=regress(y(:,j),X); % Perform regression models

y_pre=X*B(:,j);

s_error=std(y_pre-y(:,j))

LSL(j)=LSL(j)+s_error;

USL(j)=USL(j)-s_error;

end

k=0;

for w=0.05:0.05:0.95

k=k+1;

fmincon(@Pareto_time_cp,x0,[],[],[],[],Xmin,Xmax,[]);

end

plot(t,cp,'kx','LineWidth',2);


end

DVD Pareto Optimality Results

• When model uncertainty is included, the process capability drops significantly

– I’d suggest trying the 4.75 s cycle time, then reducing it once we verify OK yields

3.2 3.4 3.6 3.8 4 4.2 4.4 4.6 4.80.6

0.8

1

1.2

1.4

1.6

1.8

2

2.2

Processing Time (s)

Rolle

d-U

p P

rocess C

apabili

ty

No Uncertainty

With Uncertainty

Better

Discussion

• Model-based optimization is serious business

– Productivity can often be improved by 20% or more without added investment just by changing the design parameters!

• The limiting issue in application is the characterization and development of high fidelity models

– Many companies lack the human resources and commitment to make it happen, and suffice with suboptimal systems

Agenda

• Introduction – Topology – Linear regression – Transforms

• Multivariate Modeling – Overview (and error norms) – Univariate Optimization: R's optimize – Multivariate Optimization: R’s L-BFGS-B

• The Big Picture: Two Real Examples – High Fidelity Simulation & Polymer Rheology – Process Optimization of DVD Manufacturing

• Conclusions

• Regularization acts as a type of filter

– Filters can be vital in application, but

can significantly effect model fidelity/behavior

– Be careful: model can regress to the filter

Reflection: Bishop Chapter 3

• Increased model complexity tends to provide better “fits” given more degrees of freedom

• However, complex models can be less robust when their parameters do not reflect reality – Referred to overfitting, but really a topology issue


• Not all model terms are equally valuable. – Consider the yellow curve with varying # parameters

• Many statistics packages, such as Minitab, perform best subset regression to select the most valuable combination of terms – An automated form of supervised learning that does not

guarantee correct topology



• Estimation of confidence intervals or prediction error is crucial to decision support

– Computers don’t make decisions, people do.

Major Take-Away

• Supervised learning is limited by the topology of the model being fitted…

– Does the model form/bases reflect reality?

• Human insight can guide selection of basis functions

• Model coefficients can inform us about the system dynamics

– It is not about model complexity (and number of coefficients) but about having the right topology

• There are huge opportunities for those who can gleam meaning from big data

Date post:	13-May-2018
Category:	Documents
Upload:	lytu
View:	213 times
Download:	0 times

Supervised Learning - University of Massachusetts...

Documents