Course Logistics
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 102 / 150
Course logistics
I Lectures in Zoom.I If sometimes goes very wrong (Zoom dies), check Slack formessages.
I When possible, please turn on your camera.I Skills lecture: I’m giving a skills lecture in 2 days. It will behighlights of this course.
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 103 / 150
Course logistics
QuestionsI Ask questions on Slack in the short-course-deeplearning channel.I Problems with zoom: on Slack ask “Darwinia Elf”I Lecture content questions: Jennifer HoetingI Computing questions: ask any Course Assistant
1. Tess Hamzeh: R, installation issues for Windows2. Winston Hilton: R, installation issues for Windows [day 1 only]3. Rachael Krawczyk: R, Python, installation issues for Macs
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 104 / 150
Course logistics
Course materialsI Course materials: See the class websitetinyurl.com/HoetingShortCourse or google Jennifer Hoeting
I We’ll start with slides part 1 today.I Slide numbers:
• 320/350 means 20th out of 50 slides in packet 3
I URLs are in blue (click to see full link)
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 105 / 150
Course logistics
ScheduleBreakout rooms in Zoom
I Use for coding questions and getting to know your fellowstudents during breaks.
I If you want to switch groups, on Slack ask “Darwinia Elf” or aCourse Assistant (Tess, Winston, Rachael) to move you.
Approximate scheduleI Hour 1: lecture (zoom)I Hour 2: computing (breakout rooms in zoom)I Hour 3: lecture (zoom)
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 106 / 150
A Statistical View of DeepLearning in Ecology
Jennifer Hoeting
Colorado State University
June 2020
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 107 / 150
Traditional approach: deep learning as a black box algorithm
Image source: expoundai.wordpress.com
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 108 / 150
Our goal: open the black box of deep learning
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 109 / 150
A Statistical View of Deep Learning in EcologyPart 1: Introduction
I Introduction to machine learningI Introduction to deep learning
Part 2: Going deeperI Neural networks from 3 viewpointsI Mathematics of deep learningI Model fittingI Types of deep learning models
Part 3: Deep learning in practiceI Ethics in deep learningI Deep learning in ecology
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 110 / 150
Introduction to machine learning
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 111 / 150
Deep learning is a small part of Machine Learning
Image source: www.edureka.co/blog
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 112 / 150
Machine learning versus statistics
Goals:I Statistics: inferenceI Machine Learning: prediction
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 113 / 150
Machine learning: brief overview
Types of machine learningI Supervised learningI Unsupervised learningI Semi-supervised learningI Reinforcement learning
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 114 / 150
Machine learning: Supervised learning
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 115 / 150
Machine learning: Supervised learning
Observe: response y and predictors x1, . . . xp
Goal: fit a model that relates the response to the predictors
Image source: blogs.nvidia.com
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 116 / 150
Machine learning: Supervised learning
Machine learning in RI caret package
• One-stop solution for machine learning in R• Over 235 models available• Try different models using one package and one syntax
I tidymodels package• Sort of caret in the tidyverse world• New package (version 0.1.0) so fewer resources available
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 117 / 150
Machine learning: Supervised learning
Caret packageI Streamlines process for creating predictive modelsI Train function
Example: Fit a Generalized Boosted Regression model.gbmFit1 <- train(Class ~ ., data = training,
method = "gbm", # modeltrControl = fitControl, # sets up cross validationverbose = FALSE) # gbm function option
Max Kuhn’s caret package guide
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 118 / 150
Machine learning: Supervised learning
Observe: response y and predictors x1, . . . xp
Goal: fit a model that relates the response to the predictors
Examples:I Linear classifiers (logistic regression)I Support vector machinesI Tree-based methods (random forests, xgboost)I Neural networks and deep learning
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 119 / 150
Machine learning: Supervised learning
Regression:I Response is continuousI Ex: body temperature
Classification:I Response is categoricalI Ex: CSU student (1) vsnon-student (0)
Image source: hatbotsmagazine.com
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 120 / 150
Machine learning: Supervised learningRegression:
I Response y is continuousI Predictors x1, . . . , xp are numericI Caret: lm, ridge regression, LASSO, and many more
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 121 / 150
Machine learning: Supervised learning
Logistic Regression model:
y ∼ Bernoulli(p)
log( p1 − p
)= β0 + β1x2 + · · · + βpxp
I Response y is binary (0,1)I Predictors x1, . . . , xp are numericI Binary classificationI Caret: offers 11 models
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 122 / 150
Machine learning: Supervised learning
Tree-based ModelsI Use hierarchy of binary splits topartition the predictor space
I Classification: uses frequency ofresponse in each node
I Regression: uses average response ineach terminal node
I Caret: boosted trees, eXtremeGradient Boosting, CART, . . . .
Image source: James et al (2015) Introduction to Statistical Learning
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 123 / 150
Machine learning: Supervised learning
Decision TreesI Regression trees findboundaries based on splits inthe predictors
I Work best when the true splitis linear
I Caret: boosted trees,eXtreme Gradient Boosting,CART, . . . .
Image source: James et al (2015) Introduction to Statistical Learning
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 124 / 150
Machine learning: Supervised learning
Random ForestsI Random forests are anensemble of regression trees
I Each tree uses a limitedsubset of predictors
I Uses: classification andregression
I Adds robustnessI Caret: Random Forest,Random Ferns, . . .
Image source: James et al (2015) Introduction to Statistical Learning
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 125 / 150
Machine learning: Supervised learning
Support Vector MachinesI Hyperplane separates classesI Can create a nonlinear planeI Uses: classification and regressionI Caret: Support Vector Machineswith · · · kernel
Image source: James et al (2015) Introduction to Statistical Learning
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 126 / 150
Machine learning: Unsupervised learning
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 127 / 150
Machine learning: Unsupervised learning
Observe: predictors x1, . . . xp only (no response)Goal: Understand the relationship between the variables
I Most big datasets do not come with labelsI Statisticians call this multivariate analysis
Examples:I Cluster analysis (K-means, hierarchical)I Principal components analysis
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 128 / 150
Machine learning: Unsupervised learningCluster analysis
I K-means, hierarchical clustering, mixture modelsI Goal: Find natural groupings in the data. Groups are not knownin advance.
Image source: James et al (2015) Introduction to Statistical LearningA Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 129 / 150
Machine learning: Unsupervised learning
Principal components analysis (PCA)
I Finds the linear combinationsof the predictors that explainthe most variation in the data
I Goal: reduce a large numberof variables to a smallernumber while losing as littleinformation as possible
Image source: Lapolla et al. 2009, Journal of Mass Spectrometry
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 130 / 150
Scikit-learn algorithm cheat sheet
Useful but only includes algorithms in the scikit machine learninglibrary in Python.Image sources for previous two slides:
I scikit-learn algorithm cheat sheetI extended version by Chris Bour
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 133 / 150
Introduction to Deep Learning
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 134 / 150
What is deep learning?
Computers: solve problems that are hard or tedious for humansHumans: solve problems that are intuitive but difficult to describein a formal set of rulesDeep learning:
I Allows computers to learn from experienceI Hierarchy of concepts with each concept defined through itsrelation to simpler concepts
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 135 / 150
What is deep learning?
I The goal of deep learning is predictionI Learned representations (transformations)
• Classic Statistical and Machine Learning approaches assumes apre-specified representation
• Deep learning learns a representation
I Deep learning uses a hierarchy of layers which allows thealgorithm to learn abstract concepts
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 136 / 150
What is deep learning?
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 137 / 150
What is deep learning?I Hierarchy of concepts is learned and builds upon itself resultingin a deep graph of related concepts
I Each layer contains increasingly complex concepts
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 138 / 150
What is deep learning?
Examples of deep learningI Speech Recognition and natural language processing
• Virtual assistants (Alexa, Siri, Cortana)• Translation (Natural language processing)• Chatbots
I Image analysis and categorization• Image categorization• Facial recognition• Driverless vehicles
I Big messy data• Medicine and pharmaceuticals• Recommender systems (Netflix recommendations)• Biology/ecology
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 139 / 150
What is deep learning?
You use deep learning every dayExample: Google automatically sorts your photos into albums
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 140 / 150
What is deep learning?Deep learning in Medicine
Source: https://jamanetwork.com/journals/jama/fullarticle/2588763A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 141 / 150
What is deep learning?
Deep learning in MedicineA deep learning algorithm using CT images to screen forCorona Virus Disease (COVID-19) Shuai Wang et al., posted Feb 17, 2020
Source:www.medrxiv.org/content/medrxiv/early/2020/02/17/2020.02.14.20023028.full.pdf
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 142 / 150
What is deep learning?
Deep learning can even be used to create art
Image source: blog.udacity.com
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 143 / 150
What is deep learning?
Advantages of deep learning algorithmsI Can produce excellent predictions (when tuned correctly)I Standard regression model assumptions are not requiredI Influence of outliers can be dampenedI Good for large complex data sets.
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 144 / 150
What is deep learning?
Caution: Deep Learning isn’t useful for every problem!
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 145 / 150
What is deep learning?
Disadvantages of deep learning algorithmsI Not useful for understanding the system being studied
• Ecology: theoretical vs deterministic models
I Easy to overfit leading to poor predictionsI Require large datasetsI Not based on a probability model
• Little statistical theory for inference, diagnostics or model selection• No uncertainty estimates
I Hard to keep up with the field
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 146 / 150
A Statistical View of Deep Learning in EcologyPart 1: Introduction
I Introduction to machine learningI Introduction to deep learning
Part 2: Going deeperI Neural networks from 3 viewpointsI Mathematics of deep learningI Model fittingI Types of deep learning models
Part 3: Deep learning in practiceI Ethics in deep learningI Deep learning in ecology
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 147 / 150
Some useful references on deep learning
I Deep Learning by I. Goodfellow, Y. Bengio and A. Courville(2016) MIT Press
I Deep Learning with R F. Chollet and J.J. Allaire (2018)Manning Publications
• Chapters 1-3 available online• Focus is on Keras
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 148 / 150
Additional references that I used to develop this presentation
Books:I Computational Statisticsby Givens and Hoeting (2nd edition, Wiley)
I The Elements of Statistical Learningby Hastie, Tibshirani, and Friedman
I Introduction to Statistical Learningby James, Witten, Hastie, Tibshirani
Course materials by:Ian Goodfellow, Darren Homrighausen, Ander Wilson, Asa Ben-Hur
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 149 / 150
Thank you to
I ISEC2020 organizersespecially David Warton and Gordana Popovic
I Course assistants:Tess Hamzeh, Winston Hilton, Rachael Krawczyk
I Alison Ketz and Dan WalshAcknowledgment of funding supportThis material is based upon work supported by the National ScienceFoundation (NSF) Grant No. AGS-1419558, the US GeologicalSurvey (USGS) (G17AC00409), and Colorado State University(CSU). Any opinions, findings, and conclusions or recommendationsexpressed in this material are those of the author and do notnecessarily reflect the views of the NSF, USGS or CSU.
A Statistical View of Deep Learning Part 1 | Jennifer Hoeting, Colorado State University 150 / 150