RecitationLyle Ungar
Computer and information Science
Learning ObjectivesGeneralized linear models and RBFsLoss functions for non-parametric methodsSelection of loss functions Selection of regression penalties
Non-parametric loss functionu When doing k-nn with y a real number, what is
the loss function L(y, ŷ ) being minimized?
u When doing decision trees with y a Boolean, what is the loss function being minimized?
Breakout
Non-parametric loss functionu When doing k-nn with y a real number, what is
the loss function L(y, ŷ ) being minimized?l K-nn doesn’t really have a loss function that is is
minimizing. It is just an algorithm. l There is no learning/optimixation, so no gradient
descentu When doing decision trees with y a Boolean,
what is the loss function being minimized?l The conditional entropy of y given the features x
Model complexityu Increasing k in K-nn yields better-fitting, more
complex model
False; it gives a simplermodel
Which model to use?y = xTw
Predict income based on age, sex, and country you were born inWhat exactly are x and y?
y: incomex age, sex, and a “one hot”” vector indicating birth country
Which loss function to use?||y -.Xw||p
a) p=0b) p=1c) p=2
L1: data not Gaussian?
Which loss function to use?You are building a model to estimate the cost,y, of a software project that you are bidding on as a contractor (as a function of lots of features of the project, including estimates of lines of code, hours of meetings, complexity of specifications).
a) p=0b) p=1c) p=2
L1? The true cost is linear, not quadratic
Which loss function to use?You are writing a search algorithm that returns web pages as a function of the search query, the words on the web page the person is searching from, and the search history of that user.
You only care about getting a right answer among the top few. We’ll cover this later in the course
Which regression penalty to use?Error + l2||w||22 + l1||w||1 + l0||w||0
u If you want the model to be scale invariant?u If you want to have a small model?u If you want a convex optimization problem?a) p=0b) p=1c) p=2
Which regression penalty to use?Error + l2||w||22 + l1||w||1 + l0||w||0
u If you want the model to be scale invariant?u If you want to have a small model?u If you want a convex optimization problem?
L0 or L1L0
L1 and/or L2
u Your training error for ridge regression is substantially lower than your testing error.
u You shoulda) increase lb) decrease lc) no change in l
a)
u Your training error for ridge regression is the same as your testing error.
u You shouida) increase lb) decrease lc) no change in l
c)
What you should knowu Loss functions depend on the problemu Basis functions allow one to fit a nonlinear
function using linear regressionu Link functions give a nonlinear regression
Gather.townu https://gather.town/aQMGI0l1R8DP0Ovv/penn-
cis