Date post: | 20-Dec-2015 |
Category: |
Documents |
View: | 217 times |
Download: | 1 times |
The K Nearest Neighbor Algorithm (kNN)
Erik ZeitlerUppsala Database Laboratory
23-04-18 Erik Zeitler 2
Examination
Examination is split in two parts• Solve the assignment• Oral examination
During the oral examination• The instructor validates your program using a
script• Non-working program
the examination ends immediately (“fail” grade is given) you may re-do the examination later
• The instructor will ask questions on your implementation on the method itself
• All group members must take part in the solution. Group members can get different grades on the same
assignment.
23-04-18 Erik Zeitler 3
Grades
Fail PassComplete
Before end of semester
23-04-18 Erik Zeitler 4
Examination
Why do we have the oral part? Are we out to get you?
• The assignments cover a good part of the course understanding them will help you.
• If you have problems solving the assignment, please ask during office hours. The only way asking will affect your grade is
that you might learn more.
Different things!
Solving assignmentsUnderstanding your own solution
23-04-18 Erik Zeitler 5
What you need to do
Sign up for oral exam• Groups of 2 – 3 students• Forms are on my office door, P1320
Implement a solution• Deadline: Submit by e-mail 24h before your
oral exam• 1, 2: [email protected]• 3, 4: [email protected]
Answer the questions on the form• Bring one form per student
Prepare for oral exam:• Study the theory behind
23-04-18 Erik Zeitler 6
K Nearest Neighbor
Basic idea:• If it walks like a duck and it quacks like a
duck Then it must be a duck
So how do we know how a duck walks and talks?• Either we ask the other ducks
– or if they are unavailable –• Look at who else is walking and talking
this way.
23-04-18 Erik Zeitler 7
Duck walking and talking
Assume that a duck• has average step length 5…15 cm• quacks at a frequency 600…700 Hz
On the other hand consider a cow:• step length is 30…60 cm• a cow moos at 100…200 Hz
23-04-18 Erik Zeitler 8
Cows and Ducks in a Plot
23-04-18 Erik Zeitler 9
Enter the Chicken
23-04-18 Erik Zeitler 10
Classifying you using kNN
Each of you belong to a group:• [F|STS|Int Masters|Exchange Students|
Other] Let’s classify each one using 1-NN and
3-NN How do we select our distance
measure? How do we decide which of 1-NN and
3-NN is best?
23-04-18 Erik Zeitler 11
Things to Consider for the Assignment
Preprocessing• What are the ranges of the different measurements?• Is one characteristic more important than another?
If so, how can we reflect this? If not, do we need to do something else?
• You can assume: no missing points, no noise
Selecting training and testing data and choosing K• Is the data sorted in any way? If so is this good or bad?• Are there different ways of subdividing the known data?• How do we know if the value of K is good or bad?
23-04-18 Erik Zeitler 12
Things to Consider for the Assignment
Classifying unknown data• Do we need to preprocess the unknown
data?• Which data set should we use to classify
the unknown data? Complexity
• What is the offline part of kNN and what is the online part?
• What is the complexity for the offline and online parts of kNN?