+ All Categories
Home > Documents > Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane...

Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane...

Date post: 29-Dec-2015
Category:
Upload: rafe-mckinney
View: 218 times
Download: 1 times
Share this document with a friend
25
Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane University
Transcript
Page 1: Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane University.

Structured programming 4Day 34

LING 681.02Computational Linguistics

Harry HowardTulane University

Page 2: Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane University.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

2

Course organization

http://www.tulane.edu/~ling/NLP/

Page 3: Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane University.

Structured programming

NLPP §4

Page 4: Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane University.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

4

Today's topics

Defensive programmingDebuggingAlgorithm design

Page 5: Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane University.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

5

Defensive programming

Brainstorm with pseudo-codeCareful naming conventionsBottom-up construction

Functional decomposition

Comment, comment, commentRegression testing

Page 6: Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane University.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

6

Brainstorm with pseudo-code

Before you write the first line of Python code, write what your program does as pseudocode.

That is to say, before writing a program that NLTK understands, write it in a way that people understand.

Page 7: Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane University.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

7

An example of pseudo-code

SPOT, move forward about 10 inches, turn left 90 degrees, and start moving forward, then start looking for a black object with your ultrasonic sensor, because I want you to stop when you find a black object, then turn right 90 degrees, and move backward 2 feet, OK?

What is good or bad about this example

Page 8: Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane University.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

8

A different phrasing of the example

SPOT, move forward about 10 inches and stop.Now turn left 90 degrees.Start moving forward, and turn on your ultrasonic

sensor.Stop when you find a black object.Turn right 90 degrees and stop.Move backward 2 feet and stop.What is good or bad about this example?

Page 9: Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane University.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

9

Pseudo and real code

The main advantage of the second phrasing is that we can match up the commands in each line to elements in the programming language.

Page 10: Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane University.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

10

Careful naming conditions

Choose meaningful variable and function names.

Page 11: Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane University.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

11

Bottom-up construction

Instead of writing a 20-line program and then testing it, build and test smaller units,and then combine them.

In general, these smaller units should be functions.

Page 12: Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane University.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

12

NLP pipelineFig. 3.1

Page 13: Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane University.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

13

Commenting

Add comments to every line, unless what a line is does is so obvious that a

comment would get in the way.

Your pseudo-code could become the comments on your real code.

Page 14: Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane University.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

14

Regressive testing

Keep a suite of test cases.As your program gets bigger, it should still work

on previous test cases.If it stops working, it has 'regressed'.

A change in code has the (unintended) side effect of breaking something that used to work.

doctest module does testingIt runs a program as if it were in interactive mode. See doctest documentation.

Page 15: Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane University.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

15

Debugging topics

Check your assumptionsException > stack traceInteractive debuggingPython's debuggerPrediction

Page 16: Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane University.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

16

Debugging

"Most code errors result from the programmer making incorrect assumptions". (NLPP:158)

When you find an error, first check your assumptions.

Add print statements to show values of variables and how far the program progresses.

Reduce input to smallest amount needed to cause the error.

Page 17: Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane University.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

17

Stack trace

A runtime error (Python exception) gives a stack trace that pinpoints the location of program execution at the time of the error.

But the error may actually be upstream.

Page 18: Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane University.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

18

Python's debugger

Invoke it:import pdbpdb.run('mymodule')

It lets you monitor execution of program,specify line numbers where program should stop

(breakpoints), andstep through the sections of code inspecting values of

variables.

Page 19: Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane University.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

19

Prediction

Try to predict the effect of a potential bugfix before re-running the program.

"If the bug isn't fixed, don't fall into the trap of blindly changing the code in the hope that it will magically start working again." (NLPP:159)

For each change, try to articulate what is wrong and how the change will fix the problem.Undo the change if it doesn't work.

"Programs don't magically work; they magically don't work." (Robert Goldman)

Page 20: Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane University.

Algorithm design

NLPP 4.7

Page 21: Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane University.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

21

Algorithms

Divide and conquerStart with something that worksIterationRecursion

Page 22: Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane University.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

22

Divide and conquer

Divide a problem of size n into two problems of size n/2.

Binary search - dictionary example.

Page 23: Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane University.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

23

Start with known

Transform task into something that already works. To find duplicates in a list,

first sort the list, then check for identity of adjacent pairs.

Page 24: Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane University.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

24

Iteration vs. recursion

For some function ƒ…Iteration

Repeat ƒ some number of times.Calling ƒ in a for loop.

Recursionƒ calls itself some number of times:

NP → the N PP.PP → P NP.

Page 25: Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane University.

Next time

Start NLPP §6

Learning to classify text


Recommended