Date post: | 29-Jun-2018 |
Category: |
Documents |
Upload: | nguyenphuc |
View: | 230 times |
Download: | 0 times |
Why Python?• Productivity
• Reduced development time
• Code is extremely readable (I’ll show you today)
• Easy to learn
• Open-Source & Mature (over 20 years)
• Supportive community: Lots of resources available on-line!
• R through Python, or Python through R (it is much much faster than R)
• Object-oriented programming
• Fun!
• It is named after the BBC comedy series Monty Python’s Flying Circus
Wednesday, April 25, 2012
Popularity
http://blog.revolutionanalytics.comWednesday, April 25, 2012
Using Python• From a file (e.g. Command Line)
• Interactively
• Many softwares available.
• I use Eclipse.
• Available for Windows, Mac and Linux.
• Here are instructions for the configuration:
• http://www.youtube.com/watch?v=j4hido-FZKg
• http://www.vogella.com/articles/Python/article.html
• http://www.rose-hulman.edu/class/csse/resources/Eclipse/eclipse-python-configuration.htm Python is already installed
in Mac computers!!!
Wednesday, April 25, 2012
Today1. Python Types
2. Conditional Flow Statements: What if?
3. Iteration Statements: For Loop
4. Functions
5. Example 1: Saving webpages
6. Example 2: Word Frequency
7. What is NLTK?
Wednesday, April 25, 2012
Python Types
• Numbers
• Strings
• Lists and Tuples
• Dictionaries
Watch the Google Class!
Wednesday, April 25, 2012
Numbers• Natural numbers int
• Real number float
• Numeric operators 2+5 Summation
2*5 Multiplication
5**2 Exponentiation
5//2 Integer division
5%2 Division Remainder
5/2.0 Always be a real!
Wednesday, April 25, 2012
Number Comparison• bool
== , != equality/inequality
< , < = less than
>=, > greater than
bool(2=2)
bool(2!=3)
Wednesday, April 25, 2012
Strings
• Text stored in data structure
• str
• Delimited by (“) or (‘)
hello=”Hello”
• Substrings
hello[0]
hello[1:4]
• Operators for Strings
Wednesday, April 25, 2012
Operators for Strings: Example
first= ‘John’
last= “Doe”
full=first+’ ‘ + last
print full
len(full)
Outputs John Doe, the length of full is 8 (counts space)
Wednesday, April 25, 2012
List• A list is a collection of one or more elements
• Lists are mutable
numbers=[1,2,3,4] sum/max/min
strings=[‘do’, ‘re’, ‘mi’]
lists=[[1,2],[3,4],[5,6]]
mylist=range(start, end, step)
fruits=[‘banana’, ‘kiwi’, ‘apple’, ‘mango’]
fruits.sort()
print fruits
Wednesday, April 25, 2012
Tuple
• A tuple is an object that bundles several related object with different types and semantics.
• What are the differences between tuple and list?
• tuples have their own methods
• tuples are immutable, lists are mutable
person=(“Doe”, [“John”, “J.”], 30)
Wednesday, April 25, 2012
Dictionary• A dictionary dict is an associative array where each
element is accessed by a name.
word_frequencies={
“good”:1,
“news”:2,
“everyone”:1,
“I”:3,
“have”:1}
word_frequencies[“news”]
‘news’ in word_frequencies
* Named element access with [ ]* Containment check with in
Wednesday, April 25, 2012
What if?
if a != 0 :
print “a !=0”
else:
print “a == 0”
indentation is mandatory
Wednesday, April 25, 2012
for Loops• For each element in the container, execute a given suite of
elements.
fruits=[‘banana’, ‘kiwi’, ‘apple’, ‘mango’]
fruits.sort()
length =0 ## we want to know the total length of all the strings
for fruit in fruits:
length= length+len(fruit)
print length ## no indent is important here to get the total length!
Wednesday, April 25, 2012
Functions: Motivation
• Reusable code:
• We don’t want to write a complete algorithm every time.
• We can find functions written by other people and use them!
• Helps avoid problems with indentation when you have written a long code.
Wednesday, April 25, 2012
Example: Computing the Average
numbers= [2, 5, 8, 10]
type(numbers) # it is a list!!!
def avg_f(input):
s = sum(input)
l = len(input)
avg = s/l
return avg
avg_f(numbers)
indentation
function definition
output{Wednesday, April 25, 2012
Digression: Using Eclipse
• Eclipse tells you when you have an error:
• It gives you the error ‘before’ running the code.
• It specifies the line.
• It may give you some information about it.
• ALWAYS save as you are writing the code. You need to save before running the code. If next to the name of the module you see * , you have not saved your work.
• If you want to see results in between the code, just use print object. Very useful!
Wednesday, April 25, 2012
Example 1: Web
• Take advantage of the HTML code.
• It is useful in for loops.
• It is useful to locate words: You can identify words in italics or bold, titles, etc.
• It is useful to locate sections of the text.
• Take a look a Beautiful Soup module.
Wednesday, April 25, 2012
Example II: Word Count
• There are built in functions for this
• Look at NLTK
• Installing NLTK can be challenging. Look for a youtube video on how to do it. It is requires some special steps.
Wednesday, April 25, 2012
NLTK• NLTK is the acronym for Natural Language Toolkit
• A collection of python modules and objects tailored for NLP subtaks
Taks Modules Functionality
Accessing corpora nltk.corpusStandardized interfaces to corpora and lexicons
String processing nltk.tokenize nltk.stem Tokenizers, sentence tokenizers, stemmers
POS tagging nltk.tag Regular expressions, n-gram
Parsing nltk.parsechart, feature-baed, unification, probabilistic,
dependency
Semantic interpretation nltk.sem nltk.interference model checking, lambda calculus
... ... ...
Wednesday, April 25, 2012
(Some) Resources
www.python.org
wiki.python.org
Google Class (video)http://code.google.com/edu/languages/google-python-class/
Post questions or read answers to questionshttp://stackoverflow.com/
Wednesday, April 25, 2012