+ All Categories
Home > Documents > Lecture 9 - CEProfsLecture 9 Files and Input Processing What are we going to cover today? •Reading...

Lecture 9 - CEProfsLecture 9 Files and Input Processing What are we going to cover today? •Reading...

Date post: 07-Oct-2020
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
11
8/24/18 1 Lecture 9 Files and Input Processing What are we going to cover today? Reading from and writing to files Processing strings, particularly to handle input Input/Output To this point, we’ve received input and output from the “console” The interactive window that is the default We have the print command to print to the console We have the input command to read from the console And, to also print a message to the console, first But, how do we deal with files? Files Recall that files are a way of storing information outside of main memory We need to access that information differently than we would main memory We treat it much like console input/output CPU - registers Cache (near CPU) Main Memory Secondary Memory (Files) Offline Memory (e.g. Cloud) Faster to Access Less total data Less permanent Slower to Access More total data More long-lasting File Extensions Most file names have an “extension” – a period followed by a designation describing the type of file it is. .pdf, .docx, .jpg, .mov, .mp3, .xlsx, .csv, etc. The “extension” is just part of the name, it does not necessarily mean anything about You could rename any file you want with a different extension; that doesn’t change the file itself. Likewise, you could pick any file name you wish for some file The operating system (and some programs) use the extension as a strong hint about how the data in the file is organized. But, it is not a guarantee. The basics of dealing with files Lots of files will be in secondary storage We need to set up a way of designating a file as the particular one we are working with: A file identifier Will be a variable Then, we will use that identifier to refer to the file when we want to do something with it. fileID
Transcript
Page 1: Lecture 9 - CEProfsLecture 9 Files and Input Processing What are we going to cover today? •Reading from and writing to files •Processing strings, particularly to handle input Input/Output

8/24/18

1

Lecture 9Files and Input Processing

What are we going to cover today?

• Reading from and writing to files

• Processing strings, particularly to handle input

Input/Output

• To this point, we’ve received input and output from the “console”• The interactive window that is the default

• We have the print command to print to the console• We have the input command to read from the console• And, to also print a message to the console, first

• But, how do we deal with files?

Files

• Recall that files are a way of storing information outside of main memory• We need to access

that information differently than we would main memory• We treat it much like

console input/output CPU - registers

Cache (near CPU)

Main Memory

Secondary Memory (Files)

Offline Memory(e.g. Cloud)

Faster to AccessLess total dataLess permanent

Slower to AccessMore total dataMore long-lasting

File Extensions

• Most file names have an “extension” – a period followed by a designation describing the type of file it is.• .pdf, .docx, .jpg, .mov, .mp3, .xlsx, .csv, etc.

• The “extension” is just part of the name, it does not necessarily mean anything about • You could rename any file you want with a different extension; that doesn’t

change the file itself.• Likewise, you could pick any file name you wish for some file

• The operating system (and some programs) use the extension as a strong hint about how the data in the file is organized.• But, it is not a guarantee.

The basics of dealing with files

• Lots of files will be in secondary storage• We need to set up a way of designating a file

as the particular one we are working with:• A file identifier• Will be a variable

• Then, we will use that identifier to refer to the file when we want to do something with it.

fileID

Page 2: Lecture 9 - CEProfsLecture 9 Files and Input Processing What are we going to cover today? •Reading from and writing to files •Processing strings, particularly to handle input Input/Output

8/24/18

2

File Basics

• We first “open” the file• At this time, we associate an identifier with the file• We need to specify how we will work with the file

• Then, we work with the file contents• Reading/Writing

• Finally, we “close” the file

Opening Files

• The basic format is like this:<fileID> = open("<File Name>", "<designator>")

We start with a file ID –just a variable name that we will use to be able to refer to this specific file

Opening Files

• The basic format is like this:<fileID> = open("<File Name>", "<designator>")

Then the equals sign, which as usual is an assignment operation. We will be assigning the file that we open to the variable that is the file identifier.

Opening Files

• The basic format is like this:<fileID> = open("<File Name>", "<designator>")

Next, we have the “open” command. The “open” command will designate a particular file and make it ready to work with.

Opening Files

• The basic format is like this:<fileID> = open("<File Name>", "<designator>")

Inside of the parentheses is first the file name. This is a string giving the name of the file to use. It’s the name you would see if looking at it in a file browser/explorer on your computer.

Opening Files

• The basic format is like this:<fileID> = open("<File Name>", "<designator>")

Then, after a comma, is the designator, which says how we are going to be using the file. It is sometimes referred to as the mode for working with the file

Page 3: Lecture 9 - CEProfsLecture 9 Files and Input Processing What are we going to cover today? •Reading from and writing to files •Processing strings, particularly to handle input Input/Output

8/24/18

3

Designators

• r Reading (we will read data from the existing file)• w Writing (we will write data to a new file)• a Appending (we will append data to an existing file)• rb, wb, ab We will read/write/append BINARY data

(We use this when we are not writing text)• r+ We will read from AND write to the file• <nothing> If there is no mode designator, then ‘r’ is assumed

Examples

• To open a file named Measurements.dat so that it can be read, and assigning the designator to a variable called myfile:

myfile = open('Measurements.dat', 'r')• To open a file named Results.out so that it can be written to, and

assigning the designator to a variable called output_file:output_file = open('Results.out', 'w')

• To open a file named data so that it can be both read from and written to, in binary and assigning the designator to a variable df:

df = open('data', 'rb+')

File Name

• Remember, you can name a file whatever you want – with any extension you want.• If you want to make a file where the data inside matches the hint

given by its extension, you need to write the data to the file in the right way.• e.g. for a .pdf file, you would need to write binary data in the exact form

expected for an Adobe Acrobat file• Many file extensions are associated with proprietary programs, or are

very complicated to read/write• There are sometimes libraries/modules you can use to help with this

When finished

• When we are finished with a file, we need to close it.• This ensures that the file is left in a valid condition.• After closing, the file can’t be used (no reading/writing)

• Format:<fileID>.close()

First there is a file identifier.

When finished

• When we are finished with a file, we need to close it.• This ensures that the file is left in a valid condition.• After closing, the file can’t be used (no reading/writing)

• Format:<fileID>.close()

Then, there is a period, the word “close”, and parentheses.

When finished

• When we are finished with a file, we need to close it.• This ensures that the file is left in a valid condition.• After closing, the file can’t be used (no reading/writing)

• Format:<fileID>.close()

• Examples:• myfile.close()• output_file.close()• df.close()

Page 4: Lecture 9 - CEProfsLecture 9 Files and Input Processing What are we going to cover today? •Reading from and writing to files •Processing strings, particularly to handle input Input/Output

8/24/18

4

An alternative

• An alternative to opening with an assignment and later closing:with <open command> as <fileID>:

An alternative

• An alternative to opening with an assignment and later closing:with <open command> as <fileID>:

Start with the command “with”

An alternative

• An alternative to opening with an assignment and later closing:with <open command> as <fileID>:

Then write the open command, the same way as before

An alternative

• An alternative to opening with an assignment and later closing:with <open command> as <fileID>:

Then the word “as”

An alternative

• An alternative to opening with an assignment and later closing:with <open command> as <fileID>:

Then the variable name for the file identifier

An alternative

• An alternative to opening with an assignment and later closing:with <open command> as <fileID>:

And finaly a colon, after which we indent the subsequent lines.

Page 5: Lecture 9 - CEProfsLecture 9 Files and Input Processing What are we going to cover today? •Reading from and writing to files •Processing strings, particularly to handle input Input/Output

8/24/18

5

An alternative

• An alternative to opening with an assignment and later closing:with <open command> as <fileID>:

• When you finish with the indented portion, then the file is automatically closed.• The fileID variable can be used to refer to the file within the indented

portion of the code

The two alternatives:

### OPTION 1myfile = open("data.dat",r+)#Do stuff with myfile - read/writemyfile.close()

### OPTION 2with open("data.dat",r+) as myfile:

#Do stuff with myfile - read/write

Which version to use?

• Separate open and close commands are best for:• When you will have multiple files open at once or for a long time

• The with…as formulation would result in excessive indentation• When the open/close commands are not nested:

• e.g. Open 1 -> Open 2 -> Close 1 -> Close 2• This is not possible for with…as

• Using the with...as formulation is better for:• Ensuring that your file is always closed correctly

• Separate statements could have problems if there is an error before closing• Clearly delineating which part of the code is doing file operations

File operation: writing

• We’ll assume we are not using binary (just standard read/write)• To write, we will use the write command:<fileID>.write(<string to write>)

File operation: writing

• We’ll assume we are not using binary (just standard read/write)• To write, we will use the write command:<fileID>.write(<string to write>)

Start with the file identifier (variable name) for a file that was opened for writing or appending

File operation: writing

• We’ll assume we are not using binary (just standard read/write)• To write, we will use the write command:<fileID>.write(<string to write>)

Then, there is a period, followed by the word “write” and then parentheses.

Page 6: Lecture 9 - CEProfsLecture 9 Files and Input Processing What are we going to cover today? •Reading from and writing to files •Processing strings, particularly to handle input Input/Output

8/24/18

6

File operation: writing

• We’ll assume we are not using binary (just standard read/write)• To write, we will use the write command:<fileID>.write(<string to write>)

Inside the parentheses is the string to write

File operation: writing

• We’ll assume we are not using binary (just standard read/write)• To write, we will use the write command:<fileID>.write(<string to write>)

• Examples:myfile.write("First Line")output_string = 'Second Line'myfile.write(output_string)

The write command vs. the print statement

• write will write only a single string• You cannot output multiple• So, there is obviously no “space” separating separate strings written

• write will write only strings• You must first convert numbers to a string before writing

• write will not put a carriage return/new line after writing• You will need to explicitly put in a newline character if you want a newline• Or, you will need to create strings (with triple quotes) that have newlines in

them.

Example of writingoutfile = open("MyOutput.txt", 'w')

outfile.write("Testing the write command.\n")x = 987

outfile.write("Here's a number: "+str(x)+'\n')

outfile.write("And another number:")outfile.write(str(21))

outfile.write("\n")outfile.close()

Next

Example of writingoutfile = open("MyOutput.txt", 'w')outfile.write("Testing the write command.\n")

x = 987outfile.write("Here's a number: "+str(x)+'\n')outfile.write("And another number:")

outfile.write(str(21))outfile.write("\n")outfile.close()

MyOutput.txt

Next

The file MyOutput.txt is created and opened for writing. The next thing written will be at the beginning of the file

Write

Example of writingoutfile = open("MyOutput.txt", 'w')outfile.write("Testing the write command.\n")

x = 987outfile.write("Here's a number: "+str(x)+'\n')outfile.write("And another number:")

outfile.write(str(21))outfile.write("\n")outfile.close()

MyOutput.txtTesting the write command.

Next

The first line is written to the file. The newline at the end means the next write will begin on the next line of the file

Write

Page 7: Lecture 9 - CEProfsLecture 9 Files and Input Processing What are we going to cover today? •Reading from and writing to files •Processing strings, particularly to handle input Input/Output

8/24/18

7

Example of writingoutfile = open("MyOutput.txt", 'w')outfile.write("Testing the write command.\n")

x = 987outfile.write("Here's a number: "+str(x)+'\n')outfile.write("And another number:")

outfile.write(str(21))outfile.write("\n")outfile.close()

MyOutput.txtTesting the write command.

Next

x is created in memory, holding the value 987

Write

Example of writingoutfile = open("MyOutput.txt", 'w')outfile.write("Testing the write command.\n")

x = 987outfile.write("Here's a number: "+str(x)+'\n')outfile.write("And another number:")

outfile.write(str(21))outfile.write("\n")outfile.close()

MyOutput.txtTesting the write command.Here's a number: 987

Next

The next line is written, with a newline. Notice that the variable was converted to a string.Write

Example of writingoutfile = open("MyOutput.txt", 'w')

outfile.write("Testing the write command.\n")x = 987

outfile.write("Here's a number: "+str(x)+'\n')

outfile.write("And another number:")outfile.write(str(21))

outfile.write("\n")outfile.close()

MyOutput.txtTesting the write command.Here's a number: 987And another number:

Next

Another string is written, but not a newline, so the next thing will appear right afterward.

Write

Example of writingoutfile = open("MyOutput.txt", 'w')

outfile.write("Testing the write command.\n")x = 987

outfile.write("Here's a number: "+str(x)+'\n')

outfile.write("And another number:")outfile.write(str(21))

outfile.write("\n")outfile.close()

MyOutput.txtTesting the write command.Here's a number: 987And another number:21

Next

The number 21 is converted to a string and output, again with no newline.

Write

Example of writingoutfile = open("MyOutput.txt", 'w')outfile.write("Testing the write command.\n")

x = 987outfile.write("Here's a number: "+str(x)+'\n')outfile.write("And another number:")

outfile.write(str(21))outfile.write("\n")outfile.close()

MyOutput.txtTesting the write command.Here's a number: 987And another number:21

Next

Now a newline is written.

Write

Example of writingoutfile = open("MyOutput.txt", 'w')outfile.write("Testing the write command.\n")

x = 987outfile.write("Here's a number: "+str(x)+'\n')outfile.write("And another number:")

outfile.write(str(21))outfile.write("\n")outfile.close()

MyOutput.txtTesting the write command.Here's a number: 987And another number:21

Next

The file is closed – nothing more can be written.

Page 8: Lecture 9 - CEProfsLecture 9 Files and Input Processing What are we going to cover today? •Reading from and writing to files •Processing strings, particularly to handle input Input/Output

8/24/18

8

File location

• The file that was created can be found in the same directory as the .py file.• If you want to create it in a different directory, you must give a directory location as

part of the file name when opening.#Mac OS Xinfile = open('data/data.txt', 'r')

#Windowsinfile = open('data\\data.txt', 'r')

• Note: if you open a file for writing, a new file will be created with that name.• It will overwrite any existing file of that name!

The double backslash (\\) is how we represent a single backslash inside a string, since a single backslash is used for designating special characters like \n

Reading from a file

• Again, assume we have a text file (not a binary file)• Several options, each reading from the file as strings.• The most common option is to read one line of the file at a time.• One line is everything typed until a newline character (\n) is encountered.

<string variable> = <fileID>.readline()

Reading from a file

• Again, assume we have a text file (not a binary file)• Several options, each reading from the file as strings.• The most common option is to read one line of the file at a time.• One line is everything typed until a newline character (\n) is encountered.

<string variable> = <fileID>.readline()

We start with the file identifier.

Reading from a file

• Again, assume we have a text file (not a binary file)• Several options, each reading from the file as strings.• The most common option is to read one line of the file at a time.• One line is everything typed until a newline character (\n) is encountered.

<string variable> = <fileID>.readline()

Followed by .readline()

Reading from a file

• Again, assume we have a text file (not a binary file)• Several options, each reading from the file as strings.• The most common option is to read one line of the file at a time.• One line is everything typed until a newline character (\n) is encountered.

<string variable> = <fileID>.readline()

This gives us a string

Reading from a file

• Again, assume we have a text file (not a binary file)• Several options, each reading from the file as strings.• The most common option is to read one line of the file at a time.• One line is everything typed until a newline character (\n) is encountered.

<string variable> = <fileID>.readline()

Which we assign to a string variable

Page 9: Lecture 9 - CEProfsLecture 9 Files and Input Processing What are we going to cover today? •Reading from and writing to files •Processing strings, particularly to handle input Input/Output

8/24/18

9

Reading from a file

• Again, assume we have a text file (not a binary file)• Several options, each reading from the file as strings.• The most common option is to read one line of the file at a time.• One line is everything typed until a newline character (\n) is encountered.

<string variable> = <fileID>.readline()

• Example:next_line = myfile.readline()

Reading multiple lines

• It is common that we’ll want to process an entire file, and will want to read all (or many) lines, each of which has the same format.• e.g. a data file, with one set of data per line

• For this, we can use a version of the for loop:for <lineID> in <fileID>:

#Do stuff with the string lineID• The loop is structured just like looping through a list, but in this case

we are looping through lines in a file.

Reading multiple lines

• Two versions that essentially work the same way:### OPTION 1for next_line in myfile:

#Do stuff with the string next_line

### OPTION 2next_line = myfile.readline()while next_line != '':

#Do stuff with the string next_linenext_line = myfile.readline()

Reading in and printing a file:

myfile = open("Test.dat",'r')for next_line in myfile:

print(next_line,end='')myfile.close()

• Note: we need to suppress the new line in the print statement so that we don’t put an extra new line in our code!

Alternative ways to read from files

• <string variable> = <fileID>.read()• Will read the entire file into one single string• Could be a REALLY large string!

• <list variable> = <fileID>.readlines()• <list variable> = list(<fileID>)

• Both of these will convert all the lines in the file into a list of strings• Each element of the list is a string, giving one line of the program

• Examples:whole_file = myfile.read()all_lines = myfile.readlines()all_lines = list(myfile)

String Processing

• We often read in lines of files into individual strings• We usually want to “break up” these strings into parts.• There are many operations that can be performed on strings, but one

of the most useful is the “split” method.• The split method will convert a string into a list of strings, based on a

separator that is specified.• Everything that comes between separators becomes a new element in the

list.

Page 10: Lecture 9 - CEProfsLecture 9 Files and Input Processing What are we going to cover today? •Reading from and writing to files •Processing strings, particularly to handle input Input/Output

8/24/18

10

String splitting

• Format:<list variable> = <string variable>.split(<thing to split on>)

We start with a string variable

String splitting

• Format:<list variable> = <string variable>.split(<thing to split on>)

Then we put .split()

String splitting

• Format:<list variable> = <string variable>.split(<thing to split on>)

Inside the parentheses is what we want to use to decide how to split up the string.This is a string.

String splitting

• Format:<list variable> = <string variable>.split(<thing to split on>)

The result is a list of strings

Example

s = "1,2,3,4"elems = s.split(',')print(elems)

Console['1', '2', '3', '4']

Exercise

• Say we have a date, in a string of the form: month/day/year, and we want to get three variables, one with the month, one with the day, one with the year. How would we do that?

Page 11: Lecture 9 - CEProfsLecture 9 Files and Input Processing What are we going to cover today? •Reading from and writing to files •Processing strings, particularly to handle input Input/Output

8/24/18

11

Exercisedate = "10/21/2018"parts = date.split('/')month = parts[0]day = parts[1]year = parts[2]print("Day:",day,"Month:",month,"Year:",year)

ConsoleDay: 21 Month: 10 Year: 2018

A final note

• We have seen lots of examples of the form:• <some variable>.<something>• This format: where you have a variable of some sort, then a period,

then something else afterward is very common.• The dot is a way of saying that the “something” belongs to the

variable.• For example: myfile.close() means that we are closing myfile (not some other

file). The close() operation goes with the variable myfile.• It is related to object-oriented design, which we won’t really cover in

detail, here, but we will continue to see examples of the dot being used to identify things that are belong to something else.


Recommended