Opening a file in Python is actually very simple. It follows the general form of:
# general form for loading a file.
file_variable = open(filename, mode)
The file_variable is the variable we will use to reference the file in the Python code.
The filename includes the whole file, including path (if you are not in the same folder), and extension.
The mode specifies whether you are going to read or write the file. There are several modes that you can use.
Mode | Description |
‘r’ | This is for reading. The file cannot be modified and you start at the top of the file by default. |
‘w’ | This is for writing. If the file already exists, it will be overwritten and it’s contents erased. If the file does not exist, the file will be created. |
‘a’ | This is for appending. Appending is a form of writing to a file. Except, if the file exists, you don’t erase it’s contents. Instead, you start writing at the end of the existing data, i.e. appending to it. Like with writing, if the file doesn’t exist, Python will create it. Because you are not deleting existing data, appending data is often preferred – but it will depend upon the application. |
Let’s look at a sample of where we open a couple of files.
#open customer file
customers = open('customers.txt', 'r')
#open sales for appending data
sales = open('sales.txt', 'a')
Here we are opening a couple of files, customers.txt for reading, and sales.txt for appending to, which remember is a form of writing. A case like this could be for where we open a customer file to find their id, and then write that to the sales file, to record a new sales record without overwriting previous sales data.
The example also assumes that you are reading/writing to a file in the current folder that the python file is in. Let’s look at an example of how we open a file that is not in the same folder.
#open sales for appending data
sales = open(r'c:\data\sales.txt', 'a')
In the previous example we see a couple of interesting changes. One is the letter r before the string for our file name. There is a simple, but powerful reason for that. In Windows, the backslash is used to differentiate between folders and sub folders. But in Python, the backslash is used to escape characters. By putting the ‘r’ before the string, we are telling Python to read the string as is, and ignore any/all escape characters.
The second thing is that we are specifying the entire folder path, including which drive. This of course could be different for every user, especially if it’s has data in their personal folder. So this is a little risky, and we may need to tweak that code later on when we learn more about reading system data.
Until then, we’ll keep using local files to our project.
Reading Data from a File
Let’s look at how we will read data from a file.
There are a couple of different ways, but our first way, is going to be with the read() function. This will read the entire contents of the file into a variable.
def main():
# open the file for reading
infile = open('data.txt', 'r')
# read all of the contents of the file into the variable contents
contents = infile.read()
# close the file
infile.close()
# print out the contents of the file
print(contents)
main()
Notice how you don’t have to have the file open to print contents. That’s because read() takes all of the data in the file and “dumps” it into the variable that we used – contents in this case.
Reading a whole file is fine, but what if we need to read just a line of the file at a time? Well, that’s easy too. Python provides a readline() function. It reads the file until it sees either the end of the file, or a new line character. A new line character in Python looks like \n, but it really tells the computer to move to the next line and send the carriage return to the beginning of the line.
Let’s look at a simple file for using readline()
def main():
# open the file for reading
infile = open('data.txt', 'r')
# read all of the contents of the file into the variable contents
line1 = infile.readline()
line2 = infile.readline()
line3 = infile.readline()
# close the file
infile.close()
# print out the contents of the file
print(line1)
print(line2)
print(line3)
main()
Of course the problem with this is, how many lines are in our text file? Unless we know for sure, it would be easy to potentially read too far, causing an error, or not far enough, and miss data. Lets see a simple fix for that.
def main():
# open the file for reading
infile = open('data.txt', 'r')
# read all of the contents of the file into the variable contents
line = infile.readline()
while line != '' :
print(line)
line = infile.readline()
# close the file
infile.close()
main()
We can do something similar with a for loop, possibly even a little easier. Here, Python knows we’re dealing with a file, so it assumes that we’ll read the file one line at a time.
def main():
# open the file for reading
infile = open('data.txt', 'r')
# read all of the contents of the file into the variable contents
for line in infile :
print(line)
# close the file
infile.close()
main()
Reading Files with Python was originally found on Access 2 Learn