Friday, April 22, 2022

File processing

Most programming languages have built-in facilities for opening and processing files. In this next set of posts, I'll highlight Python's simple file input/output methods for handling text files.

Most of these facilities can be used for processing input from a variety of sources, such as reading from a URL. And we will do that at the end of this set of posts. For now, let's just get started with a simple set of exercises.

We'll be using the following file:


It's a plain text file, so it can be read by almost all programs that read files. This particular file has 14 lines. Using the VI editor, you can get a line count by typing the following

:set number

or

:set nu

Make sure that you're not in INSERT mode (i.e., the word INSERT does not appear at the bottom left corner of the editor window).

Now onto programming.

Our file is named test.txt. In python, we assign a file variable using the open() function. We will then use this file variable to manipulate the contents of the file, such as reading or writing to the file.

Start your python editor and type in the following. Make sure that you are in the same folder that your file is located.


The file is opened using the following code:

f = open('test.txt', 'r')

f is the file handle, or variable, that we will use to manipulate the file contents.

open() is the function used to open the file. open() takes two arguments, the first is the name of the file you want to open, test.txt. The second is the mode in which you want to open the file. The mode can be:

r = read only
w = read and write (the file is created, if it exists, it's truncated)
a = append (the file is created if it does not exist)

Now that we've opened the file, let's read the first line and display it's contents.















We used the readline() function to read from the file handle f.

We put the contents returned by readline() into the variable line.

Then we used the print() function to display the contents of the variable line.

Note the format for reading from the file into the variable.

line = f.readline()

Now let's read the next line and display it's contents.


















We call readline() again to read the next line.

We use the same variable line to put the contents of the next line.
tabs
Python automatically moved to the next line after we read the first one.

Now let's go to the beginning of the file and use a for loop to read all the lines and display them.


































To go to the start of the file, we use the seek() function. The argument to the seek() function is the position in the file that we want to go to. In our case, we use the argument 0. Which means, to the beginning of the file.

Notice that the file seems bigger than the one we have. This is because the print() function automatically adds a newline after it writes the line to the screen. But each line also has a newline character at the end. So the file looks like it's double-spaced.tabs

We can remove the newline character from the line before we print it to the screen like this.





















The line that removes the newline character from the file is:

line = line.strip()

What this does is remove leading and trailing whitespace. A newline character, or spaces, or tabs at the front or end of the line would be stripped.

Now let's close the file and do some writing exercises.





To open a file for writing, which will also create the file, do the following:




If the file writing-file.txt existed, then python will truncate it. So be careful not to use the name of a file that you already have. For example, if we had used the name test.txt, our file that we used for the reading exercise would have been deleted.

Now let's write a few lines into the new file.






We've written four lines into our new file. Now let's read them.










Wow! What happened? The error message indicates that the file was not opened for us to read. There's a way to open a file for both reading and writing, but the mode we used 'w' was for writing only.

Let's close the file and open it for both reading and writing.


No comments:

Post a Comment