Sunday, November 13, 2011

Lists

Python has a special type called a List. A Python list is a set of objects, they could be numbers, or strings, or other objects, enclosed in a pair of brackets, square brackets, [].


The Python range() function is a special function that returns a list object with a sequence of numbers. In our case, we asked for the range 1 to 5, excluding the last number, 5. If you don't specify the first number, the range will start at zero (0).


See that range(3,3) returns an empty list. That's because the end of the range isn't included and since we requested a range from 3 to 3, it's empty. The second range request (range(3,4)) returns a list of only one element, the number 3.

The range() function is commonly found in for loops.


We loop around five times, from 0, to 1, to 2, to 3, to 4. That's five times. In the first loop;

item = 0
value = (1 + 1) * 1 = 2

item = 1
value = (2 + 1) * 2 = 6

item = 2
value = (6 + 1) * 6 = 42

item = 3
value = (42 + 1) * 42 = 1806

item = 4
value = (1806 + 1) * 1806 = 3263442

One thing to remember about the range() function is that it can become inefficient with very large numbers. When used in a loop the way I've created it above, Python constructs  and allocates for a list object. A special type of list object known as a Tuple. We'll discuss this further later.

To find out how many items are in the list, you use the len() function, similar to a string.


Just as with the string object, you can iterate through the list using a for loop.


We added the numbers in the list using a for loop.

You can join two lists together using the "+", concatenation, operator. This is like the string concatenation operator. And just like the string repeat operator, "*", the same operator is used to make copies of lists.


The list "e" is the concatenation of lists "c" and "d". The list "f" is the list "c" repeated twice. This is similar to what we saw with strings.

And just like we saw how we can take slices of strings, we can also take slices of lists. For a comprehensive discussion of what the start and end indices mean, see the discusion on strings. For now, it's enough to say that the first element in the list has the index zero (0), and like strings, we can also count from the end of the list where the last element has the index -1.


a = [1, 2, 3, 4, 5, 6]

The number "1" has the index zero (0). The number "2" has the index 1. And so on. So the slice:

b = a[0:2]

Takes elements from index zero (0) -- the "1", up to and not including the index 2 -- the "3". So the slice becomes [1, 2].

But guess what, unlike strings, lists are mutable! You can change the list elements inline.


This wasn't possible with a string. In the case of strings, trying to change an element in the string by assigning it as:

s[index] = value

would give you an error message.

You can also delete and insert elements into the list. There are two ways to delete elements. The first is to assign an empty list in the position of the elements you want deleted. For example, say you have the list:

a = [1, 2, 3, 4, 5, 6]

and you want to delete the elements, [2, 3]. These are represented by slice a[1:3]. Here's how you do that.


Python also provides a delete function, called del, which makes this easier to see than empty list assignment.


Lastly let's discuss inserting elements into the list. To insert an element into a position in the list, you have to use slice notation. If you don't, unless you're adding elements at the end of the list, you will overwrite the value at the index.

For example, say you have the list:

a = [1, 2, 3, 4, 5, 6]

and you'd like to insert the value zero (0) right after the 3, you'd do the following:


Notice that slice [3:3] selects elements starting at position 3 (currently occupied by the number 4), but doesn't include that position. So, in effect what happens is that the element in that position is shifted over.

What if you wanted to replace the slice [2, 3, 4] with the single zero?


On the left side of the assignment, we selected the slice that includes [2, 3, 4], that's slice [1:4]. We replace that slice with the list [0].

One thing you have to be careful with about lists is copying them. Unlike strings, when you make assignments to lists, an alias is made.

Look at the following example.


Notice what happened. A list "a" was created with three elements, [1, 2, 3]. The variable "b" was set to "a". At this point, a copy of the list "a" wasn't made. "b" is a reference, or an alias, to the existing list created by "a".

So, when the element at index zero (0) is modified by the statement:

b[0] = 4

It's the same element that's referenced by "a". So when we print out the value of "a", we see that the first element, at index zero, has been changed.

How do you make copies?

Take an entire slice of the list.


In this example, we used the slice operator to make a copy of "a". Remember that when you omit the first parameter in the slice it's assumed that you're starting from the first element, and when you omit the last parameter, it's assumed that you're slicing all the way to the end. So, in this case, we're slicing from the first parameter, all the way to the end. Essentially a full copy of the list.

When we make assignments to the list "b" it doesn't affect the list "a" since we now have two separate lists.

Knowing when to slice and when to assign is very important, because when you pass a list item as an argument to a function, a reference to the list item is passed. If the function modifies the elements of the list, the modifications are global.

Here's an example.


In this example we have a list "a" with three elements, [1, 2, 3]. We then define a function delh() that takes a single argument. It's a list argument because in the function body we can see that it deletes the first element and then prints it.

In our example, the first call to the function is:

delh(a[:])

The argument a[:] is a copy of the list "a". The delh() function deletes the first element, and then prints the remaining elements.

Later, out of the function, when we print the values in "a" we see that it's unaffected. It's not changed.

In the second function call to delh() we do the following:

delh(a)

In this case, we're passing a reference to the list. The delh() function deletes the first element and prints the remaining ones. Later, out of the function, when we inspect our original list, "a", we find out that the first element was deleted.

A last word on lists. A couple of useful functions exists to convert strings to lists, and lists to strings.

The first is split. This is a string function that will split a string, based on whitespace, or any string, into a list.

Example:


In the first example, the list "l" is created using the following line:

l = string.split(s)

Because we haven't specified how to split the string, Python splits it based on whitespace. So we get a list of the words.

In the second split command,

ll = string.split(s, 's')

We instruct Python to split the string at each occurrence of the string, or letter, "s". The "s" won't be included in the result, but spaces will.

Now, if you have a list of elements, you can create a string using the "join" function.


In the first join statement, since we haven't instructed Python how we'd want the list join performed, Python joins the elements of the list using a single space. However, in the second example, we ask Python to join the elements using the string "::", two colons.

This has been a fairly long section on lists. The next one is very short, because most of the material covering lists also pertains to that special type of a list, called a Tuple.

String operations - Part 3 of 3

Python Strings are immutable. Which means that once you create one, you can't change it. In part 2 of our discussion on strings, we saw how to extract a character from a string.


However, it's illegal to try and replace a character inline. The string "a" in the example above cannot be changed.


You get the error message displayed above. That the 'str" object does not support assignment. If you want to change the contents of the string, the only thing to do is create another one. Using slices, you can do the following:


What we did is take the letter "m", add, or concatenate, it to the rest of the original string "a" starting at position 1 ("ack") and then assign it to a new string called "a." Looks like we modified the original string "a" but in effect what Python does is destroy the old string "a" and create a new one. This allows us to do the following:


In each of the assignments above, the original string "a" is destroyed, a new string is created with the characters on the right side, and that's assigned to the variable "a." So, a="jack" is a different string from a="jill".

We've already seen how to find the length of a string using the Python function len().


We can also use the for loop in Python to count the characters.


The string module has a find() function. This function allows you to find a character, or another string, inside a string. In order to use the find() function, we have to import the string module.


Notice how we call the find() function using the string module class identifier. The find() function is called a class function. It requires two arguments; the string that you're searching and the substring to look for.

In our example:

string.find(a, "n")

This instructs the find() function to look for the "n" string inside the "a" string. The "a" variable points to the string "canada". In that string, "n" is at position 2, remembering that the first character is at position 0.

If the find() function does not find the string that we're looking for, it returns the number -1.


See that the value of "i", in the second find() is -1 because the string "canada" does not contain the letter "q" that we're looking for.

The find() function is not limited to looking for simple characters, it can also find entire words.


In the find() function call above, the word "canada" is at position 36 in the string "a". The number 36 is the starting position of the word "canada". It's the position where the "c" in "canada" is.


In the example above, after finding the position of the string "canada" we extracted it. Not a very useful example but illustrative of the fact that you can use an example of this sort to pull out words, based on whitespace.

The string module contains some useful functions that will help when analysing text.


How would you use these? Well, you can use the string.lowercase variable to test if a letter in your particular string is lowercase. Here's an example that counts lowercase and uppercase characters in a string.


We have our string with upper and lowercase characters, "s". We then use a for loop to loop through each character. If its a lowercase character, we increment the value of the lowercount variable. If its an uppercase character, we increment the value of the uppercount variable. Notice how we use the find() function.

string.find(string.lowercase, item)

"item" is the character we're looking for.
string.lowercase is the string we're searching in. string.lowercase contains all the lowercase characters. If we can't find the character, the find() function returns the value -1. Otherwise it returns a value from zero (0) to the length of the string.

There is another, more elegant, way to do something like this. That's using the string "in" operator. The "in" operator checks to see if one string is "in"side another. For example, if the string is "canada" and we issue the following statement:

a = "can" in "canada"

The value of "a" will be True. Because the string "can" is inside "canada"

Here's the counting program, written using the "in" operator.



The string module has a number of interesting functions that you can use to manipulate strings. Functions to modify strings from uppercase to lowercase and vice versa. To split strings at whitespace into words. To convert strings to numbers.

Saturday, November 12, 2011

String operations - Part 2 of 3

I introduced Python strings in an earlier post. A string is a sequence of characters.


In the example above, "a", "b" and "c" are strings. "a" contains the value "jack"

You can extract each character in the string using the [] operator. The first item is at an index value of zero (0). The letter "j" in the string "jack" is at index value zero (0).


The word "jack" has four (4) letters. "j", "a", "c" and "k". The first letter is in position zero (0), the second at position 1, the third at position 2 and the fourth letter is at position 3. When we try to access a letter beyond the length of the string, we get an error message.

The Python len() function will tell you the length of a string. The last character of a string is at the position indicated by len() minus one.


We can iterate through this string, using the for loop and print out each letter.


Notice the comma at the end of the print statement so that we keep the letters on the same line. So that Python does not print the carriage return (or newline).

String slices

Just as you can extract a single character from a string by specifying its index, you can also slice out a sequence of characters by specifying two indices. The starting index and the index after the last character you want.

For example in "jack" if you want to extract "ja" you'd specify zero (0) as the first index and 2 (the index of "c") as the index after the last one. This is counter intuitive to most other programming languages where the second index would be the index of the last character you want. In Python, the second index is the index of the character you don't want.


In the example above, the letter "o" is at index zero (0). The letter "a" is at position 3. So we take everything from the letter "o" and stop before we get to "a". So we get "ont".

The second print statement has a[0:4]. Once again, the letter "o" is at position zero (0). The letter "r" is at position 4. So we'll take everything from the letter "o" and stop before we get to "r". We get "onta".

The third print statement has a[1:3]. The letter "n" is in position 1. The letter "a" is in position 3. So we'll take everything from "n" and stop before we get to "a". We get "nt"

Strings are also numbered from the end. The character at the last position is at index -1. The way to remember this is to think of the fact that index zero (0) is already taken by the first character. So numbering from the end of the string starts at -1. The second-last character is at index -2. And so on.

Each character therefore has two indexes. One index defines its position from the front of the string and a second index that defines its position from the back.


So now its easy to print a string backwards using a loop.


In the loop above, we iterate backwards from -1, down to -4 printing each character in the string.

Slices using negative indices

Slices are always taken forward. You specify a starting index, a second index to stop the slice. The character at the second index is not included.

You can therefore slice using negative indices as long as you remember this. Here are the same three slices we performed on the string "ontario" using negative slices.


We're using the same slice positions, but this time from the back of the string. To get the slice "ont" in the first example we used positions 0:3. In this example we used positions -7:-4. They represent the same characters. The "o" is represented by zero (0), or -7.

One last word about slices. You can omit any of the indices. If you omit the first index, it's assumed that you want to slice from the beginning of the string. If you omit the last one, it's assumed that you want to slice to the end of the string.


Iteration

To iterate means to "do again." In programming, iteration performs a series of actions, over and over, until a specific condition is met.

There are two main types of iterative statements in Python.

  • while loops
  • for loops

In this example, we set a variable "i" to the value 5.

We then start the while loop. The statement reads, while the value of "i" is greater than zero (0), perform the following statements. The statements to be performed are indented under the while loop. There are two things to do. First, print the value of "i". Secondly, set "i" to the value of "i" less one (1). The second statement will reduce the value of "i" successively until it reaches zero. Until that happens Python will start the loop again.

So the first time around, "i" is 5. So the loop prints "5". Then "i" becomes "4". This is still greater than zero so the loop prints "4". Then "i" becomes "3". And so on.

Here's another example.


In this example, we set a variable "i" to 5. Once again, we're going to iterate two statements, until "i" becomes zero or less. The statement "while i > 0" instructs the loop to continue iterating till that condition is met. The two statements to be carried out are; assign the variable "c" to the value "c + 'a'". Essentially this means, concatenate the letter "a" to the existing value of "c" and then assign that to a variable "c". It's not the same one as we'll see later. But Python takes the existing value of "c" which is blank initially, does the concatenation, and then assigns it to a variable "c". Once that's done, it reduces the value of "i" by one.

Once the loop is completed, we print out the value of "c". As you can see, it has five "a"'s. 

If you forget to include the statement "i = i - 1", then the value of "i" never gets reduced. You'll run into what is called an "infinite loop." This is when the loop continues on forever. Infinite loops are an example of a common programming error with both novice and experienced programmers.

The other type of loop is a for loop. For loops are particularly interesting in Python because they operate on a list of items. The list has to be predefined. For example, the following is a list of numbers that the for loop will print. We haven't discussed the list object in Python yet, this list is a special type of list in Python called a Tuple and it's enclosed in parentheses.


Quite simply, what this does is assign each element of the list to the variable "i" at each iteration of the loop.

Here's another one, using a string. Yes, Python iterates strings!


The print statement in Python adds a carriage return to each output. This is normally OK. However, if you don't want each letter to print on it's own line, add a comma at the end of the print statement.


That's it for loops for now.

Having fun with keyboard input

Time to take a small break, a quick detour. How do you interact with Python? Up to this point we've written instructions in the Python interpreter and Python has displayed the results immediately. We'll soon get to the point where we need to write programs, more than a single or a few lines, that will require that we write the programs outside the Python interpreter and ask Python to load them from there.

But in this note, I'd like to concentrate on Python's ability to accept input from the user. You.

Look at the following example.


We use the Python function raw_input(). raw_input() takes a single argument, a prompt that you wish to display. The argument isn't necessary. If you don't provide one, then Python won't display a prompt.

In our example, we call raw_input() and provide the prompt "Type something: ".

We then store the result in a variable, "a".

Here's an example where we get two numbers from the user, and then add them.


A few things to note from this example.

The return value of the raw_input() function is a string. Even though we typed in numbers, 12 and 5, they were stored in the string variables a and b. So, in order to add them, we have to use the built-in int() function, convert them to integers, so that we can do the integer math.

Notice what happens when we try to add "a" and "b." Instead we get string concatenation. The string "12" is concatenated with the string "5", not what we want. When you use the type() function to see what data type the variable "a" is, you can see that it's a string.

How would you permanently convert them to integers?


And the entire program, reading and converting to integers in one swoop.


Note how we use the return value of the raw_input() function as the argument for the int() conversion function, finally assigning that conversion to the variable "a".

Logical operators

This is a short section to introduce the logical operators in Python. If you've programmed in C, you'll recognise these immediately.


==    Equals
!=Not equal
<Less than
>Greater than
<=Less than or equal to
>=Greater than or equal to


Here's a few examples of their usage.


Mostly self explanatory. You can store the value of the test in a variable by assigning it with the equals sign. As in:

c = x > y

If you examine the value of c, it will contain the boolean answer "False" because x = 1 and y = 2 and therefore x > y is False.


Functions

In computer programming, a function is a set of operations that perform a generic function. A function consistently does the same thing.

Python has functions, and we've seen one of them already, that perform type conversion. Converting from one type of data to another. For example, we saw that if we want to perform floating point division, and we have two integers, Python will automatically perform integer division, truncating anything on the right side of the decimal point, unless we tell Python that we'd like to perform floating point division. We tell Python that we'd like to do floating point division by converting one, or both, of the numbers to a floating point number.


In the example above, the answer to 20/3 would result in the number 6. However, if we use the function float(), then we get the right answer.

float() is a Python function that takes an argument and returns a floating point number, if it can.


The float() function normally works to convert integers to floating point, or to extract floating point numbers from strings. In the example above, the integer 7 is converted to 7.0. The string "7" is converted to the floating point number 7.0. Two things happened to the string. First, the number was extracted from the sequence of characters. And secondly, the number was converted.

The float() function will also convert floating point numbers. However, look at the last conversion. When trying to convert the string "jack" to a number, we get an error message. The string "jack" doesn't contain a valid number that can be converted.

The str() function converts its argument to a string.


In the examples above, the str() function takes the integer, 1, and returns the string '1'. Similarly, it takes the floating point number 1.1 and returns the string '1.1'

The str() function will also convert strings, back to strings. This might be useful in the case where you're writing data to a file and don't want to test each type before writing it out. You can be sure that the str() function will faithfully take your strings and write them unchanged.

Notice the statement:

'jack' + 1

This results in a Python error. Python does not know what to do. In the next statement:

'jack' + str(1)

Python converts the integer, 1, to a string first, now python knows that we need to use the string concatenation operator, the "+", to concatenate the string "Jack" to the new string "1".

Python allows you to create your own functions. To create your own functions, you prototype them as follows:

def function_name(parameter_list):
    body of function

Here's an example where we define a function.


The function name is sayHi. This function has no parameters, but the parentheses are important. The function body has a single line of code. The line prints a string, "Hello World!".

When we call the function, we simply use it's name with the arguments it requires.

Let's redefine the function so that it takes an argument. In this case, an argument that tells it how many times to say hello.


Now when we call the function sayHi(), we need to pass an argument. The argument will be used to repeat the string "Hello World! " a number of times. Note the space after the exclamation mark in the string "Hello World! ".

Arguments versus Parameters

In most programming language texts, the word argument and parameter are used interchangeably, but there is a difference. The parameter is defined in the function prototype. In our example above, "n" is a parameter defined in the prototype:

def sayHi(n)

Each time we call the function sayHi() we pass an argument. In our example above, the numbers, 3, 2 and zero (0) are arguments.