Thinking Like a Computer Scientist: A Journey through My Aches and Pains in CSC148: February 2014

Thursday, 27 February 2014

Week 7: Running with Recursion

The goal of this week is to provide an intro into recursion: this includes what it is, when we use it and why we use it. We will look at what recursion is first. Recursion is a programming technique in which a function or method repeatedly calls itself in its body so that it can break down a large, complex task into smaller, simpler instances of the same task. We will introduce two ways to understand recursion. One is by its components.

(1) Base Case: This component determines if the recursive function is given a trivial task to complete. If the task is trivial, then the function will not call itself. If a function is given a complex task, then this component lets it know when it should stop calling itself.

(2) Recursive Step: This component is where function calls itself. Every time the function calls itself, it gives a portion of a complex task to another instance of itself to complete. The function continues dividing the work until it reaches the base case.

(3) Work: This component uses the solution from a simpler instance of the task to help it complete a current, more complex instance of the same task.

We can illustrate what the components look like with an example. Assume that we have to return every number in a possibly nested list in a new empty list. Then the recursive function, with every one of the components commented in the body of the code, looks like:

def collect_num(L: list) -> list:

‘’’

Return a new list in which every element of new list is a number in L.

‘’’

new_L = []

for x in L:

if isinstance(x, int) or isinstance(x, float):

new_L.append(x)

# The base case is the if statement. If x is a number, then the function can append it to new_L without any additional work.

elif isinstance(x, list):

new_L = new_L + collect_num(x)

# The recursive step is collect_num(x). This step executes if x is a list because there exists a chance that there are numbers in x.

# The work component is when new_L is concatenated with the return value of collect_num(x).

return new_L

In addition to understanding recursion by its components, another way to understand it is by comparing its design to that of a mathematical proof by induction. Mathematical statements that are provable by induction are generally in the form of for every n that is an element of the set of natural numbers, S(n) where the proof of S(1) is called the base case. We can think of the base case in induction as parallel to that of the base case in recursion because of their easiness to work out. To us as human beings, the base case in induction is generally the most trivial instance of the statement to prove. To the recursive function, the base case represents the most trivial task that it can complete. Likewise, the inductive step of mathematical induction also has its parallel in recursion. When we assume that for every k that is an element of the set of natural numbers, S(k) is true, we use that assumption to prove that S(k + 1) is true. Similarly, when the recursive function has a solution for a simpler instance of a task, it uses that solution to complete a more complex instance of the same task. As a result, there are two possible ways to understand recursion; one is through an analysis of its components and another is through its comparison to that of a mathematical proof by induction.

The reason why we use recursion is that it is a more efficient for completing complex tasks. The alternative to a recursive function is a function that only uses iteration. Iteration with for loops and while loops is best for functions that complete simpler tasks. For example, assume that we have to multiply the elements in a list of numbers. Compare the function that only uses iteration to the function that uses recursion:

def multiply_elements(L: list) -> None:

‘’’

Return the result of every element in L multiplied with each other.

‘’’

m = 1

for x in L:

m = m * x

return m

def recursive_multiply(L: list) -> None:

‘’’

Return the result of every element in L multiplied with each other.

‘’’

if len(L) == 0:

return 1

return L[0] * recursive_multiply(L[1:])

Recursion for this task is unnecessary because the task is already at its most trivial form. For the task of collecting every number in a possibly nested list however, recursion is the most natural solution. In that scenario, an iterative solution will most likely result in a hideously complicated function. Therefore, as a general rule of thumb, recursion should be used in complex tasks that can be broken down into simpler instances of the same task. If the task is already at its simplest form, then iteration is the best solution.

Sunday, 16 February 2014

Week 6: Trolling for Trees

After our tour of Python’s memory model, we arrive at a data structure called a tree. We will go over what is a tree and we will introduce some terms that come with trees. So what is a tree? A tree is set of nodes such that every node is an object with a value and a link to other nodes. We can represent the appearance of a tree horizontally from left-to-right with this diagram.

My apologies that I cannot get the vertical to work. I could just steal one from the internet but they look very strange next to this background…

There are terms to describe the nodes by their positions. Here are the ones that come from the analogy of the tree as a plant:

(1) Root: This is the node with the value of 2 at the top of the tree (or the left of the in sideways representation).

(2) Leaves: These are nodes with values 2, 5, 11 and 4 at the bottom of the tree (or the right of the tree in sideways representation). They have no links to other nodes.

Here are the terms that come from the analogy of a tree as a family tree:

(1) Parents: These are the nodes with values 2, 7, 6, 5, and 9. They have links to other nodes.

(2) Children. Every node in a tree, except the root, is a child because they are linked by other nodes.

Here are some other terms:

(1) Arity: If we have a set that contains the number of children for every parent in a tree, then the arity of that tree is the maximum number in that set. The arity for the tree above is two because every parent either has one child or two children.

(2) Length of path: The length of the path between two nodes is number of nodes in the sequence of every node from the first node to the last.

(3) Depth of a node: The depth of a node is the length of the path from the root to the node.

(4) Tree traversal: The act of visiting every node in the tree. There are three ways of traversing a tree: preorder, postorder and inorder.

(5) Preorder: Preorder traversals start from the root. They visit the parents first and then the left and right children. The preorder traversal for the tree above is 2, 7, 2, 5, 6, 11, 5, 9, 4.

(6) Postorder: Postorder traversals start from the leftmost node. They visit the left and right children first and then the parent. The postorder traversal for the tree above is 2, 5, 11, 6, 7, 4, 9, 5, 2.

(7) Inorder: Inorder traversals also start from the leftmost node. They visit the left child, then the parent and then the right child. The preorder traversal for the tree above is 2, 7, 6, 5, 11, 2, 5, 9, 4.

One important thing to note is that trees are not cyclical. Nodes of the left subtree cannot link to nodes in the right subtree. If the node with value 4 in the tree above links to the node with value 11, then it is not a tree.

When my professor introduced the concept of a tree to us, I did not find it terribly difficult to understand. However, my problem with implementing classes from the post about Object-Oriented Programming are still present three weeks later. This is mostly related to the syntax. However, I believe I am getting better at it. Probably a little more work should do it.

Sunday, 9 February 2014

Week 5: Remembering Python's Memory Model

The goal of this week is to appreciate how Python’s memory model brings two interesting advantages to programmers. A memory model in the context of programming languages is the structure of how a programming language stores information about its objects during program execution. In Python, when a name is defined, it refers to an id, which stores an object. The link between the name and the id is stored in a dictionary-like data structure called a namespace. What makes Python’s memory model interesting is that a separate namespace is created for every module and function call during program execution; when a module or function exits, its namespace is deleted. We can then introduce two basic rules in Python's memory model:

(1) If there exists two same names in two separate namespaces, then the names do not interfere with each other during program execution.

(2) If a function needs to work with a name that is defined in the global namespace of a module, then the function will a create an identical name with the same id in its local namespace.

The creation of separate namespaces is the first interesting advantage of Python's memory model. We can illustrate why this is an advantage with a thought experiment: assume that the negation of rules (1) and (2) are true for Python's memory model. The result is that programming in Python becomes very difficult because we can never use the same name twice in a program. Additionally, if we run multiple programs at a time, then we have to ensure that there are no overlapping names between the programs. Therefore, the inability to alter the ids of names in a separate namespace is one advantage of Python’s memory model.

Another advantage of Python’s memory model is its flexibility. To illustrate this advantage, assume that in the animated solver of the Towers of Anne Hoy game, function min_moves(n) has to find an integer i between zero and the number of cheeses n, such that the solver can solve the game in the minimum moves. The problem is that the function returns the minimum moves for the chosen i but it does not return i itself. So how can we access the chosen i? A possible solution is to define i in the global namespace of the module. Then use the global assignment in function min_moves to assign global i another id:

def min_moves(n: int) -> int:

…

for something in something:

…min_moves(n - i)

if something:

global i

…

i = something

…

return fewest_moves

Every name i after ‘global i’ in the function refers to the i in the global namespace of the module. Since we can access global i from anywhere in the module, then the global assignment is a solution to the inability of function min_moves(n) to return i. Therefore, the existence of a global assignment that can change the id of a name in the global namespace is another advantage of Python’s memory model; it demonstrates its flexibility under special conditions. Combined with a function's inability to alter the ids of names in a separate namespace under normal conditions, these are two interesting advantages of Python's memory model.