11.1 Python

We could implement a Charme interpreter using Scheme or any other universal programming language, but implement it using the programming language Python. Python is a popular programming language initially designed by Guido van Rossum in 1991.1 Python is freely available from http://www.python.org.

We use Python instead of Scheme to implement our Charme interpreter for a few reasons. The first reason is pedagogical: it is instructive to learn new languages. As Dijkstra's quote at the beginning of this chapter observes, the languages we use have a profound effect on how we think. This is true for natural languages, but also true for programming languages. Different languages make different styles of programming more convenient, and it is important for every programmer to be familiar with several different styles of programming. All of the major concepts we have covered so far apply to Python nearly identically to how they apply to Scheme, but seeing them in the context of a different language should make it clearer what the fundamental concepts are and what are artifacts of a particular programming language.

Another reason for using Python is that it provides some features that enhance expressiveness that are not available in Scheme. These include built-in support for objects and imperative control structures. Python is also well-supported by most web servers (including Apache), and is widely used to develop dynamic web applications.

The grammar for Python is quite different from the Scheme grammar, so Python programs look very different from Scheme programs. The evaluation rules, however, are quite similar to the evaluation rules for Scheme. This chapter does not describe the entire Python language, but introduces the grammar rules and evaluation rules for the most important Python constructs as we use them to implement the Charme interpreter.

Like Scheme, Python is a universal programming language. Both languages can express all mechanical computations. For any computation we can express in Scheme, there is a Python program that defines the same computation. Conversely, every Python program has an equivalent Scheme program.

One piece of evidence that every Scheme program has an equivalent Python program is the interpreter we develop in this chapter. Since we can implement an interpreter for a Scheme-like language in Python, we know we can express every computation that can be expressed by a program in that language with an equivalent Python program: the Charme interpreter with the Charme program as its input.

Tokenizing. We introduce Python using one of the procedures in our interpreter implementation. We divide the job of parsing into two procedures that are combined to solve the problem of transforming an input string into a list describing the input program's structure. The first part is the tokenizer. It takes as input a string representing a Charme program, and outputs a list of the tokens in that string.

A token is an indivisible syntactic unit. For example, the Charme expression, (define square (lambda (x) (* x x))), contains 15 tokens: (, define, square, (, lambda, (, x, ), (, *, x, x, ), ), and ). Tokens are separated by whitespace (spaces, tabs, and newlines). Punctuation marks such as the left and right parentheses are tokens by themselves.

The tokenize procedure below takes as input a string s in the Charme target language, and produces as output a list of the tokens in s. We describe the Python language constructs it uses next.

def tokenize(s):   # # starts a comment until the end of the line
    current = ' '  # initialize current to the empty string (two single quotes)
    tokens = []    # initialize tokens to the empty list
    for c in s:    # for each character, c, in the string s
        if c.isspace():   # if c is a whitespace
            if len(current) > 0:   # if the current token is non-empty
                tokens.append(current)   # add it to the list}
                current = ''   # reset current token to empty string
        elif c in '()':   # otherwise, if c is a parenthesis
            if len(current) > 0:   # end the current token
                tokens.append(current)   # add it to the tokens list
                current = ' '  # and reset current to the empty string
            tokens.append(c)   # add the parenthesis to the token list
        else:   # otherwise (it is an alphanumeric)
            current = current + c   # add the character to the current token
    # end of the for loop reached the end of s
    if len(current) > 0:   # if there is a current token
        tokens.append(current)   # add it to the token list
    return tokens   # the result is the list of tokens

  1. The name Python alludes to Monty Python's Flying Circus.