3.1 Problems with natural languages

Natural languages, such as English, work adequately (most, but certainly not all, of the time) for human-human communication, but are not well-suited for human-computer or computer-computer communication. Why can’t we use natural languages to program computers?

Next, we survey several of the reasons for this. We use specifics from English, although all natural languages suffer from these problems to varying degrees.

  1. Complexity. Although English may seem simple to you now, it took many years of intense effort (most of it subconscious) for you to learn it. Despite using it for most of their waking hours for many years, native English speakers know a small fraction of the entire language. The Oxford English Dictionary contains 615,000 words, of which a typical native English speaker knows about 40,000.

  2. Ambiguity. Not only do natural languages have huge numbers of words, most words have many different meanings. Understanding the intended meaning of an utterance requires knowing the context, and sometimes pure guesswork.

    For example, what does it mean to be paid biweekly? According to the American Heritage Dictionary1, biweekly has two definitions:

    1. Happening every two weeks.

    2. Happening twice a week; semiweekly.

    Merriam-Webster’s Dictionary2 takes the opposite approach:

    1. occurring twice a week

    2. occurring every two weeks : fortnightly

    So, depending on which definition is intended, someone who is paid biweekly could either be paid once or four times every two weeks! The behavior of a payroll management program better not depend on how biweekly is interpreted.

    Even if we can agree on the definition of every word, the meaning of a sentence is often ambiguous. This particularly difficult example is taken from the instructions with a shipment of ballistic missiles from the British Admiralty:3

    It is necessary for technical reasons that these warheads be stored upside down, that is, with the top at the bottom and the bottom at the top. In order that there be no doubt as to which is the bottom and which is the top, for storage purposes, it will be seen that the bottom of each warhead has been labeled ’TOP’.

  3. Irregularity. Because natural languages evolve over time as different cultures interact and speakers misspeak and listeners mishear, natural languages end up a morass of irregularity. Nearly all grammar rules have exceptions. For example, English has a rule that we can make a word plural by appending an s. The new word means “more than one of the original word’s meaning”. This rule works for most words: word $\mapsto $ words, language $\mapsto $ languages, person $\mapsto $ persons.4

    It does not work for all words, however. The plural of goose is geese (and gooses is not an English word), the plural of deer is deer (and deers is not an English word), and the plural of beer is controversial (and may depend on whether you speak American English or Canadian English).

    These irregularities can be charming for a natural language, but they are a constant source of difficulty for non-native speakers attempting to learn a language. There is no sure way to predict when the rule can be applied, and it is necessary to memorize each of the irregular forms.

  4. Uneconomic.

    I have made this letter longer than usual, only because I have not had the time to make it shorter. Blaise Pascal, 1657

    It requires a lot of space to express a complex idea in a natural language. Many superfluous words are needed for grammatical correctness, even though they do not contribute to the desired meaning. Since natural languages evolved for everyday communication, they are not well suited to describing the precise steps and decisions needed in a computer program.

    As an example, consider a procedure for finding the maximum of two numbers. In English, we could describe it like this:

    To find the maximum of two numbers, compare them. If the first number is greater than the second number, the maximum is the first number. Otherwise, the maximum is the second number.

    Perhaps shorter descriptions are possible, but any much shorter description probably assumes the reader already knows a lot. By contrast, we can express the same steps in the Scheme programming language in very concise way (don’t worry if this doesn’t make sense yet—it should by the end of this chapter): (define (bigger a b) (if ( > a b) a b))

  5. Limited means of abstraction. Natural languages provide small, fixed sets of pronouns to use as means of abstraction, and the rules for binding pronouns to meanings are often unclear. Since programming often involves using simple names to refer to complex things, we need more powerful means of abstraction than natural languages provide.

  1. American Heritage, Dictionary of the English Language (Fourth Edition), Houghton Mifflin Company, 2007 (link texthttp://www.answers.com/biweekly). 

  2. Merriam-Webster Online, Merriam-Webster, 2008 (link texthttp://www.merriam-webster.com/dictionary/biweekly). 

  3. Carl C. Gaither and Alma E. Cavazos-Gaither, Practically Speaking: A Dictionary of Quotations on Engineering, Technology and Architecture, Taylor & Francis, 1998. 

  4. Or is it people? What is the singular of people? What about peeps? Can you only have one peep