A value is one of the basic things a program works with, like a letter or a number. The values we have seen so far are 1, 2, and 'Hello, World!'.
These values belong to different types: 2 is an integer, and 'Hello, World!' is a string, so-called because it contains a “string” of letters. You (and the interpreter) can identify strings because they are enclosed in quotation marks.
The print statement also works for integers.
>>> print 4 4
If you are not sure what type a value has, the interpreter can tell you.
>>> type('Hello, World!') <type 'str'> >>> type(17) <type 'int'>
Not surprisingly, strings belong to the type str and integers belong to the type int. Less obviously, numbers with a decimal point belong to a type called float, because these numbers are represented in a format called floating-point.
>>> type(3.2) <type 'float'>
What about values like '17' and '3.2'? They look like numbers, but they are in quotation marks like strings.
>>> type('17') <type 'str'> >>> type('3.2') <type 'str'>
They're strings.
When you type a large integer, you might be tempted to use commas between groups of three digits, as in 1,000,000. This is not a legal integer in Python, but it is legal:
>>> print 1,000,000 1 0 0
Well, that's not what we expected at all! Python interprets 1,000,000 as a comma-separated sequence of integers which it prints with spaces between.
This is the first example we have seen of a semantic error: the code runs without producing an error message, but it doesn't do the “right” thing.
One of the most powerful features of a programming language is the ability to manipulate variables. A variable is a name that refers to a value.
The assignment statement creates new variables and gives them values:
>>> message = 'And now for something completely different' >>> n = 17 >>> pi = 3.1415926535897931
This example makes three assignments. The first assigns a string to a new variable named message; the second gives the integer 17 to n; the third assigns the (approximate) value of π to pi.
A common way to represent variables on paper is to write the name with an arrow pointing to the variable's value. This kind of figure is called a state diagram because it shows what state each of the variables is in (think of it as the variable's state of mind). This diagram shows the result of the assignment statements:
The print statement displays the value of a variable:
>>> print n 17 >>> print pi 3.14159265359
The type of a variable is the type of the value it refers to.
>>> type(message) <type 'str'> >>> type(n) <type 'int'> >>> type(pi) <type 'float'>
Programmers generally choose names for their variables that are meaningful—they document what the variable is used for.
Variable names can be arbitrarily long. They can contain both letters and numbers, but they have to begin with a letter. Although it is legal to use uppercase letters, by convention we don't. If you do, remember that case matters. Bruce and bruce are different variables.
The underscore character (_) can appear in a name. It is often used in names with multiple words, such as my_name or airspeed_of_unladen_swallow.
If you give a variable an illegal name, you get a syntax error:
>>> 76trombones = 'big parade' SyntaxError: invalid syntax >>> more@ = 1000000 SyntaxError: invalid syntax >>> class = 'Advanced Theoretical Herpetology' SyntaxError: invalid syntax
76trombones is illegal because it does not begin with a letter. more is illegal because it contains an illegal character, @. But what's wrong with class?
It turns out that class is one of Python's keywords. The interpreter uses keywords to recognize the structure of the program, and they cannot be used as variable names.
Python has 31 keywords:
and del from not while as elif global or with assert else if pass yield break except import print class exec in raise continue finally is return def for lambda try
You might want to keep this list handy. If the interpreter complains about one of your variable names and you don't know why, see if it is on this list.
A statement is an instruction that the Python interpreter can execute. We have seen two kinds of statements: print and assignment.
When you type a statement on the command line, Python executes it and displays the result, if there is one.
A script usually contains a sequence of statements. If there is more than one statement, the results appear one at a time as the statements execute.
For example, the script
print 1 x = 2 print x
produces the output
1 2
The assignment statement produces no output itself.
Operators are special symbols that represent computations like addition and multiplication. The values the operator is applied to are called operands.
The following examples demonstrate the arithmetic operators:
20+32 hour-1 hour*60+minute minute/60 5**2 (5+9)*(15-7)
The symbols +, -, and /, and the use of parenthesis for grouping, mean in Python what they mean in mathematics. The asterisk (*) is the symbol for multiplication, and ** is the symbol for exponentiation.
When a variable name appears in the place of an operand, it is replaced with its value before the operation is performed.
Addition, subtraction, multiplication, and exponentiation all do what you expect, but you might be surprised by division. The following operation has an unexpected result:
>>> minute = 59 >>> minute/60 0
The value of minute is 59, and in conventional arithmetic 59 divided by 60 is 0.98333, not 0. The reason for the discrepancy is that Python is performing floor division1.
When both of the operands are integers, the result must also be an integer; floor division always chops off the fraction part, so in this example it rounds down to zero.
If either of the operands is a floating-point number, Python performs floating-point division, and the result is a float:
>>> minute/60.0 0.98333333333333328
An expression is a combination of values, variables, and operators. If you type an expression on the command line, the interpreter evaluates it and displays the result:
>>> 1 + 1 2
Although expressions can contain values, variables, and operators, not every expression contains all of these elements. A value all by itself is considered an expression, and so is a variable.
>>> 17 17 >>> x 2
In a script, an expression all by itself is a legal statement, but it doesn't do anything. The following script produces no output at all:
17 3.2 'Hello, World!' 1 + 1
If you want the script to display the values of these expressions, you have to use print statements.
When more than one operator appears in an expression, the order of evaluation depends on the rules of precedence. For mathematical operators, Python follows the mathematical rules. The acronym PEMDAS is a useful way to remember them:
In general, you cannot perform mathematical operations on strings, even if the strings look like numbers, so the following are illegal:
'2'-'1' 'eggs'/'easy' 'third'*'a charm'
The + operator does work with strings, but it might not do exactly what you expect: it performs concatenation, which means joining the strings by linking them end-to-end. For example:
first = 'throat' second = 'warbler' print first + second
The output of this program is throatwarbler.
The * operator also works on strings; it performs repetition. For example, 'Spam'*3 is 'SpamSpamSpam'. If one of the operands is a string, the other has to be an integer.
On one hand, this use of + and * makes sense by analogy with addition and multiplication. Just as 4*3 is equivalent to 4+4+4, we expect 'Spam'*3 to be the same as 'Spam'+'Spam'+'Spam', and it is. On the other hand, there is a significant way in which string concatenation and repetition are different from integer addition and multiplication. Can you think of a property that addition and multiplication have that string concatenation and repetition do not?
As programs get bigger and more complicated, they get more difficult to read. Formal languages are dense, and it is often difficult to look at a piece of code and figure out what it is doing, or why.
For this reason, it is a good idea to add notes to your programs to explain in natural language what the program is doing. These notes are called comments, and they are marked with the # symbol:
# compute the percentage of the hour that has elapsed percentage = (minute * 100) / 60
In this case, the comment appears on a line by itself. You can also put comments at the end of a line:
percentage = (minute * 100) / 60 # percentage of an hour
Everything from the # to the end of the line is ignored—it has no effect on the program.
Comments are most useful when they document non-obvious features of the code. It is reasonable to assume that the reader can figure out what the code does; it is much more useful to explain why.
This comment is redundant with the code and useless:
v = 5 # assign 5 to v
This comment contains useful information that is not in the code:
v = 5 # velocity in meters/second.
Good variable names can reduce the need for comments, but long names can make complex expressions hard to read, so there is a tradeoff.
At this point the syntax error you are most likely to make is
an illegal variable name, like class and yield (which
are keywords) or odd~
job and US$ which contain
illegal characters.
If you put a space in a variable name, Python thinks it is two operands without an operator:
>>> bad name = 5 SyntaxError: invalid syntax
For syntax errors, the error messages don't help much. The most messages are SyntaxError: invalid syntax and SyntaxError: invalid token, neither of which is very informative.
The run-time error you are most likely to make is a “use before def;” that is, trying to use a variable before you have assigned a value. This can happen if you spell a variable name wrong:
>>> principal = 327.68 >>> interest = principle * rate NameError: name 'principle' is not defined
Variables names are case sensitive, so Bob is not the same as bob.
At this point the most likely cause of a semantic error is the order of operations. For example, to evaluate 1/2 π, you might be tempted to write
>>> 1 / 2 * math.pi
But the division happens first, so you would get π / 2, which is not the same thing! Unfortunately, there is no way for Python to know what you intended to write, so in this case you don't get an error message; you just get the wrong answer.
And that brings us to the Second Theorem of Debugging:
The only thing worse than a bad error message is no error message.
Practice using the Python interpreter as a calculator: