Tuesday, January 11, 2005

Teaching an old Python keyword new tricks

In my last post, I provided some additional thoughts about maintaining Python's pseudocode appearance while adding static typing information. The notation I suggested also allowed for easy inclusion of pre-conditions and post-conditions. However, this came as a cost of adding where as a new keyword to Python. After giving some more thoughts to this idea I realise that a new keyword was not needed at all. This can be seen starting from the following example using the keyword where as introduced before:
01    def gcd(a, b):

02 where:
03 assert isinstance(a, int)
04 assert isinstance(b, int)
05 '''Returns the Greatest Common Divisor,
06 implementing Euclid's algorithm.
07 Input arguments must be integers.'''
08 while a:
09 a, b = b%a, a
10 return b
If we remove the line with where: and change the indentation level, we have essentially valid (today!) Python syntax, but with many (two in this case) lines starting with the keyword assert. If we allow this keyword to introduce the beginning of a bloc of statements, we could then write:
01    def gcd(a, b):

02 assert:
03 isinstance(a, int)
04 isinstance(b, int)
05 '''Returns the Greatest Common Divisor,
06 implementing Euclid's algorithm.
07 Input arguments must be integers.'''
08 while a:
09 a, b = b%a, a
10 return b
The idea of a keyword that allows both a bloc structure with a colon, as well as a single line expression is not new: it has been introduced in Python with the addition of list comprehensions. For example, we can have
01    for i in range(10):

02 ...
03 ...
as well as
a = [i for i in range(10)]

With this suggested syntax, pre-conditions can easily be added.
01    def gcd(a, b):

02 assert:
03 isinstance(a, int)
04 isinstance(b, int)
05 a != 0 and b != 0 # fake pre-condition
06 '''Returns the Greatest Common Divisor,
07 implementing Euclid's algorithm.
08 Input arguments must be integers.'''
09 while a:
10 a, b = b%a, a
11 return b
Keeping this block structure in mind, we can add type information on return values as follows:
01    def gcd(a, b):

02 assert:
03 isinstance(a, int)
04 isinstance(b, int)
05 return c assert:
06 isinstance(c, int)
07 '''Returns the Greatest Common Divisor,
08 implementing Euclid's algorithm.
09 Input arguments must be integers;
10 return value is an integer.'''
11 while a:
12 a, b = b%a, a
13 return b
as well as adding pre- and post-conditions:
01    def gcd(a, b):

02 assert:
03 isinstance(a, int)
04 isinstance(b, int)
05 a != 0 and b != 0 # fake pre-condition
06 return c assert:
07 isinstance(c, int)
08 c != 0 # fake post-condition
09 '''Returns the Greatest Common Divisor,
10 implementing Euclid's algorithm.
11 Input arguments must be integers;
12 return value is an integer.'''
13 while a:
14 a, b = b%a, a
15 return b
To move further towards something similar to what I suggested in my previous post, we need to identify, as Guido van Rossum had himself suggested, the statement
isinstance(object, class-or-type-or-tuple)

with
object: class-or-type-or-tuple

Doing so allows us to rewrite the last example as
01    def gcd(a, b):

02 assert:
03 a: int
04 b: int
05 a != 0 and b != 0 # fake pre-condition
06 return c assert:
07 c: int
08 c != 0 # fake post-condition
09 '''Returns the Greatest Common Divisor,
10 implementing Euclid's algorithm.
11 Input arguments must be integers;
12 return value is an integer.'''
13 while a:
14 a, b = b%a, a
15 return b
which is essentially what was written in the previous post, but with the replacement of a new keyword (where) by an old one (assert). The general form would be:
01    def foo(...):

02 assert:
. type information and/or
. pre-conditions
. return [...] assert:
. type information and/or
. post-conditions
. '''docstring'''
. ...body...
As this post is already long enough, I will not discuss the interface issue here; the ideas introduced in the previous two posts are easily translated by replacing where by assert everywhere.

13 comments:

Ian Bicking said...

I like the block syntax, but where does the error message go? Sometimes that's not necessary, but for a lot of contract issues it's not entirely clear why the restriction exists without an explanation, and it's nice to suggest proper usage.

André said...

Hmmm to be honest, I didn't think about error messages. Good point... perhaps one needs an additional keyword (like "where") after all.. I'll have to think about that one!

Hamish said...

Regarding where an error message might go, could each line of the assertion block not have an optional error message separated by a comma, similar to what the existing assert statement does?

Anonymous said...

Interesting how having a more pythonic syntax makes the static typing looks more optionnal, more acceptable and opens towards a different design perspective.
Alex

Anonymous said...

Alex? By tone and content of the previos comment I have the nagging suspicion this might be Alex Martelli speaking... In this case, use your power and draw
Guido's attention to this well-balanced syntax proposal, Alex!

Anonymous said...

Sorry but I'm only another (anonymous french) Alex even quite new to the python language.
Alex

Anonymous said...

I *very* much like the "isinstance(a, int)" form of this. It is, I feel, *far* more pythonic in nature: very explicit, and when read aloud does not require any innate knowledge of the language. It does what it says.

It is also probably wordy enough to encourage people to *not* use static typing, which *should* be one of the design goals. There are few cases where static typing is required, ergo it should be less easy to do than leaving things dynamic.

Anonymous said...

I don't like the way that your syntax requires a double set of indentation though I can see how it arises from the need to specify a name for the return value. I would find this easier to read (with dots to represent the indentation as the spaces seem to be stripped out :-/):

assert:
....isinstance(a, int)
....isinstance(b, int)
....a != b # or whatever
assert return c:
....isinstance(c, int)
....c != 0

But I'm still not sure that I like it!

hathawsh said...

I really like this suggestion. May I suggest the following minor changes:

def gcd(a, b):
    """Return the greatest common divisor."""
    assert:
        "Both inputs must be integers"
        a: int
        b: int
        "Both inputs must be nonzero integers"
        a != 0 and b != 0
    assert return c:
        "This function should return a nonzero integer"
        c: int
        c != 0
    while a:
        a, b = b%a, a
    return b

In this version, you can mix docstrings in pre-conditions and post-conditions. Conditions that fail should include the preceding docstring in the exception. Also, the post-conditions are not embedded inside the pre-conditions and the post-conditions are less indented.

Gheorghe said...

regarding error description maybe:
isinstance(a, int) or raise "Must be a number"
a: int or raise "Must be a number"
a<0 or raise "Must be less the zero"
But then will this look ugly ?:
((a>0 and b<5) or a < -2) or raise "whatever business logic description"

Gheorghe said...

which way?

assert:
....a:int or raise "must be a number"
....((a > 0 and b < 5) or a < -2) or raise "some business logic description"

or ?

assert:
....a:int
....(a > 0 and b < 5) or a < -2
except TypeError:
....raise "must be a number"
except AssertionError:
....raise "some business logic description"

Anonymous said...

What's really missing in python today?

01 def gcd(a, b):
02 ____'''Returns the Greatest Common Divisor,
03 ____implementing Euclid's algorithm.
04 ____Input arguments must be integers;
05 ____return value is an integer.'''
06 ____assert isinstance(a, types.IntType)
07 ____assert isinstance(b, types.IntType)
08 ____while a:
09 ________a, b = b%a, a
10 ____assert isinstance(b, types.IntType)
11 ____return b

That keeps the return value assertion close to the return statement, where it belongs. If you want some explanatory text in the exception object then raise one explicitly as in:

06 ____if not (isinstance(a, types.IntType) 07 _______and isinstance(b, types.IntType)):
08 ________raise TypeError("GCD applies to pairs of integers.")

or

10 ____if not isinstance(b, types.IntType)"
11 ________raise TypeError("GCD was supposed to produce an integer.")

Already, I tend to liberally sprinkle my code with assert statements, often simply as a reminder to myself but also as a form of test-driven development or design by contract (I write the assertions first, then fill in the program logic).

André said...

To Anonymous who wrote:
What's really missing in python today?

01 def gcd(a, b):
02 ____'''Returns the Greatest Common Divisor,
03 ____implementing Euclid's algorithm.
04 ____Input arguments must be integers;
05 ____return value is an integer.'''
06 ____assert isinstance(a, types.IntType)
07 ____assert isinstance(b, types.IntType)
====
This series of 3 posts was prompted by Guido van Rossum's musings about adding optional static typing and related features, with a proposed syntax that many people, including myself, found "non-pythonic". Of course, the simple example I used can be done, as you've shown explictly, with python's actual syntax. However, the features suggested by GvR require some fundamental change. I was just trying to contribute in my own way by suggesting a "more pythonic" syntax that could act as a bridge between Python as it is today and how it could become with static typing, interfaces, etc.
By all means, if you can show how GvR's suggestions could be implemented "nicely" using today's Python, I am sure that many people would be interested. (Let me emphasized that I do NOT write this to be sarcastic.) I am trying to generate a constructive "debate" with all those interested.