Friday, April 08, 2005

Computing derivatives using Python

In a recent thread on Python edu-sig, Kirby Urner started a discussion about the use of python's decorators in mathematics. Some of the examples included numerically computing derivatives (and integrals) of functions. This generated a fair bit of discussion and the general conclusion was that this was not a good use of decorators. Some information contributed by yours truly required the reader to do a graph to properly visualise exactly what was going on. I thought it appropriate to present a summary that included the required graphical information here as a permanent record.

The mathematical definition of the derivative of a function f(x) at point x is to take a limit as "h" goes to zero of the following expression:

( f(x+h) - f(x) ) / h

An example is given below: the red curve is the function f(x) and the green curve is the tangent line at x (with same slope as the derivative). The blue curve is a line going through x whose slope equals that of the above formula with a non-zero value of h.



As you can see, the slopes of the two straight lines are quite different. A better way of numerically computing the derivative for the same value of "h" is by taking a symmetrical interval around x as follows:

( f(x + h/2) - f(x - h/2) ) / h

This is illustrated in the next picture. As you can see, the two straight lines have much more similar slopes - hence, the value computed for the derivative is going to be more accurate.




The corresponding python code is as follows:

def derivative(f):
...."""
....Computes the numerical derivative of a function.
...."""
....def df(x, h=0.1e-5):
........return ( f(x+h/2) - f(x-h/2) )/h
....return df

And we use it as follows:

# sample function
def g(x): return x*x*x

# first derivative
dg = derivative(g)

# second derivative
d2g = derivative(dg) # == derivative(derivative(g))

# printing the value computed at a given point:

print dg(3)
print dg(3, 0.001)
print dg(3, 1e-10) # smaller h is not always more precise

Sunday, March 27, 2005

Changing blog's name

Originally, it was my intention to have a blog to post alternate viewpoints (hence the former name Un autre point de vue) as well as some Python musing. Since this blog has been listed in both
Planet Python and
Daily Python-URL, I have felt that non-Python related stuff didn't really belong here. I may revive the old title in a separate blog later...

James Tauber : Simple Algorithm for Recurring Tasks

James Tauber : Simple Algorithm for Recurring Tasks talks about a product (sciral) for keeping track of recurring tasks with priorities roughly based on how late they are. Sciral is now also available for Windows, which Tauber didn't mention. This looks like a simple thing to do in python; I'll have to try to implement it using wxPython.

Thursday, March 24, 2005

First looks: Text Processing in Python

I just received an autographed copy of
Text Processing in Python (a book) from David Mertz (Thanks David; great service!). While the content of this book is available (in ascii format) on the author's website, I find that there is nothing like having a copy in hand, especially when it is a good quality paperback (as expected from Addison Wesley).

There is a lot of material in that book; while I am an avid reader, I suspect that it will take me quite a while to get through it. If you can't wait to learn more about this book, you might want to have a look at
this review.

Wednesday, January 26, 2005

Useful wxPython info

In an old blog, that I just stumbled upon, Golden spud gives some interesting information about wxPython. Perhaps it is possible to use his technique to change the language used in a GUI without having to restart the application under Windows.

Tuesday, January 11, 2005

Teaching an old Python keyword new tricks

In my last post, I provided some additional thoughts about maintaining Python's pseudocode appearance while adding static typing information. The notation I suggested also allowed for easy inclusion of pre-conditions and post-conditions. However, this came as a cost of adding where as a new keyword to Python. After giving some more thoughts to this idea I realise that a new keyword was not needed at all. This can be seen starting from the following example using the keyword where as introduced before:
01    def gcd(a, b):

02 where:
03 assert isinstance(a, int)
04 assert isinstance(b, int)
05 '''Returns the Greatest Common Divisor,
06 implementing Euclid's algorithm.
07 Input arguments must be integers.'''
08 while a:
09 a, b = b%a, a
10 return b
If we remove the line with where: and change the indentation level, we have essentially valid (today!) Python syntax, but with many (two in this case) lines starting with the keyword assert. If we allow this keyword to introduce the beginning of a bloc of statements, we could then write:
01    def gcd(a, b):

02 assert:
03 isinstance(a, int)
04 isinstance(b, int)
05 '''Returns the Greatest Common Divisor,
06 implementing Euclid's algorithm.
07 Input arguments must be integers.'''
08 while a:
09 a, b = b%a, a
10 return b
The idea of a keyword that allows both a bloc structure with a colon, as well as a single line expression is not new: it has been introduced in Python with the addition of list comprehensions. For example, we can have
01    for i in range(10):

02 ...
03 ...
as well as
a = [i for i in range(10)]

With this suggested syntax, pre-conditions can easily be added.
01    def gcd(a, b):

02 assert:
03 isinstance(a, int)
04 isinstance(b, int)
05 a != 0 and b != 0 # fake pre-condition
06 '''Returns the Greatest Common Divisor,
07 implementing Euclid's algorithm.
08 Input arguments must be integers.'''
09 while a:
10 a, b = b%a, a
11 return b
Keeping this block structure in mind, we can add type information on return values as follows:
01    def gcd(a, b):

02 assert:
03 isinstance(a, int)
04 isinstance(b, int)
05 return c assert:
06 isinstance(c, int)
07 '''Returns the Greatest Common Divisor,
08 implementing Euclid's algorithm.
09 Input arguments must be integers;
10 return value is an integer.'''
11 while a:
12 a, b = b%a, a
13 return b
as well as adding pre- and post-conditions:
01    def gcd(a, b):

02 assert:
03 isinstance(a, int)
04 isinstance(b, int)
05 a != 0 and b != 0 # fake pre-condition
06 return c assert:
07 isinstance(c, int)
08 c != 0 # fake post-condition
09 '''Returns the Greatest Common Divisor,
10 implementing Euclid's algorithm.
11 Input arguments must be integers;
12 return value is an integer.'''
13 while a:
14 a, b = b%a, a
15 return b
To move further towards something similar to what I suggested in my previous post, we need to identify, as Guido van Rossum had himself suggested, the statement
isinstance(object, class-or-type-or-tuple)

with
object: class-or-type-or-tuple

Doing so allows us to rewrite the last example as
01    def gcd(a, b):

02 assert:
03 a: int
04 b: int
05 a != 0 and b != 0 # fake pre-condition
06 return c assert:
07 c: int
08 c != 0 # fake post-condition
09 '''Returns the Greatest Common Divisor,
10 implementing Euclid's algorithm.
11 Input arguments must be integers;
12 return value is an integer.'''
13 while a:
14 a, b = b%a, a
15 return b
which is essentially what was written in the previous post, but with the replacement of a new keyword (where) by an old one (assert). The general form would be:
01    def foo(...):

02 assert:
. type information and/or
. pre-conditions
. return [...] assert:
. type information and/or
. post-conditions
. '''docstring'''
. ...body...
As this post is already long enough, I will not discuss the interface issue here; the ideas introduced in the previous two posts are easily translated by replacing where by assert everywhere.

Sunday, January 09, 2005

Python as pseudo-code and Optional Static Typing

In my last post, I gave a series of examples intended to show how Python's pseudocode syntax could be preserved while adding optional static typing. While I felt that the syntax was self-explanatory (as pseudocode should be), I thought it might be useful to examine in more details the why's and the how's of my suggestion.

Before proceeding further, it might be argued that this is a case of premature optimization: Python does not currently have (optional) static typing so the concept should be discussed first, before implementation details and syntax can be looked at objectively. Yet, one of the strength of Python is its syntax, and many objections (NIMPY as GvR described it) to the proposed addition(s) to the language came from an unease with the suggested syntax.

1. Functions and optional typing information

I will use GvR's version of a Greatest Common Divisor finding function as an example and build from it; the complexity of the examples will increase as we go in an attempt to fully illustrate the suggested syntax. This function is defined as follows:

01 def gcd(a, b):
02 '''Returns the Greatest Common Divisor,
03 implementing Euclid's algorithm.'''
04 while a:
05 a, b = b%a, a
06 return b
I have included an optional docstring which, following Python's normal syntax, is indented at the same level as the function's body. The docstring is intended to be easily extractable for documentation purpose and is intended for human readers only. Suppose that it is required to provide static typing information. Using the notation I used in my previous post, here's how one might write additional syntax information.
01    def gcd(a, b):

02 where:
03 a: int, b: int
04 return c where:
05 c: int
06 '''Returns the Greatest Common Divisor,
07 implementing Euclid's algorithm.
08 Input arguments must be integers;
09 return value is an integer.'''
10 while a:
11 a, b = b%a, a
12 return b
The suggested where keyword, followed by a mandatory colon, follows the usual Python syntax of introducing a bloc of code identified by its indentation level. So we now have three structural items at the same indentation level: the static typing information, the docstring, and
the function body.

The docstring is between the typing information and the actual code, allowing to better separate visually the two of them. The docstring has been updated as well to inform a potential user
reading the documention as to the particular of this function. I have given some thoughts about including the typing information within the docstring (à la doctest) in order to avoid writing about the type information twice, as in the following:
01    def gcd(a, b):

02 '''Returns the Greatest Common Divisor,
03 implementing Euclid's algorithm.
04 where:
05 a: int, b: int
06 return c where:
07 c: int'''
08 while a:
09 a, b = b%a, a
10 return b
However, I find 1) that it is not easier to read; 2) that it would complicate extraction of the docstring for inclusion in a standard documentation (i.e. spaces are significant for only part
of the docstring); 3) it doesn't "scale" well with added features like pre-conditions and post-conditions.

2. Pre- and post-conditions

Building on the previous examples, here is how pre- and post-conditions might might be added within the suggested new syntax with, in this case, somewhat artificial conditions (not required by this algorithm).
01    def gcd(a, b):

02 where:
03 a: int, b: int
04 assert a != 0 and b != 0
05 return c where:
06 c: int
07 assert c <= abs(a) and c <= abs(b)
08 '''Returns the Greatest Common Divisor,
09 implementing Euclid's algorithm.
10 Input arguments must be integers,
11 and can not be both zero;
12 return value is an integer.'''
13 while a:
14 a, b = b%a, a
15 return b

The post-condition assertion

07 assert c <= abs(a) and c <= abs(b)

is an internal verification that the code is correct. It is irrelevant to the documentation which is one more reason to separate it from the docstring, even though such information is sometimes including in docstring. This notation, using the standard assert statement is different from what I suggested in my previous post and respects Python's current syntax.

One can easily imagine that such typing information and conditions might be shared by many functions. Rather than replicating code, it might be possible to define abstract functions, which are functions that have no body, only type information and/or conditions. For example, we could have
01    def two_integer_input(i, j):  # abstract function: no body

02 where:
03 i: int, j: int
04 '''Accepts only two integers as input.'''
05
06
07 def _gcd_return_info(a, b): # abstract function: no body
08 where:
09 return c where:
10 c: int
11 assert c <= abs(a) and c <= abs(b)
12
13
14 def gcd(a, b):
15 where:
16 two_integer_input(a, b)
17 assert a != 0 and b != 0
18 _gcd_return_info(a, b)
19 '''Returns the Greatest Common Divisor,
20 implementing Euclid's algorithm.
21 Input arguments must be integers,
22 and can not be both zero;
23 return value is an integer.'''
24 while a:
25 a, b = b%a, a
26 return b

I realise that this might be starting to look like a return to past suggestions during the decorator debate. However, given that the optional static typing proposal is new, it should be looked at objectively, without rejecting off-hand proposals that were rejected in a different context.

3. Interfaces

Instead of dealing with functions, in oop we have classes and methods. Classes can implement interfaces, which share some similarities with what we called abstract functions above. Here is how, using the notation we have suggested, we might introduce interfaces in the language. Once again I am using examples suggested by Python's BDFL in his blog.
01    interface I:

02 def foo(x):
03 where:
04 x: int
05 x > 0
06 return r where:
07 r: str
08
09 class C(object; I): # derived from object; implements I
10 def foo(self, x):
11 return str(x)

I suggest using a semi-colon to separate based classes from interfaces for clarity. Thus we could have:
01    interface I1(I2, I3):   # derived from I2 and I3, both interfaces

02 def foo(a: t1, b: t2):
03 where:
04 a: t1, b: t2
05 return c where:
06 c: t3
07
08 class C(C1, C2; I1): # derived from C1, C2; implements I1
09 def foo(a, b):
10 return a+b
For a class that does not implement interfaces, the following statements would be equivalent
class C(object):  # preferred

# and
class C(object;): # allowed

Classes that do not derived from a base class (aka 'old-style') might still be accommodated via
class C(; I1, I2)

although presumably they will have disappeared from the language by the time static typing information will have been implemented.