Thursday, December 27, 2007

Crunchy and Python 3.0a2

Continuing with my experiment of adapting Crunchy to Python 3.0, I managed to get Crunchy to start with Python 3.0a2 and get some code running from the editor - but not from the interpreter, nor the doctest. Most of the problems I have are dealing with bytes-to-string conversion and string-to-bytes. As mentioned by Guido van Rossum last June
  • We're switching to a model known from Java: (immutable) text strings are Unicode, and binary data is represented by a separate mutable "bytes" data type. In addition, the parser will be more Unicode-friendly: the default source encoding will be UTF-8, and non-ASCII letters can be used in identifiers
Later on, in a comment from that post, we find:
  • > In your presentation last night you had one slide which
    > talked about the "str" vs "bytes" types in Python 3000. On
    > the bottom of that slide was something like:
    >
    > str(b"asdf") == "b'asdf'"
    >
    > However, in discussing this slide (very briefly) you said
    > that a type constructors like "str" could be used to do
    > conversion. It seems like "str" is behaving more like
    > "repr" in this case, which seems unusual and less useful
    > to me. Was this a typo, or is this actually the way it's
    > supposed to work? What's the rationale?

    To be honest, this is an open issue. The slide was wrong compared to the current implementation; but the implementation currently defaults to utf8 (so str(b'a') == 'a'), which is not right either. The problem is that there are conflicting requirements: str() of any object should ideally always return something, but we don't want str() to assume a specific default encoding.

    To be continued...
This change seems innocuous enough...

As a web server, Crunchy sends to and receives information from the browser as "binary data" or "bytes". As a generalized Python interpreter, Crunchy manipulates the information as "strings". It appears that the "bytes" implementation is done much more completely in Python 3.0a2 than it was in Python 3.0a1. And this is the source of many problems.

For example, Crunchy sends from the browser some information about the path to which a Python file should be saved and its content as follows:

'/Users/andre/.crunchy/temp.py_::EOF::_from Tkinter import *\nroot = Tk()\nw = Label(root, text="Crunchy!")\nw.pack()\nroot.mainloop()'

This is sent as a binary stream which needs to be converted to the string written above. This conversion is done via str(...). Using Python 3.0a1 (and 2.4 and 2.5), the result was as above; splitting the string gave the following:

['/Users/andre/.crunchy/temp.py', 'from Tkinter import *\nroot = Tk()\nw = Label(root, text="Crunchy!")\nw.pack()\nroot.mainloop()']

Now, with Python 3.0a2, it gets slightly more complicated. The first string acquires a "b" prefix upon conversion (as mentioned in the comment from Guido's blog mentioned before). After splitting, the result is

["b'/Users/andre/.crunchy/temp.py", 'from Tkinter import *\\nroot = Tk()\\nw = Label(root, text="Crunchy!")\\nw.pack()\\nroot.mainloop()\'']

So, we now have a first string with a "b'" prefix embedded in it, and a second one without. It seems that each case will have to be handled carefully on its own. And I suspect more problems will show up as we get closer to the final 3.0 release.

I know, I know, I'm really not following the "recommended" practice, as quoted on Guido's blog. I should probably wait first for Python 2.6 to come out. Then, I should have a complete unit test coverage and use the conversion tool to create a Python 3.0 version .... However, I am not convinced that the conversion tool will be smart enough to know when a function (that I write) expect a "str" object and when it expect a "byte" one. Furthermore, the few unit tests I had worked fine under both Python 2.5 and 3.0 ... but some functions that I had written with the expectation that they would receive some string arguments did not work in "production code", as they were getting some bytes arguments. And this failed completely silently...

If I had to give some advice to someone about creating Python programs that can work with both Python 2.x and Python 3.x, I would say like Guido: don't. :-) Unless of course you are like me and are doing this for fun and to get to learn about the differences between Python 2.x and 3.x along the way. But then, "be prepared for the unexpected" like the following: turning on a few print statements (via a "debug flag") can result in breaking some code; turn them off and the code works again... Yes, it did happen to me - I still have to figure out how...

Crunchy and Python 3.0a1

crunchy running under Python 3.0a1

It is often said that a picture is worth a thousand words...

I have managed to make Crunchy run under Python 3.0a1. Some of the features are not working but the interpreter and the editor work. The turtle module I have been working on also works "nicely" (read: as slow as before) with this new Python version. Unfortunately, when it is run under Python 3.0a1, Crunchy can not load most pages - including those of the official Python 3.0 tutorial. The reason is that is uses ElementTree to parse pages and it is unforgiving when it comes to having unclosed tags (as in <link> and <meta...> for example); it also seems to not be able to handle the <script>s that are included on the page. I have not yet found a way to reliably "clean" the pages before parsing them with ElementTree. While I believe that I should be able to do so with a bit more work, there is a bigger problem...

Unfortunately, Crunchy does not run under Python 3.oa2, and the error messages I get have not been too helpful in figuring out the error. However, perhaps this is due to a faulty installation. What makes me think so is that when I start a 3.0a2 session at a terminal, I get an error message when I use exit(). This is most unexpected.

In any even, the next release should include the new crunchy turtle module and be usable with 3.0a1. Perhaps Johannes, or some curious user, will be able to figure out how to make it run under 3.0a2 as well.

Tuesday, December 25, 2007

Slow turtle ... in time for Xmas

One of the task assigned in Google's HOP contest was to design a simple turtle graphics module for Crunchy.  This was done successfully by a student as a prototype.  This prototype had some unfortunate limitations in terms of number of turtles and simultaneous graphics canvas existing on the same page, but it did give me the impetus to use the student code as a proof-of-concept and implement a more complete turtle module for Crunchy.

Playing with turtles, and trying to draw fairly complex shapes, made me realize that the combination of using an html canvas and the Crunchy comet communication makes for an extremely slow turtle. It would be really nice to  find a better (faster) way.

The next Crunchy release should include that turtle module ... and an additional bonus: Crunchy can now be launched successfully using either Python 2.5 (or 2.4) and Python 3.0a1.  And the turtle module works with both.

At the moment, not all of Crunchy's features are supported when using Python 3.0.  However, this should no longer be the case by the time version 1.0 comes out.

And, for those that might be tempted to point out Guido's blog entry about not making programs compatible with both Python 2.x and 3.x, please don't bother.  I realize that it is not wise in general to try to do so.  However, given Crunchy's design philosophy to make it as easy for students/teachers/tutorial writers to use - it just does make sense: download, unzip, double-click; nothing else should be needed to start having fun with Python - no matter what new Python version gets installed.


Tuesday, December 18, 2007

(NOT) Bitten by PEP 3113

UPDATE: The comments left on this post (1 and 3) in particular corrected my misreading of PEP 3113. There is no such wart as I describe in Python 3.0. I should have known better than to question GvR and friends. :-) I'm leaving this post as a reference.

In trying to make Crunchy useful & interesting for beginning programmers to learn Python, I designed a small graphics library following some "natural" notation. As an aside, Johannes Woolard is the one who made sure that this library could be easily used interactively within Crunchy. I mention his name since too many people seem to assume that I am the only one involved in Crunchy's design. Anyway, back to the library...

In that library, the function used to draw a line between two points uses the syntax

line((x1, y1), (x2, y2))

for example: line((100, 100), (200, 200))


which should be familiar to everyone. Unfortunately, following the implementation of PEP 3113 in Python 3.0, this syntax is no longer allowed. This is ... annoying! There are two alternatives I can use:

line(x1, y1, x2, y2)

for example: line(100, 100, 200, 200)


or

line(point_1, point_2)

where point_a = (x_a, y_a). Update: with this second definition, it will be possible to invoke the function as
line((100, 100), (200, 200))

Of course, either of these two option is easy to implement (and is going to be backward compatible with Python 2k). However, I don't find either one of them particularly clear for beginners (who might be familiar with the normal mathematical notation) and do not consider this a (small) wart of Python 3k.

reStructuredText files and Crunchy

Crunchy can now handle reStructuredText (.rst) files in the same way it can process plain html ones! This requires the user to have docutils installed - which is normally the case for anyone that writes .rst files.

The test coverage for Crunchy is slowly improving. Currently, 10 modules are mostly covered by doctest-based unit tests, out of approximately 40. Since I make use of .rst files to keep the unit tests, these can now be browsed "pleasantly" using Crunchy itself.

Furthermore ... all the unit tests written so far work under Python 2.4, Python 2.5, and ... Python 3.0a1! This required some tedious rewriting of some parts of the code but the end result is well worth it - if only to really learn about differences between Python 2.5 and Python 3.0.

One thing that I found, which will be no surprise to TDD aficionados, is that code written without testing in mind can be quite tricky to write comprehensive tests for. Add to this the extra complication of making that code run under two incompatible Python versions, and you are on your way to major headaches. It's a good thing I am doing this only for fun!

Saturday, December 08, 2007

Launching Python 3.0 program from Crunchy running under Python 2.5

As part of Google's Highly Open Participation contest, Michele Mazzoni completed the task of creating a new option for Crunchy: one can now launch (starting with the next release of Crunchy - 0.9.8.5) a program using a different version of Python than the one used by Crunchy itself. While I had suggested that the alternate Python version could be set via the configuration options for Crunchy (usually accessible from a Python interpreter), Michele had the brilliant idea to add a simple input box where one can specify the path (or 'alias') of the Python version used right on the page where the program is launched from. This makes it extremely easy to change the interpreter version used to launch a user written program.

Michele has prepared a screencast demonstrating this, which should appear on ShowMeDo hopefully soon.

Thank you Michele - and thank you Google!

Tuesday, December 04, 2007

More results from GHOP

Google's Highly Open Participation (GHOP) contest is attracting a lot of attention from the right people: pre-university students. The PSF is one of ten organizations mentoring students working on Python-related projects. Since I submitted tasks suggestions early on and volunteered to help following a call for volunteers from Titus Brown, Crunchy has benefited from many students contributions. Crunchy's messages have been translated in Estonian, Macedonian, Polish and Italian with, hopefully, more translations to come. Some new unit tests have been added with more to come. There may be a couple of nice surprises coming out soon too :-)

While other projects have also benefited from GHOP's students contributions, there could be more. If you have some good ideas for mini-projects (doable in 3-5 days, at a couple of hours per day with perhaps one full day), your suggestions would most likely be most welcome. Just check out the GHOP Python discussion group. And, if you would like to join the (too small) ranks of Python mentors, please do; we need all the help we can get.