Friday, December 19, 2008

Plugins - part 4: Crunchy-style plugin

In this 4th post in the plugins series, I will explain the approach we used in Crunchy. While explaining the main features, I will also compare with the simple class-base plugin framework introduced in the third post in this series.

Crunchy's approach does not require a plugin to be class-based. In fact, most plugins used in Crunchy only make use of simple functions inside modules. While the class-based framework introduced in the third post used the fact that Python allowed automatic discovery of subclasses, the approach used in Crunchy requires an explicit registration of plugins. Using the same example as before, this means that op_2.py would contain the following code:

def register(OPERATORS):
OPERATORS['**'] = operator_pow_token

class operator_pow_token(object):
lbp = 30
def led(self, left):
return left ** expression(30-1)


Note that the class (operator_pow_token) is unchanged from the original application.

The method used to find plugins is similar to that introduced previously. The entire code required is as follows:

def init_plugins(expression):
plugin_dir = (os.path.dirname(os.path.realpath(__file__)))
plugin_files = [x[:-3] for x in os.listdir(plugin_dir) if x.endswith(".py")]
sys.path.insert(0, plugin_dir)
for plugin in plugin_files:
mod = __import__(plugin)
if hasattr(mod, "register"):
mod.expression = expression
mod.register(OPERATORS)

By comparison, the code used in the class-based plugin could have been written as:

def init_plugins(expression):
plugin_dir = os.path.dirname(os.path.realpath(__file__))
plugin_files = [x[:-3] for x in os.listdir(plugin_dir) if x.endswith(".py")]
sys.path.insert(0, plugin_dir)
for plugin in plugin_files:
mod = __import__(plugin)
mod.expression = expression
for plugin in Plugin.__subclasses__():
OPERATORS[plugin.symbol] = plugin

So, in one case (Crunchy-style), we have an explicit registration process with no need to create a sample base class (and, in fact, no need to work with classes at all), while in the other we have an automatic registration based on identifying subclasses. This does not mean that the Crunchy-style is better - just different. Both are equally good for this type of simple application. While we have not found the approach used in Crunchy to be limiting us in any way when extending Crunchy, something must be said for the fact that all the other Python examples of plugin-based application I have found have been based on using classes.

I can now give another motivation for having chosen the small expression calculator as a candidate for a plugin-based application: since all mathematical operations were already implemented as classes, it was readily suitable for the class-based approach (and the Zope Component Architecture one, etc.) whereas all my existing code samples that used plugins (from Crunchy, docpicture, etc.) had mostly functions rather than classes in plugins.

Plugins - part 3: Simple class-based plugin

In the first post of this series, I introduced a simple application to be used as a demonstration of a plugin-based application. The chosen application was an expression calculator contained in a single file. In the second post, I modularized the original file so that the new file structure would become a good representative of a plugin based application. In this post, I will explain how to make use of a simple class-base plugin framework. The model I have chosen follows fairly closely the tutorial written by Armin Ronacher. Another tutorial demonstrating a simple class-based plugin framework has been written by Marty Alchin.

The first step is to define a base Plugin class. All we need is to include the following in base.py:

class Plugin(object):
pass


Next, we ensure that classes used in plugins derive from this base class. We only give one explicit example, that of the class included in op_2.py since the 4 classes included in op_1.py would be treated in exactly the same way.

from plugins.base import Plugin

class operator_pow_token(Plugin):
symbol = '**'
lbp = 30
def led(self, left):
return left ** expression(30-1)

Note that we added one more line of code to the class definition. We are now ready to deal with the plugin discovery and registration.

Rather than hard-coding the information about which plugin files to import as we did when we simply modularize the application, we give a way for our program to automatically find plugins. With the file structure that we have created, this can be accomplished as follows:

def find_plugins(expression):
'''find all files in the plugin directory and imports them'''
plugin_dir = os.path.dirname(os.path.realpath(__file__))
plugin_files = [x[:-3] for x in os.listdir(plugin_dir) if x.endswith(".py")]
sys.path.insert(0, plugin_dir)
for plugin in plugin_files:
mod = __import__(plugin)
mod.expression = expression

Note that the last line of code is included because of the "wart" mentioned in the previous post and would not usually be included. To be safe, we should probably have ensured that expression was not already defined in the modules to be imported since, in theory, Python files other than plugins (such as __init__.py) might be present in the plugin directory. In this tutorial series we will often ignore the need to insert try/except clauses to simplify the code.

While we have imported the modules containing the plugins, they are not yet known in a useful form by the main application. To do so is very simple in this class-based approach, thanks to Python's treatment of (sub-)classes. Here's the code to do this:

def register_plugins():
'''Register all class based plugins.

Uses the fact that a class knows about all of its subclasses
to automatically initialize the relevant plugins
'''
for plugin in Plugin.__subclasses__():
OPERATORS[plugin.symbol] = plugin


That's it! It is hard to imagine anything simpler. With this last definition, the entire base.py module can be written as:

import os
import sys

OPERATORS = {}

class Plugin(object):
pass

def init_plugins(expression):
'''simple plugin initializer
'''
find_plugins(expression)
register_plugins()

def find_plugins(expression):
'''find all files in the plugin directory and imports them'''
plugin_dir = os.path.dirname(os.path.realpath(__file__))
plugin_files = [x[:-3] for x in os.listdir(plugin_dir) if x.endswith(".py")]
sys.path.insert(0, plugin_dir)
for plugin in plugin_files:
mod = __import__(plugin)
mod.expression = expression

def register_plugins():
'''Register all class based plugins.

Uses the fact that a class knows about all of its subclasses
to automatically initialize the relevant plugins
'''
for plugin in Plugin.__subclasses__():
OPERATORS[plugin.symbol] = plugin

In the next post, I will show another simple alternative approach similar to the one used in Crunchy.

Plugins - part 2: modularization

In the first post on the Plugins series, I introduced the small application used to demonstrate how one could modularize applications using a plugin architecture. The digital ink was barely dry on that post that already two people rose to the challenge and presented their solution, one using the standard method with the Zope Component Architecture, the other a modified method using grok. I will comment on these two solutions later in this series.

With apologies to the more advanced users, I have decided to proceed fairly slowly and cover many simple concepts with this series of plugins. Thus, this second post will not yet discuss plugins, but simply lay the groundwork for future posts. By the way, for those interested, and as pointed out by Lennart Regebro in his post, all the code samples that I will use can be browsed at, or retrieved from, my py-fun google code repository.

As a first step before comparing different approaches to dealing with plugins, I will take the sample application introduced in the first post and modularize it.

The core application (calculator.py) is as follows:

import re

from plugins.base import OPERATORS, init_plugins

class literal_token(object):
def __init__(self, value):
self.value = value
def nud(self):
return self.value

class end_token(object):
lbp = 0

def tokenize(program):
for number, operator in re.findall("\s*(?:(\d+)|(\*\*|.))", program):
if number:
yield literal_token(int(number))
elif operator in OPERATORS:
yield OPERATORS[operator]()
else:
raise SyntaxError("unknown operator: %r" % operator)
yield end_token()

def expression(rbp=0):
global token
t = token
token = next()
left = t.nud()
while rbp < token.lbp:
t = token
token = next()
left = t.led(left)
return left

def calculate(program):
global token, next
next = tokenize(program).next
token = next()
return expression()

if __name__ == "__main__":
init_plugins(expression)
assert calculate("+1") == 1
assert calculate("-1") == -1
assert calculate("10") == 10
assert calculate("1+2") == 3
assert calculate("1+2+3") == 6
assert calculate("1+2-3") == 0
assert calculate("1+2*3") == 7
assert calculate("1*2+3") == 5
assert calculate("6*2/3") == 4
assert calculate("2**3") == 8
assert calculate("2*2**3") == 16
print "Done!"


For the next few posts, when I demonstrate some very simple plugin approaches, this core application will remain untouched. This is one important characteristic of plugin-based application: in a well-designed application, plugin writers should not have to modify a single line of the core modules to ensure that their plugins can be used.

Communication between plugins and the core application is ensured via an Application Programming Interface (API) unique to that application. In our example, the API is a simple Python dict (OPERATORS) written in capital letters only to make it stand out.

In a sub-directory (plugins), in addition to an empty __init__.py file, we include the following three files:

1. base.py

OPERATORS = {}

def init_plugins(expression):
'''simulated plugin initializer'''
from plugins import op_1, op_2

op_1.expression = expression
op_2.expression = expression

OPERATORS['+'] = op_1.operator_add_token
OPERATORS['-'] = op_1.operator_sub_token
OPERATORS['*'] = op_1.operator_mul_token
OPERATORS['/'] = op_1.operator_div_token
OPERATORS['**'] = op_2.operator_pow_token

2. op_1.py

class operator_add_token(object):
lbp = 10
def nud(self):
return expression(100)
def led(self, left):
return left + expression(10)

class operator_sub_token(object):
lbp = 10
def nud(self):
return -expression(100)
def led(self, left):
return left - expression(10)

class operator_mul_token(object):
lbp = 20
def led(self, left):
return left * expression(20)

class operator_div_token(object):
lbp = 20
def led(self, left):
return left / expression(20)


and 3. op_2.py

class operator_pow_token(object):
lbp = 30
def led(self, left):
return left ** expression(30-1)


The last two files have been simply extracted with no modification from the original application. Instead of having 2 such files containing classes of the form operator_xxx_token, I could have included them all in one file, or split into 5 different files. The number of files is irrelevant here: they are only introduced to play the role of plugins in this application.

The file base.py plays the role here of a plugin initialization module: it ensures that plugins are properly registered and made available to the core program.

Since I wanted to change the original code as little as possible, a "wart" is present in the code as written since it was never intended to be a plugin-based application: the function expression() was accessible to all objects in the initial single-file application. It is now needed in a number of modules. The file base.py takes care of ensuring that "plugin" modules have access to that function in a transparent way. This will need to be changed when using some standard plugin frameworks, as was done in the zca example or the grok one.

In the next post, I will show how to take this now modularized application and transform it into a proper plugin-based one.

Thursday, December 18, 2008

Plugins - part 1: the application

My interest in plugins started two years ago listening to Ivan Krstić talk about the OLPC. Following his talk, I wrote the following on edu-sig:
One open issue (as I understand it) is that of finding the "best practice" for plugins. The idea is that the core programs should be as small as possible but easy to extend via plugins. I thought that there already was a "well known and best way" to design plugins - and it was on my list of things to learn about (to eventually incorporate rur-ple within crunchy).
After discussing this off-list with Johannes Woolard, I concluded that we should try to redesign Crunchy to make use of plugins. While I was thinking about how we might proceed to do this, Johannes went ahead and implemented a simple plugin framework which we eventually adopted for Crunchy.

While there are a few agreed-upon "standards" when it comes to dealing with plugins in Python (such as setuptools and Zope Component Architecture), I tend to agree with Ivan Krstić's observation that there are no "best practice" for plugins - at least, none that I have seen documented. As what might be considered to be a first step in determining the "best practice" for writing plugin-based applications with Python, I will take a sample application, small enough so that it can be completely included and described in a blog post, and not written with plugins in mind. I thought it would be a more representative example to use an arbitrary sample application, rather than trying to come up with one specifically written for the purpose of this series of post.

The application I have chosen is a small modification of an expression calculator written and described by Fredrik Lundh, aka effbot, a truly outstanding pythonista. The entire code is as follows:

""" A simple expression calculator entirely contained in a single file.

See http://effbot.org/zone/simple-top-down-parsing.htm for detailed explanations
as to how it works.

This is the basic application used to demonstrate various plugin frameworks.
"""

import re

class literal_token(object):
def __init__(self, value):
self.value = value
def nud(self):
return self.value

class operator_add_token(object):
lbp = 10
def nud(self):
return expression(100)
def led(self, left):
return left + expression(10)

class operator_sub_token(object):
lbp = 10
def nud(self):
return -expression(100)
def led(self, left):
return left - expression(10)

class operator_mul_token(object):
lbp = 20
def led(self, left):
return left * expression(20)

class operator_div_token(object):
lbp = 20
def led(self, left):
return left / expression(20)

class operator_pow_token(object):
lbp = 30
def led(self, left):
return left ** expression(30-1)

class end_token(object):
lbp = 0

def tokenize(program):
for number, operator in re.findall("\s*(?:(\d+)|(\*\*|.))", program):
if number:
yield literal_token(int(number))
elif operator == "+":
yield operator_add_token()
elif operator == "-":
yield operator_sub_token()
elif operator == "*":
yield operator_mul_token()
elif operator == "/":
yield operator_div_token()
elif operator == "**":
yield operator_pow_token()
else:
raise SyntaxError("unknown operator: %r" % operator)
yield end_token()

def expression(rbp=0): # note that expression is a global object in this module
global token
t = token
token = next()
left = t.nud()
while rbp < token.lbp:
t = token
token = next()
left = t.led(left)
return left

def calculate(program):
global token, next
next = tokenize(program).next
token = next()
return expression()

if __name__ == "__main__":
assert calculate("+1") == 1
assert calculate("-1") == -1
assert calculate("10") == 10
assert calculate("1+2") == 3
assert calculate("1+2+3") == 6
assert calculate("1+2-3") == 0
assert calculate("1+2*3") == 7
assert calculate("1*2+3") == 5
assert calculate("6*2/3") == 4
assert calculate("2**3") == 8
assert calculate("2*2**3") == 16
print "Done!"


The latest version used can be found online.

In the above code, I have highlighted in red classes that will be transformed into plugins. I have also highlighted in green hard-coded if/elif choices that will become indirect references to the plugin components.

In the next post in this series, I will break up this single file in a set of different modules as a required preliminary step before transforming the whole applications into a plugin-based one, with a small core. In subsequent posts, I will keep the core constant and compare various approaches that one can use to link the plugins with the core.

Wednesday, December 17, 2008

Seeing double at Pycon 2009

Jesse Noller is going to give two talks at Pycon 2009. So is Tarek Ziadé. And Mike Fletcher is as well. And Brett Cannon has a talk and a panel. So far there I have not seen any post on Planet Python about someone giving just one talk.

I would hate to be the one breaking the streak. So, I might as well announce that I will be giving two talks as well. :-)

Not surprisingly, the first one is about Crunchy. The title of the talk is Learning and Teaching Python Programming: The Crunchy Way, and the abstract reads as follows:
Crunchy (http://code.google.com/p/crunchy) is a program that transforms a static Python tutorial into an interactive session within a browser. In this talk, I will present Crunchy, focusing on the features that are specifically designed to be helpful in a formal teaching setting.

Not exactly Earth-shattering but hopefully of interest to anyone that has to teach programming in a formal setting or who would just be interested in showing off Python to anyone. This Crunchy talk is, of course, not going to be your traditional slide-based talk but rather more like an interactive demo using Crunchy. I am hoping to have a few surprises by the time the conference occurs.

My other talk is going to be very different. I doubt very much that I will be using Crunchy for it. The title is Plugins and Monkeypatching: increasing flexibility, dealing with inflexibility, and the abstract reads as follows:

By using plugins, one can create software that is easily extensible by others, thereby promoting collaborative development. The flip side of extensible software occurs when dealing with some standard framework whose interface is closed but which does not do exactly what is desired. In this case, monkeypatching may be worth considering.
In this talk, I'll give concrete examples of both plugin design and using monkeypatching, using small code samples from existing projects, and discuss the advantages and the shortcomings of the methods used. I will also include the design of a tiny, but flexible module for generating svg code - and compare it with other existing approaches.
I can not pretend to even come close to being an expert about designing plugin based applications. Still, I felt that I have had some potentially useful experiences to share about these topics which motivated my talk proposal. Now that it has been accepted, I have started working on fleshing out the original outline.

In preparation for the actual talk, which will not go into much code details due to time constraints, I plan to start a short series of posts about plugins. In the first post I will give an overview of a simple application (a calculator) that is written as a single file. In the second post, I will reorganize the code so as to use multiple files, with a number of modules located in a "plugins" directory, laying out the groundwork for working with actual plugins. Subsequent posts will be used to demonstrate different approaches used to transform the application into a truly plugin-based one.

Of course, the plugin model used in Crunchy will be one approach showcased. A second one (which I have already implemented) is a simple class based one inspired by a tutorial written by Armin Romacher. I also plan to demonstrate how to use the Zope component architecture approach as well as the setuptools based method (and possibly others depending on suggestions I might receive).

Since I have never actually written any code using the Zope component architecture or the setuptools based approach, I thought it would be interesting to do this in a truly open-source spirit. Therefore, once I have written the first two or three posts in this series, I would like to invite anyone interested to contribute their own code demonstrating their favourite framework. This way, experts could make sure that their favourite framework is properly showcased, and not misrepresented by me. Interested parties can contribute either by sending me the code directly or by blogging about it. (If your blog appears on either planet.python.org or planetpython.org, I will most likely read it.)

Anyone who contributes in this way to my talk will be mentioned at Pycon AND receive half of the stipend I get as a presenter. ;-)

Friday, November 28, 2008

Thwarted by lack of speed

I was hoping to make an announcement of a new cool app based on Google's App Engine but unfortunately I have been thwarted by Python's relative lack of speed.

I have started working on a new version of Crunchy that would run as a web app on Google's servers. While the current version of Crunchy fetches existing html pages, processes them and displays them in the browser, this new version would retrieve html page content (in reStructuredText format) from Google's datastore, transform it into html, process it to add interactive elements, and then displays them.

This new app was going to be usable as a wiki to create new material. This was my starting point, greatly helped by an already existing wiki example that I adapted to use reStructuredText. When requesting a page, the following was supposed to happen:

1. reStructuredText content (for the body of the html page) is fetched from the datastore.
2. said content is transformed (by docutils) into html
3. html content is further processed by modified "crunchy engine" to add interactive elements.
4. modified html content is inserted in page template and made available.

The user would then be able to enter some Python code which could be send back to the App Engine using Ajax for processing and updating the page display.

A normal user would only be able to interact with already existing pages. Special users ("editors") only would have been able to add pages. I was hoping that people teaching Python would be interested in writing doctest-based exercises and that a useful collection could be implemented over time.

Unfortunately, this approach can not work, at least not using Google's App Engine on Google's own servers. :-(

Just playing with small pages, steps 1 and 2 are long enough that I get warnings logged mentioning that requests are taking too long. I know from experience that step 3 (which I have not started to implement/port from the standard Crunchy) can take even longer for reasonably size pages. So, this does not appear to be feasible ... which is unfortunate.

I think I will continue to develop this app to be used as a local one and perhaps write a second wiki-based app that would take html code with no further processing. I could use the first one to create a page, have it processed and use the "view source" feature of Firefox to cut and paste the content into the online app. This would remove the need for any processing of pages on Google's servers - only Python code execution would need to be taken care of. (Of course, a user could enter some code sample that would take too long to execute and hit Google's time limit ...)

If anyone has a better idea, feel free to leave it as a comment.

Saturday, November 01, 2008

docpicture progress

For those interested, docpicture can now display images from the web. There's also a somewhat silly example where I embedded the code for a matplotlib example inside a docstring and have it displayed as a plot when viewing the docstring via docpicture inside a web browser. In order to do so I had to exec the code which is not exactly good practice ... but it serves to highlight the need to either only allow "parsers" from the standard distribution or require the user to give permission to a parser to be able to register itself with docpicture while it is running. I chose this second approach, although if you run the demo, you will not be given the opportunity to approve or not the parser - it will be done for you. This may need to be revisited...

I just announced a new release on the Python list. You can get docpicture 0.2 from here.