Sunday, January 10, 2010

Python + cython: faster than C?

[Note added on January 15, 2013: I am amazed that this post, clearly written tongue-in-cheek 3 years ago, can still spur people into doing their own test, and feeling the urge to comment.]

Now that I have your attention... ;-)

I've been playing around with cython and will write more about it soon.  But I could not resist doing a quick test when I read this post, comparing C and Java on some toy micro benchmark.

First, the result:

andre$ gcc -o fib -O3 fib.c
andre$ time ./fib
433494437

real    0m3.765s
user    0m3.463s
sys     0m0.028s

andre$ time python fib_test.py
433494437

real    0m2.953s
user    0m2.452s
sys     0m0.380s

Yes, Python+cython is faster than C!  :-)

Now, the code.  First, the C program taken straight from this post, except for a different, more meaningful value for "num" ;-)

#include 

double fib(int num)
{
   if (num <= 1)
       return 1;
   else
       return fib(num-1) + fib(num-2);
 }
 int main(void)
{
     printf("%.0f\n", fib(42));
    return 0;
 }

Next, the cython code ...

# fib.pyx

cdef inline int fib(int num):
    if (num <= 1):
        return 1
    else:
        return fib(num-1) + fib(num-2)

def fib_import (int num):
    return fib(num)

... and the Python file that calls it

# fib_test.py

import pyximport
pyximport.install()

from fib import fib_import

print fib_import(42)


  1. I know, I declared fib() to be of type "double" in the C code (like what was done in the original post) and "int" in the cython code; however, if I declare it to be of type "int" in the C code, I get -0 as the answer instead of 433494437.  I could declare it to be of type "unsigned int" ... but even when I did that, the Python code was marginally faster.
  2. If I declared fib to be of type "double" in the cython code, it was slightly slower than the corresponding C code.  However, what Python user would ever think of an integer variable as a floating point number! ;-)
  3. What is most impressive is that the cython code is NOT pre-compiled, unlike the C code.  However, to be fair ... I did run it a few times and took the best result, and it was always slower the first time.
  4. Yes, it is silly to compare such micro-benchmarks ... but it can be fun, no? ;-)
More seriously: using cython is a relatively painless way to significantly increase the speed of Python applications ... without having to learn a totally different language.

15 comments:

Joseph said...

I know this is just a fun toy test, but your single trial brought this article to mind:
http://www.zedshaw.com/essays/programmer_stats.html

Anonymous said...

What is most impressive is that the cython code is NOT pre-compiled, unlike the C code.

Huh? What do you think the pyximport stuff does? Of course, the cython code is compiled. It just happens auto-magically in the background when you run the python script first.

André Roberge said...

@Joseph: I completely agree with what Zed Shaw wrote; my post was written partly tongue in cheek.

@anonymous: I know that ... but think about it: the C code is compiled first, then run. The execution time does not include the compilation time. The python/cython code is compiled on the fly AND run; its execution time includes the time to compile and yet it is faster (or, let's just say... comparable). That's rather impressive, wouldn't you agree?

Unknown said...

the function in cython is defined as inline, tried inlining the function in C?

Unknown said...

Sorry, forget my comment :-) Better refresh my C knowledge first :-P

Sancho said...

The behind-the-scenes compilation caused by pyximport only happens on the first import after the last modification to the .pyx file. On subsequent runs/imports, compilation isn't done.

Jeff Kaufman said...

When you say "what is most impressive is that the cython code is NOT pre-compiled, unlike the C code. However, to be fair ... I did run it a few times and took the best result, and it was always slower the first time" you make no sense. The cython code is only compiled the first time you run it, which is why it is slower the first time. If you only take the best timing result as your benchmark, you're comparing precompiled C to precompiled cython.

André Roberge said...

@Sancho and @cbr: Could you tell me where the compiled cython file resides?

andre$ ls -l f*
-rwxr-xr-x 1 andre andre 12604 10 Jan 17:20 fib
-rw-r--r-- 1 andre andre 192 10 Jan 17:20 fib.c
-rw-r--r-- 1 andre andre 176 10 Jan 17:33 fib.pyx
-rw-r--r-- 1 andre andre 102 10 Jan 17:15 fib_test.py

Anonymous said...

Taking aside the complete nonsense of comparing C function declared as double with Cython code operating on ints (just google for the code if you need a working example..), I include below the code from cython transformation (debug comments removed for clarity):

static INLINE int __pyx_f_3fib_fib(int __pyx_v_num) {
int __pyx_r;
int __pyx_t_1;
__Pyx_RefNannySetupContext("fib");

__pyx_t_1 = (__pyx_v_num <= 1);
if (__pyx_t_1) {
__pyx_r = 1;
goto __pyx_L0;
goto __pyx_L3;
}
__pyx_r = (__pyx_f_3fib_fib((__pyx_v_num - 1)) + __pyx_f_3fib_fib((__pyx_v_num - 2)));
goto __pyx_L0;
}
__pyx_L3:;

__pyx_r = 0;
__pyx_L0:;
__Pyx_RefNannyFinishContext();
return __pyx_r;
}

Do you really think it will be faster if you compile it implicitly using pyrex and gcc than when doing it manually with gcc?

Anonymous said...
This comment has been removed by a blog administrator.
Unknown said...

Aside from the obvious bias with types and keyworlds your compiler fails... mine produces a constant and printf call - thats just not possible to beat without invalidating the experiment by not using the same library call for printf.

Its probably not inlining etc. for the sake of easy debugging by default.

Christopher Eads said...

Out of curiosity, are you a C programmer? If you are getting -0 as the result when you return an int from fib() instead of a double, I have a feeling your printf is still using %f. Change your printf line to:

printf("%i\n", fib(42));

and observe that it will print the correct result from an int returning function. Actually, on my system I get 0 if I use %.0f instead of %i, and my compiler warns me that I'm doing it wrong anyway. It's more than merely correct by the way:

chris@Eureka ~ $ time ./fib
433494437

real 0m2.630s
user 0m2.629s
sys 0m0.001s
chris@Eureka ~ $ time ./fib_test.py
433494437

real 0m11.252s
user 0m11.225s
sys 0m0.019s


This is with the python code exactly as you wrote it. I don't know why the python is consistently this much slower, but seeing as the C is faster than yours and much faster than the python, I'd say all you've really learned is that floating point math is slower than int math, which is fairly common knowledge.

And let's be clear, I like python, this is just meaningless.

Anonymous said...

On my MacBook Pro:

fib.c (the integer version: int fib(int num))
==
gcc -O3 fib.c
time ./a.out
433494437

real 0m1.806s
user 0m1.773s
sys 0m0.004s
==

and Pyrex:
==
time /opt/local/bin/python2.7 fib_test.py
433494437

real 0m2.196s
user 0m2.030s
sys 0m0.129s
==


btw: the C floating point version:

==
gcc -O3 fib.c
bash-3.2$ time ./a.out
433494437

real 0m1.575s
user 0m1.557s
sys 0m0.004s
==

Anonymous said...

If it's not obvious to you why this old post is getting comments now -- I think it might have to do with it appearing on the front-page of Planet Python ;)

Oh and about this question:
> @Sancho and @cbr: Could you tell me where the compiled cython file resides?

You'll find the files in ~/.pyxbld

André Roberge said...

This post appeared in Planet Python only after I wrote the update (due to a Blogger setting); comments were appearing before that.