Sunday, November 25, 2007

Sprint Discussions: Wrapping External Libraries

A more technical discussion during the sprint was about the next steps for the external module problem (minutes). One of PyPy's biggest problems in becoming more generally useful are C extension modules, which can't work with PyPy's Python interpreter. We already reimplemented many of the more commonly used extension modules in CPython's standard library in Python or RPython. However, there are more missing and there is no way to implement all the extension modules that other people have written.

Whiteboard after the discussion.

Therefore we need a different approach to this problem. Extension modules are commonly written for two different reasons, one being speed, the other being wrapping non-Python libraries. At the moment we want mostly to approach a solution for the latter problem, because we hope that the JIT will eventually make it possible to not have to write extension modules for speed reasons any more.

There are two rough ideas to approach this problem in the near future (there are other, more long-term ideas that I am not describing now): One of them is to add the ctypes module to PyPy's Python interpreter, which would mean re-implementing it since the existing implementation is written in C.

The other way would be to work on the existing way to get extensions in that PyPy provides, which are "mixed modules". Mixed modules are written in a combination of RPython and normal Python code. To then wrap C libraries you would use rffi, which is the foreign function interface of RPython.

The discussion round: Maciek Fijalkowski, Armin Rigo, Richard Emslie, Alexander Schremmer.

Both approaches have problems: With ctypes you have no built-in way to query C header files for structure layouts and constants which requires you to hard-wire them, which is highly platform dependant. Mixed modules are not really fun to write, since they need to be RPython and we currently don't have a way to do separate compilation, so you always need to translate PyPy's whole Python interpreter to see whether your module is correct.

In the meeting it was decided to first go for a ctypes replacement. The replacement would be written in pure Python, we already have a very thin wrapper around libffi which the new ctypes implementation would use. The goal to reach would be to get the pygame implementation in ctypes to run on PyPy.

To make ctypes more useful in general to write this kind of wrappers, we will probably extract some code that we have already written for PyPy's own usage: it gives a way to write "imprecise" declarations ("a structure with at least fields called x and y which are of some kind of integer type") and turn them into exact ctypes declarations, internally using the C compiler to inspect the platform headers.

After this is done we should approach separate compilation so that developing modules in RPython has a quicker turnaround time. This is somewhat involved to implement for technical reasons. There are ideas how to implement it quickly to make it usable for prototyping, but it's still a lot of work.

3 comments:

Simon Burton said...

Is it not possibe to test rpython extension modules for pypy on top of cpython ? (ie. without compilation)

Maciej Fijalkowski said...

Yop, sure it is. PyPy extension modules runs through ctypes on top of CPython.

Anonymous said...

I guess that Simon meant to ask why easier module testing requires separate compilation. The fact is that even if a module runs fine on top of CPython, there will be some RPython issues that are only visible when you try to translate it.

Armin