For some reason or another, I’m getting a lot of long-term hits to this post that I wrote a few months ago on parallel programming with Python. Thus, I thought I might share the news with people who have been looking for solutions to this problem: Python 2.6, which was just released, contains a new multiprocessing library which at first glance seems to formalize the procedure I outlined in my previous post and make it more robust. From the description:
multiprocessing is a package that supports spawning processes using an API similar to the threading module. The multiprocessing package offers both local and remote concurrency, effectively side-stepping the Global Interpreter Lock by using subprocesses instead of threads. Due to this, the multiprocessing module allows the programmer to fully leverage multiple processors on a given machine. It runs on both Unix and Windows.
If you’re interested in that, go grab 2.6 and check out the documentation for the package here!




October 3, 2008 at 10:33 pm |
Pretty cool, it sounds like there using microthreads like in stackless python.
October 3, 2008 at 11:08 pm |
Nope, multiprocessing uses processes, as the name says.
October 23, 2009 at 12:30 pm |
You said “For some reason or another…” and I think the reason is, in fact, the release of python 2.6 and people’s use of the multiprocessing module of which you speak.
The trick is that processes are hard to abstract over in a platform agnostic way and the code seems (maybe?) to have lurking bugs due to a (possibly?) rushed delivery.
I think “people” are running into problems using multiprocessing and then googling around for solutions or bug reports or *something* to explain what’s going wrong. Anyway, that’s how I found your other article and subsequently this one by way of trackback
For reference, the problem I’m dealing with is that my script fails to die gracefully. When I ^C out of it, it is very frequently in multiprocessing.forking.poll() sitting on the line “pid, sts = os.waitpid(self.pid, flag)”. I found your previous post when I started broadening my google searches around python fork() topics based on this observation.
October 23, 2009 at 5:27 pm |
To help anyone else who stumbles across this blog post, the thing that ended up solving my problem was using multiprocessing.Manager() and getting a manager.Queue() from it, instead of using a plain old multiprocessing.Queue().
The manager objects are rarely in any online examples I’ve found. They tend to demo the basic functionality with toy problems like having a process print its id or some such. The value of the Manager shows up in preventing some sort of deadlocking problem that pops up sporadically when passing data between processes.