MPI Pool #38

joezuntz · 2012-10-15T10:47:26Z

For problems where a single calculation of the likelihood is significantly slow you end up needing a distributed memory machine very quickly. This is also a really good way of using the parallelism that the emcee algorithm offers effectively.

I've added here two example files containing an MPI pool, so you can upscale the algorithm to hundreds of cores without changing the emcee core at all. I also have another variant which is a little more efficient, but is pretty misleading and less future-proof should you change how pools are used in emcee. (You can assume that the function being mapped never changes so you never have to send it).

The implementation uses the mpi4py package.

dfm · 2012-11-30T14:57:15Z

Hi @joezuntz,

First of all, thanks for this! A lot of people have been asking for something like this and this is a very satisfying solution!

Sorry for taking a million years but I've finally started merging this. I've added your MPIPool object to emcee.utils and tweaked the example a little bit to look more like the quickstart.

A few things:

You mention in the comments that you have a more efficient emcee-specific version of the pool object. If it's easy to do, it might be worth setting that up. I also haven't done many tests but my intuition is that we'll need to optimize as much as we can to make this technique worthwhile because of the amount of inter-node communication necessary.
I get a really strange error that is summed up in this gist: https://gist.github.com/4176241 Any thoughts?

Thanks again,

Dan

joezuntz · 2012-11-30T17:46:13Z

Cool - good to hear. I'm using emcee as part of the nascent Dark Energy Survey parameter estimation code, and it's been great, thanks.

The slightly more efficient version that I have basically just avoids sending the function object repeatedly, and assumes that it will remain the same throughout the life of the pool, as is true in this case. That reduces the number of communications quite a bit. Let me know what you think and I can give you a patch for this - it's a pretty small change.
I can repeat this very interesting error! It seems to be some kind of timing/race condition, because I can remove it by switching debugging on or adding other print or sleep statements to slow things down. I think it's probably a bug in mpi4py, unless I'm misunderstanding something about aysnchronous messaging. FWIW, I haven't seen this problem running on larger machines.

Since I thing there is some kind of clash going on at some point here I managed to solve it by having the master wait for all tasks to be received before demanding results. This should not slow anything down too much, I think. See patch (couldn't face setting up a whole new fork+branch+pull request):

https://gist.github.com/4177285

dstndstn · 2012-11-30T19:20:29Z

I thought the pickle/unpickle infrastructure kept some caching so that sending
an object twice is actually not significantly slower than sending it once.

OR, as an alternative to changing the pool, couldn't you just use a global
initializer that creates a global function object? That would just involved
changes to your code, not the infrastructure.

--dstn

On Fri, 30 Nov 2012, joezuntz wrote:

Cool - good to hear. I'm using emcee as part of the nascent Dark Energy Survey parameter estimation code, and it's been great, thanks.
The slightly more efficient version that I have basically just avoids sending the function object repeatedly, and assumes that it will
remain the same throughout the life of the pool, as is true in this case. That reduces the number of communications quite a bit. Let me
know what you think and I can give you a patch for this - it's a pretty small change.
I can repeat this very interesting error! It seems to be some kind of timing/race condition, because I can remove it by switching
debugging on or adding other print or sleep statements to slow things down. I think it's probably a bug in mpi4py, unless I'm
misunderstanding something about aysnchronous messaging. FWIW, I haven't seen this problem running on larger machines.
Since I thing there is some kind of clash going on at some point here I managed to solve it by having the master wait for all tasks to be
received before demanding results. This should not slow anything down too much, I think. See patch (couldn't face setting up a whole new
fork+branch+pull request):

https://gist.github.com/4177285

—
Reply to this email directly or view it on GitHub.

[nYCuMxH-c8n4yM9OOnIktJfSTzlUJWU_X2IfRvbmKN4He8DW9_LLk7q0YrUthVBu.gif]

dfm · 2012-11-30T21:45:22Z

Thanks! That fix worked. I'll finish up the documentation and then send you a link so that you can find what I mess up.

I'd love to know what the answer to @dstndstn's question is because (I think) my friend @jonathansick found that he could significantly speed up parallelization (using zmq which also uses pickling for message passing) by only passing the function once... but, as you can tell, I don't really remember the details.

@dstndstn: I'm not sure if I quite understand your second comment. Can you explain a bit more what this would look like? Thanks!

jonathansick · 2012-11-30T22:13:28Z

@dfm @dstndstn My "solution" to the function pickling overhead problem was to set up a bunch of work servers so that the objective function is setup/initialized only once per node. This is really great for objective functions that have a lot of data (say, a stellar pop synthesis code). The worker cluster is also persistent between emcee runs.

My zmq package is at https://github.com/jonathansick/mapscale and there's half-decent documentation at http://mapscale.jonathansick.ca

My branch of emcee ( https://github.com/jonathansick/emcee ) contains some mods to make the mapper object work as a replacement to Python/multiprocessing map(). These need to be rebased onto the current emcee.

This said, MapScale is a completely alpha thing right now. No unit tests, no inter-machine networking yet, no real robustness against network hiccups/job loss. The mpi4py integration from @joezuntz is certainly the way to go right now if it can cache the objective function. I'll want to check it out with (python-wrapped) FSPS pop synth problems, since that's a case where there's a huge overhead in initializing the objective function that should only be done once.

dfm · 2012-11-30T22:35:51Z

Thanks for your comments, @jonathansick. Perhaps it's worth taking this to a new thread... maybe when mapscale-emcee is ready for prime time.

@joezuntz: I've pushed the documentation for your patch. Besides the fact the the source code link gets a 404 (this will be fixed when we push to master) I think that this looks about right. Thoughts? Thanks again for your contribution.

joezuntz · 2012-12-01T12:38:01Z

I should have seen this before but it's pretty trivial to cache the applied function - you can just have to have the master check if the function has changed and only re-send to the workers if it has. That way you don't lose any generality but you get the efficiency of sending the function only once.

patch here:
https://gist.github.com/4181972

The docs look good - not sure if it's worth mentioning that you need to close the pool at the end? (Actually you could put this in the master's destructor, though I think that would potentially be confusing).

Added an MPI pool and an example of using it to examples dir.

652c0ab

dfm added a commit that referenced this pull request Nov 30, 2012

Fixing the MPI behaviour described in #38

a314fa7

dfm merged commit 652c0ab into dfm:master Jan 28, 2013

dfm added a commit that referenced this pull request Jan 28, 2013

Applying @joezuntz's patch from #38

ca9abf3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MPI Pool #38

MPI Pool #38

joezuntz commented Oct 15, 2012

dfm commented Nov 30, 2012

joezuntz commented Nov 30, 2012

dstndstn commented Nov 30, 2012

dfm commented Nov 30, 2012

jonathansick commented Nov 30, 2012

dfm commented Nov 30, 2012

joezuntz commented Dec 1, 2012

MPI Pool #38

MPI Pool #38

Conversation

joezuntz commented Oct 15, 2012

dfm commented Nov 30, 2012

joezuntz commented Nov 30, 2012

dstndstn commented Nov 30, 2012

dfm commented Nov 30, 2012

jonathansick commented Nov 30, 2012

dfm commented Nov 30, 2012

joezuntz commented Dec 1, 2012