On Sat, May 26, 2012 at 2:52 PM, Fernando Perez
> On Sat, May 26, 2012 at 2:01 PM, Charles Cadieu
> > Thanks for pointing this out. Keeping copies of every piece of data
> > passed through certainly seems like it could be the cause of the
> > I just upgraded to source and tried this db:
> > c.HubFactory.db_class = 'IPython.parallel.controller.dictdb.NoDB'
> > but still experience the same behavior, although it appears to run
> > for more iterations when I use the NoDB class.
> > I can give the mongodb option a try next week.
> Is it possible for you to do some memory profiling of your machine
> while running, to see *which* process is the one growing in size?
> That would help a lot in finding a good strategy to improve here.
There are two objects with growing caches in IPython.parallel when you are
sending and receiving lots of data:
1. The Hub remembers everything you send and receive, for resubmission and
delayed retrieval. NoDB disables this entirely, and MongoDB/SQLiteDB put
the data to disk, rather than the default in-memory DictDB. If you aren't
using task resubmission or delayed retrieval (i.e. requesting results of
computations *not* submitted by your Client object), then I recommend using
NoDB, as that whole mechanism is useless to you. Or, the more conservative
approach is to periodically instruct the DB to forget past results with the
2. The Client object also caches its own results (in Client.results,
logically enough). This is a simple dict, so you can cause the client to
forget its cache with a simple `client.results.clear()` and
`view.results.clear()`. This one is an open Issue to be addressed:
* clear Hub storage with Client.purge_results() or disable Hub storage with
* clear Client-side cache with `Client.results.clear()` and
> IPython-User mailing list
> [email protected]