Home
Reading
Searching
Subscribe
Sponsors
Statistics
Posting
Contact
Spam
Lists
Links
About
Hosting
Filtering
Features Download
Marketing
Archives
FAQ
Blog
 
Gmane
From: MinRK <benjaminrk <at> gmail.com>
Subject: Re: memory problem with passing numpy arrays in IPython parallel
Newsgroups: gmane.comp.python.ipython.user
Date: Saturday 26th May 2012 22:28:59 UTC (over 5 years ago)
On Sat, May 26, 2012 at 2:52 PM, Fernando Perez
wrote:

> On Sat, May 26, 2012 at 2:01 PM, Charles Cadieu 
> wrote:
> > Thanks for pointing this out. Keeping copies of every piece of data
that
> is
> > passed through certainly seems like it could be the cause of the
problem.
> >
> > I just upgraded to source and tried this db:
> > c.HubFactory.db_class = 'IPython.parallel.controller.dictdb.NoDB'
> >
> > but still experience the same behavior, although it appears to run
> smoothly
> > for more iterations when I use the NoDB class.
> >
> > I can give the mongodb option a try next week.
>
> Is it possible for you to do some memory profiling of your machine
> while running, to see *which* process is the one growing in size?
> That would help a lot in finding a good strategy to improve here.
>

There are two objects with growing caches in IPython.parallel when you are
sending and receiving lots of data:

1. The Hub remembers everything you send and receive, for resubmission and
delayed retrieval.  NoDB disables this entirely, and MongoDB/SQLiteDB put
the data to disk, rather than the default in-memory DictDB.  If you aren't
using task resubmission or delayed retrieval (i.e. requesting results of
computations *not* submitted by your Client object), then I recommend using
NoDB, as that whole mechanism is useless to you.  Or, the more conservative
approach is to periodically instruct the DB to forget past results with the
Client.purge_results() method.

2. The Client object also caches its own results (in Client.results,
logically enough).  This is a simple dict, so you can cause the client to
forget its cache with a simple `client.results.clear()` and
`view.results.clear()`.  This one is an open Issue to be addressed:
https://github.com/ipython/ipython/issues/1131.

Summary:

* clear Hub storage with Client.purge_results() or disable Hub storage with
--nodb
* clear Client-side cache with `Client.results.clear()` and
`View.results.clear()`

-MinRK




>
> Cheers,
>
> f
> _______________________________________________
> IPython-User mailing list
> [email protected]
> http://mail.scipy.org/mailman/listinfo/ipython-user
>
 
CD: 5ms