Features Download
From: Serge Hallyn <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA <at> public.gmane.org>
Subject: Re: cgroup: status-quo and userland efforts
Newsgroups: gmane.linux.kernel.cgroups
Date: Thursday 27th June 2013 13:22:06 UTC (over 4 years ago)
Quoting Mike Galbraith ([email protected]):
> On Wed, 2013-06-26 at 14:20 -0700, Tejun Heo wrote: 
> > Hello, Tim.
> > 
> > On Mon, Jun 24, 2013 at 09:07:47PM -0700, Tim Hockin wrote:
> > > I really want to understand why this is SO IMPORTANT that you have to
> > > break userspace compatibility?  I mean, isn't Linux supposed to be
> > > OS with the stable kernel interface?  I've seen Linus rant time and
> > > time again about this - why is it OK now?
> > 
> > What the hell are you talking about?  Nobody is breaking userland
> > interface.  A new version of interface is being phased in and the old
> > one will stay there for the foreseeable future.  It will be phased out
> > eventually but that's gonna take a long time and it will have to be
> > something hardly noticeable.  Of course new features will only be
> > available with the new interface and there will be efforts to nudge
> > people away from the old one but the existing interface will keep
> > working it does.
> I can understand some alarm.  When I saw the below I started frothing at
> the face and howling at the moon, and I don't even use the things much.
> http://lists.freedesktop.org/archives/systemd-devel/2013-June/011521.html
> Hierarchy layout aside, that "private property" bit says that the folks
> who currently own and use the cgroups interface will lose direct access
> to it.  I can imagine folks who have become dependent upon an on the fly
> management agents of their own design becoming a tad alarmed.

FWIW, the code is too embarassing yet to see daylight, but I'm playing
with a very lowlevel cgroup manager which supports nesting itself.
Access in this POC is low-level ("set freezer.state to THAWED for cgroup
/c1/c2", "Create /c3"), but the key feature is that it can run in two
modes - native mode in which it uses cgroupfs, and child mode where it
talks to a parent manager to make the changes.

So then the idea would be that userspace (like libvirt and lxc) would
talk over /dev/cgroup to its manager.  Userspace inside a container
(which can't actually mount cgroups itself) would talk to its own
manager which is talking over a passed-in socket to the host manager,
which in turn runs natively (uses cgroupfs, and nests "create /c1" under
the requestor's cgroup).

At some point (probably soon) we might want to talk about a standard API
for these things.  However I think it will have to come in the form of
a standard library, which knows to either send requests over dbus to
systemd, or over /dev/cgroup sock to the manager.

CD: 6ms