On Thu, 26 Mar 2009 18:03:15 -0700 (PDT) Linus Torvalds
> On Thu, 26 Mar 2009, Andrew Morton wrote:
> > userspace can get closer than the kernel can.
> Andrew, that's SIMPLY NOT TRUE.
> You state that without any amount of data to back it up, as if it was
> kind of truism. It's not.
I've seen you repeatedly fiddle the in-kernel defaults based on
in-field experience. That could just as easily have been done in
initscripts by distros, and much more effectively because it doesn't
need a new kernel. That's data.
The fact that this hasn't even been _attempted_ (afaik) is deplorable.
Why does everyone just sit around waiting for the kernel to put a new
value into two magic numbers which userspace scripts could have set?
My /etc/rc.local has been tweaking dirty_ratio, dirty_background_ratio
and swappiness for many years. I guess I'm just incredibly advanced.
> Everybody accepts that if you've written a 20MB file and then call
> "fsync()" on it, it's going to take a while. But when you've written a
> file, and "fsync()" takes 20 seconds, because somebody else is just
> writing normally, _that_ is a bug. And it is actually almost totally
> unrelated to the whole 'dirty_limit' thing.
> At least it _should_ be.
That's different. It's inherent JBD/ext3-ordered brain damage.
Unfixable without turning the fs into something which just isn't jbd/ext3
any more. data=writeback is a workaround, with the obvious integrity
The JBD journal is a massive designed-in contention point. It's why
for several years I've been telling anyone who will listen that we need
a new fs. Hopefully our response to all these problems will soon be
"did you try btrfs?".