|
Subject: Re: Questions about XFS Newsgroups: gmane.comp.file-systems.xfs.general Date: Tuesday 11th June 2013 17:28:34 UTC (over 4 years ago) On 6/11/13 11:12 AM, Steve Bergman wrote: > In #5 I was specifically talking about ext4. After the 2009 brouhaha > over zero-length files in ext4 with delayed allocation turned on, Ted > merged some patches into vanilla kernel 2,6,30 which mitigated the > problem by recognizing certain common idioms and forcing automatically > forcing an fsync. I'd heard the the XFS team modeled a set of XFS > patches from them. Assuming we're talking about the same behaviors, XFS resolved this issue in May 2007, in the 2.6.22 kernel, commit ba87ea6, over a year before ext4 even had delayed allocation working. ext4 added the flush-on-close heuristic in 2009, commit 7d8f9f7. > Regarding #4, I have 12 years experience with my workloads on ext3 and > 3 yrs on ext4 and know what I have observed. As a practical matter, > there are large differences between filesystem behaviors which aren't > up for debate since I know my workloads' behavior in the real world > far better than anyone else possibly could. (In fact, I'm not sure how > anyone else could presume to know how my workloads and filesystems > interact.) But if I understand correctly, ext4 at default settings > journals metadata and commits it every 5s, while flushing data every > 30s. Ext3 journals metadata, and commits it every 5 seconds, while > effectively flushing data, *immediately before the metadata*, every 5 > seconds. so the window in which data and metadata are not in sync is > vanishingly small. Are you saying that with XFS there is no periodic > flushing mechanism at all? And that unless there's an > fsync/fdatasync/sync or the memory needs to be reclaimed, that it can > sit in the page cache forever? No. By and large, buffered IO in a filesystem is flushed out by the vm, due to either age or memory pressure. The filesystem then responds to these requests by the VM, writing data as requested. You can read all about it in Documentation/sysctl/vm.txt but see dirty_expire_centisecs and dirty_writeback_centisecs - flushers wake up every 30s and push on data more than 5s old, by default. ext3 is somewhat unique in data=ordered metadata logging driving data flushing, IMHO. > One thing is puzzling me. Everyone is telling me that I must ensure > that fsync/fdatasync is used, even in environments where the concept > doesn't exist. So I've gone to find good examples of how it it used. > Since RHEL6 has been shipping with ext4 as the default for over 2.5 > years, I figured it would be a great place to find examples. However, > I've been unable to find examples of fsync or fdatasync being used, > when using "strace -o file.out -f" on various system programs which > one would very much expect to use it. Whether or not an application *uses* fsync is orthogonal to whether or not it's *needed* to ensure persistence. Obviously you don't need to fsync every IO as soon as its issued. And there are buggy applications, yes. It's up to the app to decide what needs to be persistent and when. See the "When Should You Fsync?" section in the URL below. > We talked about some Python > config utilities the other day. But now I've moved on to C and C++ > code. e.g. "cupsd" copy/truncate/writes the config file > "/etc/cups/printers.conf" quite frequently, all day long. But there is > no sign whatsoever of any fsync or fdatasync when I grep the strace > output file for those strings case insensitively. (And indeed, a > complex printers.conf file turned up zero-length on one of my RHEL6.4 > boxes last week.) I'd file a bug against cups, then. > So I figured that when rpm installs a new vmlinuz, builds a new > initramfs and puts it into place, and modifies grub.conf, that surely > proper sync'ing must be done in this particularly critical case. But > while I do see rpm fsync/fsync'ing its own database files, it never > seems to fsync/fdatasync the critical system files it just installed > and/or modified. Surely, after over 2 - 1/2 years of Red Hat shipping > RHEL6 to customers, I must be mistaken in some way. Could you point me > to an example in RHEL6.4 where I can see clearly how fsync is being > properly used? In the mean time, I'll keep looking. database packages would get it right, I hope. See also http://lwn.net/Articles/457667/, "Ensuring data reaches disk" -Eric > Thanks, > Steve > > > > On Tue, Jun 11, 2013 at 8:59 AM, Ric Wheeler |
||