Features Download
From: Patrick Eaton <patrick.eaton-Re5JQEeQqe8AvxtiuMwx3w <at> public.gmane.org>
Subject: Fuse, direct_io, mmap, and executables...
Newsgroups: gmane.comp.file-systems.fuse.devel
Date: Tuesday 23rd October 2007 15:34:58 UTC (over 9 years ago)

Recently, there have been a number of questions raised about how
direct_io and mmap interact.  I have been following these threads (and
scouring the archives for related threads)  with interest since I have
been facing the same issues.

I understand that, as things are currently implemented, direct_io and
mmap are mutually exclusive.  Furthermore, the kernel relies on mmap
to run executables, so direct_io also precludes executables.

It seems to me that Fuse file system developers are using the
direct_io option for one (or more) of three different reasons.  It
seems that there are reasonable responses to this issues for two of
those reasons.  Supporting mmap with the third use of direct_io is
still an open problem.  The reasons I have seen cited for use of
direct_io are given below.  Are there others?  Please correct any
misunderstanding, and please suggest workarounds for the third reason,
if you know any.

Reasons to use direct_io, and interaction with mmap:
1.  The file system provides access to data that is truly volatile,
like readings from sensors, and should not be cached in the kernel.  I
think it is clear that such file systems need not be concerned about
supporting mmap and executables.
2.  The developer wants the file system to handle data in blocks
larger than 4KB.  Currently, to access blocks larger than 4KB through
Fuse, the file system must use direct_io.  However, according to a
message posted in the forum by Miklos on October 19, 2007, this
restriction will be lifted soon once some patches make it into the
kernel.  So it seems to me that this reason to use direct_io will soon
be invalidated and will no longer be an obstacle to supporting mmap
and executables.
3.  Only the user level file system can ensure the consistency of the
data.  (This is the situation that I find myself in.)  Most commonly,
this is because the Fuse file system is a distributed file system.
This seems to be a legitimate use of direct_io, but prevents users
from running executables from the Fuse file system.  This seems
unfortunate.  I have not heard any good solutions to this problem.
Has anybody come up with anything?  Are there any interfaces that
allow a user-level application to invalidate pages in the kernel
cache?  That would allow the user level file system to use the kernel
cache and still provide the cache consistency that it requires.  Would
it be possible to require that every time the kernel mmaps a page, it
requests it from the file system (even if it is already in the cache
from a previous use)?  That would at least provide a sort of
close-to-open consistency, ensuring that each time an executable was
run, it would run the latest copy of the file.

Any suggestions for how to support mmap in a system like that
described in #3 above would be appreciated.


This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
CD: 3ms