Home
Reading
Searching
Subscribe
Sponsors
Statistics
Posting
Contact
Spam
Lists
Links
About
Hosting
Filtering
Features Download
Marketing
Archives
FAQ
Blog
 
Gmane
From: Mike Shal <marfey-Re5JQEeQqe8AvxtiuMwx3w <at> public.gmane.org>
Subject: Re: bypassing read/write for mirror fs
Newsgroups: gmane.comp.file-systems.fuse.devel
Date: Tuesday 9th April 2013 15:21:05 UTC (over 4 years ago)
Hi Goswin, thanks for the feedback. My results are below:

On Tue, Apr 9, 2013 at 7:22 AM, Goswin von Brederlow
<[email protected]>wrote:

> On Tue, Apr 02, 2013 at 10:58:32AM -0400, Mike Shal wrote:
> > Hi Goswin,
> >
> > On Tue, Apr 2, 2013 at 9:41 AM, Goswin von Brederlow
<[email protected]
> >wrote:
> >
> > > On Sun, Mar 31, 2013 at 02:44:10PM -0400, Mike Shal wrote:
> > > > Hello again, hope you don't mind revisiting this topic, but I have
an
> > > > example patch and some more benchmarks...
> > > >
> > > > Here are a few other examples:
> > > >
> > > > 1) Large ~3GB read (cat bigfile.txt > /dev/null)
> > > > native fs: 0.279s
> > > > fuse: 1.392s (~5x slower)
> > > > fuse passthrough: 0.279s (no difference!)
> > > >
> > > > 2) Large (100MB) write (dd bs=1M count=100 if=/dev/zero of=outfile)
> > > > native fs: 0.048s
> > > > fuse: 0.609s (~12x slower)
> > > > fuse passthrough: 0.048s (no difference!)
> > > >
> > > > Note that in all cases, the speed of the underlying disk is
> irrelevant
> > > > since everything is cached.
> > > >
> > > > I think this is significant enough to warrant adding the
> functionality to
> > > > FUSE.
> > > >
> > > >
> > > > >
> > > > > And the performance of fuse can be improved further.  For example
> Pavel
> > > > > Emelyanov is working on a patchset that allows the kernel to
cache
> > > > > writes, just like any other filesystem, bringing the cached write
> > > > > performance up to the baseline you measured.
> > > > >
> > > >
> > > > I'd be happy to perform other tests if you can provide some details
> on
> > > how
> > > > to run them (changes to fusexmp_fh). I don't see how caching writes
> would
> > > > help for cases like this though - read performance is also a major
> > > concern.
> > >
> > > So how much faster does fuse get with big writes (and I mean 128k or
> > > more here) and with splice operations for the same tests?
> > >
> > >
> > Here are my results:
> >
> > A) ./fusexmp_fh -obig_writes
> > 1) link test: 45.149s (~2 second improvement, still 137% longer than
> native)
> > 2) read test: no change
> > 3) write test: 0.173s (now 3.5x slower, rather than 12x slower)
> >
> > So it seems for the case I really care about (the end-to-end linking
> time),
> > writing is a small portion of the total time. However, it does speed up
> the
> > write-only test significantly using a 128k buffer instead of the
default
> 4k
> > buffer. It is still 3.5x slower, whereas with the passthrough
> > implementation it achieves native speeds.
>
> Obviously -obig_writes only affects big writes and not links (or
> reads). No surprise there.
>
> But you can see that 128k buffers help a lot. Even bigger buffer help
> even more.
>

The 128k buffer only helps a little bit for the link time, which is the
case I care about the most. It helps more for the write-only case, but that
is a simple benchmark, not a real-world test case.


>
> > B) ./fusexmp_fh -osplice_write -osplice_read
> > 1) link test: 47.339s (no real change over the default fuse)
> > 2) read test: 0.656s (twice as fast as default fuse, but still twice as
> > slow as native)
> > 3) write test: 0.545s (slightly better than default fuse, but still 11x
> > slower than native)
>
> And ./fusexmp_fh -osplice_write -osplice_read -obig_writes?
>

With -osplice_write -osplice_read -obig_writes I get:

1) link test: 45.622s
2) read test: 0.700s
3) write test: 0.155s

Task switching to fuse and back will always be an overhead and
> passthrough will always be a bit faster. What surprises me is that it
> still is that much overhead.
>
> How large are the read requests? Maybe those can be tuned more? Bigger
> read-ahead or larger requests?
>

For the link test, read sizes (as measured by printing out the 'size'
argument in read_buf()) are anywhere from 4k to 128k. Maybe the variation
is because of how the linker is reading the data - it probably doesn't read
the whole file in at once, but seeks around and reads the parts it needs.
Just a guess, though.

For reference, there are 124072 calls to read_buf() and 225079 calls to
write_buf() in the link test (measured using the fusexmp_fh -osplice_write
-osplice_read -obig_writes). Making these numbers smaller by using
different buffer sizes may help somewhat, as shown by the small improvement
using -obig_writes. However, with a passthrough implementation, these
numbers are 0.


>
> For writes wasn't there recently a patch to improve caching and page
> writeback for fuse? Combined with larger (even larger than 128k)
> writes fuse should get nearer to the passthrough performance.
>
>
This would not do anything for read performance though, correct? In my link
test, I can temporarily ignore the write side of the problem by specifying
/dev/null as the output library. In this case, there are only 52 calls to
write_buf() (there is a temporary file written listing the object files),
so we can see how much just using passthrough on read() requests will help.
Here are my numbers:

native: 16.807s
default fuse: 27.471s
splice_read/write and big_writes: 27.059s
fuse passthrough: 22.597s

Here is a summary of the benchmarks so far for the link test (my real-world
use case) from best to worst:

native: 18.986s
passthrough: 24.754s
-obig_writes: 45.149s
-osplice_write -osplice_read -obig_writes: 45.622s
fusexmp_fh defaults: 47.232
-osplice_write -osplice_read: 47.339s

-Mike
------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
 
CD: 2ms