Home
Reading
Searching
Subscribe
Sponsors
Statistics
Posting
Contact
Spam
Lists
Links
About
Hosting
Filtering
Features Download
Marketing
Archives
FAQ
Blog
 
Gmane
From: Zach Brown <zach.brown <at> oracle.com>
Subject: Re: Syslets, Threadlets, generic AIO support, v6
Newsgroups: gmane.linux.kernel
Date: Tuesday 29th May 2007 22:49:16 UTC (over 9 years ago)
> .. so don't keep us in suspense. Do you have any numbers for anything 
> (like Oracle, to pick a random thing out of thin air ;) that might 
> actually indicate whether this actually works or not?

I haven't gotten to running Oracle's database against it.  It is going
to be Very Cranky if O_DIRECT writes aren't concurrent, and that's going
to take a bit of work in fs/direct-io.c.

I've done initial micro-benchmarking runs for basic sanity testing with
fio.  They haven't wildly regressed, that's about as much as can be said
with confidence so far :).

Take a streaming O_DIRECT read.  1meg requests, 64 in flight.

str: (g=0): rw=read, bs=1M-1M/1M-1M, ioengine=libaio, iodepth=64

mainline:

	  read : io=3,405MiB, bw=97,996KiB/s, iops=93, runt= 36434msec

aio+syslets:

	  read : io=3,452MiB, bw=99,115KiB/s, iops=94, runt= 36520msec

That's on an old gigabit copper FC array with 10 drives behind a, no
seriously, qla2100.

The real test is the change in memory and cpu consumption, and I haven't
modified fio to take reasonably precise measurements of those yet.  Once
I get O_DIRECT writes concurrent that'll be the next step. 

I was pleased to see my motivation for the patches, to avoid having to
add specific support for operations to be called from fs/aio.c, work
out.  

Take the case of 4k random buffered reads from a block device with a
cold cache:

read: (g=0): rw=randread, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=64

mainine:

  read : io=16,116KiB, bw=457KiB/s, iops=111, runt= 36047msec
    slat (msec): min=    4, max=  629, avg=563.17, stdev=71.92
    clat (msec): min=    0, max=    0, avg= 0.00, stdev= 0.00

aio+syslets:

  read : io=125MiB, bw=3,634KiB/s, iops=887, runt= 36147msec
    slat (msec): min=    0, max=    3, avg= 0.00, stdev= 0.08
    clat (msec): min=    2, max=  643, avg=71.59, stdev=74.25

aio+syslets w/o cfq

  read : io=208MiB, bw=6,057KiB/s, iops=1,478, runt= 36071msec
    slat (msec): min=    0, max=   15, avg= 0.00, stdev= 0.09
    clat (msec): min=    2, max=  758, avg=42.75, stdev=37.33

Everyone step back and thank Jens for writing a tool that gives us
interesting data without us always having to craft some stupid specific
test each and every time.  Thanks, Jens!

In the mainline number fio clearly shows the buffered read submissions
being handled synchronously.  The mainline buffered IO paths doesn't
know to identify and work with iocbs so requests are handled in series.

In the +syslet number we see the __async_schedule() catching
the blocking buffered read, letting the submission proceed
asynchronously.  We get async behaviour without having to touch any of
the buffered IO paths.

Then we turn off cfq and we actually start to saturate the (relatively
ancient) drives :).

I need to mail Jens about that cfq behaviour, but I'm guessing it's
expected behaviour of a sort -- each syslet thread gets its own
io_context instead of inheriting it from its parent.

- z
 
CD: 48ms