linux.git/fs/aio.c, branch v2.6.35

get rid of the magic around f_count in aio

2010-05-28T02:03:07+00:00

__aio_put_req() plays sick games with file refcount.  What
it wants is fput() from atomic context; it's almost always
done with f_count > 1, so they only have to deal with delayed
work in rare cases when their reference happens to be the
last one.  Current code decrements f_count and if it hasn't
hit 0, everything is fine.  Otherwise it keeps a pointer
to struct file (with zero f_count!) around and has delayed
work do __fput() on it.

Better way to do it: use atomic_long_add_unless( , -1, 1)
instead of !atomic_long_dec_and_test().  IOW, decrement it
only if it's not the last reference, leave refcount alone
if it was.  And use normal fput() in delayed work.

I've made that atomic_long_add_unless call a new helper -
fput_atomic().  Drops a reference to file if it's safe to
do in atomic (i.e. if that's not the last one), tells if
it had been able to do that.  aio.c converted to it, __fput()
use is gone.  req->ki_file *always* contributes to refcount
now.  And __fput() became static.

Signed-off-by: Al Viro

aio: fix the compat vectored operations

2010-05-27T16:12:53+00:00

The aio compat code was not converting the struct iovecs from 32bit to
64bit pointers, causing either EINVAL to be returned from io_getevents, or
EFAULT as the result of the I/O.  This patch passes a compat flag to
io_submit to signal that pointer conversion is necessary for a given iocb
array.

A variant of this was tested by Michael Tokarev.  I have also updated the
libaio test harness to exercise this code path with good success.
Further, I grabbed a copy of ltp and ran the
testcases/kernel/syscall/readv and writev tests there (compiled with -m32
on my 64bit system).  All seems happy, but extra eyes on this would be
welcome.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix CONFIG_COMPAT=n build]
Signed-off-by: Jeff Moyer 
Reported-by: Michael Tokarev 
Cc: Zach Brown 
Cc: 		[2.6.35.1]
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

aio: remove unused field

2009-12-16T15:20:13+00:00

Don't know the reason, but it appears ki_wait field of iocb never gets used.

Signed-off-by: Shaohua Li 
Cc: Jeff Moyer 
Cc: Benjamin LaHaise 
Cc: Zach Brown 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

block: move bdi/address_space unplug functions to backing-dev.h

2009-10-29T12:59:26+00:00

There's nothing block related about them, the backing device
is used by things like NFS etc as well. This gets rid of the
need to protect such calls by CONFIG_BLOCK.

Signed-off-by: Jens Axboe

aio: implement request batching

2009-10-28T08:29:25+00:00

Hi,

Some workloads issue batches of small I/O, and the performance is poor
due to the call to blk_run_address_space for every single iocb.  Nathan
Roberts pointed this out, and suggested that by deferring this call
until all I/Os in the iocb array are submitted to the block layer, we
can realize some impressive performance gains (up to 30% for sequential
4k reads in batches of 16).

Signed-off-by: Jeff Moyer 
Signed-off-by: Jens Axboe

aio.c: move EXPORT* macros to line after function

2009-09-23T14:39:29+00:00

As mentioned in Documentation/CodingStyle, move EXPORT* macro's
to the line immediately after the closing function brace line.

Also, move the __initcall() similarly.

Signed-off-by: H Hartley Sweeten 
Cc: Zach Brown 
Cc: Benjamin LaHaise 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

mm: move use_mm/unuse_mm from aio.c to mm/

2009-09-22T14:17:42+00:00

Anyone who wants to do copy to/from user from a kernel thread, needs
use_mm (like what fs/aio has).  Move that into mm/, to make reusing and
exporting easier down the line, and make aio use it.  Next intended user,
besides aio, will be vhost-net.

Acked-by: Andrea Arcangeli 
Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

eventfd: revised interface and cleanups

2009-07-01T01:55:58+00:00

Change the eventfd interface to de-couple the eventfd memory context, from
the file pointer instance.

Without such change, there is no clean way to racely free handle the
POLLHUP event sent when the last instance of the file* goes away.  Also,
now the internal eventfd APIs are using the eventfd context instead of the
file*.

This patch is required by KVM's IRQfd code, which is still under
development.

Signed-off-by: Davide Libenzi 
Cc: Gregory Haskins 
Cc: Rusty Russell 
Cc: Benjamin LaHaise 
Cc: Avi Kivity 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

aio: lookup_ioctx can return the wrong value when looking up a bogus context

2009-03-19T22:57:18+00:00

The libaio test harness turned up a problem whereby lookup_ioctx on a
bogus io context was returning the 1 valid io context from the list
(harness/cases/3.p).

Because of that, an extra put_iocontext was done, and when the process
exited, it hit a BUG_ON in the put_iocontext macro called from exit_aio
(since we expect a users count of 1 and instead get 0).

The problem was introduced by "aio: make the lookup_ioctx() lockless"
(commit abf137dd7712132ee56d5b3143c2ff61a72a5faa).

Thanks to Zach for pointing out that hlist_for_each_entry_rcu will not
return with a NULL tpos at the end of the loop, even if the entry was
not found.

Signed-off-by: Jeff Moyer 
Acked-by: Zach Brown 
Acked-by: Jens Axboe 
Cc: Benjamin LaHaise 
Cc: 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds

eventfd: remove fput() call from possible IRQ context

2009-03-19T22:57:18+00:00

Remove a source of fput() call from inside IRQ context.  Myself, like Eric,
wasn't able to reproduce an fput() call from IRQ context, but Jeff said he was
able to, with the attached test program.  Independently from this, the bug is
conceptually there, so we might be better off fixing it.  This patch adds an
optimization similar to the one we already do on ->ki_filp, on ->ki_eventfd.
Playing with ->f_count directly is not pretty in general, but the alternative
here would be to add a brand new delayed fput() infrastructure, that I'm not
sure is worth it.

Signed-off-by: Davide Libenzi 
Cc: Benjamin LaHaise 
Cc: Trond Myklebust 
Cc: Eric Dumazet 
Signed-off-by: Jeff Moyer 
Cc: Zach Brown 
Cc: 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds