linux-stable.git/fs, branch linux-2.6.16.y

fix SMP ordering hole in fcntl_setlk() (CVE-2008-1669)

2008-07-14T18:09:23+00:00

fcntl_setlk()/close() race prevention has a subtle hole - we need to
make sure that if we *do* have an fcntl/close race on SMP box, the
access to descriptor table and inode->i_flock won't get reordered.

As it is, we get STORE inode->i_flock, LOAD descriptor table entry vs.
STORE descriptor table entry, LOAD inode->i_flock with not a single
lock in common on both sides.  We do have BKL around the first STORE,
but check in locks_remove_posix() is outside of BKL and for a good
reason - we don't want BKL on common path of close(2).

Solution is to hold ->file_lock around fcheck() in there; that orders
us wrt removal from descriptor table that preceded locks_remove_posix()
on close path and we either come first (in which case eviction will be
handled by the close side) or we'll see the effect of close and do
eviction ourselves.  Note that even though it's read-only access,
we do need ->file_lock here - rcu_read_lock() won't be enough to
order the things.

Signed-off-by: Al Viro 
Signed-off-by: Adrian Bunk

asn1: additional sanity checking during BER decoding (CVE-2008-1673)

2008-07-14T18:09:23+00:00

- Don't trust a length which is greater than the working buffer.
  An invalid length could cause overflow when calculating buffer size
  for decoding oid.

- An oid length of zero is invalid and allows for an off-by-one error when
  decoding oid because the first subid actually encodes first 2 subids.

- A primitive encoding may not have an indefinite length.

Thanks to Wei Wang from McAfee for report.

Acked-by: Patrick McHardy 
Signed-off-by: Chris Wright 
Signed-off-by: Adrian Bunk

NFS: call nfs_wb_all() only on regular files

2008-01-21T19:04:16+00:00

It looks like nfs_setattr() and nfs_rename() also need to test whether the
target is a regular file before calling nfs_wb_all()...

It isn't technically needed since the version of nfs_wb_all() that exists
on 2.6.16 should be safe to call on non-regular files (it will be a no-op).
However it is a useful optimisation.

Signed-off-by: Trond Myklebust 
Signed-off-by: Adrian Bunk

NFS: writes should not clobber utimes() calls

2008-01-21T19:02:11+00:00

Ensure that we flush out writes in the case when someone calls utimes() in
order to set the file times.

Signed-off-by: Trond Myklebust 
Signed-off-by: Adrian Bunk

vfs: coredumping fix (CVE-2007-6206)

2008-01-21T00:20:19+00:00

fix: http://bugzilla.kernel.org/show_bug.cgi?id=3043

only allow coredumping to the same uid that the coredumping
task runs under.

Signed-off-by: Ingo Molnar 
Signed-off-by: Adrian Bunk

limit minixfs printks on corrupted dir i_size (CVE-2006-6058)

2008-01-16T21:36:44+00:00

First reported at http://projects.info-pull.com/mokb/MOKB-17-11-2006.html

Essentially a corrupted minix dir inode reporting a very large
i_size will loop for a very long time in minix_readdir, minix_find_entry,
etc, because on EIO they just move on to try the next page.  This is
under the BKL, printk-storming as well.  This can lock up the machine
for a very long time.  Simply ratelimiting the printks gets things back
under control.  Make the message a bit more informative while we're here.

Adrian Bunk:
Backported to 2.6.16.

Signed-off-by: Eric Sandeen 
Signed-off-by: Adrian Bunk

fix messages in fs/minix

2008-01-16T21:25:08+00:00

Believe it or not, but in fs/minix/*, the oldest filesystem in the kernel,
something still can be fixed:

    printk("new_inode: bit already set");

"\n" is missing!

While at it, I also removed periods from the end of error messages and made
capitalization uniform.  Also s/i-node/inode/, s/printk (/printk(/

Signed-off-by: Denis Vlasenko 
Signed-off-by: Adrian Bunk

Use access mode instead of open flags to determine needed permissions (CVE-2008-0001)

2008-01-15T23:48:15+00:00

patch 974a9f0b47da74e28f68b9c8645c3786aa5ace1a in mainline

Way back when (in commit 834f2a4a1554dc5b2598038b3fe8703defcbe467, aka
"VFS: Allow the filesystem to return a full file pointer on open intent"
to be exact), Trond changed the open logic to keep track of the original
flags to a file open, in order to pass down the the intent of a dentry
lookup to the low-level filesystem.

However, when doing that reorganization, it changed the meaning of
namei_flags, and thus inadvertently changed the test of access mode for
directories (and RO filesystem) to use the wrong flag.  So fix those
test back to use access mode ("acc_mode") rather than the open flag
("flag").

Issue noticed by Bill Roman at Datalight.

Reported-and-tested-by: Bill Roman 
Acked-by: Trond Myklebust 
Acked-by: Al Viro 
Signed-off-by: Linus Torvalds 
Signed-off-by: Adrian Bunk

knfsd: allow nfsd READDIR to return 64bit cookies

2007-11-02T23:56:46+00:00

->readdir passes lofft_t offsets (used as nfs cookies) to
nfs3svc_encode_entry{,_plus}, but when they pass it on to encode_entry it
becomes an 'off_t', which isn't good.

So filesystems that returned 64bit offsets would lose.

Signed-off-by: Neil Brown 
Signed-off-by: Adrian Bunk

buffer: memorder fix

2007-11-02T23:56:45+00:00

unlock_buffer(), like unlock_page(), must not clear the lock without
ensuring that the critical section is closed.

Mingming later sent the same patch, saying:

We are running SDET benchmark and saw double free issue for ext3 extended
attributes block, which complains the same xattr block already being freed (in
ext3_xattr_release_block()).  The problem could also been triggered by
multiple threads loop untar/rm a kernel tree.

The race is caused by missing a memory barrier at unlock_buffer() before the
lock bit being cleared, resulting in possible concurrent h_refcounter update.
That causes a reference counter leak, then later leads to the double free that
we have seen.

Inside unlock_buffer(), there is a memory barrier is placed *after* the lock
bit is being cleared, however, there is no memory barrier *before* the bit is
cleared.  On some arch the h_refcount update instruction and the clear bit
instruction could be reordered, thus leave the critical section re-entered.

The race is like this: For example, if the h_refcount is initialized as 1,

cpu 0:                                   cpu1
--------------------------------------   -----------------------------------
lock_buffer() /* test_and_set_bit */
clear_buffer_locked(bh);
                                        lock_buffer() /* test_and_set_bit */
h_refcount = h_refcount+1; /* = 2*/     h_refcount = h_refcount + 1; /*= 2 */
                                        clear_buffer_locked(bh);
....                                    ......

We lost a h_refcount here.  We need a memory barrier before the buffer head
lock bit being cleared to force the order of the two writes.  Please apply.

Signed-off-by: Nick Piggin 
Signed-off-by: Adrian Bunk