linux-stable.git/net/sunrpc, branch v3.4.16

SUNRPC: Fix a UDP transport regression

2012-10-28T17:14:13+00:00

commit f39c1bfb5a03e2d255451bff05be0d7255298fa4 and
commit 84e28a307e376f271505af65a7b7e212dd6f61f4 upstream.

Commit 43cedbf0e8dfb9c5610eb7985d5f21263e313802 (SUNRPC: Ensure that
we grab the XPRT_LOCK before calling xprt_alloc_slot) is causing
hangs in the case of NFS over UDP mounts.

Since neither the UDP or the RDMA transport mechanism use dynamic slot
allocation, we can skip grabbing the socket lock for those transports.
Add a new rpc_xprt_op to allow switching between the TCP and UDP/RDMA
case.

Note that the NFSv4.1 back channel assigns the slot directly
through rpc_run_bc_task, so we can ignore that case.

Reported-by: Dick Streefland 
Signed-off-by: Bryan Schumaker 
Signed-off-by: Trond Myklebust 
Signed-off-by: Greg Kroah-Hartman

SUNRPC: Prevent kernel stack corruption on long values of flush

2012-10-28T17:14:13+00:00

commit 212ba90696ab4884e2025b0b13726d67aadc2cd4 upstream.

The buffer size in read_flush() is too small for the longest possible values
for it. This can lead to a kernel stack corruption:

[   43.047329] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff833e64b4
[   43.047329]
[   43.049030] Pid: 6015, comm: trinity-child18 Tainted: G        W    3.5.0-rc7-next-20120716-sasha #221
[   43.050038] Call Trace:
[   43.050435]  [] panic+0xcd/0x1f4
[   43.050931]  [] ? read_flush.isra.7+0xe4/0x100
[   43.051602]  [] __stack_chk_fail+0x16/0x20
[   43.052206]  [] read_flush.isra.7+0xe4/0x100
[   43.052951]  [] ? read_flush_pipefs+0x30/0x30
[   43.053594]  [] read_flush_procfs+0x2c/0x30
[   43.053596]  [] proc_reg_read+0x9c/0xd0
[   43.053596]  [] ? proc_reg_write+0xd0/0xd0
[   43.053596]  [] do_loop_readv_writev+0x4b/0x90
[   43.053596]  [] do_readv_writev+0xf6/0x1d0
[   43.053596]  [] vfs_readv+0x3e/0x60
[   43.053596]  [] sys_readv+0x48/0xb0
[   43.053596]  [] system_call_fastpath+0x1a/0x1f

Signed-off-by: Sasha Levin 
Signed-off-by: J. Bruce Fields 
Signed-off-by: Greg Kroah-Hartman

SUNRPC: Ensure that the TCP socket is closed when in CLOSE_WAIT

2012-10-21T16:27:58+00:00

commit a519fc7a70d1a918574bb826cc6905b87b482eb9 upstream.

Instead of doing a shutdown() call, we need to do an actual close().
Ditto if/when the server is sending us junk RPC headers.

Signed-off-by: Trond Myklebust 
Tested-by: Simon Kirby 
Signed-off-by: Greg Kroah-Hartman

svcrpc: sends on closed socket should stop immediately

2012-09-14T17:00:19+00:00

commit f06f00a24d76e168ecb38d352126fd203937b601 upstream.

svc_tcp_sendto sets XPT_CLOSE if we fail to transmit the entire reply.
However, the XPT_CLOSE won't be acted on immediately.  Meanwhile other
threads could send further replies before the socket is really shut
down.  This can manifest as data corruption: for example, if a truncated
read reply is followed by another rpc reply, that second reply will look
to the client like further read data.

Symptoms were data corruption preceded by svc_tcp_sendto logging
something like

	kernel: rpc-srv/tcp: nfsd: sent only 963696 when sending 1048708 bytes - shutting down socket

Reported-by: Malahal Naineni 
Tested-by: Malahal Naineni 
Signed-off-by: J. Bruce Fields 
Signed-off-by: Greg Kroah-Hartman

svcrpc: fix svc_xprt_enqueue/svc_recv busy-looping

2012-09-14T17:00:19+00:00

commit d10f27a750312ed5638c876e4bd6aa83664cccd8 upstream.

The rpc server tries to ensure that there will be room to send a reply
before it receives a request.

It does this by tracking, in xpt_reserved, an upper bound on the total
size of the replies that is has already committed to for the socket.

Currently it is adding in the estimate for a new reply *before* it
checks whether there is space available.  If it finds that there is not
space, it then subtracts the estimate back out.

This may lead the subsequent svc_xprt_enqueue to decide that there is
space after all.

The results is a svc_recv() that will repeatedly return -EAGAIN, causing
server threads to loop without doing any actual work.

Reported-by: Michael Tokarev 
Tested-by: Michael Tokarev 
Signed-off-by: J. Bruce Fields 
Signed-off-by: Greg Kroah-Hartman

svcrpc: fix BUG() in svc_tcp_clear_pages

2012-09-14T17:00:19+00:00

commit be1e44441a560c43c136a562d49a1c9623c91197 upstream.

Examination of svc_tcp_clear_pages shows that it assumes sk_tcplen is
consistent with sk_pages[] (in particular, sk_pages[n] can't be NULL if
sk_tcplen would lead us to expect n pages of data).

svc_tcp_restore_pages zeroes out sk_pages[] while leaving sk_tcplen.
This is OK, since both functions are serialized by XPT_BUSY.  However,
that means the inconsistency must be repaired before dropping XPT_BUSY.

Therefore we should be ensuring that svc_tcp_save_pages repairs the
problem before exiting svc_tcp_recv_record on error.

Symptoms were a BUG() in svc_tcp_clear_pages.

Signed-off-by: J. Bruce Fields 
Signed-off-by: Greg Kroah-Hartman

SUNRPC: return negative value in case rpcbind client creation error

2012-08-15T15:10:05+00:00

commit caea33da898e4e14f0ba58173e3b7689981d2c0b upstream.

Without this patch kernel will panic on LockD start, because lockd_up() checks
lockd_up_net() result for negative value.
From my pow it's better to return negative value from rpcbind routines instead
of replacing all such checks like in lockd_up().

Signed-off-by: Stanislav Kinsbursky 
Signed-off-by: Trond Myklebust 
Signed-off-by: Greg Kroah-Hartman

sunrpc: clnt: Add missing braces

2012-08-15T15:10:04+00:00

commit cac5d07e3ca696dcacfb123553cf6c722111cfd3 upstream.

Add a missing set of braces that commit 4e0038b6b24
("SUNRPC: Move clnt->cl_server into struct rpc_xprt")
forgot.

Signed-off-by: Joe Perches 
Signed-off-by: Trond Myklebust 
Signed-off-by: Greg Kroah-Hartman

nfs: skip commit in releasepage if we're freeing memory for fs-related reasons

2012-08-09T15:31:40+00:00

commit 5cf02d09b50b1ee1c2d536c9cf64af5a7d433f56 upstream.

We've had some reports of a deadlock where rpciod ends up with a stack
trace like this:

    PID: 2507   TASK: ffff88103691ab40  CPU: 14  COMMAND: "rpciod/14"
     #0 [ffff8810343bf2f0] schedule at ffffffff814dabd9
     #1 [ffff8810343bf3b8] nfs_wait_bit_killable at ffffffffa038fc04 [nfs]
     #2 [ffff8810343bf3c8] __wait_on_bit at ffffffff814dbc2f
     #3 [ffff8810343bf418] out_of_line_wait_on_bit at ffffffff814dbcd8
     #4 [ffff8810343bf488] nfs_commit_inode at ffffffffa039e0c1 [nfs]
     #5 [ffff8810343bf4f8] nfs_release_page at ffffffffa038bef6 [nfs]
     #6 [ffff8810343bf528] try_to_release_page at ffffffff8110c670
     #7 [ffff8810343bf538] shrink_page_list.clone.0 at ffffffff81126271
     #8 [ffff8810343bf668] shrink_inactive_list at ffffffff81126638
     #9 [ffff8810343bf818] shrink_zone at ffffffff8112788f
    #10 [ffff8810343bf8c8] do_try_to_free_pages at ffffffff81127b1e
    #11 [ffff8810343bf958] try_to_free_pages at ffffffff8112812f
    #12 [ffff8810343bfa08] __alloc_pages_nodemask at ffffffff8111fdad
    #13 [ffff8810343bfb28] kmem_getpages at ffffffff81159942
    #14 [ffff8810343bfb58] fallback_alloc at ffffffff8115a55a
    #15 [ffff8810343bfbd8] ____cache_alloc_node at ffffffff8115a2d9
    #16 [ffff8810343bfc38] kmem_cache_alloc at ffffffff8115b09b
    #17 [ffff8810343bfc78] sk_prot_alloc at ffffffff81411808
    #18 [ffff8810343bfcb8] sk_alloc at ffffffff8141197c
    #19 [ffff8810343bfce8] inet_create at ffffffff81483ba6
    #20 [ffff8810343bfd38] __sock_create at ffffffff8140b4a7
    #21 [ffff8810343bfd98] xs_create_sock at ffffffffa01f649b [sunrpc]
    #22 [ffff8810343bfdd8] xs_tcp_setup_socket at ffffffffa01f6965 [sunrpc]
    #23 [ffff8810343bfe38] worker_thread at ffffffff810887d0
    #24 [ffff8810343bfee8] kthread at ffffffff8108dd96
    #25 [ffff8810343bff48] kernel_thread at ffffffff8100c1ca

rpciod is trying to allocate memory for a new socket to talk to the
server. The VM ends up calling ->releasepage to get more memory, and it
tries to do a blocking commit. That commit can't succeed however without
a connected socket, so we deadlock.

Fix this by setting PF_FSTRANS on the workqueue task prior to doing the
socket allocation, and having nfs_release_page check for that flag when
deciding whether to do a commit call. Also, set PF_FSTRANS
unconditionally in rpc_async_schedule since that function can also do
allocations sometimes.

Signed-off-by: Jeff Layton 
Signed-off-by: Trond Myklebust 
Signed-off-by: Greg Kroah-Hartman

SUNRPC: move per-net operations from svc_destroy()

2012-07-16T16:04:39+00:00

upstream commit 786185b5f8abefa6a8a16695bb4a59c164d5a071.

The idea is to separate service destruction and per-net operations,
because these are two different things and the mix looks ugly.

Notes:

1) For NFS server this patch looks ugly (sorry for that). But these
place will be rewritten soon during NFSd containerization.

2) LockD per-net counter increase int lockd_up() was moved prior to
make_socks() to make lockd_down_net() call safe in case of error.

Signed-off-by: Stanislav Kinsbursky 
Signed-off-by: J. Bruce Fields 
Signed-off-by: Greg Kroah-Hartman