<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/fs/nfs/write.c, branch v4.8</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>Merge tag 'nfs-for-4.8-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs</title>
<updated>2016-07-30T23:33:25+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2016-07-30T23:33:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7f155c702677d057d03b192ce652311de5434697'/>
<id>7f155c702677d057d03b192ce652311de5434697</id>
<content type='text'>
Pull NFS client updates from Trond Myklebust:
 "Highlights include:

  Stable bugfixes:
   - nfs: don't create zero-length requests

   - several LAYOUTGET bugfixes

  Features:
   - several performance related features

   - more aggressive caching when we can rely on close-to-open
     cache consistency

   - remove serialisation of O_DIRECT reads and writes

   - optimise several code paths to not flush to disk unnecessarily.

     However allow for the idiosyncracies of pNFS for those layout
     types that need to issue a LAYOUTCOMMIT before the metadata can
     be updated on the server.

   - SUNRPC updates to the client data receive path

   - pNFS/SCSI support RH/Fedora dm-mpath device nodes

   - pNFS files/flexfiles can now use unprivileged ports when
     the generic NFS mount options allow it.

  Bugfixes:
   - Don't use RDMA direct data placement together with data
     integrity or privacy security flavours

   - Remove the RDMA ALLPHYSICAL memory registration mode as
     it has potential security holes.

   - Several layout recall fixes to improve NFSv4.1 protocol
     compliance.

   - Fix an Oops in the pNFS files and flexfiles connection
     setup to the DS

   - Allow retry of operations that used a returned delegation
      stateid

   - Don't mark the inode as revalidated if a LAYOUTCOMMIT is
     outstanding

   - Fix writeback races in nfs4_copy_range() and
     nfs42_proc_deallocate()"

* tag 'nfs-for-4.8-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (104 commits)
  pNFS: Actively set attributes as invalid if LAYOUTCOMMIT is outstanding
  NFSv4: Clean up lookup of SECINFO_NO_NAME
  NFSv4.2: Fix warning "variable ‘stateids’ set but not used"
  NFSv4: Fix warning "no previous prototype for ‘nfs4_listxattr’"
  SUNRPC: Fix a compiler warning in fs/nfs/clnt.c
  pNFS: Remove redundant smp_mb() from pnfs_init_lseg()
  pNFS: Cleanup - do layout segment initialisation in one place
  pNFS: Remove redundant stateid invalidation
  pNFS: Remove redundant pnfs_mark_layout_returned_if_empty()
  pNFS: Clear the layout metadata if the server changed the layout stateid
  pNFS: Cleanup - don't open code pnfs_mark_layout_stateid_invalid()
  NFS: pnfs_mark_matching_lsegs_return() should match the layout sequence id
  pNFS: Do not set plh_return_seq for non-callback related layoutreturns
  pNFS: Ensure layoutreturn acts as a completion for layout callbacks
  pNFS: Fix CB_LAYOUTRECALL stateid verification
  pNFS: Always update the layout barrier seqid on LAYOUTGET
  pNFS: Always update the layout stateid if NFS_LAYOUT_INVALID_STID is set
  pNFS: Clear the layout return tracking on layout reinitialisation
  pNFS: LAYOUTRETURN should only update the stateid if the layout is valid
  nfs: don't create zero-length requests
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull NFS client updates from Trond Myklebust:
 "Highlights include:

  Stable bugfixes:
   - nfs: don't create zero-length requests

   - several LAYOUTGET bugfixes

  Features:
   - several performance related features

   - more aggressive caching when we can rely on close-to-open
     cache consistency

   - remove serialisation of O_DIRECT reads and writes

   - optimise several code paths to not flush to disk unnecessarily.

     However allow for the idiosyncracies of pNFS for those layout
     types that need to issue a LAYOUTCOMMIT before the metadata can
     be updated on the server.

   - SUNRPC updates to the client data receive path

   - pNFS/SCSI support RH/Fedora dm-mpath device nodes

   - pNFS files/flexfiles can now use unprivileged ports when
     the generic NFS mount options allow it.

  Bugfixes:
   - Don't use RDMA direct data placement together with data
     integrity or privacy security flavours

   - Remove the RDMA ALLPHYSICAL memory registration mode as
     it has potential security holes.

   - Several layout recall fixes to improve NFSv4.1 protocol
     compliance.

   - Fix an Oops in the pNFS files and flexfiles connection
     setup to the DS

   - Allow retry of operations that used a returned delegation
      stateid

   - Don't mark the inode as revalidated if a LAYOUTCOMMIT is
     outstanding

   - Fix writeback races in nfs4_copy_range() and
     nfs42_proc_deallocate()"

* tag 'nfs-for-4.8-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (104 commits)
  pNFS: Actively set attributes as invalid if LAYOUTCOMMIT is outstanding
  NFSv4: Clean up lookup of SECINFO_NO_NAME
  NFSv4.2: Fix warning "variable ‘stateids’ set but not used"
  NFSv4: Fix warning "no previous prototype for ‘nfs4_listxattr’"
  SUNRPC: Fix a compiler warning in fs/nfs/clnt.c
  pNFS: Remove redundant smp_mb() from pnfs_init_lseg()
  pNFS: Cleanup - do layout segment initialisation in one place
  pNFS: Remove redundant stateid invalidation
  pNFS: Remove redundant pnfs_mark_layout_returned_if_empty()
  pNFS: Clear the layout metadata if the server changed the layout stateid
  pNFS: Cleanup - don't open code pnfs_mark_layout_stateid_invalid()
  NFS: pnfs_mark_matching_lsegs_return() should match the layout sequence id
  pNFS: Do not set plh_return_seq for non-callback related layoutreturns
  pNFS: Ensure layoutreturn acts as a completion for layout callbacks
  pNFS: Fix CB_LAYOUTRECALL stateid verification
  pNFS: Always update the layout barrier seqid on LAYOUTGET
  pNFS: Always update the layout stateid if NFS_LAYOUT_INVALID_STID is set
  pNFS: Clear the layout return tracking on layout reinitialisation
  pNFS: LAYOUTRETURN should only update the stateid if the layout is valid
  nfs: don't create zero-length requests
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: move most file-based accounting to the node</title>
<updated>2016-07-28T23:07:41+00:00</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@techsingularity.net</email>
</author>
<published>2016-07-28T22:46:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=11fb998986a72aa7e997d96d63d52582a01228c5'/>
<id>11fb998986a72aa7e997d96d63d52582a01228c5</id>
<content type='text'>
There are now a number of accounting oddities such as mapped file pages
being accounted for on the node while the total number of file pages are
accounted on the zone.  This can be coped with to some extent but it's
confusing so this patch moves the relevant file-based accounted.  Due to
throttling logic in the page allocator for reliable OOM detection, it is
still necessary to track dirty and writeback pages on a per-zone basis.

[mgorman@techsingularity.net: fix NR_ZONE_WRITE_PENDING accounting]
  Link: http://lkml.kernel.org/r/1468404004-5085-5-git-send-email-mgorman@techsingularity.net
Link: http://lkml.kernel.org/r/1467970510-21195-20-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Hillf Danton &lt;hillf.zj@alibaba-inc.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Rik van Riel &lt;riel@surriel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There are now a number of accounting oddities such as mapped file pages
being accounted for on the node while the total number of file pages are
accounted on the zone.  This can be coped with to some extent but it's
confusing so this patch moves the relevant file-based accounted.  Due to
throttling logic in the page allocator for reliable OOM detection, it is
still necessary to track dirty and writeback pages on a per-zone basis.

[mgorman@techsingularity.net: fix NR_ZONE_WRITE_PENDING accounting]
  Link: http://lkml.kernel.org/r/1468404004-5085-5-git-send-email-mgorman@techsingularity.net
Link: http://lkml.kernel.org/r/1467970510-21195-20-git-send-email-mgorman@techsingularity.net
Signed-off-by: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Hillf Danton &lt;hillf.zj@alibaba-inc.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Rik van Riel &lt;riel@surriel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'writeback'</title>
<updated>2016-07-24T21:08:31+00:00</updated>
<author>
<name>Trond Myklebust</name>
<email>trond.myklebust@primarydata.com</email>
</author>
<published>2016-07-24T21:08:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=362745268ce119c473952b30f57d947bdede7f7a'/>
<id>362745268ce119c473952b30f57d947bdede7f7a</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>nfs: don't create zero-length requests</title>
<updated>2016-07-22T19:15:16+00:00</updated>
<author>
<name>Benjamin Coddington</name>
<email>bcodding@redhat.com</email>
</author>
<published>2016-07-18T14:41:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=149a4fddd0a72d526abbeac0c8deaab03559836a'/>
<id>149a4fddd0a72d526abbeac0c8deaab03559836a</id>
<content type='text'>
NFS doesn't expect requests with wb_bytes set to zero and may make
unexpected decisions about how to handle that request at the page IO layer.
Skip request creation if we won't have any wb_bytes in the request.

Signed-off-by: Benjamin Coddington &lt;bcodding@redhat.com&gt;
Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Reviewed-by: Weston Andros Adamson &lt;dros@primarydata.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust &lt;trond.myklebust@primarydata.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
NFS doesn't expect requests with wb_bytes set to zero and may make
unexpected decisions about how to handle that request at the page IO layer.
Skip request creation if we won't have any wb_bytes in the request.

Signed-off-by: Benjamin Coddington &lt;bcodding@redhat.com&gt;
Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Reviewed-by: Weston Andros Adamson &lt;dros@primarydata.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust &lt;trond.myklebust@primarydata.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>sunrpc: move NO_CRKEY_TIMEOUT to the auth-&gt;au_flags</title>
<updated>2016-07-19T20:23:24+00:00</updated>
<author>
<name>Scott Mayhew</name>
<email>smayhew@redhat.com</email>
</author>
<published>2016-06-07T19:14:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ce52914eb76efd62aa48d738cf845b37852bf920'/>
<id>ce52914eb76efd62aa48d738cf845b37852bf920</id>
<content type='text'>
A generic_cred can be used to look up a unx_cred or a gss_cred, so it's
not really safe to use the the generic_cred-&gt;acred-&gt;ac_flags to store
the NO_CRKEY_TIMEOUT flag.  A lookup for a unx_cred triggered while the
KEY_EXPIRE_SOON flag is already set will cause both NO_CRKEY_TIMEOUT and
KEY_EXPIRE_SOON to be set in the ac_flags, leaving the user associated
with the auth_cred to be in a state where they're perpetually doing 4K
NFS_FILE_SYNC writes.

This can be reproduced as follows:

1. Mount two NFS filesystems, one with sec=krb5 and one with sec=sys.
They do not need to be the same export, nor do they even need to be from
the same NFS server.  Also, v3 is fine.
$ sudo mount -o v3,sec=krb5 server1:/export /mnt/krb5
$ sudo mount -o v3,sec=sys server2:/export /mnt/sys

2. As the normal user, before accessing the kerberized mount, kinit with
a short lifetime (but not so short that renewing the ticket would leave
you within the 4-minute window again by the time the original ticket
expires), e.g.
$ kinit -l 10m -r 60m

3. Do some I/O to the kerberized mount and verify that the writes are
wsize, UNSTABLE:
$ dd if=/dev/zero of=/mnt/krb5/file bs=1M count=1

4. Wait until you're within 4 minutes of key expiry, then do some more
I/O to the kerberized mount to ensure that RPC_CRED_KEY_EXPIRE_SOON gets
set.  Verify that the writes are 4K, FILE_SYNC:
$ dd if=/dev/zero of=/mnt/krb5/file bs=1M count=1

5. Now do some I/O to the sec=sys mount.  This will cause
RPC_CRED_NO_CRKEY_TIMEOUT to be set:
$ dd if=/dev/zero of=/mnt/sys/file bs=1M count=1

6. Writes for that user will now be permanently 4K, FILE_SYNC for that
user, regardless of which mount is being written to, until you reboot
the client.  Renewing the kerberos ticket (assuming it hasn't already
expired) will have no effect.  Grabbing a new kerberos ticket at this
point will have no effect either.

Move the flag to the auth-&gt;au_flags field (which is currently unused)
and rename it slightly to reflect that it's no longer associated with
the auth_cred-&gt;ac_flags.  Add the rpc_auth to the arg list of
rpcauth_cred_key_to_expire and check the au_flags there too.  Finally,
add the inode to the arg list of nfs_ctx_key_to_expire so we can
determine the rpc_auth to pass to rpcauth_cred_key_to_expire.

Signed-off-by: Scott Mayhew &lt;smayhew@redhat.com&gt;
Signed-off-by: Trond Myklebust &lt;trond.myklebust@primarydata.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
A generic_cred can be used to look up a unx_cred or a gss_cred, so it's
not really safe to use the the generic_cred-&gt;acred-&gt;ac_flags to store
the NO_CRKEY_TIMEOUT flag.  A lookup for a unx_cred triggered while the
KEY_EXPIRE_SOON flag is already set will cause both NO_CRKEY_TIMEOUT and
KEY_EXPIRE_SOON to be set in the ac_flags, leaving the user associated
with the auth_cred to be in a state where they're perpetually doing 4K
NFS_FILE_SYNC writes.

This can be reproduced as follows:

1. Mount two NFS filesystems, one with sec=krb5 and one with sec=sys.
They do not need to be the same export, nor do they even need to be from
the same NFS server.  Also, v3 is fine.
$ sudo mount -o v3,sec=krb5 server1:/export /mnt/krb5
$ sudo mount -o v3,sec=sys server2:/export /mnt/sys

2. As the normal user, before accessing the kerberized mount, kinit with
a short lifetime (but not so short that renewing the ticket would leave
you within the 4-minute window again by the time the original ticket
expires), e.g.
$ kinit -l 10m -r 60m

3. Do some I/O to the kerberized mount and verify that the writes are
wsize, UNSTABLE:
$ dd if=/dev/zero of=/mnt/krb5/file bs=1M count=1

4. Wait until you're within 4 minutes of key expiry, then do some more
I/O to the kerberized mount to ensure that RPC_CRED_KEY_EXPIRE_SOON gets
set.  Verify that the writes are 4K, FILE_SYNC:
$ dd if=/dev/zero of=/mnt/krb5/file bs=1M count=1

5. Now do some I/O to the sec=sys mount.  This will cause
RPC_CRED_NO_CRKEY_TIMEOUT to be set:
$ dd if=/dev/zero of=/mnt/sys/file bs=1M count=1

6. Writes for that user will now be permanently 4K, FILE_SYNC for that
user, regardless of which mount is being written to, until you reboot
the client.  Renewing the kerberos ticket (assuming it hasn't already
expired) will have no effect.  Grabbing a new kerberos ticket at this
point will have no effect either.

Move the flag to the auth-&gt;au_flags field (which is currently unused)
and rename it slightly to reflect that it's no longer associated with
the auth_cred-&gt;ac_flags.  Add the rpc_auth to the arg list of
rpcauth_cred_key_to_expire and check the au_flags there too.  Finally,
add the inode to the arg list of nfs_ctx_key_to_expire so we can
determine the rpc_auth to pass to rpcauth_cred_key_to_expire.

Signed-off-by: Scott Mayhew &lt;smayhew@redhat.com&gt;
Signed-off-by: Trond Myklebust &lt;trond.myklebust@primarydata.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>NFSv4.2: Fix writeback races in nfs4_copy_file_range</title>
<updated>2016-07-05T23:11:07+00:00</updated>
<author>
<name>Trond Myklebust</name>
<email>trond.myklebust@primarydata.com</email>
</author>
<published>2016-06-25T22:12:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=837bb1d752d92ea4d870877ffbd6ec5cf76624b3'/>
<id>837bb1d752d92ea4d870877ffbd6ec5cf76624b3</id>
<content type='text'>
We need to ensure that any writes to the destination file are serialised
with the copy, meaning that the writeback has to occur under the inode lock.

Also relax the writeback requirement on the source, and rely on the
stateid checking to tell us if the source rebooted. Add the helper
nfs_filemap_write_and_wait_range() to call pnfs_sync_inode() as
is appropriate for pNFS servers that may need a layoutcommit.

Signed-off-by: Trond Myklebust &lt;trond.myklebust@primarydata.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We need to ensure that any writes to the destination file are serialised
with the copy, meaning that the writeback has to occur under the inode lock.

Also relax the writeback requirement on the source, and rely on the
stateid checking to tell us if the source rebooted. Add the helper
nfs_filemap_write_and_wait_range() to call pnfs_sync_inode() as
is appropriate for pNFS servers that may need a layoutcommit.

Signed-off-by: Trond Myklebust &lt;trond.myklebust@primarydata.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>NFS: Fix O_DIRECT verifier problems</title>
<updated>2016-07-05T23:11:01+00:00</updated>
<author>
<name>Trond Myklebust</name>
<email>trond.myklebust@primarydata.com</email>
</author>
<published>2016-06-02T01:32:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8fc3c3862728373e0d0f5abccc6afc56c69e0c63'/>
<id>8fc3c3862728373e0d0f5abccc6afc56c69e0c63</id>
<content type='text'>
We should not be interested in looking at the value of the stable field,
since that could take any value.

Signed-off-by: Trond Myklebust &lt;trond.myklebust@primarydata.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We should not be interested in looking at the value of the stable field,
since that could take any value.

Signed-off-by: Trond Myklebust &lt;trond.myklebust@primarydata.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>NFS: writepage of a single page should not be synchronous</title>
<updated>2016-06-22T13:59:43+00:00</updated>
<author>
<name>Trond Myklebust</name>
<email>trond.myklebust@primarydata.com</email>
</author>
<published>2016-06-01T22:25:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=811ed92ecc9f47eee90beabcf5c2133f2a6d2440'/>
<id>811ed92ecc9f47eee90beabcf5c2133f2a6d2440</id>
<content type='text'>
It is almost always better to wait for more so that we can issue a
bulk commit.

Signed-off-by: Trond Myklebust &lt;trond.myklebust@primarydata.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It is almost always better to wait for more so that we can issue a
bulk commit.

Signed-off-by: Trond Myklebust &lt;trond.myklebust@primarydata.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>NFS: Kill NFS_INO_NFS_INO_FLUSHING: it is a performance killer</title>
<updated>2016-06-22T13:59:42+00:00</updated>
<author>
<name>Trond Myklebust</name>
<email>trond.myklebust@primarydata.com</email>
</author>
<published>2016-06-01T22:23:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6b56a89833fa7903595c8d138bb4927187315cba'/>
<id>6b56a89833fa7903595c8d138bb4927187315cba</id>
<content type='text'>
filemap_datawrite() and friends already deal just fine with livelock.

Signed-off-by: Trond Myklebust &lt;trond.myklebust@primarydata.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
filemap_datawrite() and friends already deal just fine with livelock.

Signed-off-by: Trond Myklebust &lt;trond.myklebust@primarydata.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>nfs: avoid race that crashes nfs_init_commit</title>
<updated>2016-05-25T16:59:01+00:00</updated>
<author>
<name>Weston Andros Adamson</name>
<email>dros@monkey.org</email>
</author>
<published>2016-05-25T14:07:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ade8febde0271513360bac44883dbebad44276c3'/>
<id>ade8febde0271513360bac44883dbebad44276c3</id>
<content type='text'>
Since the patch "NFS: Allow multiple commit requests in flight per file"
we can run multiple simultaneous commits on the same inode.  This
introduced a race over collecting pages to commit that made it possible
to call nfs_init_commit() with an empty list - which causes crashes like
the one below.

The fix is to catch this race and avoid calling nfs_init_commit and
initiate_commit when there is no work to do.

Here is the crash:

[600522.076832] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
[600522.078475] IP: [&lt;ffffffffa0479e72&gt;] nfs_init_commit+0x22/0x130 [nfs]
[600522.078745] PGD 4272b1067 PUD 4272cb067 PMD 0
[600522.078972] Oops: 0000 [#1] SMP
[600522.079204] Modules linked in: nfsv3 nfs_layout_flexfiles rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache dcdbas ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw vmw_vsock_vmci_transport vsock bonding ipmi_devintf ipmi_msghandler coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ppdev vmw_balloon parport_pc parport acpi_cpufreq vmw_vmci i2c_piix4 shpchp nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c vmwgfx drm_kms_helper ttm drm crc32c_intel serio_raw vmxnet3
[600522.081380]  vmw_pvscsi ata_generic pata_acpi
[600522.081809] CPU: 3 PID: 15667 Comm: /usr/bin/python Not tainted 4.1.9-100.pd.88.el7.x86_64 #1
[600522.082281] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2014
[600522.082814] task: ffff8800bbbfa780 ti: ffff88042ae84000 task.ti: ffff88042ae84000
[600522.083378] RIP: 0010:[&lt;ffffffffa0479e72&gt;]  [&lt;ffffffffa0479e72&gt;] nfs_init_commit+0x22/0x130 [nfs]
[600522.083973] RSP: 0018:ffff88042ae87438  EFLAGS: 00010246
[600522.084571] RAX: 0000000000000000 RBX: ffff880003485e40 RCX: ffff88042ae87588
[600522.085188] RDX: 0000000000000000 RSI: ffff88042ae874b0 RDI: ffff880003485e40
[600522.085756] RBP: ffff88042ae87448 R08: ffff880003486010 R09: ffff88042ae874b0
[600522.086332] R10: 0000000000000000 R11: 0000000000000005 R12: ffff88042ae872d0
[600522.086905] R13: ffff88042ae874b0 R14: ffff880003485e40 R15: ffff88042704c840
[600522.087484] FS:  00007f4728ff2740(0000) GS:ffff88043fd80000(0000) knlGS:0000000000000000
[600522.088070] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[600522.088663] CR2: 0000000000000040 CR3: 000000042b6aa000 CR4: 00000000001406e0
[600522.089327] Stack:
[600522.089926]  0000000000000001 ffff88042ae87588 ffff88042ae874f8 ffffffffa04f09fa
[600522.090549]  0000000000017840 0000000000017840 ffff88042ae87588 ffff8803258d9930
[600522.091169]  ffff88042ae87578 ffffffffa0563d80 0000000000000000 ffff88042704c840
[600522.091789] Call Trace:
[600522.092420]  [&lt;ffffffffa04f09fa&gt;] pnfs_generic_commit_pagelist+0x1da/0x320 [nfsv4]
[600522.093052]  [&lt;ffffffffa0563d80&gt;] ? ff_layout_commit_prepare_v3+0x30/0x30 [nfs_layout_flexfiles]
[600522.093696]  [&lt;ffffffffa0562645&gt;] ff_layout_commit_pagelist+0x15/0x20 [nfs_layout_flexfiles]
[600522.094359]  [&lt;ffffffffa047bc78&gt;] nfs_generic_commit_list+0xe8/0x120 [nfs]
[600522.095032]  [&lt;ffffffffa047bd6a&gt;] nfs_commit_inode+0xba/0x110 [nfs]
[600522.095719]  [&lt;ffffffffa046ac54&gt;] nfs_release_page+0x44/0xd0 [nfs]
[600522.096410]  [&lt;ffffffff811a8122&gt;] try_to_release_page+0x32/0x50
[600522.097109]  [&lt;ffffffff811bd4f1&gt;] shrink_page_list+0x961/0xb30
[600522.097812]  [&lt;ffffffff811bdced&gt;] shrink_inactive_list+0x1cd/0x550
[600522.098530]  [&lt;ffffffff811bea65&gt;] shrink_lruvec+0x635/0x840
[600522.099250]  [&lt;ffffffff811bed60&gt;] shrink_zone+0xf0/0x2f0
[600522.099974]  [&lt;ffffffff811bf312&gt;] do_try_to_free_pages+0x192/0x470
[600522.100709]  [&lt;ffffffff811bf6ca&gt;] try_to_free_pages+0xda/0x170
[600522.101464]  [&lt;ffffffff811b2198&gt;] __alloc_pages_nodemask+0x588/0x970
[600522.102235]  [&lt;ffffffff811fbbd5&gt;] alloc_pages_vma+0xb5/0x230
[600522.103000]  [&lt;ffffffff813a1589&gt;] ? cpumask_any_but+0x39/0x50
[600522.103774]  [&lt;ffffffff811d6115&gt;] wp_page_copy.isra.55+0x95/0x490
[600522.104558]  [&lt;ffffffff810e3438&gt;] ? __wake_up+0x48/0x60
[600522.105357]  [&lt;ffffffff811d7d3b&gt;] do_wp_page+0xab/0x4f0
[600522.106137]  [&lt;ffffffff810a1bbb&gt;] ? release_task+0x36b/0x470
[600522.106902]  [&lt;ffffffff8126dbd7&gt;] ? eventfd_ctx_read+0x67/0x1c0
[600522.107659]  [&lt;ffffffff811da2a8&gt;] handle_mm_fault+0xc78/0x1900
[600522.108431]  [&lt;ffffffff81067ef1&gt;] __do_page_fault+0x181/0x420
[600522.109173]  [&lt;ffffffff811446a6&gt;] ? __audit_syscall_exit+0x1e6/0x280
[600522.109893]  [&lt;ffffffff810681c0&gt;] do_page_fault+0x30/0x80
[600522.110594]  [&lt;ffffffff81024f36&gt;] ? syscall_trace_leave+0xc6/0x120
[600522.111288]  [&lt;ffffffff81790a58&gt;] page_fault+0x28/0x30
[600522.111947] Code: 5d c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 4c 8d 87 d0 01 00 00 48 89 e5 53 48 89 fb 48 83 ec 08 4c 8b 0e 49 8b 41 18 4c 39 ce &lt;48&gt; 8b 40 40 4c 8b 50 30 74 24 48 8b 87 d0 01 00 00 48 8b 7e 08
[600522.113343] RIP  [&lt;ffffffffa0479e72&gt;] nfs_init_commit+0x22/0x130 [nfs]
[600522.114003]  RSP &lt;ffff88042ae87438&gt;
[600522.114636] CR2: 0000000000000040

Fixes: af7cf057 (NFS: Allow multiple commit requests in flight per file)
CC: stable@vger.kernel.org
Signed-off-by: Weston Andros Adamson &lt;dros@primarydata.com&gt;
Signed-off-by: Anna Schumaker &lt;Anna.Schumaker@Netapp.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since the patch "NFS: Allow multiple commit requests in flight per file"
we can run multiple simultaneous commits on the same inode.  This
introduced a race over collecting pages to commit that made it possible
to call nfs_init_commit() with an empty list - which causes crashes like
the one below.

The fix is to catch this race and avoid calling nfs_init_commit and
initiate_commit when there is no work to do.

Here is the crash:

[600522.076832] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
[600522.078475] IP: [&lt;ffffffffa0479e72&gt;] nfs_init_commit+0x22/0x130 [nfs]
[600522.078745] PGD 4272b1067 PUD 4272cb067 PMD 0
[600522.078972] Oops: 0000 [#1] SMP
[600522.079204] Modules linked in: nfsv3 nfs_layout_flexfiles rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache dcdbas ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw vmw_vsock_vmci_transport vsock bonding ipmi_devintf ipmi_msghandler coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ppdev vmw_balloon parport_pc parport acpi_cpufreq vmw_vmci i2c_piix4 shpchp nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c vmwgfx drm_kms_helper ttm drm crc32c_intel serio_raw vmxnet3
[600522.081380]  vmw_pvscsi ata_generic pata_acpi
[600522.081809] CPU: 3 PID: 15667 Comm: /usr/bin/python Not tainted 4.1.9-100.pd.88.el7.x86_64 #1
[600522.082281] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2014
[600522.082814] task: ffff8800bbbfa780 ti: ffff88042ae84000 task.ti: ffff88042ae84000
[600522.083378] RIP: 0010:[&lt;ffffffffa0479e72&gt;]  [&lt;ffffffffa0479e72&gt;] nfs_init_commit+0x22/0x130 [nfs]
[600522.083973] RSP: 0018:ffff88042ae87438  EFLAGS: 00010246
[600522.084571] RAX: 0000000000000000 RBX: ffff880003485e40 RCX: ffff88042ae87588
[600522.085188] RDX: 0000000000000000 RSI: ffff88042ae874b0 RDI: ffff880003485e40
[600522.085756] RBP: ffff88042ae87448 R08: ffff880003486010 R09: ffff88042ae874b0
[600522.086332] R10: 0000000000000000 R11: 0000000000000005 R12: ffff88042ae872d0
[600522.086905] R13: ffff88042ae874b0 R14: ffff880003485e40 R15: ffff88042704c840
[600522.087484] FS:  00007f4728ff2740(0000) GS:ffff88043fd80000(0000) knlGS:0000000000000000
[600522.088070] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[600522.088663] CR2: 0000000000000040 CR3: 000000042b6aa000 CR4: 00000000001406e0
[600522.089327] Stack:
[600522.089926]  0000000000000001 ffff88042ae87588 ffff88042ae874f8 ffffffffa04f09fa
[600522.090549]  0000000000017840 0000000000017840 ffff88042ae87588 ffff8803258d9930
[600522.091169]  ffff88042ae87578 ffffffffa0563d80 0000000000000000 ffff88042704c840
[600522.091789] Call Trace:
[600522.092420]  [&lt;ffffffffa04f09fa&gt;] pnfs_generic_commit_pagelist+0x1da/0x320 [nfsv4]
[600522.093052]  [&lt;ffffffffa0563d80&gt;] ? ff_layout_commit_prepare_v3+0x30/0x30 [nfs_layout_flexfiles]
[600522.093696]  [&lt;ffffffffa0562645&gt;] ff_layout_commit_pagelist+0x15/0x20 [nfs_layout_flexfiles]
[600522.094359]  [&lt;ffffffffa047bc78&gt;] nfs_generic_commit_list+0xe8/0x120 [nfs]
[600522.095032]  [&lt;ffffffffa047bd6a&gt;] nfs_commit_inode+0xba/0x110 [nfs]
[600522.095719]  [&lt;ffffffffa046ac54&gt;] nfs_release_page+0x44/0xd0 [nfs]
[600522.096410]  [&lt;ffffffff811a8122&gt;] try_to_release_page+0x32/0x50
[600522.097109]  [&lt;ffffffff811bd4f1&gt;] shrink_page_list+0x961/0xb30
[600522.097812]  [&lt;ffffffff811bdced&gt;] shrink_inactive_list+0x1cd/0x550
[600522.098530]  [&lt;ffffffff811bea65&gt;] shrink_lruvec+0x635/0x840
[600522.099250]  [&lt;ffffffff811bed60&gt;] shrink_zone+0xf0/0x2f0
[600522.099974]  [&lt;ffffffff811bf312&gt;] do_try_to_free_pages+0x192/0x470
[600522.100709]  [&lt;ffffffff811bf6ca&gt;] try_to_free_pages+0xda/0x170
[600522.101464]  [&lt;ffffffff811b2198&gt;] __alloc_pages_nodemask+0x588/0x970
[600522.102235]  [&lt;ffffffff811fbbd5&gt;] alloc_pages_vma+0xb5/0x230
[600522.103000]  [&lt;ffffffff813a1589&gt;] ? cpumask_any_but+0x39/0x50
[600522.103774]  [&lt;ffffffff811d6115&gt;] wp_page_copy.isra.55+0x95/0x490
[600522.104558]  [&lt;ffffffff810e3438&gt;] ? __wake_up+0x48/0x60
[600522.105357]  [&lt;ffffffff811d7d3b&gt;] do_wp_page+0xab/0x4f0
[600522.106137]  [&lt;ffffffff810a1bbb&gt;] ? release_task+0x36b/0x470
[600522.106902]  [&lt;ffffffff8126dbd7&gt;] ? eventfd_ctx_read+0x67/0x1c0
[600522.107659]  [&lt;ffffffff811da2a8&gt;] handle_mm_fault+0xc78/0x1900
[600522.108431]  [&lt;ffffffff81067ef1&gt;] __do_page_fault+0x181/0x420
[600522.109173]  [&lt;ffffffff811446a6&gt;] ? __audit_syscall_exit+0x1e6/0x280
[600522.109893]  [&lt;ffffffff810681c0&gt;] do_page_fault+0x30/0x80
[600522.110594]  [&lt;ffffffff81024f36&gt;] ? syscall_trace_leave+0xc6/0x120
[600522.111288]  [&lt;ffffffff81790a58&gt;] page_fault+0x28/0x30
[600522.111947] Code: 5d c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 4c 8d 87 d0 01 00 00 48 89 e5 53 48 89 fb 48 83 ec 08 4c 8b 0e 49 8b 41 18 4c 39 ce &lt;48&gt; 8b 40 40 4c 8b 50 30 74 24 48 8b 87 d0 01 00 00 48 8b 7e 08
[600522.113343] RIP  [&lt;ffffffffa0479e72&gt;] nfs_init_commit+0x22/0x130 [nfs]
[600522.114003]  RSP &lt;ffff88042ae87438&gt;
[600522.114636] CR2: 0000000000000040

Fixes: af7cf057 (NFS: Allow multiple commit requests in flight per file)
CC: stable@vger.kernel.org
Signed-off-by: Weston Andros Adamson &lt;dros@primarydata.com&gt;
Signed-off-by: Anna Schumaker &lt;Anna.Schumaker@Netapp.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
