<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/fs/nfsd, branch v4.17</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>nfsd: vfs_mkdir() might succeed leaving dentry negative unhashed</title>
<updated>2018-05-21T18:30:10+00:00</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2018-05-11T21:03:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3819bb0d79f50b05910db5bdc6d9ef512184e3b1'/>
<id>3819bb0d79f50b05910db5bdc6d9ef512184e3b1</id>
<content type='text'>
That can (and does, on some filesystems) happen - -&gt;mkdir() (and thus
vfs_mkdir()) can legitimately leave its argument negative and just
unhash it, counting upon the lookup to pick the object we'd created
next time we try to look at that name.

Some vfs_mkdir() callers forget about that possibility...

Acked-by: J. Bruce Fields &lt;bfields@redhat.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
That can (and does, on some filesystems) happen - -&gt;mkdir() (and thus
vfs_mkdir()) can legitimately leave its argument negative and just
unhash it, counting upon the lookup to pick the object we'd created
next time we try to look at that name.

Some vfs_mkdir() callers forget about that possibility...

Acked-by: J. Bruce Fields &lt;bfields@redhat.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>nfsd: fix incorrect umasks</title>
<updated>2018-04-03T20:27:08+00:00</updated>
<author>
<name>J. Bruce Fields</name>
<email>bfields@redhat.com</email>
</author>
<published>2018-03-21T21:19:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=880a3a5325489a143269a8e172e7563ebf9897bc'/>
<id>880a3a5325489a143269a8e172e7563ebf9897bc</id>
<content type='text'>
We're neglecting to clear the umask after it's set, which can cause a
later unrelated rpc to (incorrectly) use the same umask if it happens to
be processed by the same thread.

There's a more subtle problem here too:

An NFSv4 compound request is decoded all in one pass before any
operations are executed.

Currently we're setting current-&gt;fs-&gt;umask at the time we decode the
compound.  In theory a single compound could contain multiple creates
each setting a umask.  In that case we'd end up using whichever umask
was passed in the *last* operation as the umask for all the creates,
whether that was correct or not.

So, we should just be saving the umask at decode time and waiting to set
it until we actually process the corresponding operation.

In practice it's unlikely any client would do multiple creates in a
single compound.  And even if it did they'd likely be from the same
process (hence carry the same umask).  So this is a little academic, but
we should get it right anyway.

Fixes: 47057abde515 (nfsd: add support for the umask attribute)
Cc: stable@vger.kernel.org
Reported-by: Lucash Stach &lt;l.stach@pengutronix.de&gt;
Signed-off-by: J. Bruce Fields &lt;bfields@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We're neglecting to clear the umask after it's set, which can cause a
later unrelated rpc to (incorrectly) use the same umask if it happens to
be processed by the same thread.

There's a more subtle problem here too:

An NFSv4 compound request is decoded all in one pass before any
operations are executed.

Currently we're setting current-&gt;fs-&gt;umask at the time we decode the
compound.  In theory a single compound could contain multiple creates
each setting a umask.  In that case we'd end up using whichever umask
was passed in the *last* operation as the umask for all the creates,
whether that was correct or not.

So, we should just be saving the umask at decode time and waiting to set
it until we actually process the corresponding operation.

In practice it's unlikely any client would do multiple creates in a
single compound.  And even if it did they'd likely be from the same
process (hence carry the same umask).  So this is a little academic, but
we should get it right anyway.

Fixes: 47057abde515 (nfsd: add support for the umask attribute)
Cc: stable@vger.kernel.org
Reported-by: Lucash Stach &lt;l.stach@pengutronix.de&gt;
Signed-off-by: J. Bruce Fields &lt;bfields@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>NFSD: Clean up legacy NFS SYMLINK argument XDR decoders</title>
<updated>2018-04-03T19:08:16+00:00</updated>
<author>
<name>Chuck Lever</name>
<email>chuck.lever@oracle.com</email>
</author>
<published>2018-03-27T14:54:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=38a70315599dedacd9ff3bd1016f9048c9d0ad12'/>
<id>38a70315599dedacd9ff3bd1016f9048c9d0ad12</id>
<content type='text'>
Move common code in NFSD's legacy SYMLINK decoders into a helper.
The immediate benefits include:

 - one fewer data copies on transports that support DDP
 - consistent error checking across all versions
 - reduction of code duplication
 - support for both legal forms of SYMLINK requests on RDMA
   transports for all versions of NFS (in particular, NFSv2, for
   completeness)

In the long term, this helper is an appropriate spot to perform a
per-transport call-out to fill the pathname argument using, say,
RDMA Reads.

Filling the pathname in the proc function also means that eventually
the incoming filehandle can be interpreted so that filesystem-
specific memory can be allocated as a sink for the pathname
argument, rather than using anonymous pages.

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Signed-off-by: J. Bruce Fields &lt;bfields@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Move common code in NFSD's legacy SYMLINK decoders into a helper.
The immediate benefits include:

 - one fewer data copies on transports that support DDP
 - consistent error checking across all versions
 - reduction of code duplication
 - support for both legal forms of SYMLINK requests on RDMA
   transports for all versions of NFS (in particular, NFSv2, for
   completeness)

In the long term, this helper is an appropriate spot to perform a
per-transport call-out to fill the pathname argument using, say,
RDMA Reads.

Filling the pathname in the proc function also means that eventually
the incoming filehandle can be interpreted so that filesystem-
specific memory can be allocated as a sink for the pathname
argument, rather than using anonymous pages.

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Signed-off-by: J. Bruce Fields &lt;bfields@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>NFSD: Clean up legacy NFS WRITE argument XDR decoders</title>
<updated>2018-04-03T19:08:16+00:00</updated>
<author>
<name>Chuck Lever</name>
<email>chuck.lever@oracle.com</email>
</author>
<published>2018-03-27T14:54:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=8154ef2776aa512a3eaa0e7db030dc4803354d61'/>
<id>8154ef2776aa512a3eaa0e7db030dc4803354d61</id>
<content type='text'>
Move common code in NFSD's legacy NFS WRITE decoders into a helper.
The immediate benefit is reduction of code duplication and some nice
micro-optimizations (see below).

In the long term, this helper can perform a per-transport call-out
to fill the rq_vec (say, using RDMA Reads).

The legacy WRITE decoders and procs are changed to work like NFSv4,
which constructs the rq_vec just before it is about to call
vfs_writev.

Why? Calling a transport call-out from the proc instead of the XDR
decoder means that the incoming FH can be resolved to a particular
filesystem and file. This would allow pages from the backing file to
be presented to the transport to be filled, rather than presenting
anonymous pages and copying or flipping them into the file's page
cache later.

I also prefer using the pages in rq_arg.pages, instead of pulling
the data pages directly out of the rqstp::rq_pages array. This is
currently the way the NFSv3 write decoder works, but the other two
do not seem to take this approach. Fixing this removes the only
reference to rq_pages found in NFSD, eliminating an NFSD assumption
about how transports use the pages in rq_pages.

Lastly, avoid setting up the first element of rq_vec as a zero-
length buffer. This happens with an RDMA transport when a normal
Read chunk is present because the data payload is in rq_arg's
page list (none of it is in the head buffer).

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Signed-off-by: J. Bruce Fields &lt;bfields@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Move common code in NFSD's legacy NFS WRITE decoders into a helper.
The immediate benefit is reduction of code duplication and some nice
micro-optimizations (see below).

In the long term, this helper can perform a per-transport call-out
to fill the rq_vec (say, using RDMA Reads).

The legacy WRITE decoders and procs are changed to work like NFSv4,
which constructs the rq_vec just before it is about to call
vfs_writev.

Why? Calling a transport call-out from the proc instead of the XDR
decoder means that the incoming FH can be resolved to a particular
filesystem and file. This would allow pages from the backing file to
be presented to the transport to be filled, rather than presenting
anonymous pages and copying or flipping them into the file's page
cache later.

I also prefer using the pages in rq_arg.pages, instead of pulling
the data pages directly out of the rqstp::rq_pages array. This is
currently the way the NFSv3 write decoder works, but the other two
do not seem to take this approach. Fixing this removes the only
reference to rq_pages found in NFSD, eliminating an NFSD assumption
about how transports use the pages in rq_pages.

Lastly, avoid setting up the first element of rq_vec as a zero-
length buffer. This happens with an RDMA transport when a normal
Read chunk is present because the data payload is in rq_arg's
page list (none of it is in the head buffer).

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Signed-off-by: J. Bruce Fields &lt;bfields@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>nfsd: Trace NFSv4 COMPOUND execution</title>
<updated>2018-04-03T19:08:15+00:00</updated>
<author>
<name>Chuck Lever</name>
<email>chuck.lever@oracle.com</email>
</author>
<published>2018-03-27T14:53:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=fff4080b2f1389ecf3028d72eb70e1837df48b01'/>
<id>fff4080b2f1389ecf3028d72eb70e1837df48b01</id>
<content type='text'>
This helps record the identity and timing of the ops in each NFSv4
COMPOUND, replacing dprintk calls that did much the same thing.

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Signed-off-by: J. Bruce Fields &lt;bfields@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This helps record the identity and timing of the ops in each NFSv4
COMPOUND, replacing dprintk calls that did much the same thing.

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Signed-off-by: J. Bruce Fields &lt;bfields@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>nfsd: Add I/O trace points in the NFSv4 read proc</title>
<updated>2018-04-03T19:08:15+00:00</updated>
<author>
<name>Chuck Lever</name>
<email>chuck.lever@oracle.com</email>
</author>
<published>2018-03-28T17:29:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=87c5942e8fae81ac296267654a047eb2db81592b'/>
<id>87c5942e8fae81ac296267654a047eb2db81592b</id>
<content type='text'>
NFSv4 read compound processing invokes nfsd_splice_read and
nfs_readv directly, so the trace points currently in nfsd_read are
not invoked for NFSv4 reads.

Move the NFSD READ trace points to common helpers so that NFSv4
reads are captured.

Also, record any local I/O error that occurs, the total count of
bytes that were actually returned, and whether splice or vectored
read was used.

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Signed-off-by: J. Bruce Fields &lt;bfields@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
NFSv4 read compound processing invokes nfsd_splice_read and
nfs_readv directly, so the trace points currently in nfsd_read are
not invoked for NFSv4 reads.

Move the NFSD READ trace points to common helpers so that NFSv4
reads are captured.

Also, record any local I/O error that occurs, the total count of
bytes that were actually returned, and whether splice or vectored
read was used.

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Signed-off-by: J. Bruce Fields &lt;bfields@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>nfsd: Add I/O trace points in the NFSv4 write path</title>
<updated>2018-04-03T19:08:15+00:00</updated>
<author>
<name>Chuck Lever</name>
<email>chuck.lever@oracle.com</email>
</author>
<published>2018-03-27T14:53:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d890be159a71395bd4ceca09233a8a56d7da69ee'/>
<id>d890be159a71395bd4ceca09233a8a56d7da69ee</id>
<content type='text'>
NFSv4 write compound processing invokes nfsd_vfs_write directly. The
trace points currently in nfsd_write are not effective for NFSv4
writes.

Move the trace points into the shared nfsd_vfs_write() helper.

After the I/O, we also want to record any local I/O error that
might have occurred, and the total count of bytes that were actually
moved (rather than the requested number).

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Signed-off-by: J. Bruce Fields &lt;bfields@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
NFSv4 write compound processing invokes nfsd_vfs_write directly. The
trace points currently in nfsd_write are not effective for NFSv4
writes.

Move the trace points into the shared nfsd_vfs_write() helper.

After the I/O, we also want to record any local I/O error that
might have occurred, and the total count of bytes that were actually
moved (rather than the requested number).

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Signed-off-by: J. Bruce Fields &lt;bfields@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>nfsd: Add "nfsd_" to trace point names</title>
<updated>2018-04-03T19:08:14+00:00</updated>
<author>
<name>Chuck Lever</name>
<email>chuck.lever@oracle.com</email>
</author>
<published>2018-03-27T14:53:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f394b62b7b0dd24757e9813e19f4b6091c5a28f6'/>
<id>f394b62b7b0dd24757e9813e19f4b6091c5a28f6</id>
<content type='text'>
Follow naming convention used in client and in sunrpc layers.

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Signed-off-by: J. Bruce Fields &lt;bfields@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Follow naming convention used in client and in sunrpc layers.

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Signed-off-by: J. Bruce Fields &lt;bfields@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>nfsd: Record request byte count, not count of vectors</title>
<updated>2018-04-03T19:08:14+00:00</updated>
<author>
<name>Chuck Lever</name>
<email>chuck.lever@oracle.com</email>
</author>
<published>2018-03-27T14:52:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=79e0b4e247b3d1787ade451cc0187e8d3b0e6e53'/>
<id>79e0b4e247b3d1787ade451cc0187e8d3b0e6e53</id>
<content type='text'>
Byte count is more helpful to know than vector count.

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Signed-off-by: J. Bruce Fields &lt;bfields@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Byte count is more helpful to know than vector count.

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Signed-off-by: J. Bruce Fields &lt;bfields@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>nfsd: Fix NFSD trace points</title>
<updated>2018-04-03T19:08:13+00:00</updated>
<author>
<name>Chuck Lever</name>
<email>chuck.lever@oracle.com</email>
</author>
<published>2018-03-27T14:52:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=afa720a0910bd0fe4a714ab9b902ff6e845900c3'/>
<id>afa720a0910bd0fe4a714ab9b902ff6e845900c3</id>
<content type='text'>
nfsd-1915  [003] 77915.780959: write_opened:
	[FAILED TO PARSE] xid=3286130958 fh=0 offset=154624 len=1
nfsd-1915  [003] 77915.780960: write_io_done:
	[FAILED TO PARSE] xid=3286130958 fh=0 offset=154624 len=1
nfsd-1915  [003] 77915.780964: write_done:
	[FAILED TO PARSE] xid=3286130958 fh=0 offset=154624 len=1

Byte swapping and knfsd_fh_hash() are not available in "trace-cmd
report", where the print format string is actually used. These
data transformations have to be done during the TP_fast_assign step.

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Signed-off-by: J. Bruce Fields &lt;bfields@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
nfsd-1915  [003] 77915.780959: write_opened:
	[FAILED TO PARSE] xid=3286130958 fh=0 offset=154624 len=1
nfsd-1915  [003] 77915.780960: write_io_done:
	[FAILED TO PARSE] xid=3286130958 fh=0 offset=154624 len=1
nfsd-1915  [003] 77915.780964: write_done:
	[FAILED TO PARSE] xid=3286130958 fh=0 offset=154624 len=1

Byte swapping and knfsd_fh_hash() are not available in "trace-cmd
report", where the print format string is actually used. These
data transformations have to be done during the TP_fast_assign step.

Signed-off-by: Chuck Lever &lt;chuck.lever@oracle.com&gt;
Signed-off-by: J. Bruce Fields &lt;bfields@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
