<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/fs, branch v3.18.82</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>Revert "ceph: unlock dangling spinlock in try_flush_caps()"</title>
<updated>2017-11-18T10:06:28+00:00</updated>
<author>
<name>Greg Kroah-Hartman</name>
<email>gregkh@linuxfoundation.org</email>
</author>
<published>2017-11-15T17:26:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=41f612ae707e301e334872416582460674d03ef6'/>
<id>41f612ae707e301e334872416582460674d03ef6</id>
<content type='text'>
This reverts commit 55d4aa12af57ea7782f0c8bbc3b01e44673b05ba which is
commit 6c2838fbdedb9b72a81c931d49e56b229b6cdbca upstream.

The locking issue was not a problem in 3.18, and now sparse rightly
complains about this being an issue, so go back to the "correct" code.

Cc: Jeff Layton &lt;jlayton@redhat.com&gt;
Cc: "Yan, Zheng" &lt;zyan@redhat.com&gt;
Cc: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reported-by: kbuild test robot &lt;fengguang.wu@intel.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This reverts commit 55d4aa12af57ea7782f0c8bbc3b01e44673b05ba which is
commit 6c2838fbdedb9b72a81c931d49e56b229b6cdbca upstream.

The locking issue was not a problem in 3.18, and now sparse rightly
complains about this being an issue, so go back to the "correct" code.

Cc: Jeff Layton &lt;jlayton@redhat.com&gt;
Cc: "Yan, Zheng" &lt;zyan@redhat.com&gt;
Cc: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reported-by: kbuild test robot &lt;fengguang.wu@intel.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ext4: do not use stripe_width if it is not set</title>
<updated>2017-11-08T09:03:49+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2017-10-07T22:37:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=52f145db2870583eeb6afcf414578f155b85179e'/>
<id>52f145db2870583eeb6afcf414578f155b85179e</id>
<content type='text'>
[ Upstream commit 5469d7c3087ecaf760f54b447f11af6061b7c897 ]

Avoid using stripe_width for sbi-&gt;s_stripe value if it is not actually
set. It prevents using the stride for sbi-&gt;s_stripe.

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 5469d7c3087ecaf760f54b447f11af6061b7c897 ]

Avoid using stripe_width for sbi-&gt;s_stripe value if it is not actually
set. It prevents using the stride for sbi-&gt;s_stripe.

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ext4: fix stripe-unaligned allocations</title>
<updated>2017-11-08T09:03:49+00:00</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2017-10-07T22:37:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f59fbdd3621693b36e4deb086f2d83031ec770a5'/>
<id>f59fbdd3621693b36e4deb086f2d83031ec770a5</id>
<content type='text'>
[ Upstream commit d9b22cf9f5466a057f2a4f1e642b469fa9d73117 ]

When a filesystem is created using:

	mkfs.ext4 -b 4096 -E stride=512 &lt;dev&gt;

and we try to allocate 64MB extent, we will end up directly in
ext4_mb_complex_scan_group(). This is because the request is detected
as power-of-two allocation (so we start in ext4_mb_regular_allocator()
with ac_criteria == 0) however the check before
ext4_mb_simple_scan_group() refuses the direct buddy scan because the
allocation request is too large. Since cr == 0, the check whether we
should use ext4_mb_scan_aligned() fails as well and we fall back to
ext4_mb_complex_scan_group().

Fix the problem by checking for upper limit on power-of-two requests
directly when detecting them.

Reported-by: Ross Zwisler &lt;ross.zwisler@linux.intel.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit d9b22cf9f5466a057f2a4f1e642b469fa9d73117 ]

When a filesystem is created using:

	mkfs.ext4 -b 4096 -E stride=512 &lt;dev&gt;

and we try to allocate 64MB extent, we will end up directly in
ext4_mb_complex_scan_group(). This is because the request is detected
as power-of-two allocation (so we start in ext4_mb_regular_allocator()
with ac_criteria == 0) however the check before
ext4_mb_simple_scan_group() refuses the direct buddy scan because the
allocation request is too large. Since cr == 0, the check whether we
should use ext4_mb_scan_aligned() fails as well and we fall back to
ext4_mb_complex_scan_group().

Fix the problem by checking for upper limit on power-of-two requests
directly when detecting them.

Reported-by: Ross Zwisler &lt;ross.zwisler@linux.intel.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ocfs2: fstrim: Fix start offset of first cluster group during fstrim</title>
<updated>2017-11-08T09:03:48+00:00</updated>
<author>
<name>Ashish Samant</name>
<email>ashish.samant@oracle.com</email>
</author>
<published>2017-11-02T22:59:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=163e8355bbe9af2bb3ce7c4cc7a11ed6124b76bf'/>
<id>163e8355bbe9af2bb3ce7c4cc7a11ed6124b76bf</id>
<content type='text'>
commit 105ddc93f06ebe3e553f58563d11ed63dbcd59f0 upstream.

The first cluster group descriptor is not stored at the start of the
group but at an offset from the start.  We need to take this into
account while doing fstrim on the first cluster group.  Otherwise we
will wrongly start fstrim a few blocks after the desired start block and
the range can cross over into the next cluster group and zero out the
group descriptor there.  This can cause filesytem corruption that cannot
be fixed by fsck.

Link: http://lkml.kernel.org/r/1507835579-7308-1-git-send-email-ashish.samant@oracle.com
Signed-off-by: Ashish Samant &lt;ashish.samant@oracle.com&gt;
Reviewed-by: Junxiao Bi &lt;junxiao.bi@oracle.com&gt;
Reviewed-by: Joseph Qi &lt;jiangqi903@gmail.com&gt;
Cc: Mark Fasheh &lt;mfasheh@versity.com&gt;
Cc: Joel Becker &lt;jlbec@evilplan.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 105ddc93f06ebe3e553f58563d11ed63dbcd59f0 upstream.

The first cluster group descriptor is not stored at the start of the
group but at an offset from the start.  We need to take this into
account while doing fstrim on the first cluster group.  Otherwise we
will wrongly start fstrim a few blocks after the desired start block and
the range can cross over into the next cluster group and zero out the
group descriptor there.  This can cause filesytem corruption that cannot
be fixed by fsck.

Link: http://lkml.kernel.org/r/1507835579-7308-1-git-send-email-ashish.samant@oracle.com
Signed-off-by: Ashish Samant &lt;ashish.samant@oracle.com&gt;
Reviewed-by: Junxiao Bi &lt;junxiao.bi@oracle.com&gt;
Reviewed-by: Joseph Qi &lt;jiangqi903@gmail.com&gt;
Cc: Mark Fasheh &lt;mfasheh@versity.com&gt;
Cc: Joel Becker &lt;jlbec@evilplan.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>cifs: check MaxPathNameComponentLength != 0 before using it</title>
<updated>2017-11-08T09:03:48+00:00</updated>
<author>
<name>Ronnie Sahlberg</name>
<email>lsahlber@redhat.com</email>
</author>
<published>2017-10-30T02:28:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e5c2a548f826c9fb1c8b0d41576fc01f81c523ad'/>
<id>e5c2a548f826c9fb1c8b0d41576fc01f81c523ad</id>
<content type='text'>
commit f74bc7c6679200a4a83156bb89cbf6c229fe8ec0 upstream.

And fix tcon leak in error path.

Signed-off-by: Ronnie Sahlberg &lt;lsahlber@redhat.com&gt;
Signed-off-by: Steve French &lt;smfrench@gmail.com&gt;
Reviewed-by: David Disseldorp &lt;ddiss@samba.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit f74bc7c6679200a4a83156bb89cbf6c229fe8ec0 upstream.

And fix tcon leak in error path.

Signed-off-by: Ronnie Sahlberg &lt;lsahlber@redhat.com&gt;
Signed-off-by: Steve French &lt;smfrench@gmail.com&gt;
Reviewed-by: David Disseldorp &lt;ddiss@samba.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>ecryptfs: fix dereference of NULL user_key_payload</title>
<updated>2017-11-02T08:36:48+00:00</updated>
<author>
<name>Eric Biggers</name>
<email>ebiggers@google.com</email>
</author>
<published>2017-10-09T19:51:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=5560a1cac11138fe3df5643c7819fd3c0d3755eb'/>
<id>5560a1cac11138fe3df5643c7819fd3c0d3755eb</id>
<content type='text'>
commit f66665c09ab489a11ca490d6a82df57cfc1bea3e upstream.

In eCryptfs, we failed to verify that the authentication token keys are
not revoked before dereferencing their payloads, which is problematic
because the payload of a revoked key is NULL.  request_key() *does* skip
revoked keys, but there is still a window where the key can be revoked
before we acquire the key semaphore.

Fix it by updating ecryptfs_get_key_payload_data() to return
-EKEYREVOKED if the key payload is NULL.  For completeness we check this
for "encrypted" keys as well as "user" keys, although encrypted keys
cannot be revoked currently.

Alternatively we could use key_validate(), but since we'll also need to
fix ecryptfs_get_key_payload_data() to validate the payload length, it
seems appropriate to just check the payload pointer.

Fixes: 237fead61998 ("[PATCH] ecryptfs: fs/Makefile and fs/Kconfig")
Reviewed-by: James Morris &lt;james.l.morris@oracle.com&gt;
Cc: Michael Halcrow &lt;mhalcrow@google.com&gt;
Signed-off-by: Eric Biggers &lt;ebiggers@google.com&gt;
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;


</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit f66665c09ab489a11ca490d6a82df57cfc1bea3e upstream.

In eCryptfs, we failed to verify that the authentication token keys are
not revoked before dereferencing their payloads, which is problematic
because the payload of a revoked key is NULL.  request_key() *does* skip
revoked keys, but there is still a window where the key can be revoked
before we acquire the key semaphore.

Fix it by updating ecryptfs_get_key_payload_data() to return
-EKEYREVOKED if the key payload is NULL.  For completeness we check this
for "encrypted" keys as well as "user" keys, although encrypted keys
cannot be revoked currently.

Alternatively we could use key_validate(), but since we'll also need to
fix ecryptfs_get_key_payload_data() to validate the payload length, it
seems appropriate to just check the payload pointer.

Fixes: 237fead61998 ("[PATCH] ecryptfs: fs/Makefile and fs/Kconfig")
Reviewed-by: James Morris &lt;james.l.morris@oracle.com&gt;
Cc: Michael Halcrow &lt;mhalcrow@google.com&gt;
Signed-off-by: Eric Biggers &lt;ebiggers@google.com&gt;
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;


</pre>
</div>
</content>
</entry>
<entry>
<title>fuse: fix READDIRPLUS skipping an entry</title>
<updated>2017-11-02T08:36:48+00:00</updated>
<author>
<name>Miklos Szeredi</name>
<email>mszeredi@redhat.com</email>
</author>
<published>2017-10-25T14:34:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b7be20b0175f4d148a06224df0324b5810aadb99'/>
<id>b7be20b0175f4d148a06224df0324b5810aadb99</id>
<content type='text'>
commit c6cdd51404b7ac12dd95173ddfc548c59ecf037f upstream.

Marios Titas running a Haskell program noticed a problem with fuse's
readdirplus: when it is interrupted by a signal, it skips one directory
entry.

The reason is that fuse erronously updates ctx-&gt;pos after a failed
dir_emit().

The issue originates from the patch adding readdirplus support.

Reported-by: Jakob Unterwurzacher &lt;jakobunt@gmail.com&gt;
Tested-by: Marios Titas &lt;redneb@gmx.com&gt;
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
Fixes: 0b05b18381ee ("fuse: implement NFS-like readdirplus support")
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit c6cdd51404b7ac12dd95173ddfc548c59ecf037f upstream.

Marios Titas running a Haskell program noticed a problem with fuse's
readdirplus: when it is interrupted by a signal, it skips one directory
entry.

The reason is that fuse erronously updates ctx-&gt;pos after a failed
dir_emit().

The issue originates from the patch adding readdirplus support.

Reported-by: Jakob Unterwurzacher &lt;jakobunt@gmail.com&gt;
Tested-by: Marios Titas &lt;redneb@gmx.com&gt;
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
Fixes: 0b05b18381ee ("fuse: implement NFS-like readdirplus support")
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>ceph: unlock dangling spinlock in try_flush_caps()</title>
<updated>2017-11-02T08:36:47+00:00</updated>
<author>
<name>Jeff Layton</name>
<email>jlayton@redhat.com</email>
</author>
<published>2017-10-19T12:52:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=55d4aa12af57ea7782f0c8bbc3b01e44673b05ba'/>
<id>55d4aa12af57ea7782f0c8bbc3b01e44673b05ba</id>
<content type='text'>
commit 6c2838fbdedb9b72a81c931d49e56b229b6cdbca upstream.

sparse warns:

  fs/ceph/caps.c:2042:9: warning: context imbalance in 'try_flush_caps' - wrong count at exit

We need to exit this function with the lock unlocked, but a couple of
cases leave it locked.

Signed-off-by: Jeff Layton &lt;jlayton@redhat.com&gt;
Reviewed-by: "Yan, Zheng" &lt;zyan@redhat.com&gt;
Reviewed-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 6c2838fbdedb9b72a81c931d49e56b229b6cdbca upstream.

sparse warns:

  fs/ceph/caps.c:2042:9: warning: context imbalance in 'try_flush_caps' - wrong count at exit

We need to exit this function with the lock unlocked, but a couple of
cases leave it locked.

Signed-off-by: Jeff Layton &lt;jlayton@redhat.com&gt;
Reviewed-by: "Yan, Zheng" &lt;zyan@redhat.com&gt;
Reviewed-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>FS-Cache: fix dereference of NULL user_key_payload</title>
<updated>2017-10-27T08:17:24+00:00</updated>
<author>
<name>Eric Biggers</name>
<email>ebiggers@google.com</email>
</author>
<published>2017-10-09T19:40:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=114dff8ef64e4b72cb97a613179db64096bc79dd'/>
<id>114dff8ef64e4b72cb97a613179db64096bc79dd</id>
<content type='text'>
commit d124b2c53c7bee6569d2a2d0b18b4a1afde00134 upstream.

When the file /proc/fs/fscache/objects (available with
CONFIG_FSCACHE_OBJECT_LIST=y) is opened, we request a user key with
description "fscache:objlist", then access its payload.  However, a
revoked key has a NULL payload, and we failed to check for this.
request_key() *does* skip revoked keys, but there is still a window
where the key can be revoked before we access its payload.

Fix it by checking for a NULL payload, treating it like a key which was
already revoked at the time it was requested.

Fixes: 4fbf4291aa15 ("FS-Cache: Allow the current state of all objects to be dumped")
Reviewed-by: James Morris &lt;james.l.morris@oracle.com&gt;
Signed-off-by: Eric Biggers &lt;ebiggers@google.com&gt;
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit d124b2c53c7bee6569d2a2d0b18b4a1afde00134 upstream.

When the file /proc/fs/fscache/objects (available with
CONFIG_FSCACHE_OBJECT_LIST=y) is opened, we request a user key with
description "fscache:objlist", then access its payload.  However, a
revoked key has a NULL payload, and we failed to check for this.
request_key() *does* skip revoked keys, but there is still a window
where the key can be revoked before we access its payload.

Fix it by checking for a NULL payload, treating it like a key which was
already revoked at the time it was requested.

Fixes: 4fbf4291aa15 ("FS-Cache: Allow the current state of all objects to be dumped")
Reviewed-by: James Morris &lt;james.l.morris@oracle.com&gt;
Signed-off-by: Eric Biggers &lt;ebiggers@google.com&gt;
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>ocfs2/dlmglue: prepare tracking logic to avoid recursive cluster lock</title>
<updated>2017-10-21T15:07:26+00:00</updated>
<author>
<name>Eric Ren</name>
<email>zren@suse.com</email>
</author>
<published>2017-02-22T23:40:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b370a0378091e31f8c6a4eb708dfd22be314f7c0'/>
<id>b370a0378091e31f8c6a4eb708dfd22be314f7c0</id>
<content type='text'>
[ Upstream commit 439a36b8ef38657f765b80b775e2885338d72451 ]

We are in the situation that we have to avoid recursive cluster locking,
but there is no way to check if a cluster lock has been taken by a precess
already.

Mostly, we can avoid recursive locking by writing code carefully.
However, we found that it's very hard to handle the routines that are
invoked directly by vfs code.  For instance:

  const struct inode_operations ocfs2_file_iops = {
      .permission     = ocfs2_permission,
      .get_acl        = ocfs2_iop_get_acl,
      .set_acl        = ocfs2_iop_set_acl,
  };

Both ocfs2_permission() and ocfs2_iop_get_acl() call ocfs2_inode_lock(PR):

  do_sys_open
   may_open
    inode_permission
     ocfs2_permission
      ocfs2_inode_lock() &lt;=== first time
       generic_permission
        get_acl
         ocfs2_iop_get_acl
  	ocfs2_inode_lock() &lt;=== recursive one

A deadlock will occur if a remote EX request comes in between two of
ocfs2_inode_lock().  Briefly describe how the deadlock is formed:

On one hand, OCFS2_LOCK_BLOCKED flag of this lockres is set in
BAST(ocfs2_generic_handle_bast) when downconvert is started on behalf of
the remote EX lock request.  Another hand, the recursive cluster lock
(the second one) will be blocked in in __ocfs2_cluster_lock() because of
OCFS2_LOCK_BLOCKED.  But, the downconvert never complete, why? because
there is no chance for the first cluster lock on this node to be
unlocked - we block ourselves in the code path.

The idea to fix this issue is mostly taken from gfs2 code.

1. introduce a new field: struct ocfs2_lock_res.l_holders, to keep track
   of the processes' pid who has taken the cluster lock of this lock
   resource;

2. introduce a new flag for ocfs2_inode_lock_full:
   OCFS2_META_LOCK_GETBH; it means just getting back disk inode bh for
   us if we've got cluster lock.

3. export a helper: ocfs2_is_locked_by_me() is used to check if we have
   got the cluster lock in the upper code path.

The tracking logic should be used by some of the ocfs2 vfs's callbacks,
to solve the recursive locking issue cuased by the fact that vfs
routines can call into each other.

The performance penalty of processing the holder list should only be
seen at a few cases where the tracking logic is used, such as get/set
acl.

You may ask what if the first time we got a PR lock, and the second time
we want a EX lock? fortunately, this case never happens in the real
world, as far as I can see, including permission check,
(get|set)_(acl|attr), and the gfs2 code also do so.

[sfr@canb.auug.org.au remove some inlines]
Link: http://lkml.kernel.org/r/20170117100948.11657-2-zren@suse.com
Signed-off-by: Eric Ren &lt;zren@suse.com&gt;
Reviewed-by: Junxiao Bi &lt;junxiao.bi@oracle.com&gt;
Reviewed-by: Joseph Qi &lt;jiangqi903@gmail.com&gt;
Cc: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Cc: Mark Fasheh &lt;mfasheh@versity.com&gt;
Cc: Joel Becker &lt;jlbec@evilplan.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 439a36b8ef38657f765b80b775e2885338d72451 ]

We are in the situation that we have to avoid recursive cluster locking,
but there is no way to check if a cluster lock has been taken by a precess
already.

Mostly, we can avoid recursive locking by writing code carefully.
However, we found that it's very hard to handle the routines that are
invoked directly by vfs code.  For instance:

  const struct inode_operations ocfs2_file_iops = {
      .permission     = ocfs2_permission,
      .get_acl        = ocfs2_iop_get_acl,
      .set_acl        = ocfs2_iop_set_acl,
  };

Both ocfs2_permission() and ocfs2_iop_get_acl() call ocfs2_inode_lock(PR):

  do_sys_open
   may_open
    inode_permission
     ocfs2_permission
      ocfs2_inode_lock() &lt;=== first time
       generic_permission
        get_acl
         ocfs2_iop_get_acl
  	ocfs2_inode_lock() &lt;=== recursive one

A deadlock will occur if a remote EX request comes in between two of
ocfs2_inode_lock().  Briefly describe how the deadlock is formed:

On one hand, OCFS2_LOCK_BLOCKED flag of this lockres is set in
BAST(ocfs2_generic_handle_bast) when downconvert is started on behalf of
the remote EX lock request.  Another hand, the recursive cluster lock
(the second one) will be blocked in in __ocfs2_cluster_lock() because of
OCFS2_LOCK_BLOCKED.  But, the downconvert never complete, why? because
there is no chance for the first cluster lock on this node to be
unlocked - we block ourselves in the code path.

The idea to fix this issue is mostly taken from gfs2 code.

1. introduce a new field: struct ocfs2_lock_res.l_holders, to keep track
   of the processes' pid who has taken the cluster lock of this lock
   resource;

2. introduce a new flag for ocfs2_inode_lock_full:
   OCFS2_META_LOCK_GETBH; it means just getting back disk inode bh for
   us if we've got cluster lock.

3. export a helper: ocfs2_is_locked_by_me() is used to check if we have
   got the cluster lock in the upper code path.

The tracking logic should be used by some of the ocfs2 vfs's callbacks,
to solve the recursive locking issue cuased by the fact that vfs
routines can call into each other.

The performance penalty of processing the holder list should only be
seen at a few cases where the tracking logic is used, such as get/set
acl.

You may ask what if the first time we got a PR lock, and the second time
we want a EX lock? fortunately, this case never happens in the real
world, as far as I can see, including permission check,
(get|set)_(acl|attr), and the gfs2 code also do so.

[sfr@canb.auug.org.au remove some inlines]
Link: http://lkml.kernel.org/r/20170117100948.11657-2-zren@suse.com
Signed-off-by: Eric Ren &lt;zren@suse.com&gt;
Reviewed-by: Junxiao Bi &lt;junxiao.bi@oracle.com&gt;
Reviewed-by: Joseph Qi &lt;jiangqi903@gmail.com&gt;
Cc: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Cc: Mark Fasheh &lt;mfasheh@versity.com&gt;
Cc: Joel Becker &lt;jlbec@evilplan.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
