<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git, branch v6.4.8</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>Linux 6.4.8</title>
<updated>2023-08-03T08:26:15+00:00</updated>
<author>
<name>Greg Kroah-Hartman</name>
<email>gregkh@linuxfoundation.org</email>
</author>
<published>2023-08-03T08:26:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=714a286bf9ee3740260c61471ed72d10bd17336a'/>
<id>714a286bf9ee3740260c61471ed72d10bd17336a</id>
<content type='text'>
Link: https://lore.kernel.org/r/20230801091925.659598007@linuxfoundation.org
Tested-by: Ronald Warsow &lt;rwarsow@gmx.de&gt;
Tested-by: Jon Hunter &lt;jonathanh@nvidia.com&gt;
Tested-by: Conor Dooley &lt;conor.dooley@microchip.com&gt;
Tested-by: SeongJae Park &lt;sj@kernel.org&gt;
Tested-by: Shuah Khan &lt;skhan@linuxfoundation.org&gt;
Tested-by: Florian Fainelli &lt;florian.fainelli@broadcom.com&gt;
Tested-by: Ron Economos &lt;re@w6rz.net&gt;
Tested-by: Bagas Sanjaya &lt;bagasdotme@gmail.com&gt;
Link: https://lore.kernel.org/r/20230802065501.780725463@linuxfoundation.org
Tested-by: Chris Paterson (CIP) &lt;chris.paterson2@renesas.com&gt;
Tested-by: Ronald Warsow &lt;rwarsow@gmx.de&gt;
Tested-by: SeongJae Park &lt;sj@kernel.org&gt;
Tested-by: Ron Economos &lt;re@w6rz.net&gt;
Tested-by: Guenter Roeck &lt;linux@roeck-us.net&gt;
Tested-by: Florian Fainelli &lt;florian.fainelli@broadcom.com&gt;
Tested-by: Bagas Sanjaya &lt;bagasdotme@gmail.com&gt;
Tested-by: Linux Kernel Functional Testing &lt;lkft@linaro.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Link: https://lore.kernel.org/r/20230801091925.659598007@linuxfoundation.org
Tested-by: Ronald Warsow &lt;rwarsow@gmx.de&gt;
Tested-by: Jon Hunter &lt;jonathanh@nvidia.com&gt;
Tested-by: Conor Dooley &lt;conor.dooley@microchip.com&gt;
Tested-by: SeongJae Park &lt;sj@kernel.org&gt;
Tested-by: Shuah Khan &lt;skhan@linuxfoundation.org&gt;
Tested-by: Florian Fainelli &lt;florian.fainelli@broadcom.com&gt;
Tested-by: Ron Economos &lt;re@w6rz.net&gt;
Tested-by: Bagas Sanjaya &lt;bagasdotme@gmail.com&gt;
Link: https://lore.kernel.org/r/20230802065501.780725463@linuxfoundation.org
Tested-by: Chris Paterson (CIP) &lt;chris.paterson2@renesas.com&gt;
Tested-by: Ronald Warsow &lt;rwarsow@gmx.de&gt;
Tested-by: SeongJae Park &lt;sj@kernel.org&gt;
Tested-by: Ron Economos &lt;re@w6rz.net&gt;
Tested-by: Guenter Roeck &lt;linux@roeck-us.net&gt;
Tested-by: Florian Fainelli &lt;florian.fainelli@broadcom.com&gt;
Tested-by: Bagas Sanjaya &lt;bagasdotme@gmail.com&gt;
Tested-by: Linux Kernel Functional Testing &lt;lkft@linaro.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dma-buf: fix an error pointer vs NULL bug</title>
<updated>2023-08-03T08:26:14+00:00</updated>
<author>
<name>Dan Carpenter</name>
<email>dan.carpenter@linaro.org</email>
</author>
<published>2023-07-06T12:37:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3d308a163115e275eb18366beb45b8c7c513a822'/>
<id>3d308a163115e275eb18366beb45b8c7c513a822</id>
<content type='text'>
commit 00ae1491f970acc454be0df63f50942d94825860 upstream.

Smatch detected potential error pointer dereference.

    drivers/gpu/drm/drm_syncobj.c:888 drm_syncobj_transfer_to_timeline()
    error: 'fence' dereferencing possible ERR_PTR()

The error pointer comes from dma_fence_allocate_private_stub().  One
caller expected error pointers and one expected NULL pointers.  Change
it to return NULL and update the caller which expected error pointers,
drm_syncobj_assign_null_handle(), to check for NULL instead.

Fixes: f781f661e8c9 ("dma-buf: keep the signaling time of merged fences v3")
Signed-off-by: Dan Carpenter &lt;dan.carpenter@linaro.org&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Sumit Semwal &lt;sumit.semwal@linaro.org&gt;
Signed-off-by: Sumit Semwal &lt;sumit.semwal@linaro.org&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/b09f1996-3838-4fa2-9193-832b68262e43@moroto.mountain
Cc: Jindong Yue &lt;jindong.yue@nxp.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 00ae1491f970acc454be0df63f50942d94825860 upstream.

Smatch detected potential error pointer dereference.

    drivers/gpu/drm/drm_syncobj.c:888 drm_syncobj_transfer_to_timeline()
    error: 'fence' dereferencing possible ERR_PTR()

The error pointer comes from dma_fence_allocate_private_stub().  One
caller expected error pointers and one expected NULL pointers.  Change
it to return NULL and update the caller which expected error pointers,
drm_syncobj_assign_null_handle(), to check for NULL instead.

Fixes: f781f661e8c9 ("dma-buf: keep the signaling time of merged fences v3")
Signed-off-by: Dan Carpenter &lt;dan.carpenter@linaro.org&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Sumit Semwal &lt;sumit.semwal@linaro.org&gt;
Signed-off-by: Sumit Semwal &lt;sumit.semwal@linaro.org&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/b09f1996-3838-4fa2-9193-832b68262e43@moroto.mountain
Cc: Jindong Yue &lt;jindong.yue@nxp.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dma-buf: keep the signaling time of merged fences v3</title>
<updated>2023-08-03T08:26:14+00:00</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2023-06-13T08:09:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6bd9952eade5bbcd099083ca782ad674f6b32637'/>
<id>6bd9952eade5bbcd099083ca782ad674f6b32637</id>
<content type='text'>
commit f781f661e8c99b0cb34129f2e374234d61864e77 upstream.

Some Android CTS is testing if the signaling time keeps consistent
during merges.

v2: use the current time if the fence is still in the signaling path and
the timestamp not yet available.
v3: improve comment, fix one more case to use the correct timestamp

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Luben Tuikov &lt;luben.tuikov@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20230630120041.109216-1-christian.koenig@amd.com
Cc: Jindong Yue &lt;jindong.yue@nxp.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit f781f661e8c99b0cb34129f2e374234d61864e77 upstream.

Some Android CTS is testing if the signaling time keeps consistent
during merges.

v2: use the current time if the fence is still in the signaling path and
the timestamp not yet available.
v3: improve comment, fix one more case to use the correct timestamp

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Luben Tuikov &lt;luben.tuikov@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20230630120041.109216-1-christian.koenig@amd.com
Cc: Jindong Yue &lt;jindong.yue@nxp.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm/mempolicy: Take VMA lock before replacing policy</title>
<updated>2023-08-03T08:26:14+00:00</updated>
<author>
<name>Jann Horn</name>
<email>jannh@google.com</email>
</author>
<published>2023-07-28T04:13:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e872d6b6ea4947fb87f0d6ea1ef814019dbed89e'/>
<id>e872d6b6ea4947fb87f0d6ea1ef814019dbed89e</id>
<content type='text'>
commit 6c21e066f9256ea1df6f88768f6ae1080b7cf509 upstream.

mbind() calls down into vma_replace_policy() without taking the per-VMA
locks, replaces the VMA's vma-&gt;vm_policy pointer, and frees the old
policy.  That's bad; a concurrent page fault might still be using the
old policy (in vma_alloc_folio()), resulting in use-after-free.

Normally this will manifest as a use-after-free read first, but it can
result in memory corruption, including because vma_alloc_folio() can
call mpol_cond_put() on the freed policy, which conditionally changes
the policy's refcount member.

This bug is specific to CONFIG_NUMA, but it does also affect non-NUMA
systems as long as the kernel was built with CONFIG_NUMA.

Signed-off-by: Jann Horn &lt;jannh@google.com&gt;
Reviewed-by: Suren Baghdasaryan &lt;surenb@google.com&gt;
Fixes: 5e31275cc997 ("mm: add per-VMA lock and helper functions to control it")
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 6c21e066f9256ea1df6f88768f6ae1080b7cf509 upstream.

mbind() calls down into vma_replace_policy() without taking the per-VMA
locks, replaces the VMA's vma-&gt;vm_policy pointer, and frees the old
policy.  That's bad; a concurrent page fault might still be using the
old policy (in vma_alloc_folio()), resulting in use-after-free.

Normally this will manifest as a use-after-free read first, but it can
result in memory corruption, including because vma_alloc_folio() can
call mpol_cond_put() on the freed policy, which conditionally changes
the policy's refcount member.

This bug is specific to CONFIG_NUMA, but it does also affect non-NUMA
systems as long as the kernel was built with CONFIG_NUMA.

Signed-off-by: Jann Horn &lt;jannh@google.com&gt;
Reviewed-by: Suren Baghdasaryan &lt;surenb@google.com&gt;
Fixes: 5e31275cc997 ("mm: add per-VMA lock and helper functions to control it")
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm/memory-failure: fix hardware poison check in unpoison_memory()</title>
<updated>2023-08-03T08:26:14+00:00</updated>
<author>
<name>Sidhartha Kumar</name>
<email>sidhartha.kumar@oracle.com</email>
</author>
<published>2023-07-17T18:18:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=da84cd9b5e03464c177cabf9692853708626456f'/>
<id>da84cd9b5e03464c177cabf9692853708626456f</id>
<content type='text'>
commit 6c54312f9689fbe27c70db5d42eebd29d04b672e upstream.

It was pointed out[1] that using folio_test_hwpoison() is wrong as we need
to check the indiviual page that has poison.  folio_test_hwpoison() only
checks the head page so go back to using PageHWPoison().

User-visible effects include existing hwpoison-inject tests possibly
failing as unpoisoning a single subpage could lead to unpoisoning an
entire folio.  Memory unpoisoning could also not work as expected as
the function will break early due to only checking the head page and
not the actually poisoned subpage.

[1]: https://lore.kernel.org/lkml/ZLIbZygG7LqSI9xe@casper.infradead.org/

Link: https://lkml.kernel.org/r/20230717181812.167757-1-sidhartha.kumar@oracle.com
Fixes: a6fddef49eef ("mm/memory-failure: convert unpoison_memory() to folios")
Signed-off-by: Sidhartha Kumar &lt;sidhartha.kumar@oracle.com&gt;
Reported-by: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Acked-by: Naoya Horiguchi &lt;naoya.horiguchi@nec.com&gt;
Reviewed-by: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 6c54312f9689fbe27c70db5d42eebd29d04b672e upstream.

It was pointed out[1] that using folio_test_hwpoison() is wrong as we need
to check the indiviual page that has poison.  folio_test_hwpoison() only
checks the head page so go back to using PageHWPoison().

User-visible effects include existing hwpoison-inject tests possibly
failing as unpoisoning a single subpage could lead to unpoisoning an
entire folio.  Memory unpoisoning could also not work as expected as
the function will break early due to only checking the head page and
not the actually poisoned subpage.

[1]: https://lore.kernel.org/lkml/ZLIbZygG7LqSI9xe@casper.infradead.org/

Link: https://lkml.kernel.org/r/20230717181812.167757-1-sidhartha.kumar@oracle.com
Fixes: a6fddef49eef ("mm/memory-failure: convert unpoison_memory() to folios")
Signed-off-by: Sidhartha Kumar &lt;sidhartha.kumar@oracle.com&gt;
Reported-by: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Acked-by: Naoya Horiguchi &lt;naoya.horiguchi@nec.com&gt;
Reviewed-by: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: fix memory ordering for mm_lock_seq and vm_lock_seq</title>
<updated>2023-08-03T08:26:14+00:00</updated>
<author>
<name>Jann Horn</name>
<email>jannh@google.com</email>
</author>
<published>2023-07-21T22:51:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=bdb3106af2b26e8f10dee7353f0ef63bf84fd6de'/>
<id>bdb3106af2b26e8f10dee7353f0ef63bf84fd6de</id>
<content type='text'>
commit b1f02b95758d05b799731d939e76a0bd6da312db upstream.

mm-&gt;mm_lock_seq effectively functions as a read/write lock; therefore it
must be used with acquire/release semantics.

A specific example is the interaction between userfaultfd_register() and
lock_vma_under_rcu().

userfaultfd_register() does the following from the point where it changes
a VMA's flags to the point where concurrent readers are permitted again
(in a simple scenario where only a single private VMA is accessed and no
merging/splitting is involved):

userfaultfd_register
  userfaultfd_set_vm_flags
    vm_flags_reset
      vma_start_write
        down_write(&amp;vma-&gt;vm_lock-&gt;lock)
        vma-&gt;vm_lock_seq = mm_lock_seq [marks VMA as busy]
        up_write(&amp;vma-&gt;vm_lock-&gt;lock)
      vm_flags_init
        [sets VM_UFFD_* in __vm_flags]
  vma-&gt;vm_userfaultfd_ctx.ctx = ctx
  mmap_write_unlock
    vma_end_write_all
      WRITE_ONCE(mm-&gt;mm_lock_seq, mm-&gt;mm_lock_seq + 1) [unlocks VMA]

There are no memory barriers in between the __vm_flags update and the
mm-&gt;mm_lock_seq update that unlocks the VMA, so the unlock can be
reordered to above the `vm_flags_init()` call, which means from the
perspective of a concurrent reader, a VMA can be marked as a userfaultfd
VMA while it is not VMA-locked.  That's bad, we definitely need a
store-release for the unlock operation.

The non-atomic write to vma-&gt;vm_lock_seq in vma_start_write() is mostly
fine because all accesses to vma-&gt;vm_lock_seq that matter are always
protected by the VMA lock.  There is a racy read in vma_start_read()
though that can tolerate false-positives, so we should be using
WRITE_ONCE() to keep things tidy and data-race-free (including for KCSAN).

On the other side, lock_vma_under_rcu() works as follows in the relevant
region for locking and userfaultfd check:

lock_vma_under_rcu
  vma_start_read
    vma-&gt;vm_lock_seq == READ_ONCE(vma-&gt;vm_mm-&gt;mm_lock_seq) [early bailout]
    down_read_trylock(&amp;vma-&gt;vm_lock-&gt;lock)
    vma-&gt;vm_lock_seq == READ_ONCE(vma-&gt;vm_mm-&gt;mm_lock_seq) [main check]
  userfaultfd_armed
    checks vma-&gt;vm_flags &amp; __VM_UFFD_FLAGS

Here, the interesting aspect is how far down the mm-&gt;mm_lock_seq read can
be reordered - if this read is reordered down below the vma-&gt;vm_flags
access, this could cause lock_vma_under_rcu() to partly operate on
information that was read while the VMA was supposed to be locked.  To
prevent this kind of downwards bleeding of the mm-&gt;mm_lock_seq read, we
need to read it with a load-acquire.

Some of the comment wording is based on suggestions by Suren.

BACKPORT WARNING: One of the functions changed by this patch (which I've
written against Linus' tree) is vma_try_start_write(), but this function
no longer exists in mm/mm-everything.  I don't know whether the merged
version of this patch will be ordered before or after the patch that
removes vma_try_start_write().  If you're backporting this patch to a tree
with vma_try_start_write(), make sure this patch changes that function.

Link: https://lkml.kernel.org/r/20230721225107.942336-1-jannh@google.com
Fixes: 5e31275cc997 ("mm: add per-VMA lock and helper functions to control it")
Signed-off-by: Jann Horn &lt;jannh@google.com&gt;
Reviewed-by: Suren Baghdasaryan &lt;surenb@google.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit b1f02b95758d05b799731d939e76a0bd6da312db upstream.

mm-&gt;mm_lock_seq effectively functions as a read/write lock; therefore it
must be used with acquire/release semantics.

A specific example is the interaction between userfaultfd_register() and
lock_vma_under_rcu().

userfaultfd_register() does the following from the point where it changes
a VMA's flags to the point where concurrent readers are permitted again
(in a simple scenario where only a single private VMA is accessed and no
merging/splitting is involved):

userfaultfd_register
  userfaultfd_set_vm_flags
    vm_flags_reset
      vma_start_write
        down_write(&amp;vma-&gt;vm_lock-&gt;lock)
        vma-&gt;vm_lock_seq = mm_lock_seq [marks VMA as busy]
        up_write(&amp;vma-&gt;vm_lock-&gt;lock)
      vm_flags_init
        [sets VM_UFFD_* in __vm_flags]
  vma-&gt;vm_userfaultfd_ctx.ctx = ctx
  mmap_write_unlock
    vma_end_write_all
      WRITE_ONCE(mm-&gt;mm_lock_seq, mm-&gt;mm_lock_seq + 1) [unlocks VMA]

There are no memory barriers in between the __vm_flags update and the
mm-&gt;mm_lock_seq update that unlocks the VMA, so the unlock can be
reordered to above the `vm_flags_init()` call, which means from the
perspective of a concurrent reader, a VMA can be marked as a userfaultfd
VMA while it is not VMA-locked.  That's bad, we definitely need a
store-release for the unlock operation.

The non-atomic write to vma-&gt;vm_lock_seq in vma_start_write() is mostly
fine because all accesses to vma-&gt;vm_lock_seq that matter are always
protected by the VMA lock.  There is a racy read in vma_start_read()
though that can tolerate false-positives, so we should be using
WRITE_ONCE() to keep things tidy and data-race-free (including for KCSAN).

On the other side, lock_vma_under_rcu() works as follows in the relevant
region for locking and userfaultfd check:

lock_vma_under_rcu
  vma_start_read
    vma-&gt;vm_lock_seq == READ_ONCE(vma-&gt;vm_mm-&gt;mm_lock_seq) [early bailout]
    down_read_trylock(&amp;vma-&gt;vm_lock-&gt;lock)
    vma-&gt;vm_lock_seq == READ_ONCE(vma-&gt;vm_mm-&gt;mm_lock_seq) [main check]
  userfaultfd_armed
    checks vma-&gt;vm_flags &amp; __VM_UFFD_FLAGS

Here, the interesting aspect is how far down the mm-&gt;mm_lock_seq read can
be reordered - if this read is reordered down below the vma-&gt;vm_flags
access, this could cause lock_vma_under_rcu() to partly operate on
information that was read while the VMA was supposed to be locked.  To
prevent this kind of downwards bleeding of the mm-&gt;mm_lock_seq read, we
need to read it with a load-acquire.

Some of the comment wording is based on suggestions by Suren.

BACKPORT WARNING: One of the functions changed by this patch (which I've
written against Linus' tree) is vma_try_start_write(), but this function
no longer exists in mm/mm-everything.  I don't know whether the merged
version of this patch will be ordered before or after the patch that
removes vma_try_start_write().  If you're backporting this patch to a tree
with vma_try_start_write(), make sure this patch changes that function.

Link: https://lkml.kernel.org/r/20230721225107.942336-1-jannh@google.com
Fixes: 5e31275cc997 ("mm: add per-VMA lock and helper functions to control it")
Signed-off-by: Jann Horn &lt;jannh@google.com&gt;
Reviewed-by: Suren Baghdasaryan &lt;surenb@google.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: lock VMA in dup_anon_vma() before setting -&gt;anon_vma</title>
<updated>2023-08-03T08:26:14+00:00</updated>
<author>
<name>Jann Horn</name>
<email>jannh@google.com</email>
</author>
<published>2023-07-21T03:46:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b53745bdb03e0cd45765ec4f7613c0069dce47e5'/>
<id>b53745bdb03e0cd45765ec4f7613c0069dce47e5</id>
<content type='text'>
commit d8ab9f7b644a2c9b64de405c1953c905ff219dc9 upstream.

When VMAs are merged, dup_anon_vma() is called with `dst` pointing to the
VMA that is being expanded to cover the area previously occupied by
another VMA.  This currently happens while `dst` is not write-locked.

This means that, in the `src-&gt;anon_vma &amp;&amp; !dst-&gt;anon_vma` case, as soon as
the assignment `dst-&gt;anon_vma = src-&gt;anon_vma` has happened, concurrent
page faults can happen on `dst` under the per-VMA lock.  This is already
icky in itself, since such page faults can now install pages into `dst`
that are attached to an `anon_vma` that is not yet tied back to the
`anon_vma` with an `anon_vma_chain`.  But if `anon_vma_clone()` fails due
to an out-of-memory error, things get much worse: `anon_vma_clone()` then
reverts `dst-&gt;anon_vma` back to NULL, and `dst` remains completely
unconnected to the `anon_vma`, even though we can have pages in the area
covered by `dst` that point to the `anon_vma`.

This means the `anon_vma` of such pages can be freed while the pages are
still mapped into userspace, which leads to UAF when a helper like
folio_lock_anon_vma_read() tries to look up the anon_vma of such a page.

This theoretically is a security bug, but I believe it is really hard to
actually trigger as an unprivileged user because it requires that you can
make an order-0 GFP_KERNEL allocation fail, and the page allocator tries
pretty hard to prevent that.

I think doing the vma_start_write() call inside dup_anon_vma() is the most
straightforward fix for now.

For a kernel-assisted reproducer, see the notes section of the patch mail.

Link: https://lkml.kernel.org/r/20230721034643.616851-1-jannh@google.com
Fixes: 5e31275cc997 ("mm: add per-VMA lock and helper functions to control it")
Signed-off-by: Jann Horn &lt;jannh@google.com&gt;
Reviewed-by: Suren Baghdasaryan &lt;surenb@google.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit d8ab9f7b644a2c9b64de405c1953c905ff219dc9 upstream.

When VMAs are merged, dup_anon_vma() is called with `dst` pointing to the
VMA that is being expanded to cover the area previously occupied by
another VMA.  This currently happens while `dst` is not write-locked.

This means that, in the `src-&gt;anon_vma &amp;&amp; !dst-&gt;anon_vma` case, as soon as
the assignment `dst-&gt;anon_vma = src-&gt;anon_vma` has happened, concurrent
page faults can happen on `dst` under the per-VMA lock.  This is already
icky in itself, since such page faults can now install pages into `dst`
that are attached to an `anon_vma` that is not yet tied back to the
`anon_vma` with an `anon_vma_chain`.  But if `anon_vma_clone()` fails due
to an out-of-memory error, things get much worse: `anon_vma_clone()` then
reverts `dst-&gt;anon_vma` back to NULL, and `dst` remains completely
unconnected to the `anon_vma`, even though we can have pages in the area
covered by `dst` that point to the `anon_vma`.

This means the `anon_vma` of such pages can be freed while the pages are
still mapped into userspace, which leads to UAF when a helper like
folio_lock_anon_vma_read() tries to look up the anon_vma of such a page.

This theoretically is a security bug, but I believe it is really hard to
actually trigger as an unprivileged user because it requires that you can
make an order-0 GFP_KERNEL allocation fail, and the page allocator tries
pretty hard to prevent that.

I think doing the vma_start_write() call inside dup_anon_vma() is the most
straightforward fix for now.

For a kernel-assisted reproducer, see the notes section of the patch mail.

Link: https://lkml.kernel.org/r/20230721034643.616851-1-jannh@google.com
Fixes: 5e31275cc997 ("mm: add per-VMA lock and helper functions to control it")
Signed-off-by: Jann Horn &lt;jannh@google.com&gt;
Reviewed-by: Suren Baghdasaryan &lt;surenb@google.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>rbd: retrieve and check lock owner twice before blocklisting</title>
<updated>2023-08-03T08:26:14+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2023-07-22T18:28:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=58f4cbe0cc874fe0025352019238c6b4d59d0eba'/>
<id>58f4cbe0cc874fe0025352019238c6b4d59d0eba</id>
<content type='text'>
commit 588159009d5b7a09c3e5904cffddbe4a4e170301 upstream.

An attempt to acquire exclusive lock can race with the current lock
owner closing the image:

1. lock is held by client123, rbd_lock() returns -EBUSY
2. get_lock_owner_info() returns client123 instance details
3. client123 closes the image, lock is released
4. find_watcher() returns 0 as there is no matching watcher anymore
5. client123 instance gets erroneously blocklisted

Particularly impacted is mirror snapshot scheduler in snapshot-based
mirroring since it happens to open and close images a lot (images are
opened only for as long as it takes to take the next mirror snapshot,
the same client instance is used for all images).

To reduce the potential for erroneous blocklisting, retrieve the lock
owner again after find_watcher() returns 0.  If it's still there, make
sure it matches the previously detected lock owner.

Cc: stable@vger.kernel.org # f38cb9d9c204: rbd: make get_lock_owner_info() return a single locker or NULL
Cc: stable@vger.kernel.org # 8ff2c64c9765: rbd: harden get_lock_owner_info() a bit
Cc: stable@vger.kernel.org
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Dongsheng Yang &lt;dongsheng.yang@easystack.cn&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 588159009d5b7a09c3e5904cffddbe4a4e170301 upstream.

An attempt to acquire exclusive lock can race with the current lock
owner closing the image:

1. lock is held by client123, rbd_lock() returns -EBUSY
2. get_lock_owner_info() returns client123 instance details
3. client123 closes the image, lock is released
4. find_watcher() returns 0 as there is no matching watcher anymore
5. client123 instance gets erroneously blocklisted

Particularly impacted is mirror snapshot scheduler in snapshot-based
mirroring since it happens to open and close images a lot (images are
opened only for as long as it takes to take the next mirror snapshot,
the same client instance is used for all images).

To reduce the potential for erroneous blocklisting, retrieve the lock
owner again after find_watcher() returns 0.  If it's still there, make
sure it matches the previously detected lock owner.

Cc: stable@vger.kernel.org # f38cb9d9c204: rbd: make get_lock_owner_info() return a single locker or NULL
Cc: stable@vger.kernel.org # 8ff2c64c9765: rbd: harden get_lock_owner_info() a bit
Cc: stable@vger.kernel.org
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Dongsheng Yang &lt;dongsheng.yang@easystack.cn&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>rbd: harden get_lock_owner_info() a bit</title>
<updated>2023-08-03T08:26:13+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2023-07-08T14:16:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=44f442aa19f0c992728c5af79f6aa9e6df2ad01a'/>
<id>44f442aa19f0c992728c5af79f6aa9e6df2ad01a</id>
<content type='text'>
commit 8ff2c64c9765446c3cef804fb99da04916603e27 upstream.

- we want the exclusive lock type, so test for it directly
- use sscanf() to actually parse the lock cookie and avoid admitting
  invalid handles
- bail if locker has a blank address

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Dongsheng Yang &lt;dongsheng.yang@easystack.cn&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 8ff2c64c9765446c3cef804fb99da04916603e27 upstream.

- we want the exclusive lock type, so test for it directly
- use sscanf() to actually parse the lock cookie and avoid admitting
  invalid handles
- bail if locker has a blank address

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Dongsheng Yang &lt;dongsheng.yang@easystack.cn&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>rbd: make get_lock_owner_info() return a single locker or NULL</title>
<updated>2023-08-03T08:26:13+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2023-06-30T11:52:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0a83f816a8f1ed381907c99107652832965308bd'/>
<id>0a83f816a8f1ed381907c99107652832965308bd</id>
<content type='text'>
commit f38cb9d9c2045dad16eead4a2e1aedfddd94603b upstream.

Make the "num_lockers can be only 0 or 1" assumption explicit and
simplify the API by getting rid of output parameters in preparation
for calling get_lock_owner_info() twice before blocklisting.

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Dongsheng Yang &lt;dongsheng.yang@easystack.cn&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit f38cb9d9c2045dad16eead4a2e1aedfddd94603b upstream.

Make the "num_lockers can be only 0 or 1" assumption explicit and
simplify the API by getting rid of output parameters in preparation
for calling get_lock_owner_info() twice before blocklisting.

Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Dongsheng Yang &lt;dongsheng.yang@easystack.cn&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
