<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/mm/filemap.c, branch v6.2.5</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>mm/filemap: fix page end in filemap_get_read_batch</title>
<updated>2023-02-17T02:11:58+00:00</updated>
<author>
<name>Qian Yingjin</name>
<email>qian@ddn.com</email>
</author>
<published>2023-02-08T02:24:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=5956592ce337330cdff0399a6f8b6a5aea397a8e'/>
<id>5956592ce337330cdff0399a6f8b6a5aea397a8e</id>
<content type='text'>
I was running traces of the read code against an RAID storage system to
understand why read requests were being misaligned against the underlying
RAID strips.  I found that the page end offset calculation in
filemap_get_read_batch() was off by one.

When a read is submitted with end offset 1048575, then it calculates the
end page for read of 256 when it should be 255.  "last_index" is the index
of the page beyond the end of the read and it should be skipped when get a
batch of pages for read in @filemap_get_read_batch().

The below simple patch fixes the problem.  This code was introduced in
kernel 5.12.

Link: https://lkml.kernel.org/r/20230208022400.28962-1-coolqyj@163.com
Fixes: cbd59c48ae2b ("mm/filemap: use head pages in generic_file_buffered_read")
Signed-off-by: Qian Yingjin &lt;qian@ddn.com&gt;
Reviewed-by: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
I was running traces of the read code against an RAID storage system to
understand why read requests were being misaligned against the underlying
RAID strips.  I found that the page end offset calculation in
filemap_get_read_batch() was off by one.

When a read is submitted with end offset 1048575, then it calculates the
end page for read of 256 when it should be 255.  "last_index" is the index
of the page beyond the end of the read and it should be skipped when get a
batch of pages for read in @filemap_get_read_batch().

The below simple patch fixes the problem.  This code was introduced in
kernel 5.12.

Link: https://lkml.kernel.org/r/20230208022400.28962-1-coolqyj@163.com
Fixes: cbd59c48ae2b ("mm/filemap: use head pages in generic_file_buffered_read")
Signed-off-by: Qian Yingjin &lt;qian@ddn.com&gt;
Reviewed-by: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>filemap: convert replace_page_cache_page() to replace_page_cache_folio()</title>
<updated>2022-12-12T02:12:12+00:00</updated>
<author>
<name>Vishal Moola (Oracle)</name>
<email>vishal.moola@gmail.com</email>
</author>
<published>2022-11-01T17:53:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3720dd6dcac38d03424d6ba38107f39af5318bcf'/>
<id>3720dd6dcac38d03424d6ba38107f39af5318bcf</id>
<content type='text'>
Patch series "Removing the lru_cache_add() wrapper".

This patchset replaces all calls of lru_cache_add() with the folio
equivalent: folio_add_lru().  This is allows us to get rid of the wrapper
The series passes xfstests and the userfaultfd selftests.


This patch (of 5):

Eliminates 7 calls to compound_head().

Link: https://lkml.kernel.org/r/20221101175326.13265-1-vishal.moola@gmail.com
Link: https://lkml.kernel.org/r/20221101175326.13265-2-vishal.moola@gmail.com
Signed-off-by: Vishal Moola (Oracle) &lt;vishal.moola@gmail.com&gt;
Reviewed-by: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Patch series "Removing the lru_cache_add() wrapper".

This patchset replaces all calls of lru_cache_add() with the folio
equivalent: folio_add_lru().  This is allows us to get rid of the wrapper
The series passes xfstests and the userfaultfd selftests.


This patch (of 5):

Eliminates 7 calls to compound_head().

Link: https://lkml.kernel.org/r/20221101175326.13265-1-vishal.moola@gmail.com
Link: https://lkml.kernel.org/r/20221101175326.13265-2-vishal.moola@gmail.com
Signed-off-by: Vishal Moola (Oracle) &lt;vishal.moola@gmail.com&gt;
Reviewed-by: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>filemap: skip write and wait if end offset precedes start</title>
<updated>2022-12-12T02:12:10+00:00</updated>
<author>
<name>Brian Foster</name>
<email>bfoster@redhat.com</email>
</author>
<published>2022-11-28T15:56:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=feeb9b26952367bc1171592ee476f95aa81ee588'/>
<id>feeb9b26952367bc1171592ee476f95aa81ee588</id>
<content type='text'>
Patch series "filemap: skip write and wait if end offset precedes start",
v2.

A fix for the odd write and wait behavior described in the patch 1 commit
log.  Technically patch 1 could simply remove the check rather than lift
it into the callers, but this seemed a bit more user friendly to me. 
Patch 2 is appended after observation that fadvise() interacted poorly
with the v1 patch.  This is no longer a problem with v2, making patch 2
purely a cleanup.

This series survived both fstests and ltp regression runs without
observable problems.  I had (end &lt; start) warning checks in each relevant
function, with fadvise() being the only caller that triggered them.  That
said, I dropped the warnings after testing because there seemed to much
potential for noise from the various other callers.


This patch (of 2):

A call to file[map]_write_and_wait_range() with an end offset that
precedes the start offset but happens to land in the same page can trigger
writeback submission but fails to wait on the submitted page.  Writeback
submission occurs because __filemap_fdatawrite_range() passes both offsets
down into write_cache_pages(), which rounds down to page indexes before it
starts processing writeback.  However, __filemap_fdatawait_range()
immediately returns if the byte-granular end offset precedes the start
offset.

This behavior was observed in the form of unpredictable latency from a
frequent write and wait call with incorrect parameters.  The behavior gave
the impression that the fdatawait path might occasionally fail to wait on
writeback, but further investigation showed the latency was from
write_cache_pages() waiting on writeback state to clear for a page already
under writeback.  Therefore, this indicated that fdatawait actually never
waits on writeback in this particular situation.

The byte granular check in __filemap_fdatawait_range() goes all the way
back to the old wait_on_page_writeback() helper.  It originally used page
offsets and so would have waited in this problematic case.  That changed
to byte granularity file offsets in commit 94004ed726f3 ("kill
wait_on_page_writeback_range"), which subtly changed this behavior.  The
check itself has become somewhat redundant since the error checking code
that used to follow the wait loop (at the time of the aforementioned
commit) has now been removed and lifted into the higher level callers.

Therefore, we can restore historical fdatawait behavior by simply removing
the check.  Since the current fdatawait behavior has been in place for
quite some time and is consistent with other interfaces that use file
offsets, instead lift the check into the file[map]_write_and_wait_range()
helpers to provide consistent behavior between the write and wait.

Link: https://lkml.kernel.org/r/20221128155632.3950447-1-bfoster@redhat.com
Link: https://lkml.kernel.org/r/20221128155632.3950447-2-bfoster@redhat.com
Signed-off-by: Brian Foster &lt;bfoster@redhat.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Patch series "filemap: skip write and wait if end offset precedes start",
v2.

A fix for the odd write and wait behavior described in the patch 1 commit
log.  Technically patch 1 could simply remove the check rather than lift
it into the callers, but this seemed a bit more user friendly to me. 
Patch 2 is appended after observation that fadvise() interacted poorly
with the v1 patch.  This is no longer a problem with v2, making patch 2
purely a cleanup.

This series survived both fstests and ltp regression runs without
observable problems.  I had (end &lt; start) warning checks in each relevant
function, with fadvise() being the only caller that triggered them.  That
said, I dropped the warnings after testing because there seemed to much
potential for noise from the various other callers.


This patch (of 2):

A call to file[map]_write_and_wait_range() with an end offset that
precedes the start offset but happens to land in the same page can trigger
writeback submission but fails to wait on the submitted page.  Writeback
submission occurs because __filemap_fdatawrite_range() passes both offsets
down into write_cache_pages(), which rounds down to page indexes before it
starts processing writeback.  However, __filemap_fdatawait_range()
immediately returns if the byte-granular end offset precedes the start
offset.

This behavior was observed in the form of unpredictable latency from a
frequent write and wait call with incorrect parameters.  The behavior gave
the impression that the fdatawait path might occasionally fail to wait on
writeback, but further investigation showed the latency was from
write_cache_pages() waiting on writeback state to clear for a page already
under writeback.  Therefore, this indicated that fdatawait actually never
waits on writeback in this particular situation.

The byte granular check in __filemap_fdatawait_range() goes all the way
back to the old wait_on_page_writeback() helper.  It originally used page
offsets and so would have waited in this problematic case.  That changed
to byte granularity file offsets in commit 94004ed726f3 ("kill
wait_on_page_writeback_range"), which subtly changed this behavior.  The
check itself has become somewhat redundant since the error checking code
that used to follow the wait loop (at the time of the aforementioned
commit) has now been removed and lifted into the higher level callers.

Therefore, we can restore historical fdatawait behavior by simply removing
the check.  Since the current fdatawait behavior has been in place for
quite some time and is consistent with other interfaces that use file
offsets, instead lift the check into the file[map]_write_and_wait_range()
helpers to provide consistent behavior between the write and wait.

Link: https://lkml.kernel.org/r/20221128155632.3950447-1-bfoster@redhat.com
Link: https://lkml.kernel.org/r/20221128155632.3950447-2-bfoster@redhat.com
Signed-off-by: Brian Foster &lt;bfoster@redhat.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>filemap: find_get_entries() now updates start offset</title>
<updated>2022-11-09T01:37:12+00:00</updated>
<author>
<name>Vishal Moola (Oracle)</name>
<email>vishal.moola@gmail.com</email>
</author>
<published>2022-10-17T16:18:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9fb6beea79c6e7c959adf4fb7b94cf9a6028b941'/>
<id>9fb6beea79c6e7c959adf4fb7b94cf9a6028b941</id>
<content type='text'>
Initially, find_get_entries() was being passed in the start offset as a
value.  That left the calculation of the offset to the callers.  This led
to complexity in the callers trying to keep track of the index.

Now find_get_entries() takes in a pointer to the start offset and updates
the value to be directly after the last entry found.  If no entry is
found, the offset is not changed.  This gets rid of multiple hacky
calculations that kept track of the start offset.

Link: https://lkml.kernel.org/r/20221017161800.2003-3-vishal.moola@gmail.com
Signed-off-by: Vishal Moola (Oracle) &lt;vishal.moola@gmail.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Initially, find_get_entries() was being passed in the start offset as a
value.  That left the calculation of the offset to the callers.  This led
to complexity in the callers trying to keep track of the index.

Now find_get_entries() takes in a pointer to the start offset and updates
the value to be directly after the last entry found.  If no entry is
found, the offset is not changed.  This gets rid of multiple hacky
calculations that kept track of the start offset.

Link: https://lkml.kernel.org/r/20221017161800.2003-3-vishal.moola@gmail.com
Signed-off-by: Vishal Moola (Oracle) &lt;vishal.moola@gmail.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>filemap: find_lock_entries() now updates start offset</title>
<updated>2022-11-09T01:37:12+00:00</updated>
<author>
<name>Vishal Moola (Oracle)</name>
<email>vishal.moola@gmail.com</email>
</author>
<published>2022-10-17T16:17:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3392ca121872dd8c33015c7703d4981c78819be3'/>
<id>3392ca121872dd8c33015c7703d4981c78819be3</id>
<content type='text'>
Patch series "Rework find_get_entries() and find_lock_entries()", v3.

Originally the callers of find_get_entries() and find_lock_entries() were
keeping track of the start index themselves as they traverse the search
range.

This resulted in hacky code such as in shmem_undo_range():

			index = folio-&gt;index + folio_nr_pages(folio) - 1;

where the - 1 is only present to stay in the right spot after incrementing
index later.  This sort of calculation was also being done on every folio
despite not even using index later within that function.

These patches change find_get_entries() and find_lock_entries() to
calculate the new index instead of leaving it to the callers so we can
avoid all these complications.


This patch (of 2):

Initially, find_lock_entries() was being passed in the start offset as a
value.  That left the calculation of the offset to the callers.  This led
to complexity in the callers trying to keep track of the index.

Now find_lock_entries() takes in a pointer to the start offset and updates
the value to be directly after the last entry found.  If no entry is
found, the offset is not changed.  This gets rid of multiple hacky
calculations that kept track of the start offset.

Link: https://lkml.kernel.org/r/20221017161800.2003-1-vishal.moola@gmail.com
Link: https://lkml.kernel.org/r/20221017161800.2003-2-vishal.moola@gmail.com
Signed-off-by: Vishal Moola (Oracle) &lt;vishal.moola@gmail.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Patch series "Rework find_get_entries() and find_lock_entries()", v3.

Originally the callers of find_get_entries() and find_lock_entries() were
keeping track of the start index themselves as they traverse the search
range.

This resulted in hacky code such as in shmem_undo_range():

			index = folio-&gt;index + folio_nr_pages(folio) - 1;

where the - 1 is only present to stay in the right spot after incrementing
index later.  This sort of calculation was also being done on every folio
despite not even using index later within that function.

These patches change find_get_entries() and find_lock_entries() to
calculate the new index instead of leaving it to the callers so we can
avoid all these complications.


This patch (of 2):

Initially, find_lock_entries() was being passed in the start offset as a
value.  That left the calculation of the offset to the callers.  This led
to complexity in the callers trying to keep track of the index.

Now find_lock_entries() takes in a pointer to the start offset and updates
the value to be directly after the last entry found.  If no entry is
found, the offset is not changed.  This gets rid of multiple hacky
calculations that kept track of the start offset.

Link: https://lkml.kernel.org/r/20221017161800.2003-1-vishal.moola@gmail.com
Link: https://lkml.kernel.org/r/20221017161800.2003-2-vishal.moola@gmail.com
Signed-off-by: Vishal Moola (Oracle) &lt;vishal.moola@gmail.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm</title>
<updated>2022-10-11T00:53:04+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-10-11T00:53:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=27bc50fc90647bbf7b734c3fc306a5e61350da53'/>
<id>27bc50fc90647bbf7b734c3fc306a5e61350da53</id>
<content type='text'>
Pull MM updates from Andrew Morton:

 - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
   linux-next for a couple of months without, to my knowledge, any
   negative reports (or any positive ones, come to that).

 - Also the Maple Tree from Liam Howlett. An overlapping range-based
   tree for vmas. It it apparently slightly more efficient in its own
   right, but is mainly targeted at enabling work to reduce mmap_lock
   contention.

   Liam has identified a number of other tree users in the kernel which
   could be beneficially onverted to mapletrees.

   Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
   at [1]. This has yet to be addressed due to Liam's unfortunately
   timed vacation. He is now back and we'll get this fixed up.

 - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses
   clang-generated instrumentation to detect used-unintialized bugs down
   to the single bit level.

   KMSAN keeps finding bugs. New ones, as well as the legacy ones.

 - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
   memory into THPs.

 - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to
   support file/shmem-backed pages.

 - userfaultfd updates from Axel Rasmussen

 - zsmalloc cleanups from Alexey Romanov

 - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and
   memory-failure

 - Huang Ying adds enhancements to NUMA balancing memory tiering mode's
   page promotion, with a new way of detecting hot pages.

 - memcg updates from Shakeel Butt: charging optimizations and reduced
   memory consumption.

 - memcg cleanups from Kairui Song.

 - memcg fixes and cleanups from Johannes Weiner.

 - Vishal Moola provides more folio conversions

 - Zhang Yi removed ll_rw_block() :(

 - migration enhancements from Peter Xu

 - migration error-path bugfixes from Huang Ying

 - Aneesh Kumar added ability for a device driver to alter the memory
   tiering promotion paths. For optimizations by PMEM drivers, DRM
   drivers, etc.

 - vma merging improvements from Jakub Matěn.

 - NUMA hinting cleanups from David Hildenbrand.

 - xu xin added aditional userspace visibility into KSM merging
   activity.

 - THP &amp; KSM code consolidation from Qi Zheng.

 - more folio work from Matthew Wilcox.

 - KASAN updates from Andrey Konovalov.

 - DAMON cleanups from Kaixu Xia.

 - DAMON work from SeongJae Park: fixes, cleanups.

 - hugetlb sysfs cleanups from Muchun Song.

 - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.

Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1]

* tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits)
  hugetlb: allocate vma lock for all sharable vmas
  hugetlb: take hugetlb vma_lock when clearing vma_lock-&gt;vma pointer
  hugetlb: fix vma lock handling during split vma and range unmapping
  mglru: mm/vmscan.c: fix imprecise comments
  mm/mglru: don't sync disk for each aging cycle
  mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol
  mm: memcontrol: use do_memsw_account() in a few more places
  mm: memcontrol: deprecate swapaccounting=0 mode
  mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled
  mm/secretmem: remove reduntant return value
  mm/hugetlb: add available_huge_pages() func
  mm: remove unused inline functions from include/linux/mm_inline.h
  selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory
  selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd
  selftests/vm: add thp collapse shmem testing
  selftests/vm: add thp collapse file and tmpfs testing
  selftests/vm: modularize thp collapse memory operations
  selftests/vm: dedup THP helpers
  mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
  mm/madvise: add file and shmem support to MADV_COLLAPSE
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull MM updates from Andrew Morton:

 - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
   linux-next for a couple of months without, to my knowledge, any
   negative reports (or any positive ones, come to that).

 - Also the Maple Tree from Liam Howlett. An overlapping range-based
   tree for vmas. It it apparently slightly more efficient in its own
   right, but is mainly targeted at enabling work to reduce mmap_lock
   contention.

   Liam has identified a number of other tree users in the kernel which
   could be beneficially onverted to mapletrees.

   Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
   at [1]. This has yet to be addressed due to Liam's unfortunately
   timed vacation. He is now back and we'll get this fixed up.

 - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses
   clang-generated instrumentation to detect used-unintialized bugs down
   to the single bit level.

   KMSAN keeps finding bugs. New ones, as well as the legacy ones.

 - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
   memory into THPs.

 - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to
   support file/shmem-backed pages.

 - userfaultfd updates from Axel Rasmussen

 - zsmalloc cleanups from Alexey Romanov

 - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and
   memory-failure

 - Huang Ying adds enhancements to NUMA balancing memory tiering mode's
   page promotion, with a new way of detecting hot pages.

 - memcg updates from Shakeel Butt: charging optimizations and reduced
   memory consumption.

 - memcg cleanups from Kairui Song.

 - memcg fixes and cleanups from Johannes Weiner.

 - Vishal Moola provides more folio conversions

 - Zhang Yi removed ll_rw_block() :(

 - migration enhancements from Peter Xu

 - migration error-path bugfixes from Huang Ying

 - Aneesh Kumar added ability for a device driver to alter the memory
   tiering promotion paths. For optimizations by PMEM drivers, DRM
   drivers, etc.

 - vma merging improvements from Jakub Matěn.

 - NUMA hinting cleanups from David Hildenbrand.

 - xu xin added aditional userspace visibility into KSM merging
   activity.

 - THP &amp; KSM code consolidation from Qi Zheng.

 - more folio work from Matthew Wilcox.

 - KASAN updates from Andrey Konovalov.

 - DAMON cleanups from Kaixu Xia.

 - DAMON work from SeongJae Park: fixes, cleanups.

 - hugetlb sysfs cleanups from Muchun Song.

 - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.

Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1]

* tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits)
  hugetlb: allocate vma lock for all sharable vmas
  hugetlb: take hugetlb vma_lock when clearing vma_lock-&gt;vma pointer
  hugetlb: fix vma lock handling during split vma and range unmapping
  mglru: mm/vmscan.c: fix imprecise comments
  mm/mglru: don't sync disk for each aging cycle
  mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol
  mm: memcontrol: use do_memsw_account() in a few more places
  mm: memcontrol: deprecate swapaccounting=0 mode
  mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled
  mm/secretmem: remove reduntant return value
  mm/hugetlb: add available_huge_pages() func
  mm: remove unused inline functions from include/linux/mm_inline.h
  selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory
  selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd
  selftests/vm: add thp collapse shmem testing
  selftests/vm: add thp collapse file and tmpfs testing
  selftests/vm: modularize thp collapse memory operations
  selftests/vm: dedup THP helpers
  mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
  mm/madvise: add file and shmem support to MADV_COLLAPSE
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: fs: initialize fsdata passed to write_begin/write_end interface</title>
<updated>2022-10-03T21:03:25+00:00</updated>
<author>
<name>Alexander Potapenko</name>
<email>glider@google.com</email>
</author>
<published>2022-09-15T15:04:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1468c6f4558b1bcd92aa0400f2920f9dc7588402'/>
<id>1468c6f4558b1bcd92aa0400f2920f9dc7588402</id>
<content type='text'>
Functions implementing the a_ops-&gt;write_end() interface accept the `void
*fsdata` parameter that is supposed to be initialized by the corresponding
a_ops-&gt;write_begin() (which accepts `void **fsdata`).

However not all a_ops-&gt;write_begin() implementations initialize `fsdata`
unconditionally, so it may get passed uninitialized to a_ops-&gt;write_end(),
resulting in undefined behavior.

Fix this by initializing fsdata with NULL before the call to
write_begin(), rather than doing so in all possible a_ops implementations.

This patch covers only the following cases found by running x86 KMSAN
under syzkaller:

 - generic_perform_write()
 - cont_expand_zero() and generic_cont_expand_simple()
 - page_symlink()

Other cases of passing uninitialized fsdata may persist in the codebase.

Link: https://lkml.kernel.org/r/20220915150417.722975-43-glider@google.com
Signed-off-by: Alexander Potapenko &lt;glider@google.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Andrey Konovalov &lt;andreyknvl@gmail.com&gt;
Cc: Andrey Konovalov &lt;andreyknvl@google.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Cc: Eric Biggers &lt;ebiggers@google.com&gt;
Cc: Eric Biggers &lt;ebiggers@kernel.org&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Cc: Ilya Leoshkevich &lt;iii@linux.ibm.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Marco Elver &lt;elver@google.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Cc: Pekka Enberg &lt;penberg@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Petr Mladek &lt;pmladek@suse.com&gt;
Cc: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
Cc: Vegard Nossum &lt;vegard.nossum@oracle.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Functions implementing the a_ops-&gt;write_end() interface accept the `void
*fsdata` parameter that is supposed to be initialized by the corresponding
a_ops-&gt;write_begin() (which accepts `void **fsdata`).

However not all a_ops-&gt;write_begin() implementations initialize `fsdata`
unconditionally, so it may get passed uninitialized to a_ops-&gt;write_end(),
resulting in undefined behavior.

Fix this by initializing fsdata with NULL before the call to
write_begin(), rather than doing so in all possible a_ops implementations.

This patch covers only the following cases found by running x86 KMSAN
under syzkaller:

 - generic_perform_write()
 - cont_expand_zero() and generic_cont_expand_simple()
 - page_symlink()

Other cases of passing uninitialized fsdata may persist in the codebase.

Link: https://lkml.kernel.org/r/20220915150417.722975-43-glider@google.com
Signed-off-by: Alexander Potapenko &lt;glider@google.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Andrey Konovalov &lt;andreyknvl@gmail.com&gt;
Cc: Andrey Konovalov &lt;andreyknvl@google.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Cc: Eric Biggers &lt;ebiggers@google.com&gt;
Cc: Eric Biggers &lt;ebiggers@kernel.org&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Cc: Ilya Leoshkevich &lt;iii@linux.ibm.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Marco Elver &lt;elver@google.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Cc: Pekka Enberg &lt;penberg@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Petr Mladek &lt;pmladek@suse.com&gt;
Cc: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
Cc: Vegard Nossum &lt;vegard.nossum@oracle.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm/filemap: make folio_put_wait_locked static</title>
<updated>2022-10-03T21:03:15+00:00</updated>
<author>
<name>Ke Sun</name>
<email>sunke@kylinos.cn</email>
</author>
<published>2022-09-14T02:17:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c195c3215741746b1eb7ab7980b926ddc37a4be3'/>
<id>c195c3215741746b1eb7ab7980b926ddc37a4be3</id>
<content type='text'>
It's only used in mm/filemap.c, since commit &lt;ffa65753c431&gt;
("mm/migrate.c: rework migration_entry_wait() to not take a pageref").

Make it static.

Link: https://lkml.kernel.org/r/20220914021738.3228011-1-sunke@kylinos.cn
Signed-off-by: Ke Sun &lt;sunke@kylinos.cn&gt;
Reported-by: k2ci &lt;kernel-bot@kylinos.cn&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It's only used in mm/filemap.c, since commit &lt;ffa65753c431&gt;
("mm/migrate.c: rework migration_entry_wait() to not take a pageref").

Make it static.

Link: https://lkml.kernel.org/r/20220914021738.3228011-1-sunke@kylinos.cn
Signed-off-by: Ke Sun &lt;sunke@kylinos.cn&gt;
Reported-by: k2ci &lt;kernel-bot@kylinos.cn&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>filemap: convert filemap_range_has_writeback() to use folios</title>
<updated>2022-10-03T21:02:56+00:00</updated>
<author>
<name>Vishal Moola (Oracle)</name>
<email>vishal.moola@gmail.com</email>
</author>
<published>2022-09-05T21:45:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b05f41a1aa56fd646f2aa048ee446b6a2edb80d3'/>
<id>b05f41a1aa56fd646f2aa048ee446b6a2edb80d3</id>
<content type='text'>
Removes 3 calls to compound_head().

Link: https://lkml.kernel.org/r/20220905214557.868606-1-vishal.moola@gmail.com
Signed-off-by: Vishal Moola (Oracle) &lt;vishal.moola@gmail.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Removes 3 calls to compound_head().

Link: https://lkml.kernel.org/r/20220905214557.868606-1-vishal.moola@gmail.com
Signed-off-by: Vishal Moola (Oracle) &lt;vishal.moola@gmail.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>delayacct: support re-entrance detection of thrashing accounting</title>
<updated>2022-09-27T02:46:07+00:00</updated>
<author>
<name>Yang Yang</name>
<email>yang.yang29@zte.com.cn</email>
</author>
<published>2022-08-15T07:11:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=aa1cf99b87e934e761b46ce2b925335a398980da'/>
<id>aa1cf99b87e934e761b46ce2b925335a398980da</id>
<content type='text'>
Once upon a time, we only support accounting thrashing of page cache. 
Then Joonsoo introduced workingset detection for anonymous pages and we
gained the ability to account thrashing of them[1].

For page cache thrashing accounting, there is no suitable place to do it
in fs level likes swap_readpage().  So we have to do it in
folio_wait_bit_common().

Then for anonymous pages thrashing accounting, we have to do it in both
swap_readpage() and folio_wait_bit_common().  This likes PSI, so we should
let thrashing accounting supports re-entrance detection.

This patch is to prepare complete thrashing accounting, and is based on
patch "filemap: make the accounting of thrashing more consistent".

[1] commit aae466b0052e ("mm/swap: implement workingset detection for anonymous LRU")

Link: https://lkml.kernel.org/r/20220815071134.74551-1-yang.yang29@zte.com.cn
Signed-off-by: Yang Yang &lt;yang.yang29@zte.com.cn&gt;
Signed-off-by: CGEL ZTE &lt;cgel.zte@gmail.com&gt;
Reviewed-by: Ran Xiaokai &lt;ran.xiaokai@zte.com.cn&gt;
Reviewed-by: wangyong &lt;wang.yong12@zte.com.cn&gt;
Acked-by: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Once upon a time, we only support accounting thrashing of page cache. 
Then Joonsoo introduced workingset detection for anonymous pages and we
gained the ability to account thrashing of them[1].

For page cache thrashing accounting, there is no suitable place to do it
in fs level likes swap_readpage().  So we have to do it in
folio_wait_bit_common().

Then for anonymous pages thrashing accounting, we have to do it in both
swap_readpage() and folio_wait_bit_common().  This likes PSI, so we should
let thrashing accounting supports re-entrance detection.

This patch is to prepare complete thrashing accounting, and is based on
patch "filemap: make the accounting of thrashing more consistent".

[1] commit aae466b0052e ("mm/swap: implement workingset detection for anonymous LRU")

Link: https://lkml.kernel.org/r/20220815071134.74551-1-yang.yang29@zte.com.cn
Signed-off-by: Yang Yang &lt;yang.yang29@zte.com.cn&gt;
Signed-off-by: CGEL ZTE &lt;cgel.zte@gmail.com&gt;
Reviewed-by: Ran Xiaokai &lt;ran.xiaokai@zte.com.cn&gt;
Reviewed-by: wangyong &lt;wang.yong12@zte.com.cn&gt;
Acked-by: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
