<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/mm/oom_kill.c, branch v6.5</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>mm, oom: do not check 0 mask in out_of_memory()</title>
<updated>2023-06-09T23:25:20+00:00</updated>
<author>
<name>Haifeng Xu</name>
<email>haifeng.xu@shopee.com</email>
</author>
<published>2023-05-08T07:35:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4822acb1369637938c1252d534d3356c5e313cde'/>
<id>4822acb1369637938c1252d534d3356c5e313cde</id>
<content type='text'>
Since commit 60e2793d440a ("mm, oom: do not trigger out_of_memory from the
#PF"), no user sets gfp_mask to 0.  Remove the 0 mask check and update the
comments.

Link: https://lkml.kernel.org/r/20230508073538.1168-1-haifeng.xu@shopee.com
Signed-off-by: Haifeng Xu &lt;haifeng.xu@shopee.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Shakeel Butt &lt;shakeelb@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since commit 60e2793d440a ("mm, oom: do not trigger out_of_memory from the
#PF"), no user sets gfp_mask to 0.  Remove the 0 mask check and update the
comments.

Link: https://lkml.kernel.org/r/20230508073538.1168-1-haifeng.xu@shopee.com
Signed-off-by: Haifeng Xu &lt;haifeng.xu@shopee.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Shakeel Butt &lt;shakeelb@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm/mmu_notifier: remove unused mmu_notifier_range_update_to_read_only export</title>
<updated>2023-02-03T06:32:54+00:00</updated>
<author>
<name>Alistair Popple</name>
<email>apopple@nvidia.com</email>
</author>
<published>2023-01-10T02:57:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7d4a8be0c4b2b7ffb367929d2b352651f083806b'/>
<id>7d4a8be0c4b2b7ffb367929d2b352651f083806b</id>
<content type='text'>
mmu_notifier_range_update_to_read_only() was originally introduced in
commit c6d23413f81b ("mm/mmu_notifier:
mmu_notifier_range_update_to_read_only() helper") as an optimisation for
device drivers that know a range has only been mapped read-only.  However
there are no users of this feature so remove it.  As it is the only user
of the struct mmu_notifier_range.vma field remove that also.

Link: https://lkml.kernel.org/r/20230110025722.600912-1-apopple@nvidia.com
Signed-off-by: Alistair Popple &lt;apopple@nvidia.com&gt;
Acked-by: Mike Rapoport (IBM) &lt;rppt@kernel.org&gt;
Reviewed-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Ira Weiny &lt;ira.weiny@intel.com&gt;
Cc: Jerome Glisse &lt;jglisse@redhat.com&gt;
Cc: John Hubbard &lt;jhubbard@nvidia.com&gt;
Cc: Ralph Campbell &lt;rcampbell@nvidia.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
mmu_notifier_range_update_to_read_only() was originally introduced in
commit c6d23413f81b ("mm/mmu_notifier:
mmu_notifier_range_update_to_read_only() helper") as an optimisation for
device drivers that know a range has only been mapped read-only.  However
there are no users of this feature so remove it.  As it is the only user
of the struct mmu_notifier_range.vma field remove that also.

Link: https://lkml.kernel.org/r/20230110025722.600912-1-apopple@nvidia.com
Signed-off-by: Alistair Popple &lt;apopple@nvidia.com&gt;
Acked-by: Mike Rapoport (IBM) &lt;rppt@kernel.org&gt;
Reviewed-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Ira Weiny &lt;ira.weiny@intel.com&gt;
Cc: Jerome Glisse &lt;jglisse@redhat.com&gt;
Cc: John Hubbard &lt;jhubbard@nvidia.com&gt;
Cc: Ralph Campbell &lt;rcampbell@nvidia.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: reduce noise in show_mem for lowmem allocations</title>
<updated>2022-09-27T02:46:29+00:00</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.com</email>
</author>
<published>2022-08-23T09:22:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=974f4367dd315acc15ad4a6453f8304aea60dfbd'/>
<id>974f4367dd315acc15ad4a6453f8304aea60dfbd</id>
<content type='text'>
While discussing early DMA pool pre-allocation failure with Christoph [1]
I have realized that the allocation failure warning is rather noisy for
constrained allocations like GFP_DMA{32}.  Those zones are usually not
populated on all nodes very often as their memory ranges are constrained.

This is an attempt to reduce the ballast that doesn't provide any relevant
information for those allocation failures investigation.  Please note that
I have only compile tested it (in my default config setup) and I am
throwing it mostly to see what people think about it.

[1] http://lkml.kernel.org/r/20220817060647.1032426-1-hch@lst.de

[mhocko@suse.com: update]
  Link: https://lkml.kernel.org/r/Yw29bmJTIkKogTiW@dhcp22.suse.cz
[mhocko@suse.com: fix build]
[akpm@linux-foundation.org: fix it for mapletree]
[akpm@linux-foundation.org: update it for Michal's update]
[mhocko@suse.com: fix arch/powerpc/xmon/xmon.c]
  Link: https://lkml.kernel.org/r/Ywh3C4dKB9B93jIy@dhcp22.suse.cz
[akpm@linux-foundation.org: fix arch/sparc/kernel/setup_32.c]
Link: https://lkml.kernel.org/r/YwScVmVofIZkopkF@dhcp22.suse.cz
Signed-off-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
While discussing early DMA pool pre-allocation failure with Christoph [1]
I have realized that the allocation failure warning is rather noisy for
constrained allocations like GFP_DMA{32}.  Those zones are usually not
populated on all nodes very often as their memory ranges are constrained.

This is an attempt to reduce the ballast that doesn't provide any relevant
information for those allocation failures investigation.  Please note that
I have only compile tested it (in my default config setup) and I am
throwing it mostly to see what people think about it.

[1] http://lkml.kernel.org/r/20220817060647.1032426-1-hch@lst.de

[mhocko@suse.com: update]
  Link: https://lkml.kernel.org/r/Yw29bmJTIkKogTiW@dhcp22.suse.cz
[mhocko@suse.com: fix build]
[akpm@linux-foundation.org: fix it for mapletree]
[akpm@linux-foundation.org: update it for Michal's update]
[mhocko@suse.com: fix arch/powerpc/xmon/xmon.c]
  Link: https://lkml.kernel.org/r/Ywh3C4dKB9B93jIy@dhcp22.suse.cz
[akpm@linux-foundation.org: fix arch/sparc/kernel/setup_32.c]
Link: https://lkml.kernel.org/r/YwScVmVofIZkopkF@dhcp22.suse.cz
Signed-off-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: delete unused MMF_OOM_VICTIM flag</title>
<updated>2022-09-27T02:46:27+00:00</updated>
<author>
<name>Suren Baghdasaryan</name>
<email>surenb@google.com</email>
</author>
<published>2022-05-31T22:31:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b3541d912a84dc40cabb516f2deeac9ae6fa30da'/>
<id>b3541d912a84dc40cabb516f2deeac9ae6fa30da</id>
<content type='text'>
With the last usage of MMF_OOM_VICTIM in exit_mmap gone, this flag is now
unused and can be removed.

[akpm@linux-foundation.org: remove comment about now-removed mm_is_oom_victim()]
Link: https://lkml.kernel.org/r/20220531223100.510392-2-surenb@google.com
Signed-off-by: Suren Baghdasaryan &lt;surenb@google.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Roman Gushchin &lt;guro@fb.com&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: "Kirill A . Shutemov" &lt;kirill@shutemov.name&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Christian Brauner (Microsoft) &lt;brauner@kernel.org&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Shakeel Butt &lt;shakeelb@google.com&gt;
Cc: Peter Xu &lt;peterx@redhat.com&gt;
Cc: John Hubbard &lt;jhubbard@nvidia.com&gt;
Cc: Shuah Khan &lt;shuah@kernel.org&gt;
Cc: Liam Howlett &lt;liam.howlett@oracle.com&gt;

Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
With the last usage of MMF_OOM_VICTIM in exit_mmap gone, this flag is now
unused and can be removed.

[akpm@linux-foundation.org: remove comment about now-removed mm_is_oom_victim()]
Link: https://lkml.kernel.org/r/20220531223100.510392-2-surenb@google.com
Signed-off-by: Suren Baghdasaryan &lt;surenb@google.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Roman Gushchin &lt;guro@fb.com&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: "Kirill A . Shutemov" &lt;kirill@shutemov.name&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Christian Brauner (Microsoft) &lt;brauner@kernel.org&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Shakeel Butt &lt;shakeelb@google.com&gt;
Cc: Peter Xu &lt;peterx@redhat.com&gt;
Cc: John Hubbard &lt;jhubbard@nvidia.com&gt;
Cc: Shuah Khan &lt;shuah@kernel.org&gt;
Cc: Liam Howlett &lt;liam.howlett@oracle.com&gt;

Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: drop oom code from exit_mmap</title>
<updated>2022-09-27T02:46:27+00:00</updated>
<author>
<name>Suren Baghdasaryan</name>
<email>surenb@google.com</email>
</author>
<published>2022-05-31T22:30:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=bf3980c85212fc71512d27a46f5aab66f46ca284'/>
<id>bf3980c85212fc71512d27a46f5aab66f46ca284</id>
<content type='text'>
The primary reason to invoke the oom reaper from the exit_mmap path used
to be a prevention of an excessive oom killing if the oom victim exit
races with the oom reaper (see [1] for more details).  The invocation has
moved around since then because of the interaction with the munlock logic
but the underlying reason has remained the same (see [2]).

Munlock code is no longer a problem since [3] and there shouldn't be any
blocking operation before the memory is unmapped by exit_mmap so the oom
reaper invocation can be dropped.  The unmapping part can be done with the
non-exclusive mmap_sem and the exclusive one is only required when page
tables are freed.

Remove the oom_reaper from exit_mmap which will make the code easier to
read.  This is really unlikely to make any observable difference although
some microbenchmarks could benefit from one less branch that needs to be
evaluated even though it almost never is true.

[1] 212925802454 ("mm: oom: let oom_reap_task and exit_mmap run concurrently")
[2] 27ae357fa82b ("mm, oom: fix concurrent munlock and oom reaper unmap, v3")
[3] a213e5cf71cb ("mm/munlock: delete munlock_vma_pages_all(), allow oomreap")

[akpm@linux-foundation.org: restore Suren's mmap_read_lock() optimization]
Link: https://lkml.kernel.org/r/20220531223100.510392-1-surenb@google.com
Signed-off-by: Suren Baghdasaryan &lt;surenb@google.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Christian Brauner (Microsoft) &lt;brauner@kernel.org&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: John Hubbard &lt;jhubbard@nvidia.com&gt;
Cc: "Kirill A . Shutemov" &lt;kirill@shutemov.name&gt;
Cc: Liam Howlett &lt;liam.howlett@oracle.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Peter Xu &lt;peterx@redhat.com&gt;
Cc: Roman Gushchin &lt;guro@fb.com&gt;
Cc: Shakeel Butt &lt;shakeelb@google.com&gt;
Cc: Shuah Khan &lt;shuah@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The primary reason to invoke the oom reaper from the exit_mmap path used
to be a prevention of an excessive oom killing if the oom victim exit
races with the oom reaper (see [1] for more details).  The invocation has
moved around since then because of the interaction with the munlock logic
but the underlying reason has remained the same (see [2]).

Munlock code is no longer a problem since [3] and there shouldn't be any
blocking operation before the memory is unmapped by exit_mmap so the oom
reaper invocation can be dropped.  The unmapping part can be done with the
non-exclusive mmap_sem and the exclusive one is only required when page
tables are freed.

Remove the oom_reaper from exit_mmap which will make the code easier to
read.  This is really unlikely to make any observable difference although
some microbenchmarks could benefit from one less branch that needs to be
evaluated even though it almost never is true.

[1] 212925802454 ("mm: oom: let oom_reap_task and exit_mmap run concurrently")
[2] 27ae357fa82b ("mm, oom: fix concurrent munlock and oom reaper unmap, v3")
[3] a213e5cf71cb ("mm/munlock: delete munlock_vma_pages_all(), allow oomreap")

[akpm@linux-foundation.org: restore Suren's mmap_read_lock() optimization]
Link: https://lkml.kernel.org/r/20220531223100.510392-1-surenb@google.com
Signed-off-by: Suren Baghdasaryan &lt;surenb@google.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Christian Brauner (Microsoft) &lt;brauner@kernel.org&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: John Hubbard &lt;jhubbard@nvidia.com&gt;
Cc: "Kirill A . Shutemov" &lt;kirill@shutemov.name&gt;
Cc: Liam Howlett &lt;liam.howlett@oracle.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Peter Xu &lt;peterx@redhat.com&gt;
Cc: Roman Gushchin &lt;guro@fb.com&gt;
Cc: Shakeel Butt &lt;shakeelb@google.com&gt;
Cc: Shuah Khan &lt;shuah@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm/oom_kill: use vma iterators instead of vma linked list</title>
<updated>2022-09-27T02:46:25+00:00</updated>
<author>
<name>Liam R. Howlett</name>
<email>Liam.Howlett@Oracle.com</email>
</author>
<published>2022-09-06T19:49:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e1c2c775d448be0503a3ac90681d86980919bad0'/>
<id>e1c2c775d448be0503a3ac90681d86980919bad0</id>
<content type='text'>
Use vma iterator in preparation of removing the linked list.

Link: https://lkml.kernel.org/r/20220906194824.2110408-62-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett &lt;Liam.Howlett@Oracle.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Reviewed-by: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Tested-by: Yu Zhao &lt;yuzhao@google.com&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Cc: "Matthew Wilcox (Oracle)" &lt;willy@infradead.org&gt;
Cc: SeongJae Park &lt;sj@kernel.org&gt;
Cc: Sven Schnelle &lt;svens@linux.ibm.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use vma iterator in preparation of removing the linked list.

Link: https://lkml.kernel.org/r/20220906194824.2110408-62-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett &lt;Liam.Howlett@Oracle.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Reviewed-by: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Tested-by: Yu Zhao &lt;yuzhao@google.com&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Cc: "Matthew Wilcox (Oracle)" &lt;willy@infradead.org&gt;
Cc: SeongJae Park &lt;sj@kernel.org&gt;
Cc: Sven Schnelle &lt;svens@linux.ibm.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm/oom_kill.c: fix vm_oom_kill_table[] ifdeffery</title>
<updated>2022-06-01T22:57:16+00:00</updated>
<author>
<name>Andrew Morton</name>
<email>akpm@linux-foundation.org</email>
</author>
<published>2022-06-01T22:57:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a19cad0691597eb79c123b8a19a9faba5ab7d90e'/>
<id>a19cad0691597eb79c123b8a19a9faba5ab7d90e</id>
<content type='text'>
arm allnoconfig:

mm/oom_kill.c:60:25: warning: 'vm_oom_kill_table' defined but not used [-Wunused-variable]
   60 | static struct ctl_table vm_oom_kill_table[] = {
      |                         ^~~~~~~~~~~~~~~~~

Cc: Luis Chamberlain &lt;mcgrof@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
arm allnoconfig:

mm/oom_kill.c:60:25: warning: 'vm_oom_kill_table' defined but not used [-Wunused-variable]
   60 | static struct ctl_table vm_oom_kill_table[] = {
      |                         ^~~~~~~~~~~~~~~~~

Cc: Luis Chamberlain &lt;mcgrof@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'sysctl-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux</title>
<updated>2022-05-26T23:57:20+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-05-26T23:57:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=44d35720c9a660074b77ab9de37abf2c01c5b44f'/>
<id>44d35720c9a660074b77ab9de37abf2c01c5b44f</id>
<content type='text'>
Pull sysctl updates from Luis Chamberlain:
 "For two kernel releases now kernel/sysctl.c has been being cleaned up
  slowly, since the tables were grossly long, sprinkled with tons of
  #ifdefs and all this caused merge conflicts with one susbystem or
  another.

  This tree was put together to help try to avoid conflicts with these
  cleanups going on different trees at time. So nothing exciting on this
  pull request, just cleanups.

  Thanks a lot to the Uniontech and Huawei folks for doing some of this
  nasty work"

* tag 'sysctl-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: (28 commits)
  sched: Fix build warning without CONFIG_SYSCTL
  reboot: Fix build warning without CONFIG_SYSCTL
  kernel/kexec_core: move kexec_core sysctls into its own file
  sysctl: minor cleanup in new_dir()
  ftrace: fix building with SYSCTL=y but DYNAMIC_FTRACE=n
  fs/proc: Introduce list_for_each_table_entry for proc sysctl
  mm: fix unused variable kernel warning when SYSCTL=n
  latencytop: move sysctl to its own file
  ftrace: fix building with SYSCTL=n but DYNAMIC_FTRACE=y
  ftrace: Fix build warning
  ftrace: move sysctl_ftrace_enabled to ftrace.c
  kernel/do_mount_initrd: move real_root_dev sysctls to its own file
  kernel/delayacct: move delayacct sysctls to its own file
  kernel/acct: move acct sysctls to its own file
  kernel/panic: move panic sysctls to its own file
  kernel/lockdep: move lockdep sysctls to its own file
  mm: move page-writeback sysctls to their own file
  mm: move oom_kill sysctls to their own file
  kernel/reboot: move reboot sysctls to its own file
  sched: Move energy_aware sysctls to topology.c
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull sysctl updates from Luis Chamberlain:
 "For two kernel releases now kernel/sysctl.c has been being cleaned up
  slowly, since the tables were grossly long, sprinkled with tons of
  #ifdefs and all this caused merge conflicts with one susbystem or
  another.

  This tree was put together to help try to avoid conflicts with these
  cleanups going on different trees at time. So nothing exciting on this
  pull request, just cleanups.

  Thanks a lot to the Uniontech and Huawei folks for doing some of this
  nasty work"

* tag 'sysctl-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: (28 commits)
  sched: Fix build warning without CONFIG_SYSCTL
  reboot: Fix build warning without CONFIG_SYSCTL
  kernel/kexec_core: move kexec_core sysctls into its own file
  sysctl: minor cleanup in new_dir()
  ftrace: fix building with SYSCTL=y but DYNAMIC_FTRACE=n
  fs/proc: Introduce list_for_each_table_entry for proc sysctl
  mm: fix unused variable kernel warning when SYSCTL=n
  latencytop: move sysctl to its own file
  ftrace: fix building with SYSCTL=n but DYNAMIC_FTRACE=y
  ftrace: Fix build warning
  ftrace: move sysctl_ftrace_enabled to ftrace.c
  kernel/do_mount_initrd: move real_root_dev sysctls to its own file
  kernel/delayacct: move delayacct sysctls to its own file
  kernel/acct: move acct sysctls to its own file
  kernel/panic: move panic sysctls to its own file
  kernel/lockdep: move lockdep sysctls to its own file
  mm: move page-writeback sysctls to their own file
  mm: move oom_kill sysctls to their own file
  kernel/reboot: move reboot sysctls to its own file
  sched: Move energy_aware sysctls to topology.c
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>oom_kill.c: futex: delay the OOM reaper to allow time for proper futex cleanup</title>
<updated>2022-04-22T03:01:10+00:00</updated>
<author>
<name>Nico Pache</name>
<email>npache@redhat.com</email>
</author>
<published>2022-04-21T23:36:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e4a38402c36e42df28eb1a5394be87e6571fb48a'/>
<id>e4a38402c36e42df28eb1a5394be87e6571fb48a</id>
<content type='text'>
The pthread struct is allocated on PRIVATE|ANONYMOUS memory [1] which
can be targeted by the oom reaper.  This mapping is used to store the
futex robust list head; the kernel does not keep a copy of the robust
list and instead references a userspace address to maintain the
robustness during a process death.

A race can occur between exit_mm and the oom reaper that allows the oom
reaper to free the memory of the futex robust list before the exit path
has handled the futex death:

    CPU1                               CPU2
    --------------------------------------------------------------------
    page_fault
    do_exit "signal"
    wake_oom_reaper
                                        oom_reaper
                                        oom_reap_task_mm (invalidates mm)
    exit_mm
    exit_mm_release
    futex_exit_release
    futex_cleanup
    exit_robust_list
    get_user (EFAULT- can't access memory)

If the get_user EFAULT's, the kernel will be unable to recover the
waiters on the robust_list, leaving userspace mutexes hung indefinitely.

Delay the OOM reaper, allowing more time for the exit path to perform
the futex cleanup.

Reproducer: https://gitlab.com/jsavitz/oom_futex_reproducer

Based on a patch by Michal Hocko.

Link: https://elixir.bootlin.com/glibc/glibc-2.35/source/nptl/allocatestack.c#L370 [1]
Link: https://lkml.kernel.org/r/20220414144042.677008-1-npache@redhat.com
Fixes: 212925802454 ("mm: oom: let oom_reap_task and exit_mmap run concurrently")
Signed-off-by: Joel Savitz &lt;jsavitz@redhat.com&gt;
Signed-off-by: Nico Pache &lt;npache@redhat.com&gt;
Co-developed-by: Joel Savitz &lt;jsavitz@redhat.com&gt;
Suggested-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Rafael Aquini &lt;aquini@redhat.com&gt;
Cc: Waiman Long &lt;longman@redhat.com&gt;
Cc: Herton R. Krzesinski &lt;herton@redhat.com&gt;
Cc: Juri Lelli &lt;juri.lelli@redhat.com&gt;
Cc: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Cc: Dietmar Eggemann &lt;dietmar.eggemann@arm.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Ben Segall &lt;bsegall@google.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Daniel Bristot de Oliveira &lt;bristot@redhat.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Joel Savitz &lt;jsavitz@redhat.com&gt;
Cc: Darren Hart &lt;dvhart@infradead.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The pthread struct is allocated on PRIVATE|ANONYMOUS memory [1] which
can be targeted by the oom reaper.  This mapping is used to store the
futex robust list head; the kernel does not keep a copy of the robust
list and instead references a userspace address to maintain the
robustness during a process death.

A race can occur between exit_mm and the oom reaper that allows the oom
reaper to free the memory of the futex robust list before the exit path
has handled the futex death:

    CPU1                               CPU2
    --------------------------------------------------------------------
    page_fault
    do_exit "signal"
    wake_oom_reaper
                                        oom_reaper
                                        oom_reap_task_mm (invalidates mm)
    exit_mm
    exit_mm_release
    futex_exit_release
    futex_cleanup
    exit_robust_list
    get_user (EFAULT- can't access memory)

If the get_user EFAULT's, the kernel will be unable to recover the
waiters on the robust_list, leaving userspace mutexes hung indefinitely.

Delay the OOM reaper, allowing more time for the exit path to perform
the futex cleanup.

Reproducer: https://gitlab.com/jsavitz/oom_futex_reproducer

Based on a patch by Michal Hocko.

Link: https://elixir.bootlin.com/glibc/glibc-2.35/source/nptl/allocatestack.c#L370 [1]
Link: https://lkml.kernel.org/r/20220414144042.677008-1-npache@redhat.com
Fixes: 212925802454 ("mm: oom: let oom_reap_task and exit_mmap run concurrently")
Signed-off-by: Joel Savitz &lt;jsavitz@redhat.com&gt;
Signed-off-by: Nico Pache &lt;npache@redhat.com&gt;
Co-developed-by: Joel Savitz &lt;jsavitz@redhat.com&gt;
Suggested-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Rafael Aquini &lt;aquini@redhat.com&gt;
Cc: Waiman Long &lt;longman@redhat.com&gt;
Cc: Herton R. Krzesinski &lt;herton@redhat.com&gt;
Cc: Juri Lelli &lt;juri.lelli@redhat.com&gt;
Cc: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Cc: Dietmar Eggemann &lt;dietmar.eggemann@arm.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Ben Segall &lt;bsegall@google.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Daniel Bristot de Oliveira &lt;bristot@redhat.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Joel Savitz &lt;jsavitz@redhat.com&gt;
Cc: Darren Hart &lt;dvhart@infradead.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: move oom_kill sysctls to their own file</title>
<updated>2022-04-06T20:43:44+00:00</updated>
<author>
<name>sujiaxun</name>
<email>sujiaxun@uniontech.com</email>
</author>
<published>2022-02-18T02:51:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=43fe219aa56a2fdd8f0623c9470a32b14b0617a5'/>
<id>43fe219aa56a2fdd8f0623c9470a32b14b0617a5</id>
<content type='text'>
kernel/sysctl.c is a kitchen sink where everyone leaves their dirty
dishes, this makes it very difficult to maintain.

To help with this maintenance let's start by moving sysctls to places
where they actually belong.  The proc sysctl maintainers do not want to
know what sysctl knobs you wish to add for your own piece of code, we just
care about the core logic.

So move the oom_kill sysctls to their own file, mm/oom_kill.c

[sfr@canb.auug.org.au: null-terminate the array]
  Link: https://lkml.kernel.org/r/20220216193202.28838626@canb.auug.org.au

Link: https://lkml.kernel.org/r/20220215093203.31032-1-sujiaxun@uniontech.com
Signed-off-by: sujiaxun &lt;sujiaxun@uniontech.com&gt;
Signed-off-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Iurii Zaikin &lt;yzaikin@google.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Luis Chamberlain &lt;mcgrof@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
kernel/sysctl.c is a kitchen sink where everyone leaves their dirty
dishes, this makes it very difficult to maintain.

To help with this maintenance let's start by moving sysctls to places
where they actually belong.  The proc sysctl maintainers do not want to
know what sysctl knobs you wish to add for your own piece of code, we just
care about the core logic.

So move the oom_kill sysctls to their own file, mm/oom_kill.c

[sfr@canb.auug.org.au: null-terminate the array]
  Link: https://lkml.kernel.org/r/20220216193202.28838626@canb.auug.org.au

Link: https://lkml.kernel.org/r/20220215093203.31032-1-sujiaxun@uniontech.com
Signed-off-by: sujiaxun &lt;sujiaxun@uniontech.com&gt;
Signed-off-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Iurii Zaikin &lt;yzaikin@google.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Luis Chamberlain &lt;mcgrof@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
