<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/include/linux/mmzone.h, branch v5.10.78</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>mm: make zone_to_nid() and zone_set_nid() available for DISCONTIGMEM</title>
<updated>2021-08-15T12:00:25+00:00</updated>
<author>
<name>Mike Rapoport</name>
<email>rppt@linux.ibm.com</email>
</author>
<published>2021-08-11T13:41:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=eaa7feecd366577148fc17958fd708cea4874a97'/>
<id>eaa7feecd366577148fc17958fd708cea4874a97</id>
<content type='text'>
Since the commit ce6ee46e0f39 ("mm/page_alloc: fix memory map
initialization for descending nodes") initialization of the memory map
relies on availability of zone_to_nid() and zone_set_nid methods to link
struct page to a node.

But in 5.10 zone_to_nid() is only defined for NUMA, but not for
DISCONTIGMEM which causes crashes on m68k systems with two memory banks.

For instance on ARAnyM with both ST-RAM and FastRAM atari_defconfig build
produces the following crash:

Unable to handle kernel access at virtual address (ptrval)
Oops: 00000000
Modules linked in:
PC: [&lt;0005fbbc&gt;] bpf_prog_alloc_no_stats+0x5c/0xba
SR: 2200  SP: (ptrval)  a2: 016daa90
d0: 0000000c    d1: 00000200    d2: 00000001    d3: 00000cc0
d4: 016d1f80    d5: 00034da6    a0: 305c2800    a1: 305c2a00
Process swapper (pid: 1, task=(ptrval))
Frame format=7 eff addr=31800000 ssw=0445 faddr=31800000
wb 1 stat/addr/data: 0000 00000000 00000000
wb 2 stat/addr/data: 0000 00000000 00000000
wb 3 stat/addr/data: 00c5 31800000 00000001
push data: 00000000 00000000 00000000 00000000
Stack from 3058fec8:
        00000dc0 00000000 004addc2 3058ff16 0005fc34 00000238 00000000 00000210
        004addc2 3058ff16 00281ae0 00000238 00000000 00000000 004addc2 004bc7ec
        004aea9e 0048b0c0 3058ff16 00460042 004ba4d2 3058ff8c 004ade6a 0000007e
        0000210e 0000007e 00000002 016d1f80 00034da6 000020b4 00000000 004b4764
        004bc7ec 00000000 004b4760 004bc7c0 004b4744 001e4cb2 00010001 016d1fe5
        016d1ff0 004994d2 003e1589 016d1f80 00412b8c 0000007e 00000001 00000001
Call Trace: [&lt;004addc2&gt;] sock_init+0x0/0xaa
 [&lt;0005fc34&gt;] bpf_prog_alloc+0x1a/0x66
 [&lt;004addc2&gt;] sock_init+0x0/0xaa
 [&lt;00281ae0&gt;] bpf_prog_create+0x2e/0x7c
 [&lt;004addc2&gt;] sock_init+0x0/0xaa
 [&lt;004aea9e&gt;] ptp_classifier_init+0x22/0x44
 [&lt;004ade6a&gt;] sock_init+0xa8/0xaa
 [&lt;0000210e&gt;] do_one_initcall+0x5a/0x150
 [&lt;00034da6&gt;] parse_args+0x0/0x208
 [&lt;000020b4&gt;] do_one_initcall+0x0/0x150
 [&lt;001e4cb2&gt;] strcpy+0x0/0x1c
 [&lt;00010001&gt;] stwotoxd+0x5/0x1c
 [&lt;004994d2&gt;] kernel_init_freeable+0x154/0x1a6
 [&lt;001e4cb2&gt;] strcpy+0x0/0x1c
 [&lt;0049951a&gt;] kernel_init_freeable+0x19c/0x1a6
 [&lt;004addc2&gt;] sock_init+0x0/0xaa
 [&lt;00321510&gt;] kernel_init+0x0/0xd8
 [&lt;00321518&gt;] kernel_init+0x8/0xd8
 [&lt;00321510&gt;] kernel_init+0x0/0xd8
 [&lt;00002890&gt;] ret_from_kernel_thread+0xc/0x14

Code: 204b 200b 4cdf 180c 4e75 700c e0aa 3682 &lt;2748&gt; 001c 214b 0140 022b
ffbf 0002 206b 001c 2008 0680 0000 0108 2140 0108 2140
Disabling lock debugging due to kernel taint
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

Using CONFIG_NEED_MULTIPLE_NODES rather than CONFIG_NUMA to guard
definitions of zone_to_nid() and zone_set_nid() fixes the issue.

Reported-by: Mikael Pettersson &lt;mikpelinux@gmail.com&gt;
Fixes: ce6ee46e0f39 ("mm/page_alloc: fix memory map initialization for descending nodes")
Signed-off-by: Mike Rapoport &lt;rppt@linux.ibm.com&gt;
Tested-by: Mikael Pettersson &lt;mikpelinux@gmail.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Since the commit ce6ee46e0f39 ("mm/page_alloc: fix memory map
initialization for descending nodes") initialization of the memory map
relies on availability of zone_to_nid() and zone_set_nid methods to link
struct page to a node.

But in 5.10 zone_to_nid() is only defined for NUMA, but not for
DISCONTIGMEM which causes crashes on m68k systems with two memory banks.

For instance on ARAnyM with both ST-RAM and FastRAM atari_defconfig build
produces the following crash:

Unable to handle kernel access at virtual address (ptrval)
Oops: 00000000
Modules linked in:
PC: [&lt;0005fbbc&gt;] bpf_prog_alloc_no_stats+0x5c/0xba
SR: 2200  SP: (ptrval)  a2: 016daa90
d0: 0000000c    d1: 00000200    d2: 00000001    d3: 00000cc0
d4: 016d1f80    d5: 00034da6    a0: 305c2800    a1: 305c2a00
Process swapper (pid: 1, task=(ptrval))
Frame format=7 eff addr=31800000 ssw=0445 faddr=31800000
wb 1 stat/addr/data: 0000 00000000 00000000
wb 2 stat/addr/data: 0000 00000000 00000000
wb 3 stat/addr/data: 00c5 31800000 00000001
push data: 00000000 00000000 00000000 00000000
Stack from 3058fec8:
        00000dc0 00000000 004addc2 3058ff16 0005fc34 00000238 00000000 00000210
        004addc2 3058ff16 00281ae0 00000238 00000000 00000000 004addc2 004bc7ec
        004aea9e 0048b0c0 3058ff16 00460042 004ba4d2 3058ff8c 004ade6a 0000007e
        0000210e 0000007e 00000002 016d1f80 00034da6 000020b4 00000000 004b4764
        004bc7ec 00000000 004b4760 004bc7c0 004b4744 001e4cb2 00010001 016d1fe5
        016d1ff0 004994d2 003e1589 016d1f80 00412b8c 0000007e 00000001 00000001
Call Trace: [&lt;004addc2&gt;] sock_init+0x0/0xaa
 [&lt;0005fc34&gt;] bpf_prog_alloc+0x1a/0x66
 [&lt;004addc2&gt;] sock_init+0x0/0xaa
 [&lt;00281ae0&gt;] bpf_prog_create+0x2e/0x7c
 [&lt;004addc2&gt;] sock_init+0x0/0xaa
 [&lt;004aea9e&gt;] ptp_classifier_init+0x22/0x44
 [&lt;004ade6a&gt;] sock_init+0xa8/0xaa
 [&lt;0000210e&gt;] do_one_initcall+0x5a/0x150
 [&lt;00034da6&gt;] parse_args+0x0/0x208
 [&lt;000020b4&gt;] do_one_initcall+0x0/0x150
 [&lt;001e4cb2&gt;] strcpy+0x0/0x1c
 [&lt;00010001&gt;] stwotoxd+0x5/0x1c
 [&lt;004994d2&gt;] kernel_init_freeable+0x154/0x1a6
 [&lt;001e4cb2&gt;] strcpy+0x0/0x1c
 [&lt;0049951a&gt;] kernel_init_freeable+0x19c/0x1a6
 [&lt;004addc2&gt;] sock_init+0x0/0xaa
 [&lt;00321510&gt;] kernel_init+0x0/0xd8
 [&lt;00321518&gt;] kernel_init+0x8/0xd8
 [&lt;00321510&gt;] kernel_init+0x0/0xd8
 [&lt;00002890&gt;] ret_from_kernel_thread+0xc/0x14

Code: 204b 200b 4cdf 180c 4e75 700c e0aa 3682 &lt;2748&gt; 001c 214b 0140 022b
ffbf 0002 206b 001c 2008 0680 0000 0108 2140 0108 2140
Disabling lock debugging due to kernel taint
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

Using CONFIG_NEED_MULTIPLE_NODES rather than CONFIG_NUMA to guard
definitions of zone_to_nid() and zone_set_nid() fixes the issue.

Reported-by: Mikael Pettersson &lt;mikpelinux@gmail.com&gt;
Fixes: ce6ee46e0f39 ("mm/page_alloc: fix memory map initialization for descending nodes")
Signed-off-by: Mike Rapoport &lt;rppt@linux.ibm.com&gt;
Tested-by: Mikael Pettersson &lt;mikpelinux@gmail.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: Remove examples from enum zone_type comment</title>
<updated>2021-03-09T10:11:14+00:00</updated>
<author>
<name>Nicolas Saenz Julienne</name>
<email>nsaenzjulienne@suse.de</email>
</author>
<published>2021-03-03T07:33:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4d7ed9a49b0ce079af2255b0c49b0d4b5a5ceff0'/>
<id>4d7ed9a49b0ce079af2255b0c49b0d4b5a5ceff0</id>
<content type='text'>
commit 04435217f96869ac3a8f055ff68c5237a60bcd7e upstream

We can't really list every setup in common code. On top of that they are
unlikely to stay true for long as things change in the arch trees
independently of this comment.

Suggested-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Nicolas Saenz Julienne &lt;nsaenzjulienne@suse.de&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Link: https://lore.kernel.org/r/20201119175400.9995-8-nsaenzjulienne@suse.de
Signed-off-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Jing Xiangfeng &lt;jingxiangfeng@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 04435217f96869ac3a8f055ff68c5237a60bcd7e upstream

We can't really list every setup in common code. On top of that they are
unlikely to stay true for long as things change in the arch trees
independently of this comment.

Suggested-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Nicolas Saenz Julienne &lt;nsaenzjulienne@suse.de&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Link: https://lore.kernel.org/r/20201119175400.9995-8-nsaenzjulienne@suse.de
Signed-off-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Jing Xiangfeng &lt;jingxiangfeng@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>include/linux/mmzone.h: remove unused early_pfn_valid()</title>
<updated>2020-10-16T18:11:19+00:00</updated>
<author>
<name>Mike Rapoport</name>
<email>rppt@linux.ibm.com</email>
</author>
<published>2020-10-16T03:10:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1f0f8c0de09066d23760c1f5fac2cd53b32f1127'/>
<id>1f0f8c0de09066d23760c1f5fac2cd53b32f1127</id>
<content type='text'>
The early_pfn_valid() macro is defined but it is never used.  Remove it.

Signed-off-by: Mike Rapoport &lt;rppt@linux.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Acked-by: David Hildenbrand &lt;david@redhat.com&gt;
Link: https://lkml.kernel.org/r/20200923162915.26935-1-rppt@kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The early_pfn_valid() macro is defined but it is never used.  Remove it.

Signed-off-by: Mike Rapoport &lt;rppt@linux.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Acked-by: David Hildenbrand &lt;david@redhat.com&gt;
Link: https://lkml.kernel.org/r/20200923162915.26935-1-rppt@kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: use self-explanatory macros rather than "2"</title>
<updated>2020-10-16T18:11:19+00:00</updated>
<author>
<name>Yu Zhao</name>
<email>yuzhao@google.com</email>
</author>
<published>2020-10-16T03:09:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ed0173733dd468883198c3136284394320b8fad6'/>
<id>ed0173733dd468883198c3136284394320b8fad6</id>
<content type='text'>
Signed-off-by: Yu Zhao &lt;yuzhao@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Alex Shi &lt;alex.shi@linux.alibaba.com&gt;
Link: http://lkml.kernel.org/r/20200831175042.3527153-2-yuzhao@google.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Yu Zhao &lt;yuzhao@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Alex Shi &lt;alex.shi@linux.alibaba.com&gt;
Link: http://lkml.kernel.org/r/20200831175042.3527153-2-yuzhao@google.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mmzone: clean code by removing unused macro parameter</title>
<updated>2020-10-14T01:38:33+00:00</updated>
<author>
<name>Mateusz Nosek</name>
<email>mateusznosek0@gmail.com</email>
</author>
<published>2020-10-13T23:55:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=30d8ec73e8772b32a7eae626d14004bd37d8f13c'/>
<id>30d8ec73e8772b32a7eae626d14004bd37d8f13c</id>
<content type='text'>
Previously 'for_next_zone_zonelist_nodemask' macro parameter 'zlist' was
unused so this patch removes it.

Signed-off-by: Mateusz Nosek &lt;mateusznosek0@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Link: https://lkml.kernel.org/r/20200917211906.30059-1-mateusznosek0@gmail.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Previously 'for_next_zone_zonelist_nodemask' macro parameter 'zlist' was
unused so this patch removes it.

Signed-off-by: Mateusz Nosek &lt;mateusznosek0@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Link: https://lkml.kernel.org/r/20200917211906.30059-1-mateusznosek0@gmail.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: document semantics of ZONE_MOVABLE</title>
<updated>2020-10-14T01:38:33+00:00</updated>
<author>
<name>David Hildenbrand</name>
<email>david@redhat.com</email>
</author>
<published>2020-10-13T23:55:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9181a980625a45425085ccec0fc38074a16470a5'/>
<id>9181a980625a45425085ccec0fc38074a16470a5</id>
<content type='text'>
Let's document what ZONE_MOVABLE means, how it's used, and which special
cases we have regarding unmovable pages (memory offlining vs.  migration /
allocations).

Signed-off-by: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Acked-by: Mike Rapoport &lt;rppt@linux.ibm.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Cc: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Pankaj Gupta &lt;pankaj.gupta.linux@gmail.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Qian Cai &lt;cai@lca.pw&gt;
Link: http://lkml.kernel.org/r/20200816125333.7434-7-david@redhat.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Let's document what ZONE_MOVABLE means, how it's used, and which special
cases we have regarding unmovable pages (memory offlining vs.  migration /
allocations).

Signed-off-by: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Acked-by: Mike Rapoport &lt;rppt@linux.ibm.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Cc: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Pankaj Gupta &lt;pankaj.gupta.linux@gmail.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Jason Wang &lt;jasowang@redhat.com&gt;
Cc: Qian Cai &lt;cai@lca.pw&gt;
Link: http://lkml.kernel.org/r/20200816125333.7434-7-david@redhat.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: replace memmap_context by meminit_context</title>
<updated>2020-09-26T17:33:57+00:00</updated>
<author>
<name>Laurent Dufour</name>
<email>ldufour@linux.ibm.com</email>
</author>
<published>2020-09-26T04:19:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c1d0da83358a2316d9be7f229f26126dbaa07468'/>
<id>c1d0da83358a2316d9be7f229f26126dbaa07468</id>
<content type='text'>
Patch series "mm: fix memory to node bad links in sysfs", v3.

Sometimes, firmware may expose interleaved memory layout like this:

 Early memory node ranges
   node   1: [mem 0x0000000000000000-0x000000011fffffff]
   node   2: [mem 0x0000000120000000-0x000000014fffffff]
   node   1: [mem 0x0000000150000000-0x00000001ffffffff]
   node   0: [mem 0x0000000200000000-0x000000048fffffff]
   node   2: [mem 0x0000000490000000-0x00000007ffffffff]

In that case, we can see memory blocks assigned to multiple nodes in
sysfs:

  $ ls -l /sys/devices/system/memory/memory21
  total 0
  lrwxrwxrwx 1 root root     0 Aug 24 05:27 node1 -&gt; ../../node/node1
  lrwxrwxrwx 1 root root     0 Aug 24 05:27 node2 -&gt; ../../node/node2
  -rw-r--r-- 1 root root 65536 Aug 24 05:27 online
  -r--r--r-- 1 root root 65536 Aug 24 05:27 phys_device
  -r--r--r-- 1 root root 65536 Aug 24 05:27 phys_index
  drwxr-xr-x 2 root root     0 Aug 24 05:27 power
  -r--r--r-- 1 root root 65536 Aug 24 05:27 removable
  -rw-r--r-- 1 root root 65536 Aug 24 05:27 state
  lrwxrwxrwx 1 root root     0 Aug 24 05:25 subsystem -&gt; ../../../../bus/memory
  -rw-r--r-- 1 root root 65536 Aug 24 05:25 uevent
  -r--r--r-- 1 root root 65536 Aug 24 05:27 valid_zones

The same applies in the node's directory with a memory21 link in both
the node1 and node2's directory.

This is wrong but doesn't prevent the system to run.  However when
later, one of these memory blocks is hot-unplugged and then hot-plugged,
the system is detecting an inconsistency in the sysfs layout and a
BUG_ON() is raised:

  kernel BUG at /Users/laurent/src/linux-ppc/mm/memory_hotplug.c:1084!
  LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in: rpadlpar_io rpaphp pseries_rng rng_core vmx_crypto gf128mul binfmt_misc ip_tables x_tables xfs libcrc32c crc32c_vpmsum autofs4
  CPU: 8 PID: 10256 Comm: drmgr Not tainted 5.9.0-rc1+ #25
  Call Trace:
    add_memory_resource+0x23c/0x340 (unreliable)
    __add_memory+0x5c/0xf0
    dlpar_add_lmb+0x1b4/0x500
    dlpar_memory+0x1f8/0xb80
    handle_dlpar_errorlog+0xc0/0x190
    dlpar_store+0x198/0x4a0
    kobj_attr_store+0x30/0x50
    sysfs_kf_write+0x64/0x90
    kernfs_fop_write+0x1b0/0x290
    vfs_write+0xe8/0x290
    ksys_write+0xdc/0x130
    system_call_exception+0x160/0x270
    system_call_common+0xf0/0x27c

This has been seen on PowerPC LPAR.

The root cause of this issue is that when node's memory is registered,
the range used can overlap another node's range, thus the memory block
is registered to multiple nodes in sysfs.

There are two issues here:

 (a) The sysfs memory and node's layouts are broken due to these
     multiple links

 (b) The link errors in link_mem_sections() should not lead to a system
     panic.

To address (a) register_mem_sect_under_node should not rely on the
system state to detect whether the link operation is triggered by a hot
plug operation or not.  This is addressed by the patches 1 and 2 of this
series.

Issue (b) will be addressed separately.

This patch (of 2):

The memmap_context enum is used to detect whether a memory operation is
due to a hot-add operation or happening at boot time.

Make it general to the hotplug operation and rename it as
meminit_context.

There is no functional change introduced by this patch

Suggested-by: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Laurent Dufour &lt;ldufour@linux.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Reviewed-by: Oscar Salvador &lt;osalvador@suse.de&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: "Rafael J . Wysocki" &lt;rafael@kernel.org&gt;
Cc: Nathan Lynch &lt;nathanl@linux.ibm.com&gt;
Cc: Scott Cheloha &lt;cheloha@linux.ibm.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Link: https://lkml.kernel.org/r/20200915094143.79181-1-ldufour@linux.ibm.com
Link: https://lkml.kernel.org/r/20200915132624.9723-1-ldufour@linux.ibm.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Patch series "mm: fix memory to node bad links in sysfs", v3.

Sometimes, firmware may expose interleaved memory layout like this:

 Early memory node ranges
   node   1: [mem 0x0000000000000000-0x000000011fffffff]
   node   2: [mem 0x0000000120000000-0x000000014fffffff]
   node   1: [mem 0x0000000150000000-0x00000001ffffffff]
   node   0: [mem 0x0000000200000000-0x000000048fffffff]
   node   2: [mem 0x0000000490000000-0x00000007ffffffff]

In that case, we can see memory blocks assigned to multiple nodes in
sysfs:

  $ ls -l /sys/devices/system/memory/memory21
  total 0
  lrwxrwxrwx 1 root root     0 Aug 24 05:27 node1 -&gt; ../../node/node1
  lrwxrwxrwx 1 root root     0 Aug 24 05:27 node2 -&gt; ../../node/node2
  -rw-r--r-- 1 root root 65536 Aug 24 05:27 online
  -r--r--r-- 1 root root 65536 Aug 24 05:27 phys_device
  -r--r--r-- 1 root root 65536 Aug 24 05:27 phys_index
  drwxr-xr-x 2 root root     0 Aug 24 05:27 power
  -r--r--r-- 1 root root 65536 Aug 24 05:27 removable
  -rw-r--r-- 1 root root 65536 Aug 24 05:27 state
  lrwxrwxrwx 1 root root     0 Aug 24 05:25 subsystem -&gt; ../../../../bus/memory
  -rw-r--r-- 1 root root 65536 Aug 24 05:25 uevent
  -r--r--r-- 1 root root 65536 Aug 24 05:27 valid_zones

The same applies in the node's directory with a memory21 link in both
the node1 and node2's directory.

This is wrong but doesn't prevent the system to run.  However when
later, one of these memory blocks is hot-unplugged and then hot-plugged,
the system is detecting an inconsistency in the sysfs layout and a
BUG_ON() is raised:

  kernel BUG at /Users/laurent/src/linux-ppc/mm/memory_hotplug.c:1084!
  LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
  Modules linked in: rpadlpar_io rpaphp pseries_rng rng_core vmx_crypto gf128mul binfmt_misc ip_tables x_tables xfs libcrc32c crc32c_vpmsum autofs4
  CPU: 8 PID: 10256 Comm: drmgr Not tainted 5.9.0-rc1+ #25
  Call Trace:
    add_memory_resource+0x23c/0x340 (unreliable)
    __add_memory+0x5c/0xf0
    dlpar_add_lmb+0x1b4/0x500
    dlpar_memory+0x1f8/0xb80
    handle_dlpar_errorlog+0xc0/0x190
    dlpar_store+0x198/0x4a0
    kobj_attr_store+0x30/0x50
    sysfs_kf_write+0x64/0x90
    kernfs_fop_write+0x1b0/0x290
    vfs_write+0xe8/0x290
    ksys_write+0xdc/0x130
    system_call_exception+0x160/0x270
    system_call_common+0xf0/0x27c

This has been seen on PowerPC LPAR.

The root cause of this issue is that when node's memory is registered,
the range used can overlap another node's range, thus the memory block
is registered to multiple nodes in sysfs.

There are two issues here:

 (a) The sysfs memory and node's layouts are broken due to these
     multiple links

 (b) The link errors in link_mem_sections() should not lead to a system
     panic.

To address (a) register_mem_sect_under_node should not rely on the
system state to detect whether the link operation is triggered by a hot
plug operation or not.  This is addressed by the patches 1 and 2 of this
series.

Issue (b) will be addressed separately.

This patch (of 2):

The memmap_context enum is used to detect whether a memory operation is
due to a hot-add operation or happening at boot time.

Make it general to the hotplug operation and rename it as
meminit_context.

There is no functional change introduced by this patch

Suggested-by: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Laurent Dufour &lt;ldufour@linux.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Reviewed-by: Oscar Salvador &lt;osalvador@suse.de&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: "Rafael J . Wysocki" &lt;rafael@kernel.org&gt;
Cc: Nathan Lynch &lt;nathanl@linux.ibm.com&gt;
Cc: Scott Cheloha &lt;cheloha@linux.ibm.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Link: https://lkml.kernel.org/r/20200915094143.79181-1-ldufour@linux.ibm.com
Link: https://lkml.kernel.org/r/20200915132624.9723-1-ldufour@linux.ibm.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm/compaction: correct the comments of compact_defer_shift</title>
<updated>2020-08-12T17:57:56+00:00</updated>
<author>
<name>Alex Shi</name>
<email>alex.shi@linux.alibaba.com</email>
</author>
<published>2020-08-12T01:31:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=860b32729a21e0565585a9e2ecea2e5244d65acd'/>
<id>860b32729a21e0565585a9e2ecea2e5244d65acd</id>
<content type='text'>
There is no compact_defer_limit. It should be compact_defer_shift in
use. and add compact_order_failed explanation.

Signed-off-by: Alex Shi &lt;alex.shi@linux.alibaba.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Alexander Duyck &lt;alexander.h.duyck@linux.intel.com&gt;
Link: http://lkml.kernel.org/r/3bd60e1b-a74e-050d-ade4-6e8f54e00b92@linux.alibaba.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There is no compact_defer_limit. It should be compact_defer_shift in
use. and add compact_order_failed explanation.

Signed-off-by: Alex Shi &lt;alex.shi@linux.alibaba.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Alexander Duyck &lt;alexander.h.duyck@linux.intel.com&gt;
Link: http://lkml.kernel.org/r/3bd60e1b-a74e-050d-ade4-6e8f54e00b92@linux.alibaba.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm/workingset: prepare the workingset detection infrastructure for anon LRU</title>
<updated>2020-08-12T17:57:55+00:00</updated>
<author>
<name>Joonsoo Kim</name>
<email>iamjoonsoo.kim@lge.com</email>
</author>
<published>2020-08-12T01:30:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=170b04b7ae49634df103810dad67b22cf8a99aa6'/>
<id>170b04b7ae49634df103810dad67b22cf8a99aa6</id>
<content type='text'>
To prepare the workingset detection for anon LRU, this patch splits
workingset event counters for refault, activate and restore into anon and
file variants, as well as the refaults counter in struct lruvec.

Signed-off-by: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Link: http://lkml.kernel.org/r/1595490560-15117-4-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
To prepare the workingset detection for anon LRU, this patch splits
workingset event counters for refault, activate and restore into anon and
file variants, as well as the refaults counter in struct lruvec.

Signed-off-by: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Link: http://lkml.kernel.org/r/1595490560-15117-4-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm/page_alloc.c: remove unnecessary end_bitidx for [set|get]_pfnblock_flags_mask()</title>
<updated>2020-08-07T18:33:29+00:00</updated>
<author>
<name>Wei Yang</name>
<email>richard.weiyang@linux.alibaba.com</email>
</author>
<published>2020-08-07T06:25:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=535b81e209219e03f815379746bfd1eeb82d68e5'/>
<id>535b81e209219e03f815379746bfd1eeb82d68e5</id>
<content type='text'>
After previous cleanup, the end_bitidx is not necessary any more.

Signed-off-by: Wei Yang &lt;richard.weiyang@linux.alibaba.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Link: http://lkml.kernel.org/r/20200623124201.8199-4-richard.weiyang@linux.alibaba.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
After previous cleanup, the end_bitidx is not necessary any more.

Signed-off-by: Wei Yang &lt;richard.weiyang@linux.alibaba.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Link: http://lkml.kernel.org/r/20200623124201.8199-4-richard.weiyang@linux.alibaba.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
