<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/mm/page_alloc.c, branch v4.2.1</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>mm: make page pfmemalloc check more robust</title>
<updated>2015-08-21T21:30:10+00:00</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.com</email>
</author>
<published>2015-08-21T21:11:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=2f064f3485cd29633ad1b3cfb00cc519509a3d72'/>
<id>2f064f3485cd29633ad1b3cfb00cc519509a3d72</id>
<content type='text'>
Commit c48a11c7ad26 ("netvm: propagate page-&gt;pfmemalloc to skb") added
checks for page-&gt;pfmemalloc to __skb_fill_page_desc():

        if (page-&gt;pfmemalloc &amp;&amp; !page-&gt;mapping)
                skb-&gt;pfmemalloc = true;

It assumes page-&gt;mapping == NULL implies that page-&gt;pfmemalloc can be
trusted.  However, __delete_from_page_cache() can set set page-&gt;mapping
to NULL and leave page-&gt;index value alone.  Due to being in union, a
non-zero page-&gt;index will be interpreted as true page-&gt;pfmemalloc.

So the assumption is invalid if the networking code can see such a page.
And it seems it can.  We have encountered this with a NFS over loopback
setup when such a page is attached to a new skbuf.  There is no copying
going on in this case so the page confuses __skb_fill_page_desc which
interprets the index as pfmemalloc flag and the network stack drops
packets that have been allocated using the reserves unless they are to
be queued on sockets handling the swapping which is the case here and
that leads to hangs when the nfs client waits for a response from the
server which has been dropped and thus never arrive.

The struct page is already heavily packed so rather than finding another
hole to put it in, let's do a trick instead.  We can reuse the index
again but define it to an impossible value (-1UL).  This is the page
index so it should never see the value that large.  Replace all direct
users of page-&gt;pfmemalloc by page_is_pfmemalloc which will hide this
nastiness from unspoiled eyes.

The information will get lost if somebody wants to use page-&gt;index
obviously but that was the case before and the original code expected
that the information should be persisted somewhere else if that is
really needed (e.g.  what SLAB and SLUB do).

[akpm@linux-foundation.org: fix blooper in slub]
Fixes: c48a11c7ad26 ("netvm: propagate page-&gt;pfmemalloc to skb")
Signed-off-by: Michal Hocko &lt;mhocko@suse.com&gt;
Debugged-by: Vlastimil Babka &lt;vbabka@suse.com&gt;
Debugged-by: Jiri Bohac &lt;jbohac@suse.com&gt;
Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Acked-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: &lt;stable@vger.kernel.org&gt;	[3.6+]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit c48a11c7ad26 ("netvm: propagate page-&gt;pfmemalloc to skb") added
checks for page-&gt;pfmemalloc to __skb_fill_page_desc():

        if (page-&gt;pfmemalloc &amp;&amp; !page-&gt;mapping)
                skb-&gt;pfmemalloc = true;

It assumes page-&gt;mapping == NULL implies that page-&gt;pfmemalloc can be
trusted.  However, __delete_from_page_cache() can set set page-&gt;mapping
to NULL and leave page-&gt;index value alone.  Due to being in union, a
non-zero page-&gt;index will be interpreted as true page-&gt;pfmemalloc.

So the assumption is invalid if the networking code can see such a page.
And it seems it can.  We have encountered this with a NFS over loopback
setup when such a page is attached to a new skbuf.  There is no copying
going on in this case so the page confuses __skb_fill_page_desc which
interprets the index as pfmemalloc flag and the network stack drops
packets that have been allocated using the reserves unless they are to
be queued on sockets handling the swapping which is the case here and
that leads to hangs when the nfs client waits for a response from the
server which has been dropped and thus never arrive.

The struct page is already heavily packed so rather than finding another
hole to put it in, let's do a trick instead.  We can reuse the index
again but define it to an impossible value (-1UL).  This is the page
index so it should never see the value that large.  Replace all direct
users of page-&gt;pfmemalloc by page_is_pfmemalloc which will hide this
nastiness from unspoiled eyes.

The information will get lost if somebody wants to use page-&gt;index
obviously but that was the case before and the original code expected
that the information should be persisted somewhere else if that is
really needed (e.g.  what SLAB and SLUB do).

[akpm@linux-foundation.org: fix blooper in slub]
Fixes: c48a11c7ad26 ("netvm: propagate page-&gt;pfmemalloc to skb")
Signed-off-by: Michal Hocko &lt;mhocko@suse.com&gt;
Debugged-by: Vlastimil Babka &lt;vbabka@suse.com&gt;
Debugged-by: Jiri Bohac &lt;jbohac@suse.com&gt;
Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Acked-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: &lt;stable@vger.kernel.org&gt;	[3.6+]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>memory-hotplug: fix wrong edge when hot add a new node</title>
<updated>2015-08-14T22:56:32+00:00</updated>
<author>
<name>Xishi Qiu</name>
<email>qiuxishi@huawei.com</email>
</author>
<published>2015-08-14T22:35:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f9126ab9241f66562debf69c2c9d8fee32ddcc53'/>
<id>f9126ab9241f66562debf69c2c9d8fee32ddcc53</id>
<content type='text'>
When we add a new node, the edge of memory may be wrong.

e.g. system has 4 nodes, and node3 is movable, node3 mem:[24G-32G],

1. hotremove the node3,
2. then hotadd node3 with a part of memory, mem:[26G-30G],
3. call hotadd_new_pgdat()
        free_area_init_node()
                get_pfn_range_for_nid()
4. it will return wrong start_pfn and end_pfn, because we have not
update the memblock.

This patch also fixes a BUG_ON during hot-addition, please see
http://marc.info/?l=linux-kernel&amp;m=142961156129456&amp;w=2

Signed-off-by: Xishi Qiu &lt;qiuxishi@huawei.com&gt;
Cc: Yasuaki Ishimatsu &lt;isimatu.yasuaki@jp.fujitsu.com&gt;
Cc: Kamezawa Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Taku Izumi &lt;izumi.taku@jp.fujitsu.com&gt;
Cc: Tang Chen &lt;tangchen@cn.fujitsu.com&gt;
Cc: Gu Zheng &lt;guz.fnst@cn.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When we add a new node, the edge of memory may be wrong.

e.g. system has 4 nodes, and node3 is movable, node3 mem:[24G-32G],

1. hotremove the node3,
2. then hotadd node3 with a part of memory, mem:[26G-30G],
3. call hotadd_new_pgdat()
        free_area_init_node()
                get_pfn_range_for_nid()
4. it will return wrong start_pfn and end_pfn, because we have not
update the memblock.

This patch also fixes a BUG_ON during hot-addition, please see
http://marc.info/?l=linux-kernel&amp;m=142961156129456&amp;w=2

Signed-off-by: Xishi Qiu &lt;qiuxishi@huawei.com&gt;
Cc: Yasuaki Ishimatsu &lt;isimatu.yasuaki@jp.fujitsu.com&gt;
Cc: Kamezawa Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Taku Izumi &lt;izumi.taku@jp.fujitsu.com&gt;
Cc: Tang Chen &lt;tangchen@cn.fujitsu.com&gt;
Cc: Gu Zheng &lt;guz.fnst@cn.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: check __PG_HWPOISON separately from PAGE_FLAGS_CHECK_AT_*</title>
<updated>2015-08-07T01:39:42+00:00</updated>
<author>
<name>Naoya Horiguchi</name>
<email>n-horiguchi@ah.jp.nec.com</email>
</author>
<published>2015-08-06T22:47:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f4c18e6f7b5bbb5b528b3334115806b0d76f50f9'/>
<id>f4c18e6f7b5bbb5b528b3334115806b0d76f50f9</id>
<content type='text'>
The race condition addressed in commit add05cecef80 ("mm: soft-offline:
don't free target page in successful page migration") was not closed
completely, because that can happen not only for soft-offline, but also
for hard-offline.  Consider that a slab page is about to be freed into
buddy pool, and then an uncorrected memory error hits the page just
after entering __free_one_page(), then VM_BUG_ON_PAGE(page-&gt;flags &amp;
PAGE_FLAGS_CHECK_AT_PREP) is triggered, despite the fact that it's not
necessary because the data on the affected page is not consumed.

To solve it, this patch drops __PG_HWPOISON from page flag checks at
allocation/free time.  I think it's justified because __PG_HWPOISON
flags is defined to prevent the page from being reused, and setting it
outside the page's alloc-free cycle is a designed behavior (not a bug.)

For recent months, I was annoyed about BUG_ON when soft-offlined page
remains on lru cache list for a while, which is avoided by calling
put_page() instead of putback_lru_page() in page migration's success
path.  This means that this patch reverts a major change from commit
add05cecef80 about the new refcounting rule of soft-offlined pages, so
"reuse window" revives.  This will be closed by a subsequent patch.

Signed-off-by: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Cc: Andi Kleen &lt;andi@firstfloor.org&gt;
Cc: Dean Nelson &lt;dnelson@redhat.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: "Kirill A. Shutemov" &lt;kirill@shutemov.name&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The race condition addressed in commit add05cecef80 ("mm: soft-offline:
don't free target page in successful page migration") was not closed
completely, because that can happen not only for soft-offline, but also
for hard-offline.  Consider that a slab page is about to be freed into
buddy pool, and then an uncorrected memory error hits the page just
after entering __free_one_page(), then VM_BUG_ON_PAGE(page-&gt;flags &amp;
PAGE_FLAGS_CHECK_AT_PREP) is triggered, despite the fact that it's not
necessary because the data on the affected page is not consumed.

To solve it, this patch drops __PG_HWPOISON from page flag checks at
allocation/free time.  I think it's justified because __PG_HWPOISON
flags is defined to prevent the page from being reused, and setting it
outside the page's alloc-free cycle is a designed behavior (not a bug.)

For recent months, I was annoyed about BUG_ON when soft-offlined page
remains on lru cache list for a while, which is avoided by calling
put_page() instead of putback_lru_page() in page migration's success
path.  This means that this patch reverts a major change from commit
add05cecef80 about the new refcounting rule of soft-offlined pages, so
"reuse window" revives.  This will be closed by a subsequent patch.

Signed-off-by: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Cc: Andi Kleen &lt;andi@firstfloor.org&gt;
Cc: Dean Nelson &lt;dnelson@redhat.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: "Kirill A. Shutemov" &lt;kirill@shutemov.name&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>fs, file table: reinit files_stat.max_files after deferred memory initialisation</title>
<updated>2015-08-07T01:39:40+00:00</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2015-08-06T22:46:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4248b0da460839e30eaaad78992b9a1dd3e63e21'/>
<id>4248b0da460839e30eaaad78992b9a1dd3e63e21</id>
<content type='text'>
Dave Hansen reported the following;

	My laptop has been behaving strangely with 4.2-rc2.  Once I log
	in to my X session, I start getting all kinds of strange errors
	from applications and see this in my dmesg:

        	VFS: file-max limit 8192 reached

The problem is that the file-max is calculated before memory is fully
initialised and miscalculates how much memory the kernel is using.  This
patch recalculates file-max after deferred memory initialisation.  Note
that using memory hotplug infrastructure would not have avoided this
problem as the value is not recalculated after memory hot-add.

4.1:             files_stat.max_files = 6582781
4.2-rc2:         files_stat.max_files = 8192
4.2-rc2 patched: files_stat.max_files = 6562467

Small differences with the patch applied and 4.1 but not enough to matter.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Reported-by: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: Nicolai Stange &lt;nicstange@gmail.com&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: Alex Ng &lt;alexng@microsoft.com&gt;
Cc: Fengguang Wu &lt;fengguang.wu@intel.com&gt;
Cc: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Dave Hansen reported the following;

	My laptop has been behaving strangely with 4.2-rc2.  Once I log
	in to my X session, I start getting all kinds of strange errors
	from applications and see this in my dmesg:

        	VFS: file-max limit 8192 reached

The problem is that the file-max is calculated before memory is fully
initialised and miscalculates how much memory the kernel is using.  This
patch recalculates file-max after deferred memory initialisation.  Note
that using memory hotplug infrastructure would not have avoided this
problem as the value is not recalculated after memory hot-add.

4.1:             files_stat.max_files = 6582781
4.2-rc2:         files_stat.max_files = 8192
4.2-rc2 patched: files_stat.max_files = 6562467

Small differences with the patch applied and 4.1 but not enough to matter.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Reported-by: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: Nicolai Stange &lt;nicstange@gmail.com&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: Alex Ng &lt;alexng@microsoft.com&gt;
Cc: Fengguang Wu &lt;fengguang.wu@intel.com&gt;
Cc: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm, meminit: replace rwsem with completion</title>
<updated>2015-08-07T01:39:40+00:00</updated>
<author>
<name>Nicolai Stange</name>
<email>nicstange@gmail.com</email>
</author>
<published>2015-08-06T22:46:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d3cd131d935ab3bab700491edbbd7cad4040ce50'/>
<id>d3cd131d935ab3bab700491edbbd7cad4040ce50</id>
<content type='text'>
Commit 0e1cc95b4cc7 ("mm: meminit: finish initialisation of struct pages
before basic setup") introduced a rwsem to signal completion of the
initialization workers.

Lockdep complains about possible recursive locking:
  =============================================
  [ INFO: possible recursive locking detected ]
  4.1.0-12802-g1dc51b8 #3 Not tainted
  ---------------------------------------------
  swapper/0/1 is trying to acquire lock:
  (pgdat_init_rwsem){++++.+},
    at: [&lt;ffffffff8424c7fb&gt;] page_alloc_init_late+0xc7/0xe6

  but task is already holding lock:
  (pgdat_init_rwsem){++++.+},
    at: [&lt;ffffffff8424c772&gt;] page_alloc_init_late+0x3e/0xe6

Replace the rwsem by a completion together with an atomic
"outstanding work counter".

[peterz@infradead.org: Barrier removal on the grounds of being pointless]
[mgorman@suse.de: Applied review feedback]
Signed-off-by: Nicolai Stange &lt;nicstange@gmail.com&gt;
Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: Alex Ng &lt;alexng@microsoft.com&gt;
Cc: Fengguang Wu &lt;fengguang.wu@intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit 0e1cc95b4cc7 ("mm: meminit: finish initialisation of struct pages
before basic setup") introduced a rwsem to signal completion of the
initialization workers.

Lockdep complains about possible recursive locking:
  =============================================
  [ INFO: possible recursive locking detected ]
  4.1.0-12802-g1dc51b8 #3 Not tainted
  ---------------------------------------------
  swapper/0/1 is trying to acquire lock:
  (pgdat_init_rwsem){++++.+},
    at: [&lt;ffffffff8424c7fb&gt;] page_alloc_init_late+0xc7/0xe6

  but task is already holding lock:
  (pgdat_init_rwsem){++++.+},
    at: [&lt;ffffffff8424c772&gt;] page_alloc_init_late+0x3e/0xe6

Replace the rwsem by a completion together with an atomic
"outstanding work counter".

[peterz@infradead.org: Barrier removal on the grounds of being pointless]
[mgorman@suse.de: Applied review feedback]
Signed-off-by: Nicolai Stange &lt;nicstange@gmail.com&gt;
Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: Alex Ng &lt;alexng@microsoft.com&gt;
Cc: Fengguang Wu &lt;fengguang.wu@intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm, meminit: allow early_pfn_to_nid to be used during runtime</title>
<updated>2015-08-07T01:39:40+00:00</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2015-08-06T22:46:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7ace99170789bc53cbb7e9e352d7a3851208fbcf'/>
<id>7ace99170789bc53cbb7e9e352d7a3851208fbcf</id>
<content type='text'>
early_pfn_to_nid() historically was inherently not SMP safe but only
used during boot which is inherently single threaded or during hotplug
which is protected by a giant mutex.

With deferred memory initialisation there was a thread-safe version
introduced and the early_pfn_to_nid would trigger a BUG_ON if used
unsafely.  Memory hotplug hit that check.  This patch makes
early_pfn_to_nid introduces a lock to make it safe to use during
hotplug.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Reported-by: Alex Ng &lt;alexng@microsoft.com&gt;
Tested-by: Alex Ng &lt;alexng@microsoft.com&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Nicolai Stange &lt;nicstange@gmail.com&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: Fengguang Wu &lt;fengguang.wu@intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
early_pfn_to_nid() historically was inherently not SMP safe but only
used during boot which is inherently single threaded or during hotplug
which is protected by a giant mutex.

With deferred memory initialisation there was a thread-safe version
introduced and the early_pfn_to_nid would trigger a BUG_ON if used
unsafely.  Memory hotplug hit that check.  This patch makes
early_pfn_to_nid introduces a lock to make it safe to use during
hotplug.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Reported-by: Alex Ng &lt;alexng@microsoft.com&gt;
Tested-by: Alex Ng &lt;alexng@microsoft.com&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Nicolai Stange &lt;nicstange@gmail.com&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: Fengguang Wu &lt;fengguang.wu@intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm/page_owner: set correct gfp_mask on page_owner</title>
<updated>2015-07-17T23:39:54+00:00</updated>
<author>
<name>Joonsoo Kim</name>
<email>js1304@gmail.com</email>
</author>
<published>2015-07-17T23:24:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e2cfc91120fa01e3458167054af993fb83d7d0ec'/>
<id>e2cfc91120fa01e3458167054af993fb83d7d0ec</id>
<content type='text'>
Currently, we set wrong gfp_mask to page_owner info in case of isolated
freepage by compaction and split page.  It causes incorrect mixed
pageblock report that we can get from '/proc/pagetypeinfo'.  This metric
is really useful to measure fragmentation effect so should be accurate.
This patch fixes it by setting correct information.

Without this patch, after kernel build workload is finished, number of
mixed pageblock is 112 among roughly 210 movable pageblocks.

But, with this fix, output shows that mixed pageblock is just 57.

Signed-off-by: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, we set wrong gfp_mask to page_owner info in case of isolated
freepage by compaction and split page.  It causes incorrect mixed
pageblock report that we can get from '/proc/pagetypeinfo'.  This metric
is really useful to measure fragmentation effect so should be accurate.
This patch fixes it by setting correct information.

Without this patch, after kernel build workload is finished, number of
mixed pageblock is 112 among roughly 210 movable pageblocks.

But, with this fix, output shows that mixed pageblock is just 57.

Signed-off-by: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm/page_owner: fix possible access violation</title>
<updated>2015-07-17T23:39:54+00:00</updated>
<author>
<name>Joonsoo Kim</name>
<email>js1304@gmail.com</email>
</author>
<published>2015-07-17T23:24:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f3a14ced32513d103a3ed0ce89c4e713fac01461'/>
<id>f3a14ced32513d103a3ed0ce89c4e713fac01461</id>
<content type='text'>
When I tested my new patches, I found that page pointer which is used
for setting page_owner information is changed.  This is because page
pointer is used to set new migratetype in loop.  After this work, page
pointer could be out of bound.  If this wrong pointer is used for
page_owner, access violation happens.  Below is error message that I
got.

  BUG: unable to handle kernel paging request at 0000000000b00018
  IP: [&lt;ffffffff81025f30&gt;] save_stack_address+0x30/0x40
  PGD 1af2d067 PUD 166e0067 PMD 0
  Oops: 0002 [#1] SMP
  ...snip...
  Call Trace:
    print_context_stack+0xcf/0x100
    dump_trace+0x15f/0x320
    save_stack_trace+0x2f/0x50
    __set_page_owner+0x46/0x70
    __isolate_free_page+0x1f7/0x210
    split_free_page+0x21/0xb0
    isolate_freepages_block+0x1e2/0x410
    compaction_alloc+0x22d/0x2d0
    migrate_pages+0x289/0x8b0
    compact_zone+0x409/0x880
    compact_zone_order+0x6d/0x90
    try_to_compact_pages+0x110/0x210
    __alloc_pages_direct_compact+0x3d/0xe6
    __alloc_pages_nodemask+0x6cd/0x9a0
    alloc_pages_current+0x91/0x100
    runtest_store+0x296/0xa50
    simple_attr_write+0xbd/0xe0
    __vfs_write+0x28/0xf0
    vfs_write+0xa9/0x1b0
    SyS_write+0x46/0xb0
    system_call_fastpath+0x16/0x75

This patch fixes this error by moving up set_page_owner().

Signed-off-by: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Acked-by: Minchan Kim &lt;minchan@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When I tested my new patches, I found that page pointer which is used
for setting page_owner information is changed.  This is because page
pointer is used to set new migratetype in loop.  After this work, page
pointer could be out of bound.  If this wrong pointer is used for
page_owner, access violation happens.  Below is error message that I
got.

  BUG: unable to handle kernel paging request at 0000000000b00018
  IP: [&lt;ffffffff81025f30&gt;] save_stack_address+0x30/0x40
  PGD 1af2d067 PUD 166e0067 PMD 0
  Oops: 0002 [#1] SMP
  ...snip...
  Call Trace:
    print_context_stack+0xcf/0x100
    dump_trace+0x15f/0x320
    save_stack_trace+0x2f/0x50
    __set_page_owner+0x46/0x70
    __isolate_free_page+0x1f7/0x210
    split_free_page+0x21/0xb0
    isolate_freepages_block+0x1e2/0x410
    compaction_alloc+0x22d/0x2d0
    migrate_pages+0x289/0x8b0
    compact_zone+0x409/0x880
    compact_zone_order+0x6d/0x90
    try_to_compact_pages+0x110/0x210
    __alloc_pages_direct_compact+0x3d/0xe6
    __alloc_pages_nodemask+0x6cd/0x9a0
    alloc_pages_current+0x91/0x100
    runtest_store+0x296/0xa50
    simple_attr_write+0xbd/0xe0
    __vfs_write+0x28/0xf0
    vfs_write+0xa9/0x1b0
    SyS_write+0x46/0xb0
    system_call_fastpath+0x16/0x75

This patch fixes this error by moving up set_page_owner().

Signed-off-by: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Acked-by: Minchan Kim &lt;minchan@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm, meminit: suppress unused memory variable warning</title>
<updated>2015-07-17T23:39:53+00:00</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2015-07-17T23:23:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ae026b2aa19350f3c863df2dce7e0511dd78ff49'/>
<id>ae026b2aa19350f3c863df2dce7e0511dd78ff49</id>
<content type='text'>
The kbuild test robot reported the following

  tree:   git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
  head:   14a6f1989dae9445d4532941bdd6bbad84f4c8da
  commit: 3b242c66ccbd60cf47ab0e8992119d9617548c23 x86: mm: enable deferred struct page initialisation on x86-64
  date:   3 days ago
  config: x86_64-randconfig-x006-201527 (attached as .config)
  reproduce:
    git checkout 3b242c66ccbd60cf47ab0e8992119d9617548c23
    # save the attached .config to linux build tree
    make ARCH=x86_64

  All warnings (new ones prefixed by &gt;&gt;):

     mm/page_alloc.c: In function 'early_page_uninitialised':
  &gt;&gt; mm/page_alloc.c:247:6: warning: unused variable 'nid' [-Wunused-variable]
       int nid = early_pfn_to_nid(pfn);

It's due to the NODE_DATA macro ignoring the nid parameter on !NUMA
configurations.  This patch avoids the warning by not declaring nid.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Reported-by: Wu Fengguang &lt;fengguang.wu@intel.com&gt;
Cc: "Kirill A. Shutemov" &lt;kirill@shutemov.name&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The kbuild test robot reported the following

  tree:   git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
  head:   14a6f1989dae9445d4532941bdd6bbad84f4c8da
  commit: 3b242c66ccbd60cf47ab0e8992119d9617548c23 x86: mm: enable deferred struct page initialisation on x86-64
  date:   3 days ago
  config: x86_64-randconfig-x006-201527 (attached as .config)
  reproduce:
    git checkout 3b242c66ccbd60cf47ab0e8992119d9617548c23
    # save the attached .config to linux build tree
    make ARCH=x86_64

  All warnings (new ones prefixed by &gt;&gt;):

     mm/page_alloc.c: In function 'early_page_uninitialised':
  &gt;&gt; mm/page_alloc.c:247:6: warning: unused variable 'nid' [-Wunused-variable]
       int nid = early_pfn_to_nid(pfn);

It's due to the NODE_DATA macro ignoring the nid parameter on !NUMA
configurations.  This patch avoids the warning by not declaring nid.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Reported-by: Wu Fengguang &lt;fengguang.wu@intel.com&gt;
Cc: "Kirill A. Shutemov" &lt;kirill@shutemov.name&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: meminit: finish initialisation of struct pages before basic setup</title>
<updated>2015-07-01T02:44:56+00:00</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2015-06-30T21:57:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0e1cc95b4cc7293bb7b39175035e7f7e45c90977'/>
<id>0e1cc95b4cc7293bb7b39175035e7f7e45c90977</id>
<content type='text'>
Waiman Long reported that 24TB machines hit OOM during basic setup when
struct page initialisation was deferred.  One approach is to initialise
memory on demand but it interferes with page allocator paths.  This patch
creates dedicated threads to initialise memory before basic setup.  It
then blocks on a rw_semaphore until completion as a wait_queue and counter
is overkill.  This may be slower to boot but it's simplier overall and
also gets rid of a section mangling which existed so kswapd could do the
initialisation.

[akpm@linux-foundation.org: include rwsem.h, use DECLARE_RWSEM, fix comment, remove unneeded cast]
Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Waiman Long &lt;waiman.long@hp.com
Cc: Nathan Zimmer &lt;nzimmer@sgi.com&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: Scott Norton &lt;scott.norton@hp.com&gt;
Tested-by: Daniel J Blueman &lt;daniel@numascale.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Waiman Long reported that 24TB machines hit OOM during basic setup when
struct page initialisation was deferred.  One approach is to initialise
memory on demand but it interferes with page allocator paths.  This patch
creates dedicated threads to initialise memory before basic setup.  It
then blocks on a rw_semaphore until completion as a wait_queue and counter
is overkill.  This may be slower to boot but it's simplier overall and
also gets rid of a section mangling which existed so kswapd could do the
initialisation.

[akpm@linux-foundation.org: include rwsem.h, use DECLARE_RWSEM, fix comment, remove unneeded cast]
Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Waiman Long &lt;waiman.long@hp.com
Cc: Nathan Zimmer &lt;nzimmer@sgi.com&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: Scott Norton &lt;scott.norton@hp.com&gt;
Tested-by: Daniel J Blueman &lt;daniel@numascale.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
