<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/mm, branch v3.5-rc4</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>mm, mempolicy: fix mbind() to do synchronous migration</title>
<updated>2012-06-21T05:10:42+00:00</updated>
<author>
<name>David Rientjes</name>
<email>rientjes@google.com</email>
</author>
<published>2012-06-21T01:00:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c4c0e9e544a0eb640798cc66e68f394fa4a561bf'/>
<id>c4c0e9e544a0eb640798cc66e68f394fa4a561bf</id>
<content type='text'>
If the range passed to mbind() is not allocated on nodes set in the
nodemask, it migrates the pages to respect the constraint.

The final formal of migrate_pages() is a mode of type enum migrate_mode,
not a boolean.  do_mbind() is currently passing "true" which is the
equivalent of MIGRATE_SYNC_LIGHT.  This should instead be MIGRATE_SYNC
for synchronous page migration.

Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If the range passed to mbind() is not allocated on nodes set in the
nodemask, it migrates the pages to respect the constraint.

The final formal of migrate_pages() is a mode of type enum migrate_mode,
not a boolean.  do_mbind() is currently passing "true" which is the
equivalent of MIGRATE_SYNC_LIGHT.  This should instead be MIGRATE_SYNC
for synchronous page migration.

Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm/memblock: fix overlapping allocation when doubling reserved array</title>
<updated>2012-06-20T21:39:36+00:00</updated>
<author>
<name>Greg Pearson</name>
<email>greg.pearson@hp.com</email>
</author>
<published>2012-06-20T19:53:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=48c3b583bbddad2220ca4c22319ca5d1f78b2090'/>
<id>48c3b583bbddad2220ca4c22319ca5d1f78b2090</id>
<content type='text'>
__alloc_memory_core_early() asks memblock for a range of memory then try
to reserve it.  If the reserved region array lacks space for the new
range, memblock_double_array() is called to allocate more space for the
array.  If memblock is used to allocate memory for the new array it can
end up using a range that overlaps with the range originally allocated in
__alloc_memory_core_early(), leading to possible data corruption.

With this patch memblock_double_array() now calls memblock_find_in_range()
with a narrowed candidate range (in cases where the reserved.regions array
is being doubled) so any memory allocated will not overlap with the
original range that was being reserved.  The range is narrowed by passing
in the starting address and size of the previously allocated range.  Then
the range above the ending address is searched and if a candidate is not
found, the range below the starting address is searched.

Signed-off-by: Greg Pearson &lt;greg.pearson@hp.com&gt;
Signed-off-by: Yinghai Lu &lt;yinghai@kernel.org&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
__alloc_memory_core_early() asks memblock for a range of memory then try
to reserve it.  If the reserved region array lacks space for the new
range, memblock_double_array() is called to allocate more space for the
array.  If memblock is used to allocate memory for the new array it can
end up using a range that overlaps with the range originally allocated in
__alloc_memory_core_early(), leading to possible data corruption.

With this patch memblock_double_array() now calls memblock_find_in_range()
with a narrowed candidate range (in cases where the reserved.regions array
is being doubled) so any memory allocated will not overlap with the
original range that was being reserved.  The range is narrowed by passing
in the starting address and size of the previously allocated range.  Then
the range above the ending address is searched and if a candidate is not
found, the range below the starting address is searched.

Signed-off-by: Greg Pearson &lt;greg.pearson@hp.com&gt;
Signed-off-by: Yinghai Lu &lt;yinghai@kernel.org&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm/memory.c: fix kernel-doc warnings</title>
<updated>2012-06-20T21:39:36+00:00</updated>
<author>
<name>Randy Dunlap</name>
<email>rdunlap@xenotime.net</email>
</author>
<published>2012-06-20T19:53:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=eb4546bbbdb160aff084d50511165f385756af18'/>
<id>eb4546bbbdb160aff084d50511165f385756af18</id>
<content type='text'>
Fix kernel-doc warnings in mm/memory.c:

  Warning(mm/memory.c:1377): No description found for parameter 'start'
  Warning(mm/memory.c:1377): Excess function parameter 'address' description in 'zap_page_range'

Signed-off-by: Randy Dunlap &lt;rdunlap@xenotime.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix kernel-doc warnings in mm/memory.c:

  Warning(mm/memory.c:1377): No description found for parameter 'start'
  Warning(mm/memory.c:1377): Excess function parameter 'address' description in 'zap_page_range'

Signed-off-by: Randy Dunlap &lt;rdunlap@xenotime.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: fix kernel-doc warnings</title>
<updated>2012-06-20T21:39:36+00:00</updated>
<author>
<name>Wanpeng Li</name>
<email>liwp@linux.vnet.ibm.com</email>
</author>
<published>2012-06-20T19:53:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=dad7557eb705688040aac134efa5418b66d5ed92'/>
<id>dad7557eb705688040aac134efa5418b66d5ed92</id>
<content type='text'>
Fix kernel-doc warnings such as

  Warning(../mm/page_cgroup.c:432): No description found for parameter 'id'
  Warning(../mm/page_cgroup.c:432): Excess function parameter 'mem' description in 'swap_cgroup_record'

Signed-off-by: Wanpeng Li &lt;liwp@linux.vnet.ibm.com&gt;
Cc: Randy Dunlap &lt;randy.dunlap@oracle.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix kernel-doc warnings such as

  Warning(../mm/page_cgroup.c:432): No description found for parameter 'id'
  Warning(../mm/page_cgroup.c:432): Excess function parameter 'mem' description in 'swap_cgroup_record'

Signed-off-by: Wanpeng Li &lt;liwp@linux.vnet.ibm.com&gt;
Cc: Randy Dunlap &lt;randy.dunlap@oracle.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm, thp: print useful information when mmap_sem is unlocked in zap_pmd_range</title>
<updated>2012-06-20T21:39:35+00:00</updated>
<author>
<name>David Rientjes</name>
<email>rientjes@google.com</email>
</author>
<published>2012-06-20T19:53:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e0897d75f0b22e8c3a7287a48548c5686ef73447'/>
<id>e0897d75f0b22e8c3a7287a48548c5686ef73447</id>
<content type='text'>
Andrea asked for addr, end, vma-&gt;vm_start, and vma-&gt;vm_end to be emitted
when !rwsem_is_locked(&amp;tlb-&gt;mm-&gt;mmap_sem).  Otherwise, debugging the
underlying issue is more difficult.

Suggested-by: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Andrea asked for addr, end, vma-&gt;vm_start, and vma-&gt;vm_end to be emitted
when !rwsem_is_locked(&amp;tlb-&gt;mm-&gt;mmap_sem).  Otherwise, debugging the
underlying issue is more difficult.

Suggested-by: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>memcg: fix use_hierarchy css_is_ancestor oops regression</title>
<updated>2012-06-20T21:39:35+00:00</updated>
<author>
<name>Hugh Dickins</name>
<email>hughd@google.com</email>
</author>
<published>2012-06-20T19:52:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3a981f482cc29f7d0aeab509e51ea15519a6e961'/>
<id>3a981f482cc29f7d0aeab509e51ea15519a6e961</id>
<content type='text'>
If use_hierarchy is set, reclaim testing soon oopses in css_is_ancestor()
called from __mem_cgroup_same_or_subtree() called from page_referenced():
when processes are exiting, it's easy for mm_match_cgroup() to pass along
a NULL memcg coming from a NULL mm-&gt;owner.

Check for that in __mem_cgroup_same_or_subtree().  Return true or false?
False because we cannot know if it was in the hierarchy, but also false
because it's better not to count a reference from an exiting process.

Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Acked-by: Konstantin Khlebnikov &lt;khlebnikov@openvz.org&gt;
Acked-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If use_hierarchy is set, reclaim testing soon oopses in css_is_ancestor()
called from __mem_cgroup_same_or_subtree() called from page_referenced():
when processes are exiting, it's easy for mm_match_cgroup() to pass along
a NULL memcg coming from a NULL mm-&gt;owner.

Check for that in __mem_cgroup_same_or_subtree().  Return true or false?
False because we cannot know if it was in the hierarchy, but also false
because it's better not to count a reference from an exiting process.

Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Acked-by: Konstantin Khlebnikov &lt;khlebnikov@openvz.org&gt;
Acked-by: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm, oom: fix and cleanup oom score calculations</title>
<updated>2012-06-20T21:39:35+00:00</updated>
<author>
<name>David Rientjes</name>
<email>rientjes@google.com</email>
</author>
<published>2012-06-20T19:52:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=61eafb00d55dfbccdfce543c6b60e369ff4f8f18'/>
<id>61eafb00d55dfbccdfce543c6b60e369ff4f8f18</id>
<content type='text'>
The divide in p-&gt;signal-&gt;oom_score_adj * totalpages / 1000 within
oom_badness() was causing an overflow of the signed long data type.

This adds both the root bias and p-&gt;signal-&gt;oom_score_adj before doing the
normalization which fixes the issue and also cleans up the calculation.

Tested-by: Dave Jones &lt;davej@redhat.com&gt;
Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The divide in p-&gt;signal-&gt;oom_score_adj * totalpages / 1000 within
oom_badness() was causing an overflow of the signed long data type.

This adds both the root bias and p-&gt;signal-&gt;oom_score_adj before doing the
normalization which fixes the issue and also cleans up the calculation.

Tested-by: Dave Jones &lt;davej@redhat.com&gt;
Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>swap: fix shmem swapping when more than 8 areas</title>
<updated>2012-06-16T04:48:14+00:00</updated>
<author>
<name>Hugh Dickins</name>
<email>hughd@google.com</email>
</author>
<published>2012-06-16T00:55:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=9b15b817f3d62409290fd56fe3cbb076a931bb0a'/>
<id>9b15b817f3d62409290fd56fe3cbb076a931bb0a</id>
<content type='text'>
Minchan Kim reports that when a system has many swap areas, and tmpfs
swaps out to the ninth or more, shmem_getpage_gfp()'s attempts to read
back the page cannot locate it, and the read fails with -ENOMEM.

Whoops.  Yes, I blindly followed read_swap_header()'s pte_to_swp_entry(
swp_entry_to_pte()) technique for determining maximum usable swap
offset, without stopping to realize that that actually depends upon the
pte swap encoding shifting swap offset to the higher bits and truncating
it there.  Whereas our radix_tree swap encoding leaves offset in the
lower bits: it's swap "type" (that is, index of swap area) that was
truncated.

Fix it by reducing the SWP_TYPE_SHIFT() in swapops.h, and removing the
broken radix_to_swp_entry(swp_to_radix_entry()) from read_swap_header().

This does not reduce the usable size of a swap area any further, it
leaves it as claimed when making the original commit: no change from 3.0
on x86_64, nor on i386 without PAE; but 3.0's 512GB is reduced to 128GB
per swapfile on i386 with PAE.  It's not a change I would have risked
five years ago, but with x86_64 supported for ten years, I believe it's
appropriate now.

Hmm, and what if some architecture implements its swap pte with offset
encoded below type? That would equally break the maximum usable swap
offset check.  Happily, they all follow the same tradition of encoding
offset above type, but I'll prepare a check on that for next.

Reported-and-Reviewed-and-Tested-by: Minchan Kim &lt;minchan@kernel.org&gt;
Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Cc: stable@vger.kernel.org [3.1, 3.2, 3.3, 3.4]
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Minchan Kim reports that when a system has many swap areas, and tmpfs
swaps out to the ninth or more, shmem_getpage_gfp()'s attempts to read
back the page cannot locate it, and the read fails with -ENOMEM.

Whoops.  Yes, I blindly followed read_swap_header()'s pte_to_swp_entry(
swp_entry_to_pte()) technique for determining maximum usable swap
offset, without stopping to realize that that actually depends upon the
pte swap encoding shifting swap offset to the higher bits and truncating
it there.  Whereas our radix_tree swap encoding leaves offset in the
lower bits: it's swap "type" (that is, index of swap area) that was
truncated.

Fix it by reducing the SWP_TYPE_SHIFT() in swapops.h, and removing the
broken radix_to_swp_entry(swp_to_radix_entry()) from read_swap_header().

This does not reduce the usable size of a swap area any further, it
leaves it as claimed when making the original commit: no change from 3.0
on x86_64, nor on i386 without PAE; but 3.0's 512GB is reduced to 128GB
per swapfile on i386 with PAE.  It's not a change I would have risked
five years ago, but with x86_64 supported for ten years, I believe it's
appropriate now.

Hmm, and what if some architecture implements its swap pte with offset
encoded below type? That would equally break the maximum usable swap
offset check.  Happily, they all follow the same tradition of encoding
offset above type, but I'll prepare a check on that for next.

Reported-and-Reviewed-and-Tested-by: Minchan Kim &lt;minchan@kernel.org&gt;
Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Cc: stable@vger.kernel.org [3.1, 3.2, 3.3, 3.4]
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2012-06-15T23:52:35+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-06-15T23:52:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a95f9b6e092ed862278e08266207c7ab231076b4'/>
<id>a95f9b6e092ed862278e08266207c7ab231076b4</id>
<content type='text'>
Pull core updates (RCU and locking) from Ingo Molnar:
 "Most of the diffstat comes from the RCU slow boot regression fixes,
  but there's also a debuggability improvements/fixes."

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  memblock: Document memblock_is_region_{memory,reserved}()
  rcu: Precompute RCU_FAST_NO_HZ timer offsets
  rcu: Move RCU_FAST_NO_HZ per-CPU variables to rcu_dynticks structure
  rcu: Update RCU_FAST_NO_HZ tracing for lazy callbacks
  rcu: RCU_FAST_NO_HZ detection of callback adoption
  spinlock: Indicate that a lockup is only suspected
  kdump: Execute kmsg_dump(KMSG_DUMP_PANIC) after smp_send_stop()
  panic: Make panic_on_oops configurable
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull core updates (RCU and locking) from Ingo Molnar:
 "Most of the diffstat comes from the RCU slow boot regression fixes,
  but there's also a debuggability improvements/fixes."

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  memblock: Document memblock_is_region_{memory,reserved}()
  rcu: Precompute RCU_FAST_NO_HZ timer offsets
  rcu: Move RCU_FAST_NO_HZ per-CPU variables to rcu_dynticks structure
  rcu: Update RCU_FAST_NO_HZ tracing for lazy callbacks
  rcu: RCU_FAST_NO_HZ detection of callback adoption
  spinlock: Indicate that a lockup is only suspected
  kdump: Execute kmsg_dump(KMSG_DUMP_PANIC) after smp_send_stop()
  panic: Make panic_on_oops configurable
</pre>
</div>
</content>
</entry>
<entry>
<title>mm, oom: fix badness score underflow</title>
<updated>2012-06-08T22:07:35+00:00</updated>
<author>
<name>David Rientjes</name>
<email>rientjes@google.com</email>
</author>
<published>2012-06-08T20:21:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1e11ad8dc42975d5c2bab7d478f6cd875602eda4'/>
<id>1e11ad8dc42975d5c2bab7d478f6cd875602eda4</id>
<content type='text'>
If the privileges given to root threads (3% of allowable memory) or a
negative value of /proc/pid/oom_score_adj happen to exceed the amount of
rss of a thread, its badness score overflows as a result of commit
a7f638f999ff ("mm, oom: normalize oom scores to oom_score_adj scale only
for userspace").

Fix this by making the type signed and return 1, meaning the thread is
still eligible for kill, if the value is negative.

Reported-by: Dave Jones &lt;davej@redhat.com&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If the privileges given to root threads (3% of allowable memory) or a
negative value of /proc/pid/oom_score_adj happen to exceed the amount of
rss of a thread, its badness score overflows as a result of commit
a7f638f999ff ("mm, oom: normalize oom scores to oom_score_adj scale only
for userspace").

Fix this by making the type signed and return 1, meaning the thread is
still eligible for kill, if the value is negative.

Reported-by: Dave Jones &lt;davej@redhat.com&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
