<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/include/linux/mm_types.h, branch v3.10.78</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>mm: numa: guarantee that tlb_flush_pending updates are visible before page table updates</title>
<updated>2014-01-09T20:24:23+00:00</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2013-12-19T01:08:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=5d8e03b2544c3cc962106f63c9d60578ad8a4c91'/>
<id>5d8e03b2544c3cc962106f63c9d60578ad8a4c91</id>
<content type='text'>
commit af2c1401e6f9177483be4fad876d0073669df9df upstream.

According to documentation on barriers, stores issued before a LOCK can
complete after the lock implying that it's possible tlb_flush_pending
can be visible after a page table update.  As per revised documentation,
this patch adds a smp_mb__before_spinlock to guarantee the correct
ordering.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Acked-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Reviewed-by: Rik van Riel &lt;riel@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit af2c1401e6f9177483be4fad876d0073669df9df upstream.

According to documentation on barriers, stores issued before a LOCK can
complete after the lock implying that it's possible tlb_flush_pending
can be visible after a page table update.  As per revised documentation,
this patch adds a smp_mb__before_spinlock to guarantee the correct
ordering.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Acked-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Reviewed-by: Rik van Riel &lt;riel@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>mm: fix TLB flush race between migration, and change_protection_range</title>
<updated>2014-01-09T20:24:23+00:00</updated>
<author>
<name>Rik van Riel</name>
<email>riel@redhat.com</email>
</author>
<published>2013-12-19T01:08:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d303cf4624824971d94b4e2c7c95df052d14aa81'/>
<id>d303cf4624824971d94b4e2c7c95df052d14aa81</id>
<content type='text'>
commit 20841405940e7be0617612d521e206e4b6b325db upstream.

There are a few subtle races, between change_protection_range (used by
mprotect and change_prot_numa) on one side, and NUMA page migration and
compaction on the other side.

The basic race is that there is a time window between when the PTE gets
made non-present (PROT_NONE or NUMA), and the TLB is flushed.

During that time, a CPU may continue writing to the page.

This is fine most of the time, however compaction or the NUMA migration
code may come in, and migrate the page away.

When that happens, the CPU may continue writing, through the cached
translation, to what is no longer the current memory location of the
process.

This only affects x86, which has a somewhat optimistic pte_accessible.
All other architectures appear to be safe, and will either always flush,
or flush whenever there is a valid mapping, even with no permissions
(SPARC).

The basic race looks like this:

CPU A			CPU B			CPU C

						load TLB entry
make entry PTE/PMD_NUMA
			fault on entry
						read/write old page
			start migrating page
			change PTE/PMD to new page
						read/write old page [*]
flush TLB
						reload TLB from new entry
						read/write new page
						lose data

[*] the old page may belong to a new user at this point!

The obvious fix is to flush remote TLB entries, by making sure that
pte_accessible aware of the fact that PROT_NONE and PROT_NUMA memory may
still be accessible if there is a TLB flush pending for the mm.

This should fix both NUMA migration and compaction.

[mgorman@suse.de: fix build]
Signed-off-by: Rik van Riel &lt;riel@redhat.com&gt;
Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Alex Thorlton &lt;athorlton@sgi.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 20841405940e7be0617612d521e206e4b6b325db upstream.

There are a few subtle races, between change_protection_range (used by
mprotect and change_prot_numa) on one side, and NUMA page migration and
compaction on the other side.

The basic race is that there is a time window between when the PTE gets
made non-present (PROT_NONE or NUMA), and the TLB is flushed.

During that time, a CPU may continue writing to the page.

This is fine most of the time, however compaction or the NUMA migration
code may come in, and migrate the page away.

When that happens, the CPU may continue writing, through the cached
translation, to what is no longer the current memory location of the
process.

This only affects x86, which has a somewhat optimistic pte_accessible.
All other architectures appear to be safe, and will either always flush,
or flush whenever there is a valid mapping, even with no permissions
(SPARC).

The basic race looks like this:

CPU A			CPU B			CPU C

						load TLB entry
make entry PTE/PMD_NUMA
			fault on entry
						read/write old page
			start migrating page
			change PTE/PMD to new page
						read/write old page [*]
flush TLB
						reload TLB from new entry
						read/write new page
						lose data

[*] the old page may belong to a new user at this point!

The obvious fix is to flush remote TLB entries, by making sure that
pte_accessible aware of the fact that PROT_NONE and PROT_NUMA memory may
still be accessible if there is a TLB flush pending for the mm.

This should fix both NUMA migration and compaction.

[mgorman@suse.de: fix build]
Signed-off-by: Rik van Riel &lt;riel@redhat.com&gt;
Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Alex Thorlton &lt;athorlton@sgi.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>x86 get_unmapped_area: Access mmap_legacy_base through mm_struct member</title>
<updated>2013-08-29T16:47:39+00:00</updated>
<author>
<name>Radu Caragea</name>
<email>sinaelgl@gmail.com</email>
</author>
<published>2013-08-21T17:55:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ff1a668bc01c811ffcf19a3d43af58d7bd658986'/>
<id>ff1a668bc01c811ffcf19a3d43af58d7bd658986</id>
<content type='text'>
commit 41aacc1eea645c99edbe8fbcf78a97dc9b862adc upstream.

This is the updated version of df54d6fa5427 ("x86 get_unmapped_area():
use proper mmap base for bottom-up direction") that only randomizes the
mmap base address once.

Signed-off-by: Radu Caragea &lt;sinaelgl@gmail.com&gt;
Reported-and-tested-by: Jeff Shorey &lt;shoreyjeff@gmail.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Michel Lespinasse &lt;walken@google.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Adrian Sendroiu &lt;molecula2788@gmail.com&gt;
Cc: Kamal Mostafa &lt;kamal@canonical.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 41aacc1eea645c99edbe8fbcf78a97dc9b862adc upstream.

This is the updated version of df54d6fa5427 ("x86 get_unmapped_area():
use proper mmap base for bottom-up direction") that only randomizes the
mmap base address once.

Signed-off-by: Radu Caragea &lt;sinaelgl@gmail.com&gt;
Reported-and-tested-by: Jeff Shorey &lt;shoreyjeff@gmail.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Michel Lespinasse &lt;walken@google.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Adrian Sendroiu &lt;molecula2788@gmail.com&gt;
Cc: Kamal Mostafa &lt;kamal@canonical.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>mm: fold page-&gt;_last_nid into page-&gt;flags where possible</title>
<updated>2013-02-24T01:50:17+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2013-02-23T00:34:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=75980e97daccfc6babbac7e180ff118537955f5d'/>
<id>75980e97daccfc6babbac7e180ff118537955f5d</id>
<content type='text'>
page-&gt;_last_nid fits into page-&gt;flags on 64-bit.  The unlikely 32-bit
NUMA configuration with NUMA Balancing will still need an extra page
field.  As Peter notes "Completely dropping 32bit support for
CONFIG_NUMA_BALANCING would simplify things, but it would also remove
the warning if we grow enough 64bit only page-flags to push the last-cpu
out."

[mgorman@suse.de: minor modifications]
Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Simon Jeons &lt;simon.jeons@gmail.com&gt;
Cc: Wanpeng Li &lt;liwanp@linux.vnet.ibm.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
page-&gt;_last_nid fits into page-&gt;flags on 64-bit.  The unlikely 32-bit
NUMA configuration with NUMA Balancing will still need an extra page
field.  As Peter notes "Completely dropping 32bit support for
CONFIG_NUMA_BALANCING would simplify things, but it would also remove
the warning if we grow enough 64bit only page-flags to push the last-cpu
out."

[mgorman@suse.de: minor modifications]
Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Simon Jeons &lt;simon.jeons@gmail.com&gt;
Cc: Wanpeng Li &lt;liwanp@linux.vnet.ibm.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: move page flags layout to separate header</title>
<updated>2013-02-24T01:50:17+00:00</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2013-02-23T00:34:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=bbeae5b05ef6e40bf54db05ceb8635824153b9e2'/>
<id>bbeae5b05ef6e40bf54db05ceb8635824153b9e2</id>
<content type='text'>
This is a preparation patch for moving page-&gt;_last_nid into page-&gt;flags
that moves page flag layout information to a separate header.  This
patch is necessary because otherwise there would be a circular
dependency between mm_types.h and mm.h.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Simon Jeons &lt;simon.jeons@gmail.com&gt;
Cc: Wanpeng Li &lt;liwanp@linux.vnet.ibm.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is a preparation patch for moving page-&gt;_last_nid into page-&gt;flags
that moves page flag layout information to a separate header.  This
patch is necessary because otherwise there would be a circular
dependency between mm_types.h and mm.h.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Simon Jeons &lt;simon.jeons@gmail.com&gt;
Cc: Wanpeng Li &lt;liwanp@linux.vnet.ibm.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: numa: fix minor typo in numa_next_scan</title>
<updated>2013-02-24T01:50:16+00:00</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2013-02-23T00:34:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=34f0315adb58af3b01f59d05b2bce267474e71cb'/>
<id>34f0315adb58af3b01f59d05b2bce267474e71cb</id>
<content type='text'>
s/me/be/ and clarify the comment a bit when we're changing it anyway.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Suggested-by: Simon Jeons &lt;simon.jeons@gmail.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Wanpeng Li &lt;liwanp@linux.vnet.ibm.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
s/me/be/ and clarify the comment a bit when we're changing it anyway.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Suggested-by: Simon Jeons &lt;simon.jeons@gmail.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Wanpeng Li &lt;liwanp@linux.vnet.ibm.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux</title>
<updated>2012-12-18T18:56:07+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-12-18T18:56:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ae664dba2724e59ddd66291b895f7370e28b9a7a'/>
<id>ae664dba2724e59ddd66291b895f7370e28b9a7a</id>
<content type='text'>
Pull SLAB changes from Pekka Enberg:
 "This contains preparational work from Christoph Lameter and Glauber
  Costa for SLAB memcg and cleanups and improvements from Ezequiel
  Garcia and Joonsoo Kim.

  Please note that the SLOB cleanup commit from Arnd Bergmann already
  appears in your tree but I had also merged it myself which is why it
  shows up in the shortlog."

* 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux:
  mm/sl[aou]b: Common alignment code
  slab: Use the new create_boot_cache function to simplify bootstrap
  slub: Use statically allocated kmem_cache boot structure for bootstrap
  mm, sl[au]b: create common functions for boot slab creation
  slab: Simplify bootstrap
  slub: Use correct cpu_slab on dead cpu
  mm: fix slab.c kernel-doc warnings
  mm/slob: use min_t() to compare ARCH_SLAB_MINALIGN
  slab: Ignore internal flags in cache creation
  mm/slob: Use free_page instead of put_page for page-size kmalloc allocations
  mm/sl[aou]b: Move common kmem_cache_size() to slab.h
  mm/slob: Use object_size field in kmem_cache_size()
  mm/slob: Drop usage of page-&gt;private for storing page-sized allocations
  slub: Commonize slab_cache field in struct page
  sl[au]b: Process slabinfo_show in common code
  mm/sl[au]b: Move print_slabinfo_header to slab_common.c
  mm/sl[au]b: Move slabinfo processing to slab_common.c
  slub: remove one code path and reduce lock contention in __slab_free()
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull SLAB changes from Pekka Enberg:
 "This contains preparational work from Christoph Lameter and Glauber
  Costa for SLAB memcg and cleanups and improvements from Ezequiel
  Garcia and Joonsoo Kim.

  Please note that the SLOB cleanup commit from Arnd Bergmann already
  appears in your tree but I had also merged it myself which is why it
  shows up in the shortlog."

* 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux:
  mm/sl[aou]b: Common alignment code
  slab: Use the new create_boot_cache function to simplify bootstrap
  slub: Use statically allocated kmem_cache boot structure for bootstrap
  mm, sl[au]b: create common functions for boot slab creation
  slab: Simplify bootstrap
  slub: Use correct cpu_slab on dead cpu
  mm: fix slab.c kernel-doc warnings
  mm/slob: use min_t() to compare ARCH_SLAB_MINALIGN
  slab: Ignore internal flags in cache creation
  mm/slob: Use free_page instead of put_page for page-size kmalloc allocations
  mm/sl[aou]b: Move common kmem_cache_size() to slab.h
  mm/slob: Use object_size field in kmem_cache_size()
  mm/slob: Drop usage of page-&gt;private for storing page-sized allocations
  slub: Commonize slab_cache field in struct page
  sl[au]b: Process slabinfo_show in common code
  mm/sl[au]b: Move print_slabinfo_header to slab_common.c
  mm/sl[au]b: Move slabinfo processing to slab_common.c
  slub: remove one code path and reduce lock contention in __slab_free()
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma</title>
<updated>2012-12-16T23:18:08+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-12-16T22:33:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3d59eebc5e137bd89c6351e4c70e90ba1d0dc234'/>
<id>3d59eebc5e137bd89c6351e4c70e90ba1d0dc234</id>
<content type='text'>
Pull Automatic NUMA Balancing bare-bones from Mel Gorman:
 "There are three implementations for NUMA balancing, this tree
  (balancenuma), numacore which has been developed in tip/master and
  autonuma which is in aa.git.

  In almost all respects balancenuma is the dumbest of the three because
  its main impact is on the VM side with no attempt to be smart about
  scheduling.  In the interest of getting the ball rolling, it would be
  desirable to see this much merged for 3.8 with the view to building
  scheduler smarts on top and adapting the VM where required for 3.9.

  The most recent set of comparisons available from different people are

    mel:    https://lkml.org/lkml/2012/12/9/108
    mingo:  https://lkml.org/lkml/2012/12/7/331
    tglx:   https://lkml.org/lkml/2012/12/10/437
    srikar: https://lkml.org/lkml/2012/12/10/397

  The results are a mixed bag.  In my own tests, balancenuma does
  reasonably well.  It's dumb as rocks and does not regress against
  mainline.  On the other hand, Ingo's tests shows that balancenuma is
  incapable of converging for this workloads driven by perf which is bad
  but is potentially explained by the lack of scheduler smarts.  Thomas'
  results show balancenuma improves on mainline but falls far short of
  numacore or autonuma.  Srikar's results indicate we all suffer on a
  large machine with imbalanced node sizes.

  My own testing showed that recent numacore results have improved
  dramatically, particularly in the last week but not universally.
  We've butted heads heavily on system CPU usage and high levels of
  migration even when it shows that overall performance is better.
  There are also cases where it regresses.  Of interest is that for
  specjbb in some configurations it will regress for lower numbers of
  warehouses and show gains for higher numbers which is not reported by
  the tool by default and sometimes missed in treports.  Recently I
  reported for numacore that the JVM was crashing with
  NullPointerExceptions but currently it's unclear what the source of
  this problem is.  Initially I thought it was in how numacore batch
  handles PTEs but I'm no longer think this is the case.  It's possible
  numacore is just able to trigger it due to higher rates of migration.

  These reports were quite late in the cycle so I/we would like to start
  with this tree as it contains much of the code we can agree on and has
  not changed significantly over the last 2-3 weeks."

* tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma: (50 commits)
  mm/rmap, migration: Make rmap_walk_anon() and try_to_unmap_anon() more scalable
  mm/rmap: Convert the struct anon_vma::mutex to an rwsem
  mm: migrate: Account a transhuge page properly when rate limiting
  mm: numa: Account for failed allocations and isolations as migration failures
  mm: numa: Add THP migration for the NUMA working set scanning fault case build fix
  mm: numa: Add THP migration for the NUMA working set scanning fault case.
  mm: sched: numa: Delay PTE scanning until a task is scheduled on a new node
  mm: sched: numa: Control enabling and disabling of NUMA balancing if !SCHED_DEBUG
  mm: sched: numa: Control enabling and disabling of NUMA balancing
  mm: sched: Adapt the scanning rate if a NUMA hinting fault does not migrate
  mm: numa: Use a two-stage filter to restrict pages being migrated for unlikely task&lt;-&gt;node relationships
  mm: numa: migrate: Set last_nid on newly allocated page
  mm: numa: split_huge_page: Transfer last_nid on tail page
  mm: numa: Introduce last_nid to the page frame
  sched: numa: Slowly increase the scanning period as NUMA faults are handled
  mm: numa: Rate limit setting of pte_numa if node is saturated
  mm: numa: Rate limit the amount of memory that is migrated between nodes
  mm: numa: Structures for Migrate On Fault per NUMA migration rate limiting
  mm: numa: Migrate pages handled during a pmd_numa hinting fault
  mm: numa: Migrate on reference policy
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull Automatic NUMA Balancing bare-bones from Mel Gorman:
 "There are three implementations for NUMA balancing, this tree
  (balancenuma), numacore which has been developed in tip/master and
  autonuma which is in aa.git.

  In almost all respects balancenuma is the dumbest of the three because
  its main impact is on the VM side with no attempt to be smart about
  scheduling.  In the interest of getting the ball rolling, it would be
  desirable to see this much merged for 3.8 with the view to building
  scheduler smarts on top and adapting the VM where required for 3.9.

  The most recent set of comparisons available from different people are

    mel:    https://lkml.org/lkml/2012/12/9/108
    mingo:  https://lkml.org/lkml/2012/12/7/331
    tglx:   https://lkml.org/lkml/2012/12/10/437
    srikar: https://lkml.org/lkml/2012/12/10/397

  The results are a mixed bag.  In my own tests, balancenuma does
  reasonably well.  It's dumb as rocks and does not regress against
  mainline.  On the other hand, Ingo's tests shows that balancenuma is
  incapable of converging for this workloads driven by perf which is bad
  but is potentially explained by the lack of scheduler smarts.  Thomas'
  results show balancenuma improves on mainline but falls far short of
  numacore or autonuma.  Srikar's results indicate we all suffer on a
  large machine with imbalanced node sizes.

  My own testing showed that recent numacore results have improved
  dramatically, particularly in the last week but not universally.
  We've butted heads heavily on system CPU usage and high levels of
  migration even when it shows that overall performance is better.
  There are also cases where it regresses.  Of interest is that for
  specjbb in some configurations it will regress for lower numbers of
  warehouses and show gains for higher numbers which is not reported by
  the tool by default and sometimes missed in treports.  Recently I
  reported for numacore that the JVM was crashing with
  NullPointerExceptions but currently it's unclear what the source of
  this problem is.  Initially I thought it was in how numacore batch
  handles PTEs but I'm no longer think this is the case.  It's possible
  numacore is just able to trigger it due to higher rates of migration.

  These reports were quite late in the cycle so I/we would like to start
  with this tree as it contains much of the code we can agree on and has
  not changed significantly over the last 2-3 weeks."

* tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma: (50 commits)
  mm/rmap, migration: Make rmap_walk_anon() and try_to_unmap_anon() more scalable
  mm/rmap: Convert the struct anon_vma::mutex to an rwsem
  mm: migrate: Account a transhuge page properly when rate limiting
  mm: numa: Account for failed allocations and isolations as migration failures
  mm: numa: Add THP migration for the NUMA working set scanning fault case build fix
  mm: numa: Add THP migration for the NUMA working set scanning fault case.
  mm: sched: numa: Delay PTE scanning until a task is scheduled on a new node
  mm: sched: numa: Control enabling and disabling of NUMA balancing if !SCHED_DEBUG
  mm: sched: numa: Control enabling and disabling of NUMA balancing
  mm: sched: Adapt the scanning rate if a NUMA hinting fault does not migrate
  mm: numa: Use a two-stage filter to restrict pages being migrated for unlikely task&lt;-&gt;node relationships
  mm: numa: migrate: Set last_nid on newly allocated page
  mm: numa: split_huge_page: Transfer last_nid on tail page
  mm: numa: Introduce last_nid to the page frame
  sched: numa: Slowly increase the scanning period as NUMA faults are handled
  mm: numa: Rate limit setting of pte_numa if node is saturated
  mm: numa: Rate limit the amount of memory that is migrated between nodes
  mm: numa: Structures for Migrate On Fault per NUMA migration rate limiting
  mm: numa: Migrate pages handled during a pmd_numa hinting fault
  mm: numa: Migrate on reference policy
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: rearrange vm_area_struct for fewer cache misses</title>
<updated>2012-12-12T01:22:25+00:00</updated>
<author>
<name>Rik van Riel</name>
<email>riel@surriel.com</email>
</author>
<published>2012-12-12T00:01:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e4c6bfd2d79d063017ab19a18915f0bc759f32d9'/>
<id>e4c6bfd2d79d063017ab19a18915f0bc759f32d9</id>
<content type='text'>
The kernel walks the VMA rbtree in various places, including the page
fault path.  However, the vm_rb node spanned two cache lines, on 64 bit
systems with 64 byte cache lines (most x86 systems).

Rearrange vm_area_struct a little, so all the information we need to do a
VMA tree walk is in the first cache line.

[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Michel Lespinasse &lt;walken@google.com&gt;
Signed-off-by: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Russell King &lt;linux@arm.linux.org.uk&gt;
Cc: Ralf Baechle &lt;ralf@linux-mips.org&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Chris Metcalf &lt;cmetcalf@tilera.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The kernel walks the VMA rbtree in various places, including the page
fault path.  However, the vm_rb node spanned two cache lines, on 64 bit
systems with 64 byte cache lines (most x86 systems).

Rearrange vm_area_struct a little, so all the information we need to do a
VMA tree walk is in the first cache line.

[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Michel Lespinasse &lt;walken@google.com&gt;
Signed-off-by: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Russell King &lt;linux@arm.linux.org.uk&gt;
Cc: Ralf Baechle &lt;ralf@linux-mips.org&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Chris Metcalf &lt;cmetcalf@tilera.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>mm: augment vma rbtree with rb_subtree_gap</title>
<updated>2012-12-12T01:22:25+00:00</updated>
<author>
<name>Michel Lespinasse</name>
<email>walken@google.com</email>
</author>
<published>2012-12-12T00:01:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d37371870ceb1d2165397dc36114725b6dca946c'/>
<id>d37371870ceb1d2165397dc36114725b6dca946c</id>
<content type='text'>
Define vma-&gt;rb_subtree_gap as the largest gap between any vma in the
subtree rooted at that vma, and their predecessor.  Or, for a recursive
definition, vma-&gt;rb_subtree_gap is the max of:

 - vma-&gt;vm_start - vma-&gt;vm_prev-&gt;vm_end
 - rb_subtree_gap fields of the vmas pointed by vma-&gt;rb.rb_left and
   vma-&gt;rb.rb_right

This will allow get_unmapped_area_* to find a free area of the right
size in O(log(N)) time, instead of potentially having to do a linear
walk across all the VMAs.

Also define mm-&gt;highest_vm_end as the vm_end field of the highest vma,
so that we can easily check if the following gap is suitable.

This does have the potential to make unmapping VMAs more expensive,
especially for processes with very large numbers of VMAs, where the VMA
rbtree can grow quite deep.

Signed-off-by: Michel Lespinasse &lt;walken@google.com&gt;
Reviewed-by: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Russell King &lt;linux@arm.linux.org.uk&gt;
Cc: Ralf Baechle &lt;ralf@linux-mips.org&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Chris Metcalf &lt;cmetcalf@tilera.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Define vma-&gt;rb_subtree_gap as the largest gap between any vma in the
subtree rooted at that vma, and their predecessor.  Or, for a recursive
definition, vma-&gt;rb_subtree_gap is the max of:

 - vma-&gt;vm_start - vma-&gt;vm_prev-&gt;vm_end
 - rb_subtree_gap fields of the vmas pointed by vma-&gt;rb.rb_left and
   vma-&gt;rb.rb_right

This will allow get_unmapped_area_* to find a free area of the right
size in O(log(N)) time, instead of potentially having to do a linear
walk across all the VMAs.

Also define mm-&gt;highest_vm_end as the vm_end field of the highest vma,
so that we can easily check if the following gap is suitable.

This does have the potential to make unmapping VMAs more expensive,
especially for processes with very large numbers of VMAs, where the VMA
rbtree can grow quite deep.

Signed-off-by: Michel Lespinasse &lt;walken@google.com&gt;
Reviewed-by: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Russell King &lt;linux@arm.linux.org.uk&gt;
Cc: Ralf Baechle &lt;ralf@linux-mips.org&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Chris Metcalf &lt;cmetcalf@tilera.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
