<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/arch/powerpc/include/asm/processor.h, branch v5.0.2</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>treewide: remove current_text_addr</title>
<updated>2018-10-31T15:54:12+00:00</updated>
<author>
<name>Nick Desaulniers</name>
<email>ndesaulniers@google.com</email>
</author>
<published>2018-10-30T22:04:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=de0d22e50cd3d57277f073ccf65d57aa519d6888'/>
<id>de0d22e50cd3d57277f073ccf65d57aa519d6888</id>
<content type='text'>
Prefer _THIS_IP_ defined in linux/kernel.h.

Most definitions of current_text_addr were the same as _THIS_IP_, but
a few archs had inline assembly instead.

This patch removes the final call site of current_text_addr, making all
of the definitions dead code.

[akpm@linux-foundation.org: fix arch/csky/include/asm/processor.h]
Link: http://lkml.kernel.org/r/20180911182413.180715-1-ndesaulniers@google.com
Signed-off-by: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Prefer _THIS_IP_ defined in linux/kernel.h.

Most definitions of current_text_addr were the same as _THIS_IP_, but
a few archs had inline assembly instead.

This patch removes the final call site of current_text_addr, making all
of the definitions dead code.

[akpm@linux-foundation.org: fix arch/csky/include/asm/processor.h]
Link: http://lkml.kernel.org/r/20180911182413.180715-1-ndesaulniers@google.com
Signed-off-by: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/64s/hash: Add a SLB preload cache</title>
<updated>2018-10-14T07:04:09+00:00</updated>
<author>
<name>Nicholas Piggin</name>
<email>npiggin@gmail.com</email>
</author>
<published>2018-09-14T15:30:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=5434ae74629af58ad0fc27143a9ea435f7734410'/>
<id>5434ae74629af58ad0fc27143a9ea435f7734410</id>
<content type='text'>
When switching processes, currently all user SLBEs are cleared, and a
few (exec_base, pc, and stack) are preloaded. In trivial testing with
small apps, this tends to miss the heap and low 256MB segments, and it
will also miss commonly accessed segments on large memory workloads.

Add a simple round-robin preload cache that just inserts the last SLB
miss into the head of the cache and preloads those at context switch
time. Every 256 context switches, the oldest entry is removed from the
cache to shrink the cache and require fewer slbmte if they are unused.

Much more could go into this, including into the SLB entry reclaim
side to track some LRU information etc, which would require a study of
large memory workloads. But this is a simple thing we can do now that
is an obvious win for common workloads.

With the full series, process switching speed on the context_switch
benchmark on POWER9/hash (with kernel speculation security masures
disabled) increases from 140K/s to 178K/s (27%).

POWER8 does not change much (within 1%), it's unclear why it does not
see a big gain like POWER9.

Booting to busybox init with 256MB segments has SLB misses go down
from 945 to 69, and with 1T segments 900 to 21. These could almost all
be eliminated by preloading a bit more carefully with ELF binary
loading.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When switching processes, currently all user SLBEs are cleared, and a
few (exec_base, pc, and stack) are preloaded. In trivial testing with
small apps, this tends to miss the heap and low 256MB segments, and it
will also miss commonly accessed segments on large memory workloads.

Add a simple round-robin preload cache that just inserts the last SLB
miss into the head of the cache and preloads those at context switch
time. Every 256 context switches, the oldest entry is removed from the
cache to shrink the cache and require fewer slbmte if they are unused.

Much more could go into this, including into the SLB entry reclaim
side to track some LRU information etc, which would require a study of
large memory workloads. But this is a simple thing we can do now that
is an obvious win for common workloads.

With the full series, process switching speed on the context_switch
benchmark on POWER9/hash (with kernel speculation security masures
disabled) increases from 140K/s to 178K/s (27%).

POWER8 does not change much (within 1%), it's unclear why it does not
see a big gain like POWER9.

Booting to busybox init with 256MB segments has SLB misses go down
from 945 to 69, and with 1T segments 900 to 21. These could almost all
be eliminated by preloading a bit more carefully with ELF binary
loading.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/64: Interrupts save PPR on stack rather than thread_struct</title>
<updated>2018-10-14T07:04:09+00:00</updated>
<author>
<name>Nicholas Piggin</name>
<email>npiggin@gmail.com</email>
</author>
<published>2018-10-12T13:15:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4c2de74cc8696154b283f241d74ec0bb24438e22'/>
<id>4c2de74cc8696154b283f241d74ec0bb24438e22</id>
<content type='text'>
PPR is the odd register out when it comes to interrupt handling, it is
saved in current-&gt;thread.ppr while all others are saved on the stack.

The difficulty with this is that accessing thread.ppr can cause a SLB
fault, but the SLB fault handler implementation in C change had
assumed the normal exception entry handlers would not cause an SLB
fault.

Fix this by allocating room in the interrupt stack to save PPR.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
PPR is the odd register out when it comes to interrupt handling, it is
saved in current-&gt;thread.ppr while all others are saved on the stack.

The difficulty with this is that accessing thread.ppr can cause a SLB
fault, but the SLB fault handler implementation in C change had
assumed the normal exception entry handlers would not cause an SLB
fault.

Fix this by allocating room in the interrupt stack to save PPR.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Revert "convert SLB miss handlers to C" and subsequent commits</title>
<updated>2018-10-03T05:32:49+00:00</updated>
<author>
<name>Michael Ellerman</name>
<email>mpe@ellerman.id.au</email>
</author>
<published>2018-10-02T13:56:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=54be0b9c7c9888ebe63b89a31a17ee3df6a68d61'/>
<id>54be0b9c7c9888ebe63b89a31a17ee3df6a68d61</id>
<content type='text'>
This reverts commits:
  5e46e29e6a97 ("powerpc/64s/hash: convert SLB miss handlers to C")
  8fed04d0f6ae ("powerpc/64s/hash: remove user SLB data from the paca")
  655deecf67b2 ("powerpc/64s/hash: SLB allocation status bitmaps")
  2e1626744e8d ("powerpc/64s/hash: provide arch_setup_exec hooks for hash slice setup")
  89ca4e126a3f ("powerpc/64s/hash: Add a SLB preload cache")

This series had a few bugs, and the fixes are not all trivial. So
revert most of it for now.

Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This reverts commits:
  5e46e29e6a97 ("powerpc/64s/hash: convert SLB miss handlers to C")
  8fed04d0f6ae ("powerpc/64s/hash: remove user SLB data from the paca")
  655deecf67b2 ("powerpc/64s/hash: SLB allocation status bitmaps")
  2e1626744e8d ("powerpc/64s/hash: provide arch_setup_exec hooks for hash slice setup")
  89ca4e126a3f ("powerpc/64s/hash: Add a SLB preload cache")

This series had a few bugs, and the fixes are not all trivial. So
revert most of it for now.

Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/64s/hash: Add a SLB preload cache</title>
<updated>2018-09-19T12:01:56+00:00</updated>
<author>
<name>Nicholas Piggin</name>
<email>npiggin@gmail.com</email>
</author>
<published>2018-09-14T15:30:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=89ca4e126a3f519ccbd42670b38d78700802c10b'/>
<id>89ca4e126a3f519ccbd42670b38d78700802c10b</id>
<content type='text'>
When switching processes, currently all user SLBEs are cleared, and a
few (exec_base, pc, and stack) are preloaded. In trivial testing with
small apps, this tends to miss the heap and low 256MB segments, and it
will also miss commonly accessed segments on large memory workloads.

Add a simple round-robin preload cache that just inserts the last SLB
miss into the head of the cache and preloads those at context switch
time. Every 256 context switches, the oldest entry is removed from the
cache to shrink the cache and require fewer slbmte if they are unused.

Much more could go into this, including into the SLB entry reclaim
side to track some LRU information etc, which would require a study of
large memory workloads. But this is a simple thing we can do now that
is an obvious win for common workloads.

With the full series, process switching speed on the context_switch
benchmark on POWER9/hash (with kernel speculation security masures
disabled) increases from 140K/s to 178K/s (27%).

POWER8 does not change much (within 1%), it's unclear why it does not
see a big gain like POWER9.

Booting to busybox init with 256MB segments has SLB misses go down
from 945 to 69, and with 1T segments 900 to 21. These could almost all
be eliminated by preloading a bit more carefully with ELF binary
loading.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When switching processes, currently all user SLBEs are cleared, and a
few (exec_base, pc, and stack) are preloaded. In trivial testing with
small apps, this tends to miss the heap and low 256MB segments, and it
will also miss commonly accessed segments on large memory workloads.

Add a simple round-robin preload cache that just inserts the last SLB
miss into the head of the cache and preloads those at context switch
time. Every 256 context switches, the oldest entry is removed from the
cache to shrink the cache and require fewer slbmte if they are unused.

Much more could go into this, including into the SLB entry reclaim
side to track some LRU information etc, which would require a study of
large memory workloads. But this is a simple thing we can do now that
is an obvious win for common workloads.

With the full series, process switching speed on the context_switch
benchmark on POWER9/hash (with kernel speculation security masures
disabled) increases from 140K/s to 178K/s (27%).

POWER8 does not change much (within 1%), it's unclear why it does not
see a big gain like POWER9.

Booting to busybox init with 256MB segments has SLB misses go down
from 945 to 69, and with 1T segments 900 to 21. These could almost all
be eliminated by preloading a bit more carefully with ELF binary
loading.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: fix includes in asm/processor.h</title>
<updated>2018-07-30T12:48:19+00:00</updated>
<author>
<name>Christophe Leroy</name>
<email>christophe.leroy@c-s.fr</email>
</author>
<published>2018-07-05T16:25:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=62b8426578c414c918468ab4cc7517da7adc31d5'/>
<id>62b8426578c414c918468ab4cc7517da7adc31d5</id>
<content type='text'>
Remove superflous includes and add missing ones

Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Remove superflous includes and add missing ones

Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Rename thread_struct.fs to addr_limit</title>
<updated>2018-06-03T10:43:42+00:00</updated>
<author>
<name>Michael Ellerman</name>
<email>mpe@ellerman.id.au</email>
</author>
<published>2018-05-14T13:03:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ba0635fcbe8c1ce83523c1ec79753868ce57f7a8'/>
<id>ba0635fcbe8c1ce83523c1ec79753868ce57f7a8</id>
<content type='text'>
It's called 'fs' for historical reasons, it's named after the x86 'FS'
register. But we don't have to use that name for the member of
thread_struct, and in fact arch/x86 doesn't even call it 'fs' anymore.

So rename it to 'addr_limit', which better reflects what it's used
for, and is also the name used on other arches.

Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It's called 'fs' for historical reasons, it's named after the x86 'FS'
register. But we don't have to use that name for the member of
thread_struct, and in fact arch/x86 doesn't even call it 'fs' anymore.

So rename it to 'addr_limit', which better reflects what it's used
for, and is also the name used on other arches.

Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/64: remove start_tb and accum_tb from thread_struct</title>
<updated>2018-06-03T10:40:26+00:00</updated>
<author>
<name>Nicholas Piggin</name>
<email>npiggin@gmail.com</email>
</author>
<published>2018-05-04T17:19:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=36d632ea831fd2fa3cb62599a465825f59076f64'/>
<id>36d632ea831fd2fa3cb62599a465825f59076f64</id>
<content type='text'>
These fields are only written to.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
These fields are only written to.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/64s/idle: POWER9 implement a separate idle stop function for hotplug</title>
<updated>2018-03-31T13:47:46+00:00</updated>
<author>
<name>Nicholas Piggin</name>
<email>npiggin@gmail.com</email>
</author>
<published>2017-11-17T14:08:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3d4fbffdd703d2b968db443911f2147c732a4a48'/>
<id>3d4fbffdd703d2b968db443911f2147c732a4a48</id>
<content type='text'>
Implement a new function to invoke stop, power9_offline_stop, which is
like power9_idle_stop but used by the cpu hotplug code.

Move KVM secondary state manipulation code to the offline case.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Reviewed-by: Vaidyanathan Srinivasan &lt;svaidy@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Implement a new function to invoke stop, power9_offline_stop, which is
like power9_idle_stop but used by the cpu hotplug code.

Move KVM secondary state manipulation code to the offline case.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Reviewed-by: Vaidyanathan Srinivasan &lt;svaidy@linux.vnet.ibm.com&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/mm/hash64: Increase the VA range</title>
<updated>2018-03-30T13:10:38+00:00</updated>
<author>
<name>Aneesh Kumar K.V</name>
<email>aneesh.kumar@linux.vnet.ibm.com</email>
</author>
<published>2018-03-26T10:04:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c2b4d8b7417a59b7f9a52d0d8402f5257cbbd398'/>
<id>c2b4d8b7417a59b7f9a52d0d8402f5257cbbd398</id>
<content type='text'>
This patch increases the max virtual (effective) address value to 4PB.
With 4K page size config we continue to limit ourself to 64TB.

Signed-off-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
[mpe: Keep the H_PGTABLE_RANGE test, update it to work]
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch increases the max virtual (effective) address value to 4PB.
With 4K page size config we continue to limit ourself to 64TB.

Signed-off-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
[mpe: Keep the H_PGTABLE_RANGE test, update it to work]
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
</feed>
