<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/arch/powerpc/kernel/entry_64.S, branch v2.6.30</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>powerpc64, ftrace: save toc only on modules for function graph</title>
<updated>2009-02-22T23:48:54+00:00</updated>
<author>
<name>Steven Rostedt</name>
<email>srostedt@redhat.com</email>
</author>
<published>2009-02-11T20:45:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=bb7253403f7a4670a128e4c080fd8ea1bd4d5029'/>
<id>bb7253403f7a4670a128e4c080fd8ea1bd4d5029</id>
<content type='text'>
The TOCS used by modules are different than the one used by
the core kernel code. The function graph tracer must save and
restore the TOC whenever it traces a module call. But this
is an added overhead to burden the majority of core kernel
code being traced.

Benjamin Herrenschmidt suggested in testing the entry of
the call to tell if it is a core kernel function or a module.
He recommended using the REGION_ID() macro to perform this test.

This patch implements Benjamin's idea, and uses a different
return_to_handler routine dependent on if the entry is a core
kernel function or not. The module version saves the TOC, where as
the core kernel version does not.

Geoff Lavand tested on PS3.

Tested-by: Geoff Levand &lt;geoffrey.levand@am.sony.com&gt;
Acked-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Signed-off-by: Steven Rostedt &lt;srostedt@redhat.com&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The TOCS used by modules are different than the one used by
the core kernel code. The function graph tracer must save and
restore the TOC whenever it traces a module call. But this
is an added overhead to burden the majority of core kernel
code being traced.

Benjamin Herrenschmidt suggested in testing the entry of
the call to tell if it is a core kernel function or a module.
He recommended using the REGION_ID() macro to perform this test.

This patch implements Benjamin's idea, and uses a different
return_to_handler routine dependent on if the entry is a core
kernel function or not. The module version saves the TOC, where as
the core kernel version does not.

Geoff Lavand tested on PS3.

Tested-by: Geoff Levand &lt;geoffrey.levand@am.sony.com&gt;
Acked-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Signed-off-by: Steven Rostedt &lt;srostedt@redhat.com&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc64, tracing: add function graph tracer with dynamic tracing</title>
<updated>2009-02-22T23:48:54+00:00</updated>
<author>
<name>Steven Rostedt</name>
<email>srostedt@redhat.com</email>
</author>
<published>2009-02-11T06:19:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=465428884765b43d642a967915e16c6c7cacbe8e'/>
<id>465428884765b43d642a967915e16c6c7cacbe8e</id>
<content type='text'>
This is the port of the function graph tracer to PowerPC with
dynamic tracing.

Geoff Lavand tested on PS3.

Tested-by: Geoff Levand &lt;geoffrey.levand@am.sony.com&gt;
Acked-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Signed-off-by: Steven Rostedt &lt;srostedt@redhat.com&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is the port of the function graph tracer to PowerPC with
dynamic tracing.

Geoff Lavand tested on PS3.

Tested-by: Geoff Levand &lt;geoffrey.levand@am.sony.com&gt;
Acked-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Signed-off-by: Steven Rostedt &lt;srostedt@redhat.com&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc64: port of the function graph tracer</title>
<updated>2009-02-22T23:48:53+00:00</updated>
<author>
<name>Steven Rostedt</name>
<email>srostedt@redhat.com</email>
</author>
<published>2009-02-10T05:10:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6794c78243bfda020ab184d6d578944f8e90d26c'/>
<id>6794c78243bfda020ab184d6d578944f8e90d26c</id>
<content type='text'>
This is a port of the function graph tracer that was written by
Frederic Weisbecker for the x86.

This only works for PPC64 at the moment and only for static tracing.
PPC32 and dynamic function graph tracing support will come later.

The trace produces a visual calling of functions:

 # tracer: function_graph
 #
 # CPU  DURATION                  FUNCTION CALLS
 # |     |   |                     |   |   |   |
  0)   2.224 us    |                        }
  0) ! 271.024 us  |                      }
  0) ! 320.080 us  |                    }
  0) ! 324.656 us  |                  }
  0) ! 329.136 us  |                }
  0)               |                .put_prev_task_fair() {
  0)               |                  .update_curr() {
  0)   2.240 us    |                    .update_min_vruntime();
  0)   6.512 us    |                  }
  0)   2.528 us    |                  .__enqueue_entity();
  0) + 15.536 us   |                }
  0)               |                .pick_next_task_fair() {
  0)   2.032 us    |                  .__pick_next_entity();
  0)   2.064 us    |                  .__clear_buddies();
  0)               |                  .set_next_entity() {
  0)   2.672 us    |                    .__dequeue_entity();
  0)   6.864 us    |                  }

Geoff Lavand tested on PS3.

Tested-by: Geoff Levand &lt;geoffrey.levand@am.sony.com&gt;
Acked-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Signed-off-by: Steven Rostedt &lt;srostedt@redhat.com&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is a port of the function graph tracer that was written by
Frederic Weisbecker for the x86.

This only works for PPC64 at the moment and only for static tracing.
PPC32 and dynamic function graph tracing support will come later.

The trace produces a visual calling of functions:

 # tracer: function_graph
 #
 # CPU  DURATION                  FUNCTION CALLS
 # |     |   |                     |   |   |   |
  0)   2.224 us    |                        }
  0) ! 271.024 us  |                      }
  0) ! 320.080 us  |                    }
  0) ! 324.656 us  |                  }
  0) ! 329.136 us  |                }
  0)               |                .put_prev_task_fair() {
  0)               |                  .update_curr() {
  0)   2.240 us    |                    .update_min_vruntime();
  0)   6.512 us    |                  }
  0)   2.528 us    |                  .__enqueue_entity();
  0) + 15.536 us   |                }
  0)               |                .pick_next_task_fair() {
  0)   2.032 us    |                  .__pick_next_entity();
  0)   2.064 us    |                  .__clear_buddies();
  0)               |                  .set_next_entity() {
  0)   2.672 us    |                    .__dequeue_entity();
  0)   6.864 us    |                  }

Geoff Lavand tested on PS3.

Tested-by: Geoff Levand &lt;geoffrey.levand@am.sony.com&gt;
Acked-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Signed-off-by: Steven Rostedt &lt;srostedt@redhat.com&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge commit 'v2.6.28-rc7' into tracing/core</title>
<updated>2008-12-04T08:07:19+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@elte.hu</email>
</author>
<published>2008-12-04T08:07:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b8307db2477f9c551e54e0c7b643ea349a3349cd'/>
<id>b8307db2477f9c551e54e0c7b643ea349a3349cd</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Fix system calls on Cell entered with XER.SO=1</title>
<updated>2008-11-30T22:40:19+00:00</updated>
<author>
<name>Paul Mackerras</name>
<email>paulus@samba.org</email>
</author>
<published>2008-11-30T11:49:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ab598b6680f1e74c267d1547ee352f3e1e530f89'/>
<id>ab598b6680f1e74c267d1547ee352f3e1e530f89</id>
<content type='text'>
It turns out that on Cell, on a kernel with CONFIG_VIRT_CPU_ACCOUNTING
= y, if a program sets the SO (summary overflow) bit in the XER and
then does a system call, the SO bit in CR0 will be set on return
regardless of whether the system call detected an error.  Since CR0.SO
is used as the error indication from the system call, this means that
all system calls appear to fail.

The reason is that the workaround for the timebase bug on Cell uses a
compare instruction.  With CONFIG_VIRT_CPU_ACCOUNTING = y, the
ACCOUNT_CPU_USER_ENTRY macro reads the timebase, so we end up doing a
compare instruction, which copies XER.SO to CR0.SO.  Since we were
doing this in the system call entry patch after clearing CR0.SO but
before saving the CR, this meant that the saved CR image had CR0.SO
set if XER.SO was set on entry.

This fixes it by moving the clearing of CR0.SO to after the
ACCOUNT_CPU_USER_ENTRY call in the system call entry path.

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
Acked-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Acked-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It turns out that on Cell, on a kernel with CONFIG_VIRT_CPU_ACCOUNTING
= y, if a program sets the SO (summary overflow) bit in the XER and
then does a system call, the SO bit in CR0 will be set on return
regardless of whether the system call detected an error.  Since CR0.SO
is used as the error indication from the system call, this means that
all system calls appear to fail.

The reason is that the workaround for the timebase bug on Cell uses a
compare instruction.  With CONFIG_VIRT_CPU_ACCOUNTING = y, the
ACCOUNT_CPU_USER_ENTRY macro reads the timebase, so we end up doing a
compare instruction, which copies XER.SO to CR0.SO.  Since we were
doing this in the system call entry patch after clearing CR0.SO but
before saving the CR, this meant that the saved CR image had CR0.SO
set if XER.SO was set on entry.

This fixes it by moving the clearing of CR0.SO to after the
ACCOUNT_CPU_USER_ENTRY call in the system call entry path.

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
Acked-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Acked-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: ftrace, do nothing in mcount call for dyn ftrace</title>
<updated>2008-11-28T13:07:45+00:00</updated>
<author>
<name>Steven Rostedt</name>
<email>srostedt@redhat.com</email>
</author>
<published>2008-11-20T21:18:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c7b0d17366d6e04a11470fc8d85f9fbac02671b9'/>
<id>c7b0d17366d6e04a11470fc8d85f9fbac02671b9</id>
<content type='text'>
Impact: quicken mcount calls that are not replaced by dyn ftrace

Dynamic ftrace no longer does on the fly recording of mcount locations.
The mcount locations are now found at compile time. The mcount
function no longer needs to store registers and call a stub function.
It can now just simply return.

Since there are some functions that do not get converted to a nop
(.init sections and other code that may disappear), this patch should
help speed up that code.

Also, the stub for mcount on PowerPC 32 can not be a simple branch
link register like it is on PowerPC 64. According to the ABI specification:

"The _mcount routine is required to restore the link register from
 the stack so that the profiling code can be inserted transparently,
 whether or not the profiled function saves the link register itself."

This means that we must restore the link register that was used
to make the call to mcount.  The minimal mcount function for PPC32
ends up being:

 mcount:
        mflr    r0
        mtctr   r0
        lwz     r0, 4(r1)
        mtlr    r0
        bctr

Where we move the link register used to call mcount into the
ctr register, and then restore the link register from the stack.
Then we use the ctr register to jump back to the mcount caller.
The r0 register is free for us to use.

Signed-off-by: Steven Rostedt &lt;srostedt@redhat.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Impact: quicken mcount calls that are not replaced by dyn ftrace

Dynamic ftrace no longer does on the fly recording of mcount locations.
The mcount locations are now found at compile time. The mcount
function no longer needs to store registers and call a stub function.
It can now just simply return.

Since there are some functions that do not get converted to a nop
(.init sections and other code that may disappear), this patch should
help speed up that code.

Also, the stub for mcount on PowerPC 32 can not be a simple branch
link register like it is on PowerPC 64. According to the ABI specification:

"The _mcount routine is required to restore the link register from
 the stack so that the profiling code can be inserted transparently,
 whether or not the profiled function saves the link register itself."

This means that we must restore the link register that was used
to make the call to mcount.  The minimal mcount function for PPC32
ends up being:

 mcount:
        mflr    r0
        mtctr   r0
        lwz     r0, 4(r1)
        mtlr    r0
        bctr

Where we move the link register used to call mcount into the
ctr register, and then restore the link register from the stack.
Then we use the ctr register to jump back to the mcount caller.
The r0 register is free for us to use.

Signed-off-by: Steven Rostedt &lt;srostedt@redhat.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ftrace: rename FTRACE to FUNCTION_TRACER</title>
<updated>2008-10-20T16:27:03+00:00</updated>
<author>
<name>Steven Rostedt</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2008-10-06T23:06:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=606576ce816603d9fe1fb453a88bc6eea16ca709'/>
<id>606576ce816603d9fe1fb453a88bc6eea16ca709</id>
<content type='text'>
Due to confusion between the ftrace infrastructure and the gcc profiling
tracer "ftrace", this patch renames the config options from FTRACE to
FUNCTION_TRACER.  The other two names that are offspring from FTRACE
DYNAMIC_FTRACE and FTRACE_MCOUNT_RECORD will stay the same.

This patch was generated mostly by script, and partially by hand.

Signed-off-by: Steven Rostedt &lt;srostedt@redhat.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Due to confusion between the ftrace infrastructure and the gcc profiling
tracer "ftrace", this patch renames the config options from FTRACE to
FUNCTION_TRACER.  The other two names that are offspring from FTRACE
DYNAMIC_FTRACE and FTRACE_MCOUNT_RECORD will stay the same.

This patch was generated mostly by script, and partially by hand.

Signed-off-by: Steven Rostedt &lt;srostedt@redhat.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Use LOAD_REG_IMMEDIATE only for constants on 64-bit</title>
<updated>2008-09-15T18:08:35+00:00</updated>
<author>
<name>Paul Mackerras</name>
<email>paulus@samba.org</email>
</author>
<published>2008-08-30T01:41:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e31aa453bbc4886a7bd33e5c2afa526d6f55bd7a'/>
<id>e31aa453bbc4886a7bd33e5c2afa526d6f55bd7a</id>
<content type='text'>
Using LOAD_REG_IMMEDIATE to get the address of kernel symbols
generates 5 instructions where LOAD_REG_ADDR can do it in one,
and will generate R_PPC64_ADDR16_* relocations in the output when
we get to making the kernel as a position-independent executable,
which we'd rather not have to handle.  This changes various bits
of assembly code to use LOAD_REG_ADDR when we need to get the
address of a symbol, or to use suitable position-independent code
for cases where we can't access the TOC for various reasons, or
if we're not running at the address we were linked at.

It also cleans up a few minor things; there's no reason to save and
restore SRR0/1 around RTAS calls, __mmu_off can get the return
address from LR more conveniently than the caller can supply it in
R4 (and we already assume elsewhere that EA == RA if the MMU is on
in early boot), and enable_64b_mode was using 5 instructions where
2 would do.

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Using LOAD_REG_IMMEDIATE to get the address of kernel symbols
generates 5 instructions where LOAD_REG_ADDR can do it in one,
and will generate R_PPC64_ADDR16_* relocations in the output when
we get to making the kernel as a position-independent executable,
which we'd rather not have to handle.  This changes various bits
of assembly code to use LOAD_REG_ADDR when we need to get the
address of a symbol, or to use suitable position-independent code
for cases where we can't access the TOC for various reasons, or
if we're not running at the address we were linked at.

It also cleans up a few minor things; there's no reason to save and
restore SRR0/1 around RTAS calls, __mmu_off can get the return
address from LR more conveniently than the caller can supply it in
R4 (and we already assume elsewhere that EA == RA if the MMU is on
in early boot), and enable_64b_mode was using 5 instructions where
2 would do.

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Streamline ret_from_except_lite for non-iSeries platforms</title>
<updated>2008-08-20T06:34:57+00:00</updated>
<author>
<name>Michael Ellerman</name>
<email>michael@ellerman.id.au</email>
</author>
<published>2008-07-16T04:21:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=01f3880dd8a7fa78c419da2db740cba511ca7798'/>
<id>01f3880dd8a7fa78c419da2db740cba511ca7798</id>
<content type='text'>
There is a small passage of code in ret_from_except_lite which is
only required on iSeries.  For a multi-platform kernel on non-iSeries
machines this means we end up executing ~15 nops in ret_from_except_lite.

It would be nicer if non-iSeries could skip the code entirely, and on
iSeries we can jump out of line to execute the code.

I have no performance numbers to justify this, other than the assertion
that executing 15 nops takes longer than executing 0.

Signed-off-by: Michael Ellerman &lt;michael@ellerman.id.au&gt;
Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There is a small passage of code in ret_from_except_lite which is
only required on iSeries.  For a multi-platform kernel on non-iSeries
machines this means we end up executing ~15 nops in ret_from_except_lite.

It would be nicer if non-iSeries could skip the code entirely, and on
iSeries we can jump out of line to execute the code.

I have no performance numbers to justify this, other than the assertion
that executing 15 nops takes longer than executing 0.

Signed-off-by: Michael Ellerman &lt;michael@ellerman.id.au&gt;
Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Add TIF_NOTIFY_RESUME support for tracehook</title>
<updated>2008-07-28T06:30:50+00:00</updated>
<author>
<name>Roland McGrath</name>
<email>roland@redhat.com</email>
</author>
<published>2008-07-27T06:52:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7d6d637dac2050f30a1b57b0a3dc5de4a10616ba'/>
<id>7d6d637dac2050f30a1b57b0a3dc5de4a10616ba</id>
<content type='text'>
This adds TIF_NOTIFY_RESUME support for powerpc.  When set,
we call tracehook_notify_resume() on the way to user mode.
This overloads do_signal() to do the work, but changes its
arguments to it has the TIF_* bits handy in a register and
drops the useless first argument that was always zero.

Signed-off-by: Roland McGrath &lt;roland@redhat.com&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This adds TIF_NOTIFY_RESUME support for powerpc.  When set,
we call tracehook_notify_resume() on the way to user mode.
This overloads do_signal() to do the work, but changes its
arguments to it has the TIF_* bits handy in a register and
drops the useless first argument that was always zero.

Signed-off-by: Roland McGrath &lt;roland@redhat.com&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
