<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/arch/powerpc/kernel/setup_64.c, branch v2.6.16</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>[PATCH] powerpc: Don't start secondary CPUs in a UP &amp;&amp; KEXEC kernel</title>
<updated>2006-02-20T01:03:34+00:00</updated>
<author>
<name>Michael Ellerman</name>
<email>michael@ellerman.id.au</email>
</author>
<published>2006-02-16T03:13:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f018b36f3e1f21318066de8d01740d30e38b03d5'/>
<id>f018b36f3e1f21318066de8d01740d30e38b03d5</id>
<content type='text'>
Because smp_release_cpus() is built for SMP || KEXEC, it's not safe to
unconditionally call it from setup_system(). On a UP &amp;&amp; KEXEC kernel we'll
start up the secondary CPUs which will then go beserk and we die.

Simple fix is to conditionally call smp_release_cpus() in setup_system(). With
that in place we don't need the dummy definition of smp_release_cpus() because
all call sites are #ifdef'ed either SMP or KEXEC.

Signed-off-by: Michael Ellerman &lt;michael@ellerman.id.au&gt;
Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Because smp_release_cpus() is built for SMP || KEXEC, it's not safe to
unconditionally call it from setup_system(). On a UP &amp;&amp; KEXEC kernel we'll
start up the secondary CPUs which will then go beserk and we die.

Simple fix is to conditionally call smp_release_cpus() in setup_system(). With
that in place we don't need the dummy definition of smp_release_cpus() because
all call sites are #ifdef'ed either SMP or KEXEC.

Signed-off-by: Michael Ellerman &lt;michael@ellerman.id.au&gt;
Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] powerpc: Don't overwrite flat device tree with kdump kernel</title>
<updated>2006-02-07T10:32:44+00:00</updated>
<author>
<name>Michael Ellerman</name>
<email>michael@ellerman.id.au</email>
</author>
<published>2006-02-03T08:05:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b68239ee746760bd99a68692f4c97a28f08a5d01'/>
<id>b68239ee746760bd99a68692f4c97a28f08a5d01</id>
<content type='text'>
It's possible for prom_init to allocate the flat device tree inside the
kdump crash kernel region. If this happens, when we load the kdump kernel we
overwrite the flattened device tree, which is bad.

We could make prom_init try and avoid allocating inside the crash kernel
region, but then we run into issues if the crash kernel region uses all the
space inside the RMO. The easiest solution is to move the flat device tree
once we're running in the kernel.

Signed-off-by: Michael Ellerman &lt;michael@ellerman.id.au&gt;
Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It's possible for prom_init to allocate the flat device tree inside the
kdump crash kernel region. If this happens, when we load the kdump kernel we
overwrite the flattened device tree, which is bad.

We could make prom_init try and avoid allocating inside the crash kernel
region, but then we run into issues if the crash kernel region uses all the
space inside the RMO. The easiest solution is to move the flat device tree
once we're running in the kernel.

Signed-off-by: Michael Ellerman &lt;michael@ellerman.id.au&gt;
Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] powerpc/64: per cpu data optimisations</title>
<updated>2006-01-11T03:49:45+00:00</updated>
<author>
<name>Anton Blanchard</name>
<email>anton@samba.org</email>
</author>
<published>2006-01-11T02:16:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7a0268fa1a3613f2c526a9b3058701b277f6abe1'/>
<id>7a0268fa1a3613f2c526a9b3058701b277f6abe1</id>
<content type='text'>
The current ppc64 per cpu data implementation is quite slow. eg:

        lhz 11,18(13)           /* smp_processor_id() */
        ld 9,.LC63-.LCTOC1(30)  /* per_cpu__variable_name */
        ld 8,.LC61-.LCTOC1(30)  /* __per_cpu_offset */
        sldi 11,11,3            /* form index into __per_cpu_offset */
        mr 10,9
        ldx 9,11,8              /* __per_cpu_offset[smp_processor_id()] */
        ldx 0,10,9              /* load per cpu data */

5 loads for something that is supposed to be fast, pretty awful. One
reason for the large number of loads is that we have to synthesize 2
64bit constants (per_cpu__variable_name and __per_cpu_offset).

By putting __per_cpu_offset into the paca we can avoid the 2 loads
associated with it:

        ld 11,56(13)            /* paca-&gt;data_offset */
        ld 9,.LC59-.LCTOC1(30)  /* per_cpu__variable_name */
        ldx 0,9,11              /* load per cpu data

Longer term we can should be able to do even better than 3 loads.
If per_cpu__variable_name wasnt a 64bit constant and paca-&gt;data_offset
was in a register we could cut it down to one load. A suggestion from
Rusty is to use gcc's __thread extension here. In order to do this we
would need to free up r13 (the __thread register and where the paca
currently is). So far Ive had a few unsuccessful attempts at doing that :)

The patch also allocates per cpu memory node local on NUMA machines.
This patch from Rusty has been sitting in my queue _forever_ but stalled
when I hit the compiler bug. Sorry about that.

Finally I also only allocate per cpu data for possible cpus, which comes
straight out of the x86-64 port. On a pseries kernel (with NR_CPUS == 128)
and 4 possible cpus we see some nice gains:

             total       used       free     shared    buffers cached
Mem:       4012228     212860    3799368          0          0 162424

             total       used       free     shared    buffers cached
Mem:       4016200     212984    3803216          0          0 162424

A saving of 3.75MB. Quite nice for smaller machines. Note: we now have
to be careful of per cpu users that touch data for !possible cpus.

At this stage it might be worth making the NUMA and possible cpu
optimisations generic, but per cpu init is done so early we have to be
careful that all architectures have their possible map setup correctly.

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The current ppc64 per cpu data implementation is quite slow. eg:

        lhz 11,18(13)           /* smp_processor_id() */
        ld 9,.LC63-.LCTOC1(30)  /* per_cpu__variable_name */
        ld 8,.LC61-.LCTOC1(30)  /* __per_cpu_offset */
        sldi 11,11,3            /* form index into __per_cpu_offset */
        mr 10,9
        ldx 9,11,8              /* __per_cpu_offset[smp_processor_id()] */
        ldx 0,10,9              /* load per cpu data */

5 loads for something that is supposed to be fast, pretty awful. One
reason for the large number of loads is that we have to synthesize 2
64bit constants (per_cpu__variable_name and __per_cpu_offset).

By putting __per_cpu_offset into the paca we can avoid the 2 loads
associated with it:

        ld 11,56(13)            /* paca-&gt;data_offset */
        ld 9,.LC59-.LCTOC1(30)  /* per_cpu__variable_name */
        ldx 0,9,11              /* load per cpu data

Longer term we can should be able to do even better than 3 loads.
If per_cpu__variable_name wasnt a 64bit constant and paca-&gt;data_offset
was in a register we could cut it down to one load. A suggestion from
Rusty is to use gcc's __thread extension here. In order to do this we
would need to free up r13 (the __thread register and where the paca
currently is). So far Ive had a few unsuccessful attempts at doing that :)

The patch also allocates per cpu memory node local on NUMA machines.
This patch from Rusty has been sitting in my queue _forever_ but stalled
when I hit the compiler bug. Sorry about that.

Finally I also only allocate per cpu data for possible cpus, which comes
straight out of the x86-64 port. On a pseries kernel (with NR_CPUS == 128)
and 4 possible cpus we see some nice gains:

             total       used       free     shared    buffers cached
Mem:       4012228     212860    3799368          0          0 162424

             total       used       free     shared    buffers cached
Mem:       4016200     212984    3803216          0          0 162424

A saving of 3.75MB. Quite nice for smaller machines. Note: we now have
to be careful of per cpu users that touch data for !possible cpus.

At this stage it might be worth making the NUMA and possible cpu
optimisations generic, but per cpu init is done so early we have to be
careful that all architectures have their possible map setup correctly.

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] powerpc: Make early debugging configurable via Kconfig</title>
<updated>2006-01-11T03:48:26+00:00</updated>
<author>
<name>Michael Ellerman</name>
<email>michael@ellerman.id.au</email>
</author>
<published>2006-01-11T00:54:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=296167ae1799815b9ed2d135a847436502f2ee91'/>
<id>296167ae1799815b9ed2d135a847436502f2ee91</id>
<content type='text'>
This patch adds Kconfig entries to control the early debugging options,
currently in setup_64.c.

Doing this via Kconfig rather than #defines means you can have one source tree,
which is buildable for multiple platforms - and you can enable the correct
early debug option for each platform via .config.

I made udbg_early_init() a static inline because otherwise GCC is to daft to
optimise it away when debugging is off.

Now that we have udbg_init_rtas() we can make call_rtas_display_status* static.

Signed-off-by: Michael Ellerman &lt;michael@ellerman.id.au&gt;
Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch adds Kconfig entries to control the early debugging options,
currently in setup_64.c.

Doing this via Kconfig rather than #defines means you can have one source tree,
which is buildable for multiple platforms - and you can enable the correct
early debug option for each platform via .config.

I made udbg_early_init() a static inline because otherwise GCC is to daft to
optimise it away when debugging is off.

Now that we have udbg_init_rtas() we can make call_rtas_display_status* static.

Signed-off-by: Michael Ellerman &lt;michael@ellerman.id.au&gt;
Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] powerpc: Early debugging support for iSeries</title>
<updated>2006-01-11T03:48:13+00:00</updated>
<author>
<name>Michael Ellerman</name>
<email>michael@ellerman.id.au</email>
</author>
<published>2006-01-11T00:54:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=bf6a7112bda99aadd6675526423a96be6b356a3d'/>
<id>bf6a7112bda99aadd6675526423a96be6b356a3d</id>
<content type='text'>
Connect iSeries up to the standard early debugging infrastructure.

To actually use this you need to enable the iSeries early debugging
in setup_64.c. Then after the messages are logged hit Ctrl-x Ctrl-x on
your console to dump the Hypervisor console buffer.

Signed-off-by: Michael Ellerman &lt;michael@ellerman.id.au&gt;
Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Connect iSeries up to the standard early debugging infrastructure.

To actually use this you need to enable the iSeries early debugging
in setup_64.c. Then after the messages are logged hit Ctrl-x Ctrl-x on
your console to dump the Hypervisor console buffer.

Signed-off-by: Michael Ellerman &lt;michael@ellerman.id.au&gt;
Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Introduce a new config symbol to control 16550 early debug code</title>
<updated>2006-01-10T05:19:05+00:00</updated>
<author>
<name>Paul Mackerras</name>
<email>paulus@samba.org</email>
</author>
<published>2006-01-10T05:19:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=13b8a272297b29870d5bf5f8db7a381dd9e82382'/>
<id>13b8a272297b29870d5bf5f8db7a381dd9e82382</id>
<content type='text'>
The previous change by Kumar Gala in this area led to legacy_serial.c
and udbg_16550.c being built as modules when CONFIG_SERIAL_8250=m.
Fix this by introducing a new symbol, CONFIG_PPC_UDBG_16550, to
control whether these files get built, and arrange for it to be selected
for those platforms that need it.

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The previous change by Kumar Gala in this area led to legacy_serial.c
and udbg_16550.c being built as modules when CONFIG_SERIAL_8250=m.
Fix this by introducing a new symbol, CONFIG_PPC_UDBG_16550, to
control whether these files get built, and arrange for it to be selected
for those platforms that need it.

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>spelling: s/retreive/retrieve/</title>
<updated>2006-01-09T23:10:13+00:00</updated>
<author>
<name>Adrian Bunk</name>
<email>bunk@stusta.de</email>
</author>
<published>2006-01-09T23:10:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=943ffb587cfdf3b2adfe52a6db08573f4ecf3284'/>
<id>943ffb587cfdf3b2adfe52a6db08573f4ecf3284</id>
<content type='text'>
Signed-off-by: Adrian Bunk &lt;bunk@stusta.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Signed-off-by: Adrian Bunk &lt;bunk@stusta.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] powerpc: Call find_legacy_serial_ports() if we enable CONFIG_SERIAL_8250</title>
<updated>2006-01-09T04:34:31+00:00</updated>
<author>
<name>Kumar Gala</name>
<email>galak@gate.crashing.org</email>
</author>
<published>2005-12-21T15:27:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=79e7bac0d6ad56d62e2364313b5e5e5950c7385d'/>
<id>79e7bac0d6ad56d62e2364313b5e5e5950c7385d</id>
<content type='text'>
In setup_arch and setup_system call find_legacy_serial_ports() if we
build in support for 8250 serial ports instead of basing it on PPC_MULTIPLATFORM.

Signed-off-by: Kumar Gala &lt;galak@kernel.crashing.org&gt;
Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In setup_arch and setup_system call find_legacy_serial_ports() if we
build in support for 8250 serial ports instead of basing it on PPC_MULTIPLATFORM.

Signed-off-by: Kumar Gala &lt;galak@kernel.crashing.org&gt;
Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] powerpc: Fixups for kernel linked at 32 MB</title>
<updated>2006-01-09T03:52:25+00:00</updated>
<author>
<name>Michael Ellerman</name>
<email>michael@ellerman.id.au</email>
</author>
<published>2005-12-05T21:49:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=758438a7b8da593c9116e95cc7fdff6e9e0b0c40'/>
<id>758438a7b8da593c9116e95cc7fdff6e9e0b0c40</id>
<content type='text'>
There's a few places where we need to fix things up for the kernel to work
if it's linked at 32MB:

 - platforms/powermac/smp.c
   To start secondary cpus on pmac we patch the reset vector, which is fine.
   Except if we're above 32MB we don't have enough bits for an absolute branch,
   it needs to relative.
 - kernel/head_64.s
    - A few branches in the cpu hold code need to load the full target address
      and do a bctr.
    - after_prom_start needs to load PHYSICAL_START as the dest address, not 0.
    - The exception prolog needs to load the low word of the target adddress,
      not just the low halfword.
    - Fixup handling of the initial stab address.
 - kernel/setup_64.c
   smp_release_cpus() needs to write 1 to the spinloop flag near 0, not 32 MB.

Signed-off-by: Michael Ellerman &lt;michael@ellerman.id.au&gt;
Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There's a few places where we need to fix things up for the kernel to work
if it's linked at 32MB:

 - platforms/powermac/smp.c
   To start secondary cpus on pmac we patch the reset vector, which is fine.
   Except if we're above 32MB we don't have enough bits for an absolute branch,
   it needs to relative.
 - kernel/head_64.s
    - A few branches in the cpu hold code need to load the full target address
      and do a bctr.
    - after_prom_start needs to load PHYSICAL_START as the dest address, not 0.
    - The exception prolog needs to load the low word of the target adddress,
      not just the low halfword.
    - Fixup handling of the initial stab address.
 - kernel/setup_64.c
   smp_release_cpus() needs to write 1 to the spinloop flag near 0, not 32 MB.

Signed-off-by: Michael Ellerman &lt;michael@ellerman.id.au&gt;
Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>[PATCH] powerpc: Reroute interrupts from 0 + offset to PHYSICAL_START + offset</title>
<updated>2006-01-09T03:52:21+00:00</updated>
<author>
<name>Michael Ellerman</name>
<email>michael@ellerman.id.au</email>
</author>
<published>2005-12-04T07:39:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0cc4746cadda16826a1b3214c042a2f75445b71c'/>
<id>0cc4746cadda16826a1b3214c042a2f75445b71c</id>
<content type='text'>
Regardless of where the kernel's linked we always get interrupts at low
addresses. This patch creates a trampoline in the first 3 pages of memory,
where interrupts land, and patches those addresses to jump into the real
kernel code at PHYSICAL_START.

We also need to reserve the trampoline code and a bit more in prom.c

Signed-off-by: Michael Ellerman &lt;michael@ellerman.id.au&gt;
Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Regardless of where the kernel's linked we always get interrupts at low
addresses. This patch creates a trampoline in the first 3 pages of memory,
where interrupts land, and patches those addresses to jump into the real
kernel code at PHYSICAL_START.

We also need to reserve the trampoline code and a bit more in prom.c

Signed-off-by: Michael Ellerman &lt;michael@ellerman.id.au&gt;
Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
