<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/arch/powerpc/include/asm/ppc-opcode.h, branch v2.6.39</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>powerpc: Hardcode popcnt instructions for old assemblers</title>
<updated>2010-12-09T04:35:30+00:00</updated>
<author>
<name>Anton Blanchard</name>
<email>anton@samba.org</email>
</author>
<published>2010-12-07T19:58:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b5f9b6665b70b4c844bed5452f6a14743eefa57c'/>
<id>b5f9b6665b70b4c844bed5452f6a14743eefa57c</id>
<content type='text'>
The popcnt instructions went into binutils relatively recently. As with a
number of other instructions, create macros and hardcode them.

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The popcnt instructions went into binutils relatively recently. As with a
number of other instructions, create macros and hardcode them.

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Emulate most Book I instructions in emulate_step()</title>
<updated>2010-06-22T09:40:29+00:00</updated>
<author>
<name>Paul Mackerras</name>
<email>paulus@samba.org</email>
</author>
<published>2010-06-15T04:48:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0016a4cf5582415849fafbf9f019dd9530824789'/>
<id>0016a4cf5582415849fafbf9f019dd9530824789</id>
<content type='text'>
This extends the emulate_step() function to handle a large proportion
of the Book I instructions implemented on current 64-bit server
processors.  The aim is to handle all the load and store instructions
used in the kernel, plus all of the instructions that appear between
l[wd]arx and st[wd]cx., so this handles the Altivec/VMX lvx and stvx
and the VSX lxv2dx and stxv2dx instructions (implemented in POWER7).

The new code can emulate user mode instructions, and checks the
effective address for a load or store if the saved state is for
user mode.  It doesn't handle little-endian mode at present.

For floating-point, Altivec/VMX and VSX instructions, it checks
that the saved MSR has the enable bit for the relevant facility
set, and if so, assumes that the FP/VMX/VSX registers contain
valid state, and does loads or stores directly to/from the
FP/VMX/VSX registers, using assembly helpers in ldstfp.S.

Instructions supported now include:
* Loads and stores, including some but not all VMX and VSX instructions,
  and lmw/stmw
* Atomic loads and stores (l[dw]arx, st[dw]cx.)
* Arithmetic instructions (add, subtract, multiply, divide, etc.)
* Compare instructions
* Rotate and mask instructions
* Shift instructions
* Logical instructions (and, or, xor, etc.)
* Condition register logical instructions
* mtcrf, cntlz[wd], exts[bhw]
* isync, sync, lwsync, ptesync, eieio
* Cache operations (dcbf, dcbst, dcbt, dcbtst)

The overflow-checking arithmetic instructions are not included, but
they appear not to be ever used in C code.

This uses decimal values for the minor opcodes in the switch statements
because that is what appears in the Power ISA specification, thus it is
easier to check that they are correct if they are in decimal.

If this is used to single-step an instruction where a data breakpoint
interrupt occurred, then there is the possibility that the instruction
is a lwarx or ldarx.  In that case we have to be careful not to lose the
reservation until we get to the matching st[wd]cx., or we'll never make
forward progress.  One alternative is to try to arrange that we can
return from interrupts and handle data breakpoint interrupts without
losing the reservation, which means not using any spinlocks, mutexes,
or atomic ops (including bitops).  That seems rather fragile.  The
other alternative is to emulate the larx/stcx and all the instructions
in between.  This is why this commit adds support for a wide range
of integer instructions.

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This extends the emulate_step() function to handle a large proportion
of the Book I instructions implemented on current 64-bit server
processors.  The aim is to handle all the load and store instructions
used in the kernel, plus all of the instructions that appear between
l[wd]arx and st[wd]cx., so this handles the Altivec/VMX lvx and stvx
and the VSX lxv2dx and stxv2dx instructions (implemented in POWER7).

The new code can emulate user mode instructions, and checks the
effective address for a load or store if the saved state is for
user mode.  It doesn't handle little-endian mode at present.

For floating-point, Altivec/VMX and VSX instructions, it checks
that the saved MSR has the enable bit for the relevant facility
set, and if so, assumes that the FP/VMX/VSX registers contain
valid state, and does loads or stores directly to/from the
FP/VMX/VSX registers, using assembly helpers in ldstfp.S.

Instructions supported now include:
* Loads and stores, including some but not all VMX and VSX instructions,
  and lmw/stmw
* Atomic loads and stores (l[dw]arx, st[dw]cx.)
* Arithmetic instructions (add, subtract, multiply, divide, etc.)
* Compare instructions
* Rotate and mask instructions
* Shift instructions
* Logical instructions (and, or, xor, etc.)
* Condition register logical instructions
* mtcrf, cntlz[wd], exts[bhw]
* isync, sync, lwsync, ptesync, eieio
* Cache operations (dcbf, dcbst, dcbt, dcbtst)

The overflow-checking arithmetic instructions are not included, but
they appear not to be ever used in C code.

This uses decimal values for the minor opcodes in the switch statements
because that is what appears in the Power ISA specification, thus it is
easier to check that they are correct if they are in decimal.

If this is used to single-step an instruction where a data breakpoint
interrupt occurred, then there is the possibility that the instruction
is a lwarx or ldarx.  In that case we have to be careful not to lose the
reservation until we get to the matching st[wd]cx., or we'll never make
forward progress.  One alternative is to try to arrange that we can
return from interrupts and handle data breakpoint interrupts without
losing the reservation, which means not using any spinlocks, mutexes,
or atomic ops (including bitops).  That seems rather fragile.  The
other alternative is to emulate the larx/stcx and all the instructions
in between.  This is why this commit adds support for a wide range
of integer instructions.

Signed-off-by: Paul Mackerras &lt;paulus@samba.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/85xx: Make sure lwarx hint isn't set on ppc32</title>
<updated>2010-03-17T04:24:06+00:00</updated>
<author>
<name>Kumar Gala</name>
<email>galak@kernel.crashing.org</email>
</author>
<published>2010-03-11T05:33:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d6ccb1f55ddf5146219707c0e71b85e3a52179b4'/>
<id>d6ccb1f55ddf5146219707c0e71b85e3a52179b4</id>
<content type='text'>
e500v1/v2 based chips will treat any reserved field being set in an
opcode as illegal.  Thus always setting the hint in the opcode is
a bad idea.

Anton should be kept away from the powerpc opcode map.

Signed-off-by: Kumar Gala &lt;galak@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
e500v1/v2 based chips will treat any reserved field being set in an
opcode as illegal.  Thus always setting the hint in the opcode is
a bad idea.

Anton should be kept away from the powerpc opcode map.

Signed-off-by: Kumar Gala &lt;galak@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Use lwarx/ldarx hint in bit locks</title>
<updated>2010-02-17T03:03:15+00:00</updated>
<author>
<name>Anton Blanchard</name>
<email>anton@samba.org</email>
</author>
<published>2010-02-10T01:02:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=864b9e6fd76489aab422bac62162f57c52e06ed8'/>
<id>864b9e6fd76489aab422bac62162f57c52e06ed8</id>
<content type='text'>
This patch implements the lwarx/ldarx hint bit for bit locks.

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch implements the lwarx/ldarx hint bit for bit locks.

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Use lwarx hint in spinlocks</title>
<updated>2010-02-17T03:03:14+00:00</updated>
<author>
<name>Anton Blanchard</name>
<email>anton@samba.org</email>
</author>
<published>2010-02-10T00:57:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=4e14a4d17a8cd66ccab180d32c977091922cfbed'/>
<id>4e14a4d17a8cd66ccab180d32c977091922cfbed</id>
<content type='text'>
Recent versions of the PowerPC architecture added a hint bit to the larx
instructions to differentiate between an atomic operation and a lock operation:

&gt; 0 Other programs might attempt to modify the word in storage addressed by EA
&gt; even if the subsequent Store Conditional succeeds.
&gt;
&gt; 1 Other programs will not attempt to modify the word in storage addressed by
&gt; EA until the program that has acquired the lock performs a subsequent store
&gt; releasing the lock.

To avoid a binutils dependency this patch create macros for the extended lwarx
format and uses it in the spinlock code. To test this change I used a simple
test case that acquires and releases a global pthread mutex:

	pthread_mutex_lock(&amp;mutex);
	pthread_mutex_unlock(&amp;mutex);

On a 32 core POWER6, running 32 test threads we spend almost all our time in
the futex spinlock code:

    94.37%     perf  [kernel]                     [k] ._raw_spin_lock
               |
               |--99.95%-- ._raw_spin_lock
               |          |
               |          |--63.29%-- .futex_wake
               |          |
               |          |--36.64%-- .futex_wait_setup

Which is a good test for this patch. The results (in lock/unlock operations per
second) are:

before: 1538203 ops/sec
after:  2189219 ops/sec

An improvement of 42%

A 32 core POWER7 improves even more:

before: 1279529 ops/sec
after:  2282076 ops/sec

An improvement of 78%

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Recent versions of the PowerPC architecture added a hint bit to the larx
instructions to differentiate between an atomic operation and a lock operation:

&gt; 0 Other programs might attempt to modify the word in storage addressed by EA
&gt; even if the subsequent Store Conditional succeeds.
&gt;
&gt; 1 Other programs will not attempt to modify the word in storage addressed by
&gt; EA until the program that has acquired the lock performs a subsequent store
&gt; releasing the lock.

To avoid a binutils dependency this patch create macros for the extended lwarx
format and uses it in the spinlock code. To test this change I used a simple
test case that acquires and releases a global pthread mutex:

	pthread_mutex_lock(&amp;mutex);
	pthread_mutex_unlock(&amp;mutex);

On a 32 core POWER6, running 32 test threads we spend almost all our time in
the futex spinlock code:

    94.37%     perf  [kernel]                     [k] ._raw_spin_lock
               |
               |--99.95%-- ._raw_spin_lock
               |          |
               |          |--63.29%-- .futex_wake
               |          |
               |          |--36.64%-- .futex_wait_setup

Which is a good test for this patch. The results (in lock/unlock operations per
second) are:

before: 1538203 ops/sec
after:  2189219 ops/sec

An improvement of 42%

A 32 core POWER7 improves even more:

before: 1279529 ops/sec
after:  2282076 ops/sec

An improvement of 78%

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/mm: Add opcode definitions for tlbivax and tlbsrx.</title>
<updated>2009-08-20T00:12:38+00:00</updated>
<author>
<name>Benjamin Herrenschmidt</name>
<email>benh@kernel.crashing.org</email>
</author>
<published>2009-07-23T23:15:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=29c09e8fbaf65698c51aeffe34acc284a454a38f'/>
<id>29c09e8fbaf65698c51aeffe34acc284a454a38f</id>
<content type='text'>
This adds the opcode definitions to ppc-opcode.h for the two instructions
tlbivax and tlbsrx. as defined by Book3E 2.06

Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This adds the opcode definitions to ppc-opcode.h for the two instructions
tlbivax and tlbsrx. as defined by Book3E 2.06

Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Add 2.06 tlbie mnemonics</title>
<updated>2009-05-21T05:44:21+00:00</updated>
<author>
<name>Milton Miller</name>
<email>miltonm@bga.com</email>
</author>
<published>2009-04-29T20:58:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=60dbf4385130136847ea73657da329f8e7dbe16e'/>
<id>60dbf4385130136847ea73657da329f8e7dbe16e</id>
<content type='text'>
This adds the PowerPC 2.06 tlbie mnemonics and keeps backwards
compatibilty for CPUs before 2.06.

Only useful for bare metal systems.

Signed-off-by: Milton Miller &lt;miltonm@bga.com&gt;
Signed-off-by: Michael Neuling &lt;mikey@neuling.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This adds the PowerPC 2.06 tlbie mnemonics and keeps backwards
compatibilty for CPUs before 2.06.

Only useful for bare metal systems.

Signed-off-by: Milton Miller &lt;miltonm@bga.com&gt;
Signed-off-by: Michael Neuling &lt;mikey@neuling.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Move VSX load/stores into ppc-opcode.h</title>
<updated>2009-05-21T05:44:21+00:00</updated>
<author>
<name>Michael Neuling</name>
<email>mikey@neuling.org</email>
</author>
<published>2009-04-29T20:58:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=dfb432cb960bfcbdf668d0a228bc909897156b31'/>
<id>dfb432cb960bfcbdf668d0a228bc909897156b31</id>
<content type='text'>
Cleans up the VSX load/store instructions by moving them into
ppc-opcode.h.

Signed-off-by: Michael Neuling &lt;mikey@neuling.org&gt;
Acked-by: Kumar Gala &lt;galak@kernel.crashing.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Cleans up the VSX load/store instructions by moving them into
ppc-opcode.h.

Signed-off-by: Michael Neuling &lt;mikey@neuling.org&gt;
Acked-by: Kumar Gala &lt;galak@kernel.crashing.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Cleanup macros in ppc-opcode.h</title>
<updated>2009-05-21T05:44:21+00:00</updated>
<author>
<name>Michael Neuling</name>
<email>mikey@neuling.org</email>
</author>
<published>2009-04-29T20:58:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=da6b43c833f885d8604aee7f9711bb457a40d863'/>
<id>da6b43c833f885d8604aee7f9711bb457a40d863</id>
<content type='text'>
Make macros more braces happy.

Signed-off-by: Michael Neuling &lt;mikey@neuling.org&gt;
Acked-by: Kumar Gala &lt;galak@kernel.crashing.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Make macros more braces happy.

Signed-off-by: Michael Neuling &lt;mikey@neuling.org&gt;
Acked-by: Kumar Gala &lt;galak@kernel.crashing.org&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Revert "powerpc: Add support for early tlbilx opcode"</title>
<updated>2009-04-23T13:51:22+00:00</updated>
<author>
<name>Kumar Gala</name>
<email>galak@kernel.crashing.org</email>
</author>
<published>2009-04-23T13:51:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=323d23aeac4918c7a540b597a26fa7a67645593a'/>
<id>323d23aeac4918c7a540b597a26fa7a67645593a</id>
<content type='text'>
This reverts commit e9965577406a2148ade97b5e0ce7c448b4ba4ef6.  Our HW
guys were able to fix this so it never sees the light of day.

Signed-off-by: Kumar Gala &lt;galak@kernel.crashing.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This reverts commit e9965577406a2148ade97b5e0ce7c448b4ba4ef6.  Our HW
guys were able to fix this so it never sees the light of day.

Signed-off-by: Kumar Gala &lt;galak@kernel.crashing.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
