<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/arch/powerpc/lib, branch linux-4.3.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>powerpc32: memset: only use dcbz once cache is enabled</title>
<updated>2015-09-17T00:36:53+00:00</updated>
<author>
<name>LEROY Christophe</name>
<email>christophe.leroy@c-s.fr</email>
</author>
<published>2015-09-16T10:04:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=400c47d81ca383fc87d5a3937b234e23e26909fb'/>
<id>400c47d81ca383fc87d5a3937b234e23e26909fb</id>
<content type='text'>
memset() uses instruction dcbz to speed up clearing by not wasting time
loading cache line with data that will be overwritten.
Some platform like mpc52xx do no have cache active at startup and
can therefore not use memset(). Allthough no part of the code
explicitly uses memset(), GCC may make calls to it.

This patch modifies memset() such that at startup, memset()
unconditionally skip the optimised bloc that uses dcbz instruction.

Once the initial MMU is set up, in machine_init() we patch memset()
by replacing this inconditional jump by a NOP

Tested-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
memset() uses instruction dcbz to speed up clearing by not wasting time
loading cache line with data that will be overwritten.
Some platform like mpc52xx do no have cache active at startup and
can therefore not use memset(). Allthough no part of the code
explicitly uses memset(), GCC may make calls to it.

This patch modifies memset() such that at startup, memset()
unconditionally skip the optimised bloc that uses dcbz instruction.

Once the initial MMU is set up, in machine_init() we patch memset()
by replacing this inconditional jump by a NOP

Tested-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc32: memcpy: only use dcbz once cache is enabled</title>
<updated>2015-09-17T00:36:44+00:00</updated>
<author>
<name>LEROY Christophe</name>
<email>christophe.leroy@c-s.fr</email>
</author>
<published>2015-09-16T10:04:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1cd03890ea64795e53f17a94928cca22495acb2a'/>
<id>1cd03890ea64795e53f17a94928cca22495acb2a</id>
<content type='text'>
memcpy() uses instruction dcbz to speed up copy by not wasting time
loading cache line with data that will be overwritten.
Some platform like mpc52xx do no have cache active at startup and
can therefore not use memcpy(). Allthough no part of the code
explicitly uses memcpy(), GCC makes calls to it.

This patch modifies memcpy() such that at startup, memcpy()
unconditionally jumps to generic_memcpy() which doesn't use
the dcbz instruction.

Once the initial MMU is set up, in machine_init() we patch memcpy()
by replacing this inconditional jump by a NOP

Reported-by: Michal Sojka &lt;sojkam1@fel.cvut.cz&gt;
Tested-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
memcpy() uses instruction dcbz to speed up copy by not wasting time
loading cache line with data that will be overwritten.
Some platform like mpc52xx do no have cache active at startup and
can therefore not use memcpy(). Allthough no part of the code
explicitly uses memcpy(), GCC makes calls to it.

This patch modifies memcpy() such that at startup, memcpy()
unconditionally jumps to generic_memcpy() which doesn't use
the dcbz instruction.

Once the initial MMU is set up, in machine_init() we patch memcpy()
by replacing this inconditional jump by a NOP

Reported-by: Michal Sojka &lt;sojkam1@fel.cvut.cz&gt;
Tested-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/32: Few optimisations in memcpy</title>
<updated>2015-08-08T03:59:29+00:00</updated>
<author>
<name>LEROY Christophe</name>
<email>christophe.leroy@c-s.fr</email>
</author>
<published>2015-05-19T10:07:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=295ffb41896f566d4bd6f93fc230f5d411d373cb'/>
<id>295ffb41896f566d4bd6f93fc230f5d411d373cb</id>
<content type='text'>
This patch adds a few optimisations in memcpy functions by using
lbzu/stbu instead of lxb/stb and by re-ordering insn inside a loop
to reduce latency due to loading

Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Scott Wood &lt;scottwood@freescale.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch adds a few optimisations in memcpy functions by using
lbzu/stbu instead of lxb/stb and by re-ordering insn inside a loop
to reduce latency due to loading

Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Scott Wood &lt;scottwood@freescale.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/32: cacheable_memcpy becomes memcpy</title>
<updated>2015-08-08T03:59:27+00:00</updated>
<author>
<name>LEROY Christophe</name>
<email>christophe.leroy@c-s.fr</email>
</author>
<published>2015-05-19T10:07:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0b05e2d671c40cfb57e66e4e402320d6e056b2f8'/>
<id>0b05e2d671c40cfb57e66e4e402320d6e056b2f8</id>
<content type='text'>
cacheable_memcpy uses dcbz instruction and is more efficient than
memcpy when the destination is in RAM. If the destination is in an
io area, memcpy_toio() is normally used, not memcpy

This patch renames memcpy as generic_memcpy, and renames
cacheable_memcpy as memcpy

On MPC885, we get approximatly 7% increase of the transfer rate
on an FTP reception

Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Scott Wood &lt;scottwood@freescale.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
cacheable_memcpy uses dcbz instruction and is more efficient than
memcpy when the destination is in RAM. If the destination is in an
io area, memcpy_toio() is normally used, not memcpy

This patch renames memcpy as generic_memcpy, and renames
cacheable_memcpy as memcpy

On MPC885, we get approximatly 7% increase of the transfer rate
on an FTP reception

Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Scott Wood &lt;scottwood@freescale.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/32: Merge the new memset() with the old one</title>
<updated>2015-08-08T03:59:24+00:00</updated>
<author>
<name>LEROY Christophe</name>
<email>christophe.leroy@c-s.fr</email>
</author>
<published>2015-05-19T10:07:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c152f149ce8e47bc04061a9e9e5f53e219ee5b95'/>
<id>c152f149ce8e47bc04061a9e9e5f53e219ee5b95</id>
<content type='text'>
cacheable_memzero() which has become the new memset() and the old
memset() are quite similar, so just merge them.

Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Scott Wood &lt;scottwood@freescale.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
cacheable_memzero() which has become the new memset() and the old
memset() are quite similar, so just merge them.

Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Scott Wood &lt;scottwood@freescale.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc/32: memset(0): use cacheable_memzero</title>
<updated>2015-08-08T03:59:21+00:00</updated>
<author>
<name>LEROY Christophe</name>
<email>christophe.leroy@c-s.fr</email>
</author>
<published>2015-05-19T10:07:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=5b2a32e806342f237f68435a50e1071f7f32b5c5'/>
<id>5b2a32e806342f237f68435a50e1071f7f32b5c5</id>
<content type='text'>
cacheable_memzero uses dcbz instruction and is more efficient than
memset(0) when the destination is in RAM

This patch renames memset as generic_memset, and defines memset
as a prolog to cacheable_memzero. This prolog checks if the byte
to set is 0. If not, it falls back to generic_memcpy()

cacheable_memzero disappears as it is not referenced anywhere anymore

Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Scott Wood &lt;scottwood@freescale.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
cacheable_memzero uses dcbz instruction and is more efficient than
memset(0) when the destination is in RAM

This patch renames memset as generic_memset, and defines memset
as a prolog to cacheable_memzero. This prolog checks if the byte
to set is 0. If not, it falls back to generic_memcpy()

cacheable_memzero disappears as it is not referenced anywhere anymore

Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Scott Wood &lt;scottwood@freescale.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Partially revert "powerpc: Remove duplicate cacheable_memcpy/memzero functions"</title>
<updated>2015-08-08T03:59:21+00:00</updated>
<author>
<name>LEROY Christophe</name>
<email>christophe.leroy@c-s.fr</email>
</author>
<published>2015-05-19T10:07:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=df087e450d7ddc0b15bd8824206d964720b4f5e4'/>
<id>df087e450d7ddc0b15bd8824206d964720b4f5e4</id>
<content type='text'>
This partially reverts
commit 'powerpc: Remove duplicate cacheable_memcpy/memzero functions
("b05ae4ee602b7dc90771408ccf0972e1b3801a35")'

Functions cacheable_memcpy/memzero are more efficient than
memcpy/memset as they use the dcbz instruction which avoids refill
of the cacheline with the data that we will overwrite.

Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Scott Wood &lt;scottwood@freescale.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This partially reverts
commit 'powerpc: Remove duplicate cacheable_memcpy/memzero functions
("b05ae4ee602b7dc90771408ccf0972e1b3801a35")'

Functions cacheable_memcpy/memzero are more efficient than
memcpy/memset as they use the dcbz instruction which avoids refill
of the cacheline with the data that we will overwrite.

Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Scott Wood &lt;scottwood@freescale.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: put csum_tcpudp_magic inline</title>
<updated>2015-08-08T03:59:19+00:00</updated>
<author>
<name>LEROY Christophe</name>
<email>christophe.leroy@c-s.fr</email>
</author>
<published>2015-05-19T15:18:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=92c985f1d76c228abe202cb3004f03c9b45860d1'/>
<id>92c985f1d76c228abe202cb3004f03c9b45860d1</id>
<content type='text'>
csum_tcpudp_magic() is only a few instructions, and does modify
really few registers. So it is not worth having it as a separate
function and suffer function branching and saving of volatile
registers.

This patch makes it inline by use of the already existing
csum_tcpudp_nofold() function.

Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Scott Wood &lt;scottwood@freescale.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
csum_tcpudp_magic() is only a few instructions, and does modify
really few registers. So it is not worth having it as a separate
function and suffer function branching and saving of volatile
registers.

This patch makes it inline by use of the already existing
csum_tcpudp_nofold() function.

Signed-off-by: Christophe Leroy &lt;christophe.leroy@c-s.fr&gt;
Signed-off-by: Scott Wood &lt;scottwood@freescale.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'powerpc-4.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux</title>
<updated>2015-06-24T15:46:32+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-06-24T15:46:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=08d183e3c1f650b4db1d07d764502116861542fa'/>
<id>08d183e3c1f650b4db1d07d764502116861542fa</id>
<content type='text'>
Pull powerpc updates from Michael Ellerman:

 - disable the 32-bit vdso when building LE, so we can build with a
   64-bit only toolchain.

 - EEH fixes from Gavin &amp; Richard.

 - enable the sys_kcmp syscall from Laurent.

 - sysfs control for fastsleep workaround from Shreyas.

 - expose OPAL events as an irq chip by Alistair.

 - MSI ops moved to pci_controller_ops by Daniel.

 - fix for kernel to userspace backtraces for perf from Anton.

 - merge pseries and pseries_le defconfigs from Cyril.

 - CXL in-kernel API from Mikey.

 - OPAL prd driver from Jeremy.

 - fix for DSCR handling &amp; tests from Anshuman.

 - Powernv flash mtd driver from Cyril.

 - dynamic DMA Window support on powernv from Alexey.

 - LLVM clang fixes &amp; workarounds from Anton.

 - reworked version of the patch to abort syscalls when transactional.

 - fix the swap encoding to support 4TB, from Aneesh.

 - various fixes as usual.

 - Freescale updates from Scott: Highlights include more 8xx
   optimizations, an e6500 hugetlb optimization, QMan device tree nodes,
   t1024/t1023 support, and various fixes and cleanup.

* tag 'powerpc-4.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (180 commits)
  cxl: Fix typo in debug print
  cxl: Add CXL_KERNEL_API config option
  powerpc/powernv: Fix wrong IOMMU table in pnv_ioda_setup_bus_dma()
  powerpc/mm: Change the swap encoding in pte.
  powerpc/mm: PTE_RPN_MAX is not used, remove the same
  powerpc/tm: Abort syscalls in active transactions
  powerpc/iommu/ioda2: Enable compile with IOV=on and IOMMU_API=off
  powerpc/include: Add opal-prd to installed uapi headers
  powerpc/powernv: fix construction of opal PRD messages
  powerpc/powernv: Increase opal-irqchip initcall priority
  powerpc: Make doorbell check preemption safe
  powerpc/powernv: pnv_init_idle_states() should only run on powernv
  macintosh/nvram: Remove as unused
  powerpc: Don't use gcc specific options on clang
  powerpc: Don't use -mno-strict-align on clang
  powerpc: Only use -mtraceback=no, -mno-string and -msoft-float if toolchain supports it
  powerpc: Only use -mabi=altivec if toolchain supports it
  powerpc: Fix duplicate const clang warning in user access code
  vfio: powerpc/spapr: Support Dynamic DMA windows
  vfio: powerpc/spapr: Register memory and define IOMMU v2
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull powerpc updates from Michael Ellerman:

 - disable the 32-bit vdso when building LE, so we can build with a
   64-bit only toolchain.

 - EEH fixes from Gavin &amp; Richard.

 - enable the sys_kcmp syscall from Laurent.

 - sysfs control for fastsleep workaround from Shreyas.

 - expose OPAL events as an irq chip by Alistair.

 - MSI ops moved to pci_controller_ops by Daniel.

 - fix for kernel to userspace backtraces for perf from Anton.

 - merge pseries and pseries_le defconfigs from Cyril.

 - CXL in-kernel API from Mikey.

 - OPAL prd driver from Jeremy.

 - fix for DSCR handling &amp; tests from Anshuman.

 - Powernv flash mtd driver from Cyril.

 - dynamic DMA Window support on powernv from Alexey.

 - LLVM clang fixes &amp; workarounds from Anton.

 - reworked version of the patch to abort syscalls when transactional.

 - fix the swap encoding to support 4TB, from Aneesh.

 - various fixes as usual.

 - Freescale updates from Scott: Highlights include more 8xx
   optimizations, an e6500 hugetlb optimization, QMan device tree nodes,
   t1024/t1023 support, and various fixes and cleanup.

* tag 'powerpc-4.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (180 commits)
  cxl: Fix typo in debug print
  cxl: Add CXL_KERNEL_API config option
  powerpc/powernv: Fix wrong IOMMU table in pnv_ioda_setup_bus_dma()
  powerpc/mm: Change the swap encoding in pte.
  powerpc/mm: PTE_RPN_MAX is not used, remove the same
  powerpc/tm: Abort syscalls in active transactions
  powerpc/iommu/ioda2: Enable compile with IOV=on and IOMMU_API=off
  powerpc/include: Add opal-prd to installed uapi headers
  powerpc/powernv: fix construction of opal PRD messages
  powerpc/powernv: Increase opal-irqchip initcall priority
  powerpc: Make doorbell check preemption safe
  powerpc/powernv: pnv_init_idle_states() should only run on powernv
  macintosh/nvram: Remove as unused
  powerpc: Don't use gcc specific options on clang
  powerpc: Don't use -mno-strict-align on clang
  powerpc: Only use -mtraceback=no, -mno-string and -msoft-float if toolchain supports it
  powerpc: Only use -mabi=altivec if toolchain supports it
  powerpc: Fix duplicate const clang warning in user access code
  vfio: powerpc/spapr: Support Dynamic DMA windows
  vfio: powerpc/spapr: Register memory and define IOMMU v2
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>powerpc: Only use -mabi=altivec if toolchain supports it</title>
<updated>2015-06-11T07:33:05+00:00</updated>
<author>
<name>Anton Blanchard</name>
<email>anton@samba.org</email>
</author>
<published>2015-05-25T22:53:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1fb3f5a7ca599f322e6bf21272ad215301159aa0'/>
<id>1fb3f5a7ca599f322e6bf21272ad215301159aa0</id>
<content type='text'>
The -mabi=altivec option is not recognised on LLVM, so use call cc-option
to check for support.

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The -mabi=altivec option is not recognised on LLVM, so use call cc-option
to check for support.

Signed-off-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
</pre>
</div>
</content>
</entry>
</feed>
