linux-stable.git/arch, branch v3.2.23

ARM: fix rcu stalls on SMP platforms

2012-07-12T03:32:11+00:00

commit 7deabca0acfe02b8e18f59a4c95676012f49a304 upstream.

We can stall RCU processing on SMP platforms if a CPU sits in its idle
loop for a long time.  This happens because we don't call irq_enter()
and irq_exit() around generic_smp_call_function_interrupt() and
friends.  Add the necessary calls, and remove the one from within
ipi_timer(), so that they're all in a common place.

Signed-off-by: Russell King 
Signed-off-by: Ben Hutchings

powerpc/kvm: sldi should be sld

2012-07-12T03:32:04+00:00

commit 2f584a146a2965b82fce89b8d2f95dc5cfe468d0 upstream.

Since we are taking a registers, this should never have been an sldi.
Talking to paulus offline, this is the correct fix.

Was introduced by:
 commit 19ccb76a1938ab364a412253daec64613acbf3df
 Author: Paul Mackerras 
 Date:   Sat Jul 23 17:42:46 2011 +1000

Talking to paulus, this shouldn't be a literal.

Signed-off-by: Michael Neuling 
Signed-off-by: Benjamin Herrenschmidt 
Signed-off-by: Ben Hutchings

powerpc/xmon: Use cpumask iterator to avoid warning

2012-07-12T03:32:04+00:00

commit bc1d7702910c7c7e88eb60b58429dbfe293683ce upstream.

We have a bug report where the kernel hits a warning in the cpumask
code:

WARNING: at include/linux/cpumask.h:107

Which is:
        WARN_ON_ONCE(cpu >= nr_cpumask_bits);

The backtrace is:
        cpu_cmd
        cmds
        xmon_core
        xmon
        die

xmon is iterating through 0 to NR_CPUS. I'm not sure why we are still
open coding this but iterating above nr_cpu_ids is definitely a bug.

This patch iterates through all possible cpus, in case we issue a
system reset and CPUs in an offline state call in.

Perhaps the old code was trying to handle CPUs that were in the
partition but were never started (eg kexec into a kernel with an
nr_cpus= boot option). They are going to die way before we get into
xmon since we haven't set any kernel state up for them.

Signed-off-by: Anton Blanchard 
Signed-off-by: Benjamin Herrenschmidt 
Signed-off-by: Ben Hutchings

x86, cpufeature: Rename X86_FEATURE_DTS to X86_FEATURE_DTHERM

2012-07-04T04:44:28+00:00

commit 4ad33411308596f2f918603509729922a1ec4411 upstream.

It makes sense to label "Digital Thermal Sensor" as "DTS", but
unfortunately the string "dts" was already used for "Debug Store", and
/proc/cpuinfo is a user space ABI.

Therefore, rename this to "dtherm".

This conflict went into mainline via the hwmon tree without any x86
maintainer ack, and without any kind of hint in the subject.

    a4659053 x86/hwmon: fix initialization of coretemp

Reported-by: Jean Delvare 
Link: http://lkml.kernel.org/r/4FE34BCB.5050305@linux.intel.com
Cc: Jan Beulich 
Signed-off-by: H. Peter Anvin 
[bwh: Backported to 3.2: drop the coretemp device table change]
Signed-off-by: Ben Hutchings

ARM: SAMSUNG: Fix for S3C2412 EBI memory mapping

2012-07-04T04:44:24+00:00

commit 3dca938656c7b0ff6b0717a5dde0f5f45e592be5 upstream.

While upgrading the kernel on a S3C2412 based board I've noted
that it was impossible to boot the board with a 2.6.32 or upper
kernel. I've tracked down the problem to the EBI virtual memory
mapping that is in conflict with the IO mapping definition in
arch/arm/mach-s3c24xx/s3c2412.c.

Signed-off-by: Jose Miguel Goncalves 
Signed-off-by: Kukjin Kim 
Signed-off-by: Ben Hutchings

ARM: SAMSUNG: Should check for IS_ERR(clk) instead of NULL

2012-07-04T04:44:22+00:00

commit a5d8f4765f0e92ef027492a8cb979c5b8d45f2c3 upstream.

On the error condition clk_get() returns ERR_PTR().

Signed-off-by: Jonghwan Choi 
Signed-off-by: Kukjin Kim 
Signed-off-by: Ben Hutchings

thp: avoid atomic64_read in pmd_read_atomic for 32bit PAE

2012-07-04T04:44:09+00:00

commit e4eed03fd06578571c01d4f1478c874bb432c815 upstream.

In the x86 32bit PAE CONFIG_TRANSPARENT_HUGEPAGE=y case while holding the
mmap_sem for reading, cmpxchg8b cannot be used to read pmd contents under
Xen.

So instead of dealing only with "consistent" pmdvals in
pmd_none_or_trans_huge_or_clear_bad() (which would be conceptually
simpler) we let pmd_none_or_trans_huge_or_clear_bad() deal with pmdvals
where the low 32bit and high 32bit could be inconsistent (to avoid having
to use cmpxchg8b).

The only guarantee we get from pmd_read_atomic is that if the low part of
the pmd was found null, the high part will be null too (so the pmd will be
considered unstable).  And if the low part of the pmd is found "stable"
later, then it means the whole pmd was read atomically (because after a
pmd is stable, neither MADV_DONTNEED nor page faults can alter it anymore,
and we read the high part after the low part).

In the 32bit PAE x86 case, it is enough to read the low part of the pmdval
atomically to declare the pmd as "stable" and that's true for THP and no
THP, furthermore in the THP case we also have a barrier() that will
prevent any inconsistent pmdvals to be cached by a later re-read of the
*pmd.

Signed-off-by: Andrea Arcangeli 
Cc: Jonathan Nieder 
Cc: Ulrich Obergfell 
Cc: Mel Gorman 
Cc: Hugh Dickins 
Cc: Larry Woodman 
Cc: Petr Matousek 
Cc: Rik van Riel 
Cc: Jan Beulich 
Cc: KOSAKI Motohiro 
Tested-by: Andrew Jones 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Ben Hutchings

mm: pmd_read_atomic: fix 32bit PAE pmd walk vs pmd_populate SMP race condition

2012-07-04T04:44:09+00:00

commit 26c191788f18129af0eb32a358cdaea0c7479626 upstream.

When holding the mmap_sem for reading, pmd_offset_map_lock should only
run on a pmd_t that has been read atomically from the pmdp pointer,
otherwise we may read only half of it leading to this crash.

PID: 11679  TASK: f06e8000  CPU: 3   COMMAND: "do_race_2_panic"
 #0 [f06a9dd8] crash_kexec at c049b5ec
 #1 [f06a9e2c] oops_end at c083d1c2
 #2 [f06a9e40] no_context at c0433ded
 #3 [f06a9e64] bad_area_nosemaphore at c043401a
 #4 [f06a9e6c] __do_page_fault at c0434493
 #5 [f06a9eec] do_page_fault at c083eb45
 #6 [f06a9f04] error_code (via page_fault) at c083c5d5
    EAX: 01fb470c EBX: fff35000 ECX: 00000003 EDX: 00000100 EBP:
    00000000
    DS:  007b     ESI: 9e201000 ES:  007b     EDI: 01fb4700 GS:  00e0
    CS:  0060     EIP: c083bc14 ERR: ffffffff EFLAGS: 00010246
 #7 [f06a9f38] _spin_lock at c083bc14
 #8 [f06a9f44] sys_mincore at c0507b7d
 #9 [f06a9fb0] system_call at c083becd
                         start           len
    EAX: ffffffda  EBX: 9e200000  ECX: 00001000  EDX: 6228537f
    DS:  007b      ESI: 00000000  ES:  007b      EDI: 003d0f00
    SS:  007b      ESP: 62285354  EBP: 62285388  GS:  0033
    CS:  0073      EIP: 00291416  ERR: 000000da  EFLAGS: 00000286

This should be a longstanding bug affecting x86 32bit PAE without THP.
Only archs with 64bit large pmd_t and 32bit unsigned long should be
affected.

With THP enabled the barrier() in pmd_none_or_trans_huge_or_clear_bad()
would partly hide the bug when the pmd transition from none to stable,
by forcing a re-read of the *pmd in pmd_offset_map_lock, but when THP is
enabled a new set of problem arises by the fact could then transition
freely in any of the none, pmd_trans_huge or pmd_trans_stable states.
So making the barrier in pmd_none_or_trans_huge_or_clear_bad()
unconditional isn't good idea and it would be a flakey solution.

This should be fully fixed by introducing a pmd_read_atomic that reads
the pmd in order with THP disabled, or by reading the pmd atomically
with cmpxchg8b with THP enabled.

Luckily this new race condition only triggers in the places that must
already be covered by pmd_none_or_trans_huge_or_clear_bad() so the fix
is localized there but this bug is not related to THP.

NOTE: this can trigger on x86 32bit systems with PAE enabled with more
than 4G of ram, otherwise the high part of the pmd will never risk to be
truncated because it would be zero at all times, in turn so hiding the
SMP race.

This bug was discovered and fully debugged by Ulrich, quote:

----
[..]
pmd_none_or_trans_huge_or_clear_bad() loads the content of edx and
eax.

    496 static inline int pmd_none_or_trans_huge_or_clear_bad(pmd_t
    *pmd)
    497 {
    498         /* depend on compiler for an atomic pmd read */
    499         pmd_t pmdval = *pmd;

                                // edi = pmd pointer
0xc0507a74 :   mov    0x8(%esp),%edi
...
                                // edx = PTE page table high address
0xc0507a84 :   mov    0x4(%edi),%edx
...
                                // eax = PTE page table low address
0xc0507a8e :   mov    (%edi),%eax

[..]

Please note that the PMD is not read atomically. These are two "mov"
instructions where the high order bits of the PMD entry are fetched
first. Hence, the above machine code is prone to the following race.

-  The PMD entry {high|low} is 0x0000000000000000.
   The "mov" at 0xc0507a84 loads 0x00000000 into edx.

-  A page fault (on another CPU) sneaks in between the two "mov"
   instructions and instantiates the PMD.

-  The PMD entry {high|low} is now 0x00000003fda38067.
   The "mov" at 0xc0507a8e loads 0xfda38067 into eax.
----

Reported-by: Ulrich Obergfell 
Signed-off-by: Andrea Arcangeli 
Cc: Mel Gorman 
Cc: Hugh Dickins 
Cc: Larry Woodman 
Cc: Petr Matousek 
Cc: Rik van Riel 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 
Signed-off-by: Ben Hutchings

ARM i.MX imx21ads: Fix overlapping static i/o mappings

2012-06-19T22:18:16+00:00

commit 350ab15bb2ffe7103bc6bf6c634f3c5b286eaf2a upstream.

The statically defined I/O memory regions for the i.MX21 on chip
peripherals and the on board I/O peripherals of the i.MX21ADS board
overlap. This results in a kernel crash during startup. This is fixed
by reducing the memory range for the on board I/O peripherals to the
actually required range.

Signed-off-by: Jaccon Bastiaansen 
Signed-off-by: Sascha Hauer 
Signed-off-by: Ben Hutchings

ARM: imx6: exit coherency when shutting down a cpu

2012-06-19T22:18:15+00:00

commit 602bf40971d7f9a1ec0b7ba2b7e6427849828651 upstream.

There is a system hang issue on imx6q which can easily be seen with
running a cpu hotplug stress testing (hotplug secondary cores from
user space via sysfs interface for thousands iterations).

It turns out that the issue is caused by coherency of the cpu that
is being shut down.  When shutting down a cpu, we need to have the
cpu exit coherency to prevent it from receiving cache, TLB, or BTB
maintenance operations broadcast by other CPUs in the cluster.

Copy cpu_enter_lowpower() and cpu_leave_lowpower() from mach-vexpress
to have coherency properly handled in platform_cpu_die(), thus fix
the issue.

Signed-off-by: Shawn Guo 
Signed-off-by: Ben Hutchings