linux-stable.git/arch/x86/kernel/setup_percpu.c, branch linux-4.9.y

x86/boot/32: Fix UP boot on Quark and possibly other platforms

2018-03-22T08:17:58+00:00

commit d2b6dc61a8dd3c429609b993778cb54e75a5c5f0 upstream.

This partially reverts commit:

  23b2a4ddebdd17f ("x86/boot/32: Defer resyncing initial_page_table until per-cpu is set up")

That commit had one definite bug and one potential bug.  The
definite bug is that setup_per_cpu_areas() uses a differnet generic
implementation on UP kernels, so initial_page_table never got
resynced.  This was fine for access to percpu data (it's in the
identity map on UP), but it breaks other users of
initial_page_table.  The potential bug is that helpers like
efi_init() would be called before the tables were synced.

Avoid both problems by just syncing the page tables in setup_arch()
*and* setup_per_cpu_areas().

Reported-by: Jan Kiszka 
Signed-off-by: Andy Lutomirski 
Cc: Andy Shevchenko 
Cc: Ard Biesheuvel 
Cc: Boris Ostrovsky 
Cc: Borislav Petkov 
Cc: Brian Gerst 
Cc: Denys Vlasenko 
Cc: Josh Poimboeuf 
Cc: Juergen Gross 
Cc: Linus Torvalds 
Cc: Matt Fleming 
Cc: Peter Zijlstra 
Cc: Thomas Garnier 
Cc: Thomas Gleixner 
Cc: linux-efi@vger.kernel.org
Signed-off-by: Ingo Molnar 
Signed-off-by: Greg Kroah-Hartman

x86/boot/32: Defer resyncing initial_page_table until per-cpu is set up

2018-03-22T08:17:41+00:00

[ Upstream commit 23b2a4ddebdd17fad265b4bb77256c2e4ec37dee ]

The x86 smpboot trampoline expects initial_page_table to have the
GDT mapped.  If the GDT ends up in a virtually mapped per-cpu page,
then it won't be in the page tables at all until perc-pu areas are
set up.  The result will be a triple fault the first time that the
CPU attempts to access the GDT after LGDT loads the perc-pu GDT.

This appears to be an old bug, but somehow the GDT fixmap rework
is triggering it.  This seems to have something to do with the
memory layout.

Signed-off-by: Andy Lutomirski 
Cc: Ard Biesheuvel 
Cc: Boris Ostrovsky 
Cc: Borislav Petkov 
Cc: Brian Gerst 
Cc: Denys Vlasenko 
Cc: H. Peter Anvin 
Cc: Josh Poimboeuf 
Cc: Juergen Gross 
Cc: Linus Torvalds 
Cc: Matt Fleming 
Cc: Peter Zijlstra 
Cc: Thomas Garnier 
Cc: Thomas Gleixner 
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/a553264a5972c6a86f9b5caac237470a0c74a720.1490218061.git.luto@kernel.org
Signed-off-by: Ingo Molnar 
Signed-off-by: Sasha Levin 
Signed-off-by: Greg Kroah-Hartman

Merge branch 'x86/mm' into x86/asm, to unify the two branches for simplicity

2016-09-08T06:41:52+00:00

Signed-off-by: Ingo Molnar

x86/dumpstack: Remove 64-byte gap at end of irq stack

2016-08-18T16:41:33+00:00

There has been a 64-byte gap at the end of the irq stack for at least 12
years.  It predates git history, and I can't find any good reason for
it.  Remove it.  What's the worst that could happen?

Signed-off-by: Josh Poimboeuf 
Cc: Andy Lutomirski 
Cc: Andy Lutomirski 
Cc: Borislav Petkov 
Cc: Brian Gerst 
Cc: Byungchul Park 
Cc: Denys Vlasenko 
Cc: Frederic Weisbecker 
Cc: H. Peter Anvin 
Cc: Kees Cook 
Cc: Linus Torvalds 
Cc: Nilay Vaish 
Cc: Peter Zijlstra 
Cc: Steven Rostedt 
Cc: Thomas Gleixner 
Link: http://lkml.kernel.org/r/14f9281c5475cc44af95945ea7546bff2e3836db.1471535549.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar

x86: Apply more __ro_after_init and const

2016-08-10T12:55:05+00:00

Guided by grsecurity's analogous __read_only markings in arch/x86,
this applies several uses of __ro_after_init to structures that are
only updated during __init, and const for some structures that are
never updated.  Additionally extends __init markings to some functions
that are only used during __init, and cleans up some missing C99 style
static initializers.

Signed-off-by: Kees Cook 
Cc: Andy Lutomirski 
Cc: Andy Lutomirski 
Cc: Arnd Bergmann 
Cc: Borislav Petkov 
Cc: Brad Spengler 
Cc: Brian Gerst 
Cc: David Brown 
Cc: Denys Vlasenko 
Cc: Emese Revfy 
Cc: H. Peter Anvin 
Cc: Josh Poimboeuf 
Cc: Linus Torvalds 
Cc: Mathias Krause 
Cc: Michael Ellerman 
Cc: PaX Team 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: kernel-hardening@lists.openwall.com
Link: http://lkml.kernel.org/r/20160808232906.GA29731@www.outflux.net
Signed-off-by: Ingo Molnar

x86/acpi: store ACPI ids from MADT for future usage

2016-07-25T12:30:53+00:00

Currently we don't save ACPI ids (unlike LAPIC ids which go to
x86_cpu_to_apicid) from MADT and we may need this information later.
Particularly, ACPI ids is the only existent way for a PVHVM Xen guest
to figure out Xen's idea of its vCPUs ids before these CPUs boot and
in some cases these ids diverge from Linux's cpu ids.

Signed-off-by: Vitaly Kuznetsov 
Signed-off-by: David Vrabel

x86: Convert a few more per-CPU items to read-mostly ones

2014-11-04T19:13:28+00:00

Both this_cpu_off and cpu_info aren't getting modified post boot, yet
are being accessed on enough code paths that grouping them with other
frequently read items seems desirable. For cpu_info this at the same
time implies removing the cache line alignment (which afaict became
pointless when it got converted to per-CPU data years ago).

Signed-off-by: Jan Beulich 
Link: http://lkml.kernel.org/r/54589BD20200007800044A84@mail.emea.novell.com
Signed-off-by: Thomas Gleixner

x86: Add read_mostly declaration/definition to variables from smp.h

2012-06-14T10:42:11+00:00

Add "read-mostly" qualifier to the following variables in
smp.h:

 - cpu_sibling_map
 - cpu_core_map
 - cpu_llc_shared_map
 - cpu_llc_id
 - cpu_number
 - x86_cpu_to_apicid
 - x86_bios_cpu_apicid
 - x86_cpu_to_logical_apicid

As long as all the variables above are only written during the
initialization, this change is meant to prevent the false
sharing. More specifically, on vSMP Foundation platform
x86_cpu_to_apicid shared the same internode_cache_line with
frequently written lapic_events.

From the analysis of the first 33 per_cpu variables out of 219
(memories they describe, to be more specific) the 8 have read_mostly
nature (tlb_vector_offset, cpu_loops_per_jiffy, xen_debug_irq, etc.)
and 25 are frequently written (irq_stack_union, gdt_page,
exception_stacks, idt_desc, etc.).

Assuming that the spread of the rest of the per_cpu variables is
similar, identifying the read mostly memories will make more sense
in terms of long-term code maintenance comparing to identifying
frequently written memories.

Signed-off-by: Vlad Zolotarov 
Acked-by: Shai Fultheim 
Cc: Shai Fultheim (Shai@ScaleMP.com) 
Cc: ido@wizery.com
Cc: Linus Torvalds 
Cc: Andrew Morton 
Cc: Peter Zijlstra 
Link: http://lkml.kernel.org/r/1719258.EYKzE4Zbq5@vlad
Signed-off-by: Ingo Molnar

percpu, x86: don't use PMD_SIZE as embedded atom_size on 32bit

2012-05-08T16:42:18+00:00

With the embed percpu first chunk allocator, x86 uses either PAGE_SIZE
or PMD_SIZE for atom_size.  PMD_SIZE is used when CPU supports PSE so
that percpu areas are aligned to PMD mappings and possibly allow using
PMD mappings in vmalloc areas in the future.  Using larger atom_size
doesn't waste actual memory; however, it does require larger vmalloc
space allocation later on for !first chunks.

With reasonably sized vmalloc area, PMD_SIZE shouldn't be a problem
but x86_32 at this point is anything but reasonable in terms of
address space and using larger atom_size reportedly leads to frequent
percpu allocation failures on certain setups.

As there is no reason to not use PMD_SIZE on x86_64 as vmalloc space
is aplenty and most x86_64 configurations support PSE, fix the issue
by always using PMD_SIZE on x86_64 and PAGE_SIZE on x86_32.

v2: drop cpu_has_pse test and make x86_64 always use PMD_SIZE and
    x86_32 PAGE_SIZE as suggested by hpa.

Signed-off-by: Tejun Heo 
Reported-by: Yanmin Zhang 
Reported-by: ShuoX Liu 
Acked-by: H. Peter Anvin 
LKML-Reference: <4F97BA98.6010001@intel.com>
Cc: stable@vger.kernel.org

x86: Unify CPU -> NUMA node mapping between 32 and 64bit

2011-01-28T13:54:09+00:00

Unlike 64bit, 32bit has been using its own cpu_to_node_map[] for
CPU -> NUMA node mapping.  Replace it with early_percpu variable
x86_cpu_to_node_map and share the mapping code with 64bit.

* USE_PERCPU_NUMA_NODE_ID is now enabled for 32bit too.

* x86_cpu_to_node_map and numa_set/clear_node() are moved from
  numa_64 to numa.  For now, on 32bit, x86_cpu_to_node_map is initialized
  with 0 instead of NUMA_NO_NODE.  This is to avoid introducing unexpected
  behavior change and will be updated once init path is unified.

* srat_detect_node() is now enabled for x86_32 too.  It calls
  numa_set_node() and initializes the mapping making explicit
  cpu_to_node_map[] updates from map/unmap_cpu_to_node() unnecessary.

Signed-off-by: Tejun Heo 
Cc: eric.dumazet@gmail.com
Cc: yinghai@kernel.org
Cc: brgerst@gmail.com
Cc: gorcunov@gmail.com
Cc: penberg@kernel.org
Cc: shaohui.zheng@intel.com
Cc: rientjes@google.com
LKML-Reference: <1295789862-25482-15-git-send-email-tj@kernel.org>
Signed-off-by: Ingo Molnar 
Cc: David Rientjes