linux.git/arch/x86/kernel/cpu/microcode/internal.h, branch v6.8

x86/microcode: Rework early revisions reporting

2023-11-21T15:35:48+00:00

The AMD side of the loader issues the microcode revision for each
logical thread on the system, which can become really noisy on huge
machines. And doing that doesn't make a whole lot of sense - the
microcode revision is already in /proc/cpuinfo.

So in case one is interested in the theoretical support of mixed silicon
steppings on AMD, one can check there.

What is also missing on the AMD side - something which people have
requested before - is showing the microcode revision the CPU had
*before* the early update.

So abstract that up in the main code and have the BSP on each vendor
provide those revision numbers.

Then, dump them only once on driver init.

On Intel, do not dump the patch date - it is not needed.

Reported-by: Linus Torvalds 
Signed-off-by: Borislav Petkov (AMD) 
Reviewed-by: Thomas Gleixner 
Link: https://lore.kernel.org/r/CAHk-=wg=%2B8rceshMkB4VnKxmRccVLtBLPBawnewZuuqyx5U=3A@mail.gmail.com

x86/microcode: Prepare for minimal revision check

2023-10-24T13:05:55+00:00

Applying microcode late can be fatal for the running kernel when the
update changes functionality which is in use already in a non-compatible
way, e.g. by removing a CPUID bit.

There is no way for admins which do not have access to the vendors deep
technical support to decide whether late loading of such a microcode is
safe or not.

Intel has added a new field to the microcode header which tells the
minimal microcode revision which is required to be active in the CPU in
order to be safe.

Provide infrastructure for handling this in the core code and a command
line switch which allows to enforce it.

If the update is considered safe the kernel is not tainted and the annoying
warning message not emitted. If it's enforced and the currently loaded
microcode revision is not safe for late loading then the load is aborted.

Signed-off-by: Thomas Gleixner 
Signed-off-by: Borislav Petkov (AMD) 
Link: https://lore.kernel.org/r/20231017211724.079611170@linutronix.de

x86/microcode: Handle "offline" CPUs correctly

2023-10-24T13:05:55+00:00

Offline CPUs need to be parked in a safe loop when microcode update is
in progress on the primary CPU. Currently, offline CPUs are parked in
mwait_play_dead(), and for Intel CPUs, its not a safe instruction,
because the MWAIT instruction can be patched in the new microcode update
that can cause instability.

  - Add a new microcode state 'UCODE_OFFLINE' to report status on per-CPU
  basis.
  - Force NMI on the offline CPUs.

Wake up offline CPUs while the update is in progress and then return
them back to mwait_play_dead() after microcode update is complete.

Signed-off-by: Thomas Gleixner 
Signed-off-by: Borislav Petkov (AMD) 
Link: https://lore.kernel.org/r/20231002115903.660850472@linutronix.de

x86/microcode: Rendezvous and load in NMI

2023-10-24T13:05:55+00:00

stop_machine() does not prevent the spin-waiting sibling from handling
an NMI, which is obviously violating the whole concept of rendezvous.

Implement a static branch right in the beginning of the NMI handler
which is nopped out except when enabled by the late loading mechanism.

The late loader enables the static branch before stop_machine() is
invoked. Each CPU has an nmi_enable in its control structure which
indicates whether the CPU should go into the update routine.

This is required to bridge the gap between enabling the branch and
actually being at the point where it is required to enter the loader
wait loop.

Each CPU which arrives in the stopper thread function sets that flag and
issues a self NMI right after that. If the NMI function sees the flag
clear, it returns. If it's set it clears the flag and enters the
rendezvous.

This is safe against a real NMI which hits in between setting the flag
and sending the NMI to itself. The real NMI will be swallowed by the
microcode update and the self NMI will then let stuff continue.
Otherwise this would end up with a spurious NMI.

Signed-off-by: Thomas Gleixner 
Signed-off-by: Borislav Petkov (AMD) 
Link: https://lore.kernel.org/r/20231002115903.489900814@linutronix.de

x86/microcode: Add per CPU result state

2023-10-24T13:05:54+00:00

The microcode rendezvous is purely acting on global state, which does
not allow to analyze fails in a coherent way.

Introduce per CPU state where the results are written into, which allows to
analyze the return codes of the individual CPUs.

Initialize the state when walking the cpu_present_mask in the online
check to avoid another for_each_cpu() loop.

Enhance the result print out with that.

The structure is intentionally named ucode_ctrl as it will gain control
fields in subsequent changes.

Signed-off-by: Thomas Gleixner 
Signed-off-by: Borislav Petkov (AMD) 
Link: https://lore.kernel.org/r/20231017211723.632681010@linutronix.de

x86/microcode: Handle "nosmt" correctly

2023-10-24T13:05:54+00:00

On CPUs where microcode loading is not NMI-safe the SMT siblings which
are parked in one of the play_dead() variants still react to NMIs.

So if an NMI hits while the primary thread updates the microcode the
resulting behaviour is undefined. The default play_dead() implementation on
modern CPUs is using MWAIT which is not guaranteed to be safe against
a microcode update which affects MWAIT.

Take the cpus_booted_once_mask into account to detect this case and
refuse to load late if the vendor specific driver does not advertise
that late loading is NMI safe.

AMD stated that this is safe, so mark the AMD driver accordingly.

This requirement will be partially lifted in later changes.

Signed-off-by: Thomas Gleixner 
Signed-off-by: Borislav Petkov (AMD) 
Link: https://lore.kernel.org/r/20231002115903.087472735@linutronix.de

x86/microcode: Mop up early loading leftovers

2023-10-24T13:05:54+00:00

Get rid of the initrd_gone hack which was required to keep
find_microcode_in_initrd() functional after init.

As find_microcode_in_initrd() is now only used during init, mark it
accordingly.

Signed-off-by: Thomas Gleixner 
Signed-off-by: Borislav Petkov (AMD) 
Link: https://lore.kernel.org/r/20231017211723.298854846@linutronix.de

x86/microcode/amd: Use cached microcode for AP load

2023-10-24T13:05:54+00:00

Now that the microcode cache is initialized before the APs are brought
up, there is no point in scanning builtin/initrd microcode during AP
loading.

Convert the AP loader to utilize the cache, which in turn makes the CPU
hotplug callback which applies the microcode after initrd/builtin is
gone, obsolete as the early loading during late hotplug operations
including the resume path depends now only on the cache.

Signed-off-by: Thomas Gleixner 
Signed-off-by: Borislav Petkov (AMD) 
Link: https://lore.kernel.org/r/20231017211723.243426023@linutronix.de

x86/microcode/intel: Save the microcode only after a successful late-load

2023-10-24T13:05:53+00:00

There are situations where the late microcode is loaded into memory but
is not applied:

  1) The rendezvous fails
  2) The microcode is rejected by the CPUs

If any of this happens then the pointer which was updated at firmware
load time is stale and subsequent CPU hotplug operations either fail to
update or create inconsistent microcode state.

Save the loaded microcode in a separate pointer before the late load is
attempted and when successful, update the hotplug pointer accordingly
via a new microcode_ops callback.

Remove the pointless fallback in the loader to a microcode pointer which
is never populated.

Signed-off-by: Thomas Gleixner 
Signed-off-by: Borislav Petkov (AMD) 
Link: https://lore.kernel.org/r/20231002115902.505491309@linutronix.de

x86/microcode/intel: Simplify early loading

2023-10-24T13:02:36+00:00

The early loading code is overly complicated:

  - It scans the builtin/initrd for microcode not only on the BSP, but also
    on all APs during early boot and then later in the boot process it
    scans again to duplicate and save the microcode before initrd goes
    away.

    That's a pointless exercise because this can be simply done before
    bringing up the APs when the memory allocator is up and running.

 - Saving the microcode from within the scan loop is completely
   non-obvious and a left over of the microcode cache.

   This can be done at the call site now which makes it obvious.

Rework the code so that only the BSP scans the builtin/initrd microcode
once during early boot and save it away in an early initcall for later
use.

  [ bp: Test and fold in a fix from tglx ontop which handles the need to
    distinguish what save_microcode() does depending on when it is
    called:

     - when on the BSP during early load, it needs to find a newer
       revision than the one currently loaded on the BSP

     - later, before SMP init, it still runs on the BSP and gets the BSP
       revision just loaded and uses that revision to know which patch
       to save for the APs. For that it needs to find the exact one as
       on the BSP.
   ]

Signed-off-by: Thomas Gleixner 
Signed-off-by: Borislav Petkov (AMD) 
Link: https://lore.kernel.org/r/20231017211722.629085215@linutronix.de