linux-stable.git/arch/x86/kernel/cpu/mce, branch linux-6.3.y

x86/MCE/AMD: Use an u64 for bank_map

2023-05-11T14:17:02+00:00

[ Upstream commit 4c1cdec319b9aadb65737c3eb1f5cb74bd6aa156 ]

Thee maximum number of MCA banks is 64 (MAX_NR_BANKS), see

  a0bc32b3cacf ("x86/mce: Increase maximum number of banks to 64").

However, the bank_map which contains a bitfield of which banks to
initialize is of type unsigned int and that overflows when those bit
numbers are >= 32, leading to UBSAN complaining correctly:

  UBSAN: shift-out-of-bounds in arch/x86/kernel/cpu/mce/amd.c:1365:38
  shift exponent 32 is too large for 32-bit type 'int'

Change the bank_map to a u64 and use the proper BIT_ULL() macro when
modifying bits in there.

  [ bp: Rewrite commit message. ]

Fixes: a0bc32b3cacf ("x86/mce: Increase maximum number of banks to 64")
Signed-off-by: Muralidhara M K 
Signed-off-by: Borislav Petkov (AMD) 
Link: https://lore.kernel.org/r/20230127151601.1068324-1-muralimk@amd.com
Signed-off-by: Sasha Levin

x86/mce: Make sure logged MCEs are processed after sysfs update

2023-03-12T20:12:21+00:00

A recent change introduced a flag to queue up errors found during
boot-time polling. These errors will be processed during late init once
the MCE subsystem is fully set up.

A number of sysfs updates call mce_restart() which goes through a subset
of the CPU init flow. This includes polling MCA banks and logging any
errors found. Since the same function is used as boot-time polling,
errors will be queued. However, the system is now past late init, so the
errors will remain queued until another error is found and the workqueue
is triggered.

Call mce_schedule_work() at the end of mce_restart() so that queued
errors are processed.

Fixes: 3bff147b187d ("x86/mce: Defer processing of early errors")
Signed-off-by: Yazen Ghannam 
Signed-off-by: Borislav Petkov (AMD) 
Reviewed-by: Tony Luck 
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230301221420.2203184-1-yazen.ghannam@amd.com

Merge tag 'ras_core_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

2023-02-21T16:04:51+00:00

Pull RAS updates from Borislav Petkov:

 - Add support for reporting more bits of the physical address on error,
   on newer AMD CPUs

 - Mask out bits which don't belong to the address of the error being
   reported

* tag 'ras_core_for_v6.3_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce: Mask out non-address bits from machine check bank
  x86/mce: Add support for Extended Physical Address MCA changes
  x86/mce: Define a function to extract ErrorAddr from MCA_ADDR

x86/mce: Mask out non-address bits from machine check bank

2023-01-10T10:47:07+00:00

Systems that support various memory encryption schemes (MKTME, TDX, SEV)
use high order physical address bits to indicate which key should be
used for a specific memory location.

When a memory error is reported, some systems may report those key
bits in the IA32_MCi_ADDR machine check MSR.

The Intel SDM has a footnote for the contents of the address register
that says: "Useful bits in this field depend on the address methodology
in use when the register state is saved."

AMD Processor Programming Reference has a more explicit description
of the MCA_ADDR register:

 "For physical addresses, the most significant bit is given by
  Core::X86::Cpuid::LongModeInfo[PhysAddrSize]."

Add a new #define MCI_ADDR_PHYSADDR for the mask of valid physical
address bits within the machine check bank address register. Use this
mask for recoverable machine check handling and in the EDAC driver to
ignore any key bits that may be present.

  [ Tony: Based on independent fixes proposed by Fan Du and Isaku Yamahata ]

Reported-by: Isaku Yamahata 
Reported-by: Fan Du 
Signed-off-by: Tony Luck 
Signed-off-by: Borislav Petkov (AMD) 
Reviewed-by: Yazen Ghannam 
Link: https://lore.kernel.org/r/20230109152936.397862-1-tony.luck@intel.com

x86/mce/dev-mcelog: use strscpy() to instead of strncpy()

2023-01-07T10:47:35+00:00

The implementation of strscpy() is more robust and safer.
That's now the recommended way to copy NUL terminated strings.

Signed-off-by: Xu Panda 
Signed-off-by: Yang Yang 
Signed-off-by: Ingo Molnar 
Reviewed-by: Tony Luck 
Link: https://lore.kernel.org/r/202212031419324523731@zte.com.cn

x86/mce: Add support for Extended Physical Address MCA changes

2022-12-28T21:37:37+00:00

Newer AMD CPUs support more physical address bits.

That is, the MCA_ADDR registers on Scalable MCA systems contain the
ErrorAddr in bits [56:0] instead of [55:0]. Hence, the existing LSB field
from bits [61:56] in MCA_ADDR must be moved around to accommodate the
larger ErrorAddr size.

MCA_CONFIG[McaLsbInStatusSupported] indicates this change. If set, the
LSB field will be found in MCA_STATUS rather than MCA_ADDR.

Each logical CPU has unique MCA bank in hardware and is not shared with
other logical CPUs. Additionally, on SMCA systems, each feature bit may
be different for each bank within same logical CPU.

Check for MCA_CONFIG[McaLsbInStatusSupported] for each MCA bank and for
each CPU.

Additionally, all MCA banks do not support maximum ErrorAddr bits in
MCA_ADDR. Some banks might support fewer bits but the remaining bits are
marked as reserved.

  [ Yazen: Rebased and fixed up formatting.
    bp: Massage comments. ]

Signed-off-by: Smita Koralahalli 
Signed-off-by: Yazen Ghannam 
Signed-off-by: Borislav Petkov (AMD) 
Link: https://lore.kernel.org/r/20221206173607.1185907-5-yazen.ghannam@amd.com

x86/mce: Define a function to extract ErrorAddr from MCA_ADDR

2022-12-28T21:11:48+00:00

Move MCA_ADDR[ErrorAddr] extraction into a separate helper function. This
will be further refactored to support extended ErrorAddr bits in MCA_ADDR
in newer AMD CPUs.

  [ bp: Massage. ]

Signed-off-by: Smita Koralahalli 
Signed-off-by: Borislav Petkov 
Signed-off-by: Borislav Petkov (AMD) 
Reviewed-by: Yazen Ghannam 
Link: https://lore.kernel.org/all/20220225193342.215780-3-Smita.KoralahalliChannabasappa@amd.com/

x86/mce: Use severity table to handle uncorrected errors in kernel

2022-10-31T16:01:19+00:00

mce_severity_intel() has a special case to promote UC and AR errors
in kernel context to PANIC severity.

The "AR" case is already handled with separate entries in the severity
table for all instruction fetch errors, and those data fetch errors that
are not in a recoverable area of the kernel (i.e. have an extable fixup
entry).

Add an entry to the severity table for UC errors in kernel context that
reports severity = PANIC. Delete the special case code from
mce_severity_intel().

Signed-off-by: Tony Luck 
Signed-off-by: Borislav Petkov 
Link: https://lore.kernel.org/r/20220922195136.54575-2-tony.luck@intel.com

x86/MCE/AMD: Clear DFR errors found in THR handler

2022-10-27T15:01:25+00:00

AMD's MCA Thresholding feature counts errors of all severity levels, not
just correctable errors. If a deferred error causes the threshold limit
to be reached (it was the error that caused the overflow), then both a
deferred error interrupt and a thresholding interrupt will be triggered.

The order of the interrupts is not guaranteed. If the threshold
interrupt handler is executed first, then it will clear MCA_STATUS for
the error. It will not check or clear MCA_DESTAT which also holds a copy
of the deferred error. When the deferred error interrupt handler runs it
will not find an error in MCA_STATUS, but it will find the error in
MCA_DESTAT. This will cause two errors to be logged.

Check for deferred errors when handling a threshold interrupt. If a bank
contains a deferred error, then clear the bank's MCA_DESTAT register.

Define a new helper function to do the deferred error check and clearing
of MCA_DESTAT.

  [ bp: Simplify, convert comment to passive voice. ]

Fixes: 37d43acfd79f ("x86/mce/AMD: Redo error logging from APIC LVT interrupt handlers")
Signed-off-by: Yazen Ghannam 
Signed-off-by: Borislav Petkov 
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220621155943.33623-1-yazen.ghannam@amd.com

x86/mce: Retrieve poison range from hardware

2022-08-29T07:33:42+00:00

When memory poison consumption machine checks fire, MCE notifier
handlers like nfit_handle_mce() record the impacted physical address
range which is reported by the hardware in the MCi_MISC MSR. The error
information includes data about blast radius, i.e. how many cachelines
did the hardware determine are impacted. A recent change

  7917f9cdb503 ("acpi/nfit: rely on mce->misc to determine poison granularity")

updated nfit_handle_mce() to stop hard coding the blast radius value of
1 cacheline, and instead rely on the blast radius reported in 'struct
mce' which can be up to 4K (64 cachelines).

It turns out that apei_mce_report_mem_error() had a similar problem in
that it hard coded a blast radius of 4K rather than reading the blast
radius from the error information. Fix apei_mce_report_mem_error() to
convey the proper poison granularity.

Signed-off-by: Jane Chu 
Signed-off-by: Borislav Petkov 
Reviewed-by: Dan Williams 
Reviewed-by: Ingo Molnar 
Link: https://lore.kernel.org/r/7ed50fd8-521e-cade-77b1-738b8bfb8502@oracle.com
Link: https://lore.kernel.org/r/20220826233851.1319100-1-jane.chu@oracle.com