summaryrefslogtreecommitdiff
path: root/drivers/cxl
AgeCommit message (Collapse)Author
10 dayscxl: Check for invalid addresses returned from translation functions on errorsRobert Richter
Translation functions may return an invalid address in case of errors. If the address is not checked the further use of the invalid value will cause an address corruption. Consistently check for a valid address returned by translation functions. Use RESOURCE_SIZE_MAX to indicate an invalid address for type resource_size_t. Depending on the type either RESOURCE_SIZE_MAX or ULLONG_MAX is used to indicate an address error. Propagating an invalid address from a failed translation may cause userspace to think it has received a valid SPA, when in fact it is wrong. The CXL userspace API, using trace events, expects ULLONG_MAX to indicate a translation failure. If ULLONG_MAX is not returned immediately, subsequent calculations can transform that bad address into a different value (!ULLONG_MAX), and an invalid SPA may be returned to userspace. This can lead to incorrect diagnostics and erroneous corrective actions. [ dj: Added user impact statement from Alison. ] [ dj: Fixed checkpatch tab alignment issue. ] Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Robert Richter <rrichter@amd.com> Fixes: c3dd67681c70 ("cxl/region: Add inject and clear poison by region offset") Fixes: b78b9e7b7979 ("cxl/region: Refactor address translation funcs for testing") Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20260107120544.410993-1-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
11 dayscxl/hdm: Fix potential infinite loop in __cxl_dpa_reserve()Li Ming
In __cxl_dpa_reserve(), it will check if the new resource range is included in one of paritions of the cxl memory device. cxlds->nr_paritions is used to represent how many partitions information the cxl memory device has. In the loop, if driver cannot find a partition including the new resource range, it will be an infinite loop. [ dj: Removed incorrect fixes tag ] Fixes: 991d98f17d31 ("cxl: Make cxl_dpa_alloc() DPA partition number agnostic") Signed-off-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://patch.msgid.link/20260112120526.530232-1-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
13 dayscxl/acpi: Restore HBIW check before dereferencing platform_dataAlison Schofield
Commit 4fe516d2ad1a ("cxl/acpi: Make the XOR calculations available for testing") split xormap handling code to create a reusable helper function but inadvertently dropped the check of HBIW values before dereferencing cxlrd->platform_data. When HBIW is 1 or 3, no xormaps are needed and platform_data may be NULL, leading to a potential NULL pointer dereference. Affects platform configs using XOR Arithmetic with HBIWs of 1 or 3, when performing DPA->HPA address translation for CXL events. Those events would be any of poison ops, general media, or dram. Restore the early return check for HBIW values of 1 and 3 before dereferencing platform_data. Fixes: 4fe516d2ad1a ("cxl/acpi: Make the XOR calculations available for testing") Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://patch.msgid.link/20260109194946.431083-1-alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
14 dayscxl/port: Fix target list setup for multiple decoders sharing the same dportRobert Richter
If a switch port has more than one decoder that is using the same downstream port, the enumeration of the target lists may fail with: # dmesg | grep target.list update_decoder_targets: cxl decoder1.0: dport3 found in target list, index 3 update_decoder_targets: cxl decoder1.0: dport2 found in target list, index 2 update_decoder_targets: cxl decoder1.0: dport0 found in target list, index 0 update_decoder_targets: cxl decoder2.0: dport3 found in target list, index 1 update_decoder_targets: cxl decoder4.0: dport3 found in target list, index 1 cxl_mem mem6: failed to find endpoint12:0000:00:01.4 in target list of decoder2.1 cxl_mem mem8: failed to find endpoint13:0000:20:01.4 in target list of decoder4.1 The case, that the same downstream port can be used in multiple target lists, is allowed and possible. Fix the update of the target list. Enumerate all children of the switch port and do not stop the iteration after the first matching target was found. With the fix applied: # dmesg | grep target.list update_decoder_targets: cxl decoder1.0: dport2 found in target list, index 2 update_decoder_targets: cxl decoder1.0: dport0 found in target list, index 0 update_decoder_targets: cxl decoder1.0: dport3 found in target list, index 3 update_decoder_targets: cxl decoder2.0: dport3 found in target list, index 1 update_decoder_targets: cxl decoder2.1: dport3 found in target list, index 1 update_decoder_targets: cxl decoder4.0: dport3 found in target list, index 1 update_decoder_targets: cxl decoder4.1: dport3 found in target list, index 1 Analyzing the conditions when this happens: 1) A dport is shared by multiple decoders. 2) The decoders have interleaving configured (ways > 1). The configuration above has the following hierarchy details (fixed version): root0 |_ | | | decoder0.1 | ways: 2 | target_list: 0,1 |_______________________________________ | | | dport0 | dport1 | | port2 port4 | | |___________________ |_____________________ | | | | | | | decoder2.0 decoder2.1 | decoder4.0 decoder4.1 | ways: 2 ways: 2 | ways: 2 ways: 2 | target_list: 2,3 target_list: 2,3 | target_list: 2,3 target_list: 2,3 |___________________ |___________________ | | | | | dport2 | dport3 | dport2 | dport3 | | | | endpoint7 endpoint12 endpoint9 endpoint13 |_ |_ |_ |_ | | | | | | | | | decoder7.0 | decoder12.0 | decoder9.0 | decoder13.0 | decoder7.2 | decoder12.2 | decoder9.2 | decoder13.2 | | | | mem3 mem5 mem6 mem8 Note: Device numbers vary for every boot. Current kernel fails to enumerate endpoint12 and endpoint13 as the target list is not updated for the second decoder. Fixes: 4f06d81e7c6a ("cxl: Defer dport allocation for switch ports") Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://patch.msgid.link/20260108101324.509667-1-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-01-05cxl/region: fix format string for resource_size_tArnd Bergmann
The size of this type is architecture specific, and the recommended way to print it portably is through the custom %pap format string. Fixes: d6602e25819d ("cxl/region: Add support to indicate region has extended linear cache") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com>> --- Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20251204095237.1032528-1-arnd@kernel.org Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-12-05Merge tag 'soc-drivers-6.19-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull more SoC driver updates from Arnd Bergmann: "These updates came a little late, or were based on a later 6.18-rc tag than the others: - A new driver for cache management on cxl devices with memory shared in a coherent cluster. This is part of the drivers/cache/ tree, but unlike the other drivers that back the dma-mapping interfaces, this one is needed only during CPU hotplug. - A shared branch for reset controllers using swnode infrastructure - Added support for new SoC variants in the Amlogic soc_device identification - Minor updates in Freescale, Microchip, Samsung, and Apple SoC drivers" * tag 'soc-drivers-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (24 commits) soc: samsung: exynos-pmu: fix device leak on regmap lookup soc: samsung: exynos-pmu: Fix structure initialization soc: fsl: qbman: use kmalloc_array() instead of kmalloc() soc: fsl: qbman: add WQ_PERCPU to alloc_workqueue users MAINTAINERS: Update email address for Christophe Leroy MAINTAINERS: refer to intended file in STANDALONE CACHE CONTROLLER DRIVERS cache: Support cache maintenance for HiSilicon SoC Hydra Home Agent cache: Make top level Kconfig menu a boolean dependent on RISCV MAINTAINERS: Add Jonathan Cameron to drivers/cache and add lib/cache_maint.c + header arm64: Select GENERIC_CPU_CACHE_MAINTENANCE lib: Support ARCH_HAS_CPU_CACHE_INVALIDATE_MEMREGION soc: amlogic: meson-gx-socinfo: add new SoCs id dt-bindings: arm: amlogic: meson-gx-ao-secure: support more SoCs memregion: Support fine grained invalidate by cpu_cache_invalidate_memregion() memregion: Drop unused IORES_DESC_* parameter from cpu_cache_invalidate_memregion() dt-bindings: cache: sifive,ccache0: add a pic64gx compatible MAINTAINERS: rename Microchip RISC-V entry MAINTAINERS: add new soc drivers to Microchip RISC-V entry soc: microchip: add mfd drivers for two syscon regions on PolarFire SoC dt-bindings: soc: microchip: document the simple-mfd syscon on PolarFire SoC ...
2025-12-04Merge tag 'cxl-for-6.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull compute express link (CXL) updates from Dave Jiang: "The additions of note are adding CXL region remove support for locked CXL decoders, adding unit testing support for XOR address translation, and adding unit testing support for extended linear cache. Misc: - Remove incorrect page-allocator quirk section in documentation - Remove unused devm_cxl_port_enumerate_dports() function - Fix typo in cdat.c code comment - Replace use of system_wq with system_percpu_wq - Add locked CXL decoder support for region removal - Return when generic target updated - Rename region_res_match_cxl_range() to spa_maps_hpa() - Clarify comment in spa_maps_hpa() Enable unit testing for XOR address translation of SPA to DPA and vice versa: - Refactor address translation funcs for testing in cxl_region - Make the XOR calculations available for testing - Add cxl_translate module for address translation testing in cxl_test Extended Linear Cache changes: - Add extended linear cache size sysfs attribute - Adjust failure emission of extended linear cache detection in cxl_acpi - Added extended linear cache unit testing support in cxl_test Preparation refactor patches for PRM translation support: - Simplify cxl_rd_ops allocation and handling - Group xor arithmetric setup code in a single block - Remove local variable @inc in cxl_port_setup_targets()" * tag 'cxl-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (22 commits) cxl/test: Assign overflow_err_count from log->nr_overflow cxl/test: Remove ret_limit race condition in mock_get_event() cxl/test: remove unused mock function for cxl_rcd_component_reg_phys() cxl/test: Add support for acpi extended linear cache cxl/test: Add cxl_test CFMWS support for extended linear cache cxl/test: Standardize CXL auto region size cxl/region: Remove local variable @inc in cxl_port_setup_targets() cxl/acpi: Group xor arithmetric setup code in a single block cxl: Simplify cxl_rd_ops allocation and handling cxl: Clarify comment in spa_maps_hpa() cxl: Rename region_res_match_cxl_range() to spa_maps_hpa() acpi/hmat: Return when generic target is updated cxl: Add handling of locked CXL decoder cxl/region: Add support to indicate region has extended linear cache cxl: Adjust extended linear cache failure emission in cxl_acpi cxl/test: Add cxl_translate module for address translation testing cxl/acpi: Make the XOR calculations available for testing cxl/region: Refactor address translation funcs for testing cxl/pci: replace use of system_wq with system_percpu_wq cxl: fix typos in cdat.c comments ...
2025-11-27Merge tag 'cache-for-v6.19' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into soc/drivers-late standalone cache drivers for v6.19 ccache: Add a compatible for the pic64gx SoC. No driver change needed, as it falls back to the PolarFire SoC. hisi hha/generic cpu cache maintenance: Add support for a non-architectural mechanism for invalidating memory regions, needed for some cxl implementations on arm64 (and probably elsewhere in the future). The HiSilicon Hydra Home Agent is the first driver to provide this support. Signed-off-by: Conor Dooley <conor.dooley@microchip.com> * tag 'cache-for-v6.19' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux: MAINTAINERS: refer to intended file in STANDALONE CACHE CONTROLLER DRIVERS cache: Support cache maintenance for HiSilicon SoC Hydra Home Agent cache: Make top level Kconfig menu a boolean dependent on RISCV MAINTAINERS: Add Jonathan Cameron to drivers/cache and add lib/cache_maint.c + header arm64: Select GENERIC_CPU_CACHE_MAINTENANCE lib: Support ARCH_HAS_CPU_CACHE_INVALIDATE_MEMREGION memregion: Support fine grained invalidate by cpu_cache_invalidate_memregion() memregion: Drop unused IORES_DESC_* parameter from cpu_cache_invalidate_memregion() dt-bindings: cache: sifive,ccache0: add a pic64gx compatible Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-11-17memregion: Support fine grained invalidate by cpu_cache_invalidate_memregion()Yicong Yang
Extend cpu_cache_invalidate_memregion() to support invalidating a particular range of memory by introducing start and length parameters. Control of types of invalidation is left for when use cases turn up. For now everything is Clean and Invalidate. Where the range is unknown, use the provided cpu_cache_invalidate_all() helper to act as documentation of intent in a fashion that is clearer than passing (0, -1) to cpu_cache_invalidate_memregion(). Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2025-11-17memregion: Drop unused IORES_DESC_* parameter from ↵Jonathan Cameron
cpu_cache_invalidate_memregion() The res_desc parameter was originally introduced for documentation purposes and with the idea that with HDM-DB CXL invalidation could be triggered from the device. That has not come to pass and the continued existence of the option is confusing when we add a range in the following patch which might not be a strict subset of the res_desc. So avoid that confusion by dropping the parameter. Link: https://lore.kernel.org/linux-mm/686eedb25ed02_24471002e@dwillia2-xfh.jf.intel.com.notmuch/ Reviewed-by: Dan Williams <dan.j.williams@intel.com> Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2025-11-14Merge branch 'for-6.19/cxl-prm' into cxl-for-nextDave Jiang
- Simplify cxl_rd_ops allocation - Group xor arithmetric setup code - Remove local variable @inc in cxl_port_setup_targets()
2025-11-14cxl/region: Remove local variable @inc in cxl_port_setup_targets()Robert Richter
Simplify the code by removing local variable @inc. The variable is not used elsewhere, remove it and directly increment the target number. Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://patch.msgid.link/20251114075844.1315805-4-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-11-14cxl/acpi: Group xor arithmetric setup code in a single blockRobert Richter
Simplify the xor arithmetric setup code by grouping it in a single block. No need to split the block for QoS setup. It is safe to reorder the call of cxl_setup_extended_linear_cache() because there are no dependencies. Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Robert Richter <rrichter@amd.com> Tested-by: Gregory Price <gourry@gourry.net> Link: https://patch.msgid.link/20251114075844.1315805-3-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-11-14cxl: Simplify cxl_rd_ops allocation and handlingRobert Richter
A root decoder's callback handlers are collected in struct cxl_rd_ops. The structure is dynamically allocated, though it contains only a few pointers in it. This also requires to check two pointes to check for the existence of a callback. Simplify the allocation, release and handler check by embedding the ops statically in struct cxl_root_decoder. Implementation is equivalent to how struct cxl_root_ops handles the callbacks. [ dj: Fix spelling error in commit log. ] Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://patch.msgid.link/20251114075844.1315805-2-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-11-13Merge branch 'for-6.19/cxl-elc' into cxl-for-nextDave Jiang
- Add extended linear cache size sysfs attribute. - Adjust failure emission of extended linear cache detection in cxl_acpi.
2025-11-13Merge branch 'for-6.19/cxl-addr-xlat' into cxl-for-nextDave Jiang
Enable unit testing for XOR address translation of SPA to DPA and vice versa.
2025-11-12cxl: Clarify comment in spa_maps_hpa()Dave Jiang
Update the comment in spa_maps_hpa() to clearly convey the construction of extended linear cache. Suggested-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/linux-cxl/68eea19c7e67e_2f899100a8@dwillia2-mobl4.notmuch/ Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Gregory Price <gourry@gourry.net> Link: https://patch.msgid.link/20251106170108.1468304-3-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-11-12cxl: Rename region_res_match_cxl_range() to spa_maps_hpa()Dave Jiang
The function name region_res_match_cxl_range() does not accurately convey the operation of address comparison with cache size. Rename to spa_maps_hpa() to provide a better function name. Suggested-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/linux-cxl/68eea19c7e67e_2f899100a8@dwillia2-mobl4.notmuch/ Reviewed-by: Jonathan Cameron <jonathan.cameron@huwei.com> Reviewed-by: Gregory Price <gourry@gourry.net> Link: https://patch.msgid.link/20251106170108.1468304-2-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-11-12cxl: Add handling of locked CXL decoderDave Jiang
When a decoder is locked, it means that its configuration cannot be changed. CXL spec r3.2 8.2.4.20.13 discusses the details regarding locked decoders. Locking happens when bit 8 of the decoder control register is set and then the decoder is committed afterwards (CXL spec r3.2 8.2.4.20.7). Given that the driver creates a virtual decoder for each CFMWS, the Fixed Device Configuration (bit 4) of the Window Restriction field is considered as locking for the virtual decoder by the driver. The current driver code disregards the locked status and a region can be destroyed regardless of the locking state. Add a region flag to indicate the region is in a locked configuration. The driver will considered a region locked if the CFMWS or any decoder is configured as locked. The consideration is all or nothing regarding the locked state. It is reasonable to determine the region "locked" status while the region is being assembled based on the decoders. Add a check in region commit_store() to intercept when a 0 is written to the commit sysfs attribute in order to prevent the destruction of a region when in locked state. This should be the only entry point from user space to destroy a region. Add a check is added to cxl_decoder_reset() to prevent resetting a locked decoder within the kernel driver. Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20251105201826.2901915-1-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-11-07cxl: Adjust offset calculation for poison injectionDave Jiang
The HPA to DPA translation for poison injection assumes that the base address starts from where the CXL region begins. When the extended linear cache is active, the offset can be within the DRAM region. Adjust the offset so that it correctly reflects the offset within the CXL region. [ dj: Add fixes tag from Alison ] Fixes: c3dd67681c70 ("cxl/region: Add inject and clear poison by region offset") Link: https://patch.msgid.link/20251031173224.3537030-5-dave.jiang@intel.com Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-11-03cxl/region: Add support to indicate region has extended linear cacheDave Jiang
Add a region sysfs attribute to show the size of the extended linear cache if there is any. The attribute is invisible when the cache size is 0, which indicates it does not exist. Moved the cxl_region_visible() location in order to pick up the new sysfs attribute definition. [ dj: Fixed spelling errors noted by Benjamin ] Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20251022203052.4078527-1-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-11-03cxl: Adjust extended linear cache failure emission in cxl_acpiDave Jiang
The cxl_acpi module spams "Extended linear cache calculation failed" when the hmat memory target is not found for a node. This is normal when the memory target does not contain extended linear cache attributes. Adjust cxl_acpi_set_cache_size() to just return 0 if error is returned from hmat_get_extended_linear_cache_size(). That is the only error returned from hmat_get_extended_linear_cache_size() as -ENOENT. Also remove the check for -EOPNOTSUPP in cxl_setup_extended_linear_cache() since that errno is never returned by cxl_acpi_set_cache_size(). [dj: Flipped minor return logic suggested by Jonathan ] Suggested-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20251003185509.3215900-1-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-11-03cxl/acpi: Make the XOR calculations available for testingAlison Schofield
In preparation for adding a test module that can exercise the address translation functions performed by the CXL Driver, refactor the XOR implementation like this: - Extract the core calculation into a standalone helper function, - Export the new function for use by test module cxl_translate only, - Enhance the parameter validation since this new function will be called from a test module with no guarantee of valid parameters, - Move the define of struct cxl_cxims_data to cxl.h so the test module can build xormaps. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-11-03cxl/region: Refactor address translation funcs for testingAlison Schofield
In preparation for adding a test module that exercises the address translation calculations, extract the core calculations into stand- alone functions that operate on base parameters without dependencies on struct cxl_region. Perform additional parameter validation to protect against a test module sending bad parameters. Export the validation function, as well as the three core translation functions for use by test module cxl_translate only. This refactoring enables unit testing of the address translation logic with controlled inputs, while preserving identical functionality in the existing code paths. Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-11-03cxl/pci: replace use of system_wq with system_percpu_wqMarco Crivellari
Currently if a user enqueue a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistency cannot be addressed without refactoring the API. system_wq should be the per-cpu workqueue, yet in this name nothing makes that clear, so replace system_wq with system_percpu_wq. The old wq (system_wq) will be kept for a few release cycles. See 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq") for cause of changes. [ dj: Add reference to commit that initiated the change. ] Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com>> --- Link: https://patch.msgid.link/20251030163839.307752-1-marco.crivellari@suse.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-11-03cxl: fix typos in cdat.c commentsAlok Tiwari
- Corrected spelling of "bandwdith" -> "bandwidth" - Fixed "wht" -> "with" Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-11-03cxl/port: Remove devm_cxl_port_enumerate_dports()Li Ming
devm_cxl_port_enumerate_dports() is not longer used after below commit commit 4f06d81e7c6a ("cxl: Defer dport allocation for switch ports") Delete it and the relevant interface implemented in cxl_test. Signed-off-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-10-14cxl/trace: Subtract to find an hpa_alias0 in cxl_poison eventsAlison Schofield
Traces of cxl_poison events include an hpa_alias0 field if the poison address is in a region configured with an ELC, Extended Linear Cache. Since the ELC always comes first in the region, the calculation needs to subtract the ELC size from the calculated HPA address. Fixes: 8c520c5f1e76 ("cxl: Add extended linear cache address alias emission for cxl events") Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-10-14cxl/region: Use %pa printk format to emit resource_size_tAlison Schofield
KASAN reports a stack-out-of-bounds access in validate_region_offset() while running the cxl-poison.sh unit test because the printk format specifier, %pr format, is not a match for the resource_size_t type of the variables. %pr expects struct resource pointers and attempts to dereference the structure fields, reading beyond the bounds of the stack variables. Since these messages emit an 'A exceeds B' type of message, keep the resource_size_t's and use the %pa specifier to be architecture safe. BUG: KASAN: stack-out-of-bounds in resource_string.isra.0+0xe9a/0x1690 [] Read of size 8 at addr ffff88800a7afb40 by task bash/1397 ... [] The buggy address belongs to stack of task bash/1397 [] and is located at offset 56 in frame: [] validate_region_offset+0x0/0x1c0 [cxl_core] Fixes: c3dd67681c70 ("cxl/region: Add inject and clear poison by region offset") Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-10-14cxl: Fix match_region_by_range() to use region_res_match_cxl_range()Dave Jiang
match_region_by_range() is not using the helper function that also takes extended linear cache size into account when comparing regions. This causes a x2 region to show up as 2 partial incomplete regions rather than a single CXL region with extended linear cache support. Replace the open coded compare logic with the proper helper function for comparison. User visible impact is that when 'cxl list' is issued, no activa CXL region(s) are shown. There may be multiple idle regions present. No actual active CXL region is present in the kernel. [dj: Fix stable address] Fixes: 0ec9849b6333 ("acpi/hmat / cxl: Add extended linear cache support for CXL") Cc: stable@vger.kernel.org Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-10-14cxl: Set range param for region_res_match_cxl_range() as constDave Jiang
The function takes two parameters and compares them. The second parameter should be const since no modification should be done to it. Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-10-14cxl/acpi: Fix setup of memory resource in cxl_acpi_set_cache_size()Dave Jiang
In order to compare the resource against the HMAT memory target, the resource needs to be memory type. Change the DEFINE_RES() macro to DEFINE_RES_MEM() in order to set the correct resource type. hmat_get_extended_linear_cache_size() uses resource_contains() internally. This causes a regression for platforms with the extended linear cache enabled as the comparison always fails and the cache size is not set. User visible impact is that when 'cxl list' is issued, a CXL region with extended linear cache support will only report half the size of the actual size. And this also breaks MCE reporting of the memory region due to incorrect offset calculation for the memory. [dj: Fixup commit log suggested by djbw] [dj: Fixup stable address for cc] Fixes: 12b3d697c812 ("cxl: Remove core/acpi.c and cxl core dependency on ACPI") Cc: stable@vger.kernel.org Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-10-13cxl/features: Add check for no entries in cxl_feature_infoDave Jiang
cxl EDAC calls cxl_feature_info() to get the feature information and if the hardware has no Features support, cxlfs may be passed in as NULL. [ 51.957498] BUG: kernel NULL pointer dereference, address: 0000000000000008 [ 51.965571] #PF: supervisor read access in kernel mode [ 51.971559] #PF: error_code(0x0000) - not-present page [ 51.977542] PGD 17e4f6067 P4D 0 [ 51.981384] Oops: Oops: 0000 [#1] SMP NOPTI [ 51.986300] CPU: 49 UID: 0 PID: 3782 Comm: systemd-udevd Not tainted 6.17.0dj test+ #64 PREEMPT(voluntary) [ 51.997355] Hardware name: <removed> [ 52.009790] RIP: 0010:cxl_feature_info+0xa/0x80 [cxl_core] Add a check for cxlfs before dereferencing it and return -EOPNOTSUPP if there is no cxlfs created due to no hardware support. Fixes: eb5dfcb9e36d ("cxl: Add support to handle user feature commands for set feature") Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-10-13cxl/port: Avoid missing port component registers setupLi Ming
port->nr_dports is used to represent how many dports added to the cxl port, it will increase in add_dport() when a new dport is being added to the cxl port, but it will not be reduced when a dport is removed from the cxl port. Currently, when the first dport is added to a cxl port, it will trigger component registers setup on the cxl port, the implementation is using port->nr_dports to confirm if the dport is the first dport. A corner case here is that adding dport could fail after port->nr_dports updating and before checking port->nr_dports for component registers setup. If the failure happens during the first dport attaching, it will cause that CXL subsystem has not chance to execute component registers setup for the cxl port. the failure flow like below: port->nr_dports = 0 dport 1 adding to the port: add_dport() # port->nr_dports: 1 failed on devm_add_action_or_reset() or sysfs_create_link() return error # port->nr_dports: 1 dport 2 adding to the port: add_dport() # port->nr_dports: 2 no failure skip component registers setup because of port->nr_dports is 2 The solution here is that moving component registers setup closer to add_dport(), so if add_dport() is executed correctly for the first dport, component registers setup on the port will be executed immediately after that. Fixes: f6ee24913de2 ("cxl: Move port register setup to when first dport appear") Signed-off-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-09-18Merge branch 'for-6.18/cxl-delay-dport' into cxl-for-nextDave Jiang
Add changes to delay the allocation and setup of dports until when the endpoint device is being probed. At this point, the CXL link is established from endpoint to host bridge. Addresses issues seen on some platforms when dports are probed earlier. Link: https://lore.kernel.org/linux-cxl/20250829180928.842707-1-dave.jiang@intel.com/
2025-09-18cxl: Move port register setup to when first dport appearDave Jiang
This patch moves the port register setup to when the first dport appears via the memdev probe path. At this point, the CXL link should be established and the register access is expected to succeed. This change addresses an error message observed when PCIe hotplug is enabled on an Intel platform. The error messages "cxl portN: Couldn't locate the CXL.cache and CXL.mem capability array header" is observed for the host bridge (CHBCR) during cxl_acpi driver probe. If the cxl_acpi module probe is running before the CXL link between the endpoint device and the RP is established, then the platform may not have exposed DVSEC ID 3 and/or DVSEC ID 7 blocks which will trigger the error message. This behavior is defined by the CXL spec r3.2 9.12.3 for RPs and DSPs, however the Intel platform also added this behavior to the host bridge. This change also needs the dport enumeration to be moved to the memdev probe path in order to address the issue. This change is not a wholly contained solution by itself. [dj: Add missing var init during port alloc] Suggested-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Tested-by: Robert Richter <rrichter@amd.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-09-18cxl: Change sslbis handler to only handle single dportDave Jiang
While cxl_switch_parse_cdat() is harmless to be run multiple times, it is not efficient in the current scheme where one dport is being updated at a time by the memdev probe path. Change the input parameter to the specific dport being updated to pick up the SSLBIS information for just that dport. Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Robert Richter <rrichter@amd.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-09-18cxl/test: Adjust the mock version of devm_cxl_switch_port_decoders_setup()Dave Jiang
With devm_cxl_switch_port_decoders_setup() being called within cxl_core instead of by the port driver probe, adjustments are needed to deal with circular symbol dependency when this function is being mock'd. Add the appropriate changes to get around the circular dependency. Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-09-18cxl/test: Add mock version of devm_cxl_add_dport_by_dev()Dave Jiang
devm_cxl_add_dport_by_dev() outside of cxl_test is done through PCI hierarchy. However with cxl_test, it needs to be done through the platform device hierarchy. Add the mock function for devm_cxl_add_dport_by_dev(). When cxl_core calls a cxl_core exported function and that function is mocked by cxl_test, the call chain causes a circular dependency issue. Dan provided a workaround to avoid this issue. Apply the method to changes from the late dport allocation changes in order to enable cxl-test. In cxl_core they are defined with "__" added in front of the function. A macro is used to define the original function names for when non-test version of the kernel is built. A bit of macros and typedefs are used to allow mocking of those functions in cxl_test. Co-developed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Tested-by: Alison Schofield <alison.schofield@intel.com> Tested-by: Robert Richter <rrichter@amd.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-09-18cxl: Defer dport allocation for switch portsDave Jiang
The current implementation enumerates the dports during the cxl_port driver probe. Without an endpoint connected, the dport may not be active during port probe. This scheme may prevent a valid hardware dport id to be retrieved and MMIO registers to be read when an endpoint is hot-plugged. Move the dport allocation and setup to behind memdev probe so the endpoint is guaranteed to be connected. In the original enumeration behavior, there are 3 phases (or 2 if no CXL switches) for port creation. cxl_acpi() creates a Root Port (RP) from the ACPI0017.N device. Through that it enumerates downstream ports composed of ACPI0016.N devices through add_host_bridge_dport(). Once done, it uses add_host_bridge_uport() to create the ports that enumerate the PCI RPs as the dports of these ports. Every time a port is created, the port driver is attached, cxl_switch_porbe_probe() is called and devm_cxl_port_enumerate_dports() is invoked to enumerate and probe the dports. The second phase is if there are any CXL switches. When the pci endpoint device driver (cxl_pci) calls probe, it will add a mem device and triggers the cxl_mem_probe(). cxl_mem_probe() calls devm_cxl_enumerate_ports() and attempts to discovery and create all the ports represent CXL switches. During this phase, a port is created per switch and the attached dports are also enumerated and probed. The last phase is creating endpoint port which happens for all endpoint devices. The new sequence is instead of creating all possible dports at initial port creation, defer port instantiation until a memdev beneath that dport arrives. Introduce devm_cxl_create_or_extend_port() to centralize the creation and extension of ports with new dports as memory devices arrive. As part of this rework, switch decoder target list is amended at runtime as dports show up. While the decoders are allocated during the port driver probe, The decoders must also be updated since previously they were setup when all the dports are setup. Now every time a dport is setup per endpoint, the switch target listing need to be updated with new dport. A guard(rwsem_write) is used to update decoder targets. This is similar to when decoder_populate_target() is called and the decoder programming must be protected. Also the port registers are probed the first time when the first dport shows up. This ensures that the CXL link is established when the port registers are probed. [dj] Use ERR_CAST() (Jonathan) Link: https://lore.kernel.org/linux-cxl/20250305100123.3077031-1-rrichter@amd.com/ Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-09-18cxl/test: Refactor decoder setup to reduce cxl_test burdenDave Jiang
Group the decoder setup code in switch and endpoint port probe into a single function for each to reduce the number of functions to be mocked in cxl_test. Introduce devm_cxl_switch_port_decoders_setup() and devm_cxl_endpoint_decoders_setup(). These two functions will be mocked instead with some functions optimized out since the mock version does not do anything. Remove devm_cxl_setup_hdm(), devm_cxl_add_passthrough_decoder(), and devm_cxl_enumerate_decoders() in cxl_test mock code. In turn, mock_cxl_add_passthrough_decoder() can be removed since cxl_test does not setup passthrough decoders. __wrap_cxl_hdm_decode_init() and __wrap_cxl_dvsec_rr_decode() can be removed as well since they only return 0 when called. [dj: drop 'struct cxl_port' forward declaration (Robert)] Suggested-by: Robert Richter <rrichter@amd.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-09-17cxl: Add a cached copy of target_map to cxl_decoderDave Jiang
Add a cached copy of the hardware port-id list that is available at init before all @dport objects have been instantiated. Change is in preparation of delayed dport instantiation. Reviewed-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Tested-by: Robert Richter <rrichter@amd.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-09-17cxl: Add helper to delete dportDave Jiang
Refactor the code in reap_dports() out to provide a helper function that reaps a single dport. This will be used later in the cleanup path for allocating a dport. Renaming to del_port() and del_dports() to mirror devm_cxl_add_dport(). [dj] Fixed up subject per Robert Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Robert Richter <rrichter@amd.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-09-17cxl: Add helper to detect top of CXL device topologyDave Jiang
Add a helper to replace the open code detection of CXL device hierarchy root, or the host bridge. The helper will be used for delayed downstream port (dport) creation. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Robert Richter <rrichter@amd.com> Tested-by: Robert Richter <rrichter@amd.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-09-10cxl/acpi: Rename CFMW coherency restrictionsDavidlohr Bueso
ACPICA commit 710745713ad3a2543dbfb70e84764f31f0e46bdc This has been renamed in more recent CXL specs, as type3 (memory expanders) can also use HDM-DB for device coherent memory. Link: https://github.com/acpica/acpica/commit/710745713ad3a2543dbfb70e84764f31f0e46bdc Acked-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://patch.msgid.link/20250908160034.86471-1-dave@stgolabs.net Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-09-10Merge branch 'for-6.18/cxl-update-access-coordinates' into cxl-for-nextDave Jiang
Update the CXL memory hotplug notifier to update the NUMA node access coordinates directly rather than go through the HMAT memory hotplug notifier.
2025-09-02acpi/hmat: Remove now unused hmat_update_target_coordinates()Dave Jiang
Remove deadcode since CXL no longer calls hmat_update_target_coordinates(). Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20250829222907.1290912-5-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-09-02cxl, acpi/hmat: Update CXL access coordinates directly instead of through HMATDave Jiang
The current implementation of CXL memory hotplug notifier gets called before the HMAT memory hotplug notifier. The CXL driver calculates the access coordinates (bandwidth and latency values) for the CXL end to end path (i.e. CPU to endpoint). When the CXL region is onlined, the CXL memory hotplug notifier writes the access coordinates to the HMAT target structs. Then the HMAT memory hotplug notifier is called and it creates the access coordinates for the node sysfs attributes. During testing on an Intel platform, it was found that although the newly calculated coordinates were pushed to sysfs, the sysfs attributes for the access coordinates showed up with the wrong initiator. The system has 4 nodes (0, 1, 2, 3) where node 0 and 1 are CPU nodes and node 2 and 3 are CXL nodes. The expectation is that node 2 would show up as a target to node 0: /sys/devices/system/node/node2/access0/initiators/node0 However it was observed that node 2 showed up as a target under node 1: /sys/devices/system/node/node2/access0/initiators/node1 The original intent of the 'ext_updated' flag in HMAT handling code was to stop HMAT memory hotplug callback from clobbering the access coordinates after CXL has injected its calculated coordinates and replaced the generic target access coordinates provided by the HMAT table in the HMAT target structs. However the flag is hacky at best and blocks the updates from other CXL regions that are onlined in the same node later on. Remove the 'ext_updated' flag usage and just update the access coordinates for the nodes directly without touching HMAT target data. The hotplug memory callback ordering is changed. Instead of changing CXL, move HMAT back so there's room for the levels rather than have CXL share the same level as SLAB_CALLBACK_PRI. The change will resulting in the CXL callback to be executed after the HMAT callback. With the change, the CXL hotplug memory notifier runs after the HMAT callback. The HMAT callback will create the node sysfs attributes for access coordinates. The CXL callback will write the access coordinates to the now created node sysfs attributes directly and will not pollute the HMAT target values. A nodemask is introduced to keep track if a node has been updated and prevents further updates. Fixes: 067353a46d8c ("cxl/region: Add memory hotplug notifier for cxl region") Cc: stable@vger.kernel.org Tested-by: Marc Herbert <marc.herbert@linux.intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20250829222907.1290912-4-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-08-18cxl: Fix emit of type resource_size_t argument for validate_region_offset()Dave Jiang
0day reported warnings of: drivers/cxl/core/region.c:3664:25: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 4 has type 'resource_size_t' {aka 'unsigned int'} [-Wformat=] drivers/cxl/core/region.c:3671:37: warning: format '%llx' expects argument of type 'long long unsigned int', but argument 4 has type 'resource_size_t' {aka 'unsigned int'} [-Wformat=] Replace %#llx with %pr to emit resource_size_t arguments. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202508160513.NAZ9i9rQ-lkp@intel.com/ Cc: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20250818153953.3658952-1-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-08-12Merge branch 'for-6.18/cxl-poison-inject' into cxl-for-nextDave Jiang
Add support to allow expert users to inject and clear poison for the CXL subsystem by writing a System Physical Address (SPA) to a debugfs file.