summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2025-09-02cxl, acpi/hmat: Update CXL access coordinates directly instead of through HMATDave Jiang
The current implementation of CXL memory hotplug notifier gets called before the HMAT memory hotplug notifier. The CXL driver calculates the access coordinates (bandwidth and latency values) for the CXL end to end path (i.e. CPU to endpoint). When the CXL region is onlined, the CXL memory hotplug notifier writes the access coordinates to the HMAT target structs. Then the HMAT memory hotplug notifier is called and it creates the access coordinates for the node sysfs attributes. During testing on an Intel platform, it was found that although the newly calculated coordinates were pushed to sysfs, the sysfs attributes for the access coordinates showed up with the wrong initiator. The system has 4 nodes (0, 1, 2, 3) where node 0 and 1 are CPU nodes and node 2 and 3 are CXL nodes. The expectation is that node 2 would show up as a target to node 0: /sys/devices/system/node/node2/access0/initiators/node0 However it was observed that node 2 showed up as a target under node 1: /sys/devices/system/node/node2/access0/initiators/node1 The original intent of the 'ext_updated' flag in HMAT handling code was to stop HMAT memory hotplug callback from clobbering the access coordinates after CXL has injected its calculated coordinates and replaced the generic target access coordinates provided by the HMAT table in the HMAT target structs. However the flag is hacky at best and blocks the updates from other CXL regions that are onlined in the same node later on. Remove the 'ext_updated' flag usage and just update the access coordinates for the nodes directly without touching HMAT target data. The hotplug memory callback ordering is changed. Instead of changing CXL, move HMAT back so there's room for the levels rather than have CXL share the same level as SLAB_CALLBACK_PRI. The change will resulting in the CXL callback to be executed after the HMAT callback. With the change, the CXL hotplug memory notifier runs after the HMAT callback. The HMAT callback will create the node sysfs attributes for access coordinates. The CXL callback will write the access coordinates to the now created node sysfs attributes directly and will not pollute the HMAT target values. A nodemask is introduced to keep track if a node has been updated and prevents further updates. Fixes: 067353a46d8c ("cxl/region: Add memory hotplug notifier for cxl region") Cc: stable@vger.kernel.org Tested-by: Marc Herbert <marc.herbert@linux.intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20250829222907.1290912-4-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-09-02drivers/base/node: Add a helper function node_update_perf_attrs()Dave Jiang
Add helper function node_update_perf_attrs() to allow update of node access coordinates computed by an external agent such as CXL. The helper allows updating of coordinates after the attribute being created by HMAT. Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20250829222907.1290912-3-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-09-02drm/amdgpu/amdkfd: Avoid a couple hundred -Wflex-array-member-not-at-end ↵Gustavo A. R. Silva
warnings -Wflex-array-member-not-at-end was introduced in GCC-14, and we are getting ready to enable it, globally. Move the conflicting declarations to the end of the corresponding structures. Notice that `struct dev_pagemap` is a flexible structure, this is a structure that contains a flexible-array member. struct dev_pagemap always has room for at least one range. amdgpu only uses a single range. Therefore no change are needed to the allocation of struct amdgpu_device. Fix 283 of the following type of warnings: 283 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h:111:28: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-02net: mvpp2: add xlg pcs inband capabilitiesRussell King (Oracle)
Add PCS inband capabilities for XLG in the Marvell PP2 driver, so phylink knows that 5G and 10G speeds have no inband capabilities. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Daniel Golle <daniel@makrotopia.org> Link: https://patch.msgid.link/E1uslR9-00000001OxL-44CD@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-02net: sfp: add quirk for FLYPRO copper SFP+ moduleAleksander Jan Bajkowski
Add quirk for a copper SFP that identifies itself as "FLYPRO" "SFP-10GT-CS-30M". It uses RollBall protocol to talk to the PHY. Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/20250831105910.3174-1-olek2@wp.pl Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-02bonding: Remove support for use_carrierJay Vosburgh
Remove the implementation of use_carrier, the link monitoring method that utilizes ethtool or ioctl to determine the link state of an interface in a bond. Bonding will always behaves as if use_carrier=1, which relies on netif_carrier_ok() to determine the link state of interfaces. To avoid acquiring RTNL many times per second, bonding inspects link state under RCU, but not under RTNL. However, ethtool implementations in drivers may sleep, and therefore this strategy is unsuitable for use with calls into driver ethtool functions. The use_carrier option was introduced in 2003, to provide backwards compatibility for network device drivers that did not support the then-new netif_carrier_ok/on/off system. Device drivers are now expected to support netif_carrier_*, and the use_carrier backwards compatibility logic is no longer necessary. The option itself remains, but when queried always returns 1, and may only be set to 1. Link: https://lore.kernel.org/000000000000eb54bf061cfd666a@google.com Link: https://lore.kernel.org/20240718122017.d2e33aaac43a.I10ab9c9ded97163aef4e4de10985cd8f7de60d28@changeid Signed-off-by: Jay Vosburgh <jv@jvosburgh.net> Reported-by: syzbot+b8c48ea38ca27d150063@syzkaller.appspotmail.com Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/2029487.1756512517@famine Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-02drm/xe/guc: Increase GuC crash dump buffer sizeZhanjun Dong
There are platforms already have a maximum dump size of 12KB, to avoid data truncating, increase GuC crash dump buffer size to 16KB. Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://lore.kernel.org/r/20250829160427.1245732-1-zhanjun.dong@intel.com
2025-09-02Merge tag 'mm-hotfixes-stable-2025-09-01-17-20' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "17 hotfixes. 13 are cc:stable and the remainder address post-6.16 issues or aren't considered necessary for -stable kernels. 11 of these fixes are for MM. This includes a three-patch series from Harry Yoo which fixes an intermittent boot failure which can occur on x86 systems. And a two-patch series from Alexander Gordeev which fixes a KASAN crash on S390 systems" * tag 'mm-hotfixes-stable-2025-09-01-17-20' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm: fix possible deadlock in kmemleak x86/mm/64: define ARCH_PAGE_TABLE_SYNC_MASK and arch_sync_kernel_mappings() mm: introduce and use {pgd,p4d}_populate_kernel() mm: move page table sync declarations to linux/pgtable.h proc: fix missing pde_set_flags() for net proc files mm: fix accounting of memmap pages mm/damon/core: prevent unnecessary overflow in damos_set_effective_quota() kexec: add KEXEC_FILE_NO_CMA as a legal flag kasan: fix GCC mem-intrinsic prefix with sw tags mm/kasan: avoid lazy MMU mode hazards mm/kasan: fix vmalloc shadow memory (de-)population races kunit: kasan_test: disable fortify string checker on kasan_strings() test selftests/mm: fix FORCE_READ to read input value correctly mm/userfaultfd: fix kmap_local LIFO ordering for CONFIG_HIGHPTE ocfs2: prevent release journal inode after journal shutdown rust: mm: mark VmaNew as transparent of_numa: fix uninitialized memory nodes causing kernel panic
2025-09-02drm/amd/amdgpu: Fix missing error return on kzalloc failureColin Ian King
Currently the kzalloc failure check just sets reports the failure and sets the variable ret to -ENOMEM, which is not checked later for this specific error. Fix this by just returning -ENOMEM rather than setting ret. Fixes: 4fb930715468 ("drm/amd/amdgpu: remove redundant host to psp cmd buf allocations") Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-02drm/amd/pm: Print VCE clocks too in si_dpm (v3)Timur Kristóf
They are part of a power state too and should be printed alongside the rest of the data from the power state. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-02drm/amd/pm: Remove wm_low and wm_high fields from amdgpu_crtc (v2)Timur Kristóf
These fields were only used by si_dpm and are not necessary anymore. They also may have been incorrect because: - wm_high was set to the LOW_WATERMARK field of watermark A. - wm_low was not set on DCE 6 and was always zero. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-02drm/amd/pm: Disable SCLK switching on Oland with high pixel clocks (v3)Timur Kristóf
Port of commit 227545b9a08c ("drm/radeon/dpm: Disable sclk switching on Oland when two 4K 60Hz monitors are connected") This is an ad-hoc DPM fix, necessary because we don't have proper bandwidth calculation for DCE 6. We define "high pixelclock" for SI as higher than necessary for 4K 30Hz. For example, 4K 60Hz and 1080p 144Hz fall into this category. When two high pixel clock displays are connected to Oland, additionally disable shader clock switching, which results in a higher voltage, thereby addressing some visible flickering. v2: Add more comments. v3: Split into two commits for easier review. Fixes: 841686df9f7d ("drm/amdgpu: add SI DPM support (v4)") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-02drm/amd/pm: Disable MCLK switching with non-DC at 120 Hz+ (v2)Timur Kristóf
According to pp_pm_compute_clocks the non-DC display code has "issues with mclk switching with refresh rates over 120 hz". The workaround is to disable MCLK switching in this case. Do the same for legacy DPM. Fixes: 6ddbd37f1074 ("drm/amd/pm: optimize the amdgpu_pm_compute_clocks() implementations") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-02drm/amd/pm: Treat zero vblank time as too short in si_dpm (v3)Timur Kristóf
Some parts of the code base expect that MCLK switching is turned off when the vblank time is set to zero. According to pp_pm_compute_clocks the non-DC code has issues with MCLK switching with refresh rates over 120 Hz. v3: Add code comment to explain this better. Add an if statement instead of changing the switch_limit. Fixes: 841686df9f7d ("drm/amdgpu: add SI DPM support (v4)") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-02drm/amd/pm: Adjust si_upload_smc_data register programming (v3)Timur Kristóf
Based on some comments in dm_pp_display_configuration above the crtc_index and line_time fields, these values are programmed to the SMC to work around an SMC hang when it switches MCLK. According to Alex, the Windows driver programs them to: mclk_change_block_cp_min = 200 / line_time mclk_change_block_cp_max = 100 / line_time Let's use the same for the sake of consistency. Previously we used the watermark values, but it seemed buggy as the code was mixing up low/high and A/B watermarks, and was not saving a low watermark value on DCE 6, so mclk_change_block_cp_max would be always zero previously. Split this change off from the previous si_upload_smc_data to make it easier to bisect, in case it causes any issues. Fixes: 841686df9f7d ("drm/amdgpu: add SI DPM support (v4)") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-02drm/amd/pm: Fix si_upload_smc_data (v3)Timur Kristóf
The si_upload_smc_data function uses si_write_smc_soft_register to set some register values in the SMC, and expects the result to be PPSMC_Result_OK which is 1. The PPSMC_Result_OK / PPSMC_Result_Failed values are used for checking the result of a command sent to the SMC. However, the si_write_smc_soft_register actually doesn't send any commands to the SMC and returns zero on success, so this check was incorrect. Fix that by not checking the return value, just like other calls to si_write_smc_soft_register. v3: Additionally, when no display is plugged in, there is no need to restrict MCLK switching, so program the registers to zero. Fixes: 841686df9f7d ("drm/amdgpu: add SI DPM support (v4)") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-02drm/amd/pm: Increase SMC timeout on SI and warn (v3)Timur Kristóf
The SMC can take an excessive amount of time to process some messages under some conditions. Background: Sending a message to the SMC works by writing the message into the mmSMC_MESSAGE_0 register and its optional parameter into the mmSMC_SCRATCH0, and then polling mmSMC_RESP_0. Previously the timeout was AMDGPU_MAX_USEC_TIMEOUT, ie. 100 ms. Increase the timeout to 200 ms for all messages and to 1 sec for a few messages which I've observed to be especially slow: PPSMC_MSG_NoForcedLevel PPSMC_MSG_SetEnabledLevels PPSMC_MSG_SetForcedLevels PPSMC_MSG_DisableULV PPSMC_MSG_SwitchToSwState This fixes the following problems on Tahiti when switching from a lower clock power state to a higher clock state, such as when DC turns on a display which was previously turned off. * si_restrict_performance_levels_before_switch would fail (if the user previously forced high clocks using sysfs) * si_set_sw_state would fail (always) It turns out that both of those failures were SMC timeouts and that the SMC actually didn't fail or hang, just needs more time to process those. Add a warning when there is an SMC timeout to make it easier to identify this type of problem in the future. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-02drm/amd/pm: Disable ULV even if unsupported (v3)Timur Kristóf
Always send PPSMC_MSG_DisableULV to the SMC, even if ULV mode is unsupported, to make sure it is properly turned off. v3: Simplify si_disable_ulv further. Always check the return value of amdgpu_si_send_msg_to_smc. Fixes: 841686df9f7d ("drm/amdgpu: add SI DPM support (v4)") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-02drm/amdgpu: Power up UVD 3 for FW validation (v2)Timur Kristóf
Unlike later versions, UVD 3 has firmware validation. For this to work, the UVD should be powered up correctly. When DPM is enabled and the display clock is off, the SMU may choose a power state which doesn't power the UVD, which can result in failure to initialize UVD. v2: Add code comments to explain about the UVD power state and how UVD clock is turned on/off. Fixes: b38f3e80ecec ("drm amdgpu: SI UVD v3_1 (v2)") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-02drm/amdgpu: Allow kfd CRIU with no buffer objectsDavid Francis
The kfd CRIU checkpoint ioctl would return an error if trying to checkpoint a process with no kfd buffer objects. This is a normal case and should not be an error. Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: David Francis <David.Francis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-02drm/amdgpu: Add mapping info option for GEM_OP ioctlDavid Francis
Add new GEM_OP_IOCTL option GET_MAPPING_INFO, which returns a list of mappings associated with a given bo, along with their positions and offsets. Userspace for this and the previous change can be found at: https://github.com/checkpoint-restore/criu/pull/2613 Signed-off-by: David Francis <David.Francis@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-02drm/amdgpu: Add ioctl to get all gem handles for a processDavid Francis
Add new ioctl DRM_IOCTL_AMDGPU_GEM_LIST_HANDLES. This ioctl returns a list of bos with their handles, sizes, and flags and domains. This ioctl is meant to be used during CRIU checkpoint and provide information needed to reconstruct the bos in CRIU restore. Userspace for this and the next change can be found at https://github.com/checkpoint-restore/criu/pull/2613 Signed-off-by: David Francis <David.Francis@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-02drm/amdgpu: Allow more flags to be set on gem create.David Francis
The GEM create flag AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE specifies that gem memory contains sensitive information and should be cleared to prevent snooping. The COHERENT and UNCACHED gem create flags enable memory features related to sharing memory across devices. For CRIU we need to re-create KFD BOs through the GEM_CREATE IOCTL, so allow those KFD specific flags here as well. This will also aid us in the future and allows to move the KFD components over using the render node for allocations. Signed-off-by: David Francis <David.Francis@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-09-02nvme: fix PI insert on writeChristoph Hellwig
I recently ran into an issue where the PI generated using the block layer integrity code differs from that from a kernel using the PRACT fallback when the block layer integrity code is disabled, and I tracked this down to us using PRACT incorrectly. The NVM Command Set Specification (section 5.33 in 1.2, similar in older versions) specifies the PRACT insert behavior as: Inserted protection information consists of the computed CRC for the protection information format (refer to section 5.3.1) in the Guard field, the LBAT field value in the Application Tag field, the LBST field value in the Storage Tag field, if defined, and the computed reference tag in the Logical Block Reference Tag. Where the computed reference tag is defined as following for type 1 and type 2 using the text below that is duplicated in the respective bullet points: the value of the computed reference tag for the first logical block of the command is the value contained in the Initial Logical Block Reference Tag (ILBRT) or Expected Initial Logical Block Reference Tag (EILBRT) field in the command, and the computed reference tag is incremented for each subsequent logical block. So we need to set ILBRT field, but we currently don't. Interestingly this works fine on my older type 1 formatted SSD, but Qemu trips up on this. We already set ILBRT for Write Same since commit aeb7bb061be5 ("nvme: set the PRACT bit when using Write Zeroes with T10 PI"). To ease this, move the PI type check into nvme_set_ref_tag. Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-09-02e1000e: fix heap overflow in e1000_set_eepromVitaly Lifshits
Fix a possible heap overflow in e1000_set_eeprom function by adding input validation for the requested length of the change in the EEPROM. In addition, change the variable type from int to size_t for better code practices and rearrange declarations to RCT. Cc: stable@vger.kernel.org Fixes: bc7f75fa9788 ("[E1000E]: New pci-express e1000 driver (currently for ICH9 devices only)") Co-developed-by: Mikael Wessel <post@mikaelkw.online> Signed-off-by: Mikael Wessel <post@mikaelkw.online> Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-09-02ixgbe: fix incorrect map used in eee linkmodeAlok Tiwari
incorrectly used ixgbe_lp_map in loops intended to populate the supported and advertised EEE linkmode bitmaps based on ixgbe_ls_map. This results in incorrect bit setting and potential out-of-bounds access, since ixgbe_lp_map and ixgbe_ls_map have different sizes and purposes. ixgbe_lp_map[i] -> ixgbe_ls_map[i] Use ixgbe_ls_map for supported and advertised linkmodes, and keep ixgbe_lp_map usage only for link partner (lp_advertised) mapping. Fixes: 9356b6db9d05 ("net: ethernet: ixgbe: Convert EEE to use linkmodes") Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-09-02i40e: Fix potential invalid access when MAC list is emptyZhen Ni
list_first_entry() never returns NULL - if the list is empty, it still returns a pointer to an invalid object, leading to potential invalid memory access when dereferenced. Fix this by using list_first_entry_or_null instead of list_first_entry. Fixes: e3219ce6a775 ("i40e: Add support for client interface for IWARP driver") Signed-off-by: Zhen Ni <zhen.ni@easystack.cn> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-09-02i40e: remove read access to debugfs filesJacob Keller
The 'command' and 'netdev_ops' debugfs files are a legacy debugging interface supported by the i40e driver since its early days by commit 02e9c290814c ("i40e: debugfs interface"). Both of these debugfs files provide a read handler which is mostly useless, and which is implemented with questionable logic. They both use a static 256 byte buffer which is initialized to the empty string. In the case of the 'command' file this buffer is literally never used and simply wastes space. In the case of the 'netdev_ops' file, the last command written is saved here. On read, the files contents are presented as the name of the device followed by a colon and then the contents of their respective static buffer. For 'command' this will always be "<device>: ". For 'netdev_ops', this will be "<device>: <last command written>". But note the buffer is shared between all devices operated by this module. At best, it is mostly meaningless information, and at worse it could be accessed simultaneously as there doesn't appear to be any locking mechanism. We have also recently received multiple reports for both read functions about their use of snprintf and potential overflow that could result in reading arbitrary kernel memory. For the 'command' file, this is definitely impossible, since the static buffer is always zero and never written to. For the 'netdev_ops' file, it does appear to be possible, if the user carefully crafts the command input, it will be copied into the buffer, which could be large enough to cause snprintf to truncate, which then causes the copy_to_user to read beyond the length of the buffer allocated by kzalloc. A minimal fix would be to replace snprintf() with scnprintf() which would cap the return to the number of bytes written, preventing an overflow. A more involved fix would be to drop the mostly useless static buffers, saving 512 bytes and modifying the read functions to stop needing those as input. Instead, lets just completely drop the read access to these files. These are debug interfaces exposed as part of debugfs, and I don't believe that dropping read access will break any script, as the provided output is pretty useless. You can find the netdev name through other more standard interfaces, and the 'netdev_ops' interface can easily result in garbage if you issue simultaneous writes to multiple devices at once. In order to properly remove the i40e_dbg_netdev_ops_buf, we need to refactor its write function to avoid using the static buffer. Instead, use the same logic as the i40e_dbg_command_write, with an allocated buffer. Update the code to use this instead of the static buffer, and ensure we free the buffer on exit. This fixes simultaneous writes to 'netdev_ops' on multiple devices, and allows us to remove the now unused static buffer along with removing the read access. Fixes: 02e9c290814c ("i40e: debugfs interface") Reported-by: Kunwu Chan <chentao@kylinos.cn> Closes: https://lore.kernel.org/intel-wired-lan/20231208031950.47410-1-chentao@kylinos.cn/ Reported-by: Wang Haoran <haoranwangsec@gmail.com> Closes: https://lore.kernel.org/all/CANZ3JQRRiOdtfQJoP9QM=6LS1Jto8PGBGw6y7-TL=BcnzHQn1Q@mail.gmail.com/ Reported-by: Amir Mohammad Jahangirzad <a.jahangirzad@gmail.com> Closes: https://lore.kernel.org/all/20250722115017.206969-1-a.jahangirzad@gmail.com/ Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Kunwu Chan <kunwu.chan@linux.dev> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-09-02idpf: set mac type when adding and removing MAC filtersEmil Tantilov
On control planes that allow changing the MAC address of the interface, the driver must provide a MAC type to avoid errors such as: idpf 0000:0a:00.0: Transaction failed (op 535) idpf 0000:0a:00.0: Received invalid MAC filter payload (op 535) (len 0) idpf 0000:0a:00.0: Transaction failed (op 536) These errors occur during driver load or when changing the MAC via: ip link set <iface> address <mac> Add logic to set the MAC type when sending ADD/DEL (opcodes 535/536) to the control plane. Since only one primary MAC is supported per vport, the driver only needs to send an ADD opcode when setting it. Remove the old address by calling __idpf_del_mac_filter(), which skips the message and just clears the entry from the internal list. This avoids an error on DEL as it attempts to remove an address already cleared by the preceding ADD opcode. Fixes: ce1b75d0635c ("idpf: add ptypes and MAC filter support") Reported-by: Jian Liu <jianliu@redhat.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-09-02idpf: fix UAF in RDMA core aux dev deinitializationJoshua Hay
Free the adev->id before auxiliary_device_uninit. The call to uninit triggers the release callback, which frees the iadev memory containing the adev. The previous flow results in a UAF during rmmod due to the adev->id access. [264939.604077] ================================================================== [264939.604093] BUG: KASAN: slab-use-after-free in idpf_idc_deinit_core_aux_device+0xe4/0x100 [idpf] [264939.604134] Read of size 4 at addr ff1100109eb6eaf8 by task rmmod/17842 ... [264939.604635] Allocated by task 17597: [264939.604643] kasan_save_stack+0x20/0x40 [264939.604654] kasan_save_track+0x14/0x30 [264939.604663] __kasan_kmalloc+0x8f/0xa0 [264939.604672] idpf_idc_init_aux_core_dev+0x4bd/0xb60 [idpf] [264939.604700] idpf_idc_init+0x55/0xd0 [idpf] [264939.604726] process_one_work+0x658/0xfe0 [264939.604742] worker_thread+0x6e1/0xf10 [264939.604750] kthread+0x382/0x740 [264939.604762] ret_from_fork+0x23a/0x310 [264939.604772] ret_from_fork_asm+0x1a/0x30 [264939.604785] Freed by task 17842: [264939.604790] kasan_save_stack+0x20/0x40 [264939.604799] kasan_save_track+0x14/0x30 [264939.604808] kasan_save_free_info+0x3b/0x60 [264939.604820] __kasan_slab_free+0x37/0x50 [264939.604830] kfree+0xf1/0x420 [264939.604840] device_release+0x9c/0x210 [264939.604850] kobject_put+0x17c/0x4b0 [264939.604860] idpf_idc_deinit_core_aux_device+0x4f/0x100 [idpf] [264939.604886] idpf_vc_core_deinit+0xba/0x3a0 [idpf] [264939.604915] idpf_remove+0xb0/0x7c0 [idpf] [264939.604944] pci_device_remove+0xab/0x1e0 [264939.604955] device_release_driver_internal+0x371/0x530 [264939.604969] driver_detach+0xbf/0x180 [264939.604981] bus_remove_driver+0x11b/0x2a0 [264939.604991] pci_unregister_driver+0x2a/0x250 [264939.605005] __do_sys_delete_module.constprop.0+0x2eb/0x540 [264939.605014] do_syscall_64+0x64/0x2c0 [264939.605024] entry_SYSCALL_64_after_hwframe+0x76/0x7e Fixes: f4312e6bfa2a ("idpf: implement core RDMA auxiliary dev create, init, and destroy") Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Tested-by: Samuel Salin <Samuel.salin@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-09-02ice: fix NULL access of tx->in_use in ice_ll_ts_intrJacob Keller
Recent versions of the E810 firmware have support for an extra interrupt to handle report of the "low latency" Tx timestamps coming from the specialized low latency firmware interface. Instead of polling the registers, software can wait until the low latency interrupt is fired. This logic makes use of the Tx timestamp tracking structure, ice_ptp_tx, as it uses the same "ready" bitmap to track which Tx timestamps complete. Unfortunately, the ice_ll_ts_intr() function does not check if the tracker is initialized before its first access. This results in NULL dereference or use-after-free bugs similar to the issues fixed in the ice_ptp_ts_irq() function. Fix this by only checking the in_use bitmap (and other fields) if the tracker is marked as initialized. The reset flow will clear the init field under lock before it tears the tracker down, thus preventing any use-after-free or NULL access. Fixes: 82e71b226e0e ("ice: Enable SW interrupt from FW for LL TS") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-09-02ice: fix NULL access of tx->in_use in ice_ptp_ts_irqJacob Keller
The E810 device has support for a "low latency" firmware interface to access and read the Tx timestamps. This interface does not use the standard Tx timestamp logic, due to the latency overhead of proxying sideband command requests over the firmware AdminQ. The logic still makes use of the Tx timestamp tracking structure, ice_ptp_tx, as it uses the same "ready" bitmap to track which Tx timestamps complete. Unfortunately, the ice_ptp_ts_irq() function does not check if the tracker is initialized before its first access. This results in NULL dereference or use-after-free bugs similar to the following: [245977.278756] BUG: kernel NULL pointer dereference, address: 0000000000000000 [245977.278774] RIP: 0010:_find_first_bit+0x19/0x40 [245977.278796] Call Trace: [245977.278809] ? ice_misc_intr+0x364/0x380 [ice] This can occur if a Tx timestamp interrupt races with the driver reset logic. Fix this by only checking the in_use bitmap (and other fields) if the tracker is marked as initialized. The reset flow will clear the init field under lock before it tears the tracker down, thus preventing any use-after-free or NULL access. Fixes: f9472aaabd1f ("ice: Process TSYN IRQ in a separate function") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-09-02drm/xe/vf: Enable CCS save/restore only on supported GUC versionsSatyanarayana K V P
CCS save/restore is supported starting with GuC 70.48.0 (compatibility version 1.23.0). Gate the feature on the GuC firmware version and keep it disabled on older or unsupported versions. Fixes: f3009272ff2e ("drm/xe/vf: Create contexts for CCS read write") Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Andi Shyti <andi.shyti@kernel.org> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://lore.kernel.org/r/20250902103256.21658-2-satyanarayana.k.v.p@intel.com
2025-09-02drm/bridge: ti-sn65dsi86: fix REFCLK settingMichael Walle
The bridge has three bootstrap pins which are sampled to determine the frequency of the external reference clock. The driver will also (over)write that setting. But it seems this is racy after the bridge is enabled. It was observed that although the driver write the correct value (by sniffing on the I2C bus), the register has the wrong value. The datasheet states that the GPIO lines have to be stable for at least 5us after asserting the EN signal. Thus, there seems to be some logic which samples the GPIO lines and this logic appears to overwrite the register value which was set by the driver. Waiting 20us after asserting the EN line resolves this issue. Fixes: a095f15c00e2 ("drm/bridge: add support for sn65dsi86 bridge driver") Signed-off-by: Michael Walle <mwalle@kernel.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20250821122341.1257286-1-mwalle@kernel.org
2025-09-02i2c: i801: Hide Intel Birch Stream SoC TCO WDTChiasheng Lee
Hide the Intel Birch Stream SoC TCO WDT feature since it was removed. On platforms with PCH TCO WDT, this redundant device might be rendering errors like this: [ 28.144542] sysfs: cannot create duplicate filename '/bus/platform/devices/iTCO_wdt' Fixes: 8c56f9ef25a3 ("i2c: i801: Add support for Intel Birch Stream SoC") Link: https://bugzilla.kernel.org/show_bug.cgi?id=220320 Signed-off-by: Chiasheng Lee <chiasheng.lee@linux.intel.com> Cc: <stable@vger.kernel.org> # v6.7+ Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org> Link: https://lore.kernel.org/r/20250901125943.916522-1-chiasheng.lee@linux.intel.com
2025-09-02ACPI: processor: idle: Eliminate static variable flat_state_cntRafael J. Wysocki
Instead of using static variable flat_state_cnt to pass data between functions involved in the _LPI information processing, pass the current number of "flattened" idle states to flatten_lpi_states() and make it return the updated number of those states. At the same time, use a local variable called state_count to store the number of "flattened" idle states found so far in acpi_processor_get_lpi_info(). No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Tested-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: lihuisong@huawei.com Link: https://patch.msgid.link/10715991.nUPlyArG6x@rafael.j.wysocki
2025-09-02ACPI: processor: idle: Add module import namespaceRafael J. Wysocki
Add a new module import namespace called ACPI_PROCESSOR_IDLE for functions exported from the non-modular part of the ACPI processor driver to the modular part of it. Export acpi_processor_claim_cst_control() and acpi_processor_evaluate_cst() in that namespace to hide them from unrelated modules. They are also used by the intel_idle driver, but it is non-modular, so it can call them regardless of the way the symbols are exported. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/3376499.aeNJFYEL58@rafael.j.wysocki
2025-09-02ACPI: processor: idle: Optimize ACPI idle driver registrationHuisong Li
Currently, the ACPI idle driver is registered from within a CPU hotplug callback. Although this didn't cause any functional issues, this is questionable and confusing. And it is better to register the cpuidle driver when all of the CPUs have been brought up. So add a new function to initialize acpi_idle_driver based on the power management information of an available CPU and register cpuidle driver in acpi_processor_driver_init(). Signed-off-by: Huisong Li <lihuisong@huawei.com> Link: https://patch.msgid.link/20250728070612.1260859-3-lihuisong@huawei.com [ rjw: Added missing inline modifiers ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-09-02drm/msm/dsi/phy_7nm: Fix missing initial VCO rateKrzysztof Kozlowski
Driver unconditionally saves current state on first init in dsi_pll_7nm_init(), but does not save the VCO rate, only some of the divider registers. The state is then restored during probe/enable via msm_dsi_phy_enable() -> msm_dsi_phy_pll_restore_state() -> dsi_7nm_pll_restore_state(). Restoring calls dsi_pll_7nm_vco_set_rate() with pll_7nm->vco_current_rate=0, which basically overwrites existing rate of VCO and messes with clock hierarchy, by setting frequency to 0 to clock tree. This makes anyway little sense - VCO rate was not saved, so should not be restored. If PLL was not configured configure it to minimum rate to avoid glitches and configuring entire in clock hierarchy to 0 Hz. Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Patchwork: https://patchwork.freedesktop.org/patch/657827/ Link: https://lore.kernel.org/r/20250610-b4-sm8750-display-v6-9-ee633e3ddbff@linaro.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2025-09-02drm/msm/dsi/phy: Define PHY_CMN_CTRL_0 bitfieldsKrzysztof Kozlowski
Add bitfields for PHY_CMN_CTRL_0 registers to avoid hard-coding bit masks and shifts and make the code a bit more readable. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/657818/ Link: https://lore.kernel.org/r/20250610-b4-sm8750-display-v6-7-ee633e3ddbff@linaro.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2025-09-02drm/msm/dsi/phy: Toggle back buffer resync after preparing PLLKrzysztof Kozlowski
According to Hardware Programming Guide for DSI PHY, the retime buffer resync should be done after PLL clock users (byte_clk and intf_byte_clk) are enabled. Downstream also does it as part of configuring the PLL. Driver was only turning off the resync FIFO buffer, but never bringing it on again. Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Patchwork: https://patchwork.freedesktop.org/patch/657823/ Link: https://lore.kernel.org/r/20250610-b4-sm8750-display-v6-6-ee633e3ddbff@linaro.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2025-09-02net/mlx5: {DR,HWS}, Use the cached vhca_id for this deviceSaeed Mahameed
The mlx5 driver caches many capabilities to be used by mlx5 layers. In SW and HW steering we can use the cached vhca_id instead of invoking FW commands. Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250829223722.900629-8-saeed@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-02net/mlx5: E-switch, Set representor attributes for adjacent VFsAdithya Jayachandran
Adjacent vfs get their devlink port information from firmware, use the information (pfnum, function id) from FW when populating the devlink port attributes. Before: $ devlink port show pci/0000:00:03.0/180225: type eth netdev eth0 flavour pcivf controller 0 pfnum 0 vfnum 49152 external false splittable false function: hw_addr 00:00:00:00:00:00 After: $ devlink port show pci/0000:00:03.0/180225: type eth netdev enp0s3npf0vf2 flavour pcivf controller 0 pfnum 0 vfnum 2 external false splittable false function: hw_addr 00:00:00:00:00:00 Signed-off-by: Adithya Jayachandran <ajayachandra@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250829223722.900629-7-saeed@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-02net/mlx5: E-Switch, Register representors for adjacent vportsSaeed Mahameed
Register representors for adjacent vports dynamically when they are discovered. Dynamically added representors state will now be set to 'REGISTERED' when the representor type was already registered, otherwise they won't be loaded. Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250829223722.900629-6-saeed@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-02net/mlx5: E-Switch, Create acls root namespace for adjacent vportsSaeed Mahameed
Use the new vport acl root namespace add/remove API to create the missing acl root name spaces per each new adjacent function vport. Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250829223722.900629-5-saeed@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-02net/mlx5: E-Switch, Add support for adjacent functions vports discoveryAdithya Jayachandran
Adding driver support to query adjacent functions vports, AKA delegated vports. Adjacent functions can delegate their sriov vfs to other sibling PF in the system, to be managed by the eswitch capable sibling PF. E.g, ECPF to Host PF, multi host PF between each other, etc. Only supported in switchdev mode. Signed-off-by: Adithya Jayachandran <ajayachandra@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250829223722.900629-4-saeed@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-02net/mlx5: E-Switch, Move vport acls root namespaces creation to eswitchSaeed Mahameed
Move the loop that creates the vports ACLs root name spaces to eswitch, since it is the eswitch responsibility to decide when and how many vports ACLs root namespaces to create, in the next patches we will use the fs_core vport ACL root namespace APIs to create/remove root ns ACLs dynamically for dynamically created vports. Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250829223722.900629-3-saeed@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-02net/mlx5: FS, Convert vport acls root namespaces to xarraySaeed Mahameed
Before this patch it was a linear array and could only support a certain number of vports, in the next patches, vport numbers are not bound to a well known limit, thus convert acl root name space storage to xarray. In addition create fs_core public API to add/remove vport acl namespaces as it is the eswitch responsibility to create the vports and their root name spaces for acls, in the next patch we will move mlx5_fs_ingress_acls_{init,cleanup} to eswitch and will use the individual mlx5_fs_vport_{egress,ingresS}_acl_ns_{add,remove} APIs for dynamically create vports. Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250829223722.900629-2-saeed@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-09-02drm/xe: Fix incorrect migration of backed-up object to VRAMThomas Hellström
If an object is backed up to shmem it is incorrectly identified as not having valid data by the move code. This means moving to VRAM skips the -EMULTIHOP step and the bo is cleared. This causes all sorts of weird behaviour on DGFX if an already evicted object is targeted by the shrinker. Fix this by using ttm_tt_is_swapped() to identify backed-up objects. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5996 Fixes: 00c8efc3180f ("drm/xe: Add a shrinker for xe bos") Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: <stable@vger.kernel.org> # v6.15+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20250828134837.5709-1-thomas.hellstrom@linux.intel.com (cherry picked from commit 1047bd82794a1eab64d643f196d09171ce983f44) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-09-02net: ethernet: ti: am65-cpsw-nuss: Fix null pointer dereference for ndevNishanth Menon
In the TX completion packet stage of TI SoCs with CPSW2G instance, which has single external ethernet port, ndev is accessed without being initialized if no TX packets have been processed. It results into null pointer dereference, causing kernel to crash. Fix this by having a check on the number of TX packets which have been processed. Fixes: 9a369ae3d143 ("net: ethernet: ti: am65-cpsw: remove am65_cpsw_nuss_tx_compl_packets_2g()") Signed-off-by: Nishanth Menon <nm@ti.com> Signed-off-by: Chintan Vankar <c-vankar@ti.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250829121051.2031832-1-c-vankar@ti.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>