linux.git - Linux kernel source tree

Age	Commit message (Collapse)	Author
9 days	drm/amdgpu/userq: use array instead of list for userq_vas	Sunil Khatri
	Use arrays instead of list for userq_vas since we have fixed no of bos. Also, we dont have to worry to free that memory later since this array would be free along with queue only. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit ef7dc711a664b0c548ecfdf13a00436b7446b8e7)
9 days	drm/amdgpu/userq: move mqd_destroy to later stage to keep core obj valid	Sunil Khatri
	mqd_destroy cleans up queue core objects like mqd and fw_object which are needed for any pending fence to signal properly. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 4ad65d610096498c8e265615aba42b3c47441bb5)
9 days	drm/amdkfd: fix a vulnerability of integer overflow in kfd debugger	Eric Huang
	get_queue_ids() computes array_size = num_queues * sizeof(uint32_t), which could overflow on 32-bit size_t build. using array_size() instead, it saturates to SIZE_MAX on overflow. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 2d57a0475f085c08b49312dfd8edcb461845f285) Cc: stable@vger.kernel.org
9 days	drm/amdgpu/userq: remove amdgpu_userq_create/destroy_object wrapper	Sunil Khatri
	Remove the amdgpu_userq_create/destroy_object wrappers and use directly the kernel bo allocation function which does all the things which are done in wrapper. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit deb02080ca5d3f015cf71e56067a39ef2f141998)
9 days	drm/amd/pm/si: Disregard vblank time when no displays are connected	Timur Kristóf
	When no displays are connected, there is no vblank happening so the power management code shouldn't worry about it. This fixes a regression that caused the memory clock to be stuck at maximum when there were no displays connected to a SI GPU. Fixes: 9003a0746864 ("drm/amd/pm: Treat zero vblank time as too short in si_dpm (v3)") Fixes: 9d73b107a61b ("drm/amd/pm: Use pm_display_cfg in legacy DPM (v2)") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Jeremy Klarenbeek <jeremy.klarenbeek99@gmail.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 6d87e0199f7b83735b56e422d59f170a201897a8) Cc: stable@vger.kernel.org
9 days	drm/amdkfd: Check for pdd drm file first in CRIU restore path	David Francis
	CRIU restore ioctls are meant to be called by CRIU with no existing drm file. There's an error path for if the drm file unexpectedly exists. It was positioned so it was missing a fput(drm_file). Do that check earlier, as soon as we have the pdd. Signed-off-by: David Francis <David.Francis@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 2bab781dac78916c5cc8de76345a4102449267d7) Cc: stable@vger.kernel.org
9 days	drm/amdgpu: fix potential overflow in fs_info.debugfs_name	Stanley.Yang
	Use snprintf() with sizeof(fs_info.debugfs_name) so a long RAS block name plus the "_err_inject" suffix cannot overflow the 32-byte buffer. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 1a58070fda26857a8f6acc0ab05428e60d5c6844)
9 days	drm/amdgpu/userq: make sure queue is valid in the hang_detect_work	Sunil Khatri
	Thread 1: Running amdgpu_userq_destroy which eventually remove the queue from door bell and set userq_mgr = NULL. Thread2: An interrupt might have scheduled the hang_detect_work which still need userq_mgr to be valid but could get an NULL ptrs. To fix that make sure we cancel the hang_detect_work again before setting userq_mgr to NULL. Along with that we also need all the queue va to remain valid till we could be running anything on the queue and hence moving the userq_va post hang_detect handler is cancelled. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 1a66ceb98b137d18d303b9889f0e7d8c4db73943)
9 days	drm/amdgpu/userq: reserve root bo without interruption	Sunil Khatri
	Fix the code to make it an uninterruptible reservation for root bo. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit d409ab4e387d94b2e593d558b54b7bfd315e0e75)
9 days	drm/amdgpu/userq: add amdgpu_bo_unpin when amdgpu_ttm_alloc_gart fails	Sunil Khatri
	Unpin the wptr_obj->obj when amdgpu_ttm_alloc_gart fails. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit d8145c437ccdc2d91c579787290f82788172bea0)
9 days	drm/amdgpu: simplify return value in amdgpu_userq_get_doorbell_index	Sunil Khatri
	amdgpu_userq_get_doorbell_index returns a uint64 type index as well as a int type failure values. Simplifying this and using a int type return value and getting the index in input pointer of type uint64 type. Also since it's used at once place making it static would be better. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit e947ec9d0529d5f93dbdb33cd197347f6a7b2922)
9 days	drm/amdkfd: fix NULL pointer bug in svm_range_set_attr	Eric Huang
	The process_info could be NULL if user doesn't call kfd_ioctl_acquire_vm before calling kfd_ioctl_svm. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 83a26c812e0529eb040d31a76f73e33e637243d4) Cc: stable@vger.kernel.org
9 days	drm/amd/display: Write REFCLK to 48MHz on DCN21	Ivan Lipski
	[Why&How] dccg21_init() calls dccg2_init() which hardcodes 100MHz refclk values for MICROSECOND_TIME_BASE_DIV and MILLISECOND_TIME_BASE_DIV. DCN21 uses 48MHz refclk, so the wrong values corrupt DCCG timing and cause eDP link training failure on cold boot. Write the correct 48MHz values directly instead of calling dccg2_init(). v2: Fixed typo Fixes: e6e2b956fc81 ("drm/amd/display: Add missing DCCG register entries for DCN20-DCN316") Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/5272 Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/5311 Reported-by: Max Chernoff <git@maxchernoff.ca> Tested-by: Max Chernoff <git@maxchernoff.ca> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 08236c3ef284cd2d110e5e3d51fc9615e551f9dc) Cc: stable@vger.kernel.org
9 days	drm/amdgpu/userq: Fix the mutex_init cleanup for fence_drv_lock	Sunil Khatri
	mutex fence_drv_lock is destroyed in amdgpu_userq_fence_driver_free also in one of the jump condition mutex_destroy is also called leading to double mutex_destroy. So rearranging the code so amdgpu_userq_fence_driver_free takes care of the clean up along with mutex_destroy. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 384dbef269d101e5b671fc7b942c56734cd1d186)
9 days	drm/amdgpu/userq: Fix doorbell object cleanup of queue	Sunil Khatri
	Unpin and unref the door bell obj if queue creation fails before initialization is complete. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 8c7506f7ba945f21e5abe7f8eac0a3acca6b5330)
9 days	drm/amdgpu: check num_entries in GEM_OP GET_MAPPING_INFO	Ziyi Guo
	kvcalloc(args->num_entries, sizeof(*vm_entries), GFP_KERNEL) at amdgpu_gem.c:1050 uses the user-supplied num_entries directly without any upper bounds check. Since num_entries is a __u32 and sizeof(drm_amdgpu_gem_vm_entry) is 32 bytes, a large num_entries produces an allocation exceeding INT_MAX, triggering WARNING in __kvmalloc_node_noprof(), causing a kernel WARNING, TAINT_WARN, and panic on CONFIG_PANIC_ON_WARN=y systems. Add a size bounds check before we invoke the kvzalloc() to reject oversized num_entries early with -EINVAL. Fixes: 4d82724f7f2b ("drm/amdgpu: Add mapping info option for GEM_OP ioctl") Signed-off-by: Ziyi Guo <n7l8m4@u.northwestern.edu> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 1fe7bf5457f6efd7be60b17e23163ba54341d73d) Cc: stable@vger.kernel.org
9 days	drm/amdgpu: fix lock leak on ENOMEM in AMDGPU_GEM_OP_GET_MAPPING_INFO	Michael Bommarito
	The AMDGPU_GEM_OP_GET_MAPPING_INFO branch of amdgpu_gem_op_ioctl() holds three cleanup-tracked resources before calling kvcalloc(): the drm_gem_object reference from drm_gem_object_lookup(), the drm_exec lock on the looked-up GEM via drm_exec_lock_obj(), and the drm_exec lock on the per-process VM root page directory via amdgpu_vm_lock_pd(). All three are released by the out_exec label that every other error path in this function jumps to. The kvcalloc() failure path returns -ENOMEM directly, skipping out_exec and leaking all three. The leaked per-process VM root PD dma_resv lock is the load-bearing leak: any subsequent operation on the same VM (further GEM ops, command-submission, eviction, TTM shrinker callbacks) blocks on the held lock. DRM_IOCTL_AMDGPU_GEM_OP is DRM_AUTH \| DRM_RENDER_ALLOW, so this is an unprivileged-local denial of service against the caller's GPU context, reachable by any process with /dev/dri/renderD* access. Route the failure through out_exec so drm_exec_fini() and drm_gem_object_put() run. Reproduced on stock 7.0.0-10, Ryzen 7 5700U / Radeon Vega (Lucienne): the failing ioctl returns -ENOMEM and a second GET_MAPPING_INFO on the same fd then blocks in drm_exec_lock_obj() on the leaked dma_resv. SIGKILL on the caller does not reap the task; the fd-release path during process exit goes through amdgpu_gem_object_close() -> drm_exec_prepare_obj() on the same lock, leaving the task in D state until the box is rebooted. The patched kernel was not rebuilt and re-tested on this hardware; the fix is mechanical. Tested on a single Lucienne / Vega box only. Ziyi Guo posted an independent INT_MAX-bound check for args->num_entries in the same branch [1]; the two patches are complementary and can land in either order. Fixes: 4d82724f7f2b ("drm/amdgpu: Add mapping info option for GEM_OP ioctl") Link: https://lore.kernel.org/all/20260208000255.4073363-1-n7l8m4@u.northwestern.edu/ # [1] Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit b69d3256d79de15f54c322986ff4da68f1d65b0a) Cc: stable@vger.kernel.org
9 days	drm/xe: Restore IDLEDLY regiter on engine reset	Balasubramani Vivekanandan
	Wa_16023105232 programs the register IDLEDLY. The register is reset whenever the engine is reset. Therefore it should be added to the GuC save-restore register list for it to be restored after reset. Fixes: 7c53ff050ba8 ("drm/xe: Apply Wa_16023105232") Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/20260522163531.1365540-2-balasubramani.vivekanandan@intel.com Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> (cherry picked from commit df1cfe24743a93b71eab27687e148ab8ae9b69e3) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
10 days	KVM: arm64: Fix memory leak in hyp_trace_unload()	Vincent Donnefort
	During trace remote loading, hyp_trace_load() allocates the descriptor pages but fails to store the allocated size in trace_buffer->desc_size. As a result, when unloading the trace buffer, hyp_trace_unload() calls free_pages_exact() with a size of 0 which fails to free the memory. Fix this by updating the descriptor size in trace_buffer->desc_size. Fixes: 3aed038aac8d ("KVM: arm64: Add trace remote for the nVHE/pKVM hyp") Reported-by: Sashiko <sashiko-bot@kernel.org> Signed-off-by: Vincent Donnefort <vdonnefort@google.com> Link: https://patch.msgid.link/20260521124613.911067-4-vdonnefort@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
10 days	KVM: arm64: Fix rollback in hyp_trace_buffer_share_hyp()	Vincent Donnefort
	When sharing the trace buffer with the hypervisor, if sharing a page fails, the rollback path in hyp_trace_buffer_share_hyp() misses unsharing the metadata page (meta_va) which was successfully shared before entering the page sharing loop. Additionally, if a failure occurs, the cleanup calls hyp_trace_buffer_unshare_hyp() with an incorrect CPU index. Since that CPU's pages were already rolled back locally in the loop, this leads to duplicate unsharing attempts. Fix both issues affecting the rollback. Fixes: 3aed038aac8d ("KVM: arm64: Add trace remote for the nVHE/pKVM hyp") Reported-by: Sashiko <sashiko-bot@kernel.org> Signed-off-by: Vincent Donnefort <vdonnefort@google.com> Link: https://patch.msgid.link/20260521124613.911067-3-vdonnefort@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
10 days	KVM: arm64: Fix meta-page unsharing in pKVM hyp tracing	Vincent Donnefort
	As the hyp_trace_buffer_unshare_hyp() function name suggests we should unshare all the previously shared pages, otherwise we leak hyp-shared pages which won't be reusable for hyp memory. Fix the typo by calling __unshare_page() on the meta-page, ensuring all previously shared pages are correctly unshared. Fixes: 3aed038aac8d ("KVM: arm64: Add trace remote for the nVHE/pKVM hyp") Signed-off-by: Vincent Donnefort <vdonnefort@google.com> Link: https://patch.msgid.link/20260521124613.911067-2-vdonnefort@google.com Signed-off-by: Marc Zyngier <maz@kernel.org>
10 days	ASoC: codecs: simple-mux: Fix enum control bounds check	Cássio Gabriel
	simple_mux_control_put() rejects values greater than e->items, but enum control values are zero based. For the two-entry mux used by this driver, valid values are 0 and 1, so value 2 must be rejected as well. Accepting e->items can store an invalid mux state, pass it to the GPIO setter, and pass it on to the DAPM mux update path where it is used as an index into the enum text array. Use the same >= e->items check used by the ASoC enum helpers. Fixes: 342fbb7578d1 ("ASoC: add simple-mux") Signed-off-by: Cássio Gabriel <cassiogabrielcontato@gmail.com> Link: https://patch.msgid.link/20260527-asoc-simple-mux-enum-bounds-v1-1-3f805b9fc671@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
10 days	KVM: arm64: PMU: Preserve AArch32 counter low bits	Qiang Ma
	AArch32 writes to PMU event counters cannot update the top 32 bits, even when PMUv3p5 makes the counters 64-bit. KVM therefore needs to preserve the existing high half and only update the low half written by the guest, unless the caller explicitly forces a full reset through PMCR.P. The current code masks @val down to the old high half before taking lower_32_bits(val), which means the low half is always zero. As a result, AArch32 writes to event counters discard the guest-provided low 32 bits instead of storing them. Build the new value from the old high 32 bits and the low 32 bits of the value supplied by the guest. Fixes: 26d2d0594d70 ("KVM: arm64: PMU: Do not let AArch32 change the counters' top 32 bits") Signed-off-by: Qiang Ma <maqianga@uniontech.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://patch.msgid.link/20260526074640.791991-1-maqianga@uniontech.com Cc: stable@vger.kernel.org
10 days	ALSA: usb-audio: Add iface reset and delay quirk for TAE1160 USB Audio	Lianqin Hu
	Setting up the interface when suspended/resumeing fail on this card. Adding a reset and delay quirk will eliminate this problem. usb 1-1: new full-speed USB device number 2 using xhci-hcd usb 1-1: New USB device found, idVendor=25aa, idProduct=600b usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3 usb 1-1: Product: TAE1159 usb 1-1: Manufacturer: Generic usb 1-1: SerialNumber: 20210726905926 Signed-off-by: Lianqin Hu <hulianqin@vivo.com> Link: https://patch.msgid.link/TYUPR06MB621736D7C85D43200E54E740D2082@TYUPR06MB6217.apcprd06.prod.outlook.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
10 days	ALSA: hda/cs420x: Add CS4208 fixup for iMac16,1	Jakub Pisarczyk
	The 21.5" Retina 4K iMac (Late 2015, DMI product name "iMac16,1") ships with a Cirrus Logic CS4208 codec wired to an external speaker amplifier enabled through codec GPIO0 -- the same arrangement as the late-2013 MacBookPro 11,x. Without a matching entry in cs4208_mac_fixup_tbl[] the fixup picker logs: snd_hda_codec_cs420x hdaudioC1D0: CS4208: picked fixup for codec SSID 106b:0000 i.e. an empty fixup name, GPIO0 stays low, the external amp is never powered up, and the internal speakers are silent on a stock kernel. The codec SSID reported by hardware is 0x106b:0x7f00. Reusing CS4208_MBP11 (GPIO0 + SPDIF switch fixup) makes the internal speakers and S/PDIF output work out of the box, removing the need for users to set `options snd_hda_intel model=mbp11` via /etc/modprobe.d/. Tested on iMac16,1 (kernel 6.17.0): four internal drivers (Left tweeter, Left woofer, Right tweeter, Right woofer, exposed as the 4 channels of the analog-surround-40 ALSA profile) produce audio after the fixup is applied. Signed-off-by: Jakub Pisarczyk <pisarz77@gmail.com> Link: https://patch.msgid.link/20260526201830.34097-1-pisarz77@gmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
10 days	ALSA: hda/realtek: add quirk for HP Dragonfly Folio G3 2-in-1	Fabian Lippold
	Add PCI quirk for HP Dragonfly Folio G3 (PCI ID 103c:8a06) to select the CS35L41 SPI4 & GPIO LED fixup variant. Signed-off-by: Fabian Lippold <fabianlippold1184@gmail.com> Link: https://patch.msgid.link/20260526154418.1850568-3-fabianlippold1184@gmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
10 days	ipv6: validate extension header length before copying to cmsg	Qi Tang
	ip6_datagram_recv_specific_ctl() builds IPV6_{HOPOPTS,DSTOPTS,RTHDR} cmsgs (and their IPV6_2292* legacy counterparts) by trusting the on-wire hdrlen byte (ptr[1]) when computing the put_cmsg() length. The length was validated only at parse time (ipv6_parse_hopopts(), etc.). An nftables payload-write expression can rewrite hdrlen after parsing and before the skb reaches recvmsg; the write itself is in-bounds but put_cmsg() then reads up to ((hdrlen+1) << 3) = 2040 bytes from an 8-byte header. nftables is reachable from an unprivileged user namespace, so this is an unprivileged slab-out-of-bounds read: BUG: KASAN: slab-out-of-bounds in put_cmsg+0x3ac/0x540 put_cmsg+0x3ac/0x540 udpv6_recvmsg+0xca0/0x1250 sock_recvmsg+0xdf/0x190 ____sys_recvmsg+0x1b1/0x620 Add ipv6_get_exthdr_len() which validates that at least two bytes are accessible before reading the hdrlen field, then checks the computed length against skb_tail_pointer(skb), returning 0 on failure. Extension headers are kept in the linear skb area by pskb_may_pull() during input, so skb_tail_pointer() is the correct bound. Use ipv6_get_exthdr_len() at all non-AH call sites: the five standalone cmsg blocks (HbH, 2292HbH, 2292DSTOPTS x2, 2292RTHDR) and the three standard cases in the extension-header walk loop (DSTOPTS, ROUTING, default). AH retains an inline bounds check because its length formula differs ((ptr[1]+2)<<2). The walk loop also gets a pre-read bounds check at the top to validate ptr before any case accesses ptr[0] or ptr[1]. When the walk loop detects a corrupted header, return from the function instead of continuing to process later socket options. Cc: stable@vger.kernel.org Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Qi Tang <tpluszz77@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20260523143245.2281415-1-tpluszz77@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 days	ksmbd: fix FSCTL permission bypass by adding a permission check for ↵	Sean Shen
	FSCTL_SET_SPARSE FSCTL_SET_SPARSE in fsctl_set_sparse() modifies the file's sparse attribute and saves it through xattr without any permission checks. This exposes two issues: 1) A client on a read-only share can change the sparse attribute on files it opened, even though the share is read-only. Other FSCTL write operations already check test_tree_conn_flag(work->tcon, KSMBD_TREE_CONN_FLAG_WRITABLE), but FSCTL_SET_SPARSE does not. 2) Even on writable shares, clients without FILE_WRITE_DATA or FILE_WRITE_ATTRIBUTES access should not modify the sparse attribute. Similar handle-level checks exist in other functions but are missing here. Add both share-level writable check and per-handle access check. Use goto out on error to avoid leaking file references. Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Steve French <smfrench@gmail.com> Signed-off-by: Sean Shen <grayhat@foxmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
10 days	ksmbd: release ksmbd_inode ref via ksmbd_inode_put on lookup paths	Aleksandr Golovnya
	ksmbd_query_inode_status() and ksmbd_lookup_fd_inode() both take a reference on a ksmbd_inode via __ksmbd_inode_lookup() (which performs atomic_inc_not_zero()) and later release it using a bare atomic_dec(&ci->m_count). Unlike ksmbd_inode_put(), a bare atomic_dec() does not check whether the reference count has reached zero, so if the caller happens to drop the last reference, the ksmbd_inode is leaked: it stays in the global inode hash table with m_count == 0, future __ksmbd_inode_lookup() calls reject it via atomic_inc_not_zero(), and ksmbd_inode_free() is never invoked. The race is: T1: __ksmbd_inode_lookup() -> atomic_inc_not_zero(): m_count = 2 T2: ksmbd_inode_put() -> atomic_dec_and_test(): m_count = 1 (not freed) T1: atomic_dec(&ci->m_count) -> m_count = 0 return (LEAK) In ksmbd_lookup_fd_inode() the matched-fp path (which now also uses ksmbd_inode_put()) cannot currently reach m_count == 0 because the matched ksmbd_file holds its own reference on ci, but converting it to the proper API keeps the three call sites consistent and avoids future regressions if the locking changes. Because ksmbd_inode_put() may free the ksmbd_inode if this drops the last reference, the call must happen after up_read(&ci->m_lock) on the two affected paths in ksmbd_lookup_fd_inode(). On the no-match path this is a pure reordering; on the matched path ksmbd_fp_get() is moved above the unlock so that the returned ksmbd_file is pinned before the inode reference is released. Signed-off-by: Aleksandr Golovnya <cofedish@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
10 days	ksmbd: OOB read regression in smb_check_perm_dacl() ACE-walk loops	Ali Ganiyev
	Commit d07b26f39246 ("ksmbd: require minimum ACE size in smb_check_perm_dacl()") introduced a transposed bounds check: if (offsetof(struct smb_ace, sid) + aces_size < CIFS_SID_BASE_SIZE) Since offsetof(..sid) is 8 and CIFS_SID_BASE_SIZE is 8, this evaluates to `aces_size < 0`. Because `aces_size` is always non-negative, this check becomes dead code and never breaks the loop. Worse, that commit removed the old 4-byte guard, meaning the loop now reads `ace->size` (offset 2) even when `aces_size` is 0-3 bytes. This re-opens a 2-byte heap out-of-bounds (OOB) read past the pntsd allocation during subsequent SMB2_CREATE operations. Fix this by properly transposing the comparison to require at least 16 bytes (8-byte offset + 8-byte SID base), matching the correct form used in smb_inherit_dacl(). Fixes: d07b26f39246 ("ksmbd: require minimum ACE size in smb_check_perm_dacl()") Cc: stable@vger.kernel.org Signed-off-by: Ali Ganiyev <ali.qaniyev@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
10 days	Merge tag 'nfc-7.1-rc6' of https://codeberg.org/linux-nfc/linux	Jakub Kicinski
	David Heidelberg says: ==================== nfc pull request for net: Code improvements - llcp: Fix use-after-free in llcp_sock_release() - llcp: Fix use-after-free race in nfc_llcp_recv_cc() - hci: fix out-of-bounds read in HCP header parsing Regression fixes: - nxp-nci: i2c: use rising-edge IRQ on ACPI systems Signed-off-by: David Heidelberg <david@ixit.cz> * tag 'nfc-7.1-rc6' of https://codeberg.org/linux-nfc/linux: nfc: nxp-nci: i2c: use rising-edge IRQ on ACPI systems nfc: hci: fix out-of-bounds read in HCP header parsing nfc: llcp: Fix use-after-free race in nfc_llcp_recv_cc() nfc: llcp: Fix use-after-free in llcp_sock_release() ==================== Link: https://patch.msgid.link/217c0646-8a30-4037-b613-580c2b189729@ixit.cz Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 days	tunnels: do not assume transport header in iptunnel_pmtud_check_icmp()	Eric Dumazet
	In some cases, iptunnel_pmtud_check_icmp() can be called while skb transport header is not set. This triggers an out-of-bound access, because (typeof(skb->transport_header))~0U is 65535. Access the icmp header based on IPv4 network header, after making sure icmp->type is present in skb linear part. Note that iptunnel_pmtud_check_icmpv6()) is fine. Fixes: 4cb47a8644cc ("tunnels: PMTU discovery support for directly bridged IP packets") Reported-by: Damiano Melotti <melotti@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20260522115512.1519110-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 days	vxlan: do not reuse cached ip_hdr() value after skb_tunnel_check_pmtu()	Eric Dumazet
	skb_tunnel_check_pmtu() can change skb->head. Reusing old_iph afer skb_tunnel_check_pmtu() can cause an UAF. Use instead ip_hdr(skb) as done in drivers/net/bareudp.c and drivers/net/geneve.c. Found by Sashiko. Fixes: 4cb47a8644cc ("tunnels: PMTU discovery support for directly bridged IP packets") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Link: https://patch.msgid.link/20260525203642.2389723-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 days	tunnels: load network headers after skb_cow() in iptunnel_pmtud_build_icmp[v6]()	Eric Dumazet
	Sashiko found that iptunnel_pmtud_build_icmp() and iptunnel_pmtud_build_icmpv6() were caching ip_hdr() and ipv6_hdr() before an skb_cow() call which can reallocate skb->head. Fix this possible UAF by initializing the local variables after the skb_cow() call. Remove skb_reset_network_header() calls which were not needed. Fixes: 4cb47a8644cc ("tunnels: PMTU discovery support for directly bridged IP packets") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Link: https://patch.msgid.link/20260525201335.2361845-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 days	net: phy: air_en8811h: add AN8811HB MCU assert/deassert support	Lucien.Jheng
	AN8811HB needs a MCU soft-reset cycle before firmware loading begins. Assert the MCU (hold it in reset) and immediately deassert (release) via a dedicated PBUS register pair (0x5cf9f8 / 0x5cf9fc), accessed through a registered mdio_device at PHY-addr+8. Add __air_pbus_reg_write() as a low-level helper taking a struct mdio_device *, create and register the PBUS mdio_device in an8811hb_probe() and store it in priv->pbusdev, then implement an8811hb_mcu_assert() / _deassert() on top of it. Add an8811hb_remove() to unregister the PBUS device on teardown. Wire both calls into an8811hb_load_firmware() and en8811h_restart_mcu() so every firmware load or MCU restart on AN8811HB correctly sequences the reset control registers. Fixes: 5afda1d734ed ("net: phy: air_en8811h: add Airoha AN8811HB support") Signed-off-by: Lucien Jheng <lucienzx159@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20260524063915.47961-1-lucienzx159@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 days	l2tp: use refcount_inc_not_zero in l2tp_session_get_by_ifname	Michael Bommarito
	A reader in l2tp_session_get_by_ifname() can return a pointer to a session whose refcount has reached zero. The getter takes its reference with plain refcount_inc(), but every other session getter in the same file (l2tp_v2_session_get, l2tp_v3_session_get, and the corresponding _get_next variants) uses refcount_inc_not_zero() because the IDR/RCU lookup can race with refcount_dec_and_test() -> l2tp_session_free() -> kfree_rcu(). The ifname getter is the only outlier; the inconsistency was raised on-list after 979c017803c4 ("l2tp: use list_del_rcu in l2tp_session_unhash"). A reader inside rcu_read_lock_bh() that matches session->ifname can be preempted between the strcmp() and the refcount_inc(). If the last reference drops on another CPU in that window, the reader's refcount_inc() runs on a counter that has reached zero. refcount_t catches the addition-on-zero, prints "refcount_t: addition on 0; use-after-free", saturates the counter, and returns the saturated pointer to the caller. Session memory is held live by the in-flight RCU read section, but the kfree_rcu() callback queued from l2tp_session_free() will free it once the grace period closes; a caller that dereferences the returned session past that point hits a slab-use-after-free. On PREEMPT_RT local_bh_disable() is a per-CPU sleeping lock and the preemption window is real; on stock PREEMPT kernels local_bh_disable() is a preempt_count increment that closes the cross-CPU race in practice (see below). Use refcount_inc_not_zero() and continue the list walk on failure, matching the other session getters in the file. The ifname getter is the only session getter in net/l2tp/ that still uses the bare refcount_inc() pattern; this change restores file-internal consistency. The success path is unchanged. Fixes: abe7a1a7d0b6 ("l2tp: improve tunnel/session refcount helpers") Cc: stable@vger.kernel.org Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Reviewed-by: James Chapman <jchapman@katalix.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260523023423.2568972-1-michael.bommarito@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 days	i2c: davinci: fix division by zero on missing clock-frequency	Chaitanya Sabnis
	When the 'clock-frequency' property is missing from the device tree, the driver falls back to DAVINCI_I2C_DEFAULT_BUS_FREQ. However, this macro was defined in kHz (100), whereas the device tree property is expected in Hz. The probe function divided the fallback value by 1000, causing integer truncation that resulted in dev->bus_freq = 0. This triggered a deterministic division-by-zero kernel panic when calculating clock dividers later in the probe sequence. Fix this by redefining DAVINCI_I2C_DEFAULT_BUS_FREQ in Hz (100000) to match the expected device tree property unit, allowing the existing division logic to work correctly for both cases. Fixes: b04ce6385979 ("i2c: davinci: kill platform data") Reported-by: Sashiko <sashiko-bot@kernel.org> Closes: https://lore.kernel.org/all/20260514044726.57297C2BCB7@smtp.kernel.org/ Signed-off-by: Chaitanya Sabnis <chaitanya.msabnis@gmail.com> Cc: <stable@vger.kernel.org> # v6.14+ Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org> Link: https://lore.kernel.org/r/20260526102240.4949-1-chaitanya.msabnis@gmail.com
10 days	blk-mq: reinsert cached request to the list	Keith Busch
	A previous commit removed an optimization out of caution for a scenario that turns out not to be real: all the "queue_exit" goto's are safe to reinsert the request into the cached_rq's plug list as they are either from a non-blocking path, or a successful merge that already holds the queue reference. This optimization is most needed for small sequential workloads that successfully merge into larger requests. Fixes: dc278e9bf2b9 ("blk-mq: pop cached request if it is usable") Suggested-by: Ming Lei <tom.leiming@gmail.com> Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://patch.msgid.link/20260526153531.2365935-1-kbusch@meta.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 days	cxl/test: Update mock dev array before calling platform_device_add()	Li Ming
	CXL test environment hits the following error sometimes. cxl_mem mem9: endpoint7 failed probe All mock memdevs are platform firmware devices added by cxl_test module, and cxl_test module also provides a platform device driver for them to create a memdev device to CXL subsystem. cxl_test module uses cxl_rcd/mem_single/mem arrays to store different types of mock memdevs. CXL drivers calls registered mock functions for a mock memdev by checking if a given memdev is in these arrays. When cxl_test module adds these mock memdevs, it always calls platform_device_add() before adding them to a suitable mock memdev array. However, there is a small window where CXL drivers calls mock function for a added memdev before it added to a mock memdev array. In above case, cxl endpoint driver considers a added memdev was not a mock memdev, then calling devm_cxl_endpoint_decoders_setup() for it rather than mock_endpoint_decoders_setup(). An appropriate solution is that adding a new mock device to a mock device array before calling platform_device_add() for it. It can guarantee the new mock device is visible to CXL subsystem. This patch introduces a new helped called cxl_mock_platform_device_add() to handle the issue, and uses the function for all mock devices addition. Fixes: 3a2b97b3210b ("cxl/test: Improve init-order fidelity relative to real-world systems") Signed-off-by: Li Ming <ming.li@zohomail.com> Tested-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20260520121457.234404-1-ming.li@zohomail.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
10 days	Merge tag 'nfsd-7.1-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: "Regressions: - Tighten bounds checking for sunrpc cache hash tables - Don't report key material in the ftrace log Stable fix: - Fix lockd's implementation of the NLM TEST procedure" * tag 'nfsd-7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: lockd: fix TEST handling when not all permissions are available. NFSD: Report whether fh_key was actually updated sunrpc: prevent out-of-bounds read in __cache_seq_start()
10 days	Merge tag 'linux_kselftest-kunit-fixes-7.1-rc6' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kunit fix from Shuah Khan: "Fix a use-after-free in kunit debugfs when using kunit.filter when the executor frees dynamically allocated resources after running boot-time tests. This resulted in fatal hardware exception due to invalidation of capability flags on the reclaimed memory on some architectures such as CHERI RISC-V that support the feature, and silent memory corruption on others. The fix for this couples the lifetime of the filtered suite memory allocation to the lifetime of the kunit subsystem and its associated VFS nodes. Ownership of the boot-time suite_set is now transferred to a global tracker ('kunit_boot_suites'), and the memory is cleanly released in kunit_exit() during module teardown" * tag 'linux_kselftest-kunit-fixes-7.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: kunit: fix use-after-free in debugfs when using kunit.filter
10 days	x86/microcode: Do not access MSR_IA32_PLATFORM_ID when running as a guest	Borislav Petkov
	Patch in Fixes: causes the usual: unchecked MSR access error: RDMSR from 0x17 at ... (intel_get_platform_id) Call Trace: early_init_intel early_cpu_init setup_arch _printk start_kernel x86_64_start_reservations x86_64_start_kernel common_startup_64 because the kernel is booted in a guest. In order to avoid it, this MSR access needs to be prevented when running virtualized. That is usually done by checking X86_FEATURE_HYPERVISOR but for this particular case it is too early yet. The platform ID needs to be read as early as when microcode is loaded on the BSP: load_ucode_bsp ... -> get_microcode_blob ... -> intel_find_matching_signature and by that time, CPUID leafs haven't been parsed yet. The microcode loader already has logic to check early whether the kernel is running virtualized so make that globally available to arch/x86/. The query whether running virtualized is getting more and more prominent in recent times so might as well make it an arch-global var which the rest of the code can use. Fixes: d8630b67ca1ed ("x86/cpu: Add platform ID to CPU info structure") Reported-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com> Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com> Tested-by: Binbin Wu <binbin.wu@linux.intel.com> Link: https://lore.kernel.org/all/20260430020953.1405535-1-binbin.wu@linux.intel.com
10 days	spi: dt-bindings: spi-qpic-snand: Add ipq5210 compatible	Varadarajan Narayanan
	Since the QPIC-SPI-NAND flash controller present in ipq5210 is the same as the one found in ipq9574, document the ipq5210 compatible and with ipq9574 as the fallback. Signed-off-by: Varadarajan Narayanan <varadarajan.narayanan@oss.qualcomm.com> Link: https://patch.msgid.link/20260514-ipq5210-nand-v1-1-cbdd7492e826@oss.qualcomm.com Signed-off-by: Mark Brown <broonie@kernel.org>
10 days	Merge tag 'mm-hotfixes-stable-2026-05-25-16-22' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "13 hotfixes. 9 are for MM. 9 are cc:stable and the remaining 4 address post-7.1 issues or aren't considered suitable for backporting. All patches are singletons - please see the individual changelogs for details" * tag 'mm-hotfixes-stable-2026-05-25-16-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: Revert "mm: introduce a new page type for page pool in page type" mm/vmalloc: do not trigger BUG() on BH disabled context MAINTAINERS, mailmap: change email for Eugen Hristev mm/migrate_device: fix pgtable leak in migrate_vma_insert_huge_pmd_page kernel/fork: validate exit_signal in kernel_clone() mm: memcontrol: propagate NMI slab stats to memcg vmstats mm/damon/sysfs-schemes: delete tried region in regions_rmdirs() mm/rmap: initialize nr_pages to 1 at loop start in try_to_unmap_one zram: fix use-after-free in zram_writeback_endio memfd: deny writeable mappings when implying SEAL_WRITE ipc: limit next_id allocation to the valid ID range Revert "mm/hugetlbfs: update hugetlbfs to use mmap_prepare" MAINTAINERS: .mailmap: update after GEHC spin-off
10 days	Merge branch 'ethtool-module-fix-a-handful-of-small-bugs'	Jakub Kicinski
	Jakub Kicinski says: ==================== ethtool: module: fix a handful of small bugs I've been poking at the locking in ethtool and it appears that the FW flashing is not currently taking the ops lock. Existing drivers which implement module FW flashing seem to have their own locking, so this series doesn't actually add the ops lock (I'll add it in net-next). But a number of other errors have been surfaced in the process. ==================== Link: https://patch.msgid.link/20260522231312.1710836-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 days	ethtool: cmis: validate fw->size against start_cmd_payload_size	Jakub Kicinski
	cmis_fw_update_start_download() copies start_cmd_payload_size bytes from the firmware blob into the CDB LPL vendor_data[] payload without validating that the FW has enough data. Since the start_cmd_payload_size can only be ~120B an image too short is most likely corrupted, so reject it. Fixes: c4f78134d45c ("ethtool: cmis_fw_update: add a layer for supporting firmware update using CDB") Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Link: https://patch.msgid.link/20260522231312.1710836-10-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 days	ethtool: cmis: validate start_cmd_payload_size from module	Jakub Kicinski
	The CMIS firmware update code reads start_cmd_payload_size from the module's FW Management Features CDB reply and uses it directly as the byte count for memcpy. The destination buffer is 112 bytes (ETHTOOL_CMIS_CDB_LPL_MAX_PL_LENGTH - 8). So a malicious module (or corrupted response) can cause a OOB write later on in cmis_fw_update_start_download(). Let's error out. If modules that expect longer LPL writes actually exist we should revisit. struct cmis_cdb_start_fw_download_pl's definition has to move, no change there. Fixes: c4f78134d45c ("ethtool: cmis_fw_update: add a layer for supporting firmware update using CDB") Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Link: https://patch.msgid.link/20260522231312.1710836-9-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 days	ethtool: cmis: fix u16-to-u8 truncation of msleep_pre_rpl	Jakub Kicinski
	ethtool_cmis_cdb_compose_args() accepts msleep_pre_rpl as u16 but stores it into the u8 field ethtool_cmis_cdb_cmd_args::msleep_pre_rpl, silently truncating values >= 256. Seven of the nine call sites pass 1000 ms (it's the third argument from the end). Fixes: a39c84d79625 ("ethtool: cmis_cdb: Add a layer for supporting CDB commands") Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Danielle Ratson <danieller@nvidia.com> Link: https://patch.msgid.link/20260522231312.1710836-8-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 days	ethtool: cmis: require exact CDB reply length	Jakub Kicinski
	Malicious SFP module could respond with rpl_len longer than what cmis_cdb_process_reply() expected, leading to OOB writes. Malicious HW is a bit theoretical but some modules may just be buggy and/or the reads may occasionally get corrupted, so let's protect the kernel. The existing check protects from short replies. We need to protect from long ones, too. All callers that pass a non-zero rpl_exp_len cast the reply payload to a fixed-layout struct and read fields at fixed offsets, with no version negotiation or short-reply handling: - cmis_cdb_validate_password() - cmis_cdb_module_features_get() - cmis_fw_update_fw_mng_features_get() so let's assume that responses longer than expected do not have to be handled gracefully here. Add a warning message to make the debug easier in case my understanding is wrong... Note that page_data->length (argument of kmalloc) comes from last arg to ethtool_cmis_page_init() which is rpl_exp_len. Note2 that AIs also like to point out overflows in args->req.payload itself (which is a fixed-size 120 B buffer, on the stack), but callers should be reading structs defined by the standard, so protecting from requests for more data than max seem like defensive programming. Fixes: a39c84d79625 ("ethtool: cmis_cdb: Add a layer for supporting CDB commands") Reviewed-by: Danielle Ratson <danieller@nvidia.com> Link: https://patch.msgid.link/20260522231312.1710836-7-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 days	ethtool: module: fix cleanup if socket used for flashing multiple devices	Jakub Kicinski
	When a single Netlink socket issues MODULE_FW_FLASH_ACT against multiple devices, ethnl_sock_priv_set() overwrites sk_priv->dev on each call, retaining only the last one. The socket priv is used on socket close, to walk the global work list and mark the uncompleted flashing work as "orphaned". Otherwise if another socket reuses the PID it will unexpectedly receive the flashing notifications. Don't record the device, record net pointer instead. The purpose of the dev is to scope the work to a netns, anyway. If we store netns the overrides are safe/a nop since all flashed devices must be in the same netns as the socket. Fixes: 32b4c8b53ee7 ("ethtool: Add ability to flash transceiver modules' firmware") Reviewed-by: Danielle Ratson <danieller@nvidia.com> Link: https://patch.msgid.link/20260522231312.1710836-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>