| Age | Commit message (Collapse) | Author |
|
The main purpose of this cmd is to be able to associate a
non-psp-capable device (e.g. veth or netkit) with a psp device.
One use case is if we create a pair of veth/netkit, and assign 1 end
inside a netns, while leaving the other end within the default netns,
with a real PSP device, e.g. netdevsim or a physical PSP-capable NIC.
With this command, we could associate the veth/netkit inside the netns
with PSP device, so the virtual device could act as PSP-capable device
to initiate PSP connections, and performs PSP encryption/decryption on
the real PSP device.
Signed-off-by: Wei Wang <weibunny@fb.com>
Reviewed-by: Daniel Zahka <daniel.zahka@gmail.com>
Link: https://patch.msgid.link/20260608233118.2694144-3-weibunny.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Blamed commit converted the untracked dev_hold()/dev_put() calls
in the watchdog code to use the tracked dev_hold_track()/dev_put_track()
(which were later renamed/interfaced to netdev_hold() and netdev_put()).
By introducing dev->watchdog_dev_tracker to store the
reference tracking information without adding synchronization
between netdev_watchdog_up() and dev_watchdog(), it enabled the
race condition where this pointer could be overwritten or freed
concurrently, leading to the list corruption crash syzbot reported:
list_del corruption, ffff888114a18c00->next is NULL
kernel BUG at lib/list_debug.c:52 !
Oops: invalid opcode: 0000 [#1] SMP KASAN PTI
CPU: 1 UID: 0 PID: 91 Comm: kworker/u8:5 Not tainted syzkaller #0 PREEMPT(lazy)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/09/2026
Workqueue: events_unbound linkwatch_event
RIP: 0010:__list_del_entry_valid_or_report.cold+0x22/0x2a lib/list_debug.c:52
Call Trace:
<TASK>
__list_del_entry_valid include/linux/list.h:132 [inline]
__list_del_entry include/linux/list.h:246 [inline]
list_move_tail include/linux/list.h:341 [inline]
ref_tracker_free+0x1a7/0x6c0 lib/ref_tracker.c:329
netdev_tracker_free include/linux/netdevice.h:4491 [inline]
netdev_put include/linux/netdevice.h:4508 [inline]
netdev_put include/linux/netdevice.h:4504 [inline]
netdev_watchdog_down net/sched/sch_generic.c:600 [inline]
dev_deactivate_many+0x28c/0xfe0 net/sched/sch_generic.c:1363
dev_deactivate+0x109/0x1d0 net/sched/sch_generic.c:1397
linkwatch_do_dev net/core/link_watch.c:184 [inline]
linkwatch_do_dev+0xd3/0x120 net/core/link_watch.c:166
__linkwatch_run_queue+0x3a5/0x810 net/core/link_watch.c:240
linkwatch_event+0x8f/0xc0 net/core/link_watch.c:314
process_one_work+0xa0e/0x1980 kernel/workqueue.c:3314
process_scheduled_works kernel/workqueue.c:3397 [inline]
worker_thread+0x5ef/0xe50 kernel/workqueue.c:3478
kthread+0x370/0x450 kernel/kthread.c:436
ret_from_fork+0x69a/0xc80 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
This patch has three coordinated parts:
1) Add dev->watchdog_lock and dev->watchdog_ref_held to serialize watchdog operations.
2) Remove netdev_watchdog_up() call from netif_carrier_on():
This ensures netdev_watchdog_up() is only called from process/BH context
(via linkwatch workqueue dev_activate()), allowing us to use
spin_lock_bh() for synchronization.
3) Synchronize watchdog up and watchdog timer:
Protect netdev_watchdog_up() with tx_global_lock and watchdog_lock.
Only allocate a new tracker in netdev_watchdog_up() if one is
not already present.
In dev_watchdog(), ensure we don't release the tracker if the
timer was rescheduled either by dev_watchdog() itself or concurrently
by netdev_watchdog_up().
Fixes: f12bf6f3f942 ("net: watchdog: add net device refcount tracker")
Reported-by: syzbot+381d82bbf0253710b35d@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6a26b751.c25708ab.1b19ef.0013.GAE@google.com/T/#u
Tested-by: syzbot+3479efbc2821cb2a79f2@syzkaller.appspotmail.com
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260611152737.2580480-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The tls_toe feature and its single user (chelsio chtls) have been
unmaintained for multiple years. It also hooks into the core of the
TCP implementation, and bypasses most of the networking stack.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/1f30e73275c07bf879f547589872d0916025a52e.1781165969.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
A child socket inherits the listener's bpf_sock_ops_cb_flags via
sk_clone_lock(). If its setup fails in tcp_v4_syn_recv_sock() /
tcp_v6_syn_recv_sock(), the child is freed through put_and_exit, where
inet_csk_prepare_forced_close() drops the socket lock and tcp_done() runs
without it.
If BPF_SOCK_OPS_STATE_CB_FLAG was inherited, tcp_done() -> tcp_set_state()
calls tcp_call_bpf(), which expects the lock and trips sock_owned_by_me():
WARNING: include/net/sock.h:1799 at tcp_set_state+0x433/0x550
RIP: 0010:tcp_set_state+0x433/0x550 include/net/sock.h:1799
Call Trace:
<IRQ>
tcp_done+0xba/0x250 net/ipv4/tcp.c:5095
tcp_v4_syn_recv_sock+0x850/0xa50 net/ipv4/tcp_ipv4.c:1787
tcp_check_req+0xf30/0x1360 net/ipv4/tcp_minisocks.c:926
tcp_v4_rcv+0x1047/0x1b50 net/ipv4/tcp_ipv4.c:2164
</IRQ>
The child is freed before it is ever established, so it should run no
sock_ops callback. Clear its cb flags in inet_csk_prepare_for_destroy_sock(),
the common point for the IPv4, IPv6 and chtls forced-close paths and for the
MPTCP ->syn_recv_sock() failure path (dispose_child), which reaches tcp_done()
on a child that was never established too.
Suggested-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Fixes: d44874910a26 ("bpf: Add BPF_SOCK_OPS_STATE_CB")
Signed-off-by: Sechang Lim <rhkrqnwk98@gmail.com>
Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260611092923.1895982-1-rhkrqnwk98@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Pull drm fixes from Dave Airlie:
"Looks like it's settled down a bit more thankfully. Small changes
across the board, amdgpu/xe leading with some colorop changes in the
core/amd. Otherwise some misc driver fixes.
colorop:
- make lut interpolation mutable
- track colorop updates correctly
amdgpu:
- UserQ fix
- Userptr fix
- MCCS freesync fix
- track colorop changes correctly
amdkfd:
- Fix an event information leak
- Events bounds check fix
- Trap cleanup fix
i915:
- Check supported link rates DPCD read
- Fix phys BO pread/pwrite with offset
xe:
- fix oops in suspend/shutdown without display
- RAS fixes
- Use HW_ERR prefix in log
- include all registered queues in TLB invalidation
- Fix refcount leak in xe_range_tree in error paths
- fix job timeout recovery for unstarted jobs and kernel queues
amdxdna:
- fix possible leak of mm_struct
ivpu:
- fix integer truncation
vc4:
- fix leak in krealloc() error handling
virtio:
- fix dma_fence ref-count leak"
* tag 'drm-fixes-2026-06-13' of https://gitlab.freedesktop.org/drm/kernel: (24 commits)
accel/amdxdna: Fix mm_struct reference leak in aie2_populate_range()
drm/xe: fix job timeout recovery for unstarted jobs and kernel queues
drm/xe: fix refcount leak in xe_range_fence_insert()
drm/xe: include all registered queues in TLB invalidation
drm/xe/hw_error: Use HW_ERR prefix in log
drm/xe/drm_ras: Add per node cleanup action
drm/xe/drm_ras: Make counter allocation drm managed
drm/xe/display: fix oops in suspend/shutdown without display
drm/amd/display: use plane color_mgmt_changed to track colorop changes
drm/atomic: track individual colorop updates
drm/colorop: make lut(1/3)d_interpolation props correctly behave as mutable
drm/colorop: Remove read-only comments from interpolation fields
drm/i915/gem: Fix phys BO pread/pwrite with offset
drm/vc4: fix krealloc() memory leak
drm/virtio: Fix driver removal with disabled KMS
drm/i915/edp: Check supported link rates DPCD read
accel/ivpu: Fix signed integer truncation in IPC receive
drm/virtio: fix dma_fence refcount leak on error in virtio_gpu_dma_fence_wait()
drm/amd/display: Consult MCCS FreeSync cap only if requested & supported
drm/amdkfd: Unwind debug trap enable on copy_to_user failure
...
|
|
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
Short summary of fixes pull:
amd:
- track colorop changes correctly
amdxdna:
- fix possible leak of mm_struct
colorop:
- make lut interpolation mutable
- track colorop updates correctly
ivpu:
- fix integer truncation
vc4:
- fix leak in krealloc() error handling
virtio:
- fix dma_fence ref-count leak
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20260612081418.GA17001@2a02-2455-9062-2500-e496-5a17-62ba-545e.dyn6.pyur.net
|
|
To date, platform firmware maps accelerator memory and accelerator drivers
simply want an address range that they can map themselves. This typically
results in a single region being auto-assembled upon registration of a
memory device. Use the @attach mechanism of devm_cxl_add_memdev()
parameter to retrieve that region while also adhering to CXL subsystem
locking and lifetime rules. As part of adhering to current object lifetime
rules, if the region or the CXL port topology is invalidated, the CXL core
arranges for the accelertor driver to be detached as well.
The locking and lifetime rules were validated with Dave's work-in-progress
cxl-type-2 support for cxl_test.
devm_cxl_add_classdev() supports the general memory expansion flow where
region assembly is optional, dynamic, and user controlled.
Cc: Alejandro Lucero <alucerop@amd.com>
Signed-off-by: Dan Williams <djbw@kernel.org>
Reviewed-by: Alejandro Lucero <alucerop@amd.com>
Tested-by: Alejandro Lucero <alucerop@amd.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://patch.msgid.link/20260519210158.1499795-6-djbw@kernel.org
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Current DAI supports auto format selection. It allow to have array like
below.
(X) static u64 xxx_auto_formats[] = {
(A) /* First Priority */
SND_SOC_POSSIBLE_DAIFMT_I2S |
SND_SOC_POSSIBLE_DAIFMT_LEFT_J,
/* Second Priority */
(B) SND_SOC_POSSIBLE_DAIFMT_DSP_A |
SND_SOC_POSSIBLE_DAIFMT_DSP_B,
};
It try to find available format from I2S/LEFT_J first (A).
Then, try to find from I2S/LEFT_J/DSP_A/DSP_B if couldn't find (A)+(B).
(OR:ed)
In this method, it can't handle if there is format combination.
For example, some driver has pattern.
Pattern1
I2S/RIFHT_J/LEFT_J (FORMAT) and NB_NF/IB_IF/IB_NF/NB_IF (INV)_
Pattern2
DSP_A/DSP_B (FORMAT) and NB_NF/ IB_NF
Because it will try to OR Pattern1 and Pattern2, un-supported
pattern might be selected.
This patch update method not to use OR, and assumes full format array.
Above sample (X) need to be
static u64 xxx_auto_formats[] = {
/* First Priority */
SND_SOC_POSSIBLE_DAIFMT_I2S |
SND_SOC_POSSIBLE_DAIFMT_LEFT_J,
/* Second Priority */
SND_SOC_POSSIBLE_DAIFMT_I2S |
SND_SOC_POSSIBLE_DAIFMT_LEFT_J |
SND_SOC_POSSIBLE_DAIFMT_DSP_A |
SND_SOC_POSSIBLE_DAIFMT_DSP_B,
};
Note: It doesn't support Multi CPU/Codec for now
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Link: https://patch.msgid.link/87jys836k8.wl-kuninori.morimoto.gx@renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Clock provider / consumer selection is based on board, we can't select
automatically from software. Let's remove SND_SOC_POSSIBLE_xBx_xFx.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Link: https://patch.msgid.link/87tsrc36li.wl-kuninori.morimoto.gx@renesas.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Add a new block error injection interface that allows to inject specific
status code for specific ranges.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Hannes Reinecke <hare@kernel.org>
Reviewed-by: Md Haris Iqbal <haris.iqbal@linux.dev>
Link: https://patch.msgid.link/20260611140703.2401204-5-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
So far our parsing of {iommu,msi}-map properties has always blindly
assumed that the output specifiers will always have exactly 1 cell.
This typically does happen to be the case, but is not actually enforced
(and the PCI msi-map binding even explicitly states support for 0 or 1
cells) - as a result we've now ended up with dodgy DTs out in the field
which depend on this behaviour to map a 1-cell specifier for a 2-cell
provider, despite that being bogus per the bindings themselves.
Since there is some potential use in being able to map at least single
input IDs to multi-cell output specifiers (and properly support 0-cell
outputs as well), add support for properly parsing and using the target
nodes' #cells values, albeit with the unfortunate complication of still
having to work around expectations of the old behaviour too.
Since there are multi-cell output specifiers, the callers of of_map_id()
may need to get the exact cell output value for further processing.
Update of_map_id() to set args_count in the output to reflect the actual
number of output specifier cells.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Charan Teja Kalla <charan.kalla@oss.qualcomm.com>
Signed-off-by: Vijayanand Jitta <vijayanand.jitta@oss.qualcomm.com>
Link: https://patch.msgid.link/20260603-parse_iommu_cells-v16-3-dc509dacb19a@oss.qualcomm.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
Change of_map_id() to take a pointer to struct of_phandle_args
instead of passing target device node and translated IDs separately.
Update all callers accordingly.
Add an explicit filter_np parameter to of_map_id() and of_map_msi_id()
to separate the filter input from the output. Previously, the target
parameter served dual purpose: as an input filter (if non-NULL, only
match entries targeting that node) and as an output (receiving the
matched node with a reference held). Now filter_np is the explicit
input filter and arg->np is the pure output.
Previously, of_map_id() would call of_node_put() on the matched node
when a filter was provided, making reference ownership inconsistent.
Remove this internal of_node_put() call so that of_map_id() now always
transfers ownership of the matched node reference to the caller via
arg->np. Callers are now consistently responsible for releasing this
reference with of_node_put(arg->np) when done.
Acked-by: Frank Li <Frank.Li@nxp.com>
Suggested-by: Rob Herring (Arm) <robh@kernel.org>
Suggested-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Charan Teja Kalla <charan.kalla@oss.qualcomm.com>
Signed-off-by: Vijayanand Jitta <vijayanand.jitta@oss.qualcomm.com>
Link: https://patch.msgid.link/20260603-parse_iommu_cells-v16-2-dc509dacb19a@oss.qualcomm.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
Since we now have quite a few users parsing "iommu-map" and "msi-map"
properties, give them some wrappers to conveniently encapsulate the
appropriate sets of property names. This will also make it easier to
then change of_map_id() to correctly account for specifier cells.
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Vijayanand Jitta <vijayanand.jitta@oss.qualcomm.com>
Link: https://patch.msgid.link/20260603-parse_iommu_cells-v16-1-dc509dacb19a@oss.qualcomm.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
iommu_copy_struct_from_full_user_array() copies a whole user array into a
kernel buffer. In the common case, where user entry_len equals destination
entry size, it takes a fast path and copies the whole array with a single
copy_from_user().
That fast path does not return, so it falls through into the item-by-item
copy_struct_from_user() loop and copies every entry a second time. For an
equal entry_len that loop is just a copy_from_user() of the same bytes, so
the whole array is copied twice for no benefit.
Return right after the bulk copy. The per-item loop then runs only on the
slow path, where entry_len differs and each entry needs size adaption.
Fixes: 4f2e59ccb698 ("iommu: Add iommu_copy_struct_from_full_user_array helper")
Link: https://patch.msgid.link/r/6c9eca4ff584cb977661e97799ac6fe934e7f51c.1780521606.git.nicolinc@nvidia.com
Assisted-by: Claude:claude-opus-4-8
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
IOMMU_IOAS_MAP_FILE is documented as mapping a memfd, but the
implementation first tries to resolve the fd as a dma-buf and has a
special path for supported dma-buf exporters. In particular, VFIO PCI
dma-bufs exported through VFIO_DEVICE_FEATURE_DMA_BUF can be mapped when
they describe a single DMA range.
Update the UAPI comment so userspace understands that certain kinds of
dma-buf are supported in addition to memfd.
Fixes: 44ebaa1744fd ("iommufd: Accept a DMABUF through IOMMU_IOAS_MAP_FILE")
Link: https://patch.msgid.link/r/20260610-tmp-v1-1-b8ccbf557391@fb.com
Signed-off-by: Alex Mastro <amastro@fb.com>
Assisted-by: Codex:gpt-5.5-high
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The secondary_op_tmpl kdoc line has been removed accidentally, add it
back.
Reported-by: Michael Walle <mwalle@kernel.org>
Closes: https://lore.kernel.org/linux-mtd/DJ56CDMRVFQ6.FOZRIQTF3VDW@kernel.org/T/#u
Fixes: 38fbe4b3f66e ("spi: spi-mem: Add a no_cs_assertion capability")
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Link: https://patch.msgid.link/20260612-perso-fix-no-cs-assertion-kdoc-v1-1-626b2d6d0d9b@bootlin.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Merge tag 'mtd/spi-mem-cont-read-for-7.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux into spi-7.2
Miquel Raynal <miquel.raynal@bootlin.com> says:
Aside from preparation changes in the SPI NAND core, the changes carried
here focus on the shared spi-mem layer which is enhanced in order to
bring two new features:
- The possibility to fill a primary and a secondary operation template
in the direct mapping structure in order to support continuous reads
in SPI NAND, which may require two different read operations.
- SPI controllers may indicate possible CS instabilities over long
transfers by setting a boolean. This capability is related to the
previous one, the need for it has arised while testing SPI NAND
continuous reads with the Cadence QSPI controller which cannot, under
certain conditions, keep the CS asserted for the length of
an eraseblock-large transfer.
|
|
'rockchip', 'verisilicon', 'riscv', 'intel/vt-d', 'amd/amd-vi' and 'core' into next
|
|
Merge series "slab: support for compiler-assisted type-based slab cache
partitioning" from Marco Elver. From the cover letter [6]:
Rework the general infrastructure around RANDOM_KMALLOC_CACHES into more
flexible KMALLOC_PARTITION_CACHES, with the former being a partitioning
mode of the latter.
Introduce a new mode, KMALLOC_PARTITION_TYPED, which leverages a feature
available in Clang 22 and later, called "allocation tokens" via
__builtin_infer_alloc_token() [1]. Unlike KMALLOC_PARTITION_RANDOM
(formerly RANDOM_KMALLOC_CACHES), this mode deterministically assigns a
slab cache to an allocation of type T, regardless of allocation site.
The builtin __builtin_infer_alloc_token(<malloc-args>, ...) instructs
the compiler to infer an allocation type from arguments commonly passed
to memory-allocating functions and returns a type-derived token ID. The
implementation passes kmalloc-args to the builtin: the compiler performs
best-effort type inference, and then recognizes common patterns such as
`kmalloc(sizeof(T), ...)`, `kmalloc(sizeof(T) * n, ...)`, but also
`(T *)kmalloc(...)`. Where the compiler fails to infer a type the
fallback token (default: 0) is chosen.
Note: kmalloc_obj(..) APIs fix the pattern how size and result type are
expressed, and therefore ensures there's not much drift in which
patterns the compiler needs to recognize. Specifically, kmalloc_obj()
and friends expand to `(TYPE *)KMALLOC(__obj_size, GFP)`, which the
compiler recognizes via the cast to TYPE*.
Clang's default token ID calculation is described as [1]:
typehashpointersplit: This mode assigns a token ID based on the hash
of the allocated type's name, where the top half ID-space is reserved
for types that contain pointers and the bottom half for types that do
not contain pointers.
Separating pointer-containing objects from pointerless objects and data
allocations can help mitigate certain classes of memory corruption
exploits [2]: attackers who gains a buffer overflow on a primitive
buffer cannot use it to directly corrupt pointers or other critical
metadata in an object residing in a different, isolated heap region.
It is important to note that heap isolation strategies offer a
best-effort approach, and do not provide a 100% security guarantee,
albeit achievable at relatively low performance cost. Note that this
also does not prevent cross-cache attacks: while waiting for future
features like SLAB_VIRTUAL [3] to provide physical page isolation, this
feature should be deployed alongside SHUFFLE_PAGE_ALLOCATOR and
init_on_free=1 to mitigate cross-cache attacks and page-reuse attacks as
much as possible today.
With all that, my kernel (x86 defconfig) shows me a histogram of slab
cache object distribution per /proc/slabinfo (after boot):
<slab cache> <objs> <hist>
kmalloc-part-15 1465 ++++++++++++++
kmalloc-part-14 2988 +++++++++++++++++++++++++++++
kmalloc-part-13 1656 ++++++++++++++++
kmalloc-part-12 1045 ++++++++++
kmalloc-part-11 1697 ++++++++++++++++
kmalloc-part-10 1489 ++++++++++++++
kmalloc-part-09 965 +++++++++
kmalloc-part-08 710 +++++++
kmalloc-part-07 100 +
kmalloc-part-06 217 ++
kmalloc-part-05 105 +
kmalloc-part-04 4047 ++++++++++++++++++++++++++++++++++++++++
kmalloc-part-03 183 +
kmalloc-part-02 283 ++
kmalloc-part-01 316 +++
kmalloc 1422 ++++++++++++++
The above /proc/slabinfo snapshot shows me there are 6673 allocated
objects (slabs 00 - 07) that the compiler claims contain no pointers or
it was unable to infer the type of, and 12015 objects that contain
pointers (slabs 08 - 15). On a whole, this looks relatively sane.
Additionally, when I compile my kernel with -Rpass=alloc-token, which
provides diagnostics where (after dead-code elimination) type inference
failed, I see 186 allocation sites where the compiler failed to identify
a type (down from 966 when I sent the RFC [4]). Some initial review
confirms these are mostly variable sized buffers, but also include
structs with trailing flexible length arrays.
Link: https://clang.llvm.org/docs/AllocToken.html [1]
Link: https://blog.dfsec.com/ios/2025/05/30/blasting-past-ios-18/ [2]
Link: https://lwn.net/Articles/944647/ [3]
Link: https://lore.kernel.org/all/20250825154505.1558444-1-elver@google.com/ [4]
Link: https://discourse.llvm.org/t/rfc-a-framework-for-allocator-partitioning-hints/87434 [5]
Link: https://lore.kernel.org/all/20260511200136.3201646-1-elver@google.com/ [6]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for 7.2
* New features:
- None. Zilch. Nada. Que dalle.
* Fixes and other improvements:
- Significant cleanup of the vgic-v5 PPI support which was merged in
7.1. This makes the code more maintainable, and squashes a couple
of bugs in the meantime.
- Set of fixes for the handling of the MMU in an NV context,
particularly VNCR-triggered faults. S1POE support is fixed
as well.
- Large set of pKVM fixes, mostly addressing recurring issues
around hypervisor tracking of donated pages in obscure cases
where the donation could fail and leave things in a bizarre
state.
- Fixes for the so-called "lazy vgic init", which resulted in
sleeping operations in non-preemptible sections. This turned
out to be far more invasive than initially expected...
- Reduce the overhead of L1/L2 context switch by not touching
the FP registers.
- Fix the way non-implemented page sizes are dealt with when
a guest insist on using them for S2 translation.
- The usual set of low-impact fixes and cleanups all over the map.
|
|
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt into usb-next
Mika writes:
thunderbolt: Changes for v7.2 merge window
This includes following USB4/Thunderbolt changes for the v7.2 merge
window:
- Make the driver more compliant with the connection manager guide.
- Improvements over Thunderbolt XDomain service handling.
- USB4STREAM driver.
- Split out PCIe bits into pci.c to allow the driver to work on
non-PCIe hosts as well.
- Various fixes and improvements.
All these have been in linux-next with no reported issues.
* tag 'thunderbolt-for-v7.2-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt: (41 commits)
thunderbolt: debugfs: Fix sideband write size check
thunderbolt: debugfs: Fix margining error counter buffer leak
thunderbolt: test: Release third DP tunnel
thunderbolt: Prevent XDomain delayed work use-after-free on disconnect
thunderbolt: test: Add KUnit tests for property parser bounds checks
thunderbolt: Add some more descriptive probe error messages
thunderbolt: Require nhi->ops be valid
thunderbolt: Separate out common NHI bits
thunderbolt: Move pci_device out of tb_nhi
thunderbolt: Increase Notification Timeout to 255 ms for USB4 routers
thunderbolt: Increase timeout for Configuration Ready bit
thunderbolt: Verify Router Ready bit is set after router enumeration
thunderbolt: Verify PCIe adapter in detect state before tunnel setup
thunderbolt: Activate path hops from source to destination
thunderbolt: Fix lane bonding log when bonding not possible
thunderbolt: Don't access path config space on Lane 1 adapters in tb_switch_reset_host()
thunderbolt: Improve multi-display DisplayPort tunnel allocation
docs: admin-guide: thunderbolt: Add instructions how to use USB4STREAM
thunderbolt: Add support for USB4STREAM
thunderbolt: Add support for ConfigFS
...
|
|
KVM SEV changes for 7.2
- Don't advertise support for unusuable VM types, and account for VM types
that are disabled by firmware, e.g. to mitigate security vulnerabilities.
- Rewrite the SEV {en,de}crypt debug ioctls as they were riddle with bugs and
unnecessarily complicated, and add comprehensive tests.
- Clean up and deduplicate the SEV page pinning code.
- Fix minor goofs related to writing back CPUID information after firmware
rejects a CPUID page for an SNP vCPU.
|
|
* kvm-arm64/vgic-v5-PPI-fixes:
: .
: Substantial cleanup of the vgic-v5 PPI support. From the original
: cover letter:
:
: "With the GICv5 PPi support merged in, it has become obvious that a few
: things could be improved, both from the correctness and maintainability
: angles."
: .
KVM: arm64: Fix arch timer interrupts for GICv3-on-GICv5 guests
irqchip/gic-v5: Immediately exec priority drop following activate
Documentation: KVM: Clarify that PMU_V3_IRQ IntID requirements for GICv5
Documentation: KVM: Fix typos in VGICv5 documentation
KVM: arm64: selftests: Improve error handling for GICv5 PPI selftest
KVM: arm64: selftests: Cleanup unused vars in GICv5 PPI selftest
KVM: arm64: selftests: Add missing GIC CDEN to no-vgic-v5 selftest
KVM: arm64: vgic-v5: Atomically assign bits to PPI DVI bitmap
KVM: arm64: vgic-v5: Add missing trap handing for NV triage
KVM: arm64: vgic-v5: Limit support to 64 PPIs
KVM: arm64: vgic: Rationalise per-CPU irq accessor
KVM: arm64: vgic-v5: Drop defensive checks from vgic_v5_ppi_queue_irq_unlock()
KVM: arm64: vgic: Consolidate vgic_allocate_private_irqs_locked()
KVM: arm64: vgic: Constify struct irq_ops usage
KVM: arm64: vgic-v5: Drop pointless ARM64_HAS_GICV5_CPUIF check
KVM: arm64: vgic-v5: Remove use of __assign_bit() with a constant
KVM: arm64: vgic-v5: Move PPI caps into kvm_vgic_global_state
KVM: arm64: vgic-v5: Add for_each_visible_v5_ppi() iterator
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
* kvm-arm64/no-lazy-vgic-init:
: .
: Fix an ugly situation where the vgic lazy init could happen in
: non-preemtible contexts such as vcpu reset, resulting in lockdep
: splats.
:
: This requires revamping the way in-kernel emulation of devices
: (timers, PMU) are presenting their interrupt to the vgic, and
: make sure there is no need to init the vgic on the back of that.
: .
KVM: arm64: vgic-v2: Don't init the vgic on in-kernel interrupt injection
KVM: arm64: vgic-v2: Force vgic init on injection outside the run loop
KVM: arm64: pmu: Kill the PMU interrupt level cache
KVM: arm64: timer: Kill the per-timer irq level cache
KVM: arm64: Simplify userspace notification of interrupt state
KVM: arm64: timer: Repaint kvm_timer_{should,irq_can}_fire() to kvm_timer_{pending,enabled}()
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
KVM generic changes for 7.2
- Rename invalidate_begin() to invalidate_start() throughout KVM to follow
the kernel's nomenclature, e.g. for mmu_notifiers.
- Minor cleanups.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Luiz Augusto von Dentz says:
====================
bluetooth-next pull request for net-next:
core:
- hci_sync: Add support for HCI_LE_Set_Host_Feature [v2]
- SMP: Use AES-CMAC library API
- sockets: convert to getsockopt_iter
- Add SPDX id lines to some source files
drivers:
- btintel_pcie: Support Product level reset
- btintel_pcie: Add support for smart trigger dump
- btintel_pcie: Add 50 ms delay before MAC init on BlazarIW
- btintel_pcie: Separate coredump work from RX work
- btmtk: add event filter to filter specific event
- btrtl: fix RTL8761B/BU broken LE extended scan
- btusb: Add Realtek RTL8922AE VID/PID 0bda/d922
- btusb: Add Realtek RTL8922AE VID/PID 0bda/d923
- btusb: MT7922: Add VID/PID 0e8d/223c
- btusb: MT7925: Add VID/PID 0e8d/8c38
- btusb: Add support for TP-Link TL-UB250
- btusb: Add Mercusys MA530 for Realtek RTL8761BUV
- btusb: Add TP-Link UB600 for Realtek 8761BUV
- btusb: Add support for Intel Lizard Peak 2 (0x8087:0x0040)
- btusb: Add USB ID 2c4e:0128 for Mercusys MA60XNB
- btusb: MT7925: Add VID/PID 13d3/3609
* tag 'for-net-next-2026-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next: (49 commits)
Bluetooth: btintel_pcie: Separate coredump work from RX work
Bluetooth: btmtksdio: fix infinite loop in btmtksdio_txrx_work()
Bluetooth: qca: Add BT FW build version to kernel log
Bluetooth: vhci: validate devcoredump state before side effects
Bluetooth: L2CAP: validate connectionless PSM length
Bluetooth: hci: validate codec capability element length
Bluetooth: L2CAP: Fix UAF in channel timeout by holding conn ref
Bluetooth: btintel_pcie: Load IOSF debug regs by controller variant
Bluetooth: btintel_pcie: Add 50 ms delay before MAC init on BlazarIW
Bluetooth: Add SPDX id lines to some source files
Bluetooth: btintel_pcie: Add support for smart trigger dump
Bluetooth: hci_h5: reset hci_uart::priv in the close() method
Bluetooth: btusb: clean up probe error handling
Bluetooth: btusb: fix wakeup irq devres lifetime
Bluetooth: btusb: fix wakeup source leak on probe failure
Bluetooth: btusb: fix use-after-free on marvell probe failure
Bluetooth: btusb: fix use-after-free on registration failure
Bluetooth: btmtk: fix URB leak in alloc_mtk_intr_urb error path
Bluetooth: hci_core: Fix UAF in hci_unregister_dev()
Bluetooth: hci_event: fix simultaneous discovery stuck in FINDING
...
====================
Link: https://patch.msgid.link/20260611183358.176776-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Introduce vendor-specific PHY tunable identifiers to control the
KSZ87xx low-loss cable erratum handling through the ethtool PHY
tunable interface.
The following tunables are added:
- a boolean "short-cable" tunable, applying a documented and
conservative preset intended for short or low-loss Ethernet cables;
- an integer LPF bandwidth tunable, allowing advanced adjustment of the
receiver low-pass filter bandwidth;
- an integer DSP EQ initial value tunable, allowing advanced tuning of
the PHY equalizer initialization.
The actual behavior is implemented by the corresponding PHY and switch
drivers.
Reviewed-by: Marek Vasut <marex@nabladev.com>
Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de>
Signed-off-by: Fidelio Lawson <fidelio.lawson@exotec.com>
Link: https://patch.msgid.link/20260609-ksz87xx_errata_low_loss_connections-v10-2-9ba4418cf3db@exotec.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
With TCP-timestamps (padded) taking 12 bytes and ADD_ADDR IPv6 + port
taking 30 bytes, the 40-byte limit for the TCP options is reached. In
this case, it is then not possible to send the address signal.
The idea is to let MPTCP dropping the TCP-timestamps option for some
specific packets, to be able to send some specific pure ACK carrying >28
bytes of MPTCP options, like with this specific ADD_ADDR. A new
parameter is passed from tcp_established_options to the MPTCP side to
indicate if the TCP TS option is used, and if it should be dropped. The
next commit implements the part on MPTCP side, but split into two
patches to help TCP maintainers to identify the modifications on TCP
side. This feature will be controlled by a new add_addr_v6_port_drop_ts
MPTCP sysctl knob.
It is important to keep in mind that dropping the TCP timestamps option
for one packet of the connection could eventually disrupt some
middleboxes: even if it should be unlikely, they could drop the packet
or even block the connection. That's why this new feature will be
controlled by a sysctl knob.
Note that it would be technically possible to squeeze both options into
the header if the ADD_ADDR is first written, and then the TCP timestamps
without the NOPs preceding it. But this means more modifications on TCP
side, plus some middleboxes could still be disrupted by that.
In this implementation, an unused bit is used in mptcp_out_options
structure to avoid passing an address to a local variable. Reading and
setting it needs CONFIG_MPTCP, so the whole block now has this #if
condition: mptcp_established_options() is then no longer used without
CONFIG_MPTCP.
About alternatives, instead of passing a new boolean (has_ts), another
option would be to pass the whole option structure (opts), but
'struct tcp_out_options' is currently defined in tcp_output.c, and it
would need to be exported. Plus that means the removal of the TCP TS
option would be done on the MPTCP side, and not here on the TCP side.
It feels clearer to remove other TCP options from the TCP side, than
hiding that from the MPTCP side.
Yet an other alternative would be to pass the size already taken by the
other TCP options, and have a way to drop them all when needed. But this
feels better to target only the timestamps option where dropping it
should be safe, even if it is currently the only option that would be
set before MPTCP, when MPTCP is used.
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260605-net-next-mptcp-add-addr6-port-ts-v2-5-758e7ca73f4d@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
mr_table.cache_resolve_queue_len is always updated under
spin_lock_bh(&mfc_unres_lock).
Let's convert it to u32.
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20260609222013.1550355-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
rocker_router_fib_event() calls fib_rule_get() during RCU dump.
If the fib_rule is dying, refcount_inc() will complain about it.
Let's call refcount_inc_not_zero() in fib_rules_dump().
Fixes: 5d7bfd141924 ("ipv4: fib_rules: Dump FIB rules when registering FIB notifier")
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260610061744.2030996-3-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
syzbot reported use-after-free in nsim_fib4_prepare_event(). [0]
The problem is that the following functions call fib_info_hold() /
refcount_inc() while dumping fib_info under RCU, which is unsafe.
* mlxsw_sp_router_fib4_event()
* rocker_router_fib_event()
* nsim_fib4_prepare_event()
refcount_inc_not_zero() must be used, but it would be too late
there.
Let's guarantee the lifetime of fib_info in fib_leaf_notify().
Note that IPv6 does not need the corresponding change since
fib6_table_dump() holds fib6_table.tb6_lock.
[0]:
refcount_t: addition on 0; use-after-free.
WARNING: lib/refcount.c:25 at refcount_warn_saturate+0x9f/0x110 lib/refcount.c:25, CPU#0: kworker/u8:15/3420
Modules linked in:
CPU: 0 UID: 0 PID: 3420 Comm: kworker/u8:15 Not tainted syzkaller #0 PREEMPT_{RT,(full)}
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/18/2026
Workqueue: netns cleanup_net
RIP: 0010:refcount_warn_saturate+0x9f/0x110 lib/refcount.c:25
Code: eb 66 85 db 74 3e 83 fb 01 75 4c e8 1b f1 22 fd 48 8d 3d 84 cb f1 0a 67 48 0f b9 3a eb 4a e8 08 f1 22 fd 48 8d 3d 81 cb f1 0a <67> 48 0f b9 3a eb 37 e8 f5 f0 22 fd 48 8d 3d 7e cb f1 0a 67 48 0f
RSP: 0018:ffffc9000f2c7270 EFLAGS: 00010293
RAX: ffffffff84a18858 RBX: 0000000000000002 RCX: ffff888032ff9ec0
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff8f9353e0
RBP: 0000000000000000 R08: ffff888032ff9ec0 R09: 0000000000000005
R10: 0000000000000100 R11: 0000000000000004 R12: ffff8880570cc000
R13: dffffc0000000000 R14: ffff88802b40563c R15: ffff8880570cc000
FS: 0000000000000000(0000) GS:ffff888126173000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fb1f4d5d000 CR3: 000000006072a000 CR4: 00000000003526f0
Call Trace:
<TASK>
__refcount_add include/linux/refcount.h:-1 [inline]
__refcount_inc include/linux/refcount.h:366 [inline]
refcount_inc include/linux/refcount.h:383 [inline]
fib_info_hold include/net/ip_fib.h:629 [inline]
nsim_fib4_prepare_event drivers/net/netdevsim/fib.c:930 [inline]
nsim_fib_event_schedule_work drivers/net/netdevsim/fib.c:1000 [inline]
nsim_fib_event_nb+0x1055/0x1240 drivers/net/netdevsim/fib.c:1043
call_fib_notifier+0x45/0x80 net/core/fib_notifier.c:25
call_fib_entry_notifier net/ipv4/fib_trie.c:90 [inline]
fib_leaf_notify net/ipv4/fib_trie.c:2176 [inline]
fib_table_notify net/ipv4/fib_trie.c:2194 [inline]
fib_notify+0x36b/0x5e0 net/ipv4/fib_trie.c:2217
fib_net_dump net/core/fib_notifier.c:70 [inline]
register_fib_notifier+0x184/0x360 net/core/fib_notifier.c:108
nsim_fib_create+0x85d/0x9f0 drivers/net/netdevsim/fib.c:1596
nsim_dev_reload_create drivers/net/netdevsim/dev.c:1604 [inline]
nsim_dev_reload_up+0x374/0x7c0 drivers/net/netdevsim/dev.c:1058
devlink_reload+0x501/0x8d0 net/devlink/dev.c:475
devlink_pernet_pre_exit+0x1ff/0x420 net/devlink/core.c:558
ops_pre_exit_list net/core/net_namespace.c:161 [inline]
ops_undo_list+0x187/0x940 net/core/net_namespace.c:234
cleanup_net+0x56e/0x800 net/core/net_namespace.c:702
process_one_work kernel/workqueue.c:3314 [inline]
process_scheduled_works+0xb5d/0x1860 kernel/workqueue.c:3397
worker_thread+0xa53/0xfc0 kernel/workqueue.c:3478
kthread+0x388/0x470 kernel/kthread.c:436
ret_from_fork+0x514/0xb70 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
</TASK>
Fixes: 0ae3eb7b4611 ("netdevsim: fib: Perform the route programming in a non-atomic context")
Fixes: c3852ef7f2f8 ("ipv4: fib: Replay events when registering FIB notifier")
Reported-by: syzbot+cb2aa2390ac024e25f5c@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/6a290011.39669fcc.33b062.00b1.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260610061744.2030996-2-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cross-merge networking fixes after downstream PR (net-7.1-rc8).
Conflicts:
drivers/net/ethernet/wangxun/txgbe/txgbe_aml.c
f67aead16e85 ("net: txgbe: rework service event handling")
57d39faed4c9 ("net: txgbe: improve functions of AML 40G devices")
net/rds/info.c
512db8267b73 ("rds: mark snapshot pages dirty in rds_info_getsockopt()")
6e94eeb2a2a6 ("rds: convert to getsockopt_iter")
Adjacent changes:
include/net/sock.h
1ee90b77b727 ("net: guard timestamp cmsgs to real error queue skbs")
f0de88303d5e ("net: make is_skb_wmem() available to modules")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Merge cpufreq updates for 7.2:
- Fix a race between cpufreq suspend and CPU hotplug during system
shutdown (Tianxiang Chen)
- Avoid redundant target() calls for unchanged limits and fix a typo
in a comment in the cpufreq core (Viresh Kumar)
- Fix concurrency issues related to sysfs attributes access that affect
cpufreq governors using the common governor code (Zhongqiu Han)
- Simplify frequency limit handling in the conservative cpufreq
governor (Lifeng Zheng)
- Fix descriptions of the conservative governor freq_step tunable and
the ondemand governor sampling_down_factor tunable in the cpufreq
documentation (Pengjie Zhang)
- Fix use-after-free and double free during _OSC evaluation in the PCC
cpufreq driver (Yuho Choi)
- Rework the handling of policy min and max frequency values in the
cpufreq core to allow drivers to specify special initial values for
the scaling_min_freq and scaling_max_freq sysfs attributes (Pierre
Gondois)
- Add cpufreq scaling support for Qualcomm Shikra SoC (Taniya Das,
Imran Shaik).
- Improve the warning message on HWP-disabled hybrid processors printed
by the intel_pstate driver and sync policy->cur during CPU offline in
it (Yohei Kojima, Fushuai Wang)
- Drop cpufreq support for AMD Elan SC4* (Sean Young)
- Minor fixes for cpufreq drivers (Krzysztof Kozlowski, Akashdeep Kaur,
Hans Zhang, Guangshuo Li, Xueqin Luo)
- Clean up dead dependencies on X86 in the cpufreq Kconfig (Julian
Braha)
* pm-cpufreq: (25 commits)
cpufreq: Use policy->min/max init as QoS request
cpufreq: Remove driver default policy->min/max init
cpufreq: Set default policy->min/max values for all drivers
cpufreq: Extract cpufreq_policy_init_qos() function
cpufreq: Documentation: fix conservative governor freq_step description
cpufreq: ti: Add EPROBE_DEFER for K3 SoCs
cpufreq: qcom: Add cpufreq scaling support for Qualcomm Shikra SoC
dt-bindings: cpufreq: Document Qualcomm Shikra SoC EPSS
cpufreq: governor: Fix stale prev_cpu_nice spike when enabling ignore_nice_load
cpufreq: governor: Fix data races on per-CPU idle/nice baselines
cpufreq: intel_pstate: Improve warning message on HWP-disabled hybrid CPUs
cpufreq: elanfreq: Drop support for AMD Elan SC4*
cpufreq: clean up dead dependencies on X86 in Kconfig
cpufreq: conservative: Simplify frequency limit handling
cpufreq: Avoid redundant target() calls for unchanged limits
cpufreq: Fix typo in comment
cpufreq: intel_pstate: Sync policy->cur during CPU offline
cpufreq: Documentation: fix sampling_down_factor range
cpufreq: Fix hotplug-suspend race during reboot
cpufreq: pcc: fix use-after-free and double free in _OSC evaluation
...
|
|
A driver that has popped a handle from an FRMR pool can hit failures
that leave the handle in a state where it can't safely be returned
for reuse. The driver destroys the handle itself, but the pool has
no way to learn about it, so the in_use counter drifts upward.
Add ib_frmr_pool_drop to balance the pool's accounting in this case.
Every pop is now balanced by exactly one push or drop.
Fixes: 36680ef7bceb ("RDMA/mlx5: Switch from MR cache to FRMR pools")
Link: https://patch.msgid.link/r/20260610000145.820592-9-michaelgur@nvidia.com
Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Failure to push a handle to the pool, caused by ENOMEM on queue page
allocation, will trigger missing in_use counter update, skewing pool
state indefinitely.
Fix that by moving the handling of handle destruction in such case
into the FRMR code, ensuring the handle is either pushed to the pool
or destroyed inside the same function.
Adjust mlx5_ib call site accordingly.
Fixes: ce5df0b891ed ("IB/core: Introduce FRMR pools")
Link: https://patch.msgid.link/r/20260610000145.820592-8-michaelgur@nvidia.com
Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
l2cap_chan_timeout() runs asynchronously and accesses chan->conn. If
the connection is torn down while the timer is running or pending,
chan->conn can be freed, leading to a use-after-free when the timer
worker attempts to lock conn->lock:
| BUG: KASAN: slab-use-after-free in instrument_atomic_read_write include/linux/instrumented.h:112 [inline]
| BUG: KASAN: slab-use-after-free in atomic_long_try_cmpxchg_acquire include/linux/atomic/atomic-instrumented.h:4456 [inline]
| BUG: KASAN: slab-use-after-free in __mutex_trylock_fast kernel/locking/mutex.c:161 [inline]
| BUG: KASAN: slab-use-after-free in mutex_lock+0x4f/0xa0 kernel/locking/mutex.c:318
| Write of size 8 at addr ffff8881298d9550 by task kworker/2:1/83
|
| CPU: 2 UID: 0 PID: 83 Comm: kworker/2:1 Not tainted 7.1.0-rc6-next-20260601-dirty #6 PREEMPT(full)
| Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-debian-1.17.0-1 04/01/2014
| Workqueue: events l2cap_chan_timeout
| Call Trace:
| <TASK>
| instrument_atomic_read_write include/linux/instrumented.h:112 [inline]
| atomic_long_try_cmpxchg_acquire include/linux/atomic/atomic-instrumented.h:4456 [inline]
| __mutex_trylock_fast kernel/locking/mutex.c:161 [inline]
| mutex_lock+0x4f/0xa0 kernel/locking/mutex.c:318
| l2cap_chan_timeout+0x5d/0x1b0 net/bluetooth/l2cap_core.c:422
| process_one_work kernel/workqueue.c:3326 [inline]
| process_scheduled_works+0x7c8/0xfb0 kernel/workqueue.c:3409
| worker_thread+0x8a9/0xcf0 kernel/workqueue.c:3490
| kthread+0x346/0x430 kernel/kthread.c:436
| ret_from_fork+0x1a3/0x470 arch/x86/kernel/process.c:158
| ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
| </TASK>
|
| Allocated by task 320:
| l2cap_conn_add+0xa7/0x820 net/bluetooth/l2cap_core.c:7075
| l2cap_connect_cfm+0xdb/0xd70 net/bluetooth/l2cap_core.c:7452
| hci_connect_cfm include/net/bluetooth/hci_core.h:2139 [inline]
| hci_remote_features_evt+0x52f/0x9f0 net/bluetooth/hci_event.c:3760
| hci_event_func net/bluetooth/hci_event.c:7796 [inline]
| hci_event_packet+0x561/0xa70 net/bluetooth/hci_event.c:7847
| hci_rx_work+0x370/0x890 net/bluetooth/hci_core.c:4040
| process_one_work kernel/workqueue.c:3326 [inline]
| process_scheduled_works+0x7c8/0xfb0 kernel/workqueue.c:3409
| worker_thread+0x8a9/0xcf0 kernel/workqueue.c:3490
| kthread+0x346/0x430 kernel/kthread.c:436
| ret_from_fork+0x1a3/0x470 arch/x86/kernel/process.c:158
| ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
|
| Freed by task 322:
| hci_disconn_cfm include/net/bluetooth/hci_core.h:2154 [inline]
| hci_conn_hash_flush+0x101/0x1f0 net/bluetooth/hci_conn.c:2736
| hci_dev_close_sync+0x889/0xde0 net/bluetooth/hci_sync.c:5405
| hci_dev_do_close net/bluetooth/hci_core.c:502 [inline]
| hci_unregister_dev+0x1f7/0x370 net/bluetooth/hci_core.c:2679
| vhci_release+0x12a/0x180 drivers/bluetooth/hci_vhci.c:690
| __fput+0x369/0x890 fs/file_table.c:510
| task_work_run+0x160/0x1d0 kernel/task_work.c:233
| get_signal+0xf5b/0x1120 kernel/signal.c:2810
| arch_do_signal_or_restart+0x4d/0x600 arch/x86/kernel/signal.c:337
| __exit_to_user_mode_loop kernel/entry/common.c:64 [inline]
| exit_to_user_mode_loop+0x85/0x510 kernel/entry/common.c:98
| do_syscall_64+0x263/0x3d0 arch/x86/entry/syscall_64.c:100
| entry_SYSCALL_64_after_hwframe+0x77/0x7f
|
| The buggy address belongs to the object at ffff8881298d9400
| which belongs to the cache kmalloc-512 of size 512
| The buggy address is located 336 bytes inside of
| freed 512-byte region [ffff8881298d9400, ffff8881298d9600)
Fix it by having chan->conn hold a reference to l2cap_conn (via
l2cap_conn_get) when the channel is added to the connection, and
releasing it in the channel destructor. This ensures the l2cap_conn
remains alive as long as the channel exists.
A new FLAG_DEL channel flag is introduced to indicate that the channel
has been deleted from its connection. l2cap_chan_del() atomically sets
this flag using test_and_set_bit() instead of setting chan->conn to
NULL. All asynchronous workers (l2cap_chan_timeout, l2cap_ack_timeout,
l2cap_monitor_timeout, l2cap_retrans_timeout) and l2cap_chan_send()
check FLAG_DEL to determine whether the channel has been torn down,
rather than testing chan->conn for NULL.
Fixes: 8c8e620467a7 ("Bluetooth: L2CAP: use chan timer to close channels in cleanup_listen()")
Cc: <stable@vger.kernel.org>
Cc: Siwei Zhang <oss@fourdim.xyz>
Cc: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Assisted-by: Gemini:gemini-3.1-pro-preview
Reported-by: https://sashiko.dev/#/patchset/20260521021249.3258069-1-oss%40fourdim.xyz
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
Many bluetooth source files are missing SPDX-License-Identifier
lines. Add appropriate IDs to these files, and remove other
license lines from the headers.
Leave the warranty disclaimer in files where the license ID is
GPL-2.0 but the wording of the disclaimer is slightly different
from that of the GPL v2 disclaimer.
It is not different enough to cause licensing conflicts, but is
kept to honor the original contributors' legal intent.
Signed-off-by: Tim Bird <tim.bird@sony.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
This adds support for using HCI_LE_Set_Host_Feature [v2] instead of v1
if LL Extented Features is supported and the controller supports the
command.
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from IPsec and netfilter.
This is relatively small, mostly because we are a bit behind our PW
queue. I'm not aware of any pending regression.
Current release - regressions:
- netfilter: nf_tables_offload: drop device refcount on error
Previous releases - regressions:
- core: add pskb_may_pull() to skb_gro_receive_list()
- xfrm: iptfs: preserve shared-frag marker in iptfs_consume_frags()
- ipv6: fix a potential NPD in cleanup_prefix_route()
- ipv4: fix use-after-free caused by the fqdir_pre_exit() flush
- eth:
- bnxt_en: fix NULL pointer dereference
- emac: fix use-after-free during device removal
- octeontx2-af: fix memory leak in rvu_setup_hw_resources()
- tun: zero the whole vnet header in tun_put_user()
- sit: reload inner IPv6 header after GSO offloads
Previous releases - always broken:
- core: fix double-free in netdev_nl_bind_rx_doit()
- netfilter: nf_log: validate MAC header was set before dumping it
- xfrm: iptfs: fix ABBA deadlock in iptfs_destroy_state()
- tcp: restrict SO_ATTACH_FILTER to priv users
- mctp: usb: fix race between urb completion and rx_retry
cancellation
- eth:
- mlx5: fix slab-out-of-bounds in mlx5_query_nic_vport_mac_list
- mvpp2: sync RX data at the hardware packet offset"
* tag 'net-7.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (64 commits)
octeontx2-af: fix IP fragment flag corruption on custom KPU profile load
ipv6: Fix a potential NPD in cleanup_prefix_route()
net: txgbe: initialize PHY interface to 0
net: txgbe: distinguish module types by checking identifier
net: txgbe: initialize module info buffer
net: mvpp2: build skb from XDP-adjusted data on XDP_PASS
net: mvpp2: refill RX buffers before XDP or skb use
net: mvpp2: limit XDP frame size to the RX buffer
net: mvpp2: sync RX data at the hardware packet offset
netfilter: nft_meta_bridge: fix stale stack leak via IIFHWADDR register
netfilter: nft_fib: fix stale stack leak via the OIFNAME register
netfilter: nft_exthdr: fix register tracking for F_PRESENT flag
netfilter: nf_log: validate MAC header was set before dumping it
netfilter: x_tables: avoid leaking percpu counter pointers
netfilter: nf_conntrack: destroy stale expectfn expectations on unregister
netfilter: nf_tables_offload: drop device refcount on error
netfilter: revalidate bridge ports
rds: mark snapshot pages dirty in rds_info_getsockopt()
ip6_vti: fix incorrect tunnel matching in vti6_tnl_lookup()
ptp: ocp: fix resource freeing order
...
|
|
|
|
Merge an ACPI processor driver update, two ACPI CPPC library updates
and ACPI PCI/CXL support updates for 7.2-rc1:
- Add cpuidle driver check in acpi_processor_register_idle_driver() to
avoid evaluating _CST unnecessarily (Tony W Wang-oc)
- Suppress UBSAN warning caused by field misuse during PCC-based
register access in the ACPI CPPC library (Jeremy Linton)
- Add support for CPPC v4 to the ACPI CPPC library (Sumit Gupta)
- Update the ACPI device enumeration code to honor _DEP for ACPI0016
PCI/CXL host bridges and make the ACPI PCI root driver clear _DEP
dependencies for PCI roots that have become operational (Chen Pei)
* acpi-processor:
ACPI: processor: Add cpuidle driver check in acpi_processor_register_idle_driver()
* acpi-cppc:
ACPI: CPPC: Suppress UBSAN warning caused by field misuse
ACPI: CPPC: Add support for CPPC v4
* acpi-pci:
ACPI: scan: Honor _DEP for ACPI0016 PCI/CXL host bridge
ACPI: PCI: Clear _DEP dependencies after PCI root bridge attach
|
|
Merge updates of core ACPI device drivers for 7.2-rc1:
- Fix multiple issues related to probe, removal and missing NVDIMM
device notifications in the ACPI NFIT driver (Rafael Wysocki)
- Add support for devres-based management of ACPI notify handlers to
the ACPI core (Rafael Wysocki)
- Switch multiple core ACPI device drivers (including the ACPI PAD,
ACPI video bus, ACPI HED, ACPI thermal zone, ACPI AC, ACPI battery,
and ACPI NFIT drivers) over to using devres-based resource management
during probe (Rafael Wysocki)
- Replace mutex_lock/unlock() with guard()/scoped_guard() in the ACPI
PMIC driver (Maxwell Doose)
- Fix message kref handling in the dead device path of the ACPI IPMI
address space handler (Yuho Choi)
- Use sysfs_emit() in idlecpus_show() in the ACPI processor aggregator
device (PAD) driver (Yury Norov)
- Clean up device_id_scheme initialization in the ACPI video bus driver
(Jean-Ralph Aviles)
* acpi-driver: (26 commits)
ACPI: IPMI: Fix message kref handling on dead device
ACPI: NFIT: core: Fix possible deadlock and missing notifications
ACPI: NFIT: core: Eliminate redundant local variable
ACPI: NFIT: core: Fix acpi_nfit_init() error cleanup
ACPI: NFIT: core: Fix possible NULL pointer dereference
ACPI: bus: Clean up devm_acpi_install_notify_handler()
ACPI: PAD: Use sysfs_emit() in idlecpus_show()
ACPI: video: Do not initialise device_id_scheme directly
ACPI: video: Switch over to devres-based resource management
ACPI: video: Use devm for video->entry and backlight cleanup
ACPI: video: Use devm action for freeing video devices
ACPI: video: Use devm action for video bus object cleanup
ACPI: video: Rearrange probe and remove code
ACPI: video: Reduce the number of auxiliary device dereferences
ACPI: PAD: Switch over to devres-based resource management
ACPI: PAD: Fix teardown ordering in acpi_pad_remove()
ACPI: PAD: Pass struct device pointer to acpi_pad_notify()
ACPI: PAD: Rearrange acpi_pad_notify()
ACPI: thermal: Switch over to devres-based resource management
ACPI: HED: Switch over to devres-based resource management
...
|
|
Merge ACPICA updates for 7.2-rc1 including the following changes:
- Add support for the Legacy Virtual Register (LVR) field in I2C serial
bus resource descriptors to ACPICA (Akhil R)
- Fix multiple issues related to bounds checks, input validation,
use-after-free, and integer overflow checks in the AML interpreter
in ACPICA (ikaros)
- Update the copyright year to 2026 in ACPICA files and make minor
changes related to ACPI 6.6 support (Pawel Chmielewski)
- Remove spurious precision from format used to dump parse trees in
ACPICA (David Laight)
- Add modern standby DSM GUIDs to ACPICA header files (Daniel Schaefer)
- Fix FADT 32/64X length mismatch warning in ACPICA (Abdelkader Boudih)
- Update D3hot/cold device power states definitions in ACPICA header
files (Aymeric Wibo)
- Fix NULL pointer dereference in acpi_ns_custom_package() (Weiming
Shi)
- Update ACPICA version to 20260408 (Saket Dumbre)
* acpica: (27 commits)
ACPICA: add boundary checks in two places
ACPICA: Add package limit checks in parser functions
ACPICA: Update version to 20260408
ACPICA: Update the copyright year to 2026
ACPICA: Remove spurious precision from format used to dump parse trees
ACPICA: Enhance OEM ID and Table ID validation in acpi_ex_load_table_op()
ACPICA: Fix NULL pointer dereference in acpi_ns_custom_package()
ACPICA: Enhance buffer validation in acpi_ut_walk_aml_resources()
ACPICA: Add validation for node in acpi_ns_build_normalized_path()
ACPICA: validate handler object type in two places
ACPICA: Improve argument parsing in acpi_ps_get_next_simple_arg()
ACPICA: Fix integer overflow in acpi_ex_opcode_3A_1T_1R() (mid_op)
ACPICA: Prevent adding invalid references
ACPICA: add boundary checks in acpi_ps_get_next_field()
ACPICA: validate byte_count in acpi_ps_get_next_package_length()
ACPICA: Fix use-after-free in acpi_ds_terminate_control_method()
ACPICA: fix I2C LVR item count in the conversion table
ACPICA: Mention the LVR bits
ACPICA: Change LVR to 8 bit value
ACPICA: Fetch LVR I2C resource descriptor
...
|
|
Signed-off-by: Tarun Sahu <tarunsahu@google.com>
Reviewed-by: Pratyush Yadav (Google) <pratyush@kernel.org>
Link: https://patch.msgid.link/faec5df2a3e90240cde5897bae2250cd7d44aeac.1781170056.git.tarunsahu@google.com
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
|
|
The ASUS Keystone is a physical NFC-like dongle that slots into supported
ASUS laptops. The EC fires WMI notify code 0xB4 on insert/remove events.
Expose the current insert state via a sysfs attribute by querying WMI
device ID 0x00120091 (DSTS). This devid does not follow the standard DSTS
convention: PRESENCE_BIT (0x00010000) encodes the insert state rather than
feature presence, and STATUS_BIT is never set. Presence of a keystone slot
is detected by a successful DSTS call.
Reviewed-by: Denis Benato <denis.benato@linux.dev>
Signed-off-by: Dariusz Figzał <dariuszfigzal@gmail.com>
Link: https://patch.msgid.link/20260610164942.74956-1-dariuszfigzal@gmail.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
Add the contended_release trace event. This tracepoint fires on the
holder side when a contended lock is released, complementing the
existing contention_begin/contention_end tracepoints which fire on the
waiter side.
This enables correlating lock hold time under contention with waiter
events by lock address.
Add trace_contended_release()/trace_call__contended_release() calls to
the slowpath unlock paths of sleepable locks: mutex, rtmutex, semaphore,
rwsem, percpu-rwsem, and RT-specific rwbase locks.
Where possible, trace_contended_release() fires before the lock is
released and before the waiter is woken. For some lock types, the
tracepoint fires after the release but before the wake. Making the
placement consistent across all lock types is not worth the added
complexity.
For reader/writer locks, the tracepoint fires for every reader releasing
while a writer is waiting, not only for the last reader.
Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@kernel.org>
Acked-by: Usama Arif <usama.arif@linux.dev>
Link: https://patch.msgid.link/02f4f6c5ce6761e7f6587cf0ff2289d962ecddd4.1780506267.git.d@ilvokhin.com
|
|
Move the percpu_up_read() slowpath out of the inline function into a new
__percpu_up_read() to avoid binary size increase from adding a
tracepoint to an inlined function.
Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Usama Arif <usama.arif@linux.dev>
Link: https://patch.msgid.link/3dd2a1b9ab4f469e1892766cb63f41d6b0f53d29.1780506267.git.d@ilvokhin.com
|
|
None of the trace events in lock.h reference anything from
linux/sched.h. Remove the unnecessary include.
Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Usama Arif <usama.arif@linux.dev>
Link: https://patch.msgid.link/85e14a0685c1d46c3e80de2dc01358a48169b263.1780506267.git.d@ilvokhin.com
|
|
Rate limiting is currently supported only for raw packet QPs, where
the packet pacing index is programmed into the SQC during SQ modify.
Extend rate limit support to UD and UC QPs by setting the pacing
index in the QPC during RTR2RTS and RTS2RTS transitions.
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Edward Srouji <edwards@nvidia.com>
Link: https://patch.msgid.link/20260524-packet-pacing-v1-3-3d79439f8d08@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
Add the needed capabilities in mlx5_ifc to support packet pacing for UC
and UD QPs.
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Edward Srouji <edwards@nvidia.com>
Link: https://patch.msgid.link/20260524-packet-pacing-v1-1-3d79439f8d08@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|