| Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
- fix kerneldocs for gpio-timberdale and gpio-nomadik
- clear the "requested" flag in error path in gpiod_request_commit()
- call of_xlate() if provided when setting up shared GPIOs
- handle pins shared by child firmware nodes of consumer devices
- fix return value check in gpio-qixis-fpga
- fix suspend on gpio-mxc
- fix gpio-microchip DT bindings
* tag 'gpio-fixes-for-v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
dt-bindings: gpio: fix microchip #interrupt-cells
gpio: shared: shorten the critical section in gpiochip_setup_shared()
gpio: mxc: map Both Edge pad wakeup to Rising Edge
gpio: qixis-fpga: Fix error handling for devm_regmap_init_mmio()
gpio: shared: handle pins shared by child nodes of devices
gpio: shared: call gpio_chip::of_xlate() if set
gpiolib: clear requested flag if line is invalid
gpio: nomadik: repair some kernel-doc comments
gpio: timberdale: repair kernel-doc comments
gpio: Fix resource leaks on errors in gpiochip_add_data_with_key()
|
|
Pull bpf fixes from Alexei Starovoitov:
- Fix register equivalence for pointers to packet (Alexei Starovoitov)
- Fix incorrect pruning due to atomic fetch precision tracking (Daniel
Borkmann)
- Fix grace period wait for bpf_link-ed tracepoints (Kumar Kartikeya
Dwivedi)
- Fix use-after-free of sockmap's sk->sk_socket (Kuniyuki Iwashima)
- Reject direct access to nullable PTR_TO_BUF pointers (Qi Tang)
- Reject sleepable kprobe_multi programs at attach time (Varun R
Mallya)
* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
selftests/bpf: Add more precision tracking tests for atomics
bpf: Fix incorrect pruning due to atomic fetch precision tracking
bpf: Reject sleepable kprobe_multi programs at attach time
bpf: reject direct access to nullable PTR_TO_BUF pointers
bpf: sockmap: Fix use-after-free of sk->sk_socket in sk_psock_verdict_data_ready().
bpf: Fix grace period wait for tracepoint bpf_link
bpf: Fix regsafe() for pointers to packet
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
- Add missing async markers to tegra
- Fix long hmac key DMA handling in caam
- Fix spurious ENOSPC errors in deflate
- Fix SG chaining in af_alg
- Do not use in-place process in algif_aead
- Fix out-of-place destination overflow in authencesn
* tag 'v7.0-p4' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: authencesn - Do not place hiseq at end of dst for out-of-place decryption
crypto: algif_aead - Revert to operating out-of-place
crypto: af-alg - fix NULL pointer dereference in scatterwalk
crypto: deflate - fix spurious -ENOSPC
crypto: caam - fix overflow on long hmac keys
crypto: caam - fix DMA corruption on long hmac keys
crypto: tegra - Add missing CRYPTO_ALG_ASYNC
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"With fixes from wireless, bluetooth and netfilter included we're back
to each PR carrying 30%+ more fixes than in previous era.
The good news is that so far none of the "extra" fixes are themselves
causing real regressions. Not sure how much comfort that is.
Current release - fix to a fix:
- netdevsim: fix build if SKB_EXTENSIONS=n
- eth: stmmac: skip VLAN restore when VLAN hash ops are missing
Previous releases - regressions:
- wifi: iwlwifi: mvm: don't send a 6E related command when
not supported
Previous releases - always broken:
- some info leak fixes
- add missing clearing of skb->cb[] on ICMP paths from tunnels
- ipv6:
- flowlabel: defer exclusive option free until RCU teardown
- avoid overflows in ip6_datagram_send_ctl()
- mpls: add seqcount to protect platform_labels from OOB access
- bridge: improve safety of parsing ND options
- bluetooth: fix leaks, overflows and races in hci_sync
- netfilter: add more input validation, some to address bugs directly
some to prevent exploits from cooking up broken configurations
- wifi:
- ath: avoid poor performance due to stopping the wrong
aggregation session
- virt_wifi: remove SET_NETDEV_DEV to avoid use-after-free
- eth:
- fec: fix the PTP periodic output sysfs interface
- enetc: safely reinitialize TX BD ring when it has unsent frames"
* tag 'net-7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (95 commits)
eth: fbnic: Increase FBNIC_QUEUE_SIZE_MIN to 64
ipv6: avoid overflows in ip6_datagram_send_ctl()
net: hsr: fix VLAN add unwind on slave errors
net: hsr: serialize seq_blocks merge across nodes
vsock: initialize child_ns_mode_locked in vsock_net_init()
selftests/tc-testing: add tests for cls_fw and cls_flow on shared blocks
net/sched: cls_flow: fix NULL pointer dereference on shared blocks
net/sched: cls_fw: fix NULL pointer dereference on shared blocks
net/x25: Fix overflow when accumulating packets
net/x25: Fix potential double free of skb
bnxt_en: Restore default stat ctxs for ULP when resource is available
bnxt_en: Don't assume XDP is never enabled in bnxt_init_dflt_ring_mode()
bnxt_en: Refactor some basic ring setup and adjustment logic
net/mlx5: Fix switchdev mode rollback in case of failure
net/mlx5: Avoid "No data available" when FW version queries fail
net/mlx5: lag: Check for LAG device before creating debugfs
net: macb: properly unregister fixed rate clocks
net: macb: fix clk handling on PCI glue driver removal
virtio_net: clamp rss_max_key_size to NETDEV_RSS_KEY_LEN
net/sched: sch_netem: fix out-of-bounds access in packet corruption
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux
Pull iommu fixes from Joerg Roedel:
- IOMMU-PT related compile breakage in for AMD driver
- IOTLB flushing behavior when unmapped region is larger than requested
due to page-sizes
- Fix IOTLB flush behavior with empty gathers
* tag 'iommu-fixes-v7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux:
iommupt/amdv1: mark amdv1pt_install_leaf_entry as __always_inline
iommupt: Fix short gather if the unmap goes into a large mapping
iommu: Do not call drivers for empty gathers
|
|
IPSET_ATTR_NAME and IPSET_ATTR_NAMEREF are of NLA_STRING type, they
cannot be treated like a c-string.
They either have to be switched to NLA_NUL_STRING, or the compare
operations need to use the nla functions.
Fixes: f830837f0eed ("netfilter: ipset: list:set set type support")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Recently, tracepoints were switched from using disabled preemption
(which acts as RCU read section) to SRCU-fast when they are not
faultable. This means that to do a proper grace period wait for programs
running in such tracepoints, we must use SRCU's grace period wait.
This is only for non-faultable tracepoints, faultable ones continue
using RCU Tasks Trace.
However, bpf_link_free() currently does call_rcu() for all cases when
the link is non-sleepable (hence, for tracepoints, non-faultable). Fix
this by doing a call_srcu() grace period wait.
As far RCU Tasks Trace gp -> RCU gp chaining is concerned, it is deemed
unnecessary for tracepoint programs. The link and program are either
accessed under RCU Tasks Trace protection, or SRCU-fast protection now.
The earlier logic of chaining both RCU Tasks Trace and RCU gp waits was
to generalize the logic, even if it conceded an extra RCU gp wait,
however that is unnecessary for tracepoints even before this change.
In practice no cost was paid since rcu_trace_implies_rcu_gp() was always
true. Hence we need not chaining any RCU gp after the SRCU gp.
For instance, in the non-faultable raw tracepoint, the RCU read section
of the program in __bpf_trace_run() is enclosed in the SRCU gp, likewise
for faultable raw tracepoint, the program is under the RCU Tasks Trace
protection. Hence, the outermost scope can be waited upon to ensure
correctness.
Also, sleepable programs cannot be attached to non-faultable
tracepoints, so whenever program or link is sleepable, only RCU Tasks
Trace protection is being used for the link and prog.
Fixes: a46023d5616e ("tracing: Guard __DECLARE_TRACE() use of __DO_TRACE_CALL() with SRCU-fast")
Reviewed-by: Sun Jian <sun.jian.kdev@gmail.com>
Reviewed-by: Puranjay Mohan <puranjay@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20260331211021.1632902-2-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
- Fix cgroup rmdir racing with dying tasks.
Deferred task cgroup unlink introduced a window where cgroup.procs
is empty but the cgroup is still populated, causing rmdir to fail
with -EBUSY and selftest failures.
Make rmdir wait for dying tasks to fully leave and fix selftests to
not depend on synchronous populated updates.
- Fix cpuset v1 task migration failure from empty cpusets under strict
security policies.
When CPU hotplug removes the last CPU from a v1 cpuset, tasks must be
migrated to an ancestor without a security_task_setscheduler() check
that would block the migration.
* tag 'cgroup-for-7.0-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup/cpuset: Skip security check for hotplug induced v1 task migration
cgroup/cpuset: Simplify setsched decision check in task iteration loop of cpuset_can_attach()
cgroup: Fix cgroup_drain_dying() testing the wrong condition
selftests/cgroup: Don't require synchronous populated update on task exit
cgroup: Wait for dying tasks to leave on rmdir
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull udf fix from Jan Kara:
"Fix for a race in UDF that can lead to memory corruption"
* tag 'fs_for_v7.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
udf: Fix race between file type conversion and writeback
mpage: Provide variant of mpage_writepages() with own optional folio handler
|
|
This mostly reverts commit 72548b093ee3 except for the copying of
the associated data.
There is no benefit in operating in-place in algif_aead since the
source and destination come from different mappings. Get rid of
all the complexity added for in-place operation and just copy the
AD directly.
Fixes: 72548b093ee3 ("crypto: algif_aead - copy AAD from src to dst")
Reported-by: Taeyang Lee <0wn@theori.io>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
- Fix netfs_limit_iter() hitting BUG() when an ITER_KVEC iterator
reaches it via core dump writes to 9P filesystems. Add ITER_KVEC
handling following the same pattern as the existing ITER_BVEC code.
- Fix a NULL pointer dereference in the netfs unbuffered write retry
path when the filesystem (e.g., 9P) doesn't set the prepare_write
operation.
- Clear I_DIRTY_TIME in sync_lazytime for filesystems implementing
->sync_lazytime. Without this the flag stays set and may cause
additional unnecessary calls during inode deactivation.
- Increase tmpfs size in mount_setattr selftests. A recent commit
bumped the ext4 image size to 2 GB but didn't adjust the tmpfs
backing store, so mkfs.ext4 fails with ENOSPC writing metadata.
- Fix an invalid folio access in iomap when i_blkbits matches the folio
size but differs from the I/O granularity. The cur_folio pointer
would not get invalidated and iomap_read_end() would still be called
on it despite the IO helper owning it.
- Fix hash_name() docstring.
- Fix read abandonment during netfs retry where the subreq variable
used for abandonment could be uninitialized on the first pass or
point to a deleted subrequest on later passes.
- Don't block sync for filesystems with no data integrity guarantees.
Add a SB_I_NO_DATA_INTEGRITY superblock flag replacing the per-inode
AS_NO_DATA_INTEGRITY mapping flag so sync kicks off writeback but
doesn't wait for flusher threads. This fixes a suspend-to-RAM hang on
fuse-overlayfs where the flusher thread blocks when the fuse daemon
is frozen.
- Fix a lockdep splat in iomap when reads fail. iomap_read_end_io()
invokes fserror_report() which calls igrab() taking i_lock in hardirq
context while i_lock is normally held with interrupts enabled. Kick
failed read handling to a workqueue.
- Remove the redundant netfs_io_stream::front member and use
stream->subrequests.next instead, fixing a potential issue in the
direct write code path.
* tag 'vfs-7.0-rc6.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
netfs: Fix the handling of stream->front by removing it
iomap: fix lockdep complaint when reads fail
writeback: don't block sync for filesystems with no data integrity guarantees
netfs: Fix read abandonment during retry
vfs: fix docstring of hash_name()
iomap: fix invalid folio access when i_blkbits differs from I/O granularity
selftests/mount_setattr: increase tmpfs size for idmapped mount tests
fs: clear I_DIRTY_TIME in sync_lazytime
netfs: Fix NULL pointer dereference in netfs_unbuffered_write() on retry
netfs: Fix kernel BUG in netfs_limit_iter() for ITER_KVEC iterators
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull futex fixes from Ingo Molnar:
- Tighten up the sys_futex_requeue() ABI a bit, to disallow dissimilar
futex flags and potential UaF access (Peter Zijlstra)
- Fix UaF between futex_key_to_node_opt() and vma_replace_policy()
(Hao-Yu Yang)
- Clear stale exiting pointer in futex_lock_pi() retry path, which
triggered a warning (and potential misbehavior) in stress-testing
(Davidlohr Bueso)
* tag 'locking-urgent-2026-03-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
futex: Clear stale exiting pointer in futex_lock_pi() retry path
futex: Fix UaF between futex_key_to_node_opt() and vma_replace_policy()
futex: Require sys_futex_requeue() to have identical flags
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"A few more fixes. There's one that stands out in size as it fixes an
edge case in fsync.
- fix issue on fsync where file with zero size appears as a non-zero
after log replay
- in zlib compression, handle a crash when data alignment causes
folio reference issues
- fix possible crash with enabled tracepoints on a overlayfs mount
- handle device stats update error
- on zoned filesystems, fix kobject leak on sub-block groups
- fix super block offset in an error message in validation"
* tag 'for-7.0-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix lost error when running device stats on multiple devices fs
btrfs: tracepoints: get correct superblock from dentry in event btrfs_sync_file()
btrfs: zlib: handle page aligned compressed size correctly
btrfs: fix leak of kobject name for sub-group space_info
btrfs: fix zero size inode with non-zero size after log replay
btrfs: fix super block offset in error message in btrfs_validate_super()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"10 hotfixes. 8 are cc:stable. 9 are for MM.
There's a 3-patch series of DAMON fixes from Josh Law and SeongJae
Park. The rest are singletons - please see the changelogs for details"
* tag 'mm-hotfixes-stable-2026-03-28-10-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm/mseal: update VMA end correctly on merge
bug: avoid format attribute warning for clang as well
mm/pagewalk: fix race between concurrent split and refault
mm/memory: fix PMD/PUD checks in follow_pfnmap_start()
mm/damon/sysfs: check contexts->nr in repeat_call_fn
mm/damon/sysfs: check contexts->nr before accessing contexts_arr[0]
mm/damon/sysfs: fix param_ctx leak on damon_sysfs_new_test_ctx() failure
mm/swap: fix swap cache memcg accounting
MAINTAINERS, mailmap: update email address for Harry Yoo
mm/huge_memory: fix folio isn't locked in softleaf_to_folio()
|
|
On arm64 server, we found folio that get from migration entry isn't locked
in softleaf_to_folio(). This issue triggers when mTHP splitting and
zap_nonpresent_ptes() races, and the root cause is lack of memory barrier
in softleaf_to_folio(). The race is as follows:
CPU0 CPU1
deferred_split_scan() zap_nonpresent_ptes()
lock folio
split_folio()
unmap_folio()
change ptes to migration entries
__split_folio_to_order() softleaf_to_folio()
set flags(including PG_locked) for tail pages folio = pfn_folio(softleaf_to_pfn(entry))
smp_wmb() VM_WARN_ON_ONCE(!folio_test_locked(folio))
prep_compound_page() for tail pages
In __split_folio_to_order(), smp_wmb() guarantees page flags of tail pages
are visible before the tail page becomes non-compound. smp_wmb() should
be paired with smp_rmb() in softleaf_to_folio(), which is missed. As a
result, if zap_nonpresent_ptes() accesses migration entry that stores tail
pfn, softleaf_to_folio() may see the updated compound_head of tail page
before page->flags.
This issue will trigger VM_WARN_ON_ONCE() in pfn_swap_entry_folio()
because of the race between folio split and zap_nonpresent_ptes()
leading to a folio incorrectly undergoing modification without a folio
lock being held.
This is a BUG_ON() before commit 93976a20345b ("mm: eliminate further
swapops predicates"), which in merged in v6.19-rc1.
To fix it, add missing smp_rmb() if the softleaf entry is migration entry
in softleaf_to_folio() and softleaf_to_page().
[tujinjiang@huawei.com: update function name and comments]
Link: https://lkml.kernel.org/r/20260321075214.3305564-1-tujinjiang@huawei.com
Link: https://lkml.kernel.org/r/20260319012541.4158561-1-tujinjiang@huawei.com
Fixes: e9b61f19858a ("thp: reintroduce split_huge_page()")
Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Barry Song <baohua@kernel.org>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nanyong Sun <sunnanyong@huawei.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"There are two core fixes here. One is from Johan dealing with an issue
introduced by a devm_ API usage update causing things to be freed
earlier than they had earlier when we fail to register a device,
another from Danilo avoids unlocked acccess to data by converting to
use a driver core API.
We also have a few relatively minor driver specific fixes"
* tag 'spi-fix-v7.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: spi-fsl-lpspi: fix teardown order issue (UAF)
spi: fix use-after-free on managed registration failure
spi: use generic driver_override infrastructure
spi: meson-spicc: Fix double-put in remove path
spi: sn-f-ospi: Use devm_mutex_init() to simplify code
spi: sn-f-ospi: Fix resource leak in f_ospi_probe()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"This became slightly big partly due to my time off in the last week.
But all changes are about device-specific fixes, so it should be
safely applicable.
ASoC:
- Fix double free in sma1307
- Fix uninitialized variables in simple-card-utils/imx-card
- Address clock leaks and error propagation in ADAU1372
- Add DMI quirks and ACP/SDW support for ASUS
- Fix Intel CATPT DMA mask
- Fix SOF topology parsing
- Fix DT bindings for RK3576 SPDIF, STM32 SAI and WCD934x
HD-audio:
- Quirks for Lenovo, ASUS, and various HP models, as well as
a speaker pop fix on Star Labs StarFighter
- Revert MSI X870E Tomahawk denylist again
USB-Audio:
- Fix distorted audio on Focusrite Scarlett 2i2/2i4 1st Gen
- Add iface reset quirk for AB17X
- Update Qualcomm USB audio Kconfig dependencies and license
Misc:
- Fix minor compile warnings for firewire and asihpi drivers"
* tag 'sound-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (35 commits)
Revert "ALSA: hda/intel: Add MSI X870E Tomahawk to denylist"
ALSA: usb-audio: Add iface reset and delay quirk for AB17X USB Audio
ALSA: hda/realtek: add HP Laptop 15-fd0xxx mute LED quirk
ALSA: usb-audio: Exclude Scarlett 2i4 1st Gen from SKIP_IFACE_SETUP
ALSA: hda/realtek: Add mute LED quirk for HP Pavilion 15-eg0xxx
ALSA: hda/realtek - Fixed Speaker Mute LED for HP EliteBoard G1a platform
ASoC: SOF: ipc4-topology: Allow bytes controls without initial payload
ASoC: adau1372: Fix clock leak on PLL lock failure
ASoC: adau1372: Fix unchecked clk_prepare_enable() return value
ASoC: SDCA: fix finding wrong entity
ASoC: SDCA: remove the max count of initialization table
ASoC: codecs: wcd934x: fix typo in dt parsing
ASoC: dt-bindings: stm32: Fix incorrect compatible string in stm32h7-sai match
ASoC: Intel: catpt: Fix the device initialization
ASoC: amd: acp: add ASUS HN7306EA quirk for legacy SDW machine
ASoC: SOF: topology: reject invalid vendor array size in token parser
ASoC: tas2781: Add null check for calibration data
ALSA: asihpi: avoid write overflow check warning
ASoC: fsl: imx-card: initialize playback_only and capture_only
ASoC: simple-card-utils: Check value of is_playback_only and is_capture_only
...
|
|
Some filesystems need to treat some folios specially (for example for
inodes with inline data). Doing the handling in their .writepages method
in a race-free manner results in duplicating some of the writeback
internals. So provide generalized version of mpage_writepages() that
allows filesystem to provide a handler called for each folio which can
handle the folio in a special way.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20260326140635.15895-3-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
An empty gather is coded with start=U64_MAX, end=0 and several drivers go
on to convert that to a size with:
end - start + 1
Which gives 2 for an empty gather. This then causes Weird Stuff to
happen (for example an UBSAN splat in VT-d) that is hopefully harmless,
but maybe not.
Prevent drivers from being called right in iommu_iotlb_sync().
Auditing shows that AMD, Intel, Mediatek and RSIC-V drivers all do things
on these empty gathers.
Further, there are several callers that can trigger empty gathers,
especially in unusual conditions. For example iommu_map_nosync() will call
a 0 size unmap on some error paths. Also in VFIO, iommupt and other
places.
Cc: stable@vger.kernel.org
Reported-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Closes: https://lore.kernel.org/r/11145826.aFP6jjVeTY@jkrzyszt-mobl2.ger.corp.intel.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Samiullah Khawaja <skhawaja@google.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
__skb_ext_put() is not declared if SKB_EXTENSIONS is not enabled, which
causes a build error:
drivers/net/netdevsim/netdev.c: In function 'nsim_forward_skb':
drivers/net/netdevsim/netdev.c:114:25: error: implicit declaration of function '__skb_ext_put'; did you mean 'skb_ext_put'? [-Werror=implicit-function-declaration]
114 | __skb_ext_put(psp_ext);
| ^~~~~~~~~~~~~
| skb_ext_put
cc1: some warnings being treated as errors
Add a stub to fix the build.
Fixes: 7d9351435ebb ("netdevsim: drop PSP ext ref on forward failure")
Signed-off-by: Qingfang Deng <dqfext@gmail.com>
Link: https://patch.msgid.link/20260324140857.783-1-dqfext@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The RCU-protected codepaths (mpls_forward, mpls_dump_routes) can have
an inconsistent view of platform_labels vs platform_label in case of a
concurrent resize (resize_platform_label_table, under
platform_mutex). This can lead to OOB accesses.
This patch adds a seqcount, so that we get a consistent snapshot.
Note that mpls_label_ok is also susceptible to this, so the check
against RTA_DST in rtm_to_route_config, done outside platform_mutex,
is not sufficient. This value gets passed to mpls_label_ok once more
in both mpls_route_add and mpls_route_del, so there is no issue, but
that additional check must not be removed.
Reported-by: Yuan Tan <tanyuan98@outlook.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Fixes: 7720c01f3f590 ("mpls: Add a sysctl to control the size of the mpls label table")
Fixes: dde1b38e873c ("mpls: Convert mpls_dump_routes() to RCU.")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://patch.msgid.link/cd8fca15e3eb7e212b094064cd83652e20fd9d31.1774284088.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from Bluetooth, CAN, IPsec and Netfilter.
Notably, this includes the fix for the Bluetooth regression that you
were notified about. I'm not aware of any other pending regressions.
Current release - regressions:
- bluetooth:
- fix stack-out-of-bounds read in l2cap_ecred_conn_req
- fix regressions caused by reusing ident
- netfilter: revisit array resize logic
- eth: ice: set max queues in alloc_etherdev_mqs()
Previous releases - regressions:
- core: correctly handle tunneled traffic on IPV6_CSUM GSO fallback
- bluetooth:
- fix dangling pointer on mgmt_add_adv_patterns_monitor_complete
- fix deadlock in l2cap_conn_del()
- sched: codel: fix stale state for empty flows in fq_codel
- ipv6: remove permanent routes from tb6_gc_hlist when all exceptions expire.
- xfrm: fix skb_put() panic on non-linear skb during reassembly
- openvswitch:
- avoid releasing netdev before teardown completes
- validate MPLS set/set_masked payload length
- eth: iavf: fix out-of-bounds writes in iavf_get_ethtool_stats()
Previous releases - always broken:
- bluetooth: fix null-ptr-deref on l2cap_sock_ready_cb
- udp: fix wildcard bind conflict check when using hash2
- netfilter: fix use of uninitialized rtp_addr in process_sdp
- tls: Purge async_hold in tls_decrypt_async_wait()
- xfrm:
- prevent policy_hthresh.work from racing with netns teardown
- fix skb leak with espintcp and async crypto
- smc: fix double-free of smc_spd_priv when tee() duplicates splice pipe buffer
- can:
- add missing error handling to call can_ctrlmode_changelink()
- fix OOB heap access in cgw_csum_crc8_rel()
- eth:
- mana: fix use-after-free in add_adev() error path
- virtio-net: fix for VIRTIO_NET_F_GUEST_HDRLEN
- bcmasp: fix double free of WoL irq"
* tag 'net-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (90 commits)
net: macb: use the current queue number for stats
netfilter: ctnetlink: use netlink policy range checks
netfilter: nf_conntrack_sip: fix use of uninitialized rtp_addr in process_sdp
netfilter: nf_conntrack_expect: skip expectations in other netns via proc
netfilter: nf_conntrack_expect: store netns and zone in expectation
netfilter: ctnetlink: ensure safe access to master conntrack
netfilter: nf_conntrack_expect: use expect->helper
netfilter: nf_conntrack_expect: honor expectation helper field
netfilter: nft_set_rbtree: revisit array resize logic
netfilter: ip6t_rt: reject oversized addrnr in rt_mt6_check()
netfilter: nfnetlink_log: fix uninitialized padding leak in NFULA_PAYLOAD
tls: Purge async_hold in tls_decrypt_async_wait()
selftests: netfilter: nft_concat_range.sh: add check for flush+reload bug
netfilter: nft_set_pipapo_avx2: don't return non-matching entry on expiry
Bluetooth: btusb: clamp SCO altsetting table indices
Bluetooth: L2CAP: Fix ERTM re-init and zero pdu_len infinite loop
Bluetooth: L2CAP: Fix deadlock in l2cap_conn_del()
Bluetooth: btintel: serialize btintel_hw_error() with hci_req_sync_lock
Bluetooth: L2CAP: Fix send LE flow credits in ACL link
net: mana: fix use-after-free in add_adev() error path
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux
Pull dma-mapping fixes from Marek Szyprowski:
"A set of fixes for DMA-mapping subsystem, which resolve false-
positive warnings from KMSAN and DMA-API debug (Shigeru Yoshida
and Leon Romanovsky) as well as a simple build fix (Miguel Ojeda)"
* tag 'dma-mapping-7.0-2026-03-25' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux:
dma-mapping: add missing `inline` for `dma_free_attrs`
mm/hmm: Indicate that HMM requires DMA coherency
RDMA/umem: Tell DMA mapping that UMEM requires coherency
iommu/dma: add support for DMA_ATTR_REQUIRE_COHERENT attribute
dma-direct: prevent SWIOTLB path when DMA_ATTR_REQUIRE_COHERENT is set
dma-mapping: Introduce DMA require coherency attribute
dma-mapping: Clarify valid conditions for CPU cache line overlap
dma-mapping: handle DMA_ATTR_CPU_CACHE_CLEAN in trace output
dma-debug: Allow multiple invocations of overlapping entries
dma: swiotlb: add KMSAN annotations to swiotlb_bounce()
|
|
During futex_key_to_node_opt() execution, vma->vm_policy is read under
speculative mmap lock and RCU. Concurrently, mbind() may call
vma_replace_policy() which frees the old mempolicy immediately via
kmem_cache_free().
This creates a race where __futex_key_to_node() dereferences a freed
mempolicy pointer, causing a use-after-free read of mpol->mode.
[ 151.412631] BUG: KASAN: slab-use-after-free in __futex_key_to_node (kernel/futex/core.c:349)
[ 151.414046] Read of size 2 at addr ffff888001c49634 by task e/87
[ 151.415969] Call Trace:
[ 151.416732] __asan_load2 (mm/kasan/generic.c:271)
[ 151.416777] __futex_key_to_node (kernel/futex/core.c:349)
[ 151.416822] get_futex_key (kernel/futex/core.c:374 kernel/futex/core.c:386 kernel/futex/core.c:593)
Fix by adding rcu to __mpol_put().
Fixes: c042c505210d ("futex: Implement FUTEX2_MPOL")
Reported-by: Hao-Yu Yang <naup96721@gmail.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Hao-Yu Yang <naup96721@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Link: https://patch.msgid.link/20260324174418.GB1850007@noisy.programming.kicks-ass.net
|
|
The netfs_io_stream::front member is meant to point to the subrequest
currently being collected on a stream, but it isn't actually used this way
by direct write (which mostly ignores it). However, there's a tracepoint
which looks at it. Further, stream->front is actually redundant with
stream->subrequests.next.
Fix the potential problem in the direct code by just removing the member
and using stream->subrequests.next instead, thereby also simplifying the
code.
Fixes: a0b4c7a49137 ("netfs: Fix unbuffered/DIO writes to dispatch subrequests in strict sequence")
Reported-by: Paulo Alcantara <pc@manguebit.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://patch.msgid.link/4158599.1774426817@warthog.procyon.org.uk
Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Replace manual range and mask validations with netlink policy
annotations in ctnetlink code paths, so that the netlink core rejects
invalid values early and can generate extack errors.
- CTA_PROTOINFO_TCP_STATE: reject values > TCP_CONNTRACK_SYN_SENT2 at
policy level, removing the manual >= TCP_CONNTRACK_MAX check.
- CTA_PROTOINFO_TCP_WSCALE_ORIGINAL/REPLY: reject values > TCP_MAX_WSCALE
(14). The normal TCP option parsing path already clamps to this value,
but the ctnetlink path accepted 0-255, causing undefined behavior when
used as a u32 shift count.
- CTA_FILTER_ORIG_FLAGS/REPLY_FLAGS: use NLA_POLICY_MASK with
CTA_FILTER_F_ALL, removing the manual mask checks.
- CTA_EXPECT_FLAGS: use NLA_POLICY_MASK with NF_CT_EXPECT_MASK, adding
a new mask define grouping all valid expect flags.
Extracted from a broader nf-next patch by Florian Westphal, scoped to
ctnetlink for the fixes tree.
Fixes: c8e2078cfe41 ("[NETFILTER]: ctnetlink: add support for internal tcp connection tracking flags handling")
Signed-off-by: David Carlier <devnexen@gmail.com>
Co-developed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
__nf_ct_expect_find() and nf_ct_expect_find_get() are called under
rcu_read_lock() but they dereference the master conntrack via
exp->master.
Since the expectation does not hold a reference on the master conntrack,
this could be dying conntrack or different recycled conntrack than the
real master due to SLAB_TYPESAFE_RCU.
Store the netns, the master_tuple and the zone in struct
nf_conntrack_expect as a safety measure.
This patch is required by the follow up fix not to dump expectations
that do not belong to this netns.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Holding reference on the expectation is not sufficient, the master
conntrack object can just go away, making exp->master invalid.
To access exp->master safely:
- Grab the nf_conntrack_expect_lock, this gets serialized with
clean_from_lists() which also holds this lock when the master
conntrack goes away.
- Hold reference on master conntrack via nf_conntrack_find_get().
Not so easy since the master tuple to look up for the master conntrack
is not available in the existing problematic paths.
This patch goes for extending the nf_conntrack_expect_lock section
to address this issue for simplicity, in the cases that are described
below this is just slightly extending the lock section.
The add expectation command already holds a reference to the master
conntrack from ctnetlink_create_expect().
However, the delete expectation command needs to grab the spinlock
before looking up for the expectation. Expand the existing spinlock
section to address this to cover the expectation lookup. Note that,
the nf_ct_expect_iterate_net() calls already grabs the spinlock while
iterating over the expectation table, which is correct.
The get expectation command needs to grab the spinlock to ensure master
conntrack does not go away. This also expands the existing spinlock
section to cover the expectation lookup too. I needed to move the
netlink skb allocation out of the spinlock to keep it GFP_KERNEL.
For the expectation events, the IPEXP_DESTROY event is already delivered
under the spinlock, just move the delivery of IPEXP_NEW under the
spinlock too because the master conntrack event cache is reached through
exp->master.
While at it, add lockdep notations to help identify what codepaths need
to grab the spinlock.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
The expectation helper field is mostly unused. As a result, the
netfilter codebase relies on accessing the helper through exp->master.
Always set on the expectation helper field so it can be used to reach
the helper.
nf_ct_expect_init() is called from packet path where the skb owns
the ct object, therefore accessing exp->master for the newly created
expectation is safe. This saves a lot of updates in all callsites
to pass the ct object as parameter to nf_ct_expect_init().
This is a preparation patches for follow up fixes.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Jihed Chaibi <jihed.chaibi.dev@gmail.com> says:
adau1372_set_power() had two related error handling issues in its enable
path: clk_prepare_enable() was called but its return value discarded, and
adau1372_enable_pll() was a void function that silently swallowed lock
failures, leaving mclk enabled and adau1372->enabled set to true despite
the device being in a broken state.
Patch 1 fixes the unchecked clk_prepare_enable() by making
adau1372_set_power() return int and propagating the error.
Patch 2 converts adau1372_enable_pll() to return int and adds a full
unwind in adau1372_set_power() if PLL lock fails, reversing the regcache,
GPIO power-down, and clock state.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux
Pull RCU fixes from Boqun Feng:
"Fix a regression introduced by commit c27cea4416a3 ("rcu: Re-implement
RCU Tasks Trace in terms of SRCU-fast"): BPF contexts can run with
preemption disabled or scheduler locks held, so call_srcu() must work
in all such contexts.
Fix this by converting SRCU's spinlocks to raw spinlocks and avoiding
scheduler lock acquisition in call_srcu() by deferring to an irq_work
(similar to call_rcu_tasks_generic()), for both tree SRCU and tiny
SRCU.
Also fix a follow-on lockdep splat caused by srcu_node allocation
under the newly introduced raw spinlock by deferring the allocation to
grace-period worker context"
* tag 'rcu-fixes.v7.0-20260325a' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux:
srcu: Use irq_work to start GP in tiny SRCU
rcu: Use an intermediate irq_work to start process_srcu()
srcu: Push srcu_node allocation to GP when non-preemptible
srcu: Use raw spinlocks so call_srcu() can be used under preempt_disable()
|
|
Tiny SRCU's srcu_gp_start_if_needed() directly calls schedule_work(),
which acquires the workqueue pool->lock.
This causes a lockdep splat when call_srcu() is called with a scheduler
lock held, due to:
call_srcu() [holding pi_lock]
srcu_gp_start_if_needed()
schedule_work() -> pool->lock
workqueue_init() / create_worker() [holding pool->lock]
wake_up_process() -> try_to_wake_up() -> pi_lock
Also add irq_work_sync() to cleanup_srcu_struct() to prevent a
use-after-free if a queued irq_work fires after cleanup begins.
Tested with rcutorture SRCU-T and no lockdep warnings.
[ Thanks to Boqun for similar fix in patch "rcu: Use an intermediate irq_work
to start process_srcu()" ]
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Boqun Feng <boqun@kernel.org>
|
|
Since commit c27cea4416a3 ("rcu: Re-implement RCU Tasks Trace in terms
of SRCU-fast") we switched to SRCU in BPF. However as BPF instrument can
happen basically everywhere (including where a scheduler lock is held),
call_srcu() now needs to avoid acquiring scheduler lock because
otherwise it could cause deadlock [1]. Fix this by following what the
previous RCU Tasks Trace did: using an irq_work to delay the queuing of
the work to start process_srcu().
[boqun: Apply Joel's feedback]
[boqun: Apply Andrea's test feedback]
Reported-by: Andrea Righi <arighi@nvidia.com>
Closes: https://lore.kernel.org/all/abjzvz_tL_siV17s@gpd4/
Fixes: commit c27cea4416a3 ("rcu: Re-implement RCU Tasks Trace in terms of SRCU-fast")
Link: https://lore.kernel.org/rcu/3c4c5a29-24ea-492d-aeee-e0d9605b4183@nvidia.com/ [1]
Suggested-by: Zqiang <qiang.zhang@linux.dev>
Tested-by: Andrea Righi <arighi@nvidia.com>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Tested-by: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: Boqun Feng <boqun@kernel.org>
|
|
Tree SRCU has used non-raw spinlocks for many years, motivated by a desire
to avoid unnecessary real-time latency and the absence of any reason to
use raw spinlocks. However, the recent use of SRCU in tracing as the
underlying implementation of RCU Tasks Trace means that call_srcu()
is invoked from preemption-disabled regions of code, which in turn
requires that any locks acquired by call_srcu() or its callees must be
raw spinlocks.
This commit therefore converts SRCU's spinlocks to raw spinlocks.
[boqun: Add Fixes tag]
Reported-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Fixes: c27cea4416a3 ("rcu: Re-implement RCU Tasks Trace in terms of SRCU-fast")
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Boqun Feng <boqun@kernel.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
The number of the initialization table may exceed 2048.
Therefore, this patch removes the limitation and allows the driver to
allocate memory dynamically based on the size of the initialization table.
Signed-off-by: Shuming Fan <shumingf@realtek.com>
Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com>
Link: https://patch.msgid.link/20260325092017.3221640-1-shumingf@realtek.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Under an UML build for an upcoming series [1], I got `-Wstatic-in-inline`
for `dma_free_attrs`:
BINDGEN rust/bindings/bindings_generated.rs - due to target missing
In file included from rust/helpers/helpers.c:59:
rust/helpers/dma.c:17:2: warning: static function 'dma_free_attrs' is used in an inline function with external linkage [-Wstatic-in-inline]
17 | dma_free_attrs(dev, size, cpu_addr, dma_handle, attrs);
| ^
rust/helpers/dma.c:12:1: note: use 'static' to give inline function 'rust_helper_dma_free_attrs' internal linkage
12 | __rust_helper void rust_helper_dma_free_attrs(struct device *dev, size_t size,
| ^
| static
The issue is that `dma_free_attrs` was not marked `inline` when it was
introduced alongside the rest of the stubs.
Thus mark it.
Fixes: ed6ccf10f24b ("dma-mapping: properly stub out the DMA API for !CONFIG_HAS_DMA")
Closes: https://lore.kernel.org/rust-for-linux/20260322194616.89847-1-ojeda@kernel.org/ [1]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20260325015548.70912-1-ojeda@kernel.org
|
|
When codel_dequeue() finds an empty queue, it resets vars->dropping
but does not reset vars->first_above_time. The reference CoDel
algorithm (Nichols & Jacobson, ACM Queue 2012) resets both:
dodeque_result codel_queue_t::dodeque(time_t now) {
...
if (r.p == NULL) {
first_above_time = 0; // <-- Linux omits this
}
...
}
Note that codel_should_drop() does reset first_above_time when called
with a NULL skb, but codel_dequeue() returns early before ever calling
codel_should_drop() in the empty-queue case. The post-drop code paths
do reach codel_should_drop(NULL) and correctly reset the timer, so a
dropped packet breaks the cycle -- but the next delivered packet
re-arms first_above_time and the cycle repeats.
For sparse flows such as ICMP ping (one packet every 200ms-1s), the
first packet arms first_above_time, the flow goes empty, and the
second packet arrives after the interval has elapsed and gets dropped.
The pattern repeats, producing sustained loss on flows that are not
actually congested.
Test: veth pair, fq_codel, BQL disabled, 30000 iptables rules in the
consumer namespace (NAPI-64 cycle ~14ms, well above fq_codel's 5ms
target), ping at 5 pps under UDP flood:
Before fix: 26% ping packet loss
After fix: 0% ping packet loss
Fix by resetting first_above_time to zero in the empty-queue path
of codel_dequeue(), matching the reference algorithm.
Fixes: 76e3cc126bb2 ("codel: Controlled Delay AQM")
Fixes: d068ca2ae2e6 ("codel: split into multiple files")
Co-developed-by: Jesper Dangaard Brouer <hawk@kernel.org>
Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org>
Signed-off-by: Jonas Köppeler <j.koeppeler@tu-berlin.de>
Reported-by: Chris Arges <carges@cloudflare.com>
Tested-by: Jonas Köppeler <j.koeppeler@tu-berlin.de>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/all/20260318134826.1281205-7-hawk@kernel.org/
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260323174920.253526-1-hawk@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
a72f73c4dd9b ("cgroup: Don't expose dead tasks in cgroup") hid PF_EXITING
tasks from cgroup.procs so that systemd doesn't see tasks that have already
been reaped via waitpid(). However, the populated counter (nr_populated_csets)
is only decremented when the task later passes through cgroup_task_dead() in
finish_task_switch(). This means cgroup.procs can appear empty while the
cgroup is still populated, causing rmdir to fail with -EBUSY.
Fix this by making cgroup_rmdir() wait for dying tasks to fully leave. If the
cgroup is populated but all remaining tasks have PF_EXITING set (the task
iterator returns none due to the existing filter), wait for a kick from
cgroup_task_dead() and retry. The wait is brief as tasks are removed from the
cgroup's css_set between PF_EXITING assertion in do_exit() and
cgroup_task_dead() in finish_task_switch().
v2: cgroup_is_populated() true to false transition happens under css_set_lock
not cgroup_mutex, so retest under css_set_lock before sleeping to avoid
missed wakeups (Sebastian).
Fixes: a72f73c4dd9b ("cgroup: Don't expose dead tasks in cgroup")
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202603222104.2c81684e-lkp@intel.com
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Bert Karwatzki <spasswolf@web.de>
Cc: Michal Koutny <mkoutny@suse.com>
Cc: cgroups@vger.kernel.org
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM fixes from Andrew Morton:
"6 hotfixes. 2 are cc:stable. All are for MM.
All are singletons - please see the changelogs for details"
* tag 'mm-hotfixes-stable-2026-03-23-17-56' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm/damon/stat: monitor all System RAM resources
mm/zswap: add missing kunmap_local()
mailmap: update email address for Muhammad Usama Anjum
zram: do not slot_free() written-back slots
mm/damon/core: avoid use of half-online-committed context
mm/rmap: clear vma->anon_vma on error
|
|
When a driver is probed through __driver_attach(), the bus' match()
callback is called without the device lock held, thus accessing the
driver_override field without a lock, which can cause a UAF.
Fix this by using the driver-core driver_override infrastructure taking
care of proper locking internally.
Note that calling match() from __driver_attach() without the device lock
held is intentional. [1]
Also note that we do not enable the driver_override feature of struct
bus_type, as SPI - in contrast to most other buses - passes "" to
sysfs_emit() when the driver_override pointer is NULL. Thus, printing
"\n" instead of "(null)\n".
Link: https://lore.kernel.org/driver-core/DGRGTIRHA62X.3RY09D9SOK77P@kernel.org/ [1]
Reported-by: Gui-Dong Han <hanguidong02@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220789
Fixes: 5039563e7c25 ("spi: Add driver_override SPI device attribute")
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260324005919.2408620-12-dakr@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:
====================
pull request (net): ipsec 2026-03-23
1) Add missing extack for XFRMA_SA_PCPU in add_acquire and allocspi.
From Sabrina Dubroca.
2) Fix the condition on x->pcpu_num in xfrm_sa_len by using the
proper check. From Sabrina Dubroca.
3) Call xdo_dev_state_delete during state update to properly cleanup
the xdo device state. From Sabrina Dubroca.
4) Fix a potential skb leak in espintcp when async crypto is used.
From Sabrina Dubroca.
5) Validate inner IPv4 header length in IPTFS payload to avoid
parsing malformed packets. From Roshan Kumar.
6) Fix skb_put() panic on non-linear skb during IPTFS reassembly.
From Fernando Fernandez Mancera.
7) Silence various sparse warnings related to RCU, state, and policy
handling. From Sabrina Dubroca.
8) Fix work re-schedule race after cancel in xfrm_nat_keepalive_net_fini().
From Hyunwoo Kim.
9) Prevent policy_hthresh.work from racing with netns teardown by using
a proper cleanup mechanism. From Minwoo Ra.
10) Validate that the family of the source and destination addresses match
in pfkey_send_migrate(). From Eric Dumazet.
11) Only publish mode_data after the clone is setup in the IPTFS receive path.
This prevents leaving x->mode_data pointing at freed memory on error.
From Paul Moses.
Please pull or let me know if there are problems.
ipsec-2026-03-23
* tag 'ipsec-2026-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
xfrm: iptfs: only publish mode_data after clone setup
af_key: validate families in pfkey_send_migrate()
xfrm: prevent policy_hthresh.work from racing with netns teardown
xfrm: Fix work re-schedule after cancel in xfrm_nat_keepalive_net_fini()
xfrm: avoid RCU warnings around the per-netns netlink socket
xfrm: add rcu_access_pointer to silence sparse warning for xfrm_input_afinfo
xfrm: policy: silence sparse warning in xfrm_policy_unregister_afinfo
xfrm: policy: fix sparse warnings in xfrm_policy_{init,fini}
xfrm: state: silence sparse warnings during netns exit
xfrm: remove rcu/state_hold from xfrm_state_lookup_spi_proto
xfrm: state: add xfrm_state_deref_prot to state_by* walk under lock
xfrm: state: fix sparse warnings around XFRM_STATE_INSERT
xfrm: state: fix sparse warnings in xfrm_state_init
xfrm: state: fix sparse warnings on xfrm_state_hold_rcu
xfrm: iptfs: fix skb_put() panic on non-linear skb during reassembly
xfrm: iptfs: validate inner IPv4 header length in IPTFS payload
esp: fix skb leak with espintcp and async crypto
xfrm: call xdo_dev_state_delete during state update
xfrm: fix the condition on x->pcpu_num in xfrm_sa_len
xfrm: add missing extack for XFRMA_SA_PCPU in add_acquire and allocspi
====================
Link: https://patch.msgid.link/20260323083440.2741292-1-steffen.klassert@secunet.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The commit a2fb4bc4e2a6a03 ("net: implement virtio helpers to handle UDP
GSO tunneling.") introduces support for the UDP GSO tunnel feature in
virtio-net.
The virtio spec says:
If the \field{gso_type} has the VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV4 bit or
VIRTIO_NET_HDR_GSO_UDP_TUNNEL_IPV6 bit set, \field{hdr_len} accounts for
all the headers up to and including the inner transport.
The commit did not update the hdr_len to include the inner transport.
I observed that the "hdr_len" is 116 for this packet:
17:36:18.241105 52:55:00:d1:27:0a > 2e:2c:df:46:a9:e1, ethertype IPv4 (0x0800), length 2912: (tos 0x0, ttl 64, id 45197, offset 0, flags [none], proto UDP (17), length 2898)
192.168.122.100.50613 > 192.168.122.1.4789: [bad udp cksum 0x8106 -> 0x26a0!] VXLAN, flags [I] (0x08), vni 1
fa:c3:ba:82:05:ee > ce:85:0c:31:77:e5, ethertype IPv4 (0x0800), length 2862: (tos 0x0, ttl 64, id 14678, offset 0, flags [DF], proto TCP (6), length 2848)
192.168.3.1.49880 > 192.168.3.2.9898: Flags [P.], cksum 0x9266 (incorrect -> 0xaa20), seq 515667:518463, ack 1, win 64, options [nop,nop,TS val 2990048824 ecr 2798801412], length 2796
116 = 14(mac) + 20(ip) + 8(udp) + 8(vxlan) + 14(inner mac) + 20(inner ip) + 32(innner tcp)
Fixes: a2fb4bc4e2a6a03 ("net: implement virtio helpers to handle UDP GSO tunneling.")
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Link: https://patch.msgid.link/20260320021818.111741-3-xuanzhuo@linux.alibaba.com
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The commit be50da3e9d4a ("net: virtio_net: implement exact header length
guest feature") introduces support for the VIRTIO_NET_F_GUEST_HDRLEN
feature in virtio-net.
This feature requires virtio-net to set hdr_len to the actual header
length of the packet when transmitting, the number of
bytes from the start of the packet to the beginning of the
transport-layer payload.
However, in practice, hdr_len was being set using skb_headlen(skb),
which is clearly incorrect. This commit fixes that issue.
Fixes: be50da3e9d4a ("net: virtio_net: implement exact header length guest feature")
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Link: https://patch.msgid.link/20260320021818.111741-2-xuanzhuo@linux.alibaba.com
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
"Restrict the xen privcmd driver in unprivileged domU to only allow
hypercalls to target domain when using secure boot"
* tag 'xsa482-7.0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/privcmd: add boot control for restricted usage in domU
xen/privcmd: restrict usage in unprivileged domU
|
|
When binding a udp_sock to a local address and port, UDP uses
two hashes (udptable->hash and udptable->hash2) for collision
detection. The current code switches to "hash2" when
hslot->count > 10.
"hash2" is keyed by local address and local port.
"hash" is keyed by local port only.
The issue can be shown in the following bind sequence (pseudo code):
bind(fd1, "[fd00::1]:8888")
bind(fd2, "[fd00::2]:8888")
bind(fd3, "[fd00::3]:8888")
bind(fd4, "[fd00::4]:8888")
bind(fd5, "[fd00::5]:8888")
bind(fd6, "[fd00::6]:8888")
bind(fd7, "[fd00::7]:8888")
bind(fd8, "[fd00::8]:8888")
bind(fd9, "[fd00::9]:8888")
bind(fd10, "[fd00::10]:8888")
/* Correctly return -EADDRINUSE because "hash" is used
* instead of "hash2". udp_lib_lport_inuse() detects the
* conflict.
*/
bind(fail_fd, "[::]:8888")
/* After one more socket is bound to "[fd00::11]:8888",
* hslot->count exceeds 10 and "hash2" is used instead.
*/
bind(fd11, "[fd00::11]:8888")
bind(fail_fd, "[::]:8888") /* succeeds unexpectedly */
The same issue applies to the IPv4 wildcard address "0.0.0.0"
and the IPv4-mapped wildcard address "::ffff:0.0.0.0". For
example, if there are existing sockets bound to
"192.168.1.[1-11]:8888", then binding "0.0.0.0:8888" or
"[::ffff:0.0.0.0]:8888" can also miss the conflict when
hslot->count > 10.
TCP inet_csk_get_port() already has the correct check in
inet_use_bhash2_on_bind(). Rename it to
inet_use_hash2_on_bind() and move it to inet_hashtables.h
so udp.c can reuse it in this fix.
Fixes: 30fff9231fad ("udp: bind() optimisation")
Reported-by: Andrew Onyshchuk <oandrew@meta.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260319181817.1901357-1-martin.lau@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The cited commit mechanically put fib6_remove_gc_list()
just after every fib6_clean_expires() call.
When a temporary route is promoted to a permanent route,
there may already be exception routes tied to it.
If fib6_remove_gc_list() removes the route from tb6_gc_hlist,
such exception routes will no longer be aged.
Let's replace fib6_remove_gc_list() with a new helper
fib6_may_remove_gc_list() and use fib6_age_exceptions() there.
Note that net->ipv6 is only compiled when CONFIG_IPV6 is
enabled, so fib6_{add,remove,may_remove}_gc_list() are guarded.
Fixes: 5eb902b8e719 ("net/ipv6: Remove expired routes with a separated list of routes.")
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260320072317.2561779-3-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
btrfs_sync_file()
If overlay is used on top of btrfs, dentry->d_sb translates to overlay's
super block and fsid assignment will lead to a crash.
Use file_inode(file)->i_sb to always get btrfs_sb.
Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Revert "tracing: Remove pid in task_rename tracing output"
A change was made to remove the pid field from the task_rename event
because it was thought that it was always done for the current task
and recording the pid would be redundant. This turned out to be
incorrect and there are a few corner case where this is not true and
caused some regressions in tooling.
- Fix the reading from user space for migration
The reading of user space uses a seq lock type of logic where it uses
a per-cpu temporary buffer and disables migration, then enables
preemption, does the copy from user space, disables preemption,
enables migration and checks if there was any schedule switches while
preemption was enabled. If there was a context switch, then it is
considered that the per-cpu buffer could be corrupted and it tries
again. There's a protection check that tests if it takes a hundred
tries, it issues a warning and exits out to prevent a live lock.
This was triggered because the task was selected by the load balancer
to be migrated to another CPU, every time preemption is enabled the
migration task would schedule in try to migrate the task but can't
because migration is disabled and let it run again. This caused the
scheduler to schedule out the task every time it enabled preemption
and made the loop never exit (until the 100 iteration test
triggered).
Fix this by enabling and disabling preemption and keeping migration
enabled if the reading from user space needs to be done again. This
will let the migration thread migrate the task and the copy from user
space will likely pass on the next iteration.
- Fix trace_marker copy option freeing
The "copy_trace_marker" option allows a tracing instance to get a
copy of a write to the trace_marker file of the top level instance.
This is managed by a link list protected by RCU. When an instance is
removed, a check is made if the option is set, and if so
synchronized_rcu() is called.
The problem is that an iteration is made to reset all the flags to
what they were when the instance was created (to perform clean ups)
was done before the check of the copy_trace_marker option and that
option was cleared, so the synchronize_rcu() was never called.
Move the clearing of all the flags after the check of
copy_trace_marker to do synchronize_rcu() so that the option is still
set if it was before and the synchronization is performed.
- Fix entries setting when validating the persistent ring buffer
When validating the persistent ring buffer on boot up, the number of
events per sub-buffer is added to the sub-buffer meta page. The
validator was updating cpu_buffer->head_page (the first sub-buffer of
the per-cpu buffer) and not the "head_page" variable that was
iterating the sub-buffers. This was causing the first sub-buffer to
be assigned the entries for each sub-buffer and not the sub-buffer
that was supposed to be updated.
- Use "hash" value to update the direct callers
When updating the ftrace direct callers, it assigned a temporary
callback to all the callback functions of the ftrace ops and not just
the functions represented by the passed in hash. This causes an
unnecessary slow down of the functions of the ftrace_ops that is not
being modified. Only update the functions that are going to be
modified to call the ftrace loop function so that the update can be
made on those functions.
* tag 'trace-v7.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
ftrace: Use hash argument for tmp_ops in update_ftrace_direct_mod
ring-buffer: Fix to update per-subbuf entries of persistent ring buffer
tracing: Fix trace_marker copy link list updates
tracing: Fix failure to read user space from system call trace events
tracing: Revert "tracing: Remove pid in task_rename tracing output"
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fix from Ingo Molnar:
"Fix a sparse build error regression in <linux/local_lock_internal.h>
caused by the locking context-analysis changes"
* tag 'locking-urgent-2026-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
include/linux/local_lock_internal.h: Make this header file again compatible with sparse
|
|
One major usage of damon_call() is online DAMON parameters update. It is
done by calling damon_commit_ctx() inside the damon_call() callback
function. damon_commit_ctx() can fail for two reasons: 1) invalid
parameters and 2) internal memory allocation failures. In case of
failures, the damon_ctx that attempted to be updated (commit destination)
can be partially updated (or, corrupted from a perspective), and therefore
shouldn't be used anymore. The function only ensures the damon_ctx object
can safely deallocated using damon_destroy_ctx().
The API callers are, however, calling damon_commit_ctx() only after
asserting the parameters are valid, to avoid damon_commit_ctx() fails due
to invalid input parameters. But it can still theoretically fail if the
internal memory allocation fails. In the case, DAMON may run with the
partially updated damon_ctx. This can result in unexpected behaviors
including even NULL pointer dereference in case of damos_commit_dests()
failure [1]. Such allocation failure is arguably too small to fail, so
the real world impact would be rare. But, given the bad consequence, this
needs to be fixed.
Avoid such partially-committed (maybe-corrupted) damon_ctx use by saving
the damon_commit_ctx() failure on the damon_ctx object. For this,
introduce damon_ctx->maybe_corrupted field. damon_commit_ctx() sets it
when it is failed. kdamond_call() checks if the field is set after each
damon_call_control->fn() is executed. If it is set, ignore remaining
callback requests and return. All kdamond_call() callers including
kdamond_fn() also check the maybe_corrupted field right after
kdamond_call() invocations. If the field is set, break the kdamond_fn()
main loop so that DAMON sill doesn't use the context that might be
corrupted.
[sj@kernel.org: let kdamond_call() with cancel regardless of maybe_corrupted]
Link: https://lkml.kernel.org/r/20260320031553.2479-1-sj@kernel.org
Link: https://sashiko.dev/#/patchset/20260319145218.86197-1-sj%40kernel.org
Link: https://lkml.kernel.org/r/20260319145218.86197-1-sj@kernel.org
Link: https://lore.kernel.org/20260319043309.97966-1-sj@kernel.org [1]
Fixes: 3301f1861d34 ("mm/damon/sysfs: handle commit command using damon_call()")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org> [6.15+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|