| Age | Commit message (Collapse) | Author |
|
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
Pull LSM updates from Paul Moore:
"We only have five patches in the LSM tree, but three of the five are
for an important bugfix relating to overlayfs and the mmap() and
mprotect() access controls for LSMs. Highlights below:
- Fix problems with the mmap() and mprotect() LSM hooks on overlayfs
As we are dealing with problems both in mmap() and mprotect() there
are essentially two components to this fix, spread across three
patches with all marked for stable.
The simplest portion of the fix is the creation of a new LSM hook,
security_mmap_backing_file(), that is used to enforce LSM mmap()
access controls on backing files in the stacked/overlayfs case. The
existing security_mmap_file() does not have visibility past the
user file. You can see from the associated SELinux hook callback
the code is fairly straightforward.
The mprotect() fix is a bit more complicated as there is no way in
the mprotect() code path to inspect both the user and backing
files, and bolting on a second file reference to vm_area_struct
wasn't really an option.
The solution taken here adds a LSM security blob and associated
hooks to the backing_file struct that LSMs can use to capture and
store relevant information from the user file. While the necessary
SELinux information is relatively small, a single u32, I expect
other LSMs to require more than that, and a dedicated backing_file
LSM blob provides a storage mechanism without negatively impacting
other filesystems.
I want to note that other LSMs beyond SELinux have been involved in
the discussion of the fixes presented here and they are working on
their own related changes using these new hooks, but due to other
issues those patches will be coming at a later date.
- Use kstrdup_const()/kfree_const() for securityfs symlink targets
- Resolve a handful of kernel-doc warnings in cred.h"
* tag 'lsm-pr-20260410' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm:
selinux: fix overlayfs mmap() and mprotect() access checks
lsm: add backing_file LSM hooks
fs: prepare for adding LSM blob to backing_file
securityfs: use kstrdup_const() to manage symlink targets
cred: fix kernel-doc warnings in cred.h
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit
Pull audit updates from Paul Moore:
- Improved handling of unknown status requests from userspace
The current kernel code ignores unknown/unused request bits sent from
userspace and returns an error code based on the results of the
request(s) it does understand. The patch from Ricardo fixes this so
that unknown requests return an -EINVAL to userspace, making
compatibility a bit easier moving forward.
- A number of small style and formatting cleanups
* tag 'audit-pr-20260410' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
audit: handle unknown status requests in audit_receive_msg()
audit: fix coding style issues
audit: remove redundant initialization of static variables to 0
audit: fix whitespace alignment in include/uapi/linux/audit.h
|
|
Add a new getsockopt_iter callback to struct proto_ops that uses
sockopt_t, a type-safe wrapper around iov_iter. This provides a clean
interface for socket option operations that works with both user and
kernel buffers.
The sockopt_t type encapsulates an iov_iter and an optlen field.
The optlen field, although not suggested by Linus, serves as both input
(buffer size) and output (returned data size), allowing callbacks to
return random values independent of the bytes written via
copy_to_iter(), so, keep it separated from iov_iter.count.
This is preparatory work for removing the SOL_SOCKET level restriction
from io_uring getsockopt operations.
Keep in mind that both iter_out and iter_in always point to the same
data at all times, and we just have two of them to make the callback
implementation sane.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20260408-getsockopt-v3-1-061bb9cb355d@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Luiz Augusto von Dentz says:
====================
bluetooth-next pull request for net-next:
core:
- hci_core: Rate limit the logging of invalid ISO handle
- hci_sync: make hci_cmd_sync_run_once return -EEXIST if exists
- hci_event: fix locking in hci_conn_request_evt() with HCI_PROTO_DEFER
- hci_event: fix potential UAF in SSP passkey handlers
- HCI: Avoid a couple -Wflex-array-member-not-at-end warnings
- L2CAP: CoC: Disconnect if received packet size exceeds MPS
- L2CAP: Add missing chan lock in l2cap_ecred_reconf_rsp
- L2CAP: Fix printing wrong information if SDU length exceeds MTU
- SCO: check for codecs->num_codecs == 1 before assigning to sco_pi(sk)->codec
drivers:
- btusb: MT7922: Add VID/PID 0489/e174
- btusb: Add Lite-On 04ca:3807 for MediaTek MT7921
- btusb: Add MT7927 IDs ASUS ROG Crosshair X870E Hero, Lenovo Legion Pro 7
16ARX9, Gigabyte Z790 AORUS MASTER X, MSI X870E Ace Max, TP-Link
Archer TBE550E, ASUS X870E / ProArt X870E-Creator.
- btusb: Add MT7902 IDs 13d3/3579, 13d3/3580, 13d3/3594, 13d3/3596, 0e8d/1ede
- btusb: Add MT7902 IDs 13d3/3579, 13d3/3580, 13d3/3594, 13d3/3596, 0e8d/1ede
- btusb: MediaTek MT7922: Add VID 0489 & PID e11d
- btintel: Add support for Scorpious Peak2 support
- btintel: Add support for Scorpious Peak2F support
- btintel_pcie: Add device id of Scorpius Peak2, Nova Lake-PCD-H
- btintel_pcie: Add device id of Scorpious2, Nova Lake-PCD-S
- btmtk: Add reset mechanism if downloading firmware failed
- btmtk: Add MT6639 (MT7927) Bluetooth support
- btmtk: fix ISO interface setup for single alt setting
- btmtk: add MT7902 SDIO support
- Bluetooth: btmtk: add MT7902 MCU support
- btbcm: Add entry for BCM4343A2 UART Bluetooth
- qca: enable pwrseq support for wcn39xx devices
- hci_qca: Fix BT not getting powered-off on rmmod
- hci_qca: disable power control for WCN7850 when bt_en is not defined
- hci_qca: Fix missing wakeup during SSR memdump handling
- hci_ldisc: Clear HCI_UART_PROTO_INIT on error
- mmc: sdio: add MediaTek MT7902 SDIO device ID
- hci_ll: Enable BROKEN_ENHANCED_SETUP_SYNC_CONN for WL183x
* tag 'for-net-next-2026-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next: (59 commits)
Bluetooth: hci_qca: Fix missing wakeup during SSR memdump handling
Bluetooth: btintel_pcie: use strscpy to copy plain strings
Bluetooth: hci_event: fix potential UAF in SSP passkey handlers
Bluetooth: hci.h: Avoid a couple -Wflex-array-member-not-at-end warnings
Bluetooth: SCO: check for codecs->num_codecs == 1 before assigning to sco_pi(sk)->codec
Bluetooth: btintel_pcie: Align shared DMA memory to 128 bytes
Bluetooth: l2cap: Add missing chan lock in l2cap_ecred_reconf_rsp
Bluetooth: hci_ll: Enable BROKEN_ENHANCED_SETUP_SYNC_CONN for WL183x
Bluetooth: btusb: MediaTek MT7922: Add VID 0489 & PID e11d
Bluetooth: btmtk: hide unused btmtk_mt6639_devs[] array
Bluetooth: btusb: Add MT7927 ID for ASUS X870E / ProArt X870E-Creator
Bluetooth: btusb: Add MT7927 ID for TP-Link Archer TBE550E
Bluetooth: btusb: Add MT7927 ID for MSI X870E Ace Max
Bluetooth: btusb: Add MT7927 ID for Gigabyte Z790 AORUS MASTER X
Bluetooth: btusb: Add MT7927 ID for Lenovo Legion Pro 7 16ARX9
Bluetooth: btusb: Add MT7927 ID for ASUS ROG Crosshair X870E Hero
Bluetooth: btmtk: fix ISO interface setup for single alt setting
Bluetooth: btmtk: Add MT6639 (MT7927) Bluetooth support
Bluetooth: fix locking in hci_conn_request_evt() with HCI_PROTO_DEFER
Bluetooth: btmtk: refactor endpoint lookup
...
====================
Link: https://patch.msgid.link/20260413132247.320961-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull misc vfs updates from Christian Brauner:
"Features:
- coredump: add tracepoint for coredump events
- fs: hide file and bfile caches behind runtime const machinery
Fixes:
- fix architecture-specific compat_ftruncate64 implementations
- dcache: Limit the minimal number of bucket to two
- fs/omfs: reject s_sys_blocksize smaller than OMFS_DIR_START
- fs/mbcache: cancel shrink work before destroying the cache
- dcache: permit dynamic_dname()s up to NAME_MAX
Cleanups:
- remove or unexport unused fs_context infrastructure
- trivial ->setattr cleanups
- selftests/filesystems: Assume that TIOCGPTPEER is defined
- writeback: fix kernel-doc function name mismatch for wb_put_many()
- autofs: replace manual symlink buffer allocation in autofs_dir_symlink
- init/initramfs.c: trivial fix: FSM -> Finite-state machine
- fs: remove stale and duplicate forward declarations
- readdir: Introduce dirent_size()
- fs: Replace user_access_{begin/end} by scoped user access
- kernel: acct: fix duplicate word in comment
- fs: write a better comment in step_into() concerning .mnt assignment
- fs: attr: fix comment formatting and spelling issues"
* tag 'vfs-7.1-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (28 commits)
dcache: permit dynamic_dname()s up to NAME_MAX
fs: attr: fix comment formatting and spelling issues
fs: hide file and bfile caches behind runtime const machinery
fs: write a better comment in step_into() concerning .mnt assignment
proc: rename proc_notify_change to proc_setattr
proc: rename proc_setattr to proc_nochmod_setattr
affs: rename affs_notify_change to affs_setattr
adfs: rename adfs_notify_change to adfs_setattr
hfs: update comments on hfs_inode_setattr
kernel: acct: fix duplicate word in comment
fs: Replace user_access_{begin/end} by scoped user access
readdir: Introduce dirent_size()
coredump: add tracepoint for coredump events
fs: remove do_sys_truncate
fs: pass on FTRUNCATE_* flags to do_truncate
fs: fix archiecture-specific compat_ftruncate64
fs: remove stale and duplicate forward declarations
init/initramfs.c: trivial fix: FSM -> Finite-state machine
autofs: replace manual symlink buffer allocation in autofs_dir_symlink
fs/mbcache: cancel shrink work before destroying the cache
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull clone and pidfs updates from Christian Brauner:
"Add three new clone3() flags for pidfd-based process lifecycle
management.
CLONE_AUTOREAP:
CLONE_AUTOREAP makes a child process auto-reap on exit without ever
becoming a zombie. This is a per-process property in contrast to
the existing auto-reap mechanism via SA_NOCLDWAIT or SIG_IGN for
SIGCHLD which applies to all children of a given parent.
Currently the only way to automatically reap children is to set
SA_NOCLDWAIT or SIG_IGN on SIGCHLD. This is a parent-scoped
property affecting all children which makes it unsuitable for
libraries or applications that need selective auto-reaping of
specific children while still being able to wait() on others.
CLONE_AUTOREAP stores an autoreap flag in the child's
signal_struct. When the child exits do_notify_parent() checks this
flag and causes exit_notify() to transition the task directly to
EXIT_DEAD. Since the flag lives on the child it survives
reparenting: if the original parent exits and the child is
reparented to a subreaper or init the child still auto-reaps when
it eventually exits. This is cleaner than forcing the subreaper to
get SIGCHLD and then reaping it. If the parent doesn't care the
subreaper won't care. If there's a subreaper that would care it
would be easy enough to add a prctl() that either just turns back
on SIGCHLD and turns off auto-reaping or a prctl() that just
notifies the subreaper whenever a child is reparented to it.
CLONE_AUTOREAP can be combined with CLONE_PIDFD to allow the parent
to monitor the child's exit via poll() and retrieve exit status via
PIDFD_GET_INFO. Without CLONE_PIDFD it provides a fire-and-forget
pattern. No exit signal is delivered so exit_signal must be zero.
CLONE_THREAD and CLONE_PARENT are rejected: CLONE_THREAD because
autoreap is a process-level property, and CLONE_PARENT because an
autoreap child reparented via CLONE_PARENT could become an
invisible zombie under a parent that never calls wait().
The flag is not inherited by the autoreap process's own children.
Each child that should be autoreaped must be explicitly created
with CLONE_AUTOREAP.
CLONE_NNP:
CLONE_NNP sets no_new_privs on the child at clone time. Unlike
prctl(PR_SET_NO_NEW_PRIVS) which a process sets on itself,
CLONE_NNP allows the parent to impose no_new_privs on the child at
creation without affecting the parent's own privileges.
CLONE_THREAD is rejected because threads share credentials.
CLONE_NNP is useful on its own for any spawn-and-sandbox pattern
but was specifically introduced to enable unprivileged usage of
CLONE_PIDFD_AUTOKILL.
CLONE_PIDFD_AUTOKILL:
This flag ties a child's lifetime to the pidfd returned from
clone3(). When the last reference to the struct file created by
clone3() is closed the kernel sends SIGKILL to the child. A pidfd
obtained via pidfd_open() for the same process does not keep the
child alive and does not trigger autokill - only the specific
struct file from clone3() has this property. This is useful for
container runtimes, service managers, and sandboxed subprocess
execution - any scenario where the child must die if the parent
crashes or abandons the pidfd or just wants a throwaway helper
process.
CLONE_PIDFD_AUTOKILL requires both CLONE_PIDFD and CLONE_AUTOREAP.
It requires CLONE_PIDFD because the whole point is tying the
child's lifetime to the pidfd. It requires CLONE_AUTOREAP because a
killed child with no one to reap it would become a zombie - the
primary use case is the parent crashing or abandoning the pidfd so
no one is around to call waitpid(). CLONE_THREAD is rejected
because autokill targets a process not a thread.
If CLONE_NNP is specified together with CLONE_PIDFD_AUTOKILL an
unprivileged user may spawn a process that is autokilled. The child
cannot escalate privileges via setuid/setgid exec after being
spawned. If CLONE_PIDFD_AUTOKILL is specified without CLONE_NNP the
caller must have have CAP_SYS_ADMIN in its user namespace"
* tag 'vfs-7.1-rc1.pidfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
selftests: check pidfd_info->coredump_code correctness
pidfds: add coredump_code field to pidfd_info
kselftest/coredump: reintroduce null pointer dereference
selftests/pidfd: add CLONE_PIDFD_AUTOKILL tests
selftests/pidfd: add CLONE_NNP tests
selftests/pidfd: add CLONE_AUTOREAP tests
pidfd: add CLONE_PIDFD_AUTOKILL
clone: add CLONE_NNP
clone: add CLONE_AUTOREAP
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull namespace update from Christian Brauner:
"Add two simple helper macros for the namespace infrastructure"
* tag 'namespaces-7.1-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
nsproxy: Add FOR_EACH_NS_TYPE() X-macro and CLONE_NS_ALL
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs buffer_head updates from Christian Brauner:
"This cleans up the mess that has accumulated over the years in
metadata buffer_head tracking for inodes.
It moves the tracking into dedicated structure in filesystem-private
part of the inode (so that we don't use private_list, private_data,
and private_lock in struct address_space), and also moves couple other
users of private_data and private_list so these are removed from
struct address_space saving 3 longs in struct inode for 99% of inodes"
* tag 'vfs-7.1-rc1.bh.metadata' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (42 commits)
fs: Drop i_private_list from address_space
fs: Drop mapping_metadata_bhs from address space
ext4: Track metadata bhs in fs-private inode part
minix: Track metadata bhs in fs-private inode part
udf: Track metadata bhs in fs-private inode part
fat: Track metadata bhs in fs-private inode part
bfs: Track metadata bhs in fs-private inode part
affs: Track metadata bhs in fs-private inode part
ext2: Track metadata bhs in fs-private inode part
fs: Provide functions for handling mapping_metadata_bhs directly
fs: Switch inode_has_buffers() to take mapping_metadata_bhs
fs: Make bhs point to mapping_metadata_bhs
fs: Move metadata bhs tracking to a separate struct
fs: Fold fsync_buffers_list() into sync_mapping_buffers()
fs: Drop osync_buffers_list()
kvm: Use private inode list instead of i_private_list
fs: Remove i_private_data
aio: Stop using i_private_data and i_private_lock
hugetlbfs: Stop using i_private_data
fs: Stop using i_private_data for metadata bh tracking
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs i_ino updates from Christian Brauner:
"For historical reasons, the inode->i_ino field is an unsigned long,
which means that it's 32 bits on 32 bit architectures. This has caused
a number of filesystems to implement hacks to hash a 64-bit identifier
into a 32-bit field, and deprives us of a universal identifier field
for an inode.
This changes the inode->i_ino field from an unsigned long to a u64.
This shouldn't make any material difference on 64-bit hosts, but
32-bit hosts will see struct inode grow by at least 4 bytes. This
could have effects on slabcache sizes and field alignment.
The bulk of the changes are to format strings and tracepoints, since
the kernel itself doesn't care that much about the i_ino field. The
first patch changes some vfs function arguments, so check that one out
carefully.
With this change, we may be able to shrink some inode structures. For
instance, struct nfs_inode has a fileid field that holds the 64-bit
inode number. With this set of changes, that field could be
eliminated. I'd rather leave that sort of cleanups for later just to
keep this simple"
* tag 'vfs-7.1-rc1.kino' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
nilfs2: fix 64-bit division operations in nilfs_bmap_find_target_in_group()
EVM: add comment describing why ino field is still unsigned long
vfs: remove externs from fs.h on functions modified by i_ino widening
treewide: fix missed i_ino format specifier conversions
ext4: fix signed format specifier in ext4_load_inode trace event
treewide: change inode->i_ino from unsigned long to u64
nilfs2: widen trace event i_ino fields to u64
f2fs: widen trace event i_ino fields to u64
ext4: widen trace event i_ino fields to u64
zonefs: widen trace event i_ino fields to u64
hugetlbfs: widen trace event i_ino fields to u64
ext2: widen trace event i_ino fields to u64
cachefiles: widen trace event i_ino fields to u64
vfs: widen trace event i_ino fields to u64
net: change sock.sk_ino and sock_i_ino() to u64
audit: widen ino fields to u64
vfs: widen inode hash/lookup functions to u64
|
|
xprt_rdma_alloc_slot() and xprt_rdma_free_slot() lack serialization
between the buffer pool and the backlog queue. A buffer freed
after rpcrdma_buffer_get() finds the pool empty but before
rpc_sleep_on() places the task on the backlog is returned to the
pool with no waiter to wake, leaving the task stuck on the backlog
indefinitely.
After joining the backlog, re-check the pool and route any
recovered buffer through xprt_wake_up_backlog(), whose queue lock
serializes with concurrent wakeups and avoids double-assignment
of slots.
Because xprt_rdma_free_slot() does not hold reserve_lock, the
XPRT_CONGESTED double-check in xprt_throttle_congested() is
ineffective: a task can join the backlog through that path after
free_slot has already found it empty and cleared the bit. Avoid
this by using xprt_add_backlog_noncongested(), which queues the
task without setting XPRT_CONGESTED, so every allocation reaches
xprt_rdma_alloc_slot() and its post-sleep re-check.
Fixes: edb41e61a54e ("xprtrdma: Make rpc_rqst part of rpcrdma_req")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
xfstest generic/728 fails with delegated timestamps. The client does a
removexattr and then a stat to test the ctime, which doesn't change. The
stat() doesn't trigger a GETATTR because of the delegated timestamps, so
it relies on the cached ctime, which is wrong.
The setxattr compound has a trailing GETATTR, which ensures that its
ctime gets updated. Follow the same strategy with removexattr.
Fixes: 3e1f02123fba ("NFSv4.2: add client side XDR handling for extended attributes")
Reported-by: Olga Kornievskaia <aglo@umich.edu>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
|
|
- Free all previously requested IRQs in epf_ntb_db_bar_init_msi_doorbell()
error path (Koichiro Den)
- Free doorbell IRQ in pci-epf-test only if it has actually been requested
(Koichiro Den)
- Discard pointer to doorbell message array after freeing it in
pci_epf_alloc_doorbell() error path (Koichiro Den)
- Advertise dynamic inbound mapping support in pci-epf-test and update host
pci_endpoint_test to skip doorbell testing if not advertised by endpoint
(Koichiro Den)
- Constify configfs item and group operations (Christophe JAILLET)
- Use array_index_nospec() on configfs MW show/store attributes (Koichiro
Den)
- Return -ERANGE (not -EINVAL) for configfs out-of-range MW index (Koichiro
Den)
- Return 0, not remaining timeout, when MHI eDMA ops complete so
mhi_ep_ring_add_element() doesn't interpret non-zero as failure (Daniel
Hodges)
- Remove vntb and ntb duplicate resource teardown that leads to oops when
.allow_link() fails or .drop_link() is called (Koichiro Den)
- Disable vntb delayed work before clearing BAR mappings and doorbells to
avoid oops caused by doing the work after resources have been torn down
(Koichiro Den)
- Fix pci_epf_add_vepf() kernel-doc typo (Alok Tiwari)
- Propagate pci_epf_create() errors to pci_epf_make() callers (Alok Tiwari)
- Remove redundant BAR_RESERVED annotation for the high order part of a
64-bit BAR (Niklas Cassel)
- Add a way to describe reserved subregions within BARs, e.g.,
platform-owned fixed register windows, and use it for the RK3588 BAR4 DMA
ctrl window (Koichiro Den)
- Add BAR_DISABLED for BARs that will never be available to an EPF driver,
and change some BAR_RESERVED annotations to BAR_DISABLED (Niklas Cassel)
- Disable BARs in common code instead of in each glue driver (Niklas
Cassel)
- Advertise reserved BARs in Capabilities so host-side drivers can skip
them (Niklas Cassel)
- Skip reserved BARs in selftests (Niklas Cassel)
- Improve error messages and include device name when available (Manivannan
Sadhasivam)
- Add NTB .get_dma_dev() callback for cases where DMA API requires a
different device, e.g., vNTB devices (Koichiro Den)
- Return -EINVAL, not -ENOSPC, if endpoint test determines the subrange
size is too small (Koichiro Den)
- Add reserved region types for MSI-X Table and PBA so Endpoint controllers
can them as describe hardware-owned regions in a BAR_RESERVED BAR
(Manikanta Maddireddy)
- Make Tegra194/234 BAR0 programmable and remove 1MB size limit (Manikanta
Maddireddy)
- Expose Tegra BAR2 (MSI-X) and BAR4 (DMA) as 64-bit BAR_RESERVED
(Manikanta Maddireddy)
- Add Tegra194 and Tegra234 device table entries to pci_endpoint_test
(Manikanta Maddireddy)
- Skip the BAR subrange selftest if there are not enough inbound window
resources to run the test (Christian Bruel)
* pci/endpoint:
selftests: pci_endpoint: Skip BAR subrange test on -ENOSPC
misc: pci_endpoint_test: Add Tegra194 and Tegra234 device table entries
PCI: tegra194: Expose BAR2 (MSI-X) and BAR4 (DMA) as 64-bit BAR_RESERVED
PCI: tegra194: Make BAR0 programmable and remove 1MB size limit
PCI: endpoint: Add reserved region type for MSI-X Table and PBA
misc: pci_endpoint_test: Use -EINVAL for small subrange size
PCI: endpoint: pci-epf-vntb: Implement .get_dma_dev()
NTB: ntb_transport: Use ntb_get_dma_dev() for DMA buffers
NTB: core: Add .get_dma_dev() callback to ntb_dev_ops
PCI: endpoint: Improve error messages
PCI: endpoint: Print the EPF name in the error log of pci_epf_make()
selftests: pci_endpoint: Skip reserved BARs
misc: pci_endpoint_test: Give reserved BARs a distinct error code
PCI: endpoint: pci-epf-test: Advertise reserved BARs
PCI: dwc: Disable BARs in common code instead of in each glue driver
PCI: dwc: Replace certain BAR_RESERVED with BAR_DISABLED in glue drivers
PCI: endpoint: Introduce pci_epc_bar_type BAR_DISABLED
PCI: dw-rockchip: Describe RK3588 BAR4 DMA ctrl window
PCI: endpoint: Describe reserved subregions within BARs
PCI: endpoint: Allow only_64bit on BAR_RESERVED
PCI: endpoint: Do not mark the BAR succeeding a 64-bit BAR as BAR_RESERVED
PCI: endpoint: Propagate error from pci_epf_create()
PCI: endpoint: Fix typo in pci_epf_add_vepf() kernel-doc
PCI: endpoint: pci-epf-vntb: Stop cmd_handler work in epf_ntb_epc_cleanup
PCI: endpoint: pci-epf-ntb: Remove duplicate resource teardown
PCI: endpoint: pci-epf-vntb: Remove duplicate resource teardown
PCI: epf-mhi: Return 0, not remaining timeout, when eDMA ops complete
PCI: endpoint: pci-epf-vntb: Return -ERANGE for out-of-range MW index
PCI: endpoint: pci-epf-vntb: Use array_index_nospec() on mws_size[] access
PCI: endpoint: Constify struct configfs_item_operations and configfs_group_operations
selftests: pci_endpoint: Skip doorbell test when unsupported
misc: pci_endpoint_test: Gate doorbell test on dynamic inbound mapping
PCI: endpoint: pci-epf-test: Advertise dynamic inbound mapping support
PCI: endpoint: pci-ep-msi: Fix error unwind and prevent double alloc
PCI: endpoint: pci-epf-test: Don't free doorbell IRQ unless requested
PCI: endpoint: pci-epf-vntb: Fix MSI doorbell IRQ unwind
|
|
- Prevent assigning space to unimplemented bridge windows; previously we
mistakenly assumed prefetchable window existed and assigned space and put
a BAR there (Ahmed Naseef)
- Avoid shrinking bridge windows to fit in the initial Root Port window;
this fixes one problem with devices with large BARs connected via
switches, e.g., Thunderbolt (Ilpo Järvinen)
- Retain information about optional resources to make assignment during
rescan more likely to succeed (Ilpo Järvinen)
- Add __resource_contains_unbound() for use in finding space for resources
with no address assigned (Ilpo Järvinen)
- Pass full extent of empty space, not just the aligned space, to
resource_alignf callback so free space before the requested alignment can
be used (Ilpo Järvinen)
- Remove unnecessary second alignment from ARM, m68k, MIPS (Ilpo Järvinen)
- Place small resources before larger ones for better utilization of
address space (Ilpo Järvinen)
- Fix alignment calculation for resource size larger than align, e.g.,
bridge windows larger than the 1MB required alignment (Ilpo Järvinen)
* pci/resource:
PCI: Fix alignment calculation for resource size larger than align
PCI: Align head space better
PCI: Rename window_alignment() to pci_min_window_alignment()
parisc/PCI: Clean up align handling
MIPS: PCI: Remove unnecessary second application of align
m68k/PCI: Remove unnecessary second application of align
ARM/PCI: Remove unnecessary second application of align
resource: Rename 'tmp' variable to 'full_avail'
resource: Pass full extent of empty space to resource_alignf callback
resource: Add __resource_contains_unbound() for internal contains checks
PCI: Fix premature removal from realloc_head list during resource assignment
PCI: Prevent shrinking bridge window from its required size
PCI: Prevent assignment to unsupported bridge windows
|
|
- Update slot handling so all ARI functions are treated as being in the
same slot. They're all reset by Secondary Bus Reset, but previously
drivers of ARI functions that appeared to be on a non-zero device weren't
notified and fatal hardware errors could result (Keith Busch)
- Make sysfs reset_subordinate hotplug safe to avoid spurious hotplug
events (Keith Busch)
- Consolidate bus iteration across the _lock(), _unlock(), and _trylock()
functions for pci_bus and pci_slot (Ilpo Järvinen)
- Hide Secondary Bus Reset ('bus') from sysfs reset_methods if masked by
CXL because it has no effect (Vidya Sagar)
* pci/reset:
PCI/CXL: Hide SBR from reset_methods if masked by CXL
PCI: Consolidate pci_bus/slot_lock/unlock/trylock()
PCI: Make reset_subordinate hotplug safe
PCI: Allow all bus devices to use the same slot
PCI: Rename __pci_bus_reset() and __pci_slot_reset()
|
|
- Leave Precision Time Measurement disabled until a driver enables it to
avoid PCIe errors (Mika Westerberg)
* pci/ptm:
PCI/PTM: Do not enable PTM automatically for Root and Switch Upstream Ports
PCI/PTM: Drop pci_enable_ptm() granularity parameter
|
|
- Allow wildcards in list of host bridges that support peer-to-peer DMA
between hierarchy domains and add all Google SoCs (Jacob Moroni)
* pci/p2pdma:
PCI/P2PDMA: Add Google SoCs to the P2P DMA host bridge list
PCI/P2PDMA: Allow wildcard Device IDs in host bridge list
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs integrity updates from Christian Brauner:
"This adds support to generate and verify integrity information (aka
T10 PI) in the file system, instead of the automatic below the covers
support that is currently used.
The implementation is based on refactoring the existing block layer PI
code to be reusable for this use case, and then adding relatively
small wrappers for the file system use case. These are then used in
iomap to implement the semantics, and wired up in XFS with a small
amount of glue code.
Compared to the baseline this does not change performance for writes,
but increases read performance up to 15% for 4k I/O, with the benefit
decreasing with larger I/O sizes as even the baseline maxes out the
device quickly on my older enterprise SSD"
* tag 'vfs-7.1-rc1.integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
xfs: support T10 protection information
iomap: support T10 protection information
iomap: support ioends for buffered reads
iomap: add a bioset pointer to iomap_read_folio_ops
ntfs3: remove copy and pasted iomap code
iomap: allow file systems to hook into buffered read bio submission
iomap: only call into ->submit_read when there is a read_ctx
iomap: pass the iomap_iter to ->submit_read
iomap: refactor iomap_bio_read_folio_range
block: pass a maxlen argument to bio_iov_iter_bounce
block: add fs_bio_integrity helpers
block: make max_integrity_io_size public
block: prepare generation / verification helpers for fs usage
block: add a bdev_has_integrity_csum helper
block: factor out a bio_integrity_setup_default helper
block: factor out a bio_integrity_action helper
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs directory updates from Christian Brauner:
"Recently 'start_creating', 'start_removing', 'start_renaming' and
related interfaces were added which combine the locking and the
lookup.
At that time many callers were changed to use the new interfaces.
However there are still an assortment of places out side of the core
vfs where the directory is locked explictly, whether with inode_lock()
or lock_rename() or similar. These were missed in the first pass for
an assortment of uninteresting reasons.
This addresses the remaining places where explicit locking is used,
and changes them to use the new interfaces, or otherwise removes the
explicit locking.
The biggest changes are in overlayfs. The other changes are quite
simple, though maybe the cachefiles changes is the least simple of
those"
* tag 'vfs-7.1-rc1.directory' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
VFS: unexport lock_rename(), lock_rename_child(), unlock_rename()
ovl: remove ovl_lock_rename_workdir()
ovl: use is_subdir() for testing if one thing is a subdir of another
ovl: change ovl_create_real() to get a new lock when re-opening created file.
ovl: pass name buffer to ovl_start_creating_temp()
cachefiles: change cachefiles_bury_object to use start_renaming_dentry()
ovl: Simplify ovl_lookup_real_one()
VFS: make lookup_one_qstr_excl() static.
nfsd: switch purge_old() to use start_removing_noperm()
selinux: Use simple_start_creating() / simple_done_creating()
Apparmor: Use simple_start_creating() / simple_done_creating()
libfs: change simple_done_creating() to use end_creating()
VFS: move the start_dirop() kerndoc comment to before start_dirop()
fs/proc: Don't lock root inode when creating "self" and "thread-self"
VFS: note error returns in documentation for various lookup functions
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs xattr updates from Christian Brauner:
"This reworks the simple_xattr infrastructure and adds support for
user.* extended attributes on sockets.
The simple_xattr subsystem currently uses an rbtree protected by a
reader-writer spinlock. This series replaces the rbtree with an
rhashtable giving O(1) average-case lookup with RCU-based lockless
reads. This sped up concurrent access patterns on tmpfs quite a bit
and it's an overall easy enough conversion to do and gets rid or
rwlock_t.
The conversion is done incrementally: a new rhashtable path is added
alongside the existing rbtree, consumers are migrated one at a time
(shmem, kernfs, pidfs), and then the rbtree code is removed. All three
consumers switch from embedded structs to pointer-based lazy
allocation so the rhashtable overhead is only paid for inodes that
actually use xattrs.
With this infrastructure in place the series adds support for user.*
xattrs on sockets. Path-based AF_UNIX sockets inherit xattr support
from the underlying filesystem (e.g. tmpfs) but sockets in sockfs -
that is everything created via socket() including abstract namespace
AF_UNIX sockets - had no xattr support at all.
The xattr_permission() checks are reworked to allow user.* xattrs on
S_IFSOCK inodes. Sockfs sockets get per-inode limits of 128 xattrs and
128KB total value size matching the limits already in use for kernfs.
The practical motivation comes from several directions. systemd and
GNOME are expanding their use of Varlink as an IPC mechanism.
For D-Bus there are tools like dbus-monitor that can observe IPC
traffic across the system but this only works because D-Bus has a
central broker.
For Varlink there is no broker and there is currently no way to
identify which sockets speak Varlink. With user.* xattrs on sockets a
service can label its socket with the IPC protocol it speaks (e.g.,
user.varlink=1) and an eBPF program can then selectively capture
traffic on those sockets. Enumerating bound sockets via netlink
combined with these xattr labels gives a way to discover all Varlink
IPC entrypoints for debugging and introspection.
Similarly, systemd-journald wants to use xattrs on the /dev/log socket
for protocol negotiation to indicate whether RFC 5424 structured
syslog is supported or whether only the legacy RFC 3164 format should
be used.
In containers these labels are particularly useful as high-privilege
or more complicated solutions for socket identification aren't
available.
The series comes with comprehensive selftests covering path-based
AF_UNIX sockets, sockfs socket operations, per-inode limit
enforcement, and xattr operations across multiple address families
(AF_INET, AF_INET6, AF_NETLINK, AF_PACKET)"
* tag 'vfs-7.1-rc1.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
selftests/xattr: test xattrs on various socket families
selftests/xattr: sockfs socket xattr tests
selftests/xattr: path-based AF_UNIX socket xattr tests
xattr: support extended attributes on sockets
xattr,net: support limited amount of extended attributes on sockfs sockets
xattr: move user limits for xattrs to generic infra
xattr: switch xattr_permission() to switch statement
xattr: add xattr_permission_error()
xattr: remove rbtree-based simple_xattr infrastructure
pidfs: adapt to rhashtable-based simple_xattrs
kernfs: adapt to rhashtable-based simple_xattrs with lazy allocation
shmem: adapt to rhashtable-based simple_xattrs with lazy allocation
xattr: add rhashtable-based simple_xattr infrastructure
xattr: add rcu_head and rhash_head to struct simple_xattr
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs writeback updates from Christian Brauner:
"This introduces writeback helper APIs and converts f2fs, gfs2 and nfs
to stop accessing writeback internals directly"
* tag 'vfs-7.1-rc1.writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
nfs: stop using writeback internals for WB_WRITEBACK accounting
gfs2: stop using writeback internals for dirty_exceeded check
f2fs: stop using writeback internals for dirty_exceeded checks
writeback: prep helpers for dirty-limit and writeback accounting
|
|
KVM SVM changes for 7.1
- Fix and optimize IRQ window inhibit handling for AVIC (the tracking needs to
be per-vCPU, e.g. so that KVM doesn't prematurely re-enable AVIC if multiple
vCPUs have to-be-injected IRQs).
- Fix an undefined behavior warning where a crafty userspace can read the
"avic" module param before it's fully initialized.
- Fix a (likely benign) bug in the "OS-visible workarounds" handling, where
KVM could clobber state when enabling virtualization on multiple CPUs in
parallel, and clean up and optimize the code.
- Drop a WARN in KVM_MEMORY_ENCRYPT_REG_REGION where KVM complains about a
"too large" size based purely on user input, and clean up and harden the
related pinning code.
- Disallow synchronizing a VMSA of an already-launched/encrypted vCPU, as
doing so for an SNP guest will trigger an RMP violation #PF and crash the
host.
- Protect all of sev_mem_enc_register_region() with kvm->lock to ensure
sev_guest() is stable for the entire of the function.
- Lock all vCPUs when synchronizing VMSAs for SNP guests to ensure the VMSA
page isn't actively being used.
- Overhaul KVM's APIs for detecting SEV+ guests so that VM-scoped queries are
required to hold kvm->lock (KVM has had multiple bugs due "is SEV?" checks
becoming stale), enforced by lockdep. Add and use vCPU-scoped APIs when
possible/appropriate, as all checks that originate from a vCPU are
guaranteed to be stable.
- Convert a pile of kvm->lock SEV code to guard().
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux
Pull RCU updates from Joel Fernandes:
"NOCB CPU management:
- Consolidate rcu_nocb_cpu_offload() and rcu_nocb_cpu_deoffload() to
reduce code duplication
- Extract nocb_bypass_needs_flush() helper to reduce duplication in
NOCB bypass path
rcutorture/torture infrastructure:
- Add NOCB01 config for RCU_LAZY torture testing
- Add NOCB02 config for NOCB poll mode testing
- Add TRIVIAL-PREEMPT config for textbook-style preemptible RCU
torture
- Test call_srcu() with preemption both disabled and enabled
- Remove kvm-check-branches.sh in favor of kvm-series.sh
- Make hangs more visible in torture.sh output
- Add informative message for tests without a recheck file
- Fix numeric test comparison in srcu_lockdep.sh
- Use torture_shutdown_init() in refscale and rcuscale instead of
open-coded shutdown functions
- Fix modulo-zero error in torture_hrtimeout_ns().
SRCU:
- Fix SRCU read flavor macro comments
- Fix s/they disables/they disable/ typo in srcu_read_unlock_fast()
RCU Tasks:
- Document that RCU Tasks Trace grace periods now imply RCU grace
periods
- Remove unnecessary smp_store_release() in cblist_init_generic()"
* tag 'rcu.2026.03.31a' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux:
rcutorture: Test call_srcu() with preemption disabled and not
rcu: Add BOOTPARAM_RCU_STALL_PANIC Kconfig option
torture: Avoid modulo-zero error in torture_hrtimeout_ns()
rcu/nocb: Extract nocb_bypass_needs_flush() to reduce duplication
rcu/nocb: Consolidate rcu_nocb_cpu_offload/deoffload functions
rcu-tasks: Remove unnecessary smp_store_release() in cblist_init_generic()
rcutorture: Add NOCB02 config for nocb poll mode testing
rcutorture: Add NOCB01 config for RCU_LAZY torture testing
rcu-tasks: Document that RCU Tasks Trace grace periods now imply RCU grace periods
srcu: Fix s/they disables/they disable/ typo in srcu_read_unlock_fast()
srcu: Fix SRCU read flavor macro comments
rcuscale: Ditch rcu_scale_shutdown in favor of torture_shutdown_init()
refscale: Ditch ref_scale_shutdown in favor of torture_shutdown_init()
rcutorture: Fix numeric "test" comparison in srcu_lockdep.sh
torture: Print informative message for test without recheck file
torture: Make hangs more visible in torture.sh output
kvm-check-branches.sh: Remove in favor of kvm-series.sh
rcutorture: Add a textbook-style trivial preemptible RCU
|
|
This series cleans up some of the special user copy functions naming and
semantics. In particular, get rid of the (very traditional) double
underscore names and behavior: the whole "optimize away the range check"
model has been largely excised from the other user accessors because
it's so subtle and can be unsafe, but also because it's just not a
relevant optimization any more.
To do that, a couple of drivers that misused the "user" copies as kernel
copies in order to get non-temporal stores had to be fixed up, but that
kind of code should never have been allowed anyway.
The x86-only "nocache" version was also renamed to more accurately
reflect what it actually does.
This was all done because I looked at this code due to a report by Jann
Horn, and I just couldn't stand the inconsistent naming, the horrible
semantics, and the random misuse of these functions. This code should
probably be cleaned up further, but it's at least slightly closer to
normal semantics.
I had a more intrusive series that went even further in trying to
normalize the semantics, but that ended up hitting so many other
inconsistencies between different architectures in this area (eg
'size_t' vs 'unsigned long' vs 'int' as size arguments, and various
iovec check differences that Vasily Gorbik pointed out) that I ended up
with this more limited version that fixed the worst of the issues.
Reported-by: Jann Horn <jannh@google.com>
Tested-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/all/CAHk-=wgg1QVWNWG-UCFo1hx0zqrPnB3qhPzUTrWNft+MtXQXig@mail.gmail.com/
* nocache-cleanup:
x86-64/arm64/powerpc: clean up and rename __copy_from_user_flushcache
x86: rename and clean up __copy_from_user_inatomic_nocache()
x86-64: rename misleadingly named '__copy_user_nocache()' function
|
|
Commit under Fixes moved recomputing the window clamp to
tcp_measure_rcv_mss() (when scaling_ratio changes).
I suspect it missed the fact that we don't recompute the clamp
when rcvbuf is set. Until scaling_ratio changes we are
stuck with the old window clamp which may be based on
the small initial buffer. scaling_ratio may never change.
Inspired by Eric's recent commit d1361840f8c5 ("tcp: fix
SO_RCVLOWAT and RCVBUF autotuning") plumb the user action
thru to TCP and have it update the clamp.
A smaller fix would be to just have tcp_rcvbuf_grow()
adjust the clamp even if SOCK_RCVBUF_LOCK is set.
But IIUC this is what we were trying to get away from
in the first place.
Fixes: a2cbb1603943 ("tcp: Update window clamping condition")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Eric Dumazet <edumaze@google.com>
Link: https://patch.msgid.link/20260408001438.129165-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Separate the rx and tx enablement/disablement into different
functions so that it is easier to interact with them independently
later.
Although this patch changes receive and transmit paths, the actual
behavior of the teaming driver should remain unchanged, since there
is no option introduced yet to change rx or tx enablement
independently. Those options will be added in follow-up patches.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Marc Harvey <marcharvey@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260409-teaming-driver-internal-v7-7-f47e7589685d@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add no functional changes, but rename enablement functions, variables
etc. that are used in teaming driver transmit decisions.
Since rx and tx enablement are still coupled, some of the variables
renamed in this patch are still used for the rx path, but that will
change in a follow-up patch.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Marc Harvey <marcharvey@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260409-teaming-driver-internal-v7-6-f47e7589685d@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
This team mode op is only used by the load balance mode, and it only
uses it in the tx path.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Marc Harvey <marcharvey@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260409-teaming-driver-internal-v7-3-f47e7589685d@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
This team_mode_op wasn't used by any of the team modes, so remove it.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Marc Harvey <marcharvey@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260409-teaming-driver-internal-v7-2-f47e7589685d@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The team_port's "index" and the team's "en_port_count" are read in
the hot transmit path, but are only written to when holding the rtnl
lock.
Use READ_ONCE() for all lockless reads of these values, and use
WRITE_ONCE() for all writes.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Marc Harvey <marcharvey@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260409-teaming-driver-internal-v7-1-f47e7589685d@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
WMI drivers using the buffer-based WMI API are expected to reject
undersized event payloads. Extend the WMI driver core to allow
such drivers to specify their minimum supported event payload size.
Also remove the now redundant .no_notify_data field.
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Link: https://patch.msgid.link/20260406203237.2970-7-W_Armin@gmx.de
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
WMI drivers using the buffer-based WMI API are expected to reject
undersized query results. Extend wmidev_query_block() to enable
the WMI driver core to perform this size check internally.
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Link: https://patch.msgid.link/20260406203237.2970-6-W_Armin@gmx.de
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
WMI drivers using the buffer-based WMI API are expected to reject
undersized method return values. Extend wmidev_invoke_method() to
enable the WMI driver core to perform this size check internally.
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Link: https://patch.msgid.link/20260406203237.2970-5-W_Armin@gmx.de
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
Some WMI methods return no values, so the whole postprocessing
of the result data is not needed for them. Add a special function
for calling such WMI methods to prepare for future changes of
the main wmidev_invoke_method() function.
Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Link: https://patch.msgid.link/20260406203237.2970-2-W_Armin@gmx.de
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
Add support for GPIO type 0x02, which controls an IR flood LED used
for face authentication on some laptops (e.g. Dell Pro Max 16 Premium).
Without this patch, the kernel logs "GPIO type 0x02 unknown; the sensor
may not work" and IR sensors paired with a flood LED cannot function.
The flood LED is registered through the LED subsystem like the existing
privacy LED, including a lookup entry to allow future consumer drivers
to find and control it via led_get().
To support multiple LEDs per INT3472 device, convert the single led
struct member to an array with a counter.
Signed-off-by: Marco Nenciarini <mnencia@kcore.it>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Hans de Goede <johannes.goede@oss.qualcomm.com>
Link: https://patch.msgid.link/20260401203638.1601661-5-mnencia@kcore.it
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
|
|
KVM x86 VMXON and EFER.SVME extraction for 7.1
Move _only_ VMXON+VMXOFF and EFER.SVME toggling out of KVM (versus all of VMX
and SVM enabling) out of KVM and into the core kernel so that non-KVM TDX
enabling, e.g. for trusted I/O, can make SEAMCALLs without needing to ensure
KVM is fully loaded.
TIO isn't a hypervisor, and isn't trying to be a hypervisor. Specifically, TIO
should _never_ have it's own VMCSes (that are visible to the host; the
TDX-Module has it's own VMCSes to do SEAMCALL/SEAMRET), and so there is simply
no reason to move that functionality out of KVM.
With that out of the way, dealing with VMXON/VMXOFF and EFER.SVME is a fairly
simple refcounting game.
|
|
KVM x86 emulated MMIO changes for 7.1
Copy single-chunk MMIO write values into a persistent (per-fragment) field to
fix use-after-free stack bugs due to KVM dereferencing a stack pointer after an
exit to userspace.
Clean up and comment the emulated MMIO code to try to make it easier to
maintain (not necessarily "easy", but "easier").
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for 7.1
* New features:
- Add support for tracing in the standalone EL2 hypervisor code,
which should help both debugging and performance analysis.
This comes with a full infrastructure for 'remote' trace buffers
that can be exposed by non-kernel entities such as firmware.
- Add support for GICv5 Per Processor Interrupts (PPIs), as the
starting point for supporting the new GIC architecture in KVM.
- Finally add support for pKVM protected guests, with anonymous
memory being used as a backing store. About time!
* Improvements and bug fixes:
- Rework the dreaded user_mem_abort() function to make it more
maintainable, reducing the amount of state being exposed to
the various helpers and rendering a substantial amount of
state immutable.
- Expand the Stage-2 page table dumper to support NV shadow
page tables on a per-VM basis.
- Tidy up the pKVM PSCI proxy code to be slightly less hard
to follow.
- Fix both SPE and TRBE in non-VHE configurations so that they
do not generate spurious, out of context table walks that
ultimately lead to very bad HW lockups.
- A small set of patches fixing the Stage-2 MMU freeing in error
cases.
- Tighten-up accepted SMC immediate value to be only #0 for host
SMCCC calls.
- The usual cleanups and other selftest churn.
|
|
Commit 89e5d7d61600 ("mailbox: remove superfluous internal header")
moved some constants to a public header but forgot to add a mailbox
specific prefix. Add this now to prevent future collisions on a too
generic naming.
Link: https://sashiko.dev/#/patchset/20260327151112.5202-2-wsa%2Brenesas%40sang-engineering.com
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Sudeep Holla <sudeep.holla@kernel.org>
Signed-off-by: Jassi Brar <jassisinghbrar@gmail.com>
|
|
Use container_of() macro instead of direct pointer casting to get the
pppox_sock from a sock pointer.
Signed-off-by: Qingfang Deng <qingfang.deng@linux.dev>
Link: https://patch.msgid.link/20260410054954.114031-2-qingfang.deng@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The sk member can be directly accessed from struct pppox_sock without
relying on type casting. Remove the sk_pppox() helper and update all
call sites to use po->sk directly.
Signed-off-by: Qingfang Deng <qingfang.deng@linux.dev>
Link: https://patch.msgid.link/20260410054954.114031-1-qingfang.deng@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Tariq Toukan says:
====================
mlx5-next updates 2026-04-09
* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
net/mlx5: Add icm_mng_function_id_mode cap bit
net/mlx5: Rename MLX5_PF page counter type to MLX5_SELF
net/mlx5: Add vhca_id_type bit to alias context
mlx5: Remove redundant iseg base
====================
Link: https://patch.msgid.link/20260409110431.154894-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Eric Dumazet says:
====================
net: reduce sk_filter() (and friends) bloat
Some functions return an error by value, and a drop_reason
by an output parameter. This extra parameter can force stack canaries.
A drop_reason is enough and more efficient.
This series reduces bloat by 678 bytes on x86_64:
$ scripts/bloat-o-meter -t vmlinux.old vmlinux.final
add/remove: 0/0 grow/shrink: 3/18 up/down: 79/-757 (-678)
Function old new delta
vsock_queue_rcv_skb 50 79 +29
ipmr_cache_report 1290 1315 +25
ip6mr_cache_report 1322 1347 +25
tcp_v6_rcv 3169 3167 -2
packet_rcv_spkt 329 327 -2
unix_dgram_sendmsg 1731 1726 -5
netlink_unicast 957 945 -12
netlink_dump 1372 1359 -13
sk_filter_trim_cap 889 858 -31
netlink_broadcast_filtered 1633 1595 -38
tcp_v4_rcv 3152 3111 -41
raw_rcv_skb 122 80 -42
ping_queue_rcv_skb 109 61 -48
ping_rcv 215 162 -53
rawv6_rcv_skb 278 224 -54
__sk_receive_skb 690 632 -58
raw_rcv 591 527 -64
udpv6_queue_rcv_one_skb 935 869 -66
udp_queue_rcv_one_skb 919 853 -66
tun_net_xmit 1146 1074 -72
sock_queue_rcv_skb_reason 166 76 -90
Total: Before=29722890, After=29722212, chg -0.00%
Future conversions from sock_queue_rcv_skb() to sock_queue_rcv_skb_reason()
can be done later.
====================
Link: https://patch.msgid.link/20260409145625.2306224-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Current return value can be replaced with the drop_reason,
reducing kernel bloat:
$ scripts/bloat-o-meter -t vmlinux.old vmlinux.new
add/remove: 0/2 grow/shrink: 1/11 up/down: 32/-603 (-571)
Function old new delta
tcp_v6_rcv 3135 3167 +32
unix_dgram_sendmsg 1731 1726 -5
netlink_unicast 957 945 -12
netlink_dump 1372 1359 -13
sk_filter_trim_cap 882 858 -24
tcp_v4_rcv 3143 3111 -32
__pfx_tcp_filter 32 - -32
netlink_broadcast_filtered 1633 1595 -38
sock_queue_rcv_skb_reason 126 76 -50
tun_net_xmit 1127 1074 -53
__sk_receive_skb 690 632 -58
udpv6_queue_rcv_one_skb 935 869 -66
udp_queue_rcv_one_skb 919 853 -66
tcp_filter 154 - -154
Total: Before=29722783, After=29722212, chg -0.00%
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260409145625.2306224-6-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
sk_filter_trim_cap will soon return the reason by value,
do the same for sk_filter_reason().
$ scripts/bloat-o-meter -t vmlinux.old vmlinux.new
add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-21 (-21)
Function old new delta
sock_queue_rcv_skb_reason 128 126 -2
tun_net_xmit 1146 1127 -19
Total: Before=29722661, After=29722640, chg -0.00%
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260409145625.2306224-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
BTF validation logic is independent from the main verifier.
Move it into check_btf.c
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20260412152936.54262-7-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Move precision propagation and backtracking logic to backtrack.c
to reduce verifier.c size.
No functional changes.
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20260412152936.54262-6-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
verifier.c is huge. Move is_state_visited() to states.c,
so that all state equivalence logic is in one file.
Mechanical move. No functional changes.
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20260412152936.54262-5-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
verifier.c is huge. Move check_cfg(), compute_postorder(),
compute_scc() into cfg.c
Mechanical move. No functional changes.
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20260412152936.54262-4-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
verifier.c is huge. Move compute_insn_live_regs() into liveness.c.
Mechanical move. No functional changes.
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20260412152936.54262-3-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
verifier.c is huge. Split fixup/post-processing logic that runs after
the verifier accepted the program into fixups.c.
Mechanical move. No functional changes.
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20260412152936.54262-2-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|