| Age | Commit message (Collapse) | Author |
|
S1G defines use of NDP Block Ack (BA) for aggregation, requiring negotiation
of NDP ADDBA/DELBA action frames. If the S1G recipient supports HT-immediate
block ack, the sender must send an NDP ADDBA Request indicating it expects
only NDP BlockAck frames for the agreement.
Introduce support for NDP ADDBA and DELBA exchange in mac80211. The
implementation negotiates the BA mechanism during setup based on station
capabilities and driver support (IEEE80211_HW_SUPPORTS_NDP_BLOCKACK).
If negotiation fails due to mismatched expectations, a rejection with status code
WLAN_STATUS_REJECTED_NDP_BLOCK_ACK_SUGGESTED is returned as per IEEE 802.11-2024.
Trace sample:
IEEE 802.11 Wireless Management
Fixed parameters
Category code: Block Ack (3)
Action code: NDP ADDBA Request (0x80)
Dialog token: 0x01
Block Ack Parameters: 0x1003, A-MSDUs, Block Ack Policy
.... .... .... ...1 = A-MSDUs: Permitted in QoS Data MPDUs
.... .... .... ..1. = Block Ack Policy: Immediate Block Ack
.... .... ..00 00.. = Traffic Identifier: 0x0
0001 0000 00.. .... = Number of Buffers (1 Buffer = 2304 Bytes): 64
Block Ack Timeout: 0x0000
Block Ack Starting Sequence Control (SSC): 0x0010
.... .... .... 0000 = Fragment: 0
0000 0000 0001 .... = Starting Sequence Number: 1
IEEE 802.11 Wireless Management
Fixed parameters
Category code: Block Ack (3)
Action code: NDP ADDBA Response (0x81)
Dialog token: 0x02
Status code: BlockAck negotiation refused because, due to buffer constraints and other unspecified reasons, the recipient prefers to generate only NDP BlockAck frames (0x006d)
Block Ack Parameters: 0x1002, Block Ack Policy
.... .... .... ...0 = A-MSDUs: Not Permitted
.... .... .... ..1. = Block Ack Policy: Immediate Block Ack
.... .... ..00 00.. = Traffic Identifier: 0x0
0001 0000 00.. .... = Number of Buffers (1 Buffer = 2304 Bytes): 64
Block Ack Timeout: 0x0000
Signed-off-by: Ria Thomas <ria.thomas@morsemicro.com>
Link: https://patch.msgid.link/20260305091304.310990-1-ria.thomas@morsemicro.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Draft P802.11bn_D1.3 switched the order here to align with
the order of the fields. Adjust the code accordingly.
Link: https://patch.msgid.link/20260304144148.ce45942294e1.I22ab3f16e6376a19c3953cf81dd67105ea8e529d@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
The action code actually serves to identify the type of action
frame, so it really isn't part of the per-type structure. Pull
it out and have it in the general action frame format.
In theory, whether or not the action code is present in this
way is up to each category, but all categories that are defined
right now all have that value.
While at it, and since this change requires changing all users,
remove the 'u' and make it an anonymous union in this case, so
that all code using this changes.
Change IEEE80211_MIN_ACTION_SIZE to take an argument which says
how much of the frame is needed, e.g. category, action_code or
the specific frame type that's defined in the union. Again this
also ensures that all code is updated.
In some cases, fix bugs where the SKB length was checked after
having accessed beyond the checked length, in particular in FTM
code, e.g. ieee80211_is_ftm().
Link: https://patch.msgid.link/20260226183607.67e71846b59e.I9a24328e3ffcaae179466a935f1c3345029f9961@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
lookup_one_qstr_excl() is no longer used outside of namei.c, so
make it static.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: NeilBrown <neil@brown.name>
Link: https://patch.msgid.link/20260224222542.3458677-9-neilb@ownmail.net
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Conflicts:
kernel/sched/ext.c
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Expose the following through cgroup.h:
- cgroup_on_dfl()
- cgroup_is_dead()
- cgroup_for_each_live_child()
- cgroup_for_each_live_descendant_pre()
- cgroup_for_each_live_descendant_post()
Until now, these didn't need to be exposed because controllers only cared
about the css hierarchy. The planned sched_ext hierarchical scheduler
support will be based on the default cgroup hierarchy, which is in line
with the existing BPF cgroup support, and thus needs these exposed.
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
inet_ehashfn() and inet6_ehashfn() initialise random secrets
on the first call by net_get_random_once().
While the init part is patched out using static keys, with
CONFIG_STACKPROTECTOR_STRONG=y, this causes a compiler to
generate a stack canary due to an automatic variable,
unsigned long ___flags, in the DO_ONCE() macro being passed
to __do_once_start().
With FDO, this is visible in __inet_lookup_established() and
__inet6_lookup_established() too.
Let's initialise the secrets by get_random_sleepable_once()
in the slow paths: inet_hash() for listen(), and
inet_hash_connect() and inet6_hash_connect() for connect().
Note that IPv6 listener will initialise both IPv4 & IPv6 secrets
in inet_hash() for IPv4-mapped IPv6 address.
With the patch, the stack size is reduced by 16 bytes (___flags
+ a stack canary) and NOPs for the static key go away.
Before: __inet6_lookup_established()
...
push %rbx
sub $0x38,%rsp # stack is 56 bytes
mov %edx,%ebx # sport
mov %gs:0x299419f(%rip),%rax # load stack canary
mov %rax,0x30(%rsp) and store it onto stack
mov 0x440(%rdi),%r15 # net->ipv4.tcp_death_row.hashinfo
nop
32: mov %r8d,%ebp # hnum
shl $0x10,%ebp # hnum << 16
nop
3d: mov 0x70(%rsp),%r14d # sdif
or %ebx,%ebp # INET_COMBINED_PORTS(sport, hnum)
mov 0x11a8382(%rip),%eax # inet6_ehashfn() ...
After: __inet6_lookup_established()
...
push %rbx
sub $0x28,%rsp # stack is 40 bytes
mov 0x60(%rsp),%ebp # sdif
mov %r8d,%r14d # hnum
shl $0x10,%r14d # hnum << 16
or %edx,%r14d # INET_COMBINED_PORTS(sport, hnum)
mov 0x440(%rdi),%rax # net->ipv4.tcp_death_row.hashinfo
mov 0x1194f09(%rip),%r10d # inet6_ehashfn() ...
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260303235424.3877267-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use the correct struct member names to avoid kernel-doc warnings:
Warning: include/linux/platform_data/voltage-omap.h:27 struct member
'volt_nominal' not described in 'omap_volt_data'
Warning: include/linux/platform_data/voltage-omap.h:27 struct member
'vp_errgain' not described in 'omap_volt_data'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://patch.msgid.link/20260226051309.556228-1-rdunlap@infradead.org
Signed-off-by: Kevin Hilman <khilman@baylibre.com>
|
|
The existing x86_match_cpu() infrastructure can be used to match
a bunch of attributes of a CPU: vendor, family, model, steppings
and CPU features.
But, there's one more attribute that's missing and unable to be
matched against: the platform ID, enumerated on Intel CPUs in
MSR_IA32_PLATFORM_ID. It is a little more obscure and is only
queried during microcode loading. This is because Intel sometimes
has CPUs with identical family/model/stepping but which need
different microcode. These CPUs are differentiated with the
platform ID.
Add a field in 'struct x86_cpu_id' for the platform ID. Similar
to the stepping field, make the new field a mask of platform IDs.
Some examples:
0x01: matches only platform ID 0x0
0x02: matches only platform ID 0x1
0x03: matches platform IDs 0x0 or 0x1
0x80: matches only platform ID 0x7
0xff: matches all 8 possible platform IDs
Since the mask is only a byte wide, it nestles in next to another
u8 and does not even increase the size of 'struct x86_cpu_id'.
Reserve the all 0's value as the wildcard (X86_PLATFORM_ANY). This
avoids forcing changes to existing 'struct x86_cpu_id' users. They
can just continue to fill the field with 0's and their matching will
work exactly as before.
Note: If someone is ever looking for space in 'struct x86_cpu_id',
this new field could probably get stuck over in ->driver_data
for the one user that there is.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://patch.msgid.link/20260304181022.058DF07C@davehans-spike.ostc.intel.com
|
|
Cross-merge networking fixes after downstream PR (net-7.0-rc3).
No conflicts.
Adjacent changes:
net/netfilter/nft_set_rbtree.c
fb7fb4016300 ("netfilter: nf_tables: clone set on flush only")
3aea466a4399 ("netfilter: nft_set_rbtree: don't disable bh when acquiring tree lock")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from CAN, netfilter and wireless.
Current release - new code bugs:
- sched: cake: fixup cake_mq rate adjustment for diffserv config
- wifi: fix missing ieee80211_eml_params member initialization
Previous releases - regressions:
- tcp: give up on stronger sk_rcvbuf checks (for now)
Previous releases - always broken:
- net: fix rcu_tasks stall in threaded busypoll
- sched:
- fq: clear q->band_pkt_count[] in fq_reset()
- only allow act_ct to bind to clsact/ingress qdiscs and shared
blocks
- bridge: check relevant per-VLAN options in VLAN range grouping
- xsk: fix fragment node deletion to prevent buffer leak
Misc:
- spring cleanup of inactive maintainers"
* tag 'net-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (138 commits)
xdp: produce a warning when calculated tailroom is negative
net: enetc: use truesize as XDP RxQ info frag_size
libeth, idpf: use truesize as XDP RxQ info frag_size
i40e: use xdp.frame_sz as XDP RxQ info frag_size
i40e: fix registering XDP RxQ info
ice: change XDP RxQ frag_size from DMA write length to xdp.frame_sz
ice: fix rxq info registering in mbuf packets
xsk: introduce helper to determine rxq->frag_size
xdp: use modulo operation to calculate XDP frag tailroom
selftests/tc-testing: Add tests exercising act_ife metalist replace behaviour
net/sched: act_ife: Fix metalist update behavior
selftests: net: add test for IPv4 route with loopback IPv6 nexthop
net: ipv6: fix panic when IPv4 route references loopback IPv6 nexthop
net: vxlan: fix nd_tbl NULL dereference when IPv6 is disabled
net: bridge: fix nd_tbl NULL dereference when IPv6 is disabled
MAINTAINERS: remove Thomas Falcon from IBM ibmvnic
MAINTAINERS: remove Claudiu Manoil and Alexandre Belloni from Ocelot switch
MAINTAINERS: replace Taras Chornyi with Elad Nachman for Marvell Prestera
MAINTAINERS: remove Jonathan Lemon from OpenCompute PTP
MAINTAINERS: replace Clark Wang with Frank Li for Freescale FEC
...
|
|
When both IMA and EVM fix modes are enabled, accessing a file with IMA
signature but missing EVM HMAC won't cause security.evm to be fixed.
Add a function evm_fix_hmac which will be explicitly called to fix EVM
HMAC for this case.
Suggested-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
|
|
EVM and other LSMs need the ability to query the secure boot status of
the system, without directly calling the IMA arch_ima_get_secureboot
function. Refactor the secure boot status check into a general function
named arch_get_secureboot.
Reported-and-suggested-by: Mimi Zohar <zohar@linux.ibm.com>
Suggested-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Fix thresh_return of function graph tracer
The update to store data on the shadow stack removed the abuse of
using the task recursion word as a way to keep track of what
functions to ignore. The trace_graph_return() was updated to handle
this, but when function_graph tracer is using a threshold (only trace
functions that took longer than a specified time), it uses
trace_graph_thresh_return() instead.
This function was still incorrectly using the task struct recursion
word causing the function graph tracer to permanently set all
functions to "notrace"
- Fix thresh_return nosleep accounting
When the calltime was moved to the shadow stack storage instead of
being on the fgraph descriptor, the calculations for the amount of
sleep time was updated. The calculation was done in the
trace_graph_thresh_return() function, which also called the
trace_graph_return(), which did the calculation again, causing the
time to be doubled.
Remove the call to trace_graph_return() as what it needed to do
wasn't that much, and just do the work in
trace_graph_thresh_return().
- Fix syscall trace event activation on boot up
The syscall trace events are pseudo events attached to the
raw_syscall tracepoints. When the first syscall event is enabled, it
enables the raw_syscall tracepoint and doesn't need to do anything
when a second syscall event is also enabled.
When events are enabled via the kernel command line, syscall events
are partially enabled as the enabling is called before rcu_init. This
is due to allow early events to be enabled immediately. Because
kernel command line events do not distinguish between different types
of events, the syscall events are enabled here but are not fully
functioning. After rcu_init, they are disabled and re-enabled so that
they can be fully enabled.
The problem happened is that this "disable-enable" is done one at a
time. If more than one syscall event is specified on the command
line, by disabling them one at a time, the counter never gets to
zero, and the raw_syscall is not disabled and enabled, keeping the
syscall events in their non-fully functional state.
Instead, disable all events and re-enabled them all, as that will
ensure the raw_syscall event is also disabled and re-enabled.
- Disable preemption in ftrace pid filtering
The ftrace pid filtering attaches to the fork and exit tracepoints to
add or remove pids that should be traced. They access variables
protected by RCU (preemption disabled). Now that tracepoint callbacks
are called with preemption enabled, this protection needs to be added
explicitly, and not depend on the functions being called with
preemption disabled.
- Disable preemption in event pid filtering
The event pid filtering needs the same preemption disabling guards as
ftrace pid filtering.
- Fix accounting of the memory mapped ring buffer on fork
Memory mapping the ftrace ring buffer sets the vm_flags to DONTCOPY.
But this does not prevent the application from calling
madvise(MADVISE_DOFORK). This causes the mapping to be copied on
fork. After the first tasks exits, the mapping is considered unmapped
by everyone. But when he second task exits, the counter goes below
zero and triggers a WARN_ON.
Since nothing prevents two separate tasks from mmapping the ftrace
ring buffer (although two mappings may mess each other up), there's
no reason to stop the memory from being copied on fork.
Update the vm_operations to have an ".open" handler to update the
accounting and let the ring buffer know someone else has it mapped.
- Add all ftrace headers in MAINTAINERS file
The MAINTAINERS file only specifies include/linux/ftrace.h But misses
ftrace_irq.h and ftrace_regs.h. Make the file use wildcards to get
all *ftrace* files.
* tag 'trace-v7.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
ftrace: Add MAINTAINERS entries for all ftrace headers
tracing: Fix WARN_ON in tracing_buffers_mmap_close
tracing: Disable preemption in the tracepoint callbacks handling filtered pids
ftrace: Disable preemption in the tracepoint callbacks handling filtered pids
tracing: Fix syscall events activation by ensuring refcount hits zero
fgraph: Fix thresh_return nosleeptime double-adjust
fgraph: Fix thresh_return clear per-task notrace
|
|
With TX pause enabled, if a device is unable to pass packets up to the
stack (e.g., CPU is hanged), the device can cause pause storm. Given
that devices can have native support to protect the neighbor from such
flooding, such events need some tracking. This support is to track TX
pause storm events for better observability.
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
Link: https://patch.msgid.link/20260302230149.1580195-2-mohsin.bashr@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
RAPL default settings vary across different RAPL interfaces (MSR, TPMI,
MMIO). Currently, these defaults are stored in the common RAPL driver,
which requires interface-specific handling logic and makes the common
layer unnecessarily complex. There is no strong reason for the common
code to own these defaults, since they are inherently
interface-specific.
To prepare for moving default configuration into the individual
interface drivers,
1. Move struct rapl_defaults into a shared header so that interface
drivers can directly populate their own default settings.
2. Change the @defaults field in struct rapl_if_priv from void * to
const struct rapl_defaults * to improve type safety and readability
and update the common driver to use the typed defaults structure.
3. Update all internal getter functions and local pointers to use
const struct rapl_defaults * to maintain const-correctness.
4. Rename and export the common helper functions (check_unit,
set_floor_freq, compute_time_window) so interface drivers may
reuse or override them as appropriate.
No functional changes. This is a preparatory refactoring to allow
interface drivers to supply their own RAPL default settings.
Co-developed-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20260212233044.329790-9-sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Replace hardcoded numeric constants with standard unit conversion
macros from linux/units.h for better code clarity and
self-documentation.
Add MICROJOULE_PER_JOULE and NANOJOULE_PER_JOULE to units.h to
support energy unit conversions, following the existing pattern
for power units.
No functional changes.
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20260212233044.329790-8-sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
After acquiring netdev_queue::_xmit_lock the number of the CPU owning
the lock is recorded in netdev_queue::xmit_lock_owner. This works as
long as the BH context is not preemptible.
On PREEMPT_RT the softirq context is preemptible and without the
softirq-lock it is possible to have multiple user in __dev_queue_xmit()
submitting a skb on the same CPU. This is fine in general but this means
also that the current CPU is recorded as netdev_queue::xmit_lock_owner.
This in turn leads to the recursion alert and the skb is dropped.
Instead checking the for CPU number, that owns the lock, PREEMPT_RT can
check if the lockowner matches the current task.
Add netif_tx_owned() which returns true if the current context owns the
lock by comparing the provided CPU number with the recorded number. This
resembles the current check by negating the condition (the current check
returns true if the lock is not owned).
On PREEMPT_RT use rt_mutex_owner() to return the lock owner and compare
the current task against it.
Use the new helper in __dev_queue_xmit() and netif_local_xmit_active()
which provides a similar check.
Update comments regarding pairing READ_ONCE().
Reported-by: Bert Karwatzki <spasswolf@web.de>
Closes: https://lore.kernel.org/all/20260216134333.412332-1-spasswolf@web.de
Fixes: 3253cb49cbad4 ("softirq: Allow to drop the softirq-BKL lock on PREEMPT_RT")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reported-by: Bert Karwatzki <spasswolf@web.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20260302162631.uGUyIqDT@linutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
This series adds support for Transaction Layer Packet (TLP) emulation
response gateway regions, enabling userspace device emulation software
to write TLP responses directly to lower layers without kernel driver
involvement.
Currently, the mlx5 driver exposes VirtIO emulation access regions via
the MLX5_IB_METHOD_VAR_OBJ_ALLOC ioctl. This series extends that
ioctl to also support allocating TLP response gateway channels for
PCI device emulation use cases.
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
Expose and query TLP device emulation caps on driver load.
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
Introduce the hardware structures and definitions needed for the driver
support of TLP emulation in mlx5_ifc.
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next
Florian Westphal says:
====================
netfilter: updates for net-next
The following patchset contains Netfilter updates for *net-next*,
including changes to IPv6 stack and updates to IPVS from Julian Anastasov.
1) ipv6: export fib6_lookup for nft_fib_ipv6 module
2) factor out ipv6_anycast_destination logic so its usable without
dst_entry. These are dependencies for patch 3.
3) switch nft_fib_ipv6 module to no longer need temporary dst_entry
object allocations by using fib6_lookup() + RCU.
This gets us ~13% higher packet rate in my tests.
Patches 4 to 8, from Eric Dumazet, zap sk_callback_lock usage in
netfilter. Patch 9 removes another sk_callback_lock instance.
Remaining patches, from Julian Anastasov, improve IPVS, Quoting Julian:
* Add infrastructure for resizable hash tables based on hlist_bl.
* Change the 256-bucket service hash table to be resizable.
* Change the global connection table to be per-net and resizable.
* Make connection hashing more secure for setups with multiple services.
netfilter pull request nf-next-26-03-04
* tag 'nf-next-26-03-04' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
ipvs: use more keys for connection hashing
ipvs: switch to per-net connection table
ipvs: use resizable hash table for services
ipvs: add resizable hash tables
rculist_bl: add hlist_bl_for_each_entry_continue_rcu
netfilter: nfnetlink_queue: remove locking in nfqnl_get_sk_secctx
netfilter: nfnetlink_queue: no longer acquire sk_callback_lock
netfilter: nfnetlink_log: no longer acquire sk_callback_lock
netfilter: nft_meta: no longer acquire sk_callback_lock in nft_meta_get_eval_skugid()
netfilter: xt_owner: no longer acquire sk_callback_lock in mt_owner()
netfilter: nf_log_syslog: no longer acquire sk_callback_lock in nf_log_dump_sk_uid_gid()
netfilter: nft_fib_ipv6: switch to fib6_lookup
ipv6: make ipv6_anycast_destination logic usable without dst_entry
ipv6: export fib6_lookup for nft_fib_ipv6
====================
Link: https://patch.msgid.link/20260304114921.31042-1-fw@strlen.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Instead of using struct timespec64 in scm_timestamping_internal,
use ktime_t, saving 24 bytes in kernel stack.
This makes tcp_update_recv_tstamps() small enough to be inlined.
The ktime_t -> timespec64 conversions happen after socket lock
has been released in tcp_recvmsg(), and only if the application
requested them.
$ scripts/bloat-o-meter -t vmlinux.0 vmlinux
add/remove: 0/2 grow/shrink: 5/4 up/down: 146/-277 (-131)
Function old new delta
tcp_zerocopy_receive 2383 2425 +42
mptcp_recvmsg 1565 1607 +42
tcp_recvmsg_locked 3797 3823 +26
put_cmsg_scm_timestamping64 131 149 +18
put_cmsg_scm_timestamping 131 149 +18
__pfx_tcp_update_recv_tstamps 16 - -16
do_tcp_getsockopt 4024 4006 -18
tcp_recv_timestamp 474 430 -44
tcp_zc_handle_leftover 417 371 -46
__sock_recv_timestamp 1087 1031 -56
tcp_update_recv_tstamps 97 - -97
Total: Before=25223788, After=25223657, chg -0.00%
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Link: https://patch.msgid.link/20260304012747.881644-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Johannes Berg says:
====================
Notable features this time:
- cfg80211/mac80211
- finished assoc frame encryption/EPPKE/802.1X-over-auth
(also hwsim)
- radar detection improvements
- 6 GHz incumbent signal detection APIs
- multi-link support for FILS, probe response
templates and client probling
- ath12k:
- monitor mode support on IPQ5332
- basic hwmon temperature reporting
* tag 'wireless-next-2026-03-04' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (38 commits)
wifi: UHR: define DPS/DBE/P-EDCA elements and fix size parsing
wifi: mac80211_hwsim: change hwsim_class to a const struct
wifi: mac80211: give the AP more time for EPPKE as well
wifi: ath12k: Remove the unused argument from the Rx data path
wifi: ath12k: Enable monitor mode support on IPQ5332
wifi: ath12k: Set up MLO after SSR
wifi: ath11k: Silence remoteproc probe deferral prints
wifi: cfg80211: support key installation on non-netdev wdevs
wifi: cfg80211: make cluster id an array
wifi: mac80211: update outdated comment
wifi: mac80211: Advertise IEEE 802.1X authentication support
wifi: mac80211: Add support for IEEE 802.1X authentication protocol in non-AP STA mode
wifi: cfg80211: add support for IEEE 802.1X Authentication Protocol
wifi: mac80211: Advertise EPPKE support based on driver capabilities
wifi: mac80211_hwsim: Advertise support for (Re)Association frame encryption
wifi: mac80211: Fix AAD/Nonce computation for management frames with MLO
wifi: rt2x00: use generic nvmem_cell_get
wifi: mac80211: fetch unsolicited probe response template by link ID
wifi: mac80211: fetch FILS discovery template by link ID
wifi: nl80211: don't allow DFS channels for NAN
...
====================
Link: https://patch.msgid.link/20260304113707.175181-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
- kthread: consolidate kthread exit paths to prevent use-after-free
- iomap:
- don't mark folio uptodate if read IO has bytes pending
- don't report direct-io retries to fserror
- reject delalloc mappings during writeback
- ns: tighten visibility checks
- netfs: Fix unbuffered/DIO writes to dispatch subrequests in strict
sequence
* tag 'vfs-7.0-rc3.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
iomap: reject delalloc mappings during writeback
iomap: don't mark folio uptodate if read IO has bytes pending
selftests: fix mntns iteration selftests
nstree: tighten permission checks for listing
nsfs: tighten permission checks for handle opening
nsfs: tighten permission checks for ns iteration ioctls
netfs: Fix unbuffered/DIO writes to dispatch subrequests in strict sequence
kthread: consolidate kthread exit paths to prevent use-after-free
iomap: don't report direct-io retries to fserror
|
|
Eliminate kernel-doc warnings in mmu_notifier.h:
- add a missing struct short description
- use the correct format for function parameters
- add missing function return comment sections
Warning: include/linux/mmu_notifier.h:236 missing initial short
description on line: * struct mmu_interval_notifier_ops
Warning: include/linux/mmu_notifier.h:325 function parameter 'interval_sub'
not described in 'mmu_interval_set_seq'
Warning: include/linux/mmu_notifier.h:325 function parameter 'cur_seq'
not described in 'mmu_interval_set_seq'
Warning: include/linux/mmu_notifier.h:346 function parameter 'interval_sub'
not described in 'mmu_interval_read_retry'
Warning: include/linux/mmu_notifier.h:346 function parameter 'seq' not
described in 'mmu_interval_read_retry'
Warning: include/linux/mmu_notifier.h:346 No description found for return
value of 'mmu_interval_read_retry'
Warning: include/linux/mmu_notifier.h:370 function parameter 'interval_sub'
not described in 'mmu_interval_check_retry'
Warning: include/linux/mmu_notifier.h:370 function parameter 'seq' not
described in 'mmu_interval_check_retry'
Warning: include/linux/mmu_notifier.h:370 No description found for return
value of 'mmu_interval_check_retry'
Link: https://lkml.kernel.org/r/20260302005222.3470783-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Use the correct kernel-doc function parameter format to avoid kernel-doc
warnings:
Warning: include/linux/uaccess.h:814 function parameter 'uptr' not
described in 'scoped_user_rw_access_size'
Warning: include/linux/uaccess.h:826 function parameter 'uptr' not
described in 'scoped_user_rw_access'
Link: https://lkml.kernel.org/r/20260302005229.3471955-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
pagetable_dtor()"
This change swapped out mod_node_page_state for lruvec_stat_add_folio.
But, these two APIs are not interchangeable: the lruvec version also
increments memcg stats, in addition to "global" pgdat stats.
So after this change, the "pagetables" memcg stat in memory.stat always
yields "0", which is a userspace visible regression.
I tried to look for a refactor where we add a variant of
lruvec_stat_mod_folio which takes a pgdat and a memcg instead of a folio,
to try to adhere to the spirit of the original patch. But at the end of
the day this just means we have to call folio_memcg(ptdesc_folio(ptdesc))
anyway, which doesn't really accomplish much.
This regression is visible in master as well as 6.18 stable, so CC stable
too.
Link: https://lkml.kernel.org/r/20260225002434.2953895-1-axelrasmussen@google.com
Fixes: f0c92726e89f ("ptdesc: remove references to folios from __pagetable_ctor() and pagetable_dtor()")
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Now that TDX handles doing VMXON without KVM's involvement, bury the
top-level APIs to enable and disable virtualization back in kvm_main.c.
No functional change intended.
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Chao Gao <chao.gao@intel.com>
Tested-by: Chao Gao <chao.gao@intel.com>
Tested-by: Sagi Shahar <sagis@google.com>
Link: https://patch.msgid.link/20260214012702.2368778-16-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Move kvm_rebooting, which is only read by x86, to KVM x86 so that it can
be moved again to core x86 code. Add a "shutdown" arch hook to facilate
setting the flag in KVM x86, along with a pile of comments to provide more
context around what KVM x86 is doing and why.
Reviewed-by: Chao Gao <chao.gao@intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Sagi Shahar <sagis@google.com>
Link: https://patch.msgid.link/20260214012702.2368778-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Add UHR Operation and Capability definitions and parsing helpers:
- Define ieee80211_uhr_dps_info, ieee80211_uhr_dbe_info,
ieee80211_uhr_p_edca_info with masks.
- Update ieee80211_uhr_oper_size_ok() to account for optional
DPS/DBE/P-EDCA blocks.
- Move NPCA pointer position after DPS Operation Parameter if it is
present in ieee80211_uhr_oper_size_ok().
- Move NPCA pointer position after DPS info if it is present in
ieee80211_uhr_npca_info().
Signed-off-by: Karthikeyan Kathirvel <karthikeyan.kathirvel@oss.qualcomm.com>
Link: https://patch.msgid.link/20260304085343.1093993-2-karthikeyan.kathirvel@oss.qualcomm.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Change the old hlist_bl_first_rcu to hlist_bl_first_rcu_dereference
to indicate that it is a RCU dereference.
Add hlist_bl_next_rcu and hlist_bl_first_rcu to use RCU pointers
and use them to fix sparse warnings.
Add hlist_bl_for_each_entry_continue_rcu.
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
When a process forks, the child process copies the parent's VMAs but the
user_mapped reference count is not incremented. As a result, when both the
parent and child processes exit, tracing_buffers_mmap_close() is called
twice. On the second call, user_mapped is already 0, causing the function to
return -ENODEV and triggering a WARN_ON.
Normally, this isn't an issue as the memory is mapped with VM_DONTCOPY set.
But this is only a hint, and the application can call
madvise(MADVISE_DOFORK) which resets the VM_DONTCOPY flag. When the
application does that, it can trigger this issue on fork.
Fix it by incrementing the user_mapped reference count without re-mapping
the pages in the VMA's open callback.
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Vincent Donnefort <vdonnefort@google.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Link: https://patch.msgid.link/20260227025842.1085206-1-wangqing7171@gmail.com
Fixes: cf9f0f7c4c5bb ("tracing: Allow user-space mapping of the ring-buffer")
Reported-by: syzbot+3b5dd2030fe08afdf65d@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=3b5dd2030fe08afdf65d
Tested-by: syzbot+3b5dd2030fe08afdf65d@syzkaller.appspotmail.com
Signed-off-by: Qing Wang <wangqing7171@gmail.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
A recent refactoring of the kernel-docs for stop machine changed the
description of the cpus parameter from "NULL = any online cpu"
to "NULL = run on each online CPU".
However the callback is only executed on a single CPU, not all of them.
The old wording was a bit ambiguous and could have been read both ways.
Reword the documentation to be correct again and hopefully also clearer.
Fixes: fc6f89dc7078 ("stop_machine: Improve kernel-doc function-header comments")
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
|
|
Several (register) values reported by the fuel gauge depend on its
internal task period and it needs to be taken into account when
calculating results. All relevant example formulas in the data sheet
assume the default task period (of 5760) and final results need to be
adjusted based on the task period in effect.
Update the code as and where necessary.
Reviewed-by: Peter Griffin <peter.griffin@linaro.org>
Signed-off-by: André Draszik <andre.draszik@linaro.org>
Link: https://patch.msgid.link/20260302-max77759-fg-v3-10-3c5f01dbda23@linaro.org
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
|
|
The Maxim MAX77759 is a companion PMIC intended for use in mobile
phones and tablets. It is used on Google Pixel 6 and 6 Pro (oriole and
raven). Amongst others, it contains a fuel gauge that is similar to the
ones supported by this driver.
The fuel gauge can measure battery charge and discharge current,
battery voltage, battery temperature, and the Type C connector's
temperature.
The MAX77759 incorporates the Maxim ModelGauge m5 algorithm. It, as
well as previous generations like m3 on max17047/max17050, requires
the host to save/restore some register values across power cycles to
maintain full accuracy. Extending the driver for such support is out of
scope in this initial commit.
Reviewed-by: Peter Griffin <peter.griffin@linaro.org>
Signed-off-by: André Draszik <andre.draszik@linaro.org>
Link: https://patch.msgid.link/20260302-max77759-fg-v3-9-3c5f01dbda23@linaro.org
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
|
|
Factor the return value range calculation logic in check_return_code
out of the function in preparation for separating the return value
validation logic for BPF_EXIT and bpf_throw().
No functional changes. The change made in return_retval_code's handling
of PROG_TRACING program types (not error'ing on the default case) is a
no-op because the match on the program attach type is exhaustive.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com>
Link: https://lore.kernel.org/r/20260228184759.108145-2-emil@etsalapatis.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This reverts commit dc23806a7c47 ("driver core: enforce device_lock for
driver_match_device()") and commit 289b14592cef ("driver core: fix
inverted "locked" suffix of driver_match_device()").
While technically correct, there is a major downside to this approach:
When a device is already present in the system and a driver is
registered on the same bus, we iterate over all devices registered on
this bus to see if one of them matches. If we come across an already
bound one where the corresponding driver crashed while holding the
device lock (e.g. in probe()) we can't make any progress anymore.
However, drivers are typically the least tested code in the kernel and
hence it is a case that is likely to happen regularly. Besides hurting
developer ergonomics, it potentially decreases chances of shutting
things down cleanly and obtaining logs in production environments as
well [1].
This came up in the context of a firewire bug, which only in combination
with the reverted commit, caused the machine to hang [2]. Additionally,
it was observed in [3].
Thus, revert commit dc23806a7c47 ("driver core: enforce device_lock for
driver_match_device()") and add a brief note clarifying that an
implementer of struct bus_type must not expect match() to be called with
the device lock held.
Link: https://lore.kernel.org/driver-core/DGRGTIRHA62X.3RY09D9SOK77P@kernel.org/ [1]
Link: https://lore.kernel.org/all/67f655bb-4d81-4609-b008-68d200255dd2@davidgow.net/ [2]
Link: https://lore.kernel.org/lkml/CALbr=LZ4v7N=tO1vgOsyj9AS+XuNbn6kG-QcF+PacdMjSo0iyw@mail.gmail.com/ [3]
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Closes: https://lore.kernel.org/driver-core/CAHk-=wgJ_L1C=HjcYJotg_zrZEmiLFJaoic+PWthjuQrutrfJw@mail.gmail.com/
Reviewed-by: Gui-Dong Han <hanguidong02@gmail.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/20260302002545.19389-1-dakr@kernel.org
[ Add additional Link: reference. - Danilo ]
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|
|
We have an increasing number of READ_ONCE(xxx->function)
combined with INDIRECT_CALL_[1234]() helpers.
Unfortunately this forces INDIRECT_CALL_[1234]() to read
xxx->function many times, which is not what we wanted.
Fix these macros so that xxx->function value is not reloaded.
$ scripts/bloat-o-meter -t vmlinux.0 vmlinux
add/remove: 0/0 grow/shrink: 1/65 up/down: 122/-1084 (-962)
Function old new delta
ip_push_pending_frames 59 181 +122
ip6_finish_output 687 681 -6
__udp_enqueue_schedule_skb 1078 1072 -6
ioam6_output 2319 2312 -7
xfrm4_rcv_encap_finish2 64 56 -8
xfrm4_output 297 289 -8
vrf_ip_local_out 278 270 -8
vrf_ip6_local_out 278 270 -8
seg6_input_finish 64 56 -8
rpl_output 700 692 -8
ipmr_forward_finish 124 116 -8
ip_forward_finish 143 135 -8
ip6mr_forward2_finish 100 92 -8
ip6_forward_finish 73 65 -8
input_action_end_bpf 1091 1083 -8
dst_input 52 44 -8
__xfrm6_output 801 793 -8
__xfrm4_output 83 75 -8
bpf_input 500 491 -9
__tcp_check_space 530 521 -9
input_action_end_dt6 291 280 -11
vti6_tnl_xmit 1634 1622 -12
bpf_xmit 1203 1191 -12
rpl_input 497 483 -14
rawv6_send_hdrinc 1355 1341 -14
ndisc_send_skb 1030 1016 -14
ipv6_srh_rcv 1377 1363 -14
ip_send_unicast_reply 1253 1239 -14
ip_rcv_finish 226 212 -14
ip6_rcv_finish 300 286 -14
input_action_end_x_core 205 191 -14
input_action_end_x 355 341 -14
input_action_end_t 205 191 -14
input_action_end_dx6_finish 127 113 -14
input_action_end_dx4_finish 373 359 -14
input_action_end_dt4 426 412 -14
input_action_end_core 186 172 -14
input_action_end_b6_encap 292 278 -14
input_action_end_b6 198 184 -14
igmp6_send 1332 1318 -14
ip_sublist_rcv 864 848 -16
ip6_sublist_rcv 1091 1075 -16
ipv6_rpl_srh_rcv 1937 1920 -17
xfrm_policy_queue_process 1246 1228 -18
seg6_output_core 903 885 -18
mld_sendpack 856 836 -20
NF_HOOK 756 736 -20
vti_tunnel_xmit 1447 1426 -21
input_action_end_dx6 664 642 -22
input_action_end 1502 1480 -22
sock_sendmsg_nosec 134 111 -23
ip6mr_forward2 388 364 -24
sock_recvmsg_nosec 134 109 -25
seg6_input_core 836 810 -26
ip_send_skb 172 146 -26
ip_local_out 140 114 -26
ip6_local_out 140 114 -26
__sock_sendmsg 162 136 -26
__ip_queue_xmit 1196 1170 -26
__ip_finish_output 405 379 -26
ipmr_queue_fwd_xmit 373 346 -27
sock_recvmsg 173 145 -28
ip6_xmit 1635 1607 -28
xfrm_output_resume 1418 1389 -29
ip_build_and_send_pkt 625 591 -34
dst_output 504 432 -72
Total: Before=25217686, After=25216724, chg -0.00%
Fixes: 283c16a2dfd3 ("indirect call wrappers: helpers to speed-up indirect calls of builtin")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20260227172603.1700433-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
kernel-doc reports function parameters not described for parameters
that are not named. Add parameter names for these functions and then
describe the function parameters in kernel-doc format.
Fixes these warnings:
Warning: include/linux/atmdev.h:316 function parameter '' not described
in 'register_atm_ioctl'
Warning: include/linux/atmdev.h:321 function parameter '' not described
in 'deregister_atm_ioctl'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://patch.msgid.link/20260228220845.2978547-1-rdunlap@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ipmr_mfc_add() and ipmr_mfc_delete() are already protected
by a dedicated mutex.
rtm_to_ipmr_mfcc() calls __ipmr_get_table(), __dev_get_by_index(),
amd ipmr_find_vif().
Once __dev_get_by_index() is converted to dev_get_by_index_rcu(),
we can move the other two functions under that same RCU section
and drop RTNL for ipmr_rtm_route().
Let's do that conversion and drop ASSERT_RTNL() in
mr_call_mfc_notifiers().
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260228221800.1082070-16-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We will no longer hold RTNL for ipmr_mfc_add() and ipmr_mfc_delete().
MFC entry can be loosely connected with VIF by its index for
mrt->vif_table[] (stored in mfc_parent), but the two tables are
not synchronised. i.e. Even if VIF 1 is removed, MFC for VIF 1
is not automatically removed.
The only field that the MFC/VIF interfaces share is
net->ipv[46].ipmr_seq, which is protected by RTNL.
Adding a new mutex for both just to protect a single field is overkill.
Let's convert the field to atomic_t.
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20260228221800.1082070-14-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
struct stmmac_dma_cfg mixed_burst/fixed_burst members are both boolean
in nature - of_property_read_bool() are used to read these from DT, and
they are only tested for non-zero values. Use bool to avoid unnecessary
padding in this structure.
Update dwmac-intel to initialise these using true rather than '1', and
remove the '0' initialisers as the struct is already zero initialised
on allocation.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vvuXn-0000000AvnX-4A1u@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There are repeated instances of:
fwnode = priv->plat->port_node;
if (!fwnode)
fwnode = dev_fwnode(priv->device);
However, the only place that ->port_node is set is
stmmac_probe_config_dt():
struct device_node *np = pdev->dev.of_node;
...
/* PHYLINK automatically parses the phy-handle property */
plat->port_node = of_fwnode_handle(np);
which is equivalent to dev_fwnode(&pdev->dev) and, as priv->device
will be &pdev->dev, is also equivalent to dev_fwnode(priv->device).
Thus, plat_dat->port_node doesn't provide any extra benefit over
using dev_fwnode(priv->device) directly.
There is one case where port_node is used directly, which can be
found in stmmac_pcs_setup(). This may cause a change of behaviour
as PCI drivers do not populate plat_dat->port_node, but
dev_fwnode(priv->device) may be valid. PCI-based stmmac should
be tested.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1vvuX3-0000000Avme-3oej@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
https://gitlab.freedesktop.org/drm/xe/kernel into drm-next
UAPI Changes:
- restrict multi-lrc to VCS/VECS engines (Xin Wang)
- Introduce a flag to disallow vm overcommit in fault mode (Thomas)
- update used tracking kernel-doc (Auld, Fixes)
- Some bind queue fixes (Auld, Fixes)
Cross-subsystem Changes:
- Split drm_suballoc_new() into SA alloc and init helpers (Satya, Fixes)
- pass pagemap_addr by reference (Arnd, Fixes)
- Revert "drm/pagemap: Disable device-to-device migration" (Thomas)
- Fix unbalanced unlock in drm_gpusvm_scan_mm (Maciej, Fixes)
- Small GPUSVM fixes (Brost, Fixes)
- Fix xe SVM configs (Thomas, Fixes)
Core Changes:
- Fix a hmm_range_fault() livelock / starvation problem (Thomas, Fixes)
Driver Changes:
- Fix leak on xa_store failure (Shuicheng, Fixes)
- Correct implementation of Wa_16025250150 (Roper, Fixes)
- Refactor context init into xe_lrc_ctx_init (Raag)
- Fix GSC proxy cleanup on early initialization failure (Zhanjun)
- Fix exec queue creation during post-migration recovery (Tomasz, Fixes)
- Apply windower hardware filtering setting on Xe3 and Xe3p (Roper)
- Free ctx_restore_mid_bb in release (Shuicheng, Fixes)
- Drop stale MCR steering TODO comment (Roper)
- dGPU memory optimizations (Brost)
- Do not preempt fence signaling CS instructions (Brost, Fixes)
- Revert "drm/xe/compat: Remove unused i915_reg.h from compat header" (Uma)
- Don't expose display modparam if no display support (Wajdeczko)
- Some VRAM flag improvements (Wajdeczko)
- Misc fix for xe_guc_ct.c (Shuicheng, Fixes)
- Remove unused i915_reg.h from compat header (Uma)
- Workaround cleanup & simplification (Roper)
- Add prefetch pagefault support for Xe3p (Varun)
- Fix fs_reclaim deadlock caused by CCS save/restore (Satya, Fixes)
- Cleanup partially initialized sync on parse failure (Shuicheng, Fixes)
- Allow to change VFs VRAM quota using sysfs (Michal)
- Increase GuC log sizes in debug builds (Tomasz)
- Wa_18041344222 changes (Harish)
- Add Wa_14026781792 (Niton)
- Add debugfs facility to catch RTP mistakes (Roper)
- Convert GT stats to per-cpu counters (Brost)
- Prevent unintended VRAM channel creation (Karthik)
- Privatize struct xe_ggtt (Maarten)
- remove unnecessary struct dram_info forward declaration (Jani)
- pagefault refactors (Brost)
- Apply Wa_14024997852 (Arvind)
- Redirect faults to dummy page for wedged device (Raag, Fixes)
- Force EXEC_QUEUE_FLAG_KERNEL for kernel internal VMs (Piotr)
- Stop applying Wa_16018737384 from Xe3 onward (Roper)
- Add new XeCore fuse registers to VF runtime regs (Roper)
- Update xe_device_declare_wedged() error log (Raag)
- Make xe_modparam.force_vram_bar_size signed (Shuicheng, Fixes)
- Avoid reading media version when media GT is disabled (Piotr, Fixes)
- Fix handling of Wa_14019988906 & Wa_14019877138 (Roper, Fixes)
- Basic enabling patches for Xe3p_LPG and NVL-P (Gustavo, Roper, Shekhar)
- Avoid double-adjust in 64-bit reads (Shuicheng, Fixes)
- Allow VF to initialize MCR tables (Wajdeczko)
- Add Wa_14025883347 for GuC DMA failure on reset (Anirban)
- Add bounds check on pat_index to prevent OOB kernel read in madvise (Jia, Fixes)
- Fix the address range assert in ggtt_get_pte helper (Winiarski)
- XeCore fuse register changes (Roper)
- Add more info to powergate_info debugfs (Vinay)
- Separate out GuC RC code (Vinay)
- Fix g2g_test_array indexing (Pallavi)
- Mutual exclusivity between CCS-mode and PF (Nareshkumar, Fixes)
- Some more _types.h cleanups (Wajdeczko)
- Fix sysfs initialization (Wajdeczko, Fixes)
- Drop unnecessary goto in xe_device_create (Roper)
- Disable D3Cold for BMG only on specific platforms (Karthik, Fixes)
- Add sriov.admin_only_pf attribute (Wajdeczko)
- replace old wq(s), add WQ_PERCPU to alloc_workqueue (Marco)
- Make MMIO communication more robust (Wajdeczko)
- Fix warning of kerneldoc (Shuicheng, Fixes)
- Fix topology query pointer advance (Shuicheng, Fixes)
- use entry_dump callbacks for xe2+ PAT dumps (Xin Wang)
- Fix kernel-doc warning in GuC scheduler ABI header (Chaitanya, Fixes)
- Fix CFI violation in debugfs access (Daniele, Fixes)
- Apply WA_16028005424 to Media (Balasubramani)
- Fix typo in function kernel-doc (Wajdeczko)
- Protect priority against concurrent access (Niranjana)
- Fix nvm aux resource cleanup (Shuicheng, Fixes)
- Fix is_bound() pci_dev lifetime (Shuicheng, Fixes)
- Use CLASS() for forcewake in xe_gt_enable_comp_1wcoh (Shuicheng)
- Reset VF GuC state on fini (Wajdeczko)
- Move _THIS_IP_ usage from xe_vm_create() to dedicated function (Nathan Chancellor, Fixes)
- Unregister drm device on probe error (Shuicheng, Fixes)
- Disable DCC on PTL (Vinay, Fixes)
- Fix Wa_18022495364 (Tvrtko, Fixes)
- Skip address copy for sync-only execs (Shuicheng, Fixes)
- derive mem copy capability from graphics version (Nitin, Fixes)
- Use DRM_BUDDY_CONTIGUOUS_ALLOCATION for contiguous allocations (Sanjay)
- Context based TLB invalidations (Brost)
- Enable multi_queue on xe3p_xpc (Brost, Niranjana)
- Remove check for gt in xe_query (Nakshtra)
- Reduce LRC timestamp stuck message on VFs to notice (Brost, Fixes)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/aaYR5G2MHjOEMXPW@lstrano-desk.jf.intel.com
|
|
When exiting to userspace to service an emulated MMIO write, copy the
to-be-written value to a scratch field in the MMIO fragment if the size
of the data payload is 8 bytes or less, i.e. can fit in a single chunk,
instead of pointing the fragment directly at the source value.
This fixes a class of use-after-free bugs that occur when the emulator
initiates a write using an on-stack, local variable as the source, the
write splits a page boundary, *and* both pages are MMIO pages. Because
KVM's ABI only allows for physically contiguous MMIO requests, accesses
that split MMIO pages are separated into two fragments, and are sent to
userspace one at a time. When KVM attempts to complete userspace MMIO in
response to KVM_RUN after the first fragment, KVM will detect the second
fragment and generate a second userspace exit, and reference the on-stack
variable.
The issue is most visible if the second KVM_RUN is performed by a separate
task, in which case the stack of the initiating task can show up as truly
freed data.
==================================================================
BUG: KASAN: use-after-free in complete_emulated_mmio+0x305/0x420
Read of size 1 at addr ffff888009c378d1 by task syz-executor417/984
CPU: 1 PID: 984 Comm: syz-executor417 Not tainted 5.10.0-182.0.0.95.h2627.eulerosv2r13.x86_64 #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 Call Trace:
dump_stack+0xbe/0xfd
print_address_description.constprop.0+0x19/0x170
__kasan_report.cold+0x6c/0x84
kasan_report+0x3a/0x50
check_memory_region+0xfd/0x1f0
memcpy+0x20/0x60
complete_emulated_mmio+0x305/0x420
kvm_arch_vcpu_ioctl_run+0x63f/0x6d0
kvm_vcpu_ioctl+0x413/0xb20
__se_sys_ioctl+0x111/0x160
do_syscall_64+0x30/0x40
entry_SYSCALL_64_after_hwframe+0x67/0xd1
RIP: 0033:0x42477d
Code: <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007faa8e6890e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00000000004d7338 RCX: 000000000042477d
RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000005
RBP: 00000000004d7330 R08: 00007fff28d546df R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004d733c
R13: 0000000000000000 R14: 000000000040a200 R15: 00007fff28d54720
The buggy address belongs to the page:
page:0000000029f6a428 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x9c37
flags: 0xfffffc0000000(node=0|zone=1|lastcpupid=0x1fffff)
raw: 000fffffc0000000 0000000000000000 ffffea0000270dc8 0000000000000000
raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff888009c37780: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ffff888009c37800: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>ffff888009c37880: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
^
ffff888009c37900: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
ffff888009c37980: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
==================================================================
The bug can also be reproduced with a targeted KVM-Unit-Test by hacking
KVM to fill a large on-stack variable in complete_emulated_mmio(), i.e. by
overwrite the data value with garbage.
Limit the use of the scratch fields to 8-byte or smaller accesses, and to
just writes, as larger accesses and reads are not affected thanks to
implementation details in the emulator, but add a sanity check to ensure
those details don't change in the future. Specifically, KVM never uses
on-stack variables for accesses larger that 8 bytes, e.g. uses an operand
in the emulator context, and *all* reads are buffered through the mem_read
cache.
Note! Using the scratch field for reads is not only unnecessary, it's
also extremely difficult to handle correctly. As above, KVM buffers all
reads through the mem_read cache, and heavily relies on that behavior when
re-emulating the instruction after a userspace MMIO read exit. If a read
splits a page, the first page is NOT an MMIO page, and the second page IS
an MMIO page, then the MMIO fragment needs to point at _just_ the second
chunk of the destination, i.e. its position in the mem_read cache. Taking
the "obvious" approach of copying the fragment value into the destination
when re-emulating the instruction would clobber the first chunk of the
destination, i.e. would clobber the data that was read from guest memory.
Fixes: f78146b0f923 ("KVM: Fix page-crossing MMIO")
Suggested-by: Yashu Zhang <zhangjiaji1@huawei.com>
Reported-by: Yashu Zhang <zhangjiaji1@huawei.com>
Closes: https://lore.kernel.org/all/369eaaa2b3c1425c85e8477066391bc7@huawei.com
Cc: stable@vger.kernel.org
Tested-by: Tom Lendacky <thomas.lendacky@gmail.com>
Tested-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Link: https://patch.msgid.link/20260225012049.920665-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
This gets us a fix for KUnit which allows us to test it.
|
|
This gets us a fix for KUnit execution which allows us to run the
testsuite again.
|
|
Use the correct function parameter names, function names, or kernel-doc
format, and add function return comment sections to avoid kernel-doc
warnings:
Warning: include/linux/cred.h:43 function parameter 'gi' not described
in 'get_group_info'
Warning: include/linux/cred.h:43 No description found for return value
of 'get_group_info'
Warning: include/linux/cred.h:213 No description found for return value
of 'get_cred_many'
Warning: include/linux/cred.h:260 function parameter '_cred' not described
in 'put_cred_many'
Warning: include/linux/cred.h:260 expecting prototype for put_cred().
Prototype was for put_cred_many() instead
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
[PM: subject tweak]
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
Add a new IIO_DECLARE_QUATERNION() macro that is used to declare the
field in an IIO buffer struct that contains a quaternion vector.
Quaternions are currently the only IIO data type that uses the .repeat
feature of struct iio_scan_type. This has an implicit rule that the
element in the buffer must be aligned to the entire size of the repeated
element. This macro will make that requirement explicit. Since this is
the only user, we just call the macro IIO_DECLARE_QUATERNION() instead
of something more generic.
Signed-off-by: David Lechner <dlechner@baylibre.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
|