| Age | Commit message (Collapse) | Author |
|
Add low-level hardware register definitions and accessor functions
for AQC113 (Antigua) chip features:
- L3/L4 filter command, tag, and address registers for IPv4/IPv6
- Ethertype filter tag registers
- TSG (Time Stamp Generator) clock control, modification, and
GPIO event generation/input timestamp registers
- TX descriptor timestamp writeback, timestamp enable, and AVB
enable registers
- TX data/descriptor read request limit registers
- TPB highest priority TC registers
- PCIe extended tag enable register
- RX descriptor timestamp request register
- Action resolver section enable getter
- GPIO special mode and TSG external GPIO TS input select
Signed-off-by: Sukhdeep Singh <sukhdeeps@marvell.com>
Link: https://patch.msgid.link/20260610115448.272-5-sukhdeeps@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Refactor aq_set_data_fl3l4() to take an ethtool_rx_flow_spec pointer and
an explicit HW register location instead of driver-internal structures
(aq_nic_s, aq_rx_filter). This makes the function reusable for PTP
filter setup which constructs flow specs independently.
Key changes:
- Add aq_is_ipv6_flow_type() helper to derive IPv6 status from the
flow_type field, replacing the dependency on rx_fltrs->fl3l4.is_ipv6
shared state.
- Change aq_set_data_fl3l4() signature to accept (fsp, data, location,
add) and export it via aq_filters.h.
- Update aq_add_del_fl3l4() to compute the HW register location and
pass it explicitly.
Signed-off-by: Sukhdeep Singh <sukhdeeps@marvell.com>
Link: https://patch.msgid.link/20260610115448.272-4-sukhdeeps@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Move active_ipv4/active_ipv6 bitmap updates from aq_set_data_fl3l4()
into aq_add_del_fl3l4() after the hardware write succeeds. The bitmaps
track which filter slots are actively programmed in hardware and must
only be updated once the HW write is confirmed.
The bitmap updates in aq_nic_reserve_filter() and aq_nic_release_filter()
are intentionally retained: they guard the aq_check_approve_fl3l4()
IPv4/IPv6 mixing validation for callers such as the AQC113 PTP path that
program filters directly via hw_atl2_new_fl3l4_configure() without going
through aq_add_del_fl3l4().
Signed-off-by: Sukhdeep Singh <sukhdeeps@marvell.com>
Link: https://patch.msgid.link/20260610115448.272-3-sukhdeeps@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Correct three issues in aq_set_data_fl3l4() required for the AQC113
PTP filter path introduced later in this series:
1. Mask FLOW_EXT from flow_type before the protocol switch statement.
Flow types with FLOW_EXT set (e.g. TCP_V4_FLOW | FLOW_EXT) fall
through to the default case and skip protocol comparison flags.
2. Extend the L3 address comparison check to cover all four IPv6
words. The original code only checked ip_src[0]/ip_dst[0] and
required !is_ipv6, so CMP_SRC_ADDR_L3/CMP_DEST_ADDR_L3 were never
set for IPv6 filters.
3. Use explicit flow type checks for port extraction instead of
negating IP_USER_FLOW/IPV6_USER_FLOW. The old check did not mask
FLOW_EXT, so IP_USER_FLOW | FLOW_EXT would incorrectly attempt
port extraction. Use the actual flow type to pick the correct
union member directly.
Signed-off-by: Sukhdeep Singh <sukhdeeps@marvell.com>
Link: https://patch.msgid.link/20260610115448.272-2-sukhdeeps@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The NETC switch does not age out dynamic FDB entries automatically.
Without software management, stale entries persist after topology
changes and cause incorrect forwarding.
Add a delayed work that periodically removes entries that have not been
refreshed within the specified cycles. The effective ageing time is:
ageing_time = fdbt_ageing_delay * 100
Default values are 3s interval and 100 cycles (300s total), matching
the IEEE 802.1Q default ageing time. The work starts when the first
port joins a bridge (tracked via br_cnt) and is cancelled when the
last port leaves. All FDB operations are serialized under fdbt_lock.
Implement .set_ageing_time() to allow the bridge layer to reconfigure
ageing parameters on demand.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20260611021458.2629145-10-wei.fang@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Wire up the port_bridge_join, port_bridge_leave and port_vlan_filtering
DSA callbacks to support both VLAN-unaware and VLAN-aware bridge modes.
For VLAN-unaware bridges, each bridge instance is assigned a dedicated
internal PVID via NETC_VLAN_UNAWARE_PVID(bridge.num), counting down
from VID 4095. A VFT entry is created for this PVID with hardware MAC
learning and flood-on-miss forwarding enabled. The CPU port is included
as a VFT member so frames can reach the host. The reserved VID range is
blocked in port_vlan_add to prevent user-space conflicts.
Only one VLAN-aware bridge is supported at a time; this constraint is
enforced in port_bridge_join and port_vlan_filtering. The per-port PVID
is tracked in software and written to the BPDVR register whenever VLAN
filtering is active.
When a port leaves the bridge, its dynamic FDB entries are flushed right
away in port_bridge_leave(), without waiting for the ageing cycle. When
a link down event occurs on a port, netc_mac_link_down() will also clear
the port's dynamic FDB entries via netc_port_remove_dynamic_entries().
Non-bridge ports have no dynamic FDB entries, so this call is always
safe. Additionally, .port_fast_age() callback is added to flush the
dynamic FDB entries associated to a port.
Host flood rules are removed from the ingress port filter table when a
port joins a bridge to avoid bypassing FDB lookup and MAC learning.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20260611021458.2629145-9-wei.fang@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Implement the DSA .port_vlan_add and .port_vlan_del operations to enable
VLAN-aware bridge offloading on the NETC switch.
VLAN membership is maintained in the VLAN Filter Table (VFT). Adding the
first port to a VLAN creates a new VFT entry with hardware MAC learning
and flood-on-miss forwarding; subsequent ports update the existing
entry's membership bitmap. Removing the last port deletes the entry.
Egress tagging is handled through the Egress Treatment Table (ETT). Each
VLAN is allocated a group of ETT entries, one per available port. Ports
are assigned a sequential ett_offset during initialisation, used to
address each port's entry within the group. Untagged ports configure the
ETT to strip the outer VLAN tag; tagged ports pass frames through
unmodified. Each ETT group is optionally paired with an Egress Counter
Table (ECT) group for per-port frame counting, allocated on a best-effort
basis. When the egress rule of an ETT entry changes, the counter of the
corresponding ECT entry will be recounted to track the number of frames
that match the new egress rule.
A software shadow list serialised by vft_lock tracks active VLAN state
across both port membership and egress tagging. VID 0 is used for single
port mode and is ignored by both callbacks.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20260611021458.2629145-8-wei.fang@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
NTMP index tables require software to allocate and manage entry IDs.
Add two bitmap helper functions to facilitate this management:
ntmp_lookup_free_eid(): finds the first zero bit in the given bitmap,
sets it to mark the entry as in-use, and returns the corresponding entry
ID. Returns NTMP_NULL_ENTRY_ID if no free entry is available.
ntmp_clear_eid_bitmap(): clears the bit associated with the given entry
ID in the bitmap to mark the entry as free. It is a no-op if the entry
ID is NTMP_NULL_ENTRY_ID.
Both functions are exported for use by other modules, such as the NETC
switch driver which needs to manage group index bitmaps for the Egress
Treatment Table (ETT) and Egress Count Table (ECT).
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20260611021458.2629145-7-wei.fang@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The Egress Treatment Table (ETT) and Egress Count Table (ECT) are both
index tables whose entry IDs are allocated by software. Every num_ports
entries form a group, where each entry in the group corresponds to one
port. To facilitate group allocation and management, initialize the group
index bitmaps for both tables based on hardware capabilities reported by
ETTCAPR and ECTCAPR registers.
The bitmap size per table is calculated as the total number of hardware
entries divided by the number of available ports, which gives the number
of groups available for software allocation. A set bit in the bitmap
represents a group index that has been allocated.
These bitmaps will be used by subsequent patches that add VLAN support.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20260611021458.2629145-6-wei.fang@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The egress count table is a static bounded index table, egress related
statistics are maintained in this table. The table is implemented as a
linear array of entries accessed using an index (0, 1, 2, ..., n) that
uniquely identifies an entry within the array. Egress Counter Entry ID
(EC_EID) is used as an index to an entry in this table. The EC_EID is
specified in the egress treatment table.
Egress count table entries are always present and enabled. The table
only supports access via entry ID, which is assigned by the software.
And it supports Update, Query and Query followed by Update operations.
Currently, only Update operation is supported.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20260611021458.2629145-5-wei.fang@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Each entry in the egress treatment table contains the egress packet
processing actions to be applied to a grouping or scope of packets
exiting on a particular egress port of the switch. A scope of packets,
for example, could be the packets exiting a particular VLAN, matching
a particular 802.1Q bridge forwarding entry or belonging to a stream
identified at ingress. The egress treatment table is implemented as a
linear array of entries accessed using an index (0,1, 2, ..., n) that
uniquely identifies an entry within the array.
The egress treatment table only supports access vid entry ID, which is
assigned by the software. It supports Add, Update, Delete and Query
operations. Note that only Query operation is not supported yet.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20260611021458.2629145-4-wei.fang@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add two interfaces to manage entries in the VLAN filter table:
ntmp_vft_update_entry(): Update the configuration element data of the
specified VLAN filter entry based on the given VLAN ID. It uses the
exact key access method to locate the entry.
ntmp_vft_delete_entry(): Delete the VLAN filter entry corresponding to
the specified VLAN ID. It also uses the exact key access method to
identify the target entry.
In addition, introduce struct vft_req_qd to describe the request data
buffer format for Query and Delete actions of the VLAN filter table,
which contains a common request data header and a VLAN access key.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20260611021458.2629145-3-wei.fang@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add three interfaces to manage dynamic entries in the FDB table:
ntmp_fdbt_update_activity_element(): Update the activity element of all
dynamic FDB entries. For each entry, if its activity flag is not set,
which means no packet has matched this entry since the last update, the
activity counter is incremented. Otherwise, both the activity flag and
activity counter are reset. The activity counter is used to track how
long an FDB entry has been inactive, which is useful for implementing
an ageing mechanism.
ntmp_fdbt_delete_ageing_entries(): Delete all dynamic FDB entries whose
activity flag is not set and whose activity counter is greater than or
equal to the specified threshold. This is used to remove stale entries
that have been inactive for too long.
ntmp_fdbt_delete_port_dynamic_entries(): Delete all dynamic FDB entries
associated with the specified switch port. This is typically called when
a port goes down or is removed from a bridge.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20260611021458.2629145-2-wei.fang@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 7662abf4db94 ("net: phy: sfp: Add support for SMBus module access")
added SMBus access for SFP modules, but limited it to single-byte
transfers. As a side effect, hwmon is disabled (16-bit reads cannot be
guaranteed atomic) and a warning is printed.
Many SMBus-only I2C controllers in the wild support more than just
byte access, and SFP cages are often wired to such controllers
rather than to a full-featured I2C controller -- e.g. the SMBus
controllers in the Realtek longan and mango SoCs, which advertise
word access and I2C block reads. Today, they cannot drive an SFP at
all without falling back to the byte-only path.
Extend sfp_smbus_read()/sfp_smbus_write() so that, in addition to
the existing byte access, they also use SMBus word access and SMBus
I2C block access whenever the adapter advertises them. Both
directions are handled in a single read and a single write helper
that pick the largest supported transfer per chunk and fall back as
needed.
I2C-block is preferred unconditionally when available: the protocol
carries any length 1..32, so it can serve every chunk -- including
the 1- and 2-byte tails -- without help from word or byte access.
Note that this requires I2C_FUNC_SMBUS_I2C_BLOCK, which reads a
caller-specified number of bytes. This deviates from the official
SMBus Block Read (length is supplied by the slave) but is widely
supported by Linux I2C controllers/drivers.
Capability matrix this implementation supports:
- BYTE only: works (unchanged behaviour); 1-byte
xfers, hwmon disabled.
- BYTE + WORD: word for >=2-byte chunks, byte for
trailing odd byte.
- I2C_BLOCK present (with or
without BYTE/WORD): block as the universal transport for
every chunk.
- WORD only (no BYTE/BLOCK): accepted with WARN_ONCE. Even-length
transfers work; odd-length transfers
(e.g. the 3-byte cotsworks fixup
write) hit the BYTE branch which the
adapter does not implement, so the
xfer returns an error and the
operation is aborted. No mainline
I2C driver was found to advertise
WORD without BYTE; the warning lets
us learn about it if it ever shows
up.
Adapters with asymmetric R/W capabilities (e.g. only READ_I2C_BLOCK
but not WRITE_I2C_BLOCK) remain functionally correct -- the
per-iteration fallback uses the direction-specific bits -- but the
shared i2c_max_block_size is sized by the all-bits-set check, so a
transfer in the better-supported direction is not upgraded. None of
the mainline I2C bus drivers surveyed during review advertise such
asymmetry; promoting i2c_max_block_size to per-direction sizes can
be revisited if needed.
Signed-off-by: Jonas Jelonek <jelonek.jonas@gmail.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20260614133418.2068201-3-jelonek.jonas@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The SFP driver assumes all I2C adapters support reading and writing the
pre-defined block size SFP_EEPROM_BLOCK_SIZE of 16 bytes. This constant
was probably chosen based on good guesses and known limitations of a
range of I2C adapters and SFP modules.
However, I2C adapters may even support less and usually need to specify
this via I2C quirks. Theoretically, such an adapter may provide full
functionality but only support a read and write length of e.g. 8 bytes.
Currently, the SFP driver doesn't account for that.
Add handling for I2C quirks in SFP I2C configuration taking the fields
max_read_len and max_write_len in struct i2c_adapter_quirks into account
to further limit the maximum block size if needed.
Signed-off-by: Jonas Jelonek <jelonek.jonas@gmail.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20260614133418.2068201-2-jelonek.jonas@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
otx2vf_register_mbox_intr() currently installs the VF mailbox IRQ
handler before clearing stale mailbox interrupt state. The code then says
that local interrupt bits should be cleared first to avoid spurious
interrupts, but that clear still happens only after request_irq() has
already made the handler reachable.
A running system can reach this during VF mailbox interrupt registration
while stale or latched RVU_VF_INT state is still present. If delivery
happens in the request_irq()-to-clear window,
otx2vf_vfaf_mbox_intr_handler() can run before local quiesce and touch
the same vf->mbox and vf->mbox_wq carrier that probe and teardown later
reuse or destroy.
Move the stale mailbox interrupt clear ahead of request_irq(), but keep
interrupt enabling after the handler is installed. This closes the
pre-clear early-IRQ window without creating a new enable-before-handler
window.
Fixes: 3184fb5ba96e ("octeontx2-vf: Virtual function driver support")
Cc: stable@vger.kernel.org
Signed-off-by: Runyu Xiao <runyu.xiao@seu.edu.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Ratheesh Kannoth <rkannoth@marvell.com>
Link: https://patch.msgid.link/20260611160014.3202224-3-runyu.xiao@seu.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
otx2_register_mbox_intr() currently installs the PF mailbox IRQ handler
before clearing stale mailbox interrupt state. The function itself then
comments that the local interrupt bits must be cleared first to avoid
spurious interrupts, but that clear happens only after request_irq() has
already exposed the handler to irq delivery.
A running system can reach this during PF mailbox interrupt registration
while stale or latched RVU_PF_INT state is still present. If delivery
happens in the request_irq()-to-clear window,
otx2_pfaf_mbox_intr_handler() can run before local quiesce and touch
the same pf->mbox and pf->mbox_wq carrier that probe and teardown later
reuse or destroy.
Move the stale mailbox interrupt clear ahead of request_irq(), but keep
interrupt enabling after the handler is installed. This closes the
pre-clear early-IRQ window without creating a new enable-before-handler
window.
Fixes: 5a6d7c9daef3 ("octeontx2-pf: Mailbox communication with AF")
Cc: stable@vger.kernel.org
Signed-off-by: Runyu Xiao <runyu.xiao@seu.edu.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Ratheesh Kannoth <rkannoth@marvell.com>
Link: https://patch.msgid.link/20260611160014.3202224-2-runyu.xiao@seu.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
An SFP cage (compatible "sff,sfp") whose MOD_DEF0 signal is not wired to a
GPIO currently falls back to sff_gpio_get_state(), which unconditionally
reports the module as present. An empty cage therefore fails its probe and
is parked in SFP_MOD_ERROR forever; because SFP_F_PRESENT never deasserts
there is no REMOVE event to recover the state machine, so a module inserted
after boot is never detected, and empty cages spam -EIO at boot.
This affects boards that route none of the cage presence signal to a
software-readable input. On the NicGiga S100-0800S-M (RTL9303, 8x SFP+) the
cage I2C bus is the switch's SMBus master; TX_DISABLE is driven via a
PCA9534 I/O expander, but no MOD_ABS/MOD_DEF0 line reaches a readable GPIO
(the RTL9303 gpio0 lines read stuck-low, the single PCA9534 is fully
consumed by TX_DISABLE, and there is no RTL8231). The Horaco ZX-SW82TS-L2P
(RTL9302D, 2x SFP+) is independently affected in the same way.
For such an SFP cage, derive presence from a throttled single-byte I2C read
of the module EEPROM instead: a successful read asserts SFP_F_PRESENT,
R_PROBE_ABSENT consecutive failures clear it (to ride out a transient error
on a live module). The existing poll then emits SFP_E_INSERT / SFP_E_REMOVE
normally, giving working hot-plug and silencing the boot-time -EIO spam on
empty cages. Presence is re-probed every T_PROBE_PRESENT, so insertion is
detected within that interval and removal within
T_PROBE_PRESENT * R_PROBE_ABSENT.
A soldered-down module (compatible "sff,sff") has no presence signal and is
genuinely always present, so it continues to use sff_gpio_get_state(); the
new path is gated on the cage type advertising SFP_F_PRESENT.
Signed-off-by: Greg Patrick <gregspatrick@hotmail.com>
Tested-by: Manuel Stocker <mensi@mensi.ch>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20260611175341.2223184-1-gregspatrick@hotmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The devlink resource ID for ATU collides with the sentinel
DEVLINK_RESOURCE_ID_PARENT_TOP (0). As a result, ATU_bin_* are
registered as in fact registered as top-level siblings, not as children
of ATU.
Whether intentional or unintentional, clarify it by keeping the real
resource IDs starting at 1. Unfortunately ATU_bin_* are already
registered at top-level, so keep their parent to PARENT_TOP.
Signed-off-by: David Yang <mmyangfl@gmail.com>
Link: https://patch.msgid.link/20260611070856.889700-5-mmyangfl@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This might not cause real problems, but the hellcreek devlink resource
ID collides with the sentinel DEVLINK_RESOURCE_ID_PARENT_TOP (0). Avoid
it by keeping the real resource IDs starting at 1.
Signed-off-by: David Yang <mmyangfl@gmail.com>
Acked-by: Kurt Kanzenbach <kurt@linutronix.de>
Link: https://patch.msgid.link/20260611070856.889700-4-mmyangfl@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This might not cause real problems, but the b53 devlink resource ID
collides with the sentinel DEVLINK_RESOURCE_ID_PARENT_TOP (0). Avoid it
by keeping the real resource IDs starting at 1.
Signed-off-by: David Yang <mmyangfl@gmail.com>
Link: https://patch.msgid.link/20260611070856.889700-3-mmyangfl@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This might not cause real problems, but the dsa_loop devlink resource ID
collides with the sentinel DEVLINK_RESOURCE_ID_PARENT_TOP (0). Avoid it
by keeping the real resource IDs starting at 1.
Signed-off-by: David Yang <mmyangfl@gmail.com>
Link: https://patch.msgid.link/20260611070856.889700-2-mmyangfl@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The genpd provider bus is really only used when
CONFIG_PM_GENERIC_DOMAINS_OF is enabled, and since the recent deferred
initialisation of domain parent devices, the root device pointer is
otherwise unused.
Fix the unused variable warning by moving the definition of the root device
pointer inside the corresponding ifdef.
Fixes: 92b69eff8012 ("pmdomain: core: fix early domain registration")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202606111746.kAxaAbwg-lkp@intel.com/
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Ulf Hansson <ulfh@kernel.org>
|
|
One fewer allocation for the priv struct.
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Acked-by: Kurt Kanzenbach <kurt@linutronix.de>
Link: https://patch.msgid.link/20260608045640.5172-1-rosenp@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Remove the restriction blocking SD on embedded CPU PFs (ECPF), enabling
SD functionality on BlueField DPUs. Remove the blocker preventing SD
devices from transitioning to switchdev mode.
The infrastructure added in earlier patches properly handles this case.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260612113904.537595-16-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Allow SD devices to transition to switchdev before the SD group is
fully up. Metadata allocation requires the SD group to be ready, so
defer it from esw_offloads_enable() until SD shared-FDB activation.
Add mlx5_esw_offloads_init_deferred_metadata() which allocates per-vport
metadata and refreshes the ingress ACLs that were previously programmed
with metadata=0. The helper is idempotent and can be called multiple
times.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260612113904.537595-15-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
On an SD device, vport representors are not functional until the SD
group is combined and shared FDB is active. Skip the initial load and
the reload paths in that window; reps are loaded as part of the SD LAG
activation flow once it becomes active.
In addition, explicitly unload representors when SD LAG is destroyed.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260612113904.537595-14-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Enable MPESW LAG creation over SD LAG members, forming a composite LAG
hierarchy. This allows bonding multiple SD groups together under a
single MPESW configuration with shared FDB.
When enabling composite MPESW, the individual SD LAG shared FDB
configurations are temporarily torn down and recreated when the
composite LAG is disabled.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260612113904.537595-13-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
SD LAG is a virtual LAG without hardware LAG support, so it cannot use
the firmware vport LAG commands. Implement a software-based vport LAG
using egress ACL bounce rules.
Add esw_set_slave_egress_rule() to create an egress ACL rule on the
slave's manager vport that bounces traffic to the master's manager
vport. This achieves the same traffic steering as hardware vport LAG.
Redirect mlx5_cmd_create_vport_lag() and mlx5_cmd_destroy_vport_lag()
to the software implementation when operating in SD LAG mode.
In addition, adjust lag_demux creation to check SD LAG mode as well.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260612113904.537595-12-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Extend mlx5_lag_disable_change() to properly disable both regular LAG
and SD LAG when requested. Each LAG type uses its own devcom component
for locking.
Use mlx5_sd_get_devcom() helper to retrieve the SD devcom component,
needed for proper locking when disabling SD LAG.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260612113904.537595-11-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The lag demux resources (flow table, flow group, and rules xarray)
are stored on the shared ldev. With Socket Direct, multiple SD groups
each create their own demux FT/FG during their master's IB device
initialization. Since they all write to the same ldev fields, the
second group's init overwrites the first group's pointers, leaking
the first group's FT/FG.
During teardown, the cleanup uses the overwritten pointers, destroying
the wrong group's resources and leaving leaked flow tables in the LAG
namespace. These leaked tables can interfere with subsequently created
demux tables.
Move the demux resources from the shared ldev to per-master lag_func
instances. Each master device now owns its own independent demux
state. The rule_add and rule_del helpers look up the appropriate
master's lag_func via the existing filter/group infrastructure.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260612113904.537595-10-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When eswitch is disabled, notify the SD layer so it can clean up
SD-specific resources such as the TX flow table root configuration
on secondary devices.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260612113904.537595-9-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When the eswitch transitions, propagate the change to SD: secondaries
get their TX flow table root reconfigured for the new mode, and when
all group devices move to switchdev, the per-group shared FDB is
activated.
Shared FDB activation is best-effort - failure does not block the
eswitch transition; the next transition retries.
Note: the existing mlx5_get_sd() guard that blocks switchdev for SD
devices is intentionally retained. It will be removed once all
supporting patches are in place.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260612113904.537595-8-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In Socket Direct configurations the primary and secondary PFs share the
same native_port_num. The eswitch vport metadata encodes pf_num in its
upper bits to distinguish vports across PFs. Without SD-awareness, both
PFs generate identical metadata, causing FDB rules to steer traffic to
the wrong representor.
Add mlx5_sd_pf_num_get() which remaps the pf_num for SD devices.
Use it so each PF in an SD group produces unique vport metadata.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260612113904.537595-7-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add mlx5_fs_cmd_query_l2table_silent() to query the current silent mode
state from firmware. This allows detecting if firmware has already put
secondary devices into silent mode.
During SD group registration, query the silent mode of each device. If
a device is already in silent mode (set by firmware), record this in
the fw_silents_secondaries flag and use it to help determine the
primary/secondary roles.
When fw_silents_secondaries is set, skip the driver-initiated silent
mode set/unset operations since firmware manages this state. This
handles configurations where firmware persistently silences secondary
devices.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260612113904.537595-6-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Refactor SD group registration to use devcom event-driven role
determination to ensure SD is marked as ready only after roles are fully
assigned and the group state is consistent, making outside accessors,
which will be added in downstream patches, safe to use without races.
The devcom events:
- SD_PRIMARY_SET event: each device compares bus numbers with peers
to determine which should be primary
- SD_SECONDARIES_SET event: secondaries register themselves with the
elected primary device
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260612113904.537595-5-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Some devcom events are not expected to fail. Rather than attempting
a rollback that may not be meaningful, allow callers to pass
DEVCOM_CANT_FAIL as the rollback_event to indicate that the event
handler should not fail. If it does, emit a warning and stop
propagating to further peers, but skip the rollback path.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260612113904.537595-4-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Factor mlx5_devcom_send_event() into two functions:
- mlx5_devcom_locked_send_event(): performs the dispatch (and
rollback) with comp->sem already held by the caller.
- mlx5_devcom_send_event(): unchanged wrapper that takes comp->sem,
calls the locked variant, and releases it.
This lets callers bracket multiple event broadcasts under a single
held write lock, eliminating the gap between consecutive dispatches
where peer state could change.
Will be used by a downstream patch.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260612113904.537595-3-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
SD secondary devices share the primary's uplink and do not have
their own uplink representor. When reloading IB reps on secondary
devices, skip the uplink and only load VF/SF vport IB reps.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20260612113904.537595-2-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Updates for v7.2
There's been quite a lot of framework improvements this time around,
though mainly cleanups and robustness rather than user visible features.
The same pattern is seen with a lot of the driver work that's going on,
there are new features but a huge proportion of this is bug fixing and
cleanup work. We also have a good selectio of new device support.
- Improvements to SDCA jack handling from Charles Keepax.
- Use of device links to make suspend handling more robust from Richard
Fitzgerald.
- Use of a new helper to factor out a common pattern in SoundWire
enmeration from Charles Keepax.
- Slimming down of the component from Kuninori Morimoto.
- Simplification of format auto selection from Kuninori Morimoto.
- Lots of conversions to guard() from Bui Duc Phuc.
- Addition of a simple-amplifier driver supporting more featureful GPIO
controller amplifiers than the previous basic driver from Herve
Codina.
- Support for AMD ACP 7.x, Cirrus Logic CS42448/CS42888, Everest Semi
ES9356, Mediatek MT2701 and MT8196, Renesas RZ/G3E, Spacemit K3,
Texas Instruments TAC5xx2 and TAS67524.
|
|
The upcoming cpu idle time accounting rework involves comparing and
subtracting cross cpu time stamps. Time stamps created with the stckf
instruction monotonic with respect to the local cpu. For cross cpu
monotonic time stamps the slightly slower stcke instruction has to
be used [1].
Convert the idle time accounting relevant usages of stckf to stcke.
[1] Principles of Operation - Setting and Inspecting the Clock
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
|
|
KVM/riscv changes for 7.2
- Batch G-stage TLB flushes for GPA range based page table updates
- Convert HGEI line management to fully per-HART
- Fix missing CSR dirty marking when FWFT state updated via ONE_REG
- Fix stale FWFT feature exposure to Guest/VM
- Speed up dirty logging write faults using MMU rwlock and atomic
PTE updates using cmpxchg() for permission-only changes
- Use flexible array for APLIC IRQ state
- Use kvm_slot_dirty_track_enabled() for logging enable check on
a memslot
- Avoid skipping valid pages in kvm_riscv_gstage_wp_range()
- Avoid skipping valid pages in kvm_riscv_gstage_unmap_range()
- Use endian-specific __lelong for NACL shared memory
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
KVM: s390: New features for 7.2
New features for 7.2 for KVM/s390:
* KVM_PRE_FAULT_MEMORY support
* Support for 2G hugepages
* Support for the ASTFLEIE 2 facility
* kvm_arch_set_irq_inatomic Fast Inject
* Fix potential leak of uninitialized bytes
|
|
The recently added UltraRISC DP1000 is using this symbol, and in
a reasonable way as well, so export it.
Acked-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Reported-by: Nathan Chancellor <nathan@kernel.org>
Closes: Link: https://lore.kernel.org/linux-gpio/20260613164847.GA3152104@ax162/
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202606130210.ytVPxHlm-lkp@intel.com/
Fixes: cb7037924836 ("pinctrl: ultrarisc: Add UltraRISC DP1000 pinctrl driver")
Signed-off-by: Linus Walleij <linusw@kernel.org>
|
|
If input_register_device() fails, we call input_free_device(), but keep
stale pointer to the old device in hidpp->input, which could potentially
lead to UAF. Fix that by resetting it to NULL before returning from
hidpp_connect_event().
Reported-by: zdi-disclosures@trendmicro.com
Signed-off-by: Jiri Kosina <jkosina@suse.com>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-mem-ctrl into soc/drivers
Memory controller drivers for v7.2, part two
A few improvements for Tegra Memory Controller drivers, including one
fix for UBSAN report for an older commit.
* tag 'memory-controller-drv-7.2-2' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-mem-ctrl:
memory: tegra234: drop dead NULL check in tegra234_mc_icc_aggregate()
memory: tegra264: drop redundant tegra264_mc_icc_aggregate()
memory: tegra186-emc: stop borrowing MC aggregate hook for EMC
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
interface"
This reverts commit 47d7bca76dd4f36ba0525d761f247c76ec9e4b17, which was
merged by accident.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
The Tiled Display Topology ID of a DisplayID Tiled Display Topology Data
Block consists of three fields:
- Tiled Display Manufacturer/Vendor ID Field (3 bytes)
- Tiled Display Product ID Code Field (2 bytes)
- Tiled Display Serial Number Field (4 bytes)
i.e. a total of 9 bytes, not 8.
The DisplayID Tiled Display Topology ID is used as the tile group
identifier.
Update both struct displayid_tiled_block topology_id member and struct
drm_tile_group group_data member to full 9 bytes.
The group data was missing the last byte of the serial number. I don't
know whether there are known bug reports that might be linked to this,
but it's plausible the last byte could be the differentiating part for
the tile groups, and fewer tile groups might have been created than
intended.
Fixes: b49b55bd4fba ("drm/displayid: add displayid defines and edid extension (v2)")
Fixes: 138f9ebb9755 ("drm: add tile_group support. (v3)")
Cc: Dave Airlie <airlied@redhat.com>
Cc: stable@vger.kernel.org # v3.19+
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patch.msgid.link/20260610141549.555605-1-jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
gitolite.kernel.org:pub/scm/linux/kernel/git/tip/tip
Pull x86 cpuid updates from Ingo Molnar:
- CPUID API updates (Ahmed S. Darwish):
- Introduce a centralized CPUID parser
- Introduce a centralized CPUID data model
- Introduce <asm/cpuid/leaf_types.h>
- Rename cpuid_leaf()/cpuid_subleaf() APIs
- treewide: Explicitly include the x86 CPUID headers
- Update to x86-cpuid-db v3.1 (Maciej Wieczor-Retman)
- Continued removal of pre-i586 support and related simplifications
(Ingo Molnar)
- Add Intel CPU model number for rugged Panther Lake (Tony Luck)
- Misc fixes, updates and cleanups by Arnd Bergmann, Chao Gao, Lukas
Bulwahn, Sohil Mehta, Maciej Wieczor-Retman.
* tag 'x86-cpu-2026-06-14' of gitolite.kernel.org:pub/scm/linux/kernel/git/tip/tip: (25 commits)
x86/cpu: Make CONFIG_X86_CX8 unconditional
x86/cpu: Remove unused !CONFIG_X86_TSC code
x86/cpuid: Update bitfields to x86-cpuid-db v3.1
tools/x86/kcpuid: Update bitfields to x86-cpuid-db v3.1
x86/cpu: Make CONFIG_X86_TSC unconditional
MAINTAINERS: Drop obsolete FPU EMULATOR section
x86/cpu: Fix a F00F bug warning and clean up surrounding code
x86/cpu: Add Intel CPU model number for rugged Panther Lake
x86/cpuid: Introduce a centralized CPUID parser
x86/cpu: Introduce a centralized CPUID data model
x86/cpuid: Introduce <asm/cpuid/leaf_types.h>
x86/cpuid: Rename cpuid_leaf()/cpuid_subleaf() APIs
x86/cpu: Do not include the CPUID API header in asm/processor.h
Documentation: core-api/cpu_hotplug: Remove stale cpu0_hotplug docs
x86/cpu, cpufreq: Remove AMD ELAN support
x86/fpu: Remove the math-emu/ FPU emulation library
x86/fpu: Remove the 'no387' boot option
x86/fpu: Remove MATH_EMULATION and related glue code
treewide: Explicitly include the x86 CPUID headers
x86/cpu: Remove the CONFIG_X86_INVD_BUG quirk
...
|
|
gitolite.kernel.org:pub/scm/linux/kernel/git/tip/tip
Pull x86/msr updates from Ingo Molnar:
- Large series to reorganize the rdmsr/wrmsr APIs to remove
32-bit variants and convert to 64-bit variants (Juergen Gross)
- Fix W=1 warning (HyeongJun An)
* tag 'x86-msr-2026-06-14' of gitolite.kernel.org:pub/scm/linux/kernel/git/tip/tip:
x86/msr: Remove wrmsrl()
x86/msr: Switch wrmsrl() users to wrmsrq()
x86/msr: Remove rdmsrl()
x86/msr: Switch rdmsrl() users to rdmsrq()
x86/msr: Remove wrmsr_safe_on_cpu()
x86/msr: Switch wrmsr_safe_on_cpu() users to wrmsrq_safe_on_cpu()
x86/msr: Remove rdmsr_safe_on_cpu()
x86/msr: Switch rdmsr_safe_on_cpu() users to rdmsrq_safe_on_cpu()
x86/msr: Don't use rdmsr_safe_on_cpu() in rdmsrq_safe_on_cpu()
x86/msr: Remove wrmsr_on_cpu()
x86/msr: Switch wrmsr_on_cpu() users to wrmsrq_on_cpu()
x86/msr: Remove rdmsr_on_cpu()
x86/msr: Switch rdmsr_on_cpu() users to rdmsrq_on_cpu()
x86/msr: Remove rdmsrl_on_cpu()
x86/msr: Switch rdmsrl_on_cpu() user to rdmsrq_on_cpu()
x86/process: Convert rdmsr() to rdmsrq() in arch_post_acpi_subsys_init() to address W=1 warning
|