| Age | Commit message (Collapse) | Author |
|
The ADP558x family of devices can be programmed to respond to some
especial events, In case of the unlock events, one can lock the keypad
and use KEYS or GPIs events to unlock it. For the reset events, one can
again use a combinations of GPIs/KEYs in order to generate an event that
will trigger the device to generate an output reset pulse.
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Link: https://lore.kernel.org/r/20250701-dev-adp5589-fw-v7-13-b1fcfe9e9826@analog.com
Signed-off-by: Lee Jones <lee@kernel.org>
|
|
These devices are capable of generate FIFO based events based on KEY or
GPI presses. Add support for handling these events. This is in
preparation of adding full support for keymap and gpis based events.
Reviewed-by: Lee Jones <lee@kernel.org>
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Link: https://lore.kernel.org/r/20250701-dev-adp5589-fw-v7-12-b1fcfe9e9826@analog.com
Signed-off-by: Lee Jones <lee@kernel.org>
|
|
Add support for the adp5589 I/O expander. From a PWM point of view it is
pretty similar to adp5585. Main difference is the address
of registers meaningful for configuring the PWM.
Acked-by: Uwe Kleine-König <ukleinek@kernel.org>
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Link: https://lore.kernel.org/r/20250701-dev-adp5589-fw-v7-10-b1fcfe9e9826@analog.com
Signed-off-by: Lee Jones <lee@kernel.org>
|
|
Support the adp5589 I/O expander which supports up to 19 pins. We need
to add a chip_info based struct since accessing register "banks"
and "bits" differs between devices.
Also some register addresses are different.
While at it move ADP558X_GPIO_MAX defines to the main header file and
rename them. That information will be needed by the top level device in
a following change.
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Link: https://lore.kernel.org/r/20250701-dev-adp5589-fw-v7-9-b1fcfe9e9826@analog.com
Signed-off-by: Lee Jones <lee@kernel.org>
|
|
There are some differences in the register map between the devices.
Hence, add a register structure per device. This will be needed in
following patches.
On top of that adp5585_fill_regmap_config() is renamed and reworked so
that the current struct adp5585_info act as template (they indeed
contain all the different data between variants) which can then be
complemented depending on the device (as identified by the id register).
This is done like this since a lot of the data is pretty much the same
between variants of the same device.
Reviewed-by: Lee Jones <lee@kernel.org>
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Link: https://lore.kernel.org/r/20250701-dev-adp5589-fw-v7-8-b1fcfe9e9826@analog.com
Signed-off-by: Lee Jones <lee@kernel.org>
|
|
The ADP5589 is a 19 I/O port expander with built-in keypad matrix decoder,
programmable logic, reset generator, and PWM generator.
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Link: https://lore.kernel.org/r/20250701-dev-adp5589-fw-v7-7-b1fcfe9e9826@analog.com
Signed-off-by: Lee Jones <lee@kernel.org>
|
|
The only thing changing between variants is the regmap default
registers. Hence, instead of having a regmap configuration for every
variant (duplicating lots of fields), add a chip info type of structure
with a regmap ID to identify which defaults to use and populate
regmap_config at runtime given a template plus the id. Also note that
between variants, the defaults can be the same which means the chip info
structure can be used in more than one compatible.
This will also make it simpler adding new chips with more variants.
Also note that the chip info structures are deliberately not const as
they will also contain lots of members that are the same between the
different devices variants and so we will fill those at runtime.
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Link: https://lore.kernel.org/r/20250701-dev-adp5589-fw-v7-6-b1fcfe9e9826@analog.com
Signed-off-by: Lee Jones <lee@kernel.org>
|
|
Introduce new hooks for setting and getting filesystem extended
attributes on inode (FS_IOC_FSGETXATTR).
Cc: selinux@vger.kernel.org
Cc: Paul Moore <paul@paul-moore.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
Link: https://lore.kernel.org/20250630-xattrat-syscall-v6-2-c4e3bc35227b@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
This patch moves function related to file extended attributes
manipulations to separate file. Refactoring only.
Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
Link: https://lore.kernel.org/20250630-xattrat-syscall-v6-1-c4e3bc35227b@kernel.org
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Make a number of updates to the netfs tracepoints:
(1) Remove a duplicate trace from netfs_unbuffered_write_iter_locked().
(2) Move the trace in netfs_wake_rreq_flag() to after the flag is cleared
so that the change appears in the trace.
(3) Differentiate the use of netfs_rreq_trace_wait/woke_queue symbols.
(4) Don't do so many trace emissions in the wait functions as some of them
are redundant.
(5) In netfs_collect_read_results(), differentiate a subreq that's being
abandoned vs one that has been consumed in a regular way.
(6) Add a tracepoint to indicate the call to ->ki_complete().
(7) Don't double-increment the subreq_counter when retrying a write.
(8) Move the netfs_sreq_trace_io_progress tracepoint within cifs code to
just MID_RESPONSE_RECEIVED and add different tracepoints for other MID
states and note check failure.
Signed-off-by: David Howells <dhowells@redhat.com>
Co-developed-by: Paulo Alcantara <pc@manguebit.org>
Signed-off-by: Paulo Alcantara <pc@manguebit.org>
Link: https://lore.kernel.org/20250701163852.2171681-14-dhowells@redhat.com
cc: Steve French <sfrench@samba.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
cc: linux-cifs@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Renumber the NETFS_RREQ_* flags to put the most useful status bits in the
bottom nibble - and therefore the last hex digit in the trace output -
making it easier to grasp the state at a glance.
In particular, put the IN_PROGRESS flag in bit 0 and ALL_QUEUED at bit 1.
Also make the flags field in /proc/fs/netfs/requests larger to accommodate
all the flags.
Also make the flags field in the netfs_sreq tracepoint larger to
accommodate all the NETFS_SREQ_* flags.
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/20250701163852.2171681-13-dhowells@redhat.com
Reviewed-by: Paulo Alcantara <pc@manguebit.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Netfslib has two functions for updating the i_size after a write: one for
buffered writes into the pagecache and one for direct/unbuffered writes.
However, what needs to be done is much the same in both cases, so merge
them together.
This does raise one question, though: should updating the i_size after a
direct write do the same estimated update of i_blocks as is done for
buffered writes.
Also get rid of the cleanup function pointer from netfs_io_request as it's
only used for direct write to update i_size; instead do the i_size setting
directly from write collection.
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/20250701163852.2171681-12-dhowells@redhat.com
cc: Steve French <sfrench@samba.org>
cc: Paulo Alcantara <pc@manguebit.org>
cc: linux-cifs@vger.kernel.org
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
If a netfs request finishes during the pause loop, it will have the ref
that belongs to the IN_PROGRESS flag removed at that point - however, if it
then goes to the final wait loop, that will *also* put the ref because it
sees that the IN_PROGRESS flag is clear and incorrectly assumes that this
happened when it called the collector.
In fact, since IN_PROGRESS is clear, we shouldn't call the collector again
since it's done all the cleanup, such as calling ->ki_complete().
Fix this by making netfs_collect_in_app() just return, indicating that
we're done if IN_PROGRESS is removed.
Fixes: 2b1424cd131c ("netfs: Fix wait/wake to be consistent about the waitqueue used")
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/20250701163852.2171681-3-dhowells@redhat.com
Tested-by: Steve French <sfrench@samba.org>
Reviewed-by: Paulo Alcantara <pc@manguebit.org>
cc: Steve French <sfrench@samba.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
cc: linux-cifs@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Document the dma_pool_zalloc() wrapper.
Signed-off-by: Petr Tesarik <ptesarik@suse.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
[jc: fixed up dma_pool_alloc() reference in dmapool.h]
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20250627101015.1600042-5-ptesarik@suse.com
|
|
Add two variants of helper functions that calculate the correct number
of queues to use. Two variants are needed because some drivers base
their maximum number of queues on the possible CPU mask, while others
use the online CPU mask.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Daniel Wagner <wagi@kernel.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20250617-isolcpus-queue-counters-v1-2-13923686b54b@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
group_cpu_evenly() might have allocated less groups then requested:
group_cpu_evenly()
__group_cpus_evenly()
alloc_nodes_groups()
# allocated total groups may be less than numgrps when
# active total CPU number is less then numgrps
In this case, the caller will do an out of bound access because the
caller assumes the masks returned has numgrps.
Return the number of groups created so the caller can limit the access
range accordingly.
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Daniel Wagner <wagi@kernel.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20250617-isolcpus-queue-counters-v1-1-13923686b54b@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Update header inclusions to follow IWYU (Include What You Use)
principle.
Note that kernel.h is discouraged to be included as it's written
at the top of that file.
While doing that, sort headers alphabetically.
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://patch.msgid.link/20250627142001.994860-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
|
|
If CONFIG_ACPI_SPCR_TABLE is disabled, acpi_parse_spcr()
currently returns 0, which may incorrectly suggest that
SPCR parsing was successful. This patch changes the behavior
to return -ENODEV to clearly indicate that SPCR support
is not available.
This prepares the codebase for future changes that depend
on acpi_parse_spcr() failure detection, such as suppressing
misleading console messages.
Signed-off-by: Li Chen <chenl311@chinatelecom.cn>
Acked-by: Hanjun Guo <guohanjun@huawei.com>
Link: https://lore.kernel.org/r/20250620131309.126555-2-me@linux.beauty
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
In both the read callback for struct cyclecounter, and in struct
timecounter, struct cyclecounter is declared as a const pointer.
Unfortunatly, a number of users of this pointer treat it as a non-const
pointer as it is burried in a larger structure that is heavily modified by
the callback function when accessed. This lie had been hidden by the fact
that container_of() "casts away" a const attribute of a pointer without any
compiler warning happening at all.
Fix this all up by removing the const attribute in the needed places so
that everyone can see that the structure really isn't const, but can,
and is, modified by the users of it.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/2025070124-backyard-hurt-783a@gregkh
|
|
Merge series from Charles Keepax <ckeepax@opensource.cirrus.com>:
Add a maintainers entry for SDCA, do a couple of small fixups for
previous chains, and then adding the beginnings of the SDCA IRQ
handling. This is based around a regmap IRQ chip and a few helper
functions that can be called from the client drivers to setup the
IRQs.
|
|
Add a new ioctl, FS_IOC_GETLBMD_CAP, to query metadata and protection
info (PI) capabilities. This ioctl returns information about the files
integrity profile. This is useful for userspace applications to
understand a files end-to-end data protection support and configure the
I/O accordingly.
For now this interface is only supported by block devices. However the
design and placement of this ioctl in generic FS ioctl space allows us
to extend it to work over files as well. This maybe useful when
filesystems start supporting PI-aware layouts.
A new structure struct logical_block_metadata_cap is introduced, which
contains the following fields:
1. lbmd_flags: bitmask of logical block metadata capability flags
2. lbmd_interval: the amount of data described by each unit of logical
block metadata
3. lbmd_size: size in bytes of the logical block metadata associated
with each interval
4. lbmd_opaque_size: size in bytes of the opaque block tag associated
with each interval
5. lbmd_opaque_offset: offset in bytes of the opaque block tag within
the logical block metadata
6. lbmd_pi_size: size in bytes of the T10 PI tuple associated with each
interval
7. lbmd_pi_offset: offset in bytes of T10 PI tuple within the logical
block metadata
8. lbmd_pi_guard_tag_type: T10 PI guard tag type
9. lbmd_pi_app_tag_size: size in bytes of the T10 PI application tag
10. lbmd_pi_ref_tag_size: size in bytes of the T10 PI reference tag
11. lbmd_pi_storage_tag_size: size in bytes of the T10 PI storage tag
The internal logic to fetch the capability is encapsulated in a helper
function blk_get_meta_cap(), which uses the blk_integrity profile
associated with the device. The ioctl returns -EOPNOTSUPP, if
CONFIG_BLK_DEV_INTEGRITY is not enabled.
Suggested-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
Link: https://lore.kernel.org/20250630090548.3317-5-anuj20.g@samsung.com
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Introduce a new pi_tuple_size field in struct blk_integrity to
explicitly represent the size (in bytes) of the protection information
(PI) tuple. This is a prep patch.
Add validation in blk_validate_integrity_limits() to ensure that
pi size matches the expected size for known checksum types and never
exceeds the pi_tuple_size.
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
Link: https://lore.kernel.org/20250630090548.3317-3-anuj20.g@samsung.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
The tuple_size field in blk_integrity currently represents the total
size of metadata associated with each data interval. To make the meaning
more explicit, rename tuple_size to metadata_size. This is a purely
mechanical rename with no functional changes.
Suggested-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
Link: https://lore.kernel.org/20250630090548.3317-2-anuj20.g@samsung.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Implementing if-conditions inside for_each_x() macros requires some
thinking to avoid side effects in the calling code. Resulting code
may look somewhat awkward, and there are couple of different ways it is
usually done.
Standardizing this to one way can help making it more obvious for a code
reader and writer. The newly added for_each_if() is a way to achieve this.
Use for_each_if() to make these macros look like many others which
should in the long run help reading the code.
Signed-off-by: Matti Vaittinen <mazziesaccount@gmail.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/c98b39a7195006fdd24590b8d11bb271a72a0c8a.1749453752.git.mazziesaccount@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Because pps_cdev_poll() returns unconditionally EPOLLIN,
a user space program that calls select/poll get always an immediate data
ready-to-read response. As a result the intended use to wait until next
data becomes ready does not work.
User space snippet:
struct pollfd pollfd = {
.fd = open("/dev/pps0", O_RDONLY),
.events = POLLIN|POLLERR,
.revents = 0 };
while(1) {
poll(&pollfd, 1, 2000/*ms*/); // returns immediate, but should wait
if(revents & EPOLLIN) { // always true
struct pps_fdata fdata;
memset(&fdata, 0, sizeof(memdata));
ioctl(PPS_FETCH, &fdata); // currently fetches data at max speed
}
}
Lets remember the last fetch event counter and compare this value
in pps_cdev_poll() with most recent event counter
and return 0 if they are equal.
Signed-off-by: Denis OSTERLAND-HEIM <denis.osterland@diehl.com>
Co-developed-by: Rodolfo Giometti <giometti@enneenne.com>
Signed-off-by: Rodolfo Giometti <giometti@enneenne.com>
Fixes: eae9d2ba0cfc ("LinuxPPS: core support")
Link: https://lore.kernel.org/all/f6bed779-6d59-4f0f-8a59-b6312bd83b4e@enneenne.com/
Acked-by: Rodolfo Giometti <giometti@enneenne.com>
Link: https://lore.kernel.org/r/c3c50ad1eb19ef553eca8a57c17f4c006413ab70.camel@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Currently, the capability check is done in the default
init_user_ns user namespace. When a process runs in a
non default user namespace, such check fails. Due to this
when a process is running using Podman, it fails to create
the flow resource.
Since the RDMA device is a resource within a network namespace,
use the network namespace associated with the RDMA device to
determine its owning user namespace.
Fixes: 436f2ad05a0b ("IB/core: Export ib_create/destroy_flow through uverbs")
Signed-off-by: Parav Pandit <parav@nvidia.com>
Suggested-by: Eric W. Biederman <ebiederm@xmission.com>
Link: https://patch.msgid.link/6df6f2f24627874c4f6d041c19dc1f6f29f68f84.1750963874.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
|
|
Drop the unused MIPI_DSI_MODE_VSYNC_FLUSH flag. Whether or not a display
FIFO flush on vsync is required to avoid sending garbage to the panel is
not a property of the DSI link, but of the integration between display
controller and DSI host bridge.
Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://lore.kernel.org/r/20250627-dsi-vsync-flush-v2-4-4066899a5608@pengutronix.de
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
|
|
Currently, UBLK_IO_REGISTER_IO_BUF and UBLK_IO_UNREGISTER_IO_BUF are
only permitted on the ublk_io's daemon task. But this restriction is
unnecessary. ublk_register_io_buf() calls __ublk_check_and_get_req() to
look up the request from the tagset and atomically take a reference on
the request without accessing the ublk_io. ublk_unregister_io_buf()
doesn't use the q_id or tag at all.
So allow these opcodes even on tasks other than io->task.
Handle UBLK_IO_UNREGISTER_IO_BUF before obtaining the ubq and io since
the buffer index being unregistered is not necessarily related to the
specified q_id and tag.
Add a feature flag UBLK_F_BUF_REG_OFF_DAEMON that userspace can use to
determine whether the kernel supports off-daemon buffer registration.
Suggested-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250620151008.3976463-10-csander@purestorage.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
tl;dr
=====
Add a new neighbor flag ("extern_valid") that can be used to indicate to
the kernel that a neighbor entry was learned and determined to be valid
externally. The kernel will not try to remove or invalidate such an
entry, leaving these decisions to the user space control plane. This is
needed for EVPN multi-homing where a neighbor entry for a multi-homed
host needs to be synced across all the VTEPs among which the host is
multi-homed.
Background
==========
In a typical EVPN multi-homing setup each host is multi-homed using a
set of links called ES (Ethernet Segment, i.e., LAG) to multiple leaf
switches (VTEPs). VTEPs that are connected to the same ES are called ES
peers.
When a neighbor entry is learned on a VTEP, it is distributed to both ES
peers and remote VTEPs using EVPN MAC/IP advertisement routes. ES peers
use the neighbor entry when routing traffic towards the multi-homed host
and remote VTEPs use it for ARP/NS suppression.
Motivation
==========
If the ES link between a host and the VTEP on which the neighbor entry
was locally learned goes down, the EVPN MAC/IP advertisement route will
be withdrawn and the neighbor entries will be removed from both ES peers
and remote VTEPs. Routing towards the multi-homed host and ARP/NS
suppression can fail until another ES peer locally learns the neighbor
entry and distributes it via an EVPN MAC/IP advertisement route.
"draft-rbickhart-evpn-ip-mac-proxy-adv-03" [1] suggests avoiding these
intermittent failures by having the ES peers install the neighbor
entries as before, but also injecting EVPN MAC/IP advertisement routes
with a proxy indication. When the previously mentioned ES link goes down
and the original EVPN MAC/IP advertisement route is withdrawn, the ES
peers will not withdraw their neighbor entries, but instead start aging
timers for the proxy indication.
If an ES peer locally learns the neighbor entry (i.e., it becomes
"reachable"), it will restart its aging timer for the entry and emit an
EVPN MAC/IP advertisement route without a proxy indication. An ES peer
will stop its aging timer for the proxy indication if it observes the
removal of the proxy indication from at least one of the ES peers
advertising the entry.
In the event that the aging timer for the proxy indication expired, an
ES peer will withdraw its EVPN MAC/IP advertisement route. If the timer
expired on all ES peers and they all withdrew their proxy
advertisements, the neighbor entry will be completely removed from the
EVPN fabric.
Implementation
==============
In the above scheme, when the control plane (e.g., FRR) advertises a
neighbor entry with a proxy indication, it expects the corresponding
entry in the data plane (i.e., the kernel) to remain valid and not be
removed due to garbage collection or loss of carrier. The control plane
also expects the kernel to notify it if the entry was learned locally
(i.e., became "reachable") so that it will remove the proxy indication
from the EVPN MAC/IP advertisement route. That is why these entries
cannot be programmed with dummy states such as "permanent" or "noarp".
Instead, add a new neighbor flag ("extern_valid") which indicates that
the entry was learned and determined to be valid externally and should
not be removed or invalidated by the kernel. The kernel can probe the
entry and notify user space when it becomes "reachable" (it is initially
installed as "stale"). However, if the kernel does not receive a
confirmation, have it return the entry to the "stale" state instead of
the "failed" state.
In other words, an entry marked with the "extern_valid" flag behaves
like any other dynamically learned entry other than the fact that the
kernel cannot remove or invalidate it.
One can argue that the "extern_valid" flag should not prevent garbage
collection and that instead a neighbor entry should be programmed with
both the "extern_valid" and "extern_learn" flags. There are two reasons
for not doing that:
1. Unclear why a control plane would like to program an entry that the
kernel cannot invalidate but can completely remove.
2. The "extern_learn" flag is used by FRR for neighbor entries learned
on remote VTEPs (for ARP/NS suppression) whereas here we are
concerned with local entries. This distinction is currently irrelevant
for the kernel, but might be relevant in the future.
Given that the flag only makes sense when the neighbor has a valid
state, reject attempts to add a neighbor with an invalid state and with
this flag set. For example:
# ip neigh add 192.0.2.1 nud none dev br0.10 extern_valid
Error: Cannot create externally validated neighbor with an invalid state.
# ip neigh add 192.0.2.1 lladdr 00:11:22:33:44:55 nud stale dev br0.10 extern_valid
# ip neigh replace 192.0.2.1 nud failed dev br0.10 extern_valid
Error: Cannot mark neighbor as externally validated with an invalid state.
The above means that a neighbor cannot be created with the
"extern_valid" flag and flags such as "use" or "managed" as they result
in a neighbor being created with an invalid state ("none") and
immediately getting probed:
# ip neigh add 192.0.2.1 lladdr 00:11:22:33:44:55 nud stale dev br0.10 extern_valid use
Error: Cannot create externally validated neighbor with an invalid state.
However, these flags can be used together with "extern_valid" after the
neighbor was created with a valid state:
# ip neigh add 192.0.2.1 lladdr 00:11:22:33:44:55 nud stale dev br0.10 extern_valid
# ip neigh replace 192.0.2.1 lladdr 00:11:22:33:44:55 nud stale dev br0.10 extern_valid use
One consequence of preventing the kernel from invalidating a neighbor
entry is that by default it will only try to determine reachability
using unicast probes. This can be changed using the "mcast_resolicit"
sysctl:
# sysctl net.ipv4.neigh.br0/10.mcast_resolicit
0
# tcpdump -nn -e -i br0.10 -Q out arp &
# ip neigh replace 192.0.2.1 lladdr 00:11:22:33:44:55 nud stale dev br0.10 extern_valid use
62:50:1d:11:93:6f > 00:11:22:33:44:55, ethertype ARP (0x0806), length 42: Request who-has 192.0.2.1 tell 192.0.2.2, length 28
62:50:1d:11:93:6f > 00:11:22:33:44:55, ethertype ARP (0x0806), length 42: Request who-has 192.0.2.1 tell 192.0.2.2, length 28
62:50:1d:11:93:6f > 00:11:22:33:44:55, ethertype ARP (0x0806), length 42: Request who-has 192.0.2.1 tell 192.0.2.2, length 28
# sysctl -wq net.ipv4.neigh.br0/10.mcast_resolicit=3
# ip neigh replace 192.0.2.1 lladdr 00:11:22:33:44:55 nud stale dev br0.10 extern_valid use
62:50:1d:11:93:6f > 00:11:22:33:44:55, ethertype ARP (0x0806), length 42: Request who-has 192.0.2.1 tell 192.0.2.2, length 28
62:50:1d:11:93:6f > 00:11:22:33:44:55, ethertype ARP (0x0806), length 42: Request who-has 192.0.2.1 tell 192.0.2.2, length 28
62:50:1d:11:93:6f > 00:11:22:33:44:55, ethertype ARP (0x0806), length 42: Request who-has 192.0.2.1 tell 192.0.2.2, length 28
62:50:1d:11:93:6f > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.0.2.1 tell 192.0.2.2, length 28
62:50:1d:11:93:6f > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.0.2.1 tell 192.0.2.2, length 28
62:50:1d:11:93:6f > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.0.2.1 tell 192.0.2.2, length 28
iproute2 patches can be found here [2].
[1] https://datatracker.ietf.org/doc/html/draft-rbickhart-evpn-ip-mac-proxy-adv-03
[2] https://github.com/idosch/iproute2/tree/submit/extern_valid_v1
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://patch.msgid.link/20250626073111.244534-2-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add new helper macros and enums to help identifying the platform and some
characteristics of it at runtime.
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Link: https://lore.kernel.org/r/20250519-dev-axi-clkgen-limits-v6-4-bc4b3b61d1d4@analog.com
Reviewed-by: David Lechner <dlechner@baylibre.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
The adi-axi-common.h header has some common defines used in various ADI
IPs. However they are not specific for any fpga manager so it's
questionable for the header to live under include/linux/fpga. Hence
let's just move one directory up and update all users.
Suggested-by: Xu Yilun <yilun.xu@linux.intel.com>
Acked-by: Xu Yilun <yilun.xu@intel.com>
Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> # for IIO
Signed-off-by: Nuno Sá <nuno.sa@analog.com>
Link: https://lore.kernel.org/r/20250519-dev-axi-clkgen-limits-v6-3-bc4b3b61d1d4@analog.com
Acked-by: Mark Brown <broonie@kernel.org>
Acked-by: Uwe Kleine-König <ukleinek@kernel.org>
Reviewed-by: David Lechner <dlechner@baylibre.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
Add a new blk_rq_dma_map / blk_rq_dma_unmap pair that does away with
the wasteful scatterlist structure. Instead it uses the mapping iterator
to either add segments to the IOVA for IOMMU operations, or just maps
them one by one for the direct mapping. For the IOMMU case instead of
a scatterlist with an entry for each segment, only a single [dma_addr,len]
pair needs to be stored for processing a request, and for the direct
mapping the per-segment allocation shrinks from
[page,offset,len,dma_addr,dma_len] to just [dma_addr,len].
One big difference to the scatterlist API, which could be considered
downside, is that the IOVA collapsing only works when the driver sets
a virt_boundary that matches the IOMMU granule. For NVMe this is done
already so it works perfectly.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20250625113531.522027-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
To get out of the DMA mapping helpers having to check every segment for
it's P2P status, ensure that bios either contain P2P transfers or non-P2P
transfers, and that a P2P bio only contains ranges from a single device.
This means we do the page zone access in the bio add path where it should
be still page hot, and will only have do the fairly expensive P2P topology
lookup once per bio down in the DMA mapping path, and only for already
marked bios.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20250625113531.522027-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
In preparation for fixing device mapper zone write handling, introduce
the inline helper function bio_needs_zone_write_plugging() to test if a
BIO requires handling through zone write plugging using the function
blk_zone_plug_bio(). This function returns true for any write
(op_is_write(bio) == true) operation directed at a zoned block device
using zone write plugging, that is, a block device with a disk that has
a zone write plug hash table.
This helper allows simplifying the check on entry to blk_zone_plug_bio()
and used in to protect calls to it for blk-mq devices and DM devices.
Fixes: f211268ed1f9 ("dm: Use the block layer zone append emulation")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20250625093327.548866-3-dlemoal@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
REQ_OP_ZONE_FINISH is defined as "12", which makes
op_is_write(REQ_OP_ZONE_FINISH) return false, despite the fact that a
zone finish operation is an operation that modifies a zone (transition
it to full) and so should be considered as a write operation (albeit
one that does not transfer any data to the device).
Fix this by redefining REQ_OP_ZONE_FINISH to be an odd number (13), and
redefine REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL using sequential
odd numbers from that new value.
Fixes: 6c1b1da58f8c ("block: add zone open, close and finish operations")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20250625093327.548866-2-dlemoal@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Back in 2015, commit d2be537c3ba3 ("block: bump BLK_DEF_MAX_SECTORS to
2560") increased the default maximum size of a block device I/O to 2560
sectors (1280 KiB) to "accommodate a 10-data-disk stripe write with
chunk size 128k". This choice is rather arbitrary and since then,
improvements to the block layer have software RAID drivers correctly
advertize their stripe width through chunk_sectors and abuses of
BLK_DEF_MAX_SECTORS_CAP by drivers (to set the HW limit rather than the
default user controlled maximum I/O size) have been fixed.
Since many block devices can benefit from a larger value of
BLK_DEF_MAX_SECTORS_CAP, and in particular HDDs, increase this value to
be 4MiB, or 8192 sectors.
And given that BLK_DEF_MAX_SECTORS_CAP is only used in the block layer
and should not be used by drivers directly, move this macro definition
to the block layer internal header file block/blk.h.
Suggested-by: Martin K . Petersen <martin.petersen@oracle.com>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20250618060045.37593-1-dlemoal@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Prerequisite for ARM[64] generic entry conversion
Merge it into the entry branch so further changes can be based on it.
|
|
Currently CONFIG_GENERIC_ENTRY enables both the generic exception
entry logic and the generic syscall entry logic, which are otherwise
loosely coupled.
Introduce separate config options for these so that architectures can
select the two independently. This will make it easier for
architectures to migrate to generic entry code.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/20250213130007.1418890-2-ruanjinjie@huawei.com
Link: https://lore.kernel.org/all/20250624-generic-entry-split-v1-1-53d5ef4f94df@linaro.org
[Linus Walleij: rebase onto v6.16-rc1]
|
|
An errant ';' slipped into that definition, which will cause some
compilers to complain when it's used in an application:
timestamp.c:257:45: error: empty expression statement has no effect; remove unnecessary ';' to silence this warning [-Werror,-Wextra-semi-stmt]
257 | hwts = cqe->flags & IORING_CQE_F_TSTAMP_HW;
| ^
Fixes: 9e4ed359b8ef ("io_uring/netcmd: add tx timestamping cmd support")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
There's no need for crc32() to be a macro. Make it an inline function
instead. Also, remove the cast of the data pointer to
'unsigned char const *', which is no longer necessary now that the type
used in the function prototype is 'const void *'.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250619183414.100082-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Document these widely used functions.
Update kernel-api.rst to point to the correct place, instead of to
crc32-main.c which no longer contains kerneldoc comments.
Simplify the documentation in crc32poly.h to just point to the
corresponding functions, now that they are properly documented. Change
the value of CRC32C_POLY_LE to lower case, for consistency.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250619183414.100082-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Rework how lib/crc/ supports arch-optimized code. First, instead of the
arch-optimized CRC code being in arch/$(SRCARCH)/lib/, it will now be in
lib/crc/$(SRCARCH)/. Second, the API functions (e.g. crc32c()),
arch-optimized functions (e.g. crc32c_arch()), and generic functions
(e.g. crc32c_base()) will now be part of a single module for each CRC
type, allowing better inlining and dead code elimination. The second
change is made possible by the first.
As an example, consider CONFIG_CRC32=m on x86. We'll now have just
crc32.ko instead of both crc32-x86.ko and crc32.ko. The two modules
were already coupled together and always both got loaded together via
direct symbol dependency, so the separation provided no benefit.
Note: later I'd like to apply the same design to lib/crypto/ too, where
often the API functions are out-of-line so this will work even better.
In those cases, for each algorithm we currently have 3 modules all
coupled together, e.g. libsha256.ko, libsha256-generic.ko, and
sha256-x86.ko. We should have just one, inline things properly, and
rely on the compiler's dead code elimination to decide the inclusion of
the generic code instead of manually setting it via kconfig.
Having arch-specific code outside arch/ was somewhat controversial when
Zinc proposed it back in 2018. But I don't think the concerns are
warranted. It's better from a technical perspective, as it enables the
improvements mentioned above. This model is already successfully used
in other places in the kernel such as lib/raid6/. The community of each
architecture still remains free to work on the code, even if it's not in
arch/. At the time there was also a desire to put the library code in
the same files as the old-school crypto API, but that was a mistake; now
that the library is separate, that's no longer a constraint either.
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20250607200454.73587-3-ebiggers@kernel.org
Link: https://lore.kernel.org/r/20250612054514.142728-1-ebiggers@kernel.org
Link: https://lore.kernel.org/r/20250621012221.4351-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Move all CRC files in lib/ into a subdirectory lib/crc/ to keep them
from cluttering up the main lib/ directory.
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20250607200454.73587-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Remove crc32_le_combine() and crc32_le_shift(), since they are no longer
used.
Although combination is an interesting thing that can be done with CRCs,
it turned out that none of the users of it in the kernel were even close
to being worthwhile. All were much better off simply chaining the CRCs
or processing zeroes.
Let's remove the CRC32 combination code for now. It can come back
(potentially optimized with carryless multiplication instructions) if
there is ever a case where it would actually be worthwhile.
Link: https://lore.kernel.org/r/20250607032228.27868-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
sha512_base.h is no longer used, so remove it.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160320.2888-17-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Delete crypto/sha512_generic.c, which provided "generic" SHA-384 and
SHA-512 crypto_shash algorithms. Replace it with crypto/sha512.c which
provides SHA-384, SHA-512, HMAC-SHA384, and HMAC-SHA512 crypto_shash
algorithms using the corresponding library functions.
This is a prerequisite for migrating all the arch-optimized SHA-512 code
(which is almost 3000 lines) to lib/crypto/ rather than duplicating it.
Since the replacement crypto_shash algorithms are implemented using the
(potentially arch-optimized) library functions, give them
cra_driver_names ending with "-lib" rather than "-generic". Update
crypto/testmgr.c and one odd driver to take this change in driver name
into account. Besides these cases which are accounted for, there are no
known cases where the cra_driver_name was being depended on.
This change does mean that the abstract partial block handling code in
crypto/shash.c, which got added in 6.16, no longer gets used. But
that's fine; the library has to implement the partial block handling
anyway, and it's better to do it in the library since the block size and
other properties of the algorithm are all fixed at compile time there,
resulting in more streamlined code.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160320.2888-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Since HMAC support is commonly needed and is fairly simple, include it
as a first-class citizen of the SHA-512 library.
The API supports both incremental and one-shot computation, and either
preparing the key ahead of time or just using a raw key. The
implementation is much more streamlined than crypto/hmac.c.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160320.2888-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Add basic support for SHA-384 and SHA-512 to lib/crypto/.
Various in-kernel users will be able to use this instead of the
old-school crypto API, which is harder to use and has more overhead.
The basic support added by this commit consists of the API and its
documentation, backed by a C implementation of the algorithms.
sha512_block_generic() is derived from crypto/sha512_generic.c.
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250630160320.2888-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
local_lock.h is the main header for the local_lock_t type and provides
wrappers around internal functions prefixed with __ in
local_lock_internal.h.
Move the this_cpu_ptr() dereference of the variable from the internal to
the main header. Since it is all macro implemented, this_cpu_ptr() will
still happen within the preempt/ IRQ disabled section.
This frees the internal implementation (__) to be used on local_lock_t
types which are local variables and must not be accessed via
this_cpu_ptr().
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Waiman Long <longman@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/20250630075138.3448715-2-bigeasy@linutronix.de
|
|
Change the argument type to u32 for the default level being sent
since it has to now account for luminance value which has to be
set for DP_EDP_PANEL_LUMINANCE_TARGET_VALUE.
--v2
-No need to typecast [Jani]
Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com>
Reviewed-by: Arun R Murthy <arun.r.murthy@intel.com>
Link: https://lore.kernel.org/r/20250620063445.3603086-10-suraj.kandpal@intel.com
|