summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2025-07-01mfd: adp5585: Support reset and unlock eventsNuno Sá
The ADP558x family of devices can be programmed to respond to some especial events, In case of the unlock events, one can lock the keypad and use KEYS or GPIs events to unlock it. For the reset events, one can again use a combinations of GPIs/KEYs in order to generate an event that will trigger the device to generate an output reset pulse. Signed-off-by: Nuno Sá <nuno.sa@analog.com> Link: https://lore.kernel.org/r/20250701-dev-adp5589-fw-v7-13-b1fcfe9e9826@analog.com Signed-off-by: Lee Jones <lee@kernel.org>
2025-07-01mfd: adp5585: Add support for event handlingNuno Sá
These devices are capable of generate FIFO based events based on KEY or GPI presses. Add support for handling these events. This is in preparation of adding full support for keymap and gpis based events. Reviewed-by: Lee Jones <lee@kernel.org> Signed-off-by: Nuno Sá <nuno.sa@analog.com> Link: https://lore.kernel.org/r/20250701-dev-adp5589-fw-v7-12-b1fcfe9e9826@analog.com Signed-off-by: Lee Jones <lee@kernel.org>
2025-07-01pwm: adp5585: add support for adp5589Nuno Sá
Add support for the adp5589 I/O expander. From a PWM point of view it is pretty similar to adp5585. Main difference is the address of registers meaningful for configuring the PWM. Acked-by: Uwe Kleine-König <ukleinek@kernel.org> Signed-off-by: Nuno Sá <nuno.sa@analog.com> Link: https://lore.kernel.org/r/20250701-dev-adp5589-fw-v7-10-b1fcfe9e9826@analog.com Signed-off-by: Lee Jones <lee@kernel.org>
2025-07-01gpio: adp5585: add support for the adp5589 expanderNuno Sá
Support the adp5589 I/O expander which supports up to 19 pins. We need to add a chip_info based struct since accessing register "banks" and "bits" differs between devices. Also some register addresses are different. While at it move ADP558X_GPIO_MAX defines to the main header file and rename them. That information will be needed by the top level device in a following change. Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Signed-off-by: Nuno Sá <nuno.sa@analog.com> Link: https://lore.kernel.org/r/20250701-dev-adp5589-fw-v7-9-b1fcfe9e9826@analog.com Signed-off-by: Lee Jones <lee@kernel.org>
2025-07-01mfd: adp5585: Add a per chip reg strutureNuno Sá
There are some differences in the register map between the devices. Hence, add a register structure per device. This will be needed in following patches. On top of that adp5585_fill_regmap_config() is renamed and reworked so that the current struct adp5585_info act as template (they indeed contain all the different data between variants) which can then be complemented depending on the device (as identified by the id register). This is done like this since a lot of the data is pretty much the same between variants of the same device. Reviewed-by: Lee Jones <lee@kernel.org> Signed-off-by: Nuno Sá <nuno.sa@analog.com> Link: https://lore.kernel.org/r/20250701-dev-adp5589-fw-v7-8-b1fcfe9e9826@analog.com Signed-off-by: Lee Jones <lee@kernel.org>
2025-07-01mfd: adp5585: Add support for adp5589Nuno Sá
The ADP5589 is a 19 I/O port expander with built-in keypad matrix decoder, programmable logic, reset generator, and PWM generator. Signed-off-by: Nuno Sá <nuno.sa@analog.com> Link: https://lore.kernel.org/r/20250701-dev-adp5589-fw-v7-7-b1fcfe9e9826@analog.com Signed-off-by: Lee Jones <lee@kernel.org>
2025-07-01mfd: adp5585: Refactor how regmap defaults are handledNuno Sá
The only thing changing between variants is the regmap default registers. Hence, instead of having a regmap configuration for every variant (duplicating lots of fields), add a chip info type of structure with a regmap ID to identify which defaults to use and populate regmap_config at runtime given a template plus the id. Also note that between variants, the defaults can be the same which means the chip info structure can be used in more than one compatible. This will also make it simpler adding new chips with more variants. Also note that the chip info structures are deliberately not const as they will also contain lots of members that are the same between the different devices variants and so we will fill those at runtime. Signed-off-by: Nuno Sá <nuno.sa@analog.com> Link: https://lore.kernel.org/r/20250701-dev-adp5589-fw-v7-6-b1fcfe9e9826@analog.com Signed-off-by: Lee Jones <lee@kernel.org>
2025-07-01lsm: introduce new hooks for setting/getting inode fsxattrAndrey Albershteyn
Introduce new hooks for setting and getting filesystem extended attributes on inode (FS_IOC_FSGETXATTR). Cc: selinux@vger.kernel.org Cc: Paul Moore <paul@paul-moore.com> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org> Link: https://lore.kernel.org/20250630-xattrat-syscall-v6-2-c4e3bc35227b@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-01fs: split fileattr related helpers into separate fileAndrey Albershteyn
This patch moves function related to file extended attributes manipulations to separate file. Refactoring only. Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org> Link: https://lore.kernel.org/20250630-xattrat-syscall-v6-1-c4e3bc35227b@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-01netfs: Update tracepoints in a number of waysDavid Howells
Make a number of updates to the netfs tracepoints: (1) Remove a duplicate trace from netfs_unbuffered_write_iter_locked(). (2) Move the trace in netfs_wake_rreq_flag() to after the flag is cleared so that the change appears in the trace. (3) Differentiate the use of netfs_rreq_trace_wait/woke_queue symbols. (4) Don't do so many trace emissions in the wait functions as some of them are redundant. (5) In netfs_collect_read_results(), differentiate a subreq that's being abandoned vs one that has been consumed in a regular way. (6) Add a tracepoint to indicate the call to ->ki_complete(). (7) Don't double-increment the subreq_counter when retrying a write. (8) Move the netfs_sreq_trace_io_progress tracepoint within cifs code to just MID_RESPONSE_RECEIVED and add different tracepoints for other MID states and note check failure. Signed-off-by: David Howells <dhowells@redhat.com> Co-developed-by: Paulo Alcantara <pc@manguebit.org> Signed-off-by: Paulo Alcantara <pc@manguebit.org> Link: https://lore.kernel.org/20250701163852.2171681-14-dhowells@redhat.com cc: Steve French <sfrench@samba.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org cc: linux-cifs@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-01netfs: Renumber the NETFS_RREQ_* flags to make traces easier to readDavid Howells
Renumber the NETFS_RREQ_* flags to put the most useful status bits in the bottom nibble - and therefore the last hex digit in the trace output - making it easier to grasp the state at a glance. In particular, put the IN_PROGRESS flag in bit 0 and ALL_QUEUED at bit 1. Also make the flags field in /proc/fs/netfs/requests larger to accommodate all the flags. Also make the flags field in the netfs_sreq tracepoint larger to accommodate all the NETFS_SREQ_* flags. Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/20250701163852.2171681-13-dhowells@redhat.com Reviewed-by: Paulo Alcantara <pc@manguebit.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-01netfs: Merge i_size update functionsDavid Howells
Netfslib has two functions for updating the i_size after a write: one for buffered writes into the pagecache and one for direct/unbuffered writes. However, what needs to be done is much the same in both cases, so merge them together. This does raise one question, though: should updating the i_size after a direct write do the same estimated update of i_blocks as is done for buffered writes. Also get rid of the cleanup function pointer from netfs_io_request as it's only used for direct write to update i_size; instead do the i_size setting directly from write collection. Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/20250701163852.2171681-12-dhowells@redhat.com cc: Steve French <sfrench@samba.org> cc: Paulo Alcantara <pc@manguebit.org> cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-01netfs: Fix double put of requestDavid Howells
If a netfs request finishes during the pause loop, it will have the ref that belongs to the IN_PROGRESS flag removed at that point - however, if it then goes to the final wait loop, that will *also* put the ref because it sees that the IN_PROGRESS flag is clear and incorrectly assumes that this happened when it called the collector. In fact, since IN_PROGRESS is clear, we shouldn't call the collector again since it's done all the cleanup, such as calling ->ki_complete(). Fix this by making netfs_collect_in_app() just return, indicating that we're done if IN_PROGRESS is removed. Fixes: 2b1424cd131c ("netfs: Fix wait/wake to be consistent about the waitqueue used") Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/20250701163852.2171681-3-dhowells@redhat.com Tested-by: Steve French <sfrench@samba.org> Reviewed-by: Paulo Alcantara <pc@manguebit.org> cc: Steve French <sfrench@samba.org> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org cc: linux-cifs@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-01docs: dma-api: add a kernel-doc comment for dma_pool_zalloc()Petr Tesarik
Document the dma_pool_zalloc() wrapper. Signed-off-by: Petr Tesarik <ptesarik@suse.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> [jc: fixed up dma_pool_alloc() reference in dmapool.h] Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20250627101015.1600042-5-ptesarik@suse.com
2025-07-01blk-mq: add number of queue calc helperDaniel Wagner
Add two variants of helper functions that calculate the correct number of queues to use. Two variants are needed because some drivers base their maximum number of queues on the possible CPU mask, while others use the online CPU mask. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Daniel Wagner <wagi@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20250617-isolcpus-queue-counters-v1-2-13923686b54b@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-07-01lib/group_cpus: Let group_cpu_evenly() return the number of initialized masksDaniel Wagner
group_cpu_evenly() might have allocated less groups then requested: group_cpu_evenly() __group_cpus_evenly() alloc_nodes_groups() # allocated total groups may be less than numgrps when # active total CPU number is less then numgrps In this case, the caller will do an out of bound access because the caller assumes the masks returned has numgrps. Return the number of groups created so the caller can limit the access range accordingly. Acked-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Daniel Wagner <wagi@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250617-isolcpus-queue-counters-v1-1-13923686b54b@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-07-01libnvdimm: Don't use "proxy" headersAndy Shevchenko
Update header inclusions to follow IWYU (Include What You Use) principle. Note that kernel.h is discouraged to be included as it's written at the top of that file. While doing that, sort headers alphabetically. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://patch.msgid.link/20250627142001.994860-1-andriy.shevchenko@linux.intel.com Signed-off-by: Ira Weiny <ira.weiny@intel.com>
2025-07-01ACPI: Return -ENODEV from acpi_parse_spcr() when SPCR support is disabledLi Chen
If CONFIG_ACPI_SPCR_TABLE is disabled, acpi_parse_spcr() currently returns 0, which may incorrectly suggest that SPCR parsing was successful. This patch changes the behavior to return -ENODEV to clearly indicate that SPCR support is not available. This prepares the codebase for future changes that depend on acpi_parse_spcr() failure detection, such as suppressing misleading console messages. Signed-off-by: Li Chen <chenl311@chinatelecom.cn> Acked-by: Hanjun Guo <guohanjun@huawei.com> Link: https://lore.kernel.org/r/20250620131309.126555-2-me@linux.beauty Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-07-01time/timecounter: Fix the lie that struct cyclecounter is constGreg Kroah-Hartman
In both the read callback for struct cyclecounter, and in struct timecounter, struct cyclecounter is declared as a const pointer. Unfortunatly, a number of users of this pointer treat it as a non-const pointer as it is burried in a larger structure that is heavily modified by the callback function when accessed. This lie had been hidden by the fact that container_of() "casts away" a const attribute of a pointer without any compiler warning happening at all. Fix this all up by removing the const attribute in the needed places so that everyone can see that the structure really isn't const, but can, and is, modified by the users of it. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/2025070124-backyard-hurt-783a@gregkh
2025-07-01ASoC: Add SDCA IRQ support and some misc fixupsMark Brown
Merge series from Charles Keepax <ckeepax@opensource.cirrus.com>: Add a maintainers entry for SDCA, do a couple of small fixups for previous chains, and then adding the beginnings of the SDCA IRQ handling. This is based around a regmap IRQ chip and a few helper functions that can be called from the client drivers to setup the IRQs.
2025-07-01fs: add ioctl to query metadata and protection info capabilitiesAnuj Gupta
Add a new ioctl, FS_IOC_GETLBMD_CAP, to query metadata and protection info (PI) capabilities. This ioctl returns information about the files integrity profile. This is useful for userspace applications to understand a files end-to-end data protection support and configure the I/O accordingly. For now this interface is only supported by block devices. However the design and placement of this ioctl in generic FS ioctl space allows us to extend it to work over files as well. This maybe useful when filesystems start supporting PI-aware layouts. A new structure struct logical_block_metadata_cap is introduced, which contains the following fields: 1. lbmd_flags: bitmask of logical block metadata capability flags 2. lbmd_interval: the amount of data described by each unit of logical block metadata 3. lbmd_size: size in bytes of the logical block metadata associated with each interval 4. lbmd_opaque_size: size in bytes of the opaque block tag associated with each interval 5. lbmd_opaque_offset: offset in bytes of the opaque block tag within the logical block metadata 6. lbmd_pi_size: size in bytes of the T10 PI tuple associated with each interval 7. lbmd_pi_offset: offset in bytes of T10 PI tuple within the logical block metadata 8. lbmd_pi_guard_tag_type: T10 PI guard tag type 9. lbmd_pi_app_tag_size: size in bytes of the T10 PI application tag 10. lbmd_pi_ref_tag_size: size in bytes of the T10 PI reference tag 11. lbmd_pi_storage_tag_size: size in bytes of the T10 PI storage tag The internal logic to fetch the capability is encapsulated in a helper function blk_get_meta_cap(), which uses the blk_integrity profile associated with the device. The ioctl returns -EOPNOTSUPP, if CONFIG_BLK_DEV_INTEGRITY is not enabled. Suggested-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Anuj Gupta <anuj20.g@samsung.com> Signed-off-by: Kanchan Joshi <joshi.k@samsung.com> Link: https://lore.kernel.org/20250630090548.3317-5-anuj20.g@samsung.com Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-01block: introduce pi_tuple_size field in blk_integrityAnuj Gupta
Introduce a new pi_tuple_size field in struct blk_integrity to explicitly represent the size (in bytes) of the protection information (PI) tuple. This is a prep patch. Add validation in blk_validate_integrity_limits() to ensure that pi size matches the expected size for known checksum types and never exceeds the pi_tuple_size. Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anuj Gupta <anuj20.g@samsung.com> Link: https://lore.kernel.org/20250630090548.3317-3-anuj20.g@samsung.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-01block: rename tuple_size field in blk_integrity to metadata_sizeAnuj Gupta
The tuple_size field in blk_integrity currently represents the total size of metadata associated with each data interval. To make the meaning more explicit, rename tuple_size to metadata_size. This is a purely mechanical rename with no functional changes. Suggested-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Anuj Gupta <anuj20.g@samsung.com> Link: https://lore.kernel.org/20250630090548.3317-2-anuj20.g@samsung.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-01device property: Use tidy for_each_named_* macrosMatti Vaittinen
Implementing if-conditions inside for_each_x() macros requires some thinking to avoid side effects in the calling code. Resulting code may look somewhat awkward, and there are couple of different ways it is usually done. Standardizing this to one way can help making it more obvious for a code reader and writer. The newly added for_each_if() is a way to achieve this. Use for_each_if() to make these macros look like many others which should in the long run help reading the code. Signed-off-by: Matti Vaittinen <mazziesaccount@gmail.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/c98b39a7195006fdd24590b8d11bb271a72a0c8a.1749453752.git.mazziesaccount@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-01pps: fix poll supportDenis OSTERLAND-HEIM
Because pps_cdev_poll() returns unconditionally EPOLLIN, a user space program that calls select/poll get always an immediate data ready-to-read response. As a result the intended use to wait until next data becomes ready does not work. User space snippet: struct pollfd pollfd = { .fd = open("/dev/pps0", O_RDONLY), .events = POLLIN|POLLERR, .revents = 0 }; while(1) { poll(&pollfd, 1, 2000/*ms*/); // returns immediate, but should wait if(revents & EPOLLIN) { // always true struct pps_fdata fdata; memset(&fdata, 0, sizeof(memdata)); ioctl(PPS_FETCH, &fdata); // currently fetches data at max speed } } Lets remember the last fetch event counter and compare this value in pps_cdev_poll() with most recent event counter and return 0 if they are equal. Signed-off-by: Denis OSTERLAND-HEIM <denis.osterland@diehl.com> Co-developed-by: Rodolfo Giometti <giometti@enneenne.com> Signed-off-by: Rodolfo Giometti <giometti@enneenne.com> Fixes: eae9d2ba0cfc ("LinuxPPS: core support") Link: https://lore.kernel.org/all/f6bed779-6d59-4f0f-8a59-b6312bd83b4e@enneenne.com/ Acked-by: Rodolfo Giometti <giometti@enneenne.com> Link: https://lore.kernel.org/r/c3c50ad1eb19ef553eca8a57c17f4c006413ab70.camel@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-01RDMA/uverbs: Check CAP_NET_RAW in user namespace for flow createParav Pandit
Currently, the capability check is done in the default init_user_ns user namespace. When a process runs in a non default user namespace, such check fails. Due to this when a process is running using Podman, it fails to create the flow resource. Since the RDMA device is a resource within a network namespace, use the network namespace associated with the RDMA device to determine its owning user namespace. Fixes: 436f2ad05a0b ("IB/core: Export ib_create/destroy_flow through uverbs") Signed-off-by: Parav Pandit <parav@nvidia.com> Suggested-by: Eric W. Biederman <ebiederm@xmission.com> Link: https://patch.msgid.link/6df6f2f24627874c4f6d041c19dc1f6f29f68f84.1750963874.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-07-01drm/mipi-dsi: Drop MIPI_DSI_MODE_VSYNC_FLUSH flagPhilipp Zabel
Drop the unused MIPI_DSI_MODE_VSYNC_FLUSH flag. Whether or not a display FIFO flush on vsync is required to avoid sending garbage to the panel is not a property of the DSI link, but of the integration between display controller and DSI host bridge. Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Acked-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20250627-dsi-vsync-flush-v2-4-4066899a5608@pengutronix.de Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2025-06-30ublk: allow UBLK_IO_(UN)REGISTER_IO_BUF on any taskCaleb Sander Mateos
Currently, UBLK_IO_REGISTER_IO_BUF and UBLK_IO_UNREGISTER_IO_BUF are only permitted on the ublk_io's daemon task. But this restriction is unnecessary. ublk_register_io_buf() calls __ublk_check_and_get_req() to look up the request from the tagset and atomically take a reference on the request without accessing the ublk_io. ublk_unregister_io_buf() doesn't use the q_id or tag at all. So allow these opcodes even on tasks other than io->task. Handle UBLK_IO_UNREGISTER_IO_BUF before obtaining the ubq and io since the buffer index being unregistered is not necessarily related to the specified q_id and tag. Add a feature flag UBLK_F_BUF_REG_OFF_DAEMON that userspace can use to determine whether the kernel supports off-daemon buffer registration. Suggested-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250620151008.3976463-10-csander@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-30neighbor: Add NTF_EXT_VALIDATED flag for externally validated entriesIdo Schimmel
tl;dr ===== Add a new neighbor flag ("extern_valid") that can be used to indicate to the kernel that a neighbor entry was learned and determined to be valid externally. The kernel will not try to remove or invalidate such an entry, leaving these decisions to the user space control plane. This is needed for EVPN multi-homing where a neighbor entry for a multi-homed host needs to be synced across all the VTEPs among which the host is multi-homed. Background ========== In a typical EVPN multi-homing setup each host is multi-homed using a set of links called ES (Ethernet Segment, i.e., LAG) to multiple leaf switches (VTEPs). VTEPs that are connected to the same ES are called ES peers. When a neighbor entry is learned on a VTEP, it is distributed to both ES peers and remote VTEPs using EVPN MAC/IP advertisement routes. ES peers use the neighbor entry when routing traffic towards the multi-homed host and remote VTEPs use it for ARP/NS suppression. Motivation ========== If the ES link between a host and the VTEP on which the neighbor entry was locally learned goes down, the EVPN MAC/IP advertisement route will be withdrawn and the neighbor entries will be removed from both ES peers and remote VTEPs. Routing towards the multi-homed host and ARP/NS suppression can fail until another ES peer locally learns the neighbor entry and distributes it via an EVPN MAC/IP advertisement route. "draft-rbickhart-evpn-ip-mac-proxy-adv-03" [1] suggests avoiding these intermittent failures by having the ES peers install the neighbor entries as before, but also injecting EVPN MAC/IP advertisement routes with a proxy indication. When the previously mentioned ES link goes down and the original EVPN MAC/IP advertisement route is withdrawn, the ES peers will not withdraw their neighbor entries, but instead start aging timers for the proxy indication. If an ES peer locally learns the neighbor entry (i.e., it becomes "reachable"), it will restart its aging timer for the entry and emit an EVPN MAC/IP advertisement route without a proxy indication. An ES peer will stop its aging timer for the proxy indication if it observes the removal of the proxy indication from at least one of the ES peers advertising the entry. In the event that the aging timer for the proxy indication expired, an ES peer will withdraw its EVPN MAC/IP advertisement route. If the timer expired on all ES peers and they all withdrew their proxy advertisements, the neighbor entry will be completely removed from the EVPN fabric. Implementation ============== In the above scheme, when the control plane (e.g., FRR) advertises a neighbor entry with a proxy indication, it expects the corresponding entry in the data plane (i.e., the kernel) to remain valid and not be removed due to garbage collection or loss of carrier. The control plane also expects the kernel to notify it if the entry was learned locally (i.e., became "reachable") so that it will remove the proxy indication from the EVPN MAC/IP advertisement route. That is why these entries cannot be programmed with dummy states such as "permanent" or "noarp". Instead, add a new neighbor flag ("extern_valid") which indicates that the entry was learned and determined to be valid externally and should not be removed or invalidated by the kernel. The kernel can probe the entry and notify user space when it becomes "reachable" (it is initially installed as "stale"). However, if the kernel does not receive a confirmation, have it return the entry to the "stale" state instead of the "failed" state. In other words, an entry marked with the "extern_valid" flag behaves like any other dynamically learned entry other than the fact that the kernel cannot remove or invalidate it. One can argue that the "extern_valid" flag should not prevent garbage collection and that instead a neighbor entry should be programmed with both the "extern_valid" and "extern_learn" flags. There are two reasons for not doing that: 1. Unclear why a control plane would like to program an entry that the kernel cannot invalidate but can completely remove. 2. The "extern_learn" flag is used by FRR for neighbor entries learned on remote VTEPs (for ARP/NS suppression) whereas here we are concerned with local entries. This distinction is currently irrelevant for the kernel, but might be relevant in the future. Given that the flag only makes sense when the neighbor has a valid state, reject attempts to add a neighbor with an invalid state and with this flag set. For example: # ip neigh add 192.0.2.1 nud none dev br0.10 extern_valid Error: Cannot create externally validated neighbor with an invalid state. # ip neigh add 192.0.2.1 lladdr 00:11:22:33:44:55 nud stale dev br0.10 extern_valid # ip neigh replace 192.0.2.1 nud failed dev br0.10 extern_valid Error: Cannot mark neighbor as externally validated with an invalid state. The above means that a neighbor cannot be created with the "extern_valid" flag and flags such as "use" or "managed" as they result in a neighbor being created with an invalid state ("none") and immediately getting probed: # ip neigh add 192.0.2.1 lladdr 00:11:22:33:44:55 nud stale dev br0.10 extern_valid use Error: Cannot create externally validated neighbor with an invalid state. However, these flags can be used together with "extern_valid" after the neighbor was created with a valid state: # ip neigh add 192.0.2.1 lladdr 00:11:22:33:44:55 nud stale dev br0.10 extern_valid # ip neigh replace 192.0.2.1 lladdr 00:11:22:33:44:55 nud stale dev br0.10 extern_valid use One consequence of preventing the kernel from invalidating a neighbor entry is that by default it will only try to determine reachability using unicast probes. This can be changed using the "mcast_resolicit" sysctl: # sysctl net.ipv4.neigh.br0/10.mcast_resolicit 0 # tcpdump -nn -e -i br0.10 -Q out arp & # ip neigh replace 192.0.2.1 lladdr 00:11:22:33:44:55 nud stale dev br0.10 extern_valid use 62:50:1d:11:93:6f > 00:11:22:33:44:55, ethertype ARP (0x0806), length 42: Request who-has 192.0.2.1 tell 192.0.2.2, length 28 62:50:1d:11:93:6f > 00:11:22:33:44:55, ethertype ARP (0x0806), length 42: Request who-has 192.0.2.1 tell 192.0.2.2, length 28 62:50:1d:11:93:6f > 00:11:22:33:44:55, ethertype ARP (0x0806), length 42: Request who-has 192.0.2.1 tell 192.0.2.2, length 28 # sysctl -wq net.ipv4.neigh.br0/10.mcast_resolicit=3 # ip neigh replace 192.0.2.1 lladdr 00:11:22:33:44:55 nud stale dev br0.10 extern_valid use 62:50:1d:11:93:6f > 00:11:22:33:44:55, ethertype ARP (0x0806), length 42: Request who-has 192.0.2.1 tell 192.0.2.2, length 28 62:50:1d:11:93:6f > 00:11:22:33:44:55, ethertype ARP (0x0806), length 42: Request who-has 192.0.2.1 tell 192.0.2.2, length 28 62:50:1d:11:93:6f > 00:11:22:33:44:55, ethertype ARP (0x0806), length 42: Request who-has 192.0.2.1 tell 192.0.2.2, length 28 62:50:1d:11:93:6f > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.0.2.1 tell 192.0.2.2, length 28 62:50:1d:11:93:6f > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.0.2.1 tell 192.0.2.2, length 28 62:50:1d:11:93:6f > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.0.2.1 tell 192.0.2.2, length 28 iproute2 patches can be found here [2]. [1] https://datatracker.ietf.org/doc/html/draft-rbickhart-evpn-ip-mac-proxy-adv-03 [2] https://github.com/idosch/iproute2/tree/submit/extern_valid_v1 Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://patch.msgid.link/20250626073111.244534-2-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-30include: adi-axi-common: add new helper macrosNuno Sá
Add new helper macros and enums to help identifying the platform and some characteristics of it at runtime. Signed-off-by: Nuno Sá <nuno.sa@analog.com> Link: https://lore.kernel.org/r/20250519-dev-axi-clkgen-limits-v6-4-bc4b3b61d1d4@analog.com Reviewed-by: David Lechner <dlechner@baylibre.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2025-06-30include: linux: move adi-axi-common.h out of fpgaNuno Sá
The adi-axi-common.h header has some common defines used in various ADI IPs. However they are not specific for any fpga manager so it's questionable for the header to live under include/linux/fpga. Hence let's just move one directory up and update all users. Suggested-by: Xu Yilun <yilun.xu@linux.intel.com> Acked-by: Xu Yilun <yilun.xu@intel.com> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> # for IIO Signed-off-by: Nuno Sá <nuno.sa@analog.com> Link: https://lore.kernel.org/r/20250519-dev-axi-clkgen-limits-v6-3-bc4b3b61d1d4@analog.com Acked-by: Mark Brown <broonie@kernel.org> Acked-by: Uwe Kleine-König <ukleinek@kernel.org> Reviewed-by: David Lechner <dlechner@baylibre.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2025-06-30block: add scatterlist-less DMA mapping helpersChristoph Hellwig
Add a new blk_rq_dma_map / blk_rq_dma_unmap pair that does away with the wasteful scatterlist structure. Instead it uses the mapping iterator to either add segments to the IOVA for IOMMU operations, or just maps them one by one for the direct mapping. For the IOMMU case instead of a scatterlist with an entry for each segment, only a single [dma_addr,len] pair needs to be stored for processing a request, and for the direct mapping the per-segment allocation shrinks from [page,offset,len,dma_addr,dma_len] to just [dma_addr,len]. One big difference to the scatterlist API, which could be considered downside, is that the IOVA collapsing only works when the driver sets a virt_boundary that matches the IOMMU granule. For NVMe this is done already so it works perfectly. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20250625113531.522027-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-30block: don't merge different kinds of P2P transfers in a single bioChristoph Hellwig
To get out of the DMA mapping helpers having to check every segment for it's P2P status, ensure that bios either contain P2P transfers or non-P2P transfers, and that a P2P bio only contains ranges from a single device. This means we do the page zone access in the bio add path where it should be still page hot, and will only have do the fairly expensive P2P topology lookup once per bio down in the DMA mapping path, and only for already marked bios. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20250625113531.522027-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-30block: Introduce bio_needs_zone_write_plugging()Damien Le Moal
In preparation for fixing device mapper zone write handling, introduce the inline helper function bio_needs_zone_write_plugging() to test if a BIO requires handling through zone write plugging using the function blk_zone_plug_bio(). This function returns true for any write (op_is_write(bio) == true) operation directed at a zoned block device using zone write plugging, that is, a block device with a disk that has a zone write plug hash table. This helper allows simplifying the check on entry to blk_zone_plug_bio() and used in to protect calls to it for blk-mq devices and DM devices. Fixes: f211268ed1f9 ("dm: Use the block layer zone append emulation") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250625093327.548866-3-dlemoal@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-30block: Make REQ_OP_ZONE_FINISH a write operationDamien Le Moal
REQ_OP_ZONE_FINISH is defined as "12", which makes op_is_write(REQ_OP_ZONE_FINISH) return false, despite the fact that a zone finish operation is an operation that modifies a zone (transition it to full) and so should be considered as a write operation (albeit one that does not transfer any data to the device). Fix this by redefining REQ_OP_ZONE_FINISH to be an odd number (13), and redefine REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL using sequential odd numbers from that new value. Fixes: 6c1b1da58f8c ("block: add zone open, close and finish operations") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250625093327.548866-2-dlemoal@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-30block: Increase BLK_DEF_MAX_SECTORS_CAPDamien Le Moal
Back in 2015, commit d2be537c3ba3 ("block: bump BLK_DEF_MAX_SECTORS to 2560") increased the default maximum size of a block device I/O to 2560 sectors (1280 KiB) to "accommodate a 10-data-disk stripe write with chunk size 128k". This choice is rather arbitrary and since then, improvements to the block layer have software RAID drivers correctly advertize their stripe width through chunk_sectors and abuses of BLK_DEF_MAX_SECTORS_CAP by drivers (to set the HW limit rather than the default user controlled maximum I/O size) have been fixed. Since many block devices can benefit from a larger value of BLK_DEF_MAX_SECTORS_CAP, and in particular HDDs, increase this value to be 4MiB, or 8192 sectors. And given that BLK_DEF_MAX_SECTORS_CAP is only used in the block layer and should not be used by drivers directly, move this macro definition to the block layer internal header file block/blk.h. Suggested-by: Martin K . Petersen <martin.petersen@oracle.com> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20250618060045.37593-1-dlemoal@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-30Merge tag 'entry-split-for-arm' into core/entryThomas Gleixner
Prerequisite for ARM[64] generic entry conversion Merge it into the entry branch so further changes can be based on it.
2025-06-30entry: Split generic entry into generic exception and syscall entryJinjie Ruan
Currently CONFIG_GENERIC_ENTRY enables both the generic exception entry logic and the generic syscall entry logic, which are otherwise loosely coupled. Introduce separate config options for these so that architectures can select the two independently. This will make it easier for architectures to migrate to generic entry code. Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/20250213130007.1418890-2-ruanjinjie@huawei.com Link: https://lore.kernel.org/all/20250624-generic-entry-split-v1-1-53d5ef4f94df@linaro.org [Linus Walleij: rebase onto v6.16-rc1]
2025-06-30io_uring: remove errant ';' from IORING_CQE_F_TSTAMP_HW definitionJens Axboe
An errant ';' slipped into that definition, which will cause some compilers to complain when it's used in an application: timestamp.c:257:45: error: empty expression statement has no effect; remove unnecessary ';' to silence this warning [-Werror,-Wextra-semi-stmt] 257 | hwts = cqe->flags & IORING_CQE_F_TSTAMP_HW; | ^ Fixes: 9e4ed359b8ef ("io_uring/netcmd: add tx timestamping cmd support") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-30lib/crc: crc32: Change crc32() from macro to inline function and remove castEric Biggers
There's no need for crc32() to be a macro. Make it an inline function instead. Also, remove the cast of the data pointer to 'unsigned char const *', which is no longer necessary now that the type used in the function prototype is 'const void *'. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250619183414.100082-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-06-30lib/crc: crc32: Document crc32_le(), crc32_be(), and crc32c()Eric Biggers
Document these widely used functions. Update kernel-api.rst to point to the correct place, instead of to crc32-main.c which no longer contains kerneldoc comments. Simplify the documentation in crc32poly.h to just point to the corresponding functions, now that they are properly documented. Change the value of CRC32C_POLY_LE to lower case, for consistency. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250619183414.100082-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-06-30lib/crc: Prepare for arch-optimized code in subdirs of lib/crc/Eric Biggers
Rework how lib/crc/ supports arch-optimized code. First, instead of the arch-optimized CRC code being in arch/$(SRCARCH)/lib/, it will now be in lib/crc/$(SRCARCH)/. Second, the API functions (e.g. crc32c()), arch-optimized functions (e.g. crc32c_arch()), and generic functions (e.g. crc32c_base()) will now be part of a single module for each CRC type, allowing better inlining and dead code elimination. The second change is made possible by the first. As an example, consider CONFIG_CRC32=m on x86. We'll now have just crc32.ko instead of both crc32-x86.ko and crc32.ko. The two modules were already coupled together and always both got loaded together via direct symbol dependency, so the separation provided no benefit. Note: later I'd like to apply the same design to lib/crypto/ too, where often the API functions are out-of-line so this will work even better. In those cases, for each algorithm we currently have 3 modules all coupled together, e.g. libsha256.ko, libsha256-generic.ko, and sha256-x86.ko. We should have just one, inline things properly, and rely on the compiler's dead code elimination to decide the inclusion of the generic code instead of manually setting it via kconfig. Having arch-specific code outside arch/ was somewhat controversial when Zinc proposed it back in 2018. But I don't think the concerns are warranted. It's better from a technical perspective, as it enables the improvements mentioned above. This model is already successfully used in other places in the kernel such as lib/raid6/. The community of each architecture still remains free to work on the code, even if it's not in arch/. At the time there was also a desire to put the library code in the same files as the old-school crypto API, but that was a mistake; now that the library is separate, that's no longer a constraint either. Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com> Link: https://lore.kernel.org/r/20250607200454.73587-3-ebiggers@kernel.org Link: https://lore.kernel.org/r/20250612054514.142728-1-ebiggers@kernel.org Link: https://lore.kernel.org/r/20250621012221.4351-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-06-30lib/crc: Move files into lib/crc/Eric Biggers
Move all CRC files in lib/ into a subdirectory lib/crc/ to keep them from cluttering up the main lib/ directory. Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com> Link: https://lore.kernel.org/r/20250607200454.73587-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-06-30lib/crc32: Remove unused combination supportEric Biggers
Remove crc32_le_combine() and crc32_le_shift(), since they are no longer used. Although combination is an interesting thing that can be done with CRCs, it turned out that none of the users of it in the kernel were even close to being worthwhile. All were much better off simply chaining the CRCs or processing zeroes. Let's remove the CRC32 combination code for now. It can come back (potentially optimized with carryless multiplication instructions) if there is ever a case where it would actually be worthwhile. Link: https://lore.kernel.org/r/20250607032228.27868-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-06-30crypto: sha512 - Remove sha512_base.hEric Biggers
sha512_base.h is no longer used, so remove it. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250630160320.2888-17-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-06-30crypto: sha512 - Replace sha512_generic with wrapper around SHA-512 libraryEric Biggers
Delete crypto/sha512_generic.c, which provided "generic" SHA-384 and SHA-512 crypto_shash algorithms. Replace it with crypto/sha512.c which provides SHA-384, SHA-512, HMAC-SHA384, and HMAC-SHA512 crypto_shash algorithms using the corresponding library functions. This is a prerequisite for migrating all the arch-optimized SHA-512 code (which is almost 3000 lines) to lib/crypto/ rather than duplicating it. Since the replacement crypto_shash algorithms are implemented using the (potentially arch-optimized) library functions, give them cra_driver_names ending with "-lib" rather than "-generic". Update crypto/testmgr.c and one odd driver to take this change in driver name into account. Besides these cases which are accounted for, there are no known cases where the cra_driver_name was being depended on. This change does mean that the abstract partial block handling code in crypto/shash.c, which got added in 6.16, no longer gets used. But that's fine; the library has to implement the partial block handling anyway, and it's better to do it in the library since the block size and other properties of the algorithm are all fixed at compile time there, resulting in more streamlined code. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250630160320.2888-6-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-06-30lib/crypto: sha512: Add HMAC-SHA384 and HMAC-SHA512 supportEric Biggers
Since HMAC support is commonly needed and is fairly simple, include it as a first-class citizen of the SHA-512 library. The API supports both incremental and one-shot computation, and either preparing the key ahead of time or just using a raw key. The implementation is much more streamlined than crypto/hmac.c. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250630160320.2888-4-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-06-30lib/crypto: sha512: Add support for SHA-384 and SHA-512Eric Biggers
Add basic support for SHA-384 and SHA-512 to lib/crypto/. Various in-kernel users will be able to use this instead of the old-school crypto API, which is harder to use and has more overhead. The basic support added by this commit consists of the API and its documentation, backed by a C implementation of the algorithms. sha512_block_generic() is derived from crypto/sha512_generic.c. Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250630160320.2888-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2025-06-30local_lock: Move this_cpu_ptr() notation from internal to main headerSebastian Andrzej Siewior
local_lock.h is the main header for the local_lock_t type and provides wrappers around internal functions prefixed with __ in local_lock_internal.h. Move the this_cpu_ptr() dereference of the variable from the internal to the main header. Since it is all macro implemented, this_cpu_ptr() will still happen within the preempt/ IRQ disabled section. This frees the internal implementation (__) to be used on local_lock_t types which are local variables and must not be accessed via this_cpu_ptr(). Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Waiman Long <longman@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/all/20250630075138.3448715-2-bigeasy@linutronix.de
2025-06-30drm/dp: Change argument type of drm_edp_backlight_enableSuraj Kandpal
Change the argument type to u32 for the default level being sent since it has to now account for luminance value which has to be set for DP_EDP_PANEL_LUMINANCE_TARGET_VALUE. --v2 -No need to typecast [Jani] Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Arun R Murthy <arun.r.murthy@intel.com> Link: https://lore.kernel.org/r/20250620063445.3603086-10-suraj.kandpal@intel.com