| Age | Commit message (Collapse) | Author |
|
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/fpga/linux-fpga into char-misc-next
Xu writes:
FPGA Manager changes for 6.20-rc1
- Mukesh fixes a typo in comments.
- Thadeu adds support for built-in dfl driver.
- Michal changes his email.
- Romain corrects the error handling when an fpga bridge is missing.
All patches have been reviewed on the mailing list, and have been in the
last linux-next releases (as part of our for-next branch).
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
* tag 'fpga-for-v6.20-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/fpga/linux-fpga:
fpga: dfl: fix typo in header file
fpga: dfl: use subsys_initcall to allow built-in drivers to be added
fpga: xilinx: Switch Michal Simek's email to new one
fpga: of-fpga-region: Fail if any bridge is missing
|
|
Some pinned importers, such as non-ODP RDMA ones, cannot invalidate their
mappings and therefore must be prevented from attaching to this exporter.
Fixes: 5d74781ebc86 ("vfio/pci: Add dma-buf export support for MMIO regions")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20260121-vfio-add-pin-v1-1-4e04916b17f1@nvidia.com
Signed-off-by: Alex Williamson <alex@shazbot.org>
|
|
Fix the following kconfig warning reported by the kernel test robot:
kismet: WARNING: unmet direct dependencies detected for
SND_SOC_ACPI_AMD_SDCA_QUIRKS when selected by SND_SOC_ACPI_AMD_MATCH
Depends on [n]: SOUND [=y] && SND [=y] && SND_SOC [=y] &&
ACPI [=y] && SND_SOC_SDCA [=n]
Selected by [y]:
- SND_SOC_ACPI_AMD_MATCH [=y] && SOUND [=y] && SND [=y] &&
SND_SOC [=y]
The issue occurs because SND_SOC_ACPI_AMD_SDCA_QUIRKS depends on
SND_SOC_SDCA, which may be disabled, causing unmet dependency warnings.
Fix this by adjusting the Kconfig dependency logic accordingly.
Fixes: e7c30ac379b4 ("ASoC: amd: acp: soc-acpi: add is_device_rt712_vb() helper")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202601131155.RXGj4KHv-lkp@intel.com
Signed-off-by: Syed Saba Kareem <syed.sabakareem@amd.com>
Link: https://patch.msgid.link/20260123095524.490655-1-syed.sabakareem@amd.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The test must import namespace "EXPORTED_FOR_KUNIT_TESTING".
Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com>
Fixes: d0ab89951197 ("ASoC: cs35l56: Add KUnit testing of cs35l56_set_fw_suffix()")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202601221843.kS9IMZ0E-lkp@intel.com/
Link: https://patch.msgid.link/20260123111354.1931986-1-rf@opensource.cirrus.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Passing IRQF_ONESHOT ensures that the interrupt source is masked until
the secondary (threaded) handler is done. If only a primary handler is
used then the flag makes no sense because the interrupt can not fire
(again) while its handler is running.
The flag also disallows force-threading of the primary handler and the
irq-core will warn about this.
Remove IRQF_ONESHOT from irqflags.
Cc: Oder Chiou <oder_chiou@realtek.com>
Cc: Liam Girdwood <lgirdwood@gmail.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Jaroslav Kysela <perex@perex.cz>
Cc: Takashi Iwai <tiwai@suse.com>
Cc: linux-sound@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20260123113708.416727-13-bigeasy@linutronix.de
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Convert the TI TAS2552 codec binding to DT schema format. It's a
straight-forward conversion.
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20260121235757.370920-1-robh@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v6.19
A couple of device IDs and a couple of small fixes, nothing hugely
remarkable.
|
|
While setting file attributes, the read-only flags are reset
for ->xflags, but not for ->flags if flag is shared between both. This
is fine for now as all read-only xflags don't overlap with flags.
However, for any read-only shared flag this will create inconsistency
between xflags and flags. The non-shared flag will be reset in
vfs_fileattr_set() to the current value, but shared one is past further
to ->fileattr_set.
Reported-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Andrey Albershteyn <aalbersh@kernel.org>
Link: https://patch.msgid.link/20260121193645.3611716-1-aalbersh@kernel.org
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
A vSTE may have three configuration types: Abort, Bypass, and Translate.
An Abort vSTE wouldn't enable ATS, but the other two might.
It makes sense for a Transalte vSTE to rely on the guest vSTE.EATS field.
For a Bypass vSTE, it would end up with an S2-only physical STE, similar
to an attachment to a regular S2 domain. However, the nested case always
disables ATS following the Bypass vSTE, while the regular S2 case always
enables ATS so long as arm_smmu_ats_supported(master) == true.
Note that ATS is needed for certain VM centric workloads and historically
non-vSMMU cases have relied on this automatic enablement. So, having the
nested case behave differently causes problems.
To fix that, add a condition to disable_ats, so that it might enable ATS
for a Bypass vSTE, aligning with the regular S2 case.
Fixes: f27298a82ba0 ("iommu/arm-smmu-v3: Allow ATS for IOMMU_DOMAIN_NESTED")
Cc: stable@vger.kernel.org
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
STE in a nested case requires both S1 and S2 fields. And this makes the use
case different from the existing one.
Add coverage for previously failed cases shifting between S2-only and S1+S2
STEs.
Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
If VM wants to toggle EATS_TRANS off at the same time as changing the CFG,
hypervisor will see EATS change to 0 and insert a V=0 breaking update into
the STE even though the VM did not ask for that.
In bare metal, EATS_TRANS is ignored by CFG=ABORT/BYPASS, which is why this
does not cause a problem until we have the nested case where CFG is always
a variation of S2 trans that does use EATS_TRANS.
Relax the rules for EATS_TRANS sequencing, we don't need it to be exact as
the enclosing code will always disable ATS at the PCI device when changing
EATS_TRANS. This ensures there are no ATS transactions that can race with
an EATS_TRANS change so we don't need to carefully sequence these bits.
Fixes: 1e8be08d1c91 ("iommu/arm-smmu-v3: Support IOMMU_DOMAIN_NESTED")
Cc: stable@vger.kernel.org
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Nested CD tables set the MEV bit to try to reduce multi-fault spamming on
the hypervisor. Since MEV is in STE word 1 this causes a breaking update
sequence that is not required and impacts real workloads.
For the purposes of STE updates the value of MEV doesn't matter, if it is
set/cleared early or late it just results in a change to the fault reports
that must be supported by the kernel anyhow. The spec says:
Note: Software must expect, and be able to deal with, coalesced fault
records even when MEV == 0.
So mark STE MEV safe when computing the update sequence, to avoid creating
a breaking update.
Fixes: da0c56520e88 ("iommu/arm-smmu-v3: Set MEV bit in nested STE for DoS mitigations")
Cc: stable@vger.kernel.org
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
C_BAD_STE was observed when updating nested STE from an S1-bypass mode to
an S1DSS-bypass mode. As both modes enabled S2, the used bit is slightly
different than the normal S1-bypass and S1DSS-bypass modes. As a result,
fields like MEV and EATS in S2's used list marked the word1 as a critical
word that requested a STE.V=0. This breaks a hitless update.
However, both MEV and EATS aren't critical in terms of STE update. One
controls the merge of the events and the other controls the ATS that is
managed by the driver at the same time via pci_enable_ats().
Add an arm_smmu_get_ste_update_safe() to allow STE update algorithm to
relax those fields, avoiding the STE update breakages.
After this change, entry_set has no caller checking its return value, so
change it to void.
Note that this change is required by both MEV and EATS fields, which were
introduced in different kernel versions. So add get_update_safe() first.
MEV and EATS will be added to arm_smmu_get_ste_update_safe() separately.
Fixes: 1e8be08d1c91 ("iommu/arm-smmu-v3: Support IOMMU_DOMAIN_NESTED")
Cc: stable@vger.kernel.org
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Marc Kleine-Budde <mkl@pengutronix.de> says:
The CAN controller triggers an EPI interrupt when it enters the error
passive state or transitions back to error active. Rather than tracking
state in the driver, the CAN controller state should be derived from the
TX/RX error counters using can_state_get_by_berr_counter().
Link: https://patch.msgid.link/20260123-sja1000-state-handling-v2-0-687498087dad@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
The CAN controller sends the EPI interrupt whenever it reaches the error
passive status or enters the error active status from the error passive
status.
Instead of keeping track of the controller status in the driver, read the
txerr and rxerr counters and use can_state_get_by_berr_counter() to
determine the state of the CAN controller.
Suggested-by: Achim Baumgartner <abaumgartner@topcon.com>
Signed-off-by: Michael Tretter <m.tretter@pengutronix.de>
Link: https://patch.msgid.link/20260123-sja1000-state-handling-v2-2-687498087dad@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
error counters
This is a preparation patch to make use of can_state_get_by_berr_counter()
which works on a struct can_berr_counter.
Reuse the existing function sja1000_get_berr_counter() to read the error
counters into a struct can_berr_counter.
Reviewed-by: Michael Tretter <m.tretter@pengutronix.de>
Link: https://patch.msgid.link/20260123-sja1000-state-handling-v2-1-687498087dad@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
When software issues a Cache Maintenance Operation (CMO) targeting a
dirty cache line, the CPU and DSU cluster may optimize the operation by
combining the CopyBack Write and CMO into a single combined CopyBack
Write plus CMO transaction presented to the interconnect (MCN).
For these combined transactions, the MCN splits the operation into two
separate transactions, one Write and one CMO, and then propagates the
write and optionally the CMO to the downstream memory system or external
Point of Serialization (PoS).
However, the MCN may return an early CompCMO response to the DSU cluster
before the corresponding Write and CMO transactions have completed at
the external PoS or downstream memory. As a result, stale data may be
observed by external observers that are directly connected to the
external PoS or downstream memory.
This erratum affects any system topology in which the following
conditions apply:
- The Point of Serialization (PoS) is located downstream of the
interconnect.
- A downstream observer accesses memory directly, bypassing the
interconnect.
Conditions:
This erratum occurs only when all of the following conditions are met:
1. Software executes a data cache maintenance operation, specifically,
a clean or clean&invalidate by virtual address (DC CVAC or DC
CIVAC), that hits on unique dirty data in the CPU or DSU cache.
This results in a combined CopyBack and CMO being issued to the
interconnect.
2. The interconnect splits the combined transaction into separate Write
and CMO transactions and returns an early completion response to the
CPU or DSU before the write has completed at the downstream memory
or PoS.
3. A downstream observer accesses the affected memory address after the
early completion response is issued but before the actual memory
write has completed. This allows the observer to read stale data
that has not yet been updated at the PoS or downstream memory.
The implementation of workaround put a second loop of CMOs at the same
virtual address whose operation meet erratum conditions to wait until
cache data be cleaned to PoC. This way of implementation mitigates
performance penalty compared to purely duplicate original CMO.
Signed-off-by: Lucas Wei <lucaswei@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Add support for the for the EXC3188 touchscreen from eGalaxy.
Signed-off-by: Thorsten Schmelzer <tschmelzer@topcon.com>
Signed-off-by: Michael Tretter <m.tretter@pengutronix.de>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
|
|
New model in the ELECOM HUGE trackball line that has 8 buttons but the
report descriptor specifies only 5. The HUGE Plus supports connecting via
Bluetooth, 2.4GHz wireless USB dongle, and directly via a USB-C cable.
Each connection type reports a different device id, 01AA for cable,
01AB for USB dongle, and 01AC for Bluetooth.
This patch adds these device IDs and applies the fixups similar to the
other ELECOM devices to get all 8 buttons working for all 3 connection
types.
For reference, the usbhid-dump output:
001:013:001:DESCRIPTOR 1769085639.598405
05 01 09 02 A1 01 85 01 09 01 A1 00 05 09 19 01
29 05 15 00 25 01 75 01 95 05 81 02 75 03 95 01
81 01 05 01 09 30 09 31 16 01 80 26 FF 7F 75 10
95 02 81 06 09 38 15 81 25 7F 75 08 95 01 81 06
05 0C 0A 38 02 15 81 25 7F 75 08 95 01 81 06 C0
C0 05 0C 09 01 A1 01 85 02 15 01 26 8C 02 19 01
2A 8C 02 75 10 95 01 81 00 C0 05 01 09 80 A1 01
85 03 09 82 09 81 09 83 15 00 25 01 19 01 29 03
75 01 95 03 81 02 95 05 81 01 C0 06 01 FF 09 00
A1 01 85 08 09 00 15 00 26 FF 00 75 08 95 07 81
02 C0 06 02 FF 09 02 A1 01 85 06 09 02 15 00 26
FF 00 75 08 95 07 B1 02 C0
Signed-off-by: David Phillips <david@profile.sh>
Signed-off-by: Jiri Kosina <jkosina@suse.com>
|
|
Currently fsnotify_sb_delete() was called after we have evicted
superblock's dcache and inode cache. This was done mainly so that we
iterate as few inodes as possible when removing inode marks. However, as
Jakub reported, this is problematic because for some filesystems
encoding of file handles uses sb->s_root which gets cleared as part of
dcache eviction. And either delayed fsnotify events or reading fdinfo
for fsnotify group with marks on fs being unmounted may trigger encoding
of file handles during unmount. So move shutdown of fsnotify subsystem
before shrinking of dcache.
Link: https://lore.kernel.org/linux-fsdevel/CAOQ4uxgXvwumYvJm3cLDFfx-TsU3g5-yVsTiG=6i8KS48dn0mQ@mail.gmail.com/
Reported-by: Jakub Acs <acsjakub@amazon.de>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
Instead of iterating all inodes belonging to a superblock to find inode
marks and remove them on umount, iterate all inode connectors for the
superblock. This may be substantially faster since there are generally
much less inodes with fsnotify marks than all inodes. It also removes
one use of sb->s_inodes list which we strive to ultimately remove.
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
Introduce a linked list tracking all inode connectors for a superblock.
We will use this list when the superblock is getting shutdown to
properly clean up all the inode marks instead of relying on scanning all
inodes in the superblock which can get rather slow.
Suggested-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
Fix the two added test name.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Two issues with ubq->canceling flag handling:
1) In ublk_queue_reset_io_flags(), ubq->canceling is set outside
cancel_lock, violating the locking requirement. Move it inside
the spinlock-protected section.
2) In ublk_batch_unprep_io(), when rolling back after a batch prep
failure, if the queue became ready during prep (which cleared
canceling), the flag is not restored when the queue becomes
not-ready again. This allows new requests to be queued to
uninitialized IO slots.
Fix by restoring ubq->canceling = true under cancel_lock when the
queue transitions from ready to not-ready during rollback.
Reported-by: Jens Axboe <axboe@kernel.dk>
Fixes: 3f3850785594 ("ublk: fix batch I/O recovery -ENODEV error")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
ublk_batch_prep_io() calls __ublk_fetch() while holding io->lock
spinlock. When the last IO makes the device ready, ublk_mark_io_ready()
tries to acquire ub->cancel_mutex which can sleep, causing a
sleeping-while-atomic bug.
Fix by moving ublk_mark_io_ready() out of __ublk_fetch() and into the
callers (ublk_fetch and ublk_batch_prep_io) after the spinlock is
released.
Reported-by: Jens Axboe <axboe@kernel.dk>
Fixes: b256795b3606 ("ublk: handle UBLK_U_IO_PREP_IO_CMDS")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
smatch complains about this:
smatch warnings:
io_uring/io_uring.c:2741 io_uring_sanitise_params() warn: if statement not indented
hence fix it up.
Link: https://lore.kernel.org/all/202601231651.HeTmPS8C-lkp@intel.com/
Fixes: 5247c034a67f ("io_uring: introduce non-circular SQ")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202601231651.HeTmPS8C-lkp@intel.com/
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
This patch implements the .fadvise interface for page cache share.
Similar to overlayfs, it drops those clean, unused pages through
vfs_fadvise().
Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
This patch adds page cache sharing functionality for compressed inodes.
Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
This patch adds inode page cache sharing functionality for unencoded
files.
I conducted experiments in the container environment. Below is the
memory usage for reading all files in two different minor versions
of container images:
+-------------------+------------------+-------------+---------------+
| Image | Page Cache Share | Memory (MB) | Memory |
| | | | Reduction (%) |
+-------------------+------------------+-------------+---------------+
| | No | 241 | - |
| redis +------------------+-------------+---------------+
| 7.2.4 & 7.2.5 | Yes | 163 | 33% |
+-------------------+------------------+-------------+---------------+
| | No | 872 | - |
| postgres +------------------+-------------+---------------+
| 16.1 & 16.2 | Yes | 630 | 28% |
+-------------------+------------------+-------------+---------------+
| | No | 2771 | - |
| tensorflow +------------------+-------------+---------------+
| 2.11.0 & 2.11.1 | Yes | 2340 | 16% |
+-------------------+------------------+-------------+---------------+
| | No | 926 | - |
| mysql +------------------+-------------+---------------+
| 8.0.11 & 8.0.12 | Yes | 735 | 21% |
+-------------------+------------------+-------------+---------------+
| | No | 390 | - |
| nginx +------------------+-------------+---------------+
| 7.2.4 & 7.2.5 | Yes | 219 | 44% |
+-------------------+------------------+-------------+---------------+
| tomcat | No | 924 | - |
| 10.1.25 & 10.1.26 +------------------+-------------+---------------+
| | Yes | 474 | 49% |
+-------------------+------------------+-------------+---------------+
Additionally, the table below shows the runtime memory usage of the
container:
+-------------------+------------------+-------------+---------------+
| Image | Page Cache Share | Memory (MB) | Memory |
| | | | Reduction (%) |
+-------------------+------------------+-------------+---------------+
| | No | 35 | - |
| redis +------------------+-------------+---------------+
| 7.2.4 & 7.2.5 | Yes | 28 | 20% |
+-------------------+------------------+-------------+---------------+
| | No | 149 | - |
| postgres +------------------+-------------+---------------+
| 16.1 & 16.2 | Yes | 95 | 37% |
+-------------------+------------------+-------------+---------------+
| | No | 1028 | - |
| tensorflow +------------------+-------------+---------------+
| 2.11.0 & 2.11.1 | Yes | 930 | 10% |
+-------------------+------------------+-------------+---------------+
| | No | 155 | - |
| mysql +------------------+-------------+---------------+
| 8.0.11 & 8.0.12 | Yes | 132 | 15% |
+-------------------+------------------+-------------+---------------+
| | No | 25 | - |
| nginx +------------------+-------------+---------------+
| 7.2.4 & 7.2.5 | Yes | 20 | 20% |
+-------------------+------------------+-------------+---------------+
| tomcat | No | 186 | - |
| 10.1.25 & 10.1.26 +------------------+-------------+---------------+
| | Yes | 98 | 48% |
+-------------------+------------------+-------------+---------------+
Co-developed-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
The trace_erofs_read_folio accesses inode information through folio,
but this method fails if the real inode is not associated with the
folio(such as in the upcoming page cache sharing case). Therefore,
we pass the real inode to it so that the inode information can be
printed out in that case.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
Currently, reading files with different paths (or names) but the same
content will consume multiple copies of the page cache, even if the
content of these page caches is the same. For example, reading
identical files (e.g., *.so files) from two different minor versions of
container images will cost multiple copies of the same page cache,
since different containers have different mount points. Therefore,
sharing the page cache for files with the same content can save memory.
This introduces the page cache share feature in erofs. It allocate a
shared inode and use its page cache as shared. Reads for files
with identical content will ultimately be routed to the page cache of
the shared inode. In this way, a single page cache satisfies
multiple read requests for different files with the same contents.
We introduce new mount option `inode_share` to enable the page
sharing mode during mounting. This option is used in conjunction
with `domain_id` to share the page cache within the same trusted
domain.
Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
Either the existing fscache usecase or the upcoming page
cache sharing case, the `domain_id` should be protected as
sensitive information, so we use the safer helpers to allocate,
free and display domain_id.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
Add erofs_inode_set_aops helper to set the inode->i_mapping->a_ops
and use IS_ENABLED to make it cleaner.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
When creating the EROFS image, users can specify the fingerprint name.
This is to prepare for the upcoming inode page cache share.
Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com>
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
- Move the `struct erofs_anon_fs_type` to super.c and expose it
in preparation for the upcoming page cache share feature;
- Remove the `.owner` field, as they are all internal mounts and
fully managed by EROFS. Retaining `.owner` would unnecessarily
increment module reference counts, preventing the EROFS kernel
module from being unloaded.
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
There is no need to open nonexistent real files if backing files
couldn't be backed by real files (e.g., EROFS page cache sharing
doesn't need typical real files to open again).
Therefore, we export the alloc_empty_backing_file() helper, allowing
filesystems to dynamically set the backing file without real file
open. This is particularly useful for obtaining the correct @path
and @inode when calling file_user_path() and file_user_inode().
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Acked-by: Amir Goldstein <amir73il@gmail.com>
Acked-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
|
|
Niether TASDEVICE_CALIBRATION_REG_ADDRESS nor TASDEV_UEFI_CALI_REG_ADDR_FLG
is referenced in the code, drop both.
Signed-off-by: Shenghao Ding <shenghao-ding@ti.com>
Link: https://patch.msgid.link/20260123010000.1841-1-shenghao-ding@ti.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
If it play a 5s above silence media stream, it will cause silence
detection trigger.
Speaker will make no sound when you use another app to play a stream.
Add this patch will solve this issue.
GPIO2: Mute Hotkey GPIO3: Mic Mute LED
Enable this will turn on hotkey and LED support.
Signed-off-by: Kailang Yang <kailang@realtek.com>
Link: https://lore.kernel.org/f4929e137a7949238cc043d861a4d9f8@realtek.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull 'vfs-7.0.iomap' to allow iomap page cache users to set
`iomap_iter::private` for the upcoming page cache sharing support.
It also includes a patch to avoid triggering inline data reads for
the FIEMAP operation.
Signed-off-by: Gao Xiang <xiang@kernel.org>
|
|
When initializing HCR traps in protected mode, use kvm_has_mte() to
check for MTE support rather than kvm_has_feat(kvm, ID_AA64PFR1_EL1,
MTE, IMP).
kvm_has_mte() provides a more comprehensive check:
- kvm_has_feat() only checks if MTE is in the guest's ID register view
(i.e., what we advertise to the guest)
- kvm_has_mte() checks both system_supports_mte() AND whether
KVM_ARCH_FLAG_MTE_ENABLED is set for this VM instance
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260122112218.531948-5-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
When MTE hardware is present but disabled via software (`arm64.nomte` or
`CONFIG_ARM64_MTE=n`), the kernel clears `HCR_EL2.ATA` and sets
`HCR_EL2.TID5`, to prevent the use of MTE instructions.
Additionally, accesses to certain MTE system registers trap to EL2 with
exception class ESR_ELx_EC_SYS64. To emulate hardware without MTE (where
such accesses would cause an Undefined Instruction exception), inject
UNDEF into the host.
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260122112218.531948-4-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
If MTE is not supported by the hardware, or is disabled in the kernel
configuration (`CONFIG_ARM64_MTE=n`) or command line (`arm64.nomte`),
the kernel stops advertising MTE to userspace and avoids using MTE
instructions. However, this is a software-level disable only.
When MTE hardware is present and enabled by EL3 firmware, leaving
`HCR_EL2.ATA` set allows the host to execute MTE instructions (STG, LDG,
etc.) and access allocation tags in physical memory.
Prevent this by clearing `HCR_EL2.ATA` when MTE is disabled. Remove it
from the `HCR_HOST_NVHE_FLAGS` default, and conditionally set it in
`cpu_prepare_hyp_mode()` only when `system_supports_mte()` returns true.
This causes MTE instructions to trap to EL2 when `HCR_EL2.ATA` is
cleared.
Additionally, set `HCR_EL2.TID5` when MTE is disabled. This traps reads
of `GMID_EL1` (Multiple tag transfer ID register) to EL2, preventing the
discovery of MTE parameters (such as tag block size) when the feature is
suppressed.
Early boot code in `head.S` temporarily keeps `HCR_ATA` set to avoid
special-casing initialization paths. This is safe because this code
executes before untrusted code runs and will clear `HCR_ATA` if MTE is
disabled.
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260122112218.531948-3-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The pKVM lifecycle does not support tearing down the hypervisor and
returning to the hyp stub once initialized. The transition to protected
mode is one-way.
Consequently, the code path in hyp-init.S responsible for resetting
EL2 registers (triggered by kexec or hibernation) is unreachable in
protected mode.
Remove the dead code handling HCR_EL2 reset for
ARM64_KVM_PROTECTED_MODE.
No functional change intended.
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260122112218.531948-2-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
This agressively bypasses run_to_parity and slice protection with the
assumpiton that this is what waker wants but there is no garantee that
the wakee will be the next to run. It is a better choice to use
yield_to_task or WF_SYNC in such case.
This increases the number of resched and preemption because a task becomes
quickly "ineligible" when it runs; We update the task vruntime periodically
and before the task exhausted its slice or at least quantum.
Example:
2 tasks A and B wake up simultaneously with lag = 0. Both are
eligible. Task A runs 1st and wakes up task C. Scheduler updates task
A's vruntime which becomes greater than average runtime as all others
have a lag == 0 and didn't run yet. Now task A is ineligible because
it received more runtime than the other task but it has not yet
exhausted its slice nor a min quantum. We force preemption, disable
protection but Task B will run 1st not task C.
Sidenote, DELAY_ZERO increases this effect by clearing positive lag at
wake up.
Fixes: e837456fdca8 ("sched/fair: Reimplement NEXT_BUDDY to align with EEVDF goals")
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260123102858.52428-1-vincent.guittot@linaro.org
|
|
NEXT_BUDDY was disabled with the introduction of EEVDF and enabled again
after NEXT_BUDDY was rewritten for EEVDF by commit e837456fdca8 ("sched/fair:
Reimplement NEXT_BUDDY to align with EEVDF goals"). It was not expected
that this would be a universal win without a crystal ball instruction
but the reported regressions are a concern [1][2] even if gains were
also reported. Specifically;
o mysql with client/server running on different servers regresses
o specjbb reports lower peak metrics
o daytrader regresses
The mysql is realistic and a concern. It needs to be confirmed if
specjbb is simply shifting the point where peak performance is measured
but still a concern. daytrader is considered to be representative of a
real workload.
Access to test machines is currently problematic for verifying any fix to
this problem. Disable NEXT_BUDDY for now by default until the root causes
are addressed.
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com>
Link: https://lore.kernel.org/lkml/4b96909a-f1ac-49eb-b814-97b8adda6229@arm.com [1]
Link: https://lore.kernel.org/lkml/ec3ea66f-3a0d-4b5a-ab36-ce778f159b5b@linux.ibm.com [2]
Link: https://patch.msgid.link/fyqsk63pkoxpeaclyqsm5nwtz3dyejplr7rg6p74xwemfzdzuu@7m7xhs5aqpqw
|
|
The firmware loader is maintained through the driver-core tree, hence
add a corresponding T: entry.
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/20260120123925.28267-2-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|
|
Add the driver-core mailing list for all entries maintained under the
driver-core tree.
If existent, replace linux-kernel and rust-for-linux list entries, as
they're automatically picked up by scripts/get_maintainer.pl.
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://patch.msgid.link/20260120123925.28267-1-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|
|
In close_range(), the kernel traditionally performs a linear scan over the
[fd, max_fd] range, resulting in O(N) complexity where N is the range size.
For processes with sparse FD tables, this is inefficient as it checks many
unallocated slots.
This patch optimizes __range_close() by using find_next_bit() on the
open_fds bitmap to skip holes. This shifts the algorithmic complexity from
O(Range Size) to O(Active FDs), providing a significant performance boost
for large-range close operations on sparse file descriptor tables.
Signed-off-by: Qiliang Yuan <realwujing@gmail.com>
Signed-off-by: Qiliang Yuan <yuanql9@chinatelecom.cn>
Link: https://patch.msgid.link/20260123081221.659125-1-realwujing@gmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
* kvm-arm64/pkvm-features-6.20:
: .
: pKVM guest feature trapping fixes, courtesy of Fuad Tabba.
: .
KVM: arm64: Prevent host from managing timer offsets for protected VMs
KVM: arm64: Check whether a VM IOCTL is allowed in pKVM
KVM: arm64: Track KVM IOCTLs and their associated KVM caps
KVM: arm64: Do not allow KVM_CAP_ARM_MTE for any guest in pKVM
KVM: arm64: Include VM type when checking VM capabilities in pKVM
KVM: arm64: Introduce helper to calculate fault IPA offset
KVM: arm64: Fix MTE flag initialization for protected VMs
KVM: arm64: Fix Trace Buffer trap polarity for protected VMs
KVM: arm64: Fix Trace Buffer trapping for protected VMs
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
* kvm-arm64/feat_idst:
: .
: Add support for FEAT_IDST, allowing ID registers that are not implemented
: to be reported as a normal trap rather than as an UNDEF exception.
: .
KVM: arm64: selftests: Add a test for FEAT_IDST
KVM: arm64: pkvm: Report optional ID register traps with a 0x18 syndrome
KVM: arm64: pkvm: Add a generic synchronous exception injection primitive
KVM: arm64: Force trap of GMID_EL1 when the guest doesn't have MTE
KVM: arm64: Handle CSSIDR2_EL1 and SMIDR_EL1 in a generic way
KVM: arm64: Handle FEAT_IDST for sysregs without specific handlers
KVM: arm64: Add a generic synchronous exception injection primitive
KVM: arm64: Add trap routing for GMID_EL1
arm64: Repaint ID_AA64MMFR2_EL1.IDS description
Signed-off-by: Marc Zyngier <maz@kernel.org>
|