diff options
| author | Greg Kroah-Hartman <gregkh@linuxfoundation.org> | 2026-06-01 17:54:55 +0200 |
|---|---|---|
| committer | Greg Kroah-Hartman <gregkh@linuxfoundation.org> | 2026-06-01 17:54:55 +0200 |
| commit | d0acb5202d0e33d23cbe6994424587fbb05a5360 (patch) | |
| tree | 4b7fd07149896994fb739e1cbb5d65f2e47cd6d7 | |
| parent | 55f3722fc694d6478eca3020b12a4d6a79b06f27 (diff) | |
| parent | bb532bfaf7919c7c98caab81864e9ce2646e11e3 (diff) | |
Merge v7.0.11linux-rolling-stable
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
501 files changed, 6371 insertions, 2927 deletions
diff --git a/Documentation/ABI/testing/sysfs-driver-uniwill-laptop b/Documentation/ABI/testing/sysfs-driver-uniwill-laptop index 2df70792968f..2397c65c969a 100644 --- a/Documentation/ABI/testing/sysfs-driver-uniwill-laptop +++ b/Documentation/ABI/testing/sysfs-driver-uniwill-laptop @@ -51,3 +51,30 @@ Description: Reading this file returns the current status of the breathing animation functionality. + +What: /sys/bus/platform/devices/INOU0000:XX/ctgp_offset +Date: January 2026 +KernelVersion: 7.0 +Contact: Werner Sembach <wse@tuxedocomputers.com> +Description: + Allows userspace applications to set the configurable TGP offset on top of the base + TGP. Base TGP and max TGP and therefore the max cTGP offset are device specific. + Note that setting the maximum cTGP leaves no window open for Dynamic Boost as + Dynamic Boost also can not go over max TGP. Setting the cTGP to maximum is + effectively disabling Dynamic Boost and telling the device to always prioritize the + GPU over the CPU. + + Reading this file returns the current configurable TGP offset. + +What: /sys/bus/platform/devices/INOU0000:XX/usb_c_power_priority +Date: February 2026 +KernelVersion: 7.1 +Contact: Werner Sembach <wse@tuxedocomputers.com> +Description: + Allows userspace applications to choose the USB-C power distribution profile between + one that offers a bigger share of the power to the battery and one that offers more + of it to the CPU. Writing "charging"/"performance" into this file selects the + respective profile. + + Reading this file returns the profile names with the currently active one in + brackets. diff --git a/Documentation/admin-guide/laptops/uniwill-laptop.rst b/Documentation/admin-guide/laptops/uniwill-laptop.rst index aff5f57a6bd4..1f3ca84c7d88 100644 --- a/Documentation/admin-guide/laptops/uniwill-laptop.rst +++ b/Documentation/admin-guide/laptops/uniwill-laptop.rst @@ -43,6 +43,11 @@ Support for changing the platform performance mode is currently not implemented. Battery Charging Control ------------------------ +.. warning:: Some devices do not properly implement the charging threshold interface. Forcing + the driver to enable access to said interface on such devices might damage the + battery [1]_. Because of this the driver will not enable said feature even when + using the ``force`` module parameter. + The ``uniwill-laptop`` driver supports controlling the battery charge limit. This happens over the standard ``charge_control_end_threshold`` power supply sysfs attribute. All values between 1 and 100 percent are supported. @@ -50,6 +55,10 @@ between 1 and 100 percent are supported. Additionally the driver signals the presence of battery charging issues through the standard ``health`` power supply sysfs attribute. +It also lets you set whether a USB-C power source should prioritise charging the battery or +delivering immediate power to the cpu. See Documentation/ABI/testing/sysfs-driver-uniwill-laptop for +details. + Lightbar -------- @@ -58,3 +67,16 @@ LED class device. The default name of this LED class device is ``uniwill:multico See Documentation/ABI/testing/sysfs-driver-uniwill-laptop for details on how to control the various animation modes of the lightbar. + +Configurable TGP +---------------- + +The ``uniwill-laptop`` driver allows to set the configurable TGP for devices with NVIDIA GPUs that +allow it. + +See Documentation/ABI/testing/sysfs-driver-uniwill-laptop for details. + +References +========== + +.. [1] https://www.reddit.com/r/XMG_gg/comments/ld9yyf/battery_limit_hidden_function_discovered_on/ diff --git a/Documentation/admin-guide/pm/intel_pstate.rst b/Documentation/admin-guide/pm/intel_pstate.rst index fde967b0c2e0..25fe5d88fea6 100644 --- a/Documentation/admin-guide/pm/intel_pstate.rst +++ b/Documentation/admin-guide/pm/intel_pstate.rst @@ -355,11 +355,12 @@ HyperThreading (HT) in the context of Intel processors, is enabled on at least one core, ``intel_pstate`` assigns performance-based priorities to CPUs. Namely, the priority of a given CPU reflects its highest HWP performance level which causes the CPU scheduler to generally prefer more performant CPUs, so the less -performant CPUs are used when the other ones are fully loaded. However, SMT -siblings (that is, logical CPUs sharing one physical core) are treated in a -special way such that if one of them is in use, the effective priority of the -other ones is lowered below the priorities of the CPUs located in the other -physical cores. +performant CPUs are used when the other ones are fully loaded. SMT siblings +(that is, logical CPUs sharing one physical core) are given the same priority. +The scheduler can pull tasks from lower-priority cores and place them on any +sibling. Since the scheduler spreads tasks among physical cores, tasks will be +placed on the SMT siblings of physical cores only after all physical cores are +busy. This approach maximizes performance in the majority of cases, but unfortunately it also leads to excessive energy usage in some important scenarios, like video diff --git a/Documentation/arch/riscv/zicfilp.rst b/Documentation/arch/riscv/zicfilp.rst index ab7d8e62ddaf..12b35969d17a 100644 --- a/Documentation/arch/riscv/zicfilp.rst +++ b/Documentation/arch/riscv/zicfilp.rst @@ -78,7 +78,7 @@ the program. Per-task indirect branch tracking state can be monitored and controlled via the :c:macro:`PR_GET_CFI` and :c:macro:`PR_SET_CFI` -``prctl()` arguments (respectively), by supplying +``prctl()`` arguments (respectively), by supplying :c:macro:`PR_CFI_BRANCH_LANDING_PADS` as the second argument. These are architecture-agnostic, and will return -EINVAL if the underlying functionality is not supported. diff --git a/Documentation/crypto/krb5.rst b/Documentation/crypto/krb5.rst index beffa0133446..f62e07ac6811 100644 --- a/Documentation/crypto/krb5.rst +++ b/Documentation/crypto/krb5.rst @@ -158,13 +158,22 @@ returned. When a message has been received, the location and size of the data with the message can be determined by calling:: - void crypto_krb5_where_is_the_data(const struct krb5_enctype *krb5, - enum krb5_crypto_mode mode, - size_t *_offset, size_t *_len); + int crypto_krb5_where_is_the_data(const struct krb5_enctype *krb5, + enum krb5_crypto_mode mode, + size_t *_offset, size_t *_len); The caller provides the offset and length of the message to the function, which then alters those values to indicate the region containing the data (plus any -padding). It is up to the caller to determine how much padding there is. +padding). It is up to the caller to determine how much padding there is. The +function returns an error if the length is too small or if the mode is +unsupported. An additional function:: + + int crypto_krb5_check_data_len(const struct krb5_enctype *krb5, + enum krb5_crypto_mode mode, + size_t len, size_t min_content); + +is provided to just do a basic check that the decrypted/verified message would +have a sufficient minimum payload. Preparation Functions --------------------- @@ -1,7 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 VERSION = 7 PATCHLEVEL = 0 -SUBLEVEL = 10 +SUBLEVEL = 11 EXTRAVERSION = NAME = Baby Opossum Posse diff --git a/arch/alpha/include/asm/Kbuild b/arch/alpha/include/asm/Kbuild index 483965c5a4de..b154b4e3dfa8 100644 --- a/arch/alpha/include/asm/Kbuild +++ b/arch/alpha/include/asm/Kbuild @@ -5,4 +5,5 @@ generic-y += agp.h generic-y += asm-offsets.h generic-y += kvm_para.h generic-y += mcs_spinlock.h +generic-y += ring_buffer.h generic-y += text-patching.h diff --git a/arch/arc/include/asm/Kbuild b/arch/arc/include/asm/Kbuild index 4c69522e0328..483caacc6988 100644 --- a/arch/arc/include/asm/Kbuild +++ b/arch/arc/include/asm/Kbuild @@ -5,5 +5,6 @@ generic-y += extable.h generic-y += kvm_para.h generic-y += mcs_spinlock.h generic-y += parport.h +generic-y += ring_buffer.h generic-y += user.h generic-y += text-patching.h diff --git a/arch/arm/boot/dts/renesas/r7s72100-genmai.dts b/arch/arm/boot/dts/renesas/r7s72100-genmai.dts index 3c3756509714..da552a66615e 100644 --- a/arch/arm/boot/dts/renesas/r7s72100-genmai.dts +++ b/arch/arm/boot/dts/renesas/r7s72100-genmai.dts @@ -34,9 +34,6 @@ clocks = <&mstp9_clks R7S72100_CLK_SPIBSC0>; power-domains = <&cpg_clocks>; - #address-cells = <1>; - #size-cells = <1>; - partitions { compatible = "fixed-partitions"; #address-cells = <1>; diff --git a/arch/arm/boot/dts/renesas/r7s72100-rskrza1.dts b/arch/arm/boot/dts/renesas/r7s72100-rskrza1.dts index 91178fb9e721..3306bc9b7bc3 100644 --- a/arch/arm/boot/dts/renesas/r7s72100-rskrza1.dts +++ b/arch/arm/boot/dts/renesas/r7s72100-rskrza1.dts @@ -36,8 +36,6 @@ power-domains = <&cpg_clocks>; bank-width = <4>; device-width = <1>; - #address-cells = <1>; - #size-cells = <1>; partitions { compatible = "fixed-partitions"; diff --git a/arch/arm/include/asm/Kbuild b/arch/arm/include/asm/Kbuild index 03657ff8fbe3..decad5f2c826 100644 --- a/arch/arm/include/asm/Kbuild +++ b/arch/arm/include/asm/Kbuild @@ -3,6 +3,7 @@ generic-y += early_ioremap.h generic-y += extable.h generic-y += flat.h generic-y += parport.h +generic-y += ring_buffer.h generated-y += mach-types.h generated-y += unistd-nr.h diff --git a/arch/arm/mach-versatile/integrator_cp.c b/arch/arm/mach-versatile/integrator_cp.c index 2ed4ded56b3f..03dfb5f720b7 100644 --- a/arch/arm/mach-versatile/integrator_cp.c +++ b/arch/arm/mach-versatile/integrator_cp.c @@ -86,14 +86,6 @@ static u64 notrace intcp_read_sched_clock(void) return val; } -static void __init intcp_init_early(void) -{ - cm_map = syscon_regmap_lookup_by_compatible("arm,core-module-integrator"); - if (IS_ERR(cm_map)) - return; - sched_clock_register(intcp_read_sched_clock, 32, 24000000); -} - static void __init intcp_init_irq_of(void) { cm_init(); @@ -119,6 +111,10 @@ static void __init intcp_init_of(void) { struct device_node *cpcon; + cm_map = syscon_regmap_lookup_by_compatible("arm,core-module-integrator"); + if (!IS_ERR(cm_map)) + sched_clock_register(intcp_read_sched_clock, 32, 24000000); + cpcon = of_find_matching_node(NULL, intcp_syscon_match); if (!cpcon) return; @@ -138,7 +134,6 @@ static const char * intcp_dt_board_compat[] = { DT_MACHINE_START(INTEGRATOR_CP_DT, "ARM Integrator/CP (Device Tree)") .reserve = integrator_reserve, .map_io = intcp_map_io, - .init_early = intcp_init_early, .init_irq = intcp_init_irq_of, .init_machine = intcp_init_of, .dt_compat = intcp_dt_board_compat, diff --git a/arch/arm64/boot/dts/renesas/r8a78000.dtsi b/arch/arm64/boot/dts/renesas/r8a78000.dtsi index 3e1c98903cea..3ec1b53d2782 100644 --- a/arch/arm64/boot/dts/renesas/r8a78000.dtsi +++ b/arch/arm64/boot/dts/renesas/r8a78000.dtsi @@ -699,7 +699,7 @@ "renesas,rcar-gen5-scif", "renesas,scif"; reg = <0 0xc0700000 0 0x40>; interrupts = <GIC_ESPI 10 IRQ_TYPE_LEVEL_HIGH>; - clocks = <&dummy_clk_sgasyncd16>, <&dummy_clk_sgasyncd16>, <&scif_clk>; + clocks = <&dummy_clk_sgasyncd16>, <&dummy_clk_sgasyncd4>, <&scif_clk>; clock-names = "fck", "brg_int", "scif_clk"; status = "disabled"; }; @@ -709,7 +709,7 @@ "renesas,rcar-gen5-scif", "renesas,scif"; reg = <0 0xc0704000 0 0x40>; interrupts = <GIC_ESPI 11 IRQ_TYPE_LEVEL_HIGH>; - clocks = <&dummy_clk_sgasyncd16>, <&dummy_clk_sgasyncd16>, <&scif_clk>; + clocks = <&dummy_clk_sgasyncd16>, <&dummy_clk_sgasyncd4>, <&scif_clk>; clock-names = "fck", "brg_int", "scif_clk"; status = "disabled"; }; @@ -719,7 +719,7 @@ "renesas,rcar-gen5-scif", "renesas,scif"; reg = <0 0xc0708000 0 0x40>; interrupts = <GIC_ESPI 12 IRQ_TYPE_LEVEL_HIGH>; - clocks = <&dummy_clk_sgasyncd16>, <&dummy_clk_sgasyncd16>, <&scif_clk>; + clocks = <&dummy_clk_sgasyncd16>, <&dummy_clk_sgasyncd4>, <&scif_clk>; clock-names = "fck", "brg_int", "scif_clk"; status = "disabled"; }; @@ -729,7 +729,7 @@ "renesas,rcar-gen5-scif", "renesas,scif"; reg = <0 0xc070c000 0 0x40>; interrupts = <GIC_ESPI 13 IRQ_TYPE_LEVEL_HIGH>; - clocks = <&dummy_clk_sgasyncd16>, <&dummy_clk_sgasyncd16>, <&scif_clk>; + clocks = <&dummy_clk_sgasyncd16>, <&dummy_clk_sgasyncd4>, <&scif_clk>; clock-names = "fck", "brg_int", "scif_clk"; status = "disabled"; }; diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h index f463a654a2bb..cc0702fa64a7 100644 --- a/arch/arm64/include/asm/insn.h +++ b/arch/arm64/include/asm/insn.h @@ -409,7 +409,7 @@ __AARCH64_INSN_FUNCS(cbz, 0x7F000000, 0x34000000) __AARCH64_INSN_FUNCS(cbnz, 0x7F000000, 0x35000000) __AARCH64_INSN_FUNCS(tbz, 0x7F000000, 0x36000000) __AARCH64_INSN_FUNCS(tbnz, 0x7F000000, 0x37000000) -__AARCH64_INSN_FUNCS(bcond, 0xFF000010, 0x54000000) +__AARCH64_INSN_FUNCS(bcond, 0xFF000000, 0x54000000) __AARCH64_INSN_FUNCS(svc, 0xFFE0001F, 0xD4000001) __AARCH64_INSN_FUNCS(hvc, 0xFFE0001F, 0xD4000002) __AARCH64_INSN_FUNCS(smc, 0xFFE0001F, 0xD4000003) diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h index b39cc1127e1f..61c1db637e08 100644 --- a/arch/arm64/include/asm/page.h +++ b/arch/arm64/include/asm/page.h @@ -33,7 +33,7 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma, unsigned long vaddr); #define vma_alloc_zeroed_movable_folio vma_alloc_zeroed_movable_folio -bool tag_clear_highpages(struct page *to, int numpages); +bool tag_clear_highpages(struct page *to, int numpages, bool clear_pages); #define __HAVE_ARCH_TAG_CLEAR_HIGHPAGES #define copy_user_page(to, from, vaddr, pg) copy_page(to, from) diff --git a/arch/arm64/include/asm/ring_buffer.h b/arch/arm64/include/asm/ring_buffer.h new file mode 100644 index 000000000000..62316c406888 --- /dev/null +++ b/arch/arm64/include/asm/ring_buffer.h @@ -0,0 +1,10 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef _ASM_ARM64_RING_BUFFER_H +#define _ASM_ARM64_RING_BUFFER_H + +#include <asm/cacheflush.h> + +/* Flush D-cache on persistent ring buffer */ +#define arch_ring_buffer_flush_range(start, end) dcache_clean_pop(start, end) + +#endif /* _ASM_ARM64_RING_BUFFER_H */ diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index f9c9e7fb0997..0d09f07925b5 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -540,8 +540,10 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu) kvm_destroy_mpidr_data(vcpu->kvm); err = kvm_vgic_vcpu_init(vcpu); - if (err) + if (err) { + kvm_vgic_vcpu_destroy(vcpu); return err; + } err = kvm_share_hyp(vcpu, vcpu + 1); if (err) diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c index 2ea9f1c7ebcd..1d7e5d560af4 100644 --- a/arch/arm64/kvm/vgic/vgic-its.c +++ b/arch/arm64/kvm/vgic/vgic-its.c @@ -2307,6 +2307,10 @@ static int vgic_its_restore_dte(struct vgic_its *its, u32 id, /* dte entry is valid */ offset = (entry & KVM_ITS_DTE_NEXT_MASK) >> KVM_ITS_DTE_NEXT_SHIFT; + /* Mimic the MAPD behaviour and reject invalid EID bits. */ + if (num_eventid_bits > VITS_TYPER_IDBITS) + return -EINVAL; + if (!vgic_its_check_id(its, baser, id, NULL)) return -EINVAL; diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index be9dab2c7d6a..b14d0e9fcdbd 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -971,7 +971,7 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma, return vma_alloc_folio(flags, 0, vma, vaddr); } -bool tag_clear_highpages(struct page *page, int numpages) +bool tag_clear_highpages(struct page *page, int numpages, bool clear_pages) { /* * Check if MTE is supported and fall back to clear_highpage(). @@ -979,13 +979,16 @@ bool tag_clear_highpages(struct page *page, int numpages) * post_alloc_hook() will invoke tag_clear_highpages(). */ if (!system_supports_mte()) - return false; + return clear_pages; /* Newly allocated pages, shouldn't have been tagged yet */ for (int i = 0; i < numpages; i++, page++) { WARN_ON_ONCE(!try_page_mte_tagging(page)); - mte_zero_clear_page_tags(page_address(page)); + if (clear_pages) + mte_zero_clear_page_tags(page_address(page)); + else + mte_clear_page_tags(page_address(page)); set_page_mte_tagged(page); } - return true; + return false; } diff --git a/arch/csky/include/asm/Kbuild b/arch/csky/include/asm/Kbuild index 3a5c7f6e5aac..7dca0c6cdc84 100644 --- a/arch/csky/include/asm/Kbuild +++ b/arch/csky/include/asm/Kbuild @@ -9,6 +9,7 @@ generic-y += qrwlock.h generic-y += qrwlock_types.h generic-y += qspinlock.h generic-y += parport.h +generic-y += ring_buffer.h generic-y += user.h generic-y += vmlinux.lds.h generic-y += text-patching.h diff --git a/arch/hexagon/include/asm/Kbuild b/arch/hexagon/include/asm/Kbuild index 1efa1e993d4b..0f887d4238ed 100644 --- a/arch/hexagon/include/asm/Kbuild +++ b/arch/hexagon/include/asm/Kbuild @@ -5,4 +5,5 @@ generic-y += extable.h generic-y += iomap.h generic-y += kvm_para.h generic-y += mcs_spinlock.h +generic-y += ring_buffer.h generic-y += text-patching.h diff --git a/arch/loongarch/include/asm/Kbuild b/arch/loongarch/include/asm/Kbuild index 9034b583a88a..7e92957baf6a 100644 --- a/arch/loongarch/include/asm/Kbuild +++ b/arch/loongarch/include/asm/Kbuild @@ -10,5 +10,6 @@ generic-y += qrwlock.h generic-y += user.h generic-y += ioctl.h generic-y += mmzone.h +generic-y += ring_buffer.h generic-y += statfs.h generic-y += text-patching.h diff --git a/arch/loongarch/kernel/kprobes.c b/arch/loongarch/kernel/kprobes.c index 8ba391cfabb0..1985ed30dd16 100644 --- a/arch/loongarch/kernel/kprobes.c +++ b/arch/loongarch/kernel/kprobes.c @@ -60,16 +60,18 @@ NOKPROBE_SYMBOL(arch_prepare_kprobe); /* Install breakpoint in text */ void arch_arm_kprobe(struct kprobe *p) { - *p->addr = KPROBE_BP_INSN; - flush_insn_slot(p); + u32 insn = KPROBE_BP_INSN; + + larch_insn_text_copy(p->addr, &insn, LOONGARCH_INSN_SIZE); } NOKPROBE_SYMBOL(arch_arm_kprobe); /* Remove breakpoint from text */ void arch_disarm_kprobe(struct kprobe *p) { - *p->addr = p->opcode; - flush_insn_slot(p); + u32 insn = p->opcode; + + larch_insn_text_copy(p->addr, &insn, LOONGARCH_INSN_SIZE); } NOKPROBE_SYMBOL(arch_disarm_kprobe); @@ -184,16 +186,16 @@ static bool reenter_kprobe(struct kprobe *p, struct pt_regs *regs, struct kprobe_ctlblk *kcb) { switch (kcb->kprobe_status) { - case KPROBE_HIT_SS: case KPROBE_HIT_SSDONE: case KPROBE_HIT_ACTIVE: kprobes_inc_nmissed_count(p); setup_singlestep(p, regs, kcb, 1); break; + case KPROBE_HIT_SS: case KPROBE_REENTER: pr_warn("Failed to recover from reentered kprobes.\n"); dump_kprobe(p); - WARN_ON_ONCE(1); + BUG(); break; default: WARN_ON(1); diff --git a/arch/loongarch/mm/init.c b/arch/loongarch/mm/init.c index c331bf69d2ec..a2182b18bd27 100644 --- a/arch/loongarch/mm/init.c +++ b/arch/loongarch/mm/init.c @@ -93,11 +93,7 @@ void arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) { unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; - struct page *page = pfn_to_page(start_pfn); - /* With altmap the first mapped page is offset from @start */ - if (altmap) - page += vmem_altmap_offset(altmap); __remove_pages(start_pfn, nr_pages, altmap); } #endif diff --git a/arch/m68k/include/asm/Kbuild b/arch/m68k/include/asm/Kbuild index b282e0dd8dc1..62543bf305ff 100644 --- a/arch/m68k/include/asm/Kbuild +++ b/arch/m68k/include/asm/Kbuild @@ -3,5 +3,6 @@ generated-y += syscall_table.h generic-y += extable.h generic-y += kvm_para.h generic-y += mcs_spinlock.h +generic-y += ring_buffer.h generic-y += spinlock.h generic-y += text-patching.h diff --git a/arch/microblaze/include/asm/Kbuild b/arch/microblaze/include/asm/Kbuild index 7178f990e8b3..0030309b47ad 100644 --- a/arch/microblaze/include/asm/Kbuild +++ b/arch/microblaze/include/asm/Kbuild @@ -5,6 +5,7 @@ generic-y += extable.h generic-y += kvm_para.h generic-y += mcs_spinlock.h generic-y += parport.h +generic-y += ring_buffer.h generic-y += syscalls.h generic-y += tlb.h generic-y += user.h diff --git a/arch/mips/include/asm/Kbuild b/arch/mips/include/asm/Kbuild index 684569b2ecd6..9771c3d85074 100644 --- a/arch/mips/include/asm/Kbuild +++ b/arch/mips/include/asm/Kbuild @@ -12,5 +12,6 @@ generic-y += mcs_spinlock.h generic-y += parport.h generic-y += qrwlock.h generic-y += qspinlock.h +generic-y += ring_buffer.h generic-y += user.h generic-y += text-patching.h diff --git a/arch/nios2/include/asm/Kbuild b/arch/nios2/include/asm/Kbuild index 28004301c236..0a2530964413 100644 --- a/arch/nios2/include/asm/Kbuild +++ b/arch/nios2/include/asm/Kbuild @@ -5,6 +5,7 @@ generic-y += cmpxchg.h generic-y += extable.h generic-y += kvm_para.h generic-y += mcs_spinlock.h +generic-y += ring_buffer.h generic-y += spinlock.h generic-y += user.h generic-y += text-patching.h diff --git a/arch/openrisc/include/asm/Kbuild b/arch/openrisc/include/asm/Kbuild index cef49d60d74c..8aa34621702d 100644 --- a/arch/openrisc/include/asm/Kbuild +++ b/arch/openrisc/include/asm/Kbuild @@ -8,4 +8,5 @@ generic-y += spinlock_types.h generic-y += spinlock.h generic-y += qrwlock_types.h generic-y += qrwlock.h +generic-y += ring_buffer.h generic-y += user.h diff --git a/arch/parisc/include/asm/Kbuild b/arch/parisc/include/asm/Kbuild index 4fb596d94c89..d48d158f7241 100644 --- a/arch/parisc/include/asm/Kbuild +++ b/arch/parisc/include/asm/Kbuild @@ -4,4 +4,5 @@ generated-y += syscall_table_64.h generic-y += agp.h generic-y += kvm_para.h generic-y += mcs_spinlock.h +generic-y += ring_buffer.h generic-y += user.h diff --git a/arch/powerpc/Kconfig.debug b/arch/powerpc/Kconfig.debug index f15e5920080b..e8718bc13eeb 100644 --- a/arch/powerpc/Kconfig.debug +++ b/arch/powerpc/Kconfig.debug @@ -83,11 +83,10 @@ config MSI_BITMAP_SELFTEST depends on DEBUG_KERNEL config GUEST_STATE_BUFFER_TEST - def_tristate n + def_tristate KUNIT_ALL_TESTS prompt "Enable Guest State Buffer unit tests" depends on KUNIT depends on KVM_BOOK3S_HV_POSSIBLE - default KUNIT_ALL_TESTS help The Guest State Buffer is a data format specified in the PAPR. It is by hcalls to communicate the state of L2 guests between diff --git a/arch/powerpc/include/asm/Kbuild b/arch/powerpc/include/asm/Kbuild index 2e23533b67e3..805b5aeebb6f 100644 --- a/arch/powerpc/include/asm/Kbuild +++ b/arch/powerpc/include/asm/Kbuild @@ -5,4 +5,5 @@ generated-y += syscall_table_spu.h generic-y += agp.h generic-y += mcs_spinlock.h generic-y += qrwlock.h +generic-y += ring_buffer.h generic-y += early_ioremap.h diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c index 4bbeb8644d3d..b4472288e0d4 100644 --- a/arch/powerpc/kernel/time.c +++ b/arch/powerpc/kernel/time.c @@ -458,6 +458,10 @@ DEFINE_PER_CPU(u8, irq_work_pending); #endif /* 32 vs 64 bit */ +/* + * Must be called with preemption disabled since it updates + * per-CPU irq_work state and programs the local CPU decrementer. + */ void arch_irq_work_raise(void) { /* @@ -471,10 +475,8 @@ void arch_irq_work_raise(void) * which could get tangled up if we're messing with the same state * here. */ - preempt_disable(); set_irq_work_pending_flag(); set_dec(1); - preempt_enable(); } static void set_dec_or_work(u64 val) diff --git a/arch/powerpc/perf/hv-gpci.c b/arch/powerpc/perf/hv-gpci.c index 5cac2cf3bd1e..10c82cf8f5b3 100644 --- a/arch/powerpc/perf/hv-gpci.c +++ b/arch/powerpc/perf/hv-gpci.c @@ -210,7 +210,7 @@ static ssize_t processor_bus_topology_show(struct device *dev, struct device_att 0, 0, buf, &n, arg); if (!ret) - return n; + goto out_success; if (ret != H_PARAMETER) goto out; @@ -244,12 +244,14 @@ static ssize_t processor_bus_topology_show(struct device *dev, struct device_att starting_index, 0, buf, &n, arg); if (!ret) - return n; + goto out_success; if (ret != H_PARAMETER) goto out; } +out_success: + put_cpu_var(hv_gpci_reqb); return n; out: @@ -278,7 +280,7 @@ static ssize_t processor_config_show(struct device *dev, struct device_attribute 0, 0, buf, &n, arg); if (!ret) - return n; + goto out_success; if (ret != H_PARAMETER) goto out; @@ -312,12 +314,14 @@ static ssize_t processor_config_show(struct device *dev, struct device_attribute starting_index, 0, buf, &n, arg); if (!ret) - return n; + goto out_success; if (ret != H_PARAMETER) goto out; } +out_success: + put_cpu_var(hv_gpci_reqb); return n; out: @@ -346,7 +350,7 @@ static ssize_t affinity_domain_via_virtual_processor_show(struct device *dev, 0, 0, buf, &n, arg); if (!ret) - return n; + goto out_success; if (ret != H_PARAMETER) goto out; @@ -382,12 +386,14 @@ static ssize_t affinity_domain_via_virtual_processor_show(struct device *dev, starting_index, secondary_index, buf, &n, arg); if (!ret) - return n; + goto out_success; if (ret != H_PARAMETER) goto out; } +out_success: + put_cpu_var(hv_gpci_reqb); return n; out: @@ -416,7 +422,7 @@ static ssize_t affinity_domain_via_domain_show(struct device *dev, struct device 0, 0, buf, &n, arg); if (!ret) - return n; + goto out_success; if (ret != H_PARAMETER) goto out; @@ -448,12 +454,14 @@ static ssize_t affinity_domain_via_domain_show(struct device *dev, struct device starting_index, 0, buf, &n, arg); if (!ret) - return n; + goto out_success; if (ret != H_PARAMETER) goto out; } +out_success: + put_cpu_var(hv_gpci_reqb); return n; out: diff --git a/arch/powerpc/platforms/82xx/km82xx.c b/arch/powerpc/platforms/82xx/km82xx.c index 99f0f0f41876..4ad223525e89 100644 --- a/arch/powerpc/platforms/82xx/km82xx.c +++ b/arch/powerpc/platforms/82xx/km82xx.c @@ -27,8 +27,8 @@ static void __init km82xx_pic_init(void) { - struct device_node *np __free(device_node); - np = of_find_compatible_node(NULL, NULL, "fsl,pq2-pic"); + struct device_node *np __free(device_node) = of_find_compatible_node(NULL, + NULL, "fsl,pq2-pic"); if (!np) { pr_err("PIC init: can not find cpm-pic node\n"); diff --git a/arch/riscv/errata/mips/errata.c b/arch/riscv/errata/mips/errata.c index e984a8152208..2c3dc2259e93 100644 --- a/arch/riscv/errata/mips/errata.c +++ b/arch/riscv/errata/mips/errata.c @@ -57,7 +57,7 @@ void mips_errata_patch_func(struct alt_entry *begin, struct alt_entry *end, } tmp = (1U << alt->patch_id); - if (cpu_req_errata && tmp) { + if (cpu_req_errata & tmp) { mutex_lock(&text_mutex); patch_text_nosync(ALT_OLD_PTR(alt), ALT_ALT_PTR(alt), alt->alt_len); diff --git a/arch/riscv/include/asm/Kbuild b/arch/riscv/include/asm/Kbuild index bd5fc9403295..7721b63642f4 100644 --- a/arch/riscv/include/asm/Kbuild +++ b/arch/riscv/include/asm/Kbuild @@ -14,5 +14,6 @@ generic-y += ticket_spinlock.h generic-y += qrwlock.h generic-y += qrwlock_types.h generic-y += qspinlock.h +generic-y += ring_buffer.h generic-y += user.h generic-y += vmlinux.lds.h diff --git a/arch/riscv/kernel/compat_signal.c b/arch/riscv/kernel/compat_signal.c index 6ec4e34255a9..cf3eb33a11e4 100644 --- a/arch/riscv/kernel/compat_signal.c +++ b/arch/riscv/kernel/compat_signal.c @@ -107,6 +107,8 @@ static long compat_restore_sigcontext(struct pt_regs *regs, /* sc_regs is structured the same as the start of pt_regs */ err = __copy_from_user(&cregs, &sc->sc_regs, sizeof(sc->sc_regs)); + if (unlikely(err)) + return err; cregs_to_regs(&cregs, regs); diff --git a/arch/riscv/kernel/ptrace.c b/arch/riscv/kernel/ptrace.c index 93de2e7a3074..793bcee46182 100644 --- a/arch/riscv/kernel/ptrace.c +++ b/arch/riscv/kernel/ptrace.c @@ -577,8 +577,8 @@ static int compat_riscv_gpr_set(struct task_struct *target, struct compat_user_regs_struct cregs; ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &cregs, 0, -1); - - cregs_to_regs(&cregs, task_pt_regs(target)); + if (!ret) + cregs_to_regs(&cregs, task_pt_regs(target)); return ret; } diff --git a/arch/riscv/kvm/vcpu_pmu.c b/arch/riscv/kvm/vcpu_pmu.c index e873430e596b..f36b099f447c 100644 --- a/arch/riscv/kvm/vcpu_pmu.c +++ b/arch/riscv/kvm/vcpu_pmu.c @@ -435,8 +435,10 @@ int kvm_riscv_vcpu_pmu_snapshot_set_shmem(struct kvm_vcpu *vcpu, unsigned long s } kvpmu->sdata = kzalloc(snapshot_area_size, GFP_ATOMIC); - if (!kvpmu->sdata) - return -ENOMEM; + if (!kvpmu->sdata) { + sbiret = SBI_ERR_FAILURE; + goto out; + } /* No need to check writable slot explicitly as kvm_vcpu_write_guest does it internally */ if (kvm_vcpu_write_guest(vcpu, saddr, kvpmu->sdata, snapshot_area_size)) { @@ -480,8 +482,10 @@ int kvm_riscv_vcpu_pmu_event_info(struct kvm_vcpu *vcpu, unsigned long saddr_low } einfo = kzalloc(shmem_size, GFP_KERNEL); - if (!einfo) - return -ENOMEM; + if (!einfo) { + ret = SBI_ERR_FAILURE; + goto out; + } ret = kvm_vcpu_read_guest(vcpu, shmem, einfo, shmem_size); if (ret) { diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 811e03786c56..1b221c3fe275 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -846,6 +846,27 @@ static void __init set_mmap_rnd_bits_max(void) mmap_rnd_bits_max = MMAP_VA_BITS - PAGE_SHIFT - 3; } +static bool __init is_vaddr_valid(unsigned long va) +{ + unsigned long up = 0; + + switch (satp_mode) { + case SATP_MODE_39: + up = 1UL << 38; + break; + case SATP_MODE_48: + up = 1UL << 47; + break; + case SATP_MODE_57: + up = 1UL << 56; + break; + default: + return false; + } + + return (va < up) || (va >= (ULONG_MAX - up + 1)); +} + /* * There is a simple way to determine if 4-level is supported by the * underlying hardware: establish 1:1 mapping in 4-level page table mode @@ -887,6 +908,9 @@ static __init void set_satp_mode(uintptr_t dtb_pa) set_satp_mode_pmd + PMD_SIZE, PMD_SIZE, PAGE_KERNEL_EXEC); retry: + if (!is_vaddr_valid(set_satp_mode_pmd)) + goto out; + create_pgd_mapping(early_pg_dir, set_satp_mode_pmd, pgtable_l5_enabled ? @@ -909,6 +933,7 @@ retry: disable_pgtable_l4(); } +out: memset(early_pg_dir, 0, PAGE_SIZE); memset(early_p4d, 0, PAGE_SIZE); memset(early_pud, 0, PAGE_SIZE); diff --git a/arch/s390/include/asm/Kbuild b/arch/s390/include/asm/Kbuild index 80bad7de7a04..0c1fc47c3ba0 100644 --- a/arch/s390/include/asm/Kbuild +++ b/arch/s390/include/asm/Kbuild @@ -7,3 +7,4 @@ generated-y += unistd_nr.h generic-y += asm-offsets.h generic-y += mcs_spinlock.h generic-y += mmzone.h +generic-y += ring_buffer.h diff --git a/arch/s390/kernel/perf_pai.c b/arch/s390/kernel/perf_pai.c index 86f71a3d1ef2..cdb8006220ca 100644 --- a/arch/s390/kernel/perf_pai.c +++ b/arch/s390/kernel/perf_pai.c @@ -186,6 +186,13 @@ static u64 pai_getctr(unsigned long *page, int nr, unsigned long offset) return page[nr]; } +static void pai_setctr(unsigned long *page, int nr, unsigned long offset, u64 v) +{ + if (offset) + nr += offset / sizeof(*page); + page[nr] = v; +} + /* Read the counter values. Return value from location in CMP. For base * event xxx_ALL sum up all events. Returns counter value. */ @@ -551,6 +558,8 @@ static void paicrypt_del(struct perf_event *event, int flags) /* Create raw data and save it in buffer. Calculate the delta for each * counter between this invocation and the last invocation. * Returns number of bytes copied. + * After reading from PAI counter page, save the read value to the old + * page to calculate PAI counter deltas. * Saves only entries with positive counter difference of the form * 2 bytes: Number of counter * 8 bytes: Value of counter @@ -562,16 +571,22 @@ static size_t pai_copy(struct pai_userdata *userdata, unsigned long *page, int i, outidx = 0; for (i = 1; i <= pp->num_avail; i++) { - u64 val = 0, val_old = 0; + u64 val = 0, val_old = 0, val_k = 0, val_old_k = 0; if (!exclude_kernel) { - val += pai_getctr(page, i, pp->kernel_offset); - val_old += pai_getctr(page_old, i, pp->kernel_offset); + val_k = pai_getctr(page, i, pp->kernel_offset); + val_old_k = pai_getctr(page_old, i, pp->kernel_offset); + if (val_k != val_old_k) + pai_setctr(page_old, i, pp->kernel_offset, val_k); } if (!exclude_user) { - val += pai_getctr(page, i, 0); - val_old += pai_getctr(page_old, i, 0); + val = pai_getctr(page, i, 0); + val_old = pai_getctr(page_old, i, 0); + if (val != val_old) + pai_setctr(page_old, i, 0, val); } + val += val_k; + val_old += val_old_k; if (val >= val_old) val -= val_old; else @@ -602,8 +617,6 @@ static size_t pai_copy(struct pai_userdata *userdata, unsigned long *page, static int pai_push_sample(size_t rawsize, struct pai_map *cpump, struct perf_event *event) { - int idx = PAI_PMU_IDX(event); - struct pai_pmu *pp = &pai_pmu[idx]; struct perf_sample_data data; struct perf_raw_record raw; struct pt_regs regs; @@ -634,8 +647,6 @@ static int pai_push_sample(size_t rawsize, struct pai_map *cpump, overflow = perf_event_overflow(event, &data, ®s); perf_event_update_userpage(event); - /* Save crypto counter lowcore page after reading event data. */ - memcpy((void *)PAI_SAVE_AREA(event), cpump->area, pp->area_size); return overflow; } @@ -651,7 +662,7 @@ static void pai_have_sample(struct perf_event *event, struct pai_map *cpump) rawsize = pai_copy(cpump->save, cpump->area, pp, (unsigned long *)PAI_SAVE_AREA(event), event->attr.exclude_user, - event->attr.exclude_kernel); + !pp->kernel_offset ? true : event->attr.exclude_kernel); if (rawsize) /* No incremented counters */ pai_push_sample(rawsize, cpump, event); } diff --git a/arch/sh/include/asm/Kbuild b/arch/sh/include/asm/Kbuild index 4d3f10ed8275..f0403d3ee8ab 100644 --- a/arch/sh/include/asm/Kbuild +++ b/arch/sh/include/asm/Kbuild @@ -3,4 +3,5 @@ generated-y += syscall_table.h generic-y += kvm_para.h generic-y += mcs_spinlock.h generic-y += parport.h +generic-y += ring_buffer.h generic-y += text-patching.h diff --git a/arch/sparc/include/asm/Kbuild b/arch/sparc/include/asm/Kbuild index 17ee8a273aa6..49c6bb326b75 100644 --- a/arch/sparc/include/asm/Kbuild +++ b/arch/sparc/include/asm/Kbuild @@ -4,4 +4,5 @@ generated-y += syscall_table_64.h generic-y += agp.h generic-y += kvm_para.h generic-y += mcs_spinlock.h +generic-y += ring_buffer.h generic-y += text-patching.h diff --git a/arch/um/include/asm/Kbuild b/arch/um/include/asm/Kbuild index 1b9b82bbe322..2a1629ba8140 100644 --- a/arch/um/include/asm/Kbuild +++ b/arch/um/include/asm/Kbuild @@ -17,6 +17,7 @@ generic-y += module.lds.h generic-y += parport.h generic-y += percpu.h generic-y += preempt.h +generic-y += ring_buffer.h generic-y += runtime-const.h generic-y += softirq_stack.h generic-y += switch_to.h diff --git a/arch/x86/include/asm/Kbuild b/arch/x86/include/asm/Kbuild index 4566000e15c4..078fd2c0d69d 100644 --- a/arch/x86/include/asm/Kbuild +++ b/arch/x86/include/asm/Kbuild @@ -14,3 +14,4 @@ generic-y += early_ioremap.h generic-y += fprobe.h generic-y += mcs_spinlock.h generic-y += mmzone.h +generic-y += ring_buffer.h diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c index 146f6f8b0650..99801e844b30 100644 --- a/arch/x86/kernel/cpu/cpuid-deps.c +++ b/arch/x86/kernel/cpu/cpuid-deps.c @@ -92,6 +92,7 @@ static const struct cpuid_dep cpuid_deps[] = { { X86_FEATURE_FRED, X86_FEATURE_LKGS }, { X86_FEATURE_SPEC_CTRL_SSBD, X86_FEATURE_SPEC_CTRL }, { X86_FEATURE_LASS, X86_FEATURE_SMAP }, + { X86_FEATURE_INVLPGB, X86_FEATURE_PCID }, {} }; diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c index 8dd424ac5de8..f3a793e3a6c8 100644 --- a/arch/x86/kernel/cpu/mce/core.c +++ b/arch/x86/kernel/cpu/mce/core.c @@ -90,7 +90,6 @@ struct mca_config mca_cfg __read_mostly = { }; static DEFINE_PER_CPU(struct mce_hw_err, hw_errs_seen); -static unsigned long mce_need_notify; /* * MCA banks polled by the period polling timer for corrected events. @@ -152,8 +151,10 @@ EXPORT_PER_CPU_SYMBOL_GPL(injectm); void mce_log(struct mce_hw_err *err) { - if (mce_gen_pool_add(err)) + if (mce_gen_pool_add(err)) { + pr_info(HW_ERR "Machine check events logged\n"); irq_work_queue(&mce_irq_work); + } } EXPORT_SYMBOL_GPL(mce_log); @@ -585,28 +586,6 @@ bool mce_is_correctable(struct mce *m) } EXPORT_SYMBOL_GPL(mce_is_correctable); -/* - * Notify the user(s) about new machine check events. - * Can be called from interrupt context, but not from machine check/NMI - * context. - */ -static bool mce_notify_irq(void) -{ - /* Not more than two messages every minute */ - static DEFINE_RATELIMIT_STATE(ratelimit, 60*HZ, 2); - - if (test_and_clear_bit(0, &mce_need_notify)) { - mce_work_trigger(); - - if (__ratelimit(&ratelimit)) - pr_info(HW_ERR "Machine check events logged\n"); - - return true; - } - - return false; -} - static int mce_early_notifier(struct notifier_block *nb, unsigned long val, void *data) { @@ -618,9 +597,7 @@ static int mce_early_notifier(struct notifier_block *nb, unsigned long val, /* Emit the trace record: */ trace_mce_record(err); - set_bit(0, &mce_need_notify); - - mce_notify_irq(); + mce_work_trigger(); return NOTIFY_DONE; } @@ -1804,7 +1781,7 @@ static void mce_timer_fn(struct timer_list *t) * Alert userspace if needed. If we logged an MCE, reduce the polling * interval, otherwise increase the polling interval. */ - if (mce_notify_irq()) + if (!mce_gen_pool_empty()) iv = max(iv / 2, (unsigned long) HZ/100); else iv = min(iv * 2, round_jiffies_relative(check_interval * HZ)); diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c index f7ec7914e3c4..02beb15d7428 100644 --- a/arch/x86/kvm/svm/avic.c +++ b/arch/x86/kvm/svm/avic.c @@ -1289,12 +1289,14 @@ bool __init avic_hardware_setup(void) } /* - * Disable IPI virtualization for AMD Family 17h CPUs (Zen1 and Zen2) - * due to erratum 1235, which results in missed VM-Exits on the sender - * and thus missed wake events for blocking vCPUs due to the CPU - * failing to see a software update to clear IsRunning. + * Disable IPI virtualization for AMD Family 17h (Zen1 and Zen2) and + * Hygon Family 18h (derived from AMD Zen1) CPUs due to erratum 1235, + * which results in missed VM-Exits on the sender and thus missed wake + * events for blocking vCPUs due to the CPU failing to see a software + * update to clear IsRunning. */ - enable_ipiv = enable_ipiv && boot_cpu_data.x86 != 0x17; + if (boot_cpu_data.x86 == 0x17 || boot_cpu_data.x86 == 0x18) + enable_ipiv = false; amd_iommu_register_ga_log_notifier(&avic_ga_log_notifier); diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c index ac8021c3a997..d4738e03a63a 100644 --- a/arch/x86/xen/setup.c +++ b/arch/x86/xen/setup.c @@ -655,7 +655,7 @@ static void __init xen_e820_swap_entry_with_ram(struct e820_entry *swap_entry) /* Fill new entry (keep size and page offset). */ entry->type = swap_entry->type; entry->addr = entry_end - swap_size + - swap_addr - swap_entry->addr; + swap_entry->addr - swap_addr; entry->size = swap_entry->size; /* Convert old entry to RAM, align to pages. */ diff --git a/arch/xtensa/include/asm/Kbuild b/arch/xtensa/include/asm/Kbuild index 13fe45dea296..e57af619263a 100644 --- a/arch/xtensa/include/asm/Kbuild +++ b/arch/xtensa/include/asm/Kbuild @@ -6,5 +6,6 @@ generic-y += mcs_spinlock.h generic-y += parport.h generic-y += qrwlock.h generic-y += qspinlock.h +generic-y += ring_buffer.h generic-y += user.h generic-y += text-patching.h diff --git a/block/bio-integrity.c b/block/bio-integrity.c index a31936221703..d6c9a09a8dc6 100644 --- a/block/bio-integrity.c +++ b/block/bio-integrity.c @@ -244,7 +244,6 @@ static int bio_integrity_copy_user(struct bio *bio, struct bio_vec *bvec, } bip->bip_flags |= BIP_COPY_USER; - bip->bip_vcnt = nr_vecs; return 0; free_bip: bio_integrity_free(bio); @@ -339,6 +338,24 @@ int bio_integrity_map_user(struct bio *bio, struct iov_iter *iter) if (unlikely(ret < 0)) goto free_bvec; + /* + * Handle partial pinning. This can happen when pin_user_pages_fast() + * returns fewer pages than requested. + */ + if (user_backed_iter(iter) && unlikely(ret != bytes)) { + if (ret > 0) { + int npinned = DIV_ROUND_UP(offset + ret, PAGE_SIZE); + int i; + + for (i = 0; i < npinned; i++) + unpin_user_page(pages[i]); + } + if (pages != stack_pages) + kvfree(pages); + ret = -EFAULT; + goto free_bvec; + } + nr_bvecs = bvec_from_pages(bvec, pages, nr_vecs, bytes, offset, &is_p2p); if (pages != stack_pages) diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index 554c87bb4a86..bc63bd220865 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -2241,7 +2241,7 @@ void blk_cgroup_bio_start(struct bio *bio) } u64_stats_update_end_irqrestore(&bis->sync, flags); - css_rstat_updated(&blkcg->css, cpu); + __css_rstat_updated(&blkcg->css, cpu); put_cpu(); } diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c index 28167c9baa55..047ec887456b 100644 --- a/block/blk-mq-debugfs.c +++ b/block/blk-mq-debugfs.c @@ -97,6 +97,7 @@ static const char *const blk_queue_flag_name[] = { QUEUE_FLAG_NAME(NO_ELV_SWITCH), QUEUE_FLAG_NAME(QOS_ENABLED), QUEUE_FLAG_NAME(BIO_ISSUE_TIME), + QUEUE_FLAG_NAME(ZONED_QD1_WRITES), }; #undef QUEUE_FLAG_NAME diff --git a/block/blk-mq.c b/block/blk-mq.c index 3da2215b2912..39986a742b98 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -3077,7 +3077,7 @@ static struct request *blk_mq_get_new_requests(struct request_queue *q, /* * Check if there is a suitable cached request and return it. */ -static struct request *blk_mq_peek_cached_request(struct blk_plug *plug, +static struct request *blk_mq_get_cached_request(struct blk_plug *plug, struct request_queue *q, blk_opf_t opf) { enum hctx_type type = blk_mq_get_hctx_type(opf); @@ -3093,27 +3093,10 @@ static struct request *blk_mq_peek_cached_request(struct blk_plug *plug, return NULL; if (op_is_flush(rq->cmd_flags) != op_is_flush(opf)) return NULL; + rq_list_pop(&plug->cached_rqs); return rq; } -static void blk_mq_use_cached_rq(struct request *rq, struct blk_plug *plug, - struct bio *bio) -{ - if (rq_list_pop(&plug->cached_rqs) != rq) - WARN_ON_ONCE(1); - - /* - * If any qos ->throttle() end up blocking, we will have flushed the - * plug and hence killed the cached_rq list as well. Pop this entry - * before we throttle. - */ - rq_qos_throttle(rq->q, bio); - - blk_mq_rq_time_init(rq, blk_time_get_ns()); - rq->cmd_flags = bio->bi_opf; - INIT_LIST_HEAD(&rq->queuelist); -} - static bool bio_unaligned(const struct bio *bio, struct request_queue *q) { unsigned int bs_mask = queue_logical_block_size(q) - 1; @@ -3151,7 +3134,7 @@ void blk_mq_submit_bio(struct bio *bio) /* * If the plug has a cached request for this queue, try to use it. */ - rq = blk_mq_peek_cached_request(plug, q, bio->bi_opf); + rq = blk_mq_get_cached_request(plug, q, bio->bi_opf); /* * A BIO that was released from a zone write plug has already been @@ -3209,7 +3192,10 @@ void blk_mq_submit_bio(struct bio *bio) new_request: if (rq) { - blk_mq_use_cached_rq(rq, plug, bio); + rq_qos_throttle(rq->q, bio); + blk_mq_rq_time_init(rq, blk_time_get_ns()); + rq->cmd_flags = bio->bi_opf; + INIT_LIST_HEAD(&rq->queuelist); } else { rq = blk_mq_get_new_requests(q, plug, bio); if (unlikely(!rq)) { @@ -3255,12 +3241,10 @@ new_request: return; queue_exit: - /* - * Don't drop the queue reference if we were trying to use a cached - * request and thus didn't acquire one. - */ if (!rq) blk_queue_exit(q); + else + blk_mq_free_request(rq); } #ifdef CONFIG_BLK_MQ_STACKING @@ -3305,6 +3289,25 @@ blk_status_t blk_insert_cloned_request(struct request *rq) return BLK_STS_IOERR; } + /* + * Integrity segment counting depends on the same queue limits + * (virt_boundary_mask, seg_boundary_mask, max_segment_size) that + * vary across stacked queues, so recompute against the bottom + * queue just like nr_phys_segments above. + */ + if (blk_integrity_rq(rq) && rq->bio) { + unsigned short max_int_segs = queue_max_integrity_segments(q); + + rq->nr_integrity_segments = + blk_rq_count_integrity_sg(rq->q, rq->bio); + if (rq->nr_integrity_segments > max_int_segs) { + printk(KERN_ERR "%s: over max integrity segments limit. (%u > %u)\n", + __func__, rq->nr_integrity_segments, + max_int_segs); + return BLK_STS_IOERR; + } + } + if (q->disk && should_fail_request(q->disk->part0, blk_rq_bytes(rq))) return BLK_STS_IOERR; diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 55a1bbfef7d4..ca8033e6d699 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -390,6 +390,36 @@ static ssize_t queue_nr_zones_show(struct gendisk *disk, char *page) return queue_var_show(disk_nr_zones(disk), page); } +static ssize_t queue_zoned_qd1_writes_show(struct gendisk *disk, char *page) +{ + return queue_var_show(!!blk_queue_zoned_qd1_writes(disk->queue), + page); +} + +static ssize_t queue_zoned_qd1_writes_store(struct gendisk *disk, + const char *page, size_t count) +{ + struct request_queue *q = disk->queue; + unsigned long qd1_writes; + unsigned int memflags; + ssize_t ret; + + ret = queue_var_store(&qd1_writes, page, count); + if (ret < 0) + return ret; + + memflags = blk_mq_freeze_queue(q); + blk_mq_quiesce_queue(q); + if (qd1_writes) + blk_queue_flag_set(QUEUE_FLAG_ZONED_QD1_WRITES, q); + else + blk_queue_flag_clear(QUEUE_FLAG_ZONED_QD1_WRITES, q); + blk_mq_unquiesce_queue(q); + blk_mq_unfreeze_queue(q, memflags); + + return count; +} + static ssize_t queue_iostats_passthrough_show(struct gendisk *disk, char *page) { return queue_var_show(!!blk_queue_passthrough_stat(disk->queue), page); @@ -617,6 +647,7 @@ QUEUE_LIM_RO_ENTRY(queue_max_zone_append_sectors, "zone_append_max_bytes"); QUEUE_LIM_RO_ENTRY(queue_zone_write_granularity, "zone_write_granularity"); QUEUE_LIM_RO_ENTRY(queue_zoned, "zoned"); +QUEUE_RW_ENTRY(queue_zoned_qd1_writes, "zoned_qd1_writes"); QUEUE_RO_ENTRY(queue_nr_zones, "nr_zones"); QUEUE_LIM_RO_ENTRY(queue_max_open_zones, "max_open_zones"); QUEUE_LIM_RO_ENTRY(queue_max_active_zones, "max_active_zones"); @@ -754,6 +785,7 @@ static struct attribute *queue_attrs[] = { &queue_nomerges_entry.attr, &queue_poll_entry.attr, &queue_poll_delay_entry.attr, + &queue_zoned_qd1_writes_entry.attr, NULL, }; @@ -786,7 +818,8 @@ static umode_t queue_attr_visible(struct kobject *kobj, struct attribute *attr, struct request_queue *q = disk->queue; if ((attr == &queue_max_open_zones_entry.attr || - attr == &queue_max_active_zones_entry.attr) && + attr == &queue_max_active_zones_entry.attr || + attr == &queue_zoned_qd1_writes_entry.attr) && !blk_queue_is_zoned(q)) return 0; diff --git a/block/blk-zoned.c b/block/blk-zoned.c index a4d82342e37a..fe29fe4b6dcc 100644 --- a/block/blk-zoned.c +++ b/block/blk-zoned.c @@ -16,6 +16,8 @@ #include <linux/spinlock.h> #include <linux/refcount.h> #include <linux/mempool.h> +#include <linux/kthread.h> +#include <linux/freezer.h> #include <trace/events/block.h> @@ -40,6 +42,8 @@ static const char *const zone_cond_name[] = { /* * Per-zone write plug. * @node: hlist_node structure for managing the plug using a hash table. + * @entry: list_head structure for listing the plug in the disk list of active + * zone write plugs. * @bio_list: The list of BIOs that are currently plugged. * @bio_work: Work struct to handle issuing of plugged BIOs * @rcu_head: RCU head to free zone write plugs with an RCU grace period. @@ -62,6 +66,7 @@ static const char *const zone_cond_name[] = { */ struct blk_zone_wplug { struct hlist_node node; + struct list_head entry; struct bio_list bio_list; struct work_struct bio_work; struct rcu_head rcu_head; @@ -520,10 +525,11 @@ static bool disk_insert_zone_wplug(struct gendisk *disk, * are racing with other submission context, so we may already have a * zone write plug for the same zone. */ - spin_lock_irqsave(&disk->zone_wplugs_lock, flags); + spin_lock_irqsave(&disk->zone_wplugs_hash_lock, flags); hlist_for_each_entry_rcu(zwplg, &disk->zone_wplugs_hash[idx], node) { if (zwplg->zone_no == zwplug->zone_no) { - spin_unlock_irqrestore(&disk->zone_wplugs_lock, flags); + spin_unlock_irqrestore(&disk->zone_wplugs_hash_lock, + flags); return false; } } @@ -535,7 +541,7 @@ static bool disk_insert_zone_wplug(struct gendisk *disk, * necessarilly in the active condition. */ zones_cond = rcu_dereference_check(disk->zones_cond, - lockdep_is_held(&disk->zone_wplugs_lock)); + lockdep_is_held(&disk->zone_wplugs_hash_lock)); if (zones_cond) zwplug->cond = zones_cond[zwplug->zone_no]; else @@ -543,7 +549,7 @@ static bool disk_insert_zone_wplug(struct gendisk *disk, hlist_add_head_rcu(&zwplug->node, &disk->zone_wplugs_hash[idx]); atomic_inc(&disk->nr_zone_wplugs); - spin_unlock_irqrestore(&disk->zone_wplugs_lock, flags); + spin_unlock_irqrestore(&disk->zone_wplugs_hash_lock, flags); return true; } @@ -596,13 +602,13 @@ static void disk_free_zone_wplug(struct blk_zone_wplug *zwplug) WARN_ON_ONCE(zwplug->flags & BLK_ZONE_WPLUG_PLUGGED); WARN_ON_ONCE(!bio_list_empty(&zwplug->bio_list)); - spin_lock_irqsave(&disk->zone_wplugs_lock, flags); + spin_lock_irqsave(&disk->zone_wplugs_hash_lock, flags); blk_zone_set_cond(rcu_dereference_check(disk->zones_cond, - lockdep_is_held(&disk->zone_wplugs_lock)), + lockdep_is_held(&disk->zone_wplugs_hash_lock)), zwplug->zone_no, zwplug->cond); hlist_del_init_rcu(&zwplug->node); atomic_dec(&disk->nr_zone_wplugs); - spin_unlock_irqrestore(&disk->zone_wplugs_lock, flags); + spin_unlock_irqrestore(&disk->zone_wplugs_hash_lock, flags); call_rcu(&zwplug->rcu_head, disk_free_zone_wplug_rcu); } @@ -628,7 +634,41 @@ static void disk_mark_zone_wplug_dead(struct blk_zone_wplug *zwplug) } } -static void blk_zone_wplug_bio_work(struct work_struct *work); +static inline bool disk_check_zone_wplug_dead(struct blk_zone_wplug *zwplug) +{ + if (!(zwplug->flags & BLK_ZONE_WPLUG_DEAD)) + return false; + + /* + * If a new write is received right after a zone reset completes and + * while the disk_zone_wplugs_worker() thread has not yet released the + * reference on the zone write plug after processing the last write to + * the zone, then the new write BIO will see the zone write plug marked + * as dead. This case is however a false positive and a perfectly valid + * pattern. In such case, restore the zone write plug to a live one. + */ + if (!zwplug->wp_offset && bio_list_empty(&zwplug->bio_list)) { + zwplug->flags &= ~BLK_ZONE_WPLUG_DEAD; + refcount_inc(&zwplug->ref); + return false; + } + + return true; +} + +static bool disk_zone_wplug_submit_bio(struct gendisk *disk, + struct blk_zone_wplug *zwplug); + +static void blk_zone_wplug_bio_work(struct work_struct *work) +{ + struct blk_zone_wplug *zwplug = + container_of(work, struct blk_zone_wplug, bio_work); + + disk_zone_wplug_submit_bio(zwplug->disk, zwplug); + + /* Drop the reference we took in disk_zone_wplug_schedule_work(). */ + disk_put_zone_wplug(zwplug); +} /* * Get a reference on the write plug for the zone containing @sector. @@ -666,6 +706,7 @@ again: zwplug->wp_offset = bdev_offset_from_zone_start(disk->part0, sector); bio_list_init(&zwplug->bio_list); INIT_WORK(&zwplug->bio_work, blk_zone_wplug_bio_work); + INIT_LIST_HEAD(&zwplug->entry); zwplug->disk = disk; spin_lock_irqsave(&zwplug->lock, *flags); @@ -701,6 +742,7 @@ static inline void blk_zone_wplug_bio_io_error(struct blk_zone_wplug *zwplug, */ static void disk_zone_wplug_abort(struct blk_zone_wplug *zwplug) { + struct gendisk *disk = zwplug->disk; struct bio *bio; lockdep_assert_held(&zwplug->lock); @@ -714,6 +756,20 @@ static void disk_zone_wplug_abort(struct blk_zone_wplug *zwplug) blk_zone_wplug_bio_io_error(zwplug, bio); zwplug->flags &= ~BLK_ZONE_WPLUG_PLUGGED; + + /* + * If we are using the per disk zone write plugs worker thread, remove + * the zone write plug from the work list and drop the reference we + * took when the zone write plug was added to that list. + */ + if (blk_queue_zoned_qd1_writes(disk->queue)) { + spin_lock(&disk->zone_wplugs_list_lock); + if (!list_empty(&zwplug->entry)) { + list_del_init(&zwplug->entry); + disk_put_zone_wplug(zwplug); + } + spin_unlock(&disk->zone_wplugs_list_lock); + } } /* @@ -1148,8 +1204,8 @@ void blk_zone_mgmt_bio_endio(struct bio *bio) } } -static void disk_zone_wplug_schedule_bio_work(struct gendisk *disk, - struct blk_zone_wplug *zwplug) +static void disk_zone_wplug_schedule_work(struct gendisk *disk, + struct blk_zone_wplug *zwplug) { lockdep_assert_held(&zwplug->lock); @@ -1162,6 +1218,7 @@ static void disk_zone_wplug_schedule_bio_work(struct gendisk *disk, * and we also drop this reference if the work is already scheduled. */ WARN_ON_ONCE(!(zwplug->flags & BLK_ZONE_WPLUG_PLUGGED)); + WARN_ON_ONCE(blk_queue_zoned_qd1_writes(disk->queue)); refcount_inc(&zwplug->ref); if (!queue_work(disk->zone_wplugs_wq, &zwplug->bio_work)) disk_put_zone_wplug(zwplug); @@ -1201,6 +1258,22 @@ static inline void disk_zone_wplug_add_bio(struct gendisk *disk, bio_list_add(&zwplug->bio_list, bio); trace_disk_zone_wplug_add_bio(zwplug->disk->queue, zwplug->zone_no, bio->bi_iter.bi_sector, bio_sectors(bio)); + + /* + * If we are using the disk zone write plugs worker instead of the per + * zone write plug BIO work, add the zone write plug to the work list + * if it is not already there. Make sure to also get an extra reference + * on the zone write plug so that it does not go away until it is + * removed from the work list. + */ + if (blk_queue_zoned_qd1_writes(disk->queue)) { + spin_lock(&disk->zone_wplugs_list_lock); + if (list_empty(&zwplug->entry)) { + list_add_tail(&zwplug->entry, &disk->zone_wplugs_list); + refcount_inc(&zwplug->ref); + } + spin_unlock(&disk->zone_wplugs_list_lock); + } } /* @@ -1408,12 +1481,12 @@ static bool blk_zone_wplug_handle_write(struct bio *bio, unsigned int nr_segs) } /* - * If we got a zone write plug marked as dead, then the user is issuing - * writes to a full zone, or without synchronizing with zone reset or - * zone finish operations. In such case, fail the BIO to signal this - * invalid usage. + * Check if we got a zone write plug marked as dead. If yes, then the + * user is likely issuing writes to a full zone, or without + * synchronizing with zone reset or zone finish operations. In such + * case, fail the BIO to signal this invalid usage. */ - if (zwplug->flags & BLK_ZONE_WPLUG_DEAD) { + if (disk_check_zone_wplug_dead(zwplug)) { spin_unlock_irqrestore(&zwplug->lock, flags); disk_put_zone_wplug(zwplug); bio_io_error(bio); @@ -1432,6 +1505,13 @@ static bool blk_zone_wplug_handle_write(struct bio *bio, unsigned int nr_segs) goto queue_bio; } + /* + * For rotational devices, we will use the gendisk zone write plugs + * work instead of the per zone write plug BIO work, so queue the BIO. + */ + if (blk_queue_zoned_qd1_writes(disk->queue)) + goto queue_bio; + /* If the zone is already plugged, add the BIO to the BIO plug list. */ if (zwplug->flags & BLK_ZONE_WPLUG_PLUGGED) goto queue_bio; @@ -1454,7 +1534,10 @@ queue_bio: if (!(zwplug->flags & BLK_ZONE_WPLUG_PLUGGED)) { zwplug->flags |= BLK_ZONE_WPLUG_PLUGGED; - disk_zone_wplug_schedule_bio_work(disk, zwplug); + if (blk_queue_zoned_qd1_writes(disk->queue)) + wake_up_process(disk->zone_wplugs_worker); + else + disk_zone_wplug_schedule_work(disk, zwplug); } spin_unlock_irqrestore(&zwplug->lock, flags); @@ -1595,16 +1678,22 @@ static void disk_zone_wplug_unplug_bio(struct gendisk *disk, spin_lock_irqsave(&zwplug->lock, flags); - /* Schedule submission of the next plugged BIO if we have one. */ - if (!bio_list_empty(&zwplug->bio_list)) { - disk_zone_wplug_schedule_bio_work(disk, zwplug); - spin_unlock_irqrestore(&zwplug->lock, flags); - return; - } + /* + * For rotational devices, signal the BIO completion to the zone write + * plug work. Otherwise, schedule submission of the next plugged BIO + * if we have one. + */ + if (bio_list_empty(&zwplug->bio_list)) + zwplug->flags &= ~BLK_ZONE_WPLUG_PLUGGED; + + if (blk_queue_zoned_qd1_writes(disk->queue)) + complete(&disk->zone_wplugs_worker_bio_done); + else if (!bio_list_empty(&zwplug->bio_list)) + disk_zone_wplug_schedule_work(disk, zwplug); - zwplug->flags &= ~BLK_ZONE_WPLUG_PLUGGED; if (!zwplug->wp_offset || disk_zone_wplug_is_full(disk, zwplug)) disk_mark_zone_wplug_dead(zwplug); + spin_unlock_irqrestore(&zwplug->lock, flags); } @@ -1694,10 +1783,9 @@ void blk_zone_write_plug_finish_request(struct request *req) disk_put_zone_wplug(zwplug); } -static void blk_zone_wplug_bio_work(struct work_struct *work) +static bool disk_zone_wplug_submit_bio(struct gendisk *disk, + struct blk_zone_wplug *zwplug) { - struct blk_zone_wplug *zwplug = - container_of(work, struct blk_zone_wplug, bio_work); struct block_device *bdev; unsigned long flags; struct bio *bio; @@ -1713,7 +1801,7 @@ again: if (!bio) { zwplug->flags &= ~BLK_ZONE_WPLUG_PLUGGED; spin_unlock_irqrestore(&zwplug->lock, flags); - goto put_zwplug; + return false; } trace_blk_zone_wplug_bio(zwplug->disk->queue, zwplug->zone_no, @@ -1727,14 +1815,15 @@ again: goto again; } - bdev = bio->bi_bdev; - /* * blk-mq devices will reuse the extra reference on the request queue * usage counter we took when the BIO was plugged, but the submission * path for BIO-based devices will not do that. So drop this extra * reference here. */ + if (blk_queue_zoned_qd1_writes(disk->queue)) + reinit_completion(&disk->zone_wplugs_worker_bio_done); + bdev = bio->bi_bdev; if (bdev_test_flag(bdev, BD_HAS_SUBMIT_BIO)) { bdev->bd_disk->fops->submit_bio(bio); blk_queue_exit(bdev->bd_disk->queue); @@ -1742,14 +1831,78 @@ again: blk_mq_submit_bio(bio); } -put_zwplug: - /* Drop the reference we took in disk_zone_wplug_schedule_bio_work(). */ - disk_put_zone_wplug(zwplug); + return true; +} + +static struct blk_zone_wplug *disk_get_zone_wplugs_work(struct gendisk *disk) +{ + struct blk_zone_wplug *zwplug; + + spin_lock_irq(&disk->zone_wplugs_list_lock); + zwplug = list_first_entry_or_null(&disk->zone_wplugs_list, + struct blk_zone_wplug, entry); + if (zwplug) + list_del_init(&zwplug->entry); + spin_unlock_irq(&disk->zone_wplugs_list_lock); + + return zwplug; +} + +static int disk_zone_wplugs_worker(void *data) +{ + struct gendisk *disk = data; + struct blk_zone_wplug *zwplug; + unsigned int noio_flag; + + noio_flag = memalloc_noio_save(); + set_user_nice(current, MIN_NICE); + set_freezable(); + + for (;;) { + set_current_state(TASK_INTERRUPTIBLE | TASK_FREEZABLE); + + zwplug = disk_get_zone_wplugs_work(disk); + if (zwplug) { + /* + * Process all BIOs of this zone write plug and then + * drop the reference we took when adding the zone write + * plug to the active list. + */ + set_current_state(TASK_RUNNING); + while (disk_zone_wplug_submit_bio(disk, zwplug)) + blk_wait_io(&disk->zone_wplugs_worker_bio_done); + disk_put_zone_wplug(zwplug); + continue; + } + + /* + * Only sleep if nothing sets the state to running. Else check + * for zone write plugs work again as a newly submitted BIO + * might have added a zone write plug to the work list. + */ + if (get_current_state() == TASK_RUNNING) { + try_to_freeze(); + } else { + if (kthread_should_stop()) { + set_current_state(TASK_RUNNING); + break; + } + schedule(); + } + } + + WARN_ON_ONCE(!list_empty(&disk->zone_wplugs_list)); + memalloc_noio_restore(noio_flag); + + return 0; } void disk_init_zone_resources(struct gendisk *disk) { - spin_lock_init(&disk->zone_wplugs_lock); + spin_lock_init(&disk->zone_wplugs_hash_lock); + spin_lock_init(&disk->zone_wplugs_list_lock); + INIT_LIST_HEAD(&disk->zone_wplugs_list); + init_completion(&disk->zone_wplugs_worker_bio_done); } /* @@ -1765,6 +1918,7 @@ static int disk_alloc_zone_resources(struct gendisk *disk, unsigned int pool_size) { unsigned int i; + int ret = -ENOMEM; atomic_set(&disk->nr_zone_wplugs, 0); disk->zone_wplugs_hash_bits = @@ -1790,8 +1944,21 @@ static int disk_alloc_zone_resources(struct gendisk *disk, if (!disk->zone_wplugs_wq) goto destroy_pool; + disk->zone_wplugs_worker = + kthread_create(disk_zone_wplugs_worker, disk, + "%s_zwplugs_worker", disk->disk_name); + if (IS_ERR(disk->zone_wplugs_worker)) { + ret = PTR_ERR(disk->zone_wplugs_worker); + disk->zone_wplugs_worker = NULL; + goto destroy_wq; + } + wake_up_process(disk->zone_wplugs_worker); + return 0; +destroy_wq: + destroy_workqueue(disk->zone_wplugs_wq); + disk->zone_wplugs_wq = NULL; destroy_pool: mempool_destroy(disk->zone_wplugs_pool); disk->zone_wplugs_pool = NULL; @@ -1799,7 +1966,7 @@ free_hash: kfree(disk->zone_wplugs_hash); disk->zone_wplugs_hash = NULL; disk->zone_wplugs_hash_bits = 0; - return -ENOMEM; + return ret; } static void disk_destroy_zone_wplugs_hash_table(struct gendisk *disk) @@ -1839,16 +2006,22 @@ static void disk_set_zones_cond_array(struct gendisk *disk, u8 *zones_cond) { unsigned long flags; - spin_lock_irqsave(&disk->zone_wplugs_lock, flags); + spin_lock_irqsave(&disk->zone_wplugs_hash_lock, flags); zones_cond = rcu_replace_pointer(disk->zones_cond, zones_cond, - lockdep_is_held(&disk->zone_wplugs_lock)); - spin_unlock_irqrestore(&disk->zone_wplugs_lock, flags); + lockdep_is_held(&disk->zone_wplugs_hash_lock)); + spin_unlock_irqrestore(&disk->zone_wplugs_hash_lock, flags); kfree_rcu_mightsleep(zones_cond); } void disk_free_zone_resources(struct gendisk *disk) { + if (disk->zone_wplugs_worker) { + kthread_stop(disk->zone_wplugs_worker); + disk->zone_wplugs_worker = NULL; + } + WARN_ON_ONCE(!list_empty(&disk->zone_wplugs_list)); + if (disk->zone_wplugs_wq) { destroy_workqueue(disk->zone_wplugs_wq); disk->zone_wplugs_wq = NULL; @@ -1979,9 +2152,6 @@ commit: ret = queue_limits_commit_update(q, &lim); unfreeze: - if (ret) - disk_free_zone_resources(disk); - blk_mq_unfreeze_queue(q, memflags); return ret; diff --git a/crypto/krb5/krb5_api.c b/crypto/krb5/krb5_api.c index 23026d4206c8..c7ea40f900a7 100644 --- a/crypto/krb5/krb5_api.c +++ b/crypto/krb5/krb5_api.c @@ -134,27 +134,69 @@ EXPORT_SYMBOL(crypto_krb5_how_much_data); * Find the offset and size of the data in a secure message so that this * information can be used in the metadata buffer which will get added to the * digest by crypto_krb5_verify_mic(). + * + * Return: 0 if successful, -EBADMSG if the message is too short or -EINVAL if + * the mode is unsupported. */ -void crypto_krb5_where_is_the_data(const struct krb5_enctype *krb5, - enum krb5_crypto_mode mode, - size_t *_offset, size_t *_len) +int crypto_krb5_where_is_the_data(const struct krb5_enctype *krb5, + enum krb5_crypto_mode mode, + size_t *_offset, size_t *_len) { switch (mode) { case KRB5_CHECKSUM_MODE: + if (*_len < krb5->cksum_len) + return -EBADMSG; *_offset += krb5->cksum_len; *_len -= krb5->cksum_len; - return; + return 0; case KRB5_ENCRYPT_MODE: + if (*_len < krb5->conf_len + krb5->cksum_len) + return -EBADMSG; *_offset += krb5->conf_len; *_len -= krb5->conf_len + krb5->cksum_len; - return; + return 0; default: WARN_ON_ONCE(1); - return; + return -EINVAL; } } EXPORT_SYMBOL(crypto_krb5_where_is_the_data); +/** + * crypto_krb5_check_data_len - Check a message is big enough + * @krb5: The encoding to use. + * @mode: Mode of operation. + * @len: The length of the secure blob. + * @min_content: Minimum length of the content inside the blob. + * + * Check that a message is large enough to hold whatever bits the encryption + * type wants to glue on (nonce, checksum) plus a minimum amount of content. + * + * Return: 0 if successful, -EBADMSG if the message is too short or -EINVAL if + * the mode is unsupported. + */ +int crypto_krb5_check_data_len(const struct krb5_enctype *krb5, + enum krb5_crypto_mode mode, + size_t len, size_t min_content) +{ + switch (mode) { + case KRB5_CHECKSUM_MODE: + if (len < krb5->cksum_len || + len - krb5->cksum_len < min_content) + return -EBADMSG; + return 0; + case KRB5_ENCRYPT_MODE: + if (len < krb5->conf_len + krb5->cksum_len || + len - (krb5->conf_len + krb5->cksum_len) < min_content) + return -EBADMSG; + return 0; + default: + WARN_ON_ONCE(1); + return -EINVAL; + } +} +EXPORT_SYMBOL(crypto_krb5_check_data_len); + /* * Prepare the encryption with derived key data. */ diff --git a/drivers/accel/qaic/qaic_data.c b/drivers/accel/qaic/qaic_data.c index 95300c2f7d8a..1e4c579d2725 100644 --- a/drivers/accel/qaic/qaic_data.c +++ b/drivers/accel/qaic/qaic_data.c @@ -606,8 +606,11 @@ static const struct vm_operations_struct drm_vm_ops = { static int qaic_gem_object_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma) { struct qaic_bo *bo = to_qaic_bo(obj); + unsigned long remap_start; unsigned long offset = 0; + unsigned long remap_end; struct scatterlist *sg; + unsigned long length; int ret = 0; if (drm_gem_is_imported(obj)) @@ -615,11 +618,27 @@ static int qaic_gem_object_mmap(struct drm_gem_object *obj, struct vm_area_struc for (sg = bo->sgt->sgl; sg; sg = sg_next(sg)) { if (sg_page(sg)) { + /* if sg is too large for the VMA, so truncate it to fit */ + if (check_add_overflow(vma->vm_start, offset, &remap_start)) + return -EINVAL; + if (check_add_overflow(remap_start, sg->length, &remap_end)) + return -EINVAL; + + if (remap_end > vma->vm_end) { + if (check_sub_overflow(vma->vm_end, remap_start, &length)) + return -EINVAL; + } else { + length = sg->length; + } + + if (length == 0) + goto out; + ret = remap_pfn_range(vma, vma->vm_start + offset, page_to_pfn(sg_page(sg)), - sg->length, vma->vm_page_prot); + length, vma->vm_page_prot); if (ret) goto out; - offset += sg->length; + offset += length; } } diff --git a/drivers/acpi/ac.c b/drivers/acpi/ac.c index c5d77c3cb4bc..56783af6239b 100644 --- a/drivers/acpi/ac.c +++ b/drivers/acpi/ac.c @@ -203,11 +203,15 @@ static const struct dmi_system_id ac_dmi_table[] __initconst = { static int acpi_ac_probe(struct platform_device *pdev) { - struct acpi_device *adev = ACPI_COMPANION(&pdev->dev); struct power_supply_config psy_cfg = {}; + struct acpi_device *adev; struct acpi_ac *ac; int result; + adev = ACPI_COMPANION(&pdev->dev); + if (!adev) + return -ENODEV; + ac = kzalloc_obj(struct acpi_ac); if (!ac) return -ENOMEM; diff --git a/drivers/acpi/acpi_pad.c b/drivers/acpi/acpi_pad.c index c9a0bcaba2e4..dea7b2b4e544 100644 --- a/drivers/acpi/acpi_pad.c +++ b/drivers/acpi/acpi_pad.c @@ -426,9 +426,13 @@ static void acpi_pad_notify(acpi_handle handle, u32 event, static int acpi_pad_probe(struct platform_device *pdev) { - struct acpi_device *adev = ACPI_COMPANION(&pdev->dev); + struct acpi_device *adev; acpi_status status; + adev = ACPI_COMPANION(&pdev->dev); + if (!adev) + return -ENODEV; + strscpy(acpi_device_name(adev), ACPI_PROCESSOR_AGGREGATOR_DEVICE_NAME); strscpy(acpi_device_class(adev), ACPI_PROCESSOR_AGGREGATOR_CLASS); diff --git a/drivers/acpi/acpi_tad.c b/drivers/acpi/acpi_tad.c index 6d870d97ada6..49e0710ac5ca 100644 --- a/drivers/acpi/acpi_tad.c +++ b/drivers/acpi/acpi_tad.c @@ -593,12 +593,16 @@ static void acpi_tad_remove(struct platform_device *pdev) static int acpi_tad_probe(struct platform_device *pdev) { struct device *dev = &pdev->dev; - acpi_handle handle = ACPI_HANDLE(dev); struct acpi_tad_driver_data *dd; + acpi_handle handle; acpi_status status; unsigned long long caps; int ret; + handle = ACPI_HANDLE(dev); + if (!handle) + return -ENODEV; + ret = acpi_install_cmos_rtc_space_handler(handle); if (ret < 0) { dev_info(dev, "Unable to install space handler\n"); diff --git a/drivers/acpi/battery.c b/drivers/acpi/battery.c index 8fbad8bc4650..8a60e5f42cd7 100644 --- a/drivers/acpi/battery.c +++ b/drivers/acpi/battery.c @@ -96,6 +96,7 @@ struct acpi_battery { struct power_supply *bat; struct power_supply_desc bat_desc; struct acpi_device *device; + struct device *phys_dev; struct notifier_block pm_nb; struct list_head list; unsigned long update_time; @@ -1035,7 +1036,7 @@ static int acpi_battery_update(struct acpi_battery *battery, bool resume) if ((battery->state & ACPI_BATTERY_STATE_CRITICAL) || (test_bit(ACPI_BATTERY_ALARM_PRESENT, &battery->flags) && (battery->capacity_now <= battery->alarm))) - acpi_pm_wakeup_event(&battery->device->dev); + acpi_pm_wakeup_event(battery->phys_dev); return result; } @@ -1215,10 +1216,14 @@ static void sysfs_battery_cleanup(struct acpi_battery *battery) static int acpi_battery_probe(struct platform_device *pdev) { - struct acpi_device *device = ACPI_COMPANION(&pdev->dev); struct acpi_battery *battery; + struct acpi_device *device; int result; + device = ACPI_COMPANION(&pdev->dev); + if (!device) + return -ENODEV; + if (device->dep_unmet) return -EPROBE_DEFER; @@ -1228,6 +1233,7 @@ static int acpi_battery_probe(struct platform_device *pdev) platform_set_drvdata(pdev, battery); + battery->phys_dev = &pdev->dev; battery->device = device; strscpy(acpi_device_name(device), ACPI_BATTERY_DEVICE_NAME); strscpy(acpi_device_class(device), ACPI_BATTERY_CLASS); diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c index 97b05246efab..ff30f993b150 100644 --- a/drivers/acpi/button.c +++ b/drivers/acpi/button.c @@ -531,15 +531,20 @@ static int acpi_lid_input_open(struct input_dev *input) static int acpi_button_probe(struct platform_device *pdev) { - struct acpi_device *device = ACPI_COMPANION(&pdev->dev); acpi_notify_handler handler; + struct acpi_device *device; struct acpi_button *button; struct input_dev *input; - const char *hid = acpi_device_hid(device); acpi_status status; char *name, *class; + const char *hid; int error = 0; + device = ACPI_COMPANION(&pdev->dev); + if (!device) + return -ENODEV; + + hid = acpi_device_hid(device); if (!strcmp(hid, ACPI_BUTTON_HID_LID) && lid_init_state == ACPI_BUTTON_LID_INIT_DISABLED) return -ENODEV; diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c index 6f0065257a77..2d94bae5c4d1 100644 --- a/drivers/acpi/ec.c +++ b/drivers/acpi/ec.c @@ -1679,10 +1679,14 @@ static int acpi_ec_setup(struct acpi_ec *ec, struct acpi_device *device, bool ca static int acpi_ec_probe(struct platform_device *pdev) { - struct acpi_device *device = ACPI_COMPANION(&pdev->dev); + struct acpi_device *device; struct acpi_ec *ec; int ret; + device = ACPI_COMPANION(&pdev->dev); + if (!device) + return -ENODEV; + strscpy(acpi_device_name(device), ACPI_EC_DEVICE_NAME); strscpy(acpi_device_class(device), ACPI_EC_CLASS); diff --git a/drivers/acpi/hed.c b/drivers/acpi/hed.c index 4d5e12ed6f3c..060e8d670f5d 100644 --- a/drivers/acpi/hed.c +++ b/drivers/acpi/hed.c @@ -50,9 +50,13 @@ static void acpi_hed_notify(acpi_handle handle, u32 event, void *data) static int acpi_hed_probe(struct platform_device *pdev) { - struct acpi_device *device = ACPI_COMPANION(&pdev->dev); + struct acpi_device *device; int err; + device = ACPI_COMPANION(&pdev->dev); + if (!device) + return -ENODEV; + /* Only one hardware error device */ if (hed_handle) return -EINVAL; diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c index d13264fb9e02..9304ac996d41 100644 --- a/drivers/acpi/nfit/core.c +++ b/drivers/acpi/nfit/core.c @@ -3341,12 +3341,16 @@ static int acpi_nfit_probe(struct platform_device *pdev) struct acpi_buffer buf = { ACPI_ALLOCATE_BUFFER, NULL }; struct acpi_nfit_desc *acpi_desc; struct device *dev = &pdev->dev; - struct acpi_device *adev = ACPI_COMPANION(dev); struct acpi_table_header *tbl; + struct acpi_device *adev; acpi_status status = AE_OK; acpi_size sz; int rc = 0; + adev = ACPI_COMPANION(&pdev->dev); + if (!adev) + return -ENODEV; + rc = acpi_dev_install_notify_handler(adev, ACPI_DEVICE_NOTIFY, acpi_nfit_notify, dev); if (rc) diff --git a/drivers/acpi/pfr_telemetry.c b/drivers/acpi/pfr_telemetry.c index 32bdf8cbe8f2..2387376832a1 100644 --- a/drivers/acpi/pfr_telemetry.c +++ b/drivers/acpi/pfr_telemetry.c @@ -360,10 +360,14 @@ static void pfrt_log_put_idx(void *data) static int acpi_pfrt_log_probe(struct platform_device *pdev) { - acpi_handle handle = ACPI_HANDLE(&pdev->dev); struct pfrt_log_device *pfrt_log_dev; + acpi_handle handle; int ret; + handle = ACPI_HANDLE(&pdev->dev); + if (!handle) + return -ENODEV; + if (!acpi_has_method(handle, "_DSM")) { dev_dbg(&pdev->dev, "Missing _DSM\n"); return -ENODEV; diff --git a/drivers/acpi/pfr_update.c b/drivers/acpi/pfr_update.c index 11b1c2828005..6283105bb0e8 100644 --- a/drivers/acpi/pfr_update.c +++ b/drivers/acpi/pfr_update.c @@ -538,10 +538,14 @@ static void pfru_put_idx(void *data) static int acpi_pfru_probe(struct platform_device *pdev) { - acpi_handle handle = ACPI_HANDLE(&pdev->dev); struct pfru_device *pfru_dev; + acpi_handle handle; int ret; + handle = ACPI_HANDLE(&pdev->dev); + if (!handle) + return -ENODEV; + if (!acpi_has_method(handle, "_DSM")) { dev_dbg(&pdev->dev, "Missing _DSM\n"); return -ENODEV; diff --git a/drivers/acpi/sbs.c b/drivers/acpi/sbs.c index bbd3938f7b52..e32d424ff3e1 100644 --- a/drivers/acpi/sbs.c +++ b/drivers/acpi/sbs.c @@ -631,11 +631,15 @@ static void acpi_sbs_callback(void *context) static int acpi_sbs_probe(struct platform_device *pdev) { - struct acpi_device *device = ACPI_COMPANION(&pdev->dev); + struct acpi_device *device; struct acpi_sbs *sbs; int result = 0; int id; + device = ACPI_COMPANION(&pdev->dev); + if (!device) + return -ENODEV; + sbs = kzalloc_obj(struct acpi_sbs); if (!sbs) { result = -ENOMEM; diff --git a/drivers/acpi/sbshc.c b/drivers/acpi/sbshc.c index 36850831910b..4b2a3ef83573 100644 --- a/drivers/acpi/sbshc.c +++ b/drivers/acpi/sbshc.c @@ -240,11 +240,15 @@ static int smbus_alarm(void *context) static int acpi_smbus_hc_probe(struct platform_device *pdev) { - struct acpi_device *device = ACPI_COMPANION(&pdev->dev); + struct acpi_device *device; int status; unsigned long long val; struct acpi_smb_hc *hc; + device = ACPI_COMPANION(&pdev->dev); + if (!device) + return -ENODEV; + status = acpi_evaluate_integer(device->handle, "_EC", NULL, &val); if (ACPI_FAILURE(status)) { pr_err("error obtaining _EC.\n"); diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c index 64356b004a57..a7bb550a7185 100644 --- a/drivers/acpi/thermal.c +++ b/drivers/acpi/thermal.c @@ -790,7 +790,7 @@ static int acpi_thermal_probe(struct platform_device *pdev) int i; if (!device) - return -EINVAL; + return -ENODEV; tz = kzalloc_obj(struct acpi_thermal); if (!tz) diff --git a/drivers/acpi/tiny-power-button.c b/drivers/acpi/tiny-power-button.c index 531e65b01bcb..92516ef84b02 100644 --- a/drivers/acpi/tiny-power-button.c +++ b/drivers/acpi/tiny-power-button.c @@ -38,9 +38,13 @@ static u32 acpi_tiny_power_button_event(void *not_used) static int acpi_tiny_power_button_probe(struct platform_device *pdev) { - struct acpi_device *device = ACPI_COMPANION(&pdev->dev); + struct acpi_device *device; acpi_status status; + device = ACPI_COMPANION(&pdev->dev); + if (!device) + return -ENODEV; + if (device->device_type == ACPI_BUS_TYPE_POWER_BUTTON) { status = acpi_install_fixed_event_handler(ACPI_EVENT_POWER_BUTTON, acpi_tiny_power_button_event, diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index 374993031895..f8c2e3192a70 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -5579,6 +5579,7 @@ void ata_link_init(struct ata_port *ap, struct ata_link *link, int pmp) link->pmp = pmp; link->active_tag = ATA_TAG_POISON; link->hw_sata_spd_limit = UINT_MAX; + INIT_WORK(&link->deferred_qc_work, ata_scsi_deferred_qc_work); /* can't use iterator, ap isn't initialized yet */ for (i = 0; i < ATA_MAX_DEVICES; i++) { @@ -5661,7 +5662,6 @@ struct ata_port *ata_port_alloc(struct ata_host *host) mutex_init(&ap->scsi_scan_mutex); INIT_DELAYED_WORK(&ap->hotplug_task, ata_scsi_hotplug); INIT_DELAYED_WORK(&ap->scsi_rescan_task, ata_scsi_dev_rescan); - INIT_WORK(&ap->deferred_qc_work, ata_scsi_deferred_qc_work); INIT_LIST_HEAD(&ap->eh_done_q); init_waitqueue_head(&ap->eh_wait_q); init_completion(&ap->park_req_pending); @@ -6286,12 +6286,15 @@ static void ata_port_detach(struct ata_port *ap) /* It better be dead now and not have any remaining deferred qc. */ WARN_ON(!(ap->pflags & ATA_PFLAG_UNLOADED)); - WARN_ON(ap->deferred_qc); - cancel_work_sync(&ap->deferred_qc_work); cancel_delayed_work_sync(&ap->hotplug_task); cancel_delayed_work_sync(&ap->scsi_rescan_task); + ata_for_each_link(link, ap, PMP_FIRST) { + WARN_ON(link->deferred_qc); + cancel_work_sync(&link->deferred_qc_work); + } + /* Delete port multiplier link transport devices */ if (ap->pmp_link) { int i; diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c index 23be85418b3b..5e8a63206108 100644 --- a/drivers/ata/libata-eh.c +++ b/drivers/ata/libata-eh.c @@ -643,11 +643,11 @@ void ata_scsi_cmd_error_handler(struct Scsi_Host *host, struct ata_port *ap, if (qc->scsicmd != scmd) continue; if ((qc->flags & ATA_QCFLAG_ACTIVE) || - qc == ap->deferred_qc) + qc == qc->dev->link->deferred_qc) break; } - if (i < ATA_MAX_QUEUE && qc == ap->deferred_qc) { + if (i < ATA_MAX_QUEUE && qc == qc->dev->link->deferred_qc) { /* * This is a deferred command that timed out while * waiting for the command queue to drain. Since the qc @@ -658,8 +658,8 @@ void ata_scsi_cmd_error_handler(struct Scsi_Host *host, struct ata_port *ap, * deferred qc work from issuing this qc. */ WARN_ON_ONCE(qc->flags & ATA_QCFLAG_ACTIVE); - ap->deferred_qc = NULL; - cancel_work(&ap->deferred_qc_work); + qc->dev->link->deferred_qc = NULL; + cancel_work(&qc->dev->link->deferred_qc_work); set_host_byte(scmd, DID_TIME_OUT); scsi_eh_finish_cmd(scmd, &ap->eh_done_q); } else if (i < ATA_MAX_QUEUE) { diff --git a/drivers/ata/libata-pmp.c b/drivers/ata/libata-pmp.c index e3adc008fed1..e8540931b4a1 100644 --- a/drivers/ata/libata-pmp.c +++ b/drivers/ata/libata-pmp.c @@ -110,13 +110,24 @@ int sata_pmp_qc_defer_cmd_switch(struct ata_queued_cmd *qc) { struct ata_link *link = qc->dev->link; struct ata_port *ap = link->ap; + int ret; if (ap->excl_link == NULL || ap->excl_link == link) { if (ap->nr_active_links == 0 || ata_link_active(link)) { qc->flags |= ATA_QCFLAG_CLEAR_EXCL; - return ata_std_qc_defer(qc); + ret = ata_std_qc_defer(qc); + if (ret == ATA_DEFER_LINK) + return ATA_DEFER_LINK_EXCL; + return ret; } + /* + * Note: ap->excl_link contains the link that is next in line, + * i.e. implicit round robin. If there is only one link + * dispatching, ap->excl_link will be left unclaimed, allowing + * other links to set ap->excl_link, ensuring that the currently + * active link cannot queue any more. + */ ap->excl_link = link; } @@ -571,8 +582,11 @@ static void sata_pmp_detach(struct ata_device *dev) if (ap->ops->pmp_detach) ap->ops->pmp_detach(ap); - ata_for_each_link(tlink, ap, EDGE) + ata_for_each_link(tlink, ap, EDGE) { + WARN_ON(tlink->deferred_qc); + cancel_work_sync(&tlink->deferred_qc_work); ata_eh_detach_dev(tlink->device); + } spin_lock_irqsave(ap->lock, flags); ap->nr_pmp_links = 0; diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c index cd607911d724..0b4adfc8dc84 100644 --- a/drivers/ata/libata-scsi.c +++ b/drivers/ata/libata-scsi.c @@ -1660,8 +1660,9 @@ static void ata_qc_done(struct ata_queued_cmd *qc) void ata_scsi_deferred_qc_work(struct work_struct *work) { - struct ata_port *ap = - container_of(work, struct ata_port, deferred_qc_work); + struct ata_link *link = + container_of(work, struct ata_link, deferred_qc_work); + struct ata_port *ap = link->ap; struct ata_queued_cmd *qc; unsigned long flags; @@ -1672,10 +1673,10 @@ void ata_scsi_deferred_qc_work(struct work_struct *work) * such case, we should not need any more deferring the qc, so warn if * qc_defer() says otherwise. */ - qc = ap->deferred_qc; + qc = link->deferred_qc; if (qc && !ata_port_eh_scheduled(ap)) { WARN_ON_ONCE(ap->ops->qc_defer(qc)); - ap->deferred_qc = NULL; + link->deferred_qc = NULL; ata_qc_issue(qc); } @@ -1684,8 +1685,7 @@ void ata_scsi_deferred_qc_work(struct work_struct *work) void ata_scsi_requeue_deferred_qc(struct ata_port *ap) { - struct ata_queued_cmd *qc = ap->deferred_qc; - struct scsi_cmnd *scmd; + struct ata_link *link; lockdep_assert_held(ap->lock); @@ -1694,20 +1694,25 @@ void ata_scsi_requeue_deferred_qc(struct ata_port *ap) * do not try to be smart about what to do with this deferred command * and simply requeue it by completing it with DID_REQUEUE. */ - if (!qc) - return; - - scmd = qc->scsicmd; - ap->deferred_qc = NULL; - cancel_work(&ap->deferred_qc_work); - ata_qc_free(qc); - scmd->result = (DID_REQUEUE << 16); - scsi_done(scmd); + ata_for_each_link(link, ap, PMP_FIRST) { + struct ata_queued_cmd *qc = link->deferred_qc; + struct scsi_cmnd *scmd; + + if (qc) { + scmd = qc->scsicmd; + link->deferred_qc = NULL; + cancel_work(&link->deferred_qc_work); + ata_qc_free(qc); + scmd->result = (DID_REQUEUE << 16); + scsi_done(scmd); + } + } } -static void ata_scsi_schedule_deferred_qc(struct ata_port *ap) +static void ata_scsi_schedule_deferred_qc(struct ata_link *link) { - struct ata_queued_cmd *qc = ap->deferred_qc; + struct ata_queued_cmd *qc = link->deferred_qc; + struct ata_port *ap = link->ap; lockdep_assert_held(ap->lock); @@ -1724,12 +1729,12 @@ static void ata_scsi_schedule_deferred_qc(struct ata_port *ap) return; } if (!ap->ops->qc_defer(qc)) - queue_work(system_highpri_wq, &ap->deferred_qc_work); + queue_work(system_highpri_wq, &link->deferred_qc_work); } static void ata_scsi_qc_complete(struct ata_queued_cmd *qc) { - struct ata_port *ap = qc->ap; + struct ata_link *link = qc->dev->link; struct scsi_cmnd *cmd = qc->scsicmd; u8 *cdb = cmd->cmnd; bool have_sense = qc->flags & ATA_QCFLAG_SENSE_VALID; @@ -1760,22 +1765,23 @@ static void ata_scsi_qc_complete(struct ata_queued_cmd *qc) ata_qc_done(qc); - ata_scsi_schedule_deferred_qc(ap); + ata_scsi_schedule_deferred_qc(link); } static int ata_scsi_qc_issue(struct ata_port *ap, struct ata_queued_cmd *qc) { + struct ata_link *link = qc->dev->link; int ret; if (!ap->ops->qc_defer) - goto issue; + goto issue_qc; /* * If we already have a deferred qc, then rely on the SCSI layer to * requeue and defer all incoming commands until the deferred qc is * processed, once all on-going commands complete. */ - if (ap->deferred_qc) { + if (link->deferred_qc) { ata_qc_free(qc); return SCSI_MLQUEUE_DEVICE_BUSY; } @@ -1787,38 +1793,46 @@ static int ata_scsi_qc_issue(struct ata_port *ap, struct ata_queued_cmd *qc) break; case ATA_DEFER_LINK: ret = SCSI_MLQUEUE_DEVICE_BUSY; - break; + goto defer_qc; + case ATA_DEFER_LINK_EXCL: + /* + * Drivers making use of ap->excl_link cannot store the QC in + * link->deferred_qc, because the ap->excl_link handling is + * incompatible with the link->deferred_qc workqueue handling. + */ + ret = SCSI_MLQUEUE_DEVICE_BUSY; + goto free_qc; case ATA_DEFER_PORT: ret = SCSI_MLQUEUE_HOST_BUSY; - break; + goto free_qc; default: WARN_ON_ONCE(1); ret = SCSI_MLQUEUE_HOST_BUSY; - break; + goto free_qc; } - if (ret) { - /* - * We must defer this qc: if this is not an NCQ command, keep - * this qc as a deferred one and report to the SCSI layer that - * we issued it so that it is not requeued. The deferred qc will - * be issued with the port deferred_qc_work once all on-going - * commands complete. - */ - if (!ata_is_ncq(qc->tf.protocol)) { - ap->deferred_qc = qc; - return 0; - } +issue_qc: + ata_qc_issue(qc); + return 0; - /* Force a requeue of the command to defer its execution. */ - ata_qc_free(qc); - return ret; +defer_qc: + /* + * We must defer this qc: if this is not an NCQ command, keep + * this qc as a deferred one and report to the SCSI layer that + * we issued it so that it is not requeued. The deferred qc will + * be issued with the port deferred_qc_work once all on-going + * commands complete. + */ + if (!ata_is_ncq(qc->tf.protocol)) { + link->deferred_qc = qc; + return 0; } -issue: - ata_qc_issue(qc); +free_qc: + /* Force a requeue of the command to defer its execution. */ + ata_qc_free(qc); - return 0; + return ret; } /** diff --git a/drivers/ata/sata_sil24.c b/drivers/ata/sata_sil24.c index d642ece9f07a..57f1081b86db 100644 --- a/drivers/ata/sata_sil24.c +++ b/drivers/ata/sata_sil24.c @@ -789,6 +789,7 @@ static int sil24_qc_defer(struct ata_queued_cmd *qc) struct ata_link *link = qc->dev->link; struct ata_port *ap = link->ap; u8 prot = qc->tf.protocol; + int ret; /* * There is a bug in the chip: @@ -826,7 +827,10 @@ static int sil24_qc_defer(struct ata_queued_cmd *qc) qc->flags |= ATA_QCFLAG_CLEAR_EXCL; } - return ata_std_qc_defer(qc); + ret = ata_std_qc_defer(qc); + if (ret == ATA_DEFER_LINK) + return ATA_DEFER_LINK_EXCL; + return ret; } static enum ata_completion_errors sil24_qc_prep(struct ata_queued_cmd *qc) diff --git a/drivers/base/memory.c b/drivers/base/memory.c index a3091924918b..2f6fbc39ebea 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -1230,8 +1230,10 @@ void memblk_nr_poison_inc(unsigned long pfn) const unsigned long block_id = pfn_to_block_id(pfn); struct memory_block *mem = find_memory_block_by_id(block_id); - if (mem) + if (mem) { atomic_long_inc(&mem->nr_hwpoison); + put_device(&mem->dev); + } } void memblk_nr_poison_sub(unsigned long pfn, long i) @@ -1239,8 +1241,10 @@ void memblk_nr_poison_sub(unsigned long pfn, long i) const unsigned long block_id = pfn_to_block_id(pfn); struct memory_block *mem = find_memory_block_by_id(block_id); - if (mem) + if (mem) { atomic_long_sub(i, &mem->nr_hwpoison); + put_device(&mem->dev); + } } static unsigned long memblk_nr_poison(struct memory_block *mem) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index 4065336ebd1f..6c1e7347e6a7 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -4565,24 +4565,12 @@ out: return ret; } -static void cancel_tasks_sync(struct rbd_device *rbd_dev) -{ - dout("%s rbd_dev %p\n", __func__, rbd_dev); - - cancel_work_sync(&rbd_dev->acquired_lock_work); - cancel_work_sync(&rbd_dev->released_lock_work); - cancel_delayed_work_sync(&rbd_dev->lock_dwork); - cancel_work_sync(&rbd_dev->unlock_work); -} - /* * header_rwsem must not be held to avoid a deadlock with * rbd_dev_refresh() when flushing notifies. */ static void rbd_unregister_watch(struct rbd_device *rbd_dev) { - cancel_tasks_sync(rbd_dev); - mutex_lock(&rbd_dev->watch_mutex); if (rbd_dev->watch_state == RBD_WATCH_STATE_REGISTERED) __rbd_unregister_watch(rbd_dev); @@ -6548,10 +6536,18 @@ out_err: static void rbd_dev_image_unlock(struct rbd_device *rbd_dev) { + dout("%s rbd_dev %p\n", __func__, rbd_dev); + + disable_delayed_work_sync(&rbd_dev->lock_dwork); + disable_work_sync(&rbd_dev->unlock_work); + down_write(&rbd_dev->lock_rwsem); if (__rbd_is_lock_owner(rbd_dev)) __rbd_release_lock(rbd_dev); up_write(&rbd_dev->lock_rwsem); + + flush_work(&rbd_dev->acquired_lock_work); + flush_work(&rbd_dev->released_lock_work); } /* diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index 0bdb804fca83..e5f4942d9911 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -868,6 +868,9 @@ static int ublk_validate_params(const struct ublk_device *ub) if (p->max_sectors > (ub->dev_info.max_io_buf_bytes >> 9)) return -EINVAL; + if (p->max_sectors < PAGE_SECTORS) + return -EINVAL; + if (ublk_dev_is_zoned(ub) && !p->chunk_sectors) return -EINVAL; } else diff --git a/drivers/bluetooth/btintel_pcie.c b/drivers/bluetooth/btintel_pcie.c index 37b744e35bc4..8dbd72895cb2 100644 --- a/drivers/bluetooth/btintel_pcie.c +++ b/drivers/bluetooth/btintel_pcie.c @@ -568,12 +568,10 @@ static int btintel_pcie_get_mac_access(struct btintel_pcie_data *data) reg = btintel_pcie_rd_reg32(data, BTINTEL_PCIE_CSR_FUNC_CTRL_REG); - reg |= BTINTEL_PCIE_CSR_FUNC_CTRL_STOP_MAC_ACCESS_DIS; - reg |= BTINTEL_PCIE_CSR_FUNC_CTRL_XTAL_CLK_REQ; - if ((reg & BTINTEL_PCIE_CSR_FUNC_CTRL_MAC_ACCESS_STS) == 0) + if (!(reg & BTINTEL_PCIE_CSR_FUNC_CTRL_MAC_ACCESS_REQ)) { reg |= BTINTEL_PCIE_CSR_FUNC_CTRL_MAC_ACCESS_REQ; - - btintel_pcie_wr_reg32(data, BTINTEL_PCIE_CSR_FUNC_CTRL_REG, reg); + btintel_pcie_wr_reg32(data, BTINTEL_PCIE_CSR_FUNC_CTRL_REG, reg); + } do { reg = btintel_pcie_rd_reg32(data, BTINTEL_PCIE_CSR_FUNC_CTRL_REG); @@ -593,16 +591,10 @@ static void btintel_pcie_release_mac_access(struct btintel_pcie_data *data) reg = btintel_pcie_rd_reg32(data, BTINTEL_PCIE_CSR_FUNC_CTRL_REG); - if (reg & BTINTEL_PCIE_CSR_FUNC_CTRL_MAC_ACCESS_REQ) + if (reg & BTINTEL_PCIE_CSR_FUNC_CTRL_MAC_ACCESS_REQ) { reg &= ~BTINTEL_PCIE_CSR_FUNC_CTRL_MAC_ACCESS_REQ; - - if (reg & BTINTEL_PCIE_CSR_FUNC_CTRL_STOP_MAC_ACCESS_DIS) - reg &= ~BTINTEL_PCIE_CSR_FUNC_CTRL_STOP_MAC_ACCESS_DIS; - - if (reg & BTINTEL_PCIE_CSR_FUNC_CTRL_XTAL_CLK_REQ) - reg &= ~BTINTEL_PCIE_CSR_FUNC_CTRL_XTAL_CLK_REQ; - - btintel_pcie_wr_reg32(data, BTINTEL_PCIE_CSR_FUNC_CTRL_REG, reg); + btintel_pcie_wr_reg32(data, BTINTEL_PCIE_CSR_FUNC_CTRL_REG, reg); + } } static void *btintel_pcie_copy_tlv(void *dest, enum btintel_pcie_tlv_type type, diff --git a/drivers/bluetooth/btintel_pcie.h b/drivers/bluetooth/btintel_pcie.h index e3d941ffef4a..34aa092bfbe3 100644 --- a/drivers/bluetooth/btintel_pcie.h +++ b/drivers/bluetooth/btintel_pcie.h @@ -34,9 +34,6 @@ #define BTINTEL_PCIE_CSR_FUNC_CTRL_MAC_ACCESS_STS (BIT(20)) #define BTINTEL_PCIE_CSR_FUNC_CTRL_MAC_ACCESS_REQ (BIT(21)) -/* Stop MAC Access disconnection request */ -#define BTINTEL_PCIE_CSR_FUNC_CTRL_STOP_MAC_ACCESS_DIS (BIT(22)) -#define BTINTEL_PCIE_CSR_FUNC_CTRL_XTAL_CLK_REQ (BIT(23)) #define BTINTEL_PCIE_CSR_FUNC_CTRL_BUS_MASTER_STS (BIT(28)) #define BTINTEL_PCIE_CSR_FUNC_CTRL_BUS_MASTER_DISCON (BIT(29)) diff --git a/drivers/bluetooth/btmtk.c b/drivers/bluetooth/btmtk.c index a4b4dacfd2ad..04f183fd3d12 100644 --- a/drivers/bluetooth/btmtk.c +++ b/drivers/bluetooth/btmtk.c @@ -496,6 +496,7 @@ static void btmtk_usb_wmt_recv(struct urb *urb) return; } else if (urb->status == -ENOENT) { /* Avoid suspend failed when usb_kill_urb */ + kfree(urb->setup_packet); return; } @@ -569,6 +570,7 @@ static int btmtk_usb_submit_wmt_recv_urb(struct hci_dev *hdev) if (err != -EPERM && err != -ENODEV) bt_dev_err(hdev, "urb %p submission failed (%d)", urb, -err); + kfree(dr); usb_unanchor_urb(urb); } diff --git a/drivers/bluetooth/hci_ldisc.c b/drivers/bluetooth/hci_ldisc.c index 275ea865bc29..47f4902b40b4 100644 --- a/drivers/bluetooth/hci_ldisc.c +++ b/drivers/bluetooth/hci_ldisc.c @@ -194,7 +194,15 @@ void hci_uart_init_work(struct work_struct *work) err = hci_register_dev(hu->hdev); if (err < 0) { BT_ERR("Can't register HCI device"); + + percpu_down_write(&hu->proto_lock); clear_bit(HCI_UART_PROTO_READY, &hu->flags); + percpu_up_write(&hu->proto_lock); + + /* Safely cancel work after clearing flags */ + cancel_work_sync(&hu->write_work); + + /* Close protocol before freeing hdev */ hu->proto->close(hu); hdev = hu->hdev; hu->hdev = NULL; @@ -263,8 +271,12 @@ static int hci_uart_open(struct hci_dev *hdev) /* Close device */ static int hci_uart_close(struct hci_dev *hdev) { + struct hci_uart *hu = hci_get_drvdata(hdev); + BT_DBG("hdev %p", hdev); + cancel_work_sync(&hu->write_work); + hci_uart_flush(hdev); hdev->flush = NULL; return 0; @@ -531,6 +543,7 @@ static void hci_uart_tty_close(struct tty_struct *tty) { struct hci_uart *hu = tty->disc_data; struct hci_dev *hdev; + bool proto_ready; BT_DBG("tty %p", tty); @@ -540,24 +553,38 @@ static void hci_uart_tty_close(struct tty_struct *tty) if (!hu) return; - hdev = hu->hdev; - if (hdev) - hci_uart_close(hdev); + /* Wait for init_ready to finish to prevent registration races */ + cancel_work_sync(&hu->init_ready); - if (test_bit(HCI_UART_PROTO_READY, &hu->flags)) { + proto_ready = test_bit(HCI_UART_PROTO_READY, &hu->flags); + if (proto_ready) { percpu_down_write(&hu->proto_lock); clear_bit(HCI_UART_PROTO_READY, &hu->flags); percpu_up_write(&hu->proto_lock); + } - cancel_work_sync(&hu->init_ready); - cancel_work_sync(&hu->write_work); + /* + * Unconditionally cancel write_work AFTER clearing PROTO_READY. + * This ensures that concurrent protocol timers cannot requeue + * write_work via hci_uart_tx_wakeup(), permanently preventing + * double-free races and UAFs. + */ + cancel_work_sync(&hu->write_work); + + hdev = hu->hdev; + if (hdev) + hci_uart_close(hdev); /* proto->flush is safely skipped */ + if (proto_ready) { if (hdev) { if (test_bit(HCI_UART_REGISTERED, &hu->flags)) hci_unregister_dev(hdev); - hci_free_dev(hdev); } + /* Close protocol before freeing hdev (intrinsically purges queues) */ hu->proto->close(hu); + + if (hdev) + hci_free_dev(hdev); } clear_bit(HCI_UART_PROTO_SET, &hu->flags); @@ -625,11 +652,12 @@ static void hci_uart_tty_receive(struct tty_struct *tty, const u8 *data, * tty caller */ hu->proto->recv(hu, data, count); - percpu_up_read(&hu->proto_lock); if (hu->hdev) hu->hdev->stat.byte_rx += count; + percpu_up_read(&hu->proto_lock); + tty_unthrottle(tty); } @@ -695,6 +723,10 @@ static int hci_uart_register_dev(struct hci_uart *hu) percpu_down_write(&hu->proto_lock); clear_bit(HCI_UART_PROTO_INIT, &hu->flags); percpu_up_write(&hu->proto_lock); + /* Cancel work after clearing flags */ + cancel_work_sync(&hu->write_work); + + /* Close protocol before freeing hdev */ hu->proto->close(hu); hu->hdev = NULL; hci_free_dev(hdev); diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c index bb9f002aa85e..a18480c46b24 100644 --- a/drivers/bluetooth/hci_qca.c +++ b/drivers/bluetooth/hci_qca.c @@ -48,13 +48,12 @@ #define HCI_MAX_IBS_SIZE 10 #define IBS_WAKE_RETRANS_TIMEOUT_MS 100 -#define IBS_BTSOC_TX_IDLE_TIMEOUT_MS 200 +#define IBS_BTSOC_TX_IDLE_TIMEOUT msecs_to_jiffies(200) #define IBS_HOST_TX_IDLE_TIMEOUT_MS 2000 -#define CMD_TRANS_TIMEOUT_MS 100 -#define MEMDUMP_TIMEOUT_MS 8000 -#define IBS_DISABLE_SSR_TIMEOUT_MS \ - (MEMDUMP_TIMEOUT_MS + FW_DOWNLOAD_TIMEOUT_MS) -#define FW_DOWNLOAD_TIMEOUT_MS 3000 +#define CMD_TRANS_TIMEOUT msecs_to_jiffies(100) +#define MEMDUMP_TIMEOUT msecs_to_jiffies(8000) +#define FW_DOWNLOAD_TIMEOUT msecs_to_jiffies(3000) +#define IBS_DISABLE_SSR_TIMEOUT (MEMDUMP_TIMEOUT + FW_DOWNLOAD_TIMEOUT) /* susclk rate */ #define SUSCLK_RATE_32KHZ 32768 @@ -1093,7 +1092,7 @@ static void qca_controller_memdump(struct work_struct *work) queue_delayed_work(qca->workqueue, &qca->ctrl_memdump_timeout, - msecs_to_jiffies(MEMDUMP_TIMEOUT_MS)); + MEMDUMP_TIMEOUT); skb_pull(skb, sizeof(qca_memdump->ram_dump_size)); qca_memdump->current_seq_no = 0; qca_memdump->received_dump = 0; @@ -1366,7 +1365,7 @@ static int qca_set_baudrate(struct hci_dev *hdev, uint8_t baudrate) if (hu->serdev) serdev_device_wait_until_sent(hu->serdev, - msecs_to_jiffies(CMD_TRANS_TIMEOUT_MS)); + CMD_TRANS_TIMEOUT); /* Give the controller time to process the request */ switch (qca_soc_type(hu)) { @@ -1398,8 +1397,8 @@ static inline void host_set_baudrate(struct hci_uart *hu, unsigned int speed) static int qca_send_power_pulse(struct hci_uart *hu, bool on) { + int timeout = CMD_TRANS_TIMEOUT; int ret; - int timeout = msecs_to_jiffies(CMD_TRANS_TIMEOUT_MS); u8 cmd = on ? QCA_WCN3990_POWERON_PULSE : QCA_WCN3990_POWEROFF_PULSE; /* These power pulses are single byte command which are sent @@ -1604,7 +1603,7 @@ static void qca_wait_for_dump_collection(struct hci_dev *hdev) struct qca_data *qca = hu->priv; wait_on_bit_timeout(&qca->flags, QCA_MEMDUMP_COLLECTION, - TASK_UNINTERRUPTIBLE, MEMDUMP_TIMEOUT_MS); + TASK_UNINTERRUPTIBLE, MEMDUMP_TIMEOUT); clear_bit(QCA_MEMDUMP_COLLECTION, &qca->flags); } @@ -2577,7 +2576,7 @@ static void qca_serdev_remove(struct serdev_device *serdev) static void qca_serdev_shutdown(struct serdev_device *serdev) { int ret; - int timeout = msecs_to_jiffies(CMD_TRANS_TIMEOUT_MS); + int timeout = CMD_TRANS_TIMEOUT; struct qca_serdev *qcadev = serdev_device_get_drvdata(serdev); struct hci_uart *hu = &qcadev->serdev_hu; struct hci_dev *hdev = hu->hdev; @@ -2634,7 +2633,7 @@ static int __maybe_unused qca_suspend(struct device *dev) bool tx_pending = false; int ret = 0; u8 cmd; - u32 wait_timeout = 0; + unsigned long wait_timeout = 0; set_bit(QCA_SUSPENDING, &qca->flags); @@ -2655,15 +2654,15 @@ static int __maybe_unused qca_suspend(struct device *dev) if (test_bit(QCA_IBS_DISABLED, &qca->flags) || test_bit(QCA_SSR_TRIGGERED, &qca->flags)) { wait_timeout = test_bit(QCA_SSR_TRIGGERED, &qca->flags) ? - IBS_DISABLE_SSR_TIMEOUT_MS : - FW_DOWNLOAD_TIMEOUT_MS; + IBS_DISABLE_SSR_TIMEOUT : + FW_DOWNLOAD_TIMEOUT; /* QCA_IBS_DISABLED flag is set to true, During FW download * and during memory dump collection. It is reset to false, * After FW download complete. */ wait_on_bit_timeout(&qca->flags, QCA_IBS_DISABLED, - TASK_UNINTERRUPTIBLE, msecs_to_jiffies(wait_timeout)); + TASK_UNINTERRUPTIBLE, wait_timeout); if (test_bit(QCA_IBS_DISABLED, &qca->flags)) { bt_dev_err(hu->hdev, "SSR or FW download time out"); @@ -2715,7 +2714,7 @@ static int __maybe_unused qca_suspend(struct device *dev) if (tx_pending) { serdev_device_wait_until_sent(hu->serdev, - msecs_to_jiffies(CMD_TRANS_TIMEOUT_MS)); + CMD_TRANS_TIMEOUT); serial_clock_vote(HCI_IBS_TX_VOTE_CLOCK_OFF, hu); } @@ -2724,7 +2723,7 @@ static int __maybe_unused qca_suspend(struct device *dev) */ ret = wait_event_interruptible_timeout(qca->suspend_wait_q, qca->rx_ibs_state == HCI_IBS_RX_ASLEEP, - msecs_to_jiffies(IBS_BTSOC_TX_IDLE_TIMEOUT_MS)); + IBS_BTSOC_TX_IDLE_TIMEOUT); if (ret == 0) { ret = -ETIMEDOUT; goto error; diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c index 0f50034e4b68..8ef15e1db0cd 100644 --- a/drivers/cpufreq/intel_pstate.c +++ b/drivers/cpufreq/intel_pstate.c @@ -2279,7 +2279,7 @@ static int hwp_get_cpu_scaling(int cpu) * Return the hybrid scaling factor for P-cores and use the * default core scaling for E-cores. */ - if (hybrid_get_cpu_type(cpu) == INTEL_CPU_TYPE_CORE) + if (hybrid_get_cpu_type(cpu) != INTEL_CPU_TYPE_ATOM) return hybrid_scaling_factor; return core_get_scaling(); diff --git a/drivers/firmware/arm_ffa/bus.c b/drivers/firmware/arm_ffa/bus.c index 9576862d89c4..601c3418e0d9 100644 --- a/drivers/firmware/arm_ffa/bus.c +++ b/drivers/firmware/arm_ffa/bus.c @@ -26,6 +26,8 @@ static int ffa_device_match(struct device *dev, const struct device_driver *drv) id_table = to_ffa_driver(drv)->id_table; ffa_dev = to_ffa_dev(dev); + if (!id_table) + return 0; while (!uuid_is_null(&id_table->uuid)) { /* @@ -123,7 +125,7 @@ int ffa_driver_register(struct ffa_driver *driver, struct module *owner, { int ret; - if (!driver->probe) + if (!driver->probe || !driver->id_table) return -EINVAL; driver->driver.bus = &ffa_bus_type; diff --git a/drivers/firmware/arm_ffa/driver.c b/drivers/firmware/arm_ffa/driver.c index eb2782848283..e0263c3fad70 100644 --- a/drivers/firmware/arm_ffa/driver.c +++ b/drivers/firmware/arm_ffa/driver.c @@ -100,6 +100,7 @@ struct ffa_drv_info { bool mem_ops_native; bool msg_direct_req2_supp; bool bitmap_created; + bool bus_notifier_registered; bool notif_enabled; unsigned int sched_recv_irq; unsigned int notif_pend_irq; @@ -322,6 +323,12 @@ __ffa_partition_info_get(u32 uuid0, u32 uuid1, u32 uuid2, u32 uuid3, #define PART_INFO_ID_MASK GENMASK(15, 0) #define PART_INFO_EXEC_CXT_MASK GENMASK(31, 16) #define PART_INFO_PROPS_MASK GENMASK(63, 32) +#define FFA_PART_INFO_GET_REGS_FIRST_REG 3 +#define FFA_PART_INFO_GET_REGS_REGS_PER_DESC 3 +#define FFA_PART_INFO_GET_REGS_MAX_DESC \ + (((sizeof(ffa_value_t) / sizeof_field(ffa_value_t, a0)) - \ + FFA_PART_INFO_GET_REGS_FIRST_REG) / \ + FFA_PART_INFO_GET_REGS_REGS_PER_DESC) #define PART_INFO_ID(x) ((u16)(FIELD_GET(PART_INFO_ID_MASK, (x)))) #define PART_INFO_EXEC_CXT(x) ((u16)(FIELD_GET(PART_INFO_EXEC_CXT_MASK, (x)))) #define PART_INFO_PROPERTIES(x) ((u32)(FIELD_GET(PART_INFO_PROPS_MASK, (x)))) @@ -329,15 +336,13 @@ static int __ffa_partition_info_get_regs(u32 uuid0, u32 uuid1, u32 uuid2, u32 uuid3, struct ffa_partition_info *buffer, int num_parts) { - u16 buf_sz, start_idx, cur_idx, count = 0, prev_idx = 0, tag = 0; + u16 buf_sz, start_idx = 0, cur_idx, count = 0, tag = 0; struct ffa_partition_info *buf = buffer; ffa_value_t partition_info; do { __le64 *regs; - int idx; - - start_idx = prev_idx ? prev_idx + 1 : 0; + int idx, nr_desc, buf_idx; invoke_ffa_fn((ffa_value_t){ .a0 = FFA_PARTITION_INFO_GET_REGS, @@ -353,15 +358,28 @@ __ffa_partition_info_get_regs(u32 uuid0, u32 uuid1, u32 uuid2, u32 uuid3, count = PARTITION_COUNT(partition_info.a2); if (!buffer || !num_parts) /* count only */ return count; + if (count > num_parts) + return -EINVAL; cur_idx = CURRENT_INDEX(partition_info.a2); + if (cur_idx < start_idx || cur_idx >= count) + return -EINVAL; + + nr_desc = cur_idx - start_idx + 1; + if (nr_desc > FFA_PART_INFO_GET_REGS_MAX_DESC) + return -EINVAL; + + buf_idx = buf - buffer; + if (buf_idx + nr_desc > num_parts) + return -EINVAL; + tag = UUID_INFO_TAG(partition_info.a2); buf_sz = PARTITION_INFO_SZ(partition_info.a2); if (buf_sz > sizeof(*buffer)) buf_sz = sizeof(*buffer); regs = (void *)&partition_info.a3; - for (idx = 0; idx < cur_idx - start_idx + 1; idx++, buf++) { + for (idx = 0; idx < nr_desc; idx++, buf++) { union { uuid_t uuid; u64 regs[2]; @@ -379,7 +397,7 @@ __ffa_partition_info_get_regs(u32 uuid0, u32 uuid1, u32 uuid2, u32 uuid3, uuid_copy(&buf->uuid, &uuid_regs.uuid); regs += 3; } - prev_idx = cur_idx; + start_idx = cur_idx + 1; } while (cur_idx < (count - 1)); @@ -1189,7 +1207,7 @@ static int ffa_sched_recv_cb_update(struct ffa_device *dev, ffa_sched_recv_cb callback, void *cb_data, bool is_registration) { - struct ffa_dev_part_info *partition = NULL, *tmp; + struct ffa_dev_part_info *partition = NULL; struct list_head *phead; bool cb_valid; @@ -1202,11 +1220,11 @@ ffa_sched_recv_cb_update(struct ffa_device *dev, ffa_sched_recv_cb callback, return -EINVAL; } - list_for_each_entry_safe(partition, tmp, phead, node) + list_for_each_entry(partition, phead, node) if (partition->dev == dev) break; - if (!partition) { + if (&partition->node == phead) { pr_err("%s: No such partition ID 0x%x\n", __func__, dev->vm_id); return -EINVAL; } @@ -1445,20 +1463,25 @@ static int ffa_notify_send(struct ffa_device *dev, int notify_id, static void handle_notif_callbacks(u64 bitmap, enum notify_type type) { + ffa_notifier_cb cb; + void *cb_data; int notify_id; - struct notifier_cb_info *cb_info = NULL; for (notify_id = 0; notify_id <= FFA_MAX_NOTIFICATIONS && bitmap; notify_id++, bitmap >>= 1) { if (!(bitmap & 1)) continue; - read_lock(&drv_info->notify_lock); - cb_info = notifier_hnode_get_by_type(notify_id, type); - read_unlock(&drv_info->notify_lock); + scoped_guard(read_lock, &drv_info->notify_lock) { + struct notifier_cb_info *cb_info; - if (cb_info && cb_info->cb) - cb_info->cb(notify_id, cb_info->cb_data); + cb_info = notifier_hnode_get_by_type(notify_id, type); + cb = cb_info ? cb_info->cb : NULL; + cb_data = cb_info ? cb_info->cb_data : NULL; + } + + if (cb) + cb(notify_id, cb_data); } } @@ -1466,39 +1489,56 @@ static void handle_fwk_notif_callbacks(u32 bitmap) { void *buf; uuid_t uuid; + void *fwk_cb_data; int notify_id = 0, target; + ffa_fwk_notifier_cb fwk_cb; struct ffa_indirect_msg_hdr *msg; - struct notifier_cb_info *cb_info = NULL; + size_t min_offset = offsetof(struct ffa_indirect_msg_hdr, uuid); /* Only one framework notification defined and supported for now */ if (!(bitmap & FRAMEWORK_NOTIFY_RX_BUFFER_FULL)) return; - mutex_lock(&drv_info->rx_lock); + scoped_guard(mutex, &drv_info->rx_lock) { + u32 offset, size; - msg = drv_info->rx_buffer; - buf = kmemdup((void *)msg + msg->offset, msg->size, GFP_KERNEL); - if (!buf) { - mutex_unlock(&drv_info->rx_lock); - return; - } + msg = drv_info->rx_buffer; + offset = msg->offset; + size = msg->size; - target = SENDER_ID(msg->send_recv_id); - if (msg->offset >= sizeof(*msg)) - uuid_copy(&uuid, &msg->uuid); - else - uuid_copy(&uuid, &uuid_null); + if (!size || (offset != min_offset && offset < sizeof(*msg)) || + offset > drv_info->rxtx_bufsz || + size > drv_info->rxtx_bufsz - offset) { + pr_err("invalid framework notification message\n"); + ffa_rx_release(); + return; + } - mutex_unlock(&drv_info->rx_lock); + buf = kmemdup((void *)msg + offset, size, GFP_KERNEL); + if (!buf) { + ffa_rx_release(); + return; + } - ffa_rx_release(); + target = SENDER_ID(msg->send_recv_id); + if (offset >= sizeof(*msg)) + uuid_copy(&uuid, &msg->uuid); + else + uuid_copy(&uuid, &uuid_null); + ffa_rx_release(); + } - read_lock(&drv_info->notify_lock); - cb_info = notifier_hnode_get_by_vmid_uuid(notify_id, target, &uuid); - read_unlock(&drv_info->notify_lock); + scoped_guard(read_lock, &drv_info->notify_lock) { + struct notifier_cb_info *cb_info; + + cb_info = notifier_hnode_get_by_vmid_uuid(notify_id, target, + &uuid); + fwk_cb = cb_info ? cb_info->fwk_cb : NULL; + fwk_cb_data = cb_info ? cb_info->cb_data : NULL; + } - if (cb_info && cb_info->fwk_cb) - cb_info->fwk_cb(notify_id, cb_info->cb_data, buf); + if (fwk_cb) + fwk_cb(notify_id, fwk_cb_data, buf); kfree(buf); } @@ -1542,7 +1582,7 @@ static void notif_pcpu_irq_work_fn(struct work_struct *work) struct ffa_drv_info *info = container_of(work, struct ffa_drv_info, notif_pcpu_work); - ffa_self_notif_handle(smp_processor_id(), true, info); + notif_get_and_handle(info); } static const struct ffa_info_ops ffa_drv_info_ops = { @@ -1629,6 +1669,15 @@ static struct notifier_block ffa_bus_nb = { .notifier_call = ffa_bus_notifier, }; +static void ffa_bus_notifier_unregister(void) +{ + if (!drv_info->bus_notifier_registered) + return; + + bus_unregister_notifier(&ffa_bus_type, &ffa_bus_nb); + drv_info->bus_notifier_registered = false; +} + static int ffa_xa_add_partition_info(struct ffa_device *dev) { struct ffa_dev_part_info *info; @@ -1712,6 +1761,8 @@ static void ffa_partitions_cleanup(void) struct list_head *phead; unsigned long idx; + ffa_bus_notifier_unregister(); + /* Clean up/free all registered devices */ ffa_devices_unregister(); @@ -1739,11 +1790,14 @@ static int ffa_setup_partitions(void) ret = bus_register_notifier(&ffa_bus_type, &ffa_bus_nb); if (ret) pr_err("Failed to register FF-A bus notifiers\n"); + else + drv_info->bus_notifier_registered = true; } count = ffa_partition_probe(&uuid_null, &pbuf); if (count <= 0) { pr_info("%s: No partitions found, error %d\n", __func__, count); + ffa_bus_notifier_unregister(); return -EINVAL; } @@ -2063,11 +2117,12 @@ static int __init ffa_init(void) rxtx_bufsz = SZ_4K; } + rxtx_bufsz = PAGE_ALIGN(rxtx_bufsz); drv_info->rxtx_bufsz = rxtx_bufsz; drv_info->rx_buffer = alloc_pages_exact(rxtx_bufsz, GFP_KERNEL); if (!drv_info->rx_buffer) { ret = -ENOMEM; - goto free_pages; + goto free_drv_info; } drv_info->tx_buffer = alloc_pages_exact(rxtx_bufsz, GFP_KERNEL); @@ -2078,7 +2133,7 @@ static int __init ffa_init(void) ret = ffa_rxtx_map(virt_to_phys(drv_info->tx_buffer), virt_to_phys(drv_info->rx_buffer), - PAGE_ALIGN(rxtx_bufsz) / FFA_PAGE_SIZE); + rxtx_bufsz / FFA_PAGE_SIZE); if (ret) { pr_err("failed to register FFA RxTx buffers\n"); goto free_pages; diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c index b2fb92a4bbd1..6b961c9b08b7 100644 --- a/drivers/firmware/efi/efi.c +++ b/drivers/firmware/efi/efi.c @@ -402,21 +402,11 @@ static void __init efi_debugfs_init(void) static inline void efi_debugfs_init(void) {} #endif -/* - * We register the efi subsystem with the firmware subsystem and the - * efivars subsystem with the efi subsystem, if the system was booted with - * EFI. - */ -static int __init efisubsys_init(void) +static int __init efipostcore_init(void) { - int error; - if (!efi_enabled(EFI_RUNTIME_SERVICES)) efi.runtime_supported_mask = 0; - if (!efi_enabled(EFI_BOOT)) - return 0; - if (efi.runtime_supported_mask) { /* * Since we process only one efi_runtime_service() at a time, an @@ -428,9 +418,23 @@ static int __init efisubsys_init(void) pr_err("Creating efi_rts_wq failed, EFI runtime services disabled.\n"); clear_bit(EFI_RUNTIME_SERVICES, &efi.flags); efi.runtime_supported_mask = 0; - return 0; } } + return 0; +} +postcore_initcall(efipostcore_init); + +/* + * We register the efi subsystem with the firmware subsystem and the + * efivars subsystem with the efi subsystem, if the system was booted with + * EFI. + */ +static int __init efisubsys_init(void) +{ + int error; + + if (!efi_enabled(EFI_BOOT)) + return 0; if (efi_rt_services_supported(EFI_RT_SUPPORTED_TIME_SERVICES)) platform_device_register_simple("rtc-efi", 0, NULL, 0); diff --git a/drivers/fwctl/pds/main.c b/drivers/fwctl/pds/main.c index 08872ee8422f..68fe254dd10a 100644 --- a/drivers/fwctl/pds/main.c +++ b/drivers/fwctl/pds/main.c @@ -362,6 +362,9 @@ static void *pdsfc_fw_rpc(struct fwctl_uctx *uctx, enum fwctl_rpc_scope scope, void *out = NULL; int err; + if (in_len < sizeof(*rpc)) + return ERR_PTR(-EINVAL); + err = pdsfc_validate_rpc(pdsfc, rpc, scope); if (err) return ERR_PTR(err); diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig index b45fb799e36c..e63096002e92 100644 --- a/drivers/gpio/Kconfig +++ b/drivers/gpio/Kconfig @@ -1986,7 +1986,6 @@ menu "Virtual GPIO drivers" config GPIO_AGGREGATOR tristate "GPIO Aggregator" select CONFIGFS_FS - select DEV_SYNC_PROBE help Say yes here to enable the GPIO Aggregator, which provides a way to aggregate existing GPIO lines into a new virtual GPIO chip. diff --git a/drivers/gpio/gpio-aggregator.c b/drivers/gpio/gpio-aggregator.c index 9adf3228c12a..bc6699a821ee 100644 --- a/drivers/gpio/gpio-aggregator.c +++ b/drivers/gpio/gpio-aggregator.c @@ -32,8 +32,6 @@ #include <linux/gpio/forwarder.h> #include <linux/gpio/machine.h> -#include "dev-sync-probe.h" - #define AGGREGATOR_MAX_GPIOS 512 #define AGGREGATOR_LEGACY_PREFIX "_sysfs" @@ -42,7 +40,7 @@ */ struct gpio_aggregator { - struct dev_sync_probe_data probe_data; + struct platform_device *pdev; struct config_group group; struct gpiod_lookup_table *lookups; struct mutex lock; @@ -135,7 +133,7 @@ static bool gpio_aggregator_is_active(struct gpio_aggregator *aggr) { lockdep_assert_held(&aggr->lock); - return aggr->probe_data.pdev && platform_get_drvdata(aggr->probe_data.pdev); + return aggr->pdev && platform_get_drvdata(aggr->pdev); } /* Only aggregators created via legacy sysfs can be "activating". */ @@ -143,7 +141,7 @@ static bool gpio_aggregator_is_activating(struct gpio_aggregator *aggr) { lockdep_assert_held(&aggr->lock); - return aggr->probe_data.pdev && !platform_get_drvdata(aggr->probe_data.pdev); + return aggr->pdev && !platform_get_drvdata(aggr->pdev); } static size_t gpio_aggregator_count_lines(struct gpio_aggregator *aggr) @@ -909,6 +907,7 @@ static int gpio_aggregator_activate(struct gpio_aggregator *aggr) { struct platform_device_info pdevinfo; struct gpio_aggregator_line *line; + struct platform_device *pdev; struct fwnode_handle *swnode; unsigned int n = 0; int ret = 0; @@ -962,15 +961,29 @@ static int gpio_aggregator_activate(struct gpio_aggregator *aggr) gpiod_add_lookup_table(aggr->lookups); - ret = dev_sync_probe_register(&aggr->probe_data, &pdevinfo); - if (ret) + pdev = platform_device_register_full(&pdevinfo); + if (IS_ERR(pdev)) { + ret = PTR_ERR(pdev); goto err_remove_lookup_table; + } + + wait_for_device_probe(); + + scoped_guard(device, &pdev->dev) { + if (!device_is_bound(&pdev->dev)) { + ret = -ENXIO; + goto err_unregister_pdev; + } + } + aggr->pdev = pdev; return 0; +err_unregister_pdev: + platform_device_unregister(pdev); err_remove_lookup_table: - kfree(aggr->lookups->dev_id); gpiod_remove_lookup_table(aggr->lookups); + kfree(aggr->lookups->dev_id); err_remove_swnode: fwnode_remove_software_node(swnode); err_remove_lookups: @@ -981,10 +994,15 @@ err_remove_lookups: static void gpio_aggregator_deactivate(struct gpio_aggregator *aggr) { - dev_sync_probe_unregister(&aggr->probe_data); + struct fwnode_handle *swnode; + + swnode = dev_fwnode(&aggr->pdev->dev); + platform_device_unregister(aggr->pdev); + aggr->pdev = NULL; gpiod_remove_lookup_table(aggr->lookups); kfree(aggr->lookups->dev_id); kfree(aggr->lookups); + fwnode_remove_software_node(swnode); } static void gpio_aggregator_lockup_configfs(struct gpio_aggregator *aggr, @@ -1145,7 +1163,7 @@ gpio_aggregator_device_dev_name_show(struct config_item *item, char *page) guard(mutex)(&aggr->lock); - pdev = aggr->probe_data.pdev; + pdev = aggr->pdev; if (pdev) return sysfs_emit(page, "%s\n", dev_name(&pdev->dev)); @@ -1322,7 +1340,6 @@ gpio_aggregator_make_group(struct config_group *group, const char *name) return ERR_PTR(ret); config_group_init_type_name(&aggr->group, name, &gpio_aggregator_device_type); - dev_sync_probe_init(&aggr->probe_data); return &aggr->group; } @@ -1471,12 +1488,6 @@ static ssize_t gpio_aggregator_new_device_store(struct device_driver *driver, scnprintf(name, sizeof(name), "%s.%d", AGGREGATOR_LEGACY_PREFIX, aggr->id); config_group_init_type_name(&aggr->group, name, &gpio_aggregator_device_type); - /* - * Since the device created by sysfs might be toggled via configfs - * 'live' attribute later, this initialization is needed. - */ - dev_sync_probe_init(&aggr->probe_data); - /* Expose to configfs */ res = configfs_register_group(&gpio_aggregator_subsys.su_group, &aggr->group); @@ -1495,7 +1506,7 @@ static ssize_t gpio_aggregator_new_device_store(struct device_driver *driver, goto remove_table; } - aggr->probe_data.pdev = pdev; + aggr->pdev = pdev; module_put(THIS_MODULE); return count; diff --git a/drivers/gpio/gpiolib-cdev.c b/drivers/gpio/gpiolib-cdev.c index 73ae77f0f213..78df6d517d30 100644 --- a/drivers/gpio/gpiolib-cdev.c +++ b/drivers/gpio/gpiolib-cdev.c @@ -1184,6 +1184,7 @@ static int gpio_v2_line_flags_validate(u64 flags) static int gpio_v2_line_config_validate(struct gpio_v2_line_config *lc, unsigned int num_lines) { + size_t unused_attrs; unsigned int i; u64 flags; int ret; @@ -1191,9 +1192,21 @@ static int gpio_v2_line_config_validate(struct gpio_v2_line_config *lc, if (lc->num_attrs > GPIO_V2_LINE_NUM_ATTRS_MAX) return -EINVAL; + unused_attrs = GPIO_V2_LINE_NUM_ATTRS_MAX - lc->num_attrs; + if (!mem_is_zero(lc->padding, sizeof(lc->padding))) return -EINVAL; + for (i = 0; i < lc->num_attrs; i++) { + if (lc->attrs[i].attr.padding != 0) + return -EINVAL; + } + + if (unused_attrs) { + if (!mem_is_zero(&lc->attrs[lc->num_attrs], unused_attrs * sizeof(*lc->attrs))) + return -EINVAL; + } + for (i = 0; i < num_lines; i++) { flags = gpio_v2_line_config_flags(lc, i); ret = gpio_v2_line_flags_validate(flags); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c index ac276bb53c7c..9dd6cfd6c0fe 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c @@ -199,11 +199,18 @@ int amdgpu_gtt_mgr_alloc_entries(struct amdgpu_gtt_mgr *mgr, enum drm_mm_insert_mode mode) { struct amdgpu_device *adev = container_of(mgr, typeof(*adev), mman.gtt_mgr); + u32 alignment = 0; int r; + /* Align to TLB L2 cache entry size to work around "V bit HW bug" */ + if (adev->asic_type == CHIP_TAHITI) { + alignment = 32 * 1024 / AMDGPU_GPU_PAGE_SIZE; + num_pages = ALIGN(num_pages, alignment); + } + spin_lock(&mgr->lock); r = drm_mm_insert_node_in_range(&mgr->mm, mm_node, num_pages, - 0, GART_ENTRY_WITHOUT_BO_COLOR, 0, + alignment, GART_ENTRY_WITHOUT_BO_COLOR, 0, adev->gmc.gart_size >> PAGE_SHIFT, mode); spin_unlock(&mgr->lock); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c index fd881388d612..f27f917e3cdb 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c @@ -562,6 +562,11 @@ static void vpe_ring_emit_fence(struct amdgpu_ring *ring, uint64_t addr, amdgpu_ring_write(ring, 0); } + /* WA: Force sync after TRAP to avoid VPE1 fail to power off */ + if (ring->adev->vpe.collaborate_mode) { + amdgpu_ring_write(ring, VPE_CMD_HEADER(VPE_CMD_OPCODE_COLLAB_SYNC, 0)); + amdgpu_ring_write(ring, 0xabcd); + } } static void vpe_ring_emit_pipeline_sync(struct amdgpu_ring *ring) @@ -968,7 +973,7 @@ static const struct amdgpu_ring_funcs vpe_ring_funcs = { .emit_frame_size = 5 + /* vpe_ring_init_cond_exec */ 6 + /* vpe_ring_emit_pipeline_sync */ - 10 + 10 + 10 + /* vpe_ring_emit_fence */ + 12 + 12 + 12 + /* vpe_ring_emit_fence */ /* vpe_ring_emit_vm_flush */ SOC15_FLUSH_GPU_TLB_NUM_WREG * 3 + SOC15_FLUSH_GPU_TLB_NUM_REG_WAIT * 6, diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c index 9ae424618556..d63ff64943d5 100644 --- a/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c +++ b/drivers/gpu/drm/amd/amdgpu/vce_v1_0.c @@ -42,9 +42,10 @@ #include "oss/oss_1_0_d.h" #include "oss/oss_1_0_sh_mask.h" +#define VCE_V1_0_ALIGNMENT (32 * 1024) #define VCE_V1_0_FW_SIZE (256 * 1024) #define VCE_V1_0_STACK_SIZE (64 * 1024) -#define VCE_V1_0_DATA_SIZE (7808 * (AMDGPU_MAX_VCE_HANDLES + 1)) +#define VCE_V1_0_DATA_SIZE (ALIGN(7808 * (AMDGPU_MAX_VCE_HANDLES + 1), VCE_V1_0_ALIGNMENT)) #define VCE_STATUS_VCPU_REPORT_FW_LOADED_MASK 0x02 #define VCE_V1_0_GART_PAGE_START \ @@ -194,17 +195,22 @@ static int vce_v1_0_load_fw_signature(struct amdgpu_device *adev) { const struct common_firmware_header *hdr; struct vce_v1_0_fw_signature *sign; - unsigned int ucode_offset; + u32 ucode_offset; + u32 ucode_size; uint32_t chip_id; u32 *cpu_addr; int i; hdr = (const struct common_firmware_header *)adev->vce.fw->data; ucode_offset = le32_to_cpu(hdr->ucode_array_offset_bytes); + ucode_size = hdr->ucode_size_bytes - sizeof(struct vce_v1_0_fw_signature *); cpu_addr = adev->vce.cpu_addr; sign = (void *)adev->vce.fw->data + ucode_offset; + if (ucode_size > VCE_V1_0_FW_SIZE - AMDGPU_VCE_FIRMWARE_OFFSET) + return -EINVAL; + switch (adev->asic_type) { case CHIP_TAHITI: chip_id = 0x01000014; @@ -236,7 +242,7 @@ static int vce_v1_0_load_fw_signature(struct amdgpu_device *adev) cpu_addr[4] = cpu_to_le32(le32_to_cpu(sign->length) + 64); memset_io(&cpu_addr[5], 0, 44); - memcpy_toio(&cpu_addr[16], &sign[1], hdr->ucode_size_bytes - sizeof(*sign)); + memcpy_toio(&cpu_addr[16], &sign[1], ucode_size); cpu_addr += (le32_to_cpu(sign->length) + 64) / 4; memcpy_toio(&cpu_addr[0], &sign->val[i].sigval[0], 16); @@ -317,18 +323,23 @@ static int vce_v1_0_mc_resume(struct amdgpu_device *adev) WREG32(mmVCE_VCPU_SCRATCH7, AMDGPU_MAX_VCE_HANDLES); offset = adev->vce.gpu_addr + AMDGPU_VCE_FIRMWARE_OFFSET; - size = VCE_V1_0_FW_SIZE; - WREG32(mmVCE_VCPU_CACHE_OFFSET0, offset & 0x7fffffff); + size = VCE_V1_0_FW_SIZE - AMDGPU_VCE_FIRMWARE_OFFSET; + WREG32(mmVCE_VCPU_CACHE_OFFSET0, offset); WREG32(mmVCE_VCPU_CACHE_SIZE0, size); offset += size; size = VCE_V1_0_STACK_SIZE; - WREG32(mmVCE_VCPU_CACHE_OFFSET1, offset & 0x7fffffff); + WARN_ON(!IS_ALIGNED(offset, VCE_V1_0_ALIGNMENT)); + WARN_ON(!IS_ALIGNED(size, VCE_V1_0_ALIGNMENT)); + WREG32(mmVCE_VCPU_CACHE_OFFSET1, offset); WREG32(mmVCE_VCPU_CACHE_SIZE1, size); offset += size; size = VCE_V1_0_DATA_SIZE; - WREG32(mmVCE_VCPU_CACHE_OFFSET2, offset & 0x7fffffff); + WARN_ON(!IS_ALIGNED(offset, VCE_V1_0_ALIGNMENT)); + WARN_ON(!IS_ALIGNED(size, VCE_V1_0_ALIGNMENT)); + WARN_ON((offset + size - adev->vce.gpu_addr) > amdgpu_bo_size(adev->vce.vcpu_bo)); + WREG32(mmVCE_VCPU_CACHE_OFFSET2, offset); WREG32(mmVCE_VCPU_CACHE_SIZE2, size); WREG32_P(mmVCE_LMI_CTRL2, 0x0, ~0x100); @@ -532,12 +543,16 @@ static int vce_v1_0_early_init(struct amdgpu_ip_block *ip_block) * To accomodate that, we put GART to the LOW address range * and reserve some GART pages where we map the VCPU BO, * so that it gets a 32-bit address. + * + * The BAR address is zero and we can't change it + * due to the firmware validation mechanism. + * It seems that it fails to initialize if the address is >= 128 MiB. */ static int vce_v1_0_ensure_vcpu_bo_32bit_addr(struct amdgpu_device *adev) { u64 gpu_addr = amdgpu_bo_gpu_offset(adev->vce.vcpu_bo); u64 bo_size = amdgpu_bo_size(adev->vce.vcpu_bo); - u64 max_vcpu_bo_addr = 0xffffffff - bo_size; + u64 max_vcpu_bo_addr = 0x07ffffff - bo_size; u64 num_pages = ALIGN(bo_size, AMDGPU_GPU_PAGE_SIZE) / AMDGPU_GPU_PAGE_SIZE; u64 pa = amdgpu_gmc_vram_pa(adev, adev->vce.vcpu_bo); u64 flags = AMDGPU_PTE_READABLE | AMDGPU_PTE_WRITEABLE | AMDGPU_PTE_VALID; diff --git a/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c b/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c index 94fddf22f5a9..b20a07ea1d94 100644 --- a/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c +++ b/drivers/gpu/drm/amd/display/dc/bios/bios_parser2.c @@ -492,6 +492,10 @@ static enum bp_result get_gpio_i2c_info( - sizeof(struct atom_common_table_header)) / sizeof(struct atom_gpio_pin_assignment); + if (!bios_get_image(&bp->base, DATA_TABLES(gpio_pin_lut), + le16_to_cpu(header->table_header.structuresize))) + return BP_RESULT_BADBIOSTABLE; + pin = (struct atom_gpio_pin_assignment *) header->gpio_pin; for (table_index = 0; table_index < count; table_index++) { @@ -680,6 +684,11 @@ static enum bp_result bios_parser_get_gpio_pin_info( count = (le16_to_cpu(header->table_header.structuresize) - sizeof(struct atom_common_table_header)) / sizeof(struct atom_gpio_pin_assignment); + + if (!bios_get_image(&bp->base, DATA_TABLES(gpio_pin_lut), + le16_to_cpu(header->table_header.structuresize))) + return BP_RESULT_BADBIOSTABLE; + for (i = 0; i < count; ++i) { if (header->gpio_pin[i].gpio_id != gpio_id) continue; diff --git a/drivers/gpu/drm/amd/display/dc/bios/bios_parser_helper.c b/drivers/gpu/drm/amd/display/dc/bios/bios_parser_helper.c index 8d2cf95ae739..e00dc05c2d9d 100644 --- a/drivers/gpu/drm/amd/display/dc/bios/bios_parser_helper.c +++ b/drivers/gpu/drm/amd/display/dc/bios/bios_parser_helper.c @@ -37,10 +37,13 @@ uint8_t *bios_get_image(struct dc_bios *bp, uint32_t offset, uint32_t size) { - if (bp->bios && offset + size < bp->bios_size) - return bp->bios + offset; - else + if (!bp->bios) return NULL; + + if (offset > bp->bios_size || size > bp->bios_size - offset) + return NULL; + + return bp->bios + offset; } #include "reg_helper.h" diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c index 3e87b6a553be..73fde9df22d1 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c @@ -5993,7 +5993,11 @@ bool dc_process_dmub_aux_transfer_async(struct dc *dc, uint8_t action; union dmub_rb_cmd cmd = {0}; - ASSERT(payload->length <= 16); + if (link_index >= dc->link_count || !dc->links[link_index]) + return false; + + if (payload->length > sizeof(cmd.dp_aux_access.aux_control.dpaux.data)) + return false; cmd.dp_aux_access.header.type = DMUB_CMD__DP_AUX_ACCESS; cmd.dp_aux_access.header.payload_bytes = 0; diff --git a/drivers/gpu/drm/bridge/chipone-icn6211.c b/drivers/gpu/drm/bridge/chipone-icn6211.c index 814713c5bea9..553a1df4688d 100644 --- a/drivers/gpu/drm/bridge/chipone-icn6211.c +++ b/drivers/gpu/drm/bridge/chipone-icn6211.c @@ -758,7 +758,9 @@ static int chipone_i2c_probe(struct i2c_client *client) dev_set_drvdata(dev, icn); i2c_set_clientdata(client, icn); - drm_bridge_add(&icn->bridge); + ret = devm_drm_bridge_add(dev, &icn->bridge); + if (ret) + return ret; return chipone_dsi_host_attach(icn); } diff --git a/drivers/gpu/drm/bridge/ite-it66121.c b/drivers/gpu/drm/bridge/ite-it66121.c index 9246e9c15a6e..ed21f09cd19a 100644 --- a/drivers/gpu/drm/bridge/ite-it66121.c +++ b/drivers/gpu/drm/bridge/ite-it66121.c @@ -1559,6 +1559,11 @@ static int it66121_probe(struct i2c_client *client) return ret; } + ctx->gpio_reset = devm_gpiod_get(dev, "reset", GPIOD_OUT_LOW); + if (IS_ERR(ctx->gpio_reset)) + return dev_err_probe(dev, PTR_ERR(ctx->gpio_reset), + "Failed to get reset GPIO\n"); + it66121_hw_reset(ctx); ctx->regmap = devm_regmap_init_i2c(client, &it66121_regmap_config); diff --git a/drivers/gpu/drm/bridge/megachips-stdpxxxx-ge-b850v3-fw.c b/drivers/gpu/drm/bridge/megachips-stdpxxxx-ge-b850v3-fw.c index c9e6505cbd88..2d02cc69f237 100644 --- a/drivers/gpu/drm/bridge/megachips-stdpxxxx-ge-b850v3-fw.c +++ b/drivers/gpu/drm/bridge/megachips-stdpxxxx-ge-b850v3-fw.c @@ -251,7 +251,6 @@ static void ge_b850v3_lvds_remove(void) goto out; drm_bridge_remove(&ge_b850v3_lvds_ptr->bridge); - ge_b850v3_lvds_ptr = NULL; out: mutex_unlock(&ge_b850v3_lvds_dev_mutex); @@ -261,6 +260,7 @@ static int ge_b850v3_register(void) { struct i2c_client *stdp4028_i2c = ge_b850v3_lvds_ptr->stdp4028_i2c; struct device *dev = &stdp4028_i2c->dev; + int ret; /* drm bridge initialization */ ge_b850v3_lvds_ptr->bridge.ops = DRM_BRIDGE_OP_DETECT | @@ -277,11 +277,15 @@ static int ge_b850v3_register(void) if (!stdp4028_i2c->irq) return 0; - return devm_request_threaded_irq(&stdp4028_i2c->dev, - stdp4028_i2c->irq, NULL, - ge_b850v3_lvds_irq_handler, - IRQF_TRIGGER_HIGH | IRQF_ONESHOT, - "ge-b850v3-lvds-dp", ge_b850v3_lvds_ptr); + ret = devm_request_threaded_irq(&stdp4028_i2c->dev, + stdp4028_i2c->irq, NULL, + ge_b850v3_lvds_irq_handler, + IRQF_TRIGGER_HIGH | IRQF_ONESHOT, + "ge-b850v3-lvds-dp", ge_b850v3_lvds_ptr); + if (ret) + drm_bridge_remove(&ge_b850v3_lvds_ptr->bridge); + + return ret; } static int stdp4028_ge_b850v3_fw_probe(struct i2c_client *stdp4028_i2c) diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c index 2915118436ce..0238445cf96c 100644 --- a/drivers/gpu/drm/drm_drv.c +++ b/drivers/gpu/drm/drm_drv.c @@ -696,6 +696,7 @@ static void drm_dev_init_release(struct drm_device *dev, void *res) mutex_destroy(&dev->master_mutex); mutex_destroy(&dev->clientlist_mutex); mutex_destroy(&dev->filelist_mutex); + mutex_destroy(&dev->gem_lru_mutex); } static int drm_dev_init(struct drm_device *dev, @@ -737,6 +738,7 @@ static int drm_dev_init(struct drm_device *dev, INIT_LIST_HEAD(&dev->vblank_event_list); spin_lock_init(&dev->event_lock); + mutex_init(&dev->gem_lru_mutex); mutex_init(&dev->filelist_mutex); mutex_init(&dev->clientlist_mutex); mutex_init(&dev->master_mutex); diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c index 2b152e3103c3..52151452adf9 100644 --- a/drivers/gpu/drm/drm_gem.c +++ b/drivers/gpu/drm/drm_gem.c @@ -1543,12 +1543,10 @@ EXPORT_SYMBOL(drm_gem_unlock_reservations); * drm_gem_lru_init - initialize a LRU * * @lru: The LRU to initialize - * @lock: The lock protecting the LRU */ void -drm_gem_lru_init(struct drm_gem_lru *lru, struct mutex *lock) +drm_gem_lru_init(struct drm_gem_lru *lru) { - lru->lock = lock; lru->count = 0; INIT_LIST_HEAD(&lru->list); } @@ -1573,14 +1571,10 @@ drm_gem_lru_remove_locked(struct drm_gem_object *obj) void drm_gem_lru_remove(struct drm_gem_object *obj) { - struct drm_gem_lru *lru = obj->lru; - - if (!lru) - return; - - mutex_lock(lru->lock); - drm_gem_lru_remove_locked(obj); - mutex_unlock(lru->lock); + mutex_lock(&obj->dev->gem_lru_mutex); + if (obj->lru) + drm_gem_lru_remove_locked(obj); + mutex_unlock(&obj->dev->gem_lru_mutex); } EXPORT_SYMBOL(drm_gem_lru_remove); @@ -1595,7 +1589,7 @@ EXPORT_SYMBOL(drm_gem_lru_remove); void drm_gem_lru_move_tail_locked(struct drm_gem_lru *lru, struct drm_gem_object *obj) { - lockdep_assert_held_once(lru->lock); + lockdep_assert_held_once(&obj->dev->gem_lru_mutex); if (obj->lru) drm_gem_lru_remove_locked(obj); @@ -1619,9 +1613,9 @@ EXPORT_SYMBOL(drm_gem_lru_move_tail_locked); void drm_gem_lru_move_tail(struct drm_gem_lru *lru, struct drm_gem_object *obj) { - mutex_lock(lru->lock); + mutex_lock(&obj->dev->gem_lru_mutex); drm_gem_lru_move_tail_locked(lru, obj); - mutex_unlock(lru->lock); + mutex_unlock(&obj->dev->gem_lru_mutex); } EXPORT_SYMBOL(drm_gem_lru_move_tail); @@ -1635,6 +1629,7 @@ EXPORT_SYMBOL(drm_gem_lru_move_tail); * of the shrink callback to check for this (ie. dma_resv_test_signaled()) * or if necessary block until the buffer becomes idle. * + * @dev: DRM device the LRU belongs to * @lru: The LRU to scan * @nr_to_scan: The number of pages to try to reclaim * @remaining: The number of pages left to reclaim, should be initialized by caller @@ -1642,7 +1637,8 @@ EXPORT_SYMBOL(drm_gem_lru_move_tail); * @ticket: Optional ww_acquire_ctx context to use for locking */ unsigned long -drm_gem_lru_scan(struct drm_gem_lru *lru, +drm_gem_lru_scan(struct drm_device *dev, + struct drm_gem_lru *lru, unsigned int nr_to_scan, unsigned long *remaining, bool (*shrink)(struct drm_gem_object *obj, struct ww_acquire_ctx *ticket), @@ -1652,9 +1648,9 @@ drm_gem_lru_scan(struct drm_gem_lru *lru, struct drm_gem_object *obj; unsigned freed = 0; - drm_gem_lru_init(&still_in_lru, lru->lock); + drm_gem_lru_init(&still_in_lru); - mutex_lock(lru->lock); + mutex_lock(&dev->gem_lru_mutex); while (freed < nr_to_scan) { obj = list_first_entry_or_null(&lru->list, typeof(*obj), lru_node); @@ -1677,7 +1673,7 @@ drm_gem_lru_scan(struct drm_gem_lru *lru, * rest of the loop body, to reduce contention with other * code paths that need the LRU lock */ - mutex_unlock(lru->lock); + mutex_unlock(&dev->gem_lru_mutex); if (ticket) ww_acquire_init(ticket, &reservation_ww_class); @@ -1711,7 +1707,7 @@ drm_gem_lru_scan(struct drm_gem_lru *lru, tail: drm_gem_object_put(obj); - mutex_lock(lru->lock); + mutex_lock(&dev->gem_lru_mutex); } /* @@ -1723,7 +1719,7 @@ tail: list_splice_tail(&still_in_lru.list, &lru->list); lru->count += still_in_lru.count; - mutex_unlock(lru->lock); + mutex_unlock(&dev->gem_lru_mutex); return freed; } diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c index 2906dc6e630e..d52205d714ee 100644 --- a/drivers/gpu/drm/i915/display/intel_dp.c +++ b/drivers/gpu/drm/i915/display/intel_dp.c @@ -5067,7 +5067,7 @@ int intel_dp_as_sdp_unpack(struct drm_dp_as_sdp *as_sdp, as_sdp->length = sdp->sdp_header.HB3 & DP_ADAPTIVE_SYNC_SDP_LENGTH; as_sdp->mode = sdp->db[0] & DP_ADAPTIVE_SYNC_SDP_OPERATION_MODE; as_sdp->vtotal = (sdp->db[2] << 8) | sdp->db[1]; - as_sdp->target_rr = (u64)sdp->db[3] | ((u64)sdp->db[4] & 0x3); + as_sdp->target_rr = ((sdp->db[4] & 0x3) << 8) | sdp->db[3]; as_sdp->target_rr_divider = sdp->db[4] & 0x20 ? true : false; return 0; diff --git a/drivers/gpu/drm/i915/display/intel_plane.c b/drivers/gpu/drm/i915/display/intel_plane.c index 076b9b356481..d0d8dc57d950 100644 --- a/drivers/gpu/drm/i915/display/intel_plane.c +++ b/drivers/gpu/drm/i915/display/intel_plane.c @@ -373,7 +373,7 @@ intel_plane_color_copy_uapi_to_hw_state(struct intel_plane_state *plane_state, bool changed = false; int i = 0; - iter_colorop = plane_state->uapi.color_pipeline; + iter_colorop = from_plane_state->uapi.color_pipeline; while (iter_colorop) { for_each_new_colorop_in_state(state, colorop, new_colorop_state, i) { diff --git a/drivers/gpu/drm/mediatek/mtk_cec.c b/drivers/gpu/drm/mediatek/mtk_cec.c index c7be530ca041..b8ccd6e55bed 100644 --- a/drivers/gpu/drm/mediatek/mtk_cec.c +++ b/drivers/gpu/drm/mediatek/mtk_cec.c @@ -240,7 +240,7 @@ static const struct of_device_id mtk_cec_of_ids[] = { }; MODULE_DEVICE_TABLE(of, mtk_cec_of_ids); -struct platform_driver mtk_cec_driver = { +static struct platform_driver mtk_cec_driver = { .probe = mtk_cec_probe, .remove = mtk_cec_remove, .driver = { diff --git a/drivers/gpu/drm/mediatek/mtk_hdmi_ddc.c b/drivers/gpu/drm/mediatek/mtk_hdmi_ddc.c index 6358e1af69b4..2acbdb025d89 100644 --- a/drivers/gpu/drm/mediatek/mtk_hdmi_ddc.c +++ b/drivers/gpu/drm/mediatek/mtk_hdmi_ddc.c @@ -328,7 +328,7 @@ static const struct of_device_id mtk_hdmi_ddc_match[] = { }; MODULE_DEVICE_TABLE(of, mtk_hdmi_ddc_match); -struct platform_driver mtk_hdmi_ddc_driver = { +static struct platform_driver mtk_hdmi_ddc_driver = { .probe = mtk_hdmi_ddc_probe, .remove = mtk_hdmi_ddc_remove, .driver = { diff --git a/drivers/gpu/drm/mediatek/mtk_hdmi_ddc_v2.c b/drivers/gpu/drm/mediatek/mtk_hdmi_ddc_v2.c index d937219fdb7e..31e81a6de6d8 100644 --- a/drivers/gpu/drm/mediatek/mtk_hdmi_ddc_v2.c +++ b/drivers/gpu/drm/mediatek/mtk_hdmi_ddc_v2.c @@ -389,7 +389,7 @@ static const struct of_device_id mtk_hdmi_ddc_v2_match[] = { }; MODULE_DEVICE_TABLE(of, mtk_hdmi_ddc_v2_match); -struct platform_driver mtk_hdmi_ddc_v2_driver = { +static struct platform_driver mtk_hdmi_ddc_v2_driver = { .probe = mtk_hdmi_ddc_v2_probe, .driver = { .name = "mediatek-hdmi-ddc-v2", diff --git a/drivers/gpu/drm/mediatek/mtk_hdmi_v2.c b/drivers/gpu/drm/mediatek/mtk_hdmi_v2.c index 279ca896b0a2..6cdad4415475 100644 --- a/drivers/gpu/drm/mediatek/mtk_hdmi_v2.c +++ b/drivers/gpu/drm/mediatek/mtk_hdmi_v2.c @@ -50,7 +50,7 @@ enum mtk_hdmi_v2_clk_id { MTK_HDMI_V2_CLK_COUNT, }; -const char *const mtk_hdmi_v2_clk_names[MTK_HDMI_V2_CLK_COUNT] = { +static const char *const mtk_hdmi_v2_clk_names[MTK_HDMI_V2_CLK_COUNT] = { [MTK_HDMI_V2_CLK_HDMI_APB_SEL] = "bus", [MTK_HDMI_V2_CLK_HDCP_SEL] = "hdcp", [MTK_HDMI_V2_CLK_HDCP_24M_SEL] = "hdcp24m", diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c index e44302251de5..79acae11154a 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c @@ -1730,6 +1730,7 @@ static struct msm_gpu *a5xx_gpu_init(struct drm_device *dev) struct adreno_gpu *adreno_gpu; struct msm_gpu *gpu; unsigned int nr_rings; + u32 speedbin; int ret; a5xx_gpu = kzalloc_obj(*a5xx_gpu); @@ -1756,6 +1757,11 @@ static struct msm_gpu *a5xx_gpu_init(struct drm_device *dev) return ERR_PTR(ret); } + /* Set the speedbin value that is passed to userspace */ + if (adreno_read_speedbin(&pdev->dev, &speedbin) || !speedbin) + speedbin = 0xffff; + adreno_gpu->speedbin = (uint16_t) (0xffff & speedbin); + msm_mmu_set_fault_handler(to_msm_vm(gpu->vm)->mmu, gpu, a5xx_fault_handler); diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index 0e8a48ca816d..3e26b60d7f67 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -2546,13 +2546,33 @@ static u32 fuse_to_supp_hw(const struct adreno_info *info, u32 fuse) return UINT_MAX; } -static int a6xx_set_supported_hw(struct device *dev, const struct adreno_info *info) +static int a6xx_read_speedbin(struct device *dev, struct a6xx_gpu *a6xx_gpu, + const struct adreno_info *info, u32 *speedbin) +{ + int ret; + + /* Use speedbin fuse if present. Otherwise, fallback to softfuse */ + ret = adreno_read_speedbin(dev, speedbin); + if (ret != -ENOENT) + return ret; + + if (info->quirks & ADRENO_QUIRK_SOFTFUSE) { + *speedbin = a6xx_llc_read(a6xx_gpu, REG_A8XX_CX_MISC_SW_FUSE_FREQ_LIMIT_STATUS); + *speedbin = A8XX_CX_MISC_SW_FUSE_FREQ_LIMIT_STATUS_FINALFREQLIMIT(*speedbin); + return 0; + } + + return -ENOENT; +} + +static int a6xx_set_supported_hw(struct device *dev, struct a6xx_gpu *a6xx_gpu, + const struct adreno_info *info) { u32 supp_hw; u32 speedbin; int ret; - ret = adreno_read_speedbin(dev, &speedbin); + ret = a6xx_read_speedbin(dev, a6xx_gpu, info, &speedbin); /* * -ENOENT means that the platform doesn't support speedbin which is * fine @@ -2586,11 +2606,12 @@ static struct msm_gpu *a6xx_gpu_init(struct drm_device *dev) struct msm_drm_private *priv = dev->dev_private; struct platform_device *pdev = priv->gpu_pdev; struct adreno_platform_config *config = pdev->dev.platform_data; - struct device_node *node; + const struct adreno_info *info = config->info; struct a6xx_gpu *a6xx_gpu; struct adreno_gpu *adreno_gpu; struct msm_gpu *gpu; extern int enable_preemption; + u32 speedbin; bool is_a7xx; int ret, nr_rings = 1; @@ -2607,21 +2628,22 @@ static struct msm_gpu *a6xx_gpu_init(struct drm_device *dev) adreno_gpu->registers = NULL; /* Check if there is a GMU phandle and set it up */ - node = of_parse_phandle(pdev->dev.of_node, "qcom,gmu", 0); + struct device_node *node __free(device_node) = + of_parse_phandle(pdev->dev.of_node, "qcom,gmu", 0); /* FIXME: How do we gracefully handle this? */ BUG_ON(!node); adreno_gpu->gmu_is_wrapper = of_device_is_compatible(node, "qcom,adreno-gmu-wrapper"); adreno_gpu->base.hw_apriv = - !!(config->info->quirks & ADRENO_QUIRK_HAS_HW_APRIV); + !!(info->quirks & ADRENO_QUIRK_HAS_HW_APRIV); /* gpu->info only gets assigned in adreno_gpu_init(). A8x is included intentionally */ - is_a7xx = config->info->family >= ADRENO_7XX_GEN1; + is_a7xx = info->family >= ADRENO_7XX_GEN1; a6xx_llc_slices_init(pdev, a6xx_gpu, is_a7xx); - ret = a6xx_set_supported_hw(&pdev->dev, config->info); + ret = a6xx_set_supported_hw(&pdev->dev, a6xx_gpu, info); if (ret) { a6xx_llc_slices_destroy(a6xx_gpu); kfree(a6xx_gpu); @@ -2629,15 +2651,20 @@ static struct msm_gpu *a6xx_gpu_init(struct drm_device *dev) } if ((enable_preemption == 1) || (enable_preemption == -1 && - (config->info->quirks & ADRENO_QUIRK_PREEMPTION))) + (info->quirks & ADRENO_QUIRK_PREEMPTION))) nr_rings = 4; - ret = adreno_gpu_init(dev, pdev, adreno_gpu, config->info->funcs, nr_rings); + ret = adreno_gpu_init(dev, pdev, adreno_gpu, info->funcs, nr_rings); if (ret) { a6xx_destroy(&(a6xx_gpu->base.base)); return ERR_PTR(ret); } + /* Set the speedbin value that is passed to userspace */ + if (a6xx_read_speedbin(&pdev->dev, a6xx_gpu, info, &speedbin) || !speedbin) + speedbin = 0xffff; + adreno_gpu->speedbin = (uint16_t) (0xffff & speedbin); + /* * For now only clamp to idle freq for devices where this is known not * to cause power supply issues: @@ -2649,7 +2676,6 @@ static struct msm_gpu *a6xx_gpu_init(struct drm_device *dev) ret = a6xx_gmu_wrapper_init(a6xx_gpu, node); else ret = a6xx_gmu_init(a6xx_gpu, node); - of_node_put(node); if (ret) { a6xx_destroy(&(a6xx_gpu->base.base)); return ERR_PTR(ret); @@ -2699,6 +2725,7 @@ const struct adreno_gpu_funcs a6xx_gpu_funcs = { .create_private_vm = a6xx_create_private_vm, .get_rptr = a6xx_get_rptr, .progress = a6xx_progress, + .sysprof_setup = a6xx_gmu_sysprof_setup, }, .init = a6xx_gpu_init, .get_timestamp = a6xx_gmu_get_timestamp, @@ -2767,6 +2794,7 @@ const struct adreno_gpu_funcs a7xx_gpu_funcs = { .create_private_vm = a6xx_create_private_vm, .get_rptr = a6xx_get_rptr, .progress = a6xx_progress, + .sysprof_setup = a6xx_gmu_sysprof_setup, }, .init = a6xx_gpu_init, .get_timestamp = a6xx_gmu_get_timestamp, diff --git a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c index 4f5dbf46132b..b40148b75420 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c @@ -289,6 +289,8 @@ static int a8xx_hfi_send_perf_table(struct a6xx_gmu *gmu) (gmu->nr_gpu_freqs * num_gx_votes * sizeof(gmu->gx_arc_votes[0])) + (gmu->nr_gmu_freqs * num_cx_votes * sizeof(gmu->cx_arc_votes[0])); tbl = kzalloc(size, GFP_KERNEL); + if (!tbl) + return -ENOMEM; tbl->type = HFI_TABLE_GPU_PERF; /* First fill GX votes */ diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c index 785e99fb5bd5..682a09a376fb 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c @@ -376,7 +376,7 @@ int adreno_get_param(struct msm_gpu *gpu, struct msm_context *ctx, *value = adreno_gpu->info->gmem; return 0; case MSM_PARAM_GMEM_BASE: - if (adreno_gpu->info->family >= ADRENO_6XX_GEN4) + if (adreno_gpu->info->family >= ADRENO_6XX_GEN3) *value = 0; else *value = 0x100000; @@ -424,15 +424,21 @@ int adreno_get_param(struct msm_gpu *gpu, struct msm_context *ctx, *value = vm->mm_range; return 0; case MSM_PARAM_HIGHEST_BANK_BIT: + if (!adreno_gpu->ubwc_config) + return UERR(ENOENT, drm, "no UBWC on this platform"); *value = adreno_gpu->ubwc_config->highest_bank_bit; return 0; case MSM_PARAM_RAYTRACING: *value = adreno_gpu->has_ray_tracing; return 0; case MSM_PARAM_UBWC_SWIZZLE: + if (!adreno_gpu->ubwc_config) + return UERR(ENOENT, drm, "no UBWC on this platform"); *value = adreno_gpu->ubwc_config->ubwc_swizzle; return 0; case MSM_PARAM_MACROTILE_MODE: + if (!adreno_gpu->ubwc_config) + return UERR(ENOENT, drm, "no UBWC on this platform"); *value = adreno_gpu->ubwc_config->macrotile_mode; return 0; case MSM_PARAM_UCHE_TRAP_BASE: @@ -1182,7 +1188,6 @@ int adreno_gpu_init(struct drm_device *drm, struct platform_device *pdev, struct msm_gpu_config adreno_gpu_config = { 0 }; struct msm_gpu *gpu = &adreno_gpu->base; const char *gpu_name; - u32 speedbin; int ret; adreno_gpu->funcs = funcs; @@ -1211,10 +1216,6 @@ int adreno_gpu_init(struct drm_device *drm, struct platform_device *pdev, devm_pm_opp_set_clkname(dev, "core"); } - if (adreno_read_speedbin(dev, &speedbin) || !speedbin) - speedbin = 0xffff; - adreno_gpu->speedbin = (uint16_t) (0xffff & speedbin); - gpu_name = devm_kasprintf(dev, GFP_KERNEL, "%"ADRENO_CHIPID_FMT, ADRENO_CHIPID_ARGS(config->chip_id)); if (!gpu_name) diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h index 29097e6b4253..044ed4d49aa7 100644 --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h @@ -63,6 +63,7 @@ enum adreno_family { #define ADRENO_QUIRK_PREEMPTION BIT(5) #define ADRENO_QUIRK_4GB_VA BIT(6) #define ADRENO_QUIRK_IFPC BIT(7) +#define ADRENO_QUIRK_SOFTFUSE BIT(8) /* Helper for formating the chip_id in the way that userspace tools like * crashdec expect. diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_13_0_kaanapali.h b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_13_0_kaanapali.h index 0b20401b04cf..e3c47b6702f1 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_13_0_kaanapali.h +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_13_0_kaanapali.h @@ -481,7 +481,7 @@ const struct dpu_mdss_cfg dpu_kaanapali_cfg = { .wb_count = ARRAY_SIZE(kaanapali_wb), .wb = kaanapali_wb, .cwb_count = ARRAY_SIZE(kaanapali_cwb), - .cwb = sm8650_cwb, + .cwb = kaanapali_cwb, .intf_count = ARRAY_SIZE(kaanapali_intf), .intf = kaanapali_intf, .vbif_count = ARRAY_SIZE(sm8650_vbif), diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c index 6e8883dbfad4..590922c4f69b 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c @@ -61,7 +61,7 @@ static int _dpu_format_populate_plane_sizes_ubwc( bool meta = MSM_FORMAT_IS_UBWC(fmt); if (MSM_FORMAT_IS_YUV(fmt)) { - unsigned int stride, sclines; + unsigned int stride, y_sclines, uv_sclines; unsigned int y_tile_width, y_tile_height; unsigned int y_meta_stride, y_meta_scanlines; unsigned int uv_meta_stride, uv_meta_scanlines; @@ -77,23 +77,25 @@ static int _dpu_format_populate_plane_sizes_ubwc( y_tile_width = 32; } - sclines = round_up(fb->height, 16); + y_sclines = round_up(fb->height, 16); + uv_sclines = round_up((fb->height+1)>>1, 16); y_tile_height = 4; } else { stride = round_up(fb->width, 128); y_tile_width = 32; - sclines = round_up(fb->height, 32); + y_sclines = round_up(fb->height, 32); + uv_sclines = round_up((fb->height+1)>>1, 32); y_tile_height = 8; } layout->plane_pitch[0] = stride; layout->plane_size[0] = round_up(layout->plane_pitch[0] * - sclines, DPU_UBWC_PLANE_SIZE_ALIGNMENT); + y_sclines, DPU_UBWC_PLANE_SIZE_ALIGNMENT); layout->plane_pitch[1] = stride; layout->plane_size[1] = round_up(layout->plane_pitch[1] * - sclines, DPU_UBWC_PLANE_SIZE_ALIGNMENT); + uv_sclines, DPU_UBWC_PLANE_SIZE_ALIGNMENT); if (!meta) return 0; diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_writeback.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_writeback.c index 7545c0293efb..6f2370c9dd98 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_writeback.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_writeback.c @@ -5,6 +5,7 @@ #include <drm/drm_edid.h> #include <drm/drm_framebuffer.h> +#include <drm/drm_managed.h> #include "dpu_writeback.h" @@ -125,7 +126,7 @@ int dpu_writeback_init(struct drm_device *dev, struct drm_encoder *enc, struct dpu_wb_connector *dpu_wb_conn; int rc = 0; - dpu_wb_conn = devm_kzalloc(dev->dev, sizeof(*dpu_wb_conn), GFP_KERNEL); + dpu_wb_conn = drmm_kzalloc(dev, sizeof(*dpu_wb_conn), GFP_KERNEL); if (!dpu_wb_conn) return -ENOMEM; diff --git a/drivers/gpu/drm/msm/disp/msm_disp_snapshot_util.c b/drivers/gpu/drm/msm/disp/msm_disp_snapshot_util.c index 427d3ee2b833..6e0f8671bfb4 100644 --- a/drivers/gpu/drm/msm/disp/msm_disp_snapshot_util.c +++ b/drivers/gpu/drm/msm/disp/msm_disp_snapshot_util.c @@ -9,7 +9,7 @@ #include "msm_disp_snapshot.h" -static void msm_disp_state_dump_regs(u32 **reg, u32 aligned_len, void __iomem *base_addr) +static void msm_disp_state_dump_regs(u32 **reg, u32 len, void __iomem *base_addr) { u32 len_padded; u32 num_rows; @@ -19,11 +19,11 @@ static void msm_disp_state_dump_regs(u32 **reg, u32 aligned_len, void __iomem *b void __iomem *end_addr; int i; - len_padded = aligned_len * REG_DUMP_ALIGN; - num_rows = aligned_len / REG_DUMP_ALIGN; + len_padded = round_up(len, REG_DUMP_ALIGN); + num_rows = DIV_ROUND_UP(len, REG_DUMP_ALIGN); addr = base_addr; - end_addr = base_addr + aligned_len; + end_addr = base_addr + len; *reg = kvzalloc(len_padded, GFP_KERNEL); if (!*reg) @@ -48,8 +48,8 @@ static void msm_disp_state_dump_regs(u32 **reg, u32 aligned_len, void __iomem *b static void msm_disp_state_print_regs(const u32 *dump_addr, u32 len, void __iomem *base_addr, struct drm_printer *p) { + void __iomem *addr, *end_addr; int i; - void __iomem *addr; u32 num_rows; if (!dump_addr) { @@ -58,6 +58,7 @@ static void msm_disp_state_print_regs(const u32 *dump_addr, u32 len, } addr = base_addr; + end_addr = base_addr + len; num_rows = len / REG_DUMP_ALIGN; for (i = 0; i < num_rows; i++) { @@ -67,6 +68,17 @@ static void msm_disp_state_print_regs(const u32 *dump_addr, u32 len, dump_addr[i * 4 + 2], dump_addr[i * 4 + 3]); addr += REG_DUMP_ALIGN; } + + if (addr != end_addr) { + drm_printf(p, "0x%lx : %08x", + (unsigned long)(addr - base_addr), + dump_addr[i * 4]); + if (addr + 0x4 < end_addr) + drm_printf(p, " %08x", dump_addr[i * 4 + 1]); + if (addr + 0x8 < end_addr) + drm_printf(p, " %08x", dump_addr[i * 4 + 2]); + drm_printf(p, "\n"); + } } void msm_disp_state_print(struct msm_disp_state *state, struct drm_printer *p) @@ -185,7 +197,7 @@ void msm_disp_snapshot_add_block(struct msm_disp_state *disp_state, u32 len, va_end(va); INIT_LIST_HEAD(&new_blk->node); - new_blk->size = ALIGN(len, REG_DUMP_ALIGN); + new_blk->size = len; new_blk->base_addr = base_addr; msm_disp_state_dump_regs(&new_blk->state, new_blk->size, base_addr); diff --git a/drivers/gpu/drm/msm/dsi/dsi_host.c b/drivers/gpu/drm/msm/dsi/dsi_host.c index 1c0841a1c101..50474c994d47 100644 --- a/drivers/gpu/drm/msm/dsi/dsi_host.c +++ b/drivers/gpu/drm/msm/dsi/dsi_host.c @@ -2003,6 +2003,7 @@ int msm_dsi_host_init(struct msm_dsi *msm_dsi) /* fixup base address by io offset */ msm_host->ctrl_base += cfg->io_offset; + msm_host->ctrl_size -= cfg->io_offset; ret = devm_regulator_bulk_get_const(&pdev->dev, cfg->num_regulators, cfg->regulator_data, diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c index 195f40e331e5..cc2bcd14b1c2 100644 --- a/drivers/gpu/drm/msm/msm_drv.c +++ b/drivers/gpu/drm/msm/msm_drv.c @@ -128,11 +128,10 @@ static int msm_drm_init(struct device *dev, const struct drm_driver *drv, /* * Initialize the LRUs: */ - mutex_init(&priv->lru.lock); - drm_gem_lru_init(&priv->lru.unbacked, &priv->lru.lock); - drm_gem_lru_init(&priv->lru.pinned, &priv->lru.lock); - drm_gem_lru_init(&priv->lru.willneed, &priv->lru.lock); - drm_gem_lru_init(&priv->lru.dontneed, &priv->lru.lock); + drm_gem_lru_init(&priv->lru.unbacked); + drm_gem_lru_init(&priv->lru.pinned); + drm_gem_lru_init(&priv->lru.willneed); + drm_gem_lru_init(&priv->lru.dontneed); /* Initialize stall-on-fault */ spin_lock_init(&priv->fault_stall_lock); @@ -140,7 +139,7 @@ static int msm_drm_init(struct device *dev, const struct drm_driver *drv, /* Teach lockdep about lock ordering wrt. shrinker: */ fs_reclaim_acquire(GFP_KERNEL); - might_lock(&priv->lru.lock); + might_lock(&ddev->gem_lru_mutex); fs_reclaim_release(GFP_KERNEL); if (priv->kms_init) { diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h index 6d847d593f1a..617b3c4b42c0 100644 --- a/drivers/gpu/drm/msm/msm_drv.h +++ b/drivers/gpu/drm/msm/msm_drv.h @@ -150,13 +150,6 @@ struct msm_drm_private { * DONTNEED state (ie. can be purged) */ struct drm_gem_lru dontneed; - - /** - * lock: - * - * Protects manipulation of all of the LRUs. - */ - struct mutex lock; } lru; struct notifier_block vmap_notifier; diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index 2cb3ab04f125..efd3d3c9a449 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -177,11 +177,11 @@ static void update_lru_locked(struct drm_gem_object *obj) static void update_lru(struct drm_gem_object *obj) { - struct msm_drm_private *priv = obj->dev->dev_private; + struct drm_device *dev = obj->dev; - mutex_lock(&priv->lru.lock); + mutex_lock(&dev->gem_lru_mutex); update_lru_locked(obj); - mutex_unlock(&priv->lru.lock); + mutex_unlock(&dev->gem_lru_mutex); } static struct page **get_pages(struct drm_gem_object *obj) @@ -292,11 +292,11 @@ void msm_gem_pin_obj_locked(struct drm_gem_object *obj) static void pin_obj_locked(struct drm_gem_object *obj) { - struct msm_drm_private *priv = obj->dev->dev_private; + struct drm_device *dev = obj->dev; - mutex_lock(&priv->lru.lock); + mutex_lock(&dev->gem_lru_mutex); msm_gem_pin_obj_locked(obj); - mutex_unlock(&priv->lru.lock); + mutex_unlock(&dev->gem_lru_mutex); } struct page **msm_gem_pin_pages_locked(struct drm_gem_object *obj) @@ -487,16 +487,16 @@ int msm_gem_pin_vma_locked(struct drm_gem_object *obj, struct drm_gpuva *vma) void msm_gem_unpin_locked(struct drm_gem_object *obj) { - struct msm_drm_private *priv = obj->dev->dev_private; + struct drm_device *dev = obj->dev; struct msm_gem_object *msm_obj = to_msm_bo(obj); msm_gem_assert_locked(obj); - mutex_lock(&priv->lru.lock); + mutex_lock(&dev->gem_lru_mutex); msm_obj->pin_count--; GEM_WARN_ON(msm_obj->pin_count < 0); update_lru_locked(obj); - mutex_unlock(&priv->lru.lock); + mutex_unlock(&dev->gem_lru_mutex); } /* Special unpin path for use in fence-signaling path, avoiding the need @@ -507,10 +507,10 @@ void msm_gem_unpin_locked(struct drm_gem_object *obj) */ void msm_gem_unpin_active(struct drm_gem_object *obj) { - struct msm_drm_private *priv = obj->dev->dev_private; + struct drm_device *dev = obj->dev; struct msm_gem_object *msm_obj = to_msm_bo(obj); - GEM_WARN_ON(!mutex_is_locked(&priv->lru.lock)); + GEM_WARN_ON(!mutex_is_locked(&dev->gem_lru_mutex)); msm_obj->pin_count--; GEM_WARN_ON(msm_obj->pin_count < 0); @@ -797,12 +797,12 @@ void msm_gem_put_vaddr(struct drm_gem_object *obj) */ int msm_gem_madvise(struct drm_gem_object *obj, unsigned madv) { - struct msm_drm_private *priv = obj->dev->dev_private; + struct drm_device *dev = obj->dev; struct msm_gem_object *msm_obj = to_msm_bo(obj); msm_gem_lock(obj); - mutex_lock(&priv->lru.lock); + mutex_lock(&dev->gem_lru_mutex); if (msm_obj->madv != __MSM_MADV_PURGED) msm_obj->madv = madv; @@ -814,7 +814,7 @@ int msm_gem_madvise(struct drm_gem_object *obj, unsigned madv) */ update_lru_locked(obj); - mutex_unlock(&priv->lru.lock); + mutex_unlock(&dev->gem_lru_mutex); msm_gem_unlock(obj); @@ -824,7 +824,6 @@ int msm_gem_madvise(struct drm_gem_object *obj, unsigned madv) void msm_gem_purge(struct drm_gem_object *obj) { struct drm_device *dev = obj->dev; - struct msm_drm_private *priv = obj->dev->dev_private; struct msm_gem_object *msm_obj = to_msm_bo(obj); msm_gem_assert_locked(obj); @@ -839,10 +838,10 @@ void msm_gem_purge(struct drm_gem_object *obj) put_pages(obj); - mutex_lock(&priv->lru.lock); + mutex_lock(&dev->gem_lru_mutex); /* A one-way transition: */ msm_obj->madv = __MSM_MADV_PURGED; - mutex_unlock(&priv->lru.lock); + mutex_unlock(&dev->gem_lru_mutex); drm_gem_free_mmap_offset(obj); diff --git a/drivers/gpu/drm/msm/msm_gem_shrinker.c b/drivers/gpu/drm/msm/msm_gem_shrinker.c index 31fa51a44f86..9d2788f79ace 100644 --- a/drivers/gpu/drm/msm/msm_gem_shrinker.c +++ b/drivers/gpu/drm/msm/msm_gem_shrinker.c @@ -43,8 +43,7 @@ msm_gem_shrinker_count(struct shrinker *shrinker, struct shrink_control *sc) } static bool -with_vm_locks(struct ww_acquire_ctx *ticket, - void (*fn)(struct drm_gem_object *obj), +with_vm_locks(void (*fn)(struct drm_gem_object *obj), struct drm_gem_object *obj) { /* @@ -52,7 +51,7 @@ with_vm_locks(struct ww_acquire_ctx *ticket, * success paths */ struct drm_gpuvm_bo *vm_bo, *last_locked = NULL; - int ret = 0; + bool locked = true; drm_gem_for_each_gpuvm_bo (vm_bo, obj) { struct dma_resv *resv = drm_gpuvm_resv(vm_bo->vm); @@ -60,23 +59,14 @@ with_vm_locks(struct ww_acquire_ctx *ticket, if (resv == obj->resv) continue; - ret = dma_resv_lock(resv, ticket); - - /* - * Since we already skip the case when the VM and obj - * share a resv (ie. _NO_SHARE objs), we don't expect - * to hit a double-locking scenario... which the lock - * unwinding cannot really cope with. - */ - WARN_ON(ret == -EALREADY); - /* - * Don't bother with slow-lock / backoff / retry sequence, - * if we can't get the lock just give up and move on to - * the next object. + * dma_resv_lock can't be used due to acquiring 'ticket' before the + * fs_reclaim lock, which is held in shrinker context */ - if (ret) + if (!dma_resv_trylock(resv)) { + locked = false; goto out_unlock; + } /* * Hold a ref to prevent the vm_bo from being freed @@ -108,11 +98,11 @@ out_unlock: } } - return ret == 0; + return locked; } static bool -purge(struct drm_gem_object *obj, struct ww_acquire_ctx *ticket) +purge(struct drm_gem_object *obj, struct ww_acquire_ctx *unused) { if (!is_purgeable(to_msm_bo(obj))) return false; @@ -120,11 +110,11 @@ purge(struct drm_gem_object *obj, struct ww_acquire_ctx *ticket) if (msm_gem_active(obj)) return false; - return with_vm_locks(ticket, msm_gem_purge, obj); + return with_vm_locks(msm_gem_purge, obj); } static bool -evict(struct drm_gem_object *obj, struct ww_acquire_ctx *ticket) +evict(struct drm_gem_object *obj, struct ww_acquire_ctx *unused) { if (is_unevictable(to_msm_bo(obj))) return false; @@ -132,7 +122,7 @@ evict(struct drm_gem_object *obj, struct ww_acquire_ctx *ticket) if (msm_gem_active(obj)) return false; - return with_vm_locks(ticket, msm_gem_evict, obj); + return with_vm_locks(msm_gem_evict, obj); } static bool @@ -164,7 +154,6 @@ static unsigned long msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) { struct msm_drm_private *priv = shrinker->private_data; - struct ww_acquire_ctx ticket; struct { struct drm_gem_lru *lru; bool (*shrink)(struct drm_gem_object *obj, struct ww_acquire_ctx *ticket); @@ -185,11 +174,14 @@ msm_gem_shrinker_scan(struct shrinker *shrinker, struct shrink_control *sc) for (unsigned i = 0; (nr > 0) && (i < ARRAY_SIZE(stages)); i++) { if (!stages[i].cond) continue; + /* + * 'ticket' not needed on trylock paths + */ stages[i].freed = - drm_gem_lru_scan(stages[i].lru, nr, + drm_gem_lru_scan(priv->dev, stages[i].lru, nr, &stages[i].remaining, stages[i].shrink, - &ticket); + NULL); nr -= stages[i].freed; freed += stages[i].freed; remaining += stages[i].remaining; @@ -255,7 +247,7 @@ msm_gem_shrinker_vmap(struct notifier_block *nb, unsigned long event, void *ptr) unsigned long remaining = 0; for (idx = 0; lrus[idx] && unmapped < vmap_shrink_limit; idx++) { - unmapped += drm_gem_lru_scan(lrus[idx], + unmapped += drm_gem_lru_scan(priv->dev, lrus[idx], vmap_shrink_limit - unmapped, &remaining, vmap_shrink, diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c index 75d9f3574370..771d7bb12c2d 100644 --- a/drivers/gpu/drm/msm/msm_gem_submit.c +++ b/drivers/gpu/drm/msm/msm_gem_submit.c @@ -350,7 +350,7 @@ static int submit_fence_sync(struct msm_gem_submit *submit) static int submit_pin_objects(struct msm_gem_submit *submit) { - struct msm_drm_private *priv = submit->dev->dev_private; + struct drm_device *dev = submit->dev; int i, ret = 0; for (i = 0; i < submit->nr_bos; i++) { @@ -379,11 +379,11 @@ static int submit_pin_objects(struct msm_gem_submit *submit) * get_pages() which could trigger reclaim.. and if we held the LRU lock * could trigger deadlock with the shrinker). */ - mutex_lock(&priv->lru.lock); + mutex_lock(&dev->gem_lru_mutex); for (i = 0; i < submit->nr_bos; i++) { msm_gem_pin_obj_locked(submit->bos[i].obj); } - mutex_unlock(&priv->lru.lock); + mutex_unlock(&dev->gem_lru_mutex); submit->bos_pinned = true; diff --git a/drivers/gpu/drm/msm/msm_gem_vma.c b/drivers/gpu/drm/msm/msm_gem_vma.c index 9e3632019bc9..3b418ee32658 100644 --- a/drivers/gpu/drm/msm/msm_gem_vma.c +++ b/drivers/gpu/drm/msm/msm_gem_vma.c @@ -696,7 +696,7 @@ static struct dma_fence * msm_vma_job_run(struct drm_sched_job *_job) { struct msm_vm_bind_job *job = to_msm_vm_bind_job(_job); - struct msm_drm_private *priv = job->vm->drm->dev_private; + struct drm_device *dev = job->vm->drm; struct msm_gem_vm *vm = to_msm_vm(job->vm); struct drm_gem_object *obj; int ret = vm->unusable ? -EINVAL : 0; @@ -739,13 +739,13 @@ msm_vma_job_run(struct drm_sched_job *_job) if (ret) msm_gem_vm_unusable(job->vm); - mutex_lock(&priv->lru.lock); + mutex_lock(&dev->gem_lru_mutex); job_foreach_bo (obj, job) { msm_gem_unpin_active(obj); } - mutex_unlock(&priv->lru.lock); + mutex_unlock(&dev->gem_lru_mutex); /* VM_BIND ops are synchronous, so no fence to wait on: */ return NULL; @@ -1299,7 +1299,7 @@ vm_bind_job_pin_objects(struct msm_vm_bind_job *job) return PTR_ERR(pages); } - struct msm_drm_private *priv = job->vm->drm->dev_private; + struct drm_device *dev = job->vm->drm; /* * A second loop while holding the LRU lock (a) avoids acquiring/dropping @@ -1308,10 +1308,10 @@ vm_bind_job_pin_objects(struct msm_vm_bind_job *job) * get_pages() which could trigger reclaim.. and if we held the LRU lock * could trigger deadlock with the shrinker). */ - mutex_lock(&priv->lru.lock); + mutex_lock(&dev->gem_lru_mutex); job_foreach_bo (obj, job) msm_gem_pin_obj_locked(obj); - mutex_unlock(&priv->lru.lock); + mutex_unlock(&dev->gem_lru_mutex); job->bos_pinned = true; diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c index 7d449e5202c5..058c71c82cf5 100644 --- a/drivers/gpu/drm/msm/msm_iommu.c +++ b/drivers/gpu/drm/msm/msm_iommu.c @@ -677,7 +677,7 @@ static int msm_iommu_map(struct msm_mmu *mmu, uint64_t iova, int prot) { struct msm_iommu *iommu = to_msm_iommu(mmu); - size_t ret; + ssize_t ret; WARN_ON(off != 0); @@ -686,7 +686,8 @@ static int msm_iommu_map(struct msm_mmu *mmu, uint64_t iova, iova |= GENMASK_ULL(63, 49); ret = iommu_map_sgtable(iommu->domain, iova, sgt, prot); - WARN_ON(!ret); + if (ret < 0) + return ret; return (ret == len) ? 0 : -EINVAL; } diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c index 30ddb5351e98..2d6b930b766e 100644 --- a/drivers/gpu/drm/msm/msm_ringbuffer.c +++ b/drivers/gpu/drm/msm/msm_ringbuffer.c @@ -16,13 +16,13 @@ static struct dma_fence *msm_job_run(struct drm_sched_job *job) struct msm_gem_submit *submit = to_msm_submit(job); struct msm_fence_context *fctx = submit->ring->fctx; struct msm_gpu *gpu = submit->gpu; - struct msm_drm_private *priv = gpu->dev->dev_private; + struct drm_device *dev = gpu->dev; unsigned nr_cmds = submit->nr_cmds; int i; msm_fence_init(submit->hw_fence, fctx); - mutex_lock(&priv->lru.lock); + mutex_lock(&dev->gem_lru_mutex); for (i = 0; i < submit->nr_bos; i++) { struct drm_gem_object *obj = submit->bos[i].obj; @@ -32,7 +32,7 @@ static struct dma_fence *msm_job_run(struct drm_sched_job *job) submit->bos_pinned = false; - mutex_unlock(&priv->lru.lock); + mutex_unlock(&dev->gem_lru_mutex); /* TODO move submit path over to using a per-ring lock.. */ mutex_lock(&gpu->lock); diff --git a/drivers/gpu/drm/msm/registers/adreno/a6xx.xml b/drivers/gpu/drm/msm/registers/adreno/a6xx.xml index 3941e7510754..2309870f5031 100644 --- a/drivers/gpu/drm/msm/registers/adreno/a6xx.xml +++ b/drivers/gpu/drm/msm/registers/adreno/a6xx.xml @@ -5016,6 +5016,10 @@ by a particular renderpass/blit. <bitfield pos="1" name="LPAC" type="boolean"/> <bitfield pos="2" name="RAYTRACING" type="boolean"/> </reg32> + <reg32 offset="0x0405" name="CX_MISC_SW_FUSE_FREQ_LIMIT_STATUS" variants="A8XX-"> + <bitfield high="8" low="0" name="FINALFREQLIMIT"/> + <bitfield pos="24" name="SOFTSKUDISABLED" type="boolean"/> + </reg32> </domain> </database> diff --git a/drivers/gpu/drm/v3d/v3d_sched.c b/drivers/gpu/drm/v3d/v3d_sched.c index 6dc871fc9a62..a2dbc92ed4dd 100644 --- a/drivers/gpu/drm/v3d/v3d_sched.c +++ b/drivers/gpu/drm/v3d/v3d_sched.c @@ -105,20 +105,6 @@ v3d_performance_query_info_free(struct v3d_performance_query_info *query_info, } static void -v3d_cpu_job_free(struct drm_sched_job *sched_job) -{ - struct v3d_cpu_job *job = to_cpu_job(sched_job); - - v3d_timestamp_query_info_free(&job->timestamp_query, - job->timestamp_query.count); - - v3d_performance_query_info_free(&job->performance_query, - job->performance_query.count); - - v3d_job_cleanup(&job->base); -} - -static void v3d_switch_perfmon(struct v3d_dev *v3d, struct v3d_job *job) { struct v3d_perfmon *perfmon = v3d->global_perfmon; @@ -861,7 +847,7 @@ static const struct drm_sched_backend_ops v3d_cache_clean_sched_ops = { static const struct drm_sched_backend_ops v3d_cpu_sched_ops = { .run_job = v3d_cpu_job_run, - .free_job = v3d_cpu_job_free + .free_job = v3d_sched_job_free }; static int diff --git a/drivers/gpu/drm/v3d/v3d_submit.c b/drivers/gpu/drm/v3d/v3d_submit.c index fc74351efad5..24de81bc3c1e 100644 --- a/drivers/gpu/drm/v3d/v3d_submit.c +++ b/drivers/gpu/drm/v3d/v3d_submit.c @@ -120,6 +120,24 @@ v3d_render_job_free(struct kref *ref) v3d_job_free(ref); } +static void +v3d_cpu_job_free(struct kref *ref) +{ + struct v3d_cpu_job *job = container_of(ref, struct v3d_cpu_job, + base.refcount); + + v3d_timestamp_query_info_free(&job->timestamp_query, + job->timestamp_query.count); + + v3d_performance_query_info_free(&job->performance_query, + job->performance_query.count); + + if (job->indirect_csd.indirect) + drm_gem_object_put(job->indirect_csd.indirect); + + v3d_job_free(ref); +} + void v3d_job_cleanup(struct v3d_job *job) { if (!job) @@ -1296,7 +1314,7 @@ v3d_submit_cpu_ioctl(struct drm_device *dev, void *data, trace_v3d_submit_cpu_ioctl(&v3d->drm, cpu_job->job_type); ret = v3d_job_init(v3d, file_priv, &cpu_job->base, - v3d_job_free, 0, &se, V3D_CPU); + v3d_cpu_job_free, 0, &se, V3D_CPU); if (ret) { v3d_job_deallocate((void *)&cpu_job); goto fail; @@ -1379,8 +1397,6 @@ fail: v3d_job_cleanup((void *)csd_job); v3d_job_cleanup(clean_job); v3d_put_multisync_post_deps(&se); - kvfree(cpu_job->timestamp_query.queries); - kvfree(cpu_job->performance_query.queries); return ret; } diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h index f17660a71a3e..2f3531950aa4 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.h +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h @@ -317,6 +317,7 @@ virtio_gpu_array_from_handles(struct drm_file *drm_file, u32 *handles, u32 nents void virtio_gpu_array_add_obj(struct virtio_gpu_object_array *objs, struct drm_gem_object *obj); int virtio_gpu_array_lock_resv(struct virtio_gpu_object_array *objs); +int virtio_gpu_lock_one_resv_uninterruptible(struct virtio_gpu_object_array *objs); void virtio_gpu_array_unlock_resv(struct virtio_gpu_object_array *objs); void virtio_gpu_array_add_fence(struct virtio_gpu_object_array *objs, struct dma_fence *fence); diff --git a/drivers/gpu/drm/virtio/virtgpu_gem.c b/drivers/gpu/drm/virtio/virtgpu_gem.c index f22dc5c21cd4..435d37d36034 100644 --- a/drivers/gpu/drm/virtio/virtgpu_gem.c +++ b/drivers/gpu/drm/virtio/virtgpu_gem.c @@ -238,6 +238,23 @@ int virtio_gpu_array_lock_resv(struct virtio_gpu_object_array *objs) return ret; } +int virtio_gpu_lock_one_resv_uninterruptible(struct virtio_gpu_object_array *objs) +{ + int ret; + + if (objs->nents != 1) + return -EINVAL; + + dma_resv_lock(objs->objs[0]->resv, NULL); + + ret = dma_resv_reserve_fences(objs->objs[0]->resv, 1); + if (ret) { + virtio_gpu_array_unlock_resv(objs); + return ret; + } + return 0; +} + void virtio_gpu_array_unlock_resv(struct virtio_gpu_object_array *objs) { if (objs->nents == 1) { diff --git a/drivers/gpu/drm/virtio/virtgpu_plane.c b/drivers/gpu/drm/virtio/virtgpu_plane.c index a126d1b25f46..652352424744 100644 --- a/drivers/gpu/drm/virtio/virtgpu_plane.c +++ b/drivers/gpu/drm/virtio/virtgpu_plane.c @@ -215,7 +215,10 @@ static void virtio_gpu_resource_flush(struct drm_plane *plane, if (!objs) return; virtio_gpu_array_add_obj(objs, vgfb->base.obj[0]); - virtio_gpu_array_lock_resv(objs); + if (virtio_gpu_lock_one_resv_uninterruptible(objs)) { + virtio_gpu_array_put_free(objs); + return; + } virtio_gpu_cmd_resource_flush(vgdev, bo->hw_res_handle, x, y, width, height, objs, vgplane_st->fence); @@ -459,7 +462,10 @@ static void virtio_gpu_cursor_plane_update(struct drm_plane *plane, if (!objs) return; virtio_gpu_array_add_obj(objs, vgfb->base.obj[0]); - virtio_gpu_array_lock_resv(objs); + if (virtio_gpu_lock_one_resv_uninterruptible(objs)) { + virtio_gpu_array_put_free(objs); + return; + } virtio_gpu_cmd_transfer_to_host_2d (vgdev, 0, plane->state->crtc_w, diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h index 9d66f168ab8a..bdbcbccd759e 100644 --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h @@ -145,6 +145,7 @@ #define MSAA_OPTIMIZATION_REDUC_DISABLE REG_BIT(11) #define COMMON_SLICE_CHICKEN1 XE_REG(0x7010, XE_REG_OPTION_MASKED) +#define XEHP_COMMON_SLICE_CHICKEN1 XE_REG_MCR(0x7010, XE_REG_OPTION_MASKED) #define DISABLE_BOTTOM_CLIP_RECTANGLE_TEST REG_BIT(14) #define HIZ_CHICKEN XE_REG(0x7018, XE_REG_OPTION_MASKED) @@ -167,8 +168,10 @@ #define XEHPG_SC_INSTDONE_EXTRA2 XE_REG_MCR(0x7108) #define COMMON_SLICE_CHICKEN4 XE_REG(0x7300, XE_REG_OPTION_MASKED) +#define XEHP_COMMON_SLICE_CHICKEN4 XE_REG_MCR(0x7300, XE_REG_OPTION_MASKED) #define SBE_PUSH_CONSTANT_BEHIND_FIX_ENABLE REG_BIT(12) #define DISABLE_TDC_LOAD_BALANCING_CALC REG_BIT(6) +#define HW_FILTERING REG_BIT(5) #define COMMON_SLICE_CHICKEN3 XE_REG(0x7304, XE_REG_OPTION_MASKED) #define XEHP_COMMON_SLICE_CHICKEN3 XE_REG_MCR(0x7304, XE_REG_OPTION_MASKED) diff --git a/drivers/gpu/drm/xe/xe_gsc.c b/drivers/gpu/drm/xe/xe_gsc.c index 0d13e357fb43..aab59dc647fb 100644 --- a/drivers/gpu/drm/xe/xe_gsc.c +++ b/drivers/gpu/drm/xe/xe_gsc.c @@ -482,8 +482,7 @@ int xe_gsc_init_post_hwconfig(struct xe_gsc *gsc) EXEC_QUEUE_FLAG_PERMANENT, 0); if (IS_ERR(q)) { xe_gt_err(gt, "Failed to create queue for GSC submission\n"); - err = PTR_ERR(q); - goto out_bo; + return PTR_ERR(q); } wq = alloc_ordered_workqueue("gsc-ordered-wq", 0); @@ -506,8 +505,6 @@ int xe_gsc_init_post_hwconfig(struct xe_gsc *gsc) out_q: xe_exec_queue_put(q); -out_bo: - xe_bo_unpin_map_no_vm(bo); return err; } diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_monitor.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_monitor.c index 7d532bded02a..a85ba4435378 100644 --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_monitor.c +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_monitor.c @@ -114,8 +114,10 @@ int xe_gt_sriov_pf_monitor_process_guc2pf(struct xe_gt *gt, const u32 *msg, u32 * VFs with no events are not printed. * * This function can only be called on PF. + * + * Return: always 0 */ -void xe_gt_sriov_pf_monitor_print_events(struct xe_gt *gt, struct drm_printer *p) +int xe_gt_sriov_pf_monitor_print_events(struct xe_gt *gt, struct drm_printer *p) { unsigned int n, total_vfs = xe_gt_sriov_pf_get_totalvfs(gt); const struct xe_gt_sriov_monitor *data; @@ -144,4 +146,6 @@ void xe_gt_sriov_pf_monitor_print_events(struct xe_gt *gt, struct drm_printer *p #undef __format #undef __value } + + return 0; } diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_monitor.h b/drivers/gpu/drm/xe/xe_gt_sriov_pf_monitor.h index 7ca9351a271b..0b8f088d3a16 100644 --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_monitor.h +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_monitor.h @@ -13,7 +13,7 @@ struct drm_printer; struct xe_gt; void xe_gt_sriov_pf_monitor_flr(struct xe_gt *gt, u32 vfid); -void xe_gt_sriov_pf_monitor_print_events(struct xe_gt *gt, struct drm_printer *p); +int xe_gt_sriov_pf_monitor_print_events(struct xe_gt *gt, struct drm_printer *p); #ifdef CONFIG_PCI_IOV int xe_gt_sriov_pf_monitor_process_guc2pf(struct xe_gt *gt, const u32 *msg, u32 len); diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_vf.c b/drivers/gpu/drm/xe/xe_gt_sriov_vf.c index 30e8c2cf5f09..82703a25e96d 100644 --- a/drivers/gpu/drm/xe/xe_gt_sriov_vf.c +++ b/drivers/gpu/drm/xe/xe_gt_sriov_vf.c @@ -1129,13 +1129,15 @@ void xe_gt_sriov_vf_write32(struct xe_gt *gt, struct xe_reg reg, u32 val) } /** - * xe_gt_sriov_vf_print_config - Print VF self config. + * xe_gt_sriov_vf_print_config() - Print VF self config. * @gt: the &xe_gt * @p: the &drm_printer * * This function is for VF use only. + * + * Return: always 0. */ -void xe_gt_sriov_vf_print_config(struct xe_gt *gt, struct drm_printer *p) +int xe_gt_sriov_vf_print_config(struct xe_gt *gt, struct drm_printer *p) { struct xe_gt_sriov_vf_selfconfig *config = >->sriov.vf.self_config; struct xe_device *xe = gt_to_xe(gt); @@ -1162,16 +1164,20 @@ void xe_gt_sriov_vf_print_config(struct xe_gt *gt, struct drm_printer *p) drm_printf(p, "GuC contexts:\t%u\n", config->num_ctxs); drm_printf(p, "GuC doorbells:\t%u\n", config->num_dbs); + + return 0; } /** - * xe_gt_sriov_vf_print_runtime - Print VF's runtime regs received from PF. + * xe_gt_sriov_vf_print_runtime() - Print VF's runtime regs received from PF. * @gt: the &xe_gt * @p: the &drm_printer * * This function is for VF use only. + * + * Return: always 0. */ -void xe_gt_sriov_vf_print_runtime(struct xe_gt *gt, struct drm_printer *p) +int xe_gt_sriov_vf_print_runtime(struct xe_gt *gt, struct drm_printer *p) { struct vf_runtime_reg *vf_regs = gt->sriov.vf.runtime.regs; unsigned int size = gt->sriov.vf.runtime.num_regs; @@ -1180,16 +1186,20 @@ void xe_gt_sriov_vf_print_runtime(struct xe_gt *gt, struct drm_printer *p) for (; size--; vf_regs++) drm_printf(p, "%#x = %#x\n", vf_regs->offset, vf_regs->value); + + return 0; } /** - * xe_gt_sriov_vf_print_version - Print VF ABI versions. + * xe_gt_sriov_vf_print_version() - Print VF ABI versions. * @gt: the &xe_gt * @p: the &drm_printer * * This function is for VF use only. + * + * Return: always 0. */ -void xe_gt_sriov_vf_print_version(struct xe_gt *gt, struct drm_printer *p) +int xe_gt_sriov_vf_print_version(struct xe_gt *gt, struct drm_printer *p) { struct xe_device *xe = gt_to_xe(gt); struct xe_uc_fw_version *guc_version = >->sriov.vf.guc_version; @@ -1219,6 +1229,8 @@ void xe_gt_sriov_vf_print_version(struct xe_gt *gt, struct drm_printer *p) GUC_RELAY_VERSION_LATEST_MAJOR, GUC_RELAY_VERSION_LATEST_MINOR); drm_printf(p, "\thandshake:\t%u.%u\n", pf_version->major, pf_version->minor); + + return 0; } static bool vf_post_migration_shutdown(struct xe_gt *gt) diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_vf.h b/drivers/gpu/drm/xe/xe_gt_sriov_vf.h index 7d97189c2d3d..4f1c7aa422e7 100644 --- a/drivers/gpu/drm/xe/xe_gt_sriov_vf.h +++ b/drivers/gpu/drm/xe/xe_gt_sriov_vf.h @@ -35,9 +35,9 @@ bool xe_gt_sriov_vf_sched_groups_enabled(struct xe_gt *gt); u32 xe_gt_sriov_vf_read32(struct xe_gt *gt, struct xe_reg reg); void xe_gt_sriov_vf_write32(struct xe_gt *gt, struct xe_reg reg, u32 val); -void xe_gt_sriov_vf_print_config(struct xe_gt *gt, struct drm_printer *p); -void xe_gt_sriov_vf_print_runtime(struct xe_gt *gt, struct drm_printer *p); -void xe_gt_sriov_vf_print_version(struct xe_gt *gt, struct drm_printer *p); +int xe_gt_sriov_vf_print_config(struct xe_gt *gt, struct drm_printer *p); +int xe_gt_sriov_vf_print_runtime(struct xe_gt *gt, struct drm_printer *p); +int xe_gt_sriov_vf_print_version(struct xe_gt *gt, struct drm_printer *p); void xe_gt_sriov_vf_wait_valid_ggtt(struct xe_gt *gt); diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c b/drivers/gpu/drm/xe/xe_guc_submit.c index 82412c8dfd37..e948f40fa178 100644 --- a/drivers/gpu/drm/xe/xe_guc_submit.c +++ b/drivers/gpu/drm/xe/xe_guc_submit.c @@ -1623,6 +1623,14 @@ static void guc_exec_queue_fini(struct xe_exec_queue *q) struct xe_guc_exec_queue *ge = q->guc; struct xe_guc *guc = exec_queue_to_guc(q); + if (xe_exec_queue_is_multi_queue_secondary(q)) { + struct xe_exec_queue_group *group = q->multi_queue.group; + + mutex_lock(&group->list_lock); + list_del(&q->multi_queue.link); + mutex_unlock(&group->list_lock); + } + release_guc_id(guc, q); xe_sched_entity_fini(&ge->entity); xe_sched_fini(&ge->sched); @@ -1644,14 +1652,6 @@ static void __guc_exec_queue_destroy_async(struct work_struct *w) guard(xe_pm_runtime)(guc_to_xe(guc)); trace_xe_exec_queue_destroy(q); - if (xe_exec_queue_is_multi_queue_secondary(q)) { - struct xe_exec_queue_group *group = q->multi_queue.group; - - mutex_lock(&group->list_lock); - list_del(&q->multi_queue.link); - mutex_unlock(&group->list_lock); - } - /* Confirm no work left behind accessing device structures */ cancel_delayed_work_sync(&ge->sched.base.work_tdr); diff --git a/drivers/gpu/drm/xe/xe_oa.c b/drivers/gpu/drm/xe/xe_oa.c index fa90441d3052..449a431ec1d4 100644 --- a/drivers/gpu/drm/xe/xe_oa.c +++ b/drivers/gpu/drm/xe/xe_oa.c @@ -2048,8 +2048,10 @@ int xe_oa_stream_open_ioctl(struct drm_device *dev, u64 data, struct drm_file *f if (XE_IOCTL_DBG(oa->xe, !param.exec_q)) return -ENOENT; - if (XE_IOCTL_DBG(oa->xe, param.exec_q->width > 1)) - return -EOPNOTSUPP; + if (XE_IOCTL_DBG(oa->xe, param.exec_q->width > 1)) { + ret = -EOPNOTSUPP; + goto err_exec_q; + } } /* diff --git a/drivers/gpu/drm/xe/xe_tuning.c b/drivers/gpu/drm/xe/xe_tuning.c index 5766fa7742d3..e15553bfb739 100644 --- a/drivers/gpu/drm/xe/xe_tuning.c +++ b/drivers/gpu/drm/xe/xe_tuning.c @@ -110,6 +110,11 @@ static const struct xe_rtp_entry_sr engine_tunings[] = { }; static const struct xe_rtp_entry_sr lrc_tunings[] = { + { XE_RTP_NAME("Tuning: Windower HW Filtering"), + XE_RTP_RULES(GRAPHICS_VERSION_RANGE(3000, 3599), ENGINE_CLASS(RENDER)), + XE_RTP_ACTIONS(SET(XEHP_COMMON_SLICE_CHICKEN4, HW_FILTERING)) + }, + /* DG2 */ { XE_RTP_NAME("Tuning: L3 cache"), diff --git a/drivers/gpu/drm/xe/xe_wa.c b/drivers/gpu/drm/xe/xe_wa.c index 9ddd21a21dce..dce0a39d1914 100644 --- a/drivers/gpu/drm/xe/xe_wa.c +++ b/drivers/gpu/drm/xe/xe_wa.c @@ -716,6 +716,14 @@ static const struct xe_rtp_entry_sr lrc_was[] = { XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2001, 2004), ENGINE_CLASS(RENDER)), XE_RTP_ACTIONS(SET(VF_SCRATCHPAD, XE2_VFG_TED_CREDIT_INTERFACE_DISABLE)) }, + { XE_RTP_NAME("14019988906"), + XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2001, 2004), ENGINE_CLASS(RENDER)), + XE_RTP_ACTIONS(SET(XEHP_PSS_CHICKEN, FLSH_IGNORES_PSD)) + }, + { XE_RTP_NAME("18033852989"), + XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2001, 2004), ENGINE_CLASS(RENDER)), + XE_RTP_ACTIONS(SET(XEHP_COMMON_SLICE_CHICKEN1, DISABLE_BOTTOM_CLIP_RECTANGLE_TEST)) + }, /* DG1 */ @@ -772,14 +780,6 @@ static const struct xe_rtp_entry_sr lrc_was[] = { /* Xe2_LPG */ - { XE_RTP_NAME("14019988906"), - XE_RTP_RULES(GRAPHICS_VERSION(2004), ENGINE_CLASS(RENDER)), - XE_RTP_ACTIONS(SET(XEHP_PSS_CHICKEN, FLSH_IGNORES_PSD)) - }, - { XE_RTP_NAME("18033852989"), - XE_RTP_RULES(GRAPHICS_VERSION(2004), ENGINE_CLASS(RENDER)), - XE_RTP_ACTIONS(SET(COMMON_SLICE_CHICKEN1, DISABLE_BOTTOM_CLIP_RECTANGLE_TEST)) - }, { XE_RTP_NAME("14021567978"), XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2001, XE_RTP_END_VERSION_UNDEFINED), ENGINE_CLASS(RENDER)), @@ -810,10 +810,6 @@ static const struct xe_rtp_entry_sr lrc_was[] = { XE_RTP_RULES(GRAPHICS_VERSION(2001), ENGINE_CLASS(RENDER)), XE_RTP_ACTIONS(SET(WM_CHICKEN3, HIZ_PLANE_COMPRESSION_DIS)) }, - { XE_RTP_NAME("14019988906"), - XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2001, 2002), ENGINE_CLASS(RENDER)), - XE_RTP_ACTIONS(SET(XEHP_PSS_CHICKEN, FLSH_IGNORES_PSD)) - }, { XE_RTP_NAME("14021490052"), XE_RTP_RULES(GRAPHICS_VERSION(2001), ENGINE_CLASS(RENDER)), XE_RTP_ACTIONS(SET(FF_MODE, @@ -829,11 +825,7 @@ static const struct xe_rtp_entry_sr lrc_was[] = { }, { XE_RTP_NAME("22021007897"), XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2001, 2002), ENGINE_CLASS(RENDER)), - XE_RTP_ACTIONS(SET(COMMON_SLICE_CHICKEN4, SBE_PUSH_CONSTANT_BEHIND_FIX_ENABLE)) - }, - { XE_RTP_NAME("18033852989"), - XE_RTP_RULES(GRAPHICS_VERSION(2001), ENGINE_CLASS(RENDER)), - XE_RTP_ACTIONS(SET(COMMON_SLICE_CHICKEN1, DISABLE_BOTTOM_CLIP_RECTANGLE_TEST)) + XE_RTP_ACTIONS(SET(XEHP_COMMON_SLICE_CHICKEN4, SBE_PUSH_CONSTANT_BEHIND_FIX_ENABLE)) }, /* Xe3_LPG */ @@ -849,7 +841,7 @@ static const struct xe_rtp_entry_sr lrc_was[] = { }, { XE_RTP_NAME("22021007897"), XE_RTP_RULES(GRAPHICS_VERSION_RANGE(3000, 3005), ENGINE_CLASS(RENDER)), - XE_RTP_ACTIONS(SET(COMMON_SLICE_CHICKEN4, SBE_PUSH_CONSTANT_BEHIND_FIX_ENABLE)) + XE_RTP_ACTIONS(SET(XEHP_COMMON_SLICE_CHICKEN4, SBE_PUSH_CONSTANT_BEHIND_FIX_ENABLE)) }, { XE_RTP_NAME("14024681466"), XE_RTP_RULES(GRAPHICS_VERSION_RANGE(3000, 3005), ENGINE_CLASS(RENDER)), diff --git a/drivers/hid/hid-quirks.c b/drivers/hid/hid-quirks.c index 02f7db5c1056..5e754b0a5032 100644 --- a/drivers/hid/hid-quirks.c +++ b/drivers/hid/hid-quirks.c @@ -234,7 +234,7 @@ static const struct hid_device_id hid_quirks[] = { * used as a driver. See hid_scan_report(). */ static const struct hid_device_id hid_have_special_driver[] = { -#if IS_ENABLED(CONFIG_APPLEDISPLAY) +#if IS_ENABLED(CONFIG_USB_APPLEDISPLAY) { HID_USB_DEVICE(USB_VENDOR_ID_APPLE, 0x9218) }, { HID_USB_DEVICE(USB_VENDOR_ID_APPLE, 0x9219) }, { HID_USB_DEVICE(USB_VENDOR_ID_APPLE, 0x921c) }, diff --git a/drivers/hid/hid-uclogic-core.c b/drivers/hid/hid-uclogic-core.c index bd7f93e96e4e..b73f09d26688 100644 --- a/drivers/hid/hid-uclogic-core.c +++ b/drivers/hid/hid-uclogic-core.c @@ -184,7 +184,9 @@ static int uclogic_input_configured(struct hid_device *hdev, suffix = "System Control"; break; } - } else { + } + + if (suffix) { hi->input->name = devm_kasprintf(&hdev->dev, GFP_KERNEL, "%s %s", hdev->name, suffix); if (!hi->input->name) diff --git a/drivers/hid/intel-thc-hid/intel-quickspi/quickspi-protocol.c b/drivers/hid/intel-thc-hid/intel-quickspi/quickspi-protocol.c index 16f780bc879b..cb19057f1191 100644 --- a/drivers/hid/intel-thc-hid/intel-quickspi/quickspi-protocol.c +++ b/drivers/hid/intel-thc-hid/intel-quickspi/quickspi-protocol.c @@ -94,7 +94,7 @@ static int quickspi_get_device_descriptor(struct quickspi_device *qsdev) dev_err_once(qsdev->dev, "Read DEVICE_DESCRIPTOR failed, ret = %d\n", ret); dev_err_once(qsdev->dev, "DEVICE_DESCRIPTOR expected len = %u, actual read = %u\n", input_len, read_len); - return ret; + return ret ?: -EINVAL; } input_rep_type = ((struct input_report_body_header *)read_buf)->input_report_type; @@ -318,7 +318,7 @@ int reset_tic(struct quickspi_device *qsdev) dev_err_once(qsdev->dev, "Read RESET_RESPONSE body failed, ret = %d\n", ret); dev_err_once(qsdev->dev, "RESET_RESPONSE body expected len = %u, actual = %u\n", read_len, actual_read_len); - return ret; + return ret ?: -EINVAL; } input_rep_type = FIELD_GET(HIDSPI_IN_REP_BDY_HDR_REP_TYPE, reset_response); diff --git a/drivers/hwmon/lm90.c b/drivers/hwmon/lm90.c index 3c10a5066b53..1eeb608e5903 100644 --- a/drivers/hwmon/lm90.c +++ b/drivers/hwmon/lm90.c @@ -736,6 +736,7 @@ struct lm90_data { struct hwmon_chip_info chip; struct delayed_work alert_work; struct work_struct report_work; + bool shutdown; /* true if shutting down */ bool valid; /* true if register values are valid */ bool alarms_valid; /* true if status register values are valid */ unsigned long last_updated; /* in jiffies */ @@ -1154,6 +1155,9 @@ static void lm90_report_alarms(struct work_struct *work) static int lm90_update_alarms_locked(struct lm90_data *data, bool force) { + if (data->shutdown) + return 0; + if (force || !data->alarms_valid || time_after(jiffies, data->alarms_updated + msecs_to_jiffies(data->update_interval))) { struct i2c_client *client = data->client; @@ -2584,15 +2588,23 @@ static void lm90_restore_conf(void *_data) struct lm90_data *data = _data; struct i2c_client *client = data->client; - cancel_delayed_work_sync(&data->alert_work); - cancel_work_sync(&data->report_work); - /* Restore initial configuration */ if (data->flags & LM90_HAVE_CONVRATE) lm90_write_convrate(data, data->convrate_orig); lm90_write_reg(client, LM90_REG_CONFIG1, data->config_orig); } +static void lm90_stop_work(void *_data) +{ + struct lm90_data *data = _data; + + hwmon_lock(data->hwmon_dev); + data->shutdown = true; + hwmon_unlock(data->hwmon_dev); + cancel_delayed_work_sync(&data->alert_work); + cancel_work_sync(&data->report_work); +} + static int lm90_init_client(struct i2c_client *client, struct lm90_data *data) { struct device_node *np = client->dev.of_node; @@ -2902,6 +2914,10 @@ static int lm90_probe(struct i2c_client *client) data->hwmon_dev = hwmon_dev; + err = devm_add_action_or_reset(&client->dev, lm90_stop_work, data); + if (err) + return err; + if (client->irq) { dev_dbg(dev, "IRQ: %d\n", client->irq); err = devm_request_threaded_irq(dev, client->irq, @@ -2930,7 +2946,8 @@ static void lm90_alert(struct i2c_client *client, enum i2c_alert_protocol type, */ struct lm90_data *data = i2c_get_clientdata(client); - if ((data->flags & LM90_HAVE_BROKEN_ALERT) && + hwmon_lock(data->hwmon_dev); + if (!data->shutdown && (data->flags & LM90_HAVE_BROKEN_ALERT) && (data->current_alarms & data->alert_alarms)) { if (!(data->config & 0x80)) { dev_dbg(&client->dev, "Disabling ALERT#\n"); @@ -2939,6 +2956,7 @@ static void lm90_alert(struct i2c_client *client, enum i2c_alert_protocol type, schedule_delayed_work(&data->alert_work, max_t(int, HZ, msecs_to_jiffies(data->update_interval))); } + hwmon_unlock(data->hwmon_dev); } else { dev_dbg(&client->dev, "Everything OK\n"); } diff --git a/drivers/hwmon/pmbus/adm1266.c b/drivers/hwmon/pmbus/adm1266.c index d90f8f80be8e..9631a64cb1eb 100644 --- a/drivers/hwmon/pmbus/adm1266.c +++ b/drivers/hwmon/pmbus/adm1266.c @@ -46,6 +46,7 @@ #define ADM1266_BLACKBOX_OFFSET 0 #define ADM1266_BLACKBOX_SIZE 64 +#define ADM1266_BLACKBOX_MAX_RECORDS 32 #define ADM1266_PMBUS_BLOCK_MAX 255 @@ -60,7 +61,7 @@ struct adm1266_data { u8 *dev_mem; struct mutex buf_mutex; u8 write_buf[ADM1266_PMBUS_BLOCK_MAX + 1] ____cacheline_aligned; - u8 read_buf[ADM1266_PMBUS_BLOCK_MAX + 1] ____cacheline_aligned; + u8 read_buf[ADM1266_PMBUS_BLOCK_MAX + 2] ____cacheline_aligned; }; static const struct nvmem_cell_info adm1266_nvmem_cells[] = { @@ -175,6 +176,8 @@ static int adm1266_gpio_get(struct gpio_chip *chip, unsigned int offset) ret = i2c_smbus_read_block_data(data->client, pmbus_cmd, read_buf); if (ret < 0) return ret; + if (ret < 2) + return -EIO; pins_status = read_buf[0] + (read_buf[1] << 8); if (offset < ADM1266_GPIO_NR) @@ -195,6 +198,8 @@ static int adm1266_gpio_get_multiple(struct gpio_chip *chip, unsigned long *mask ret = i2c_smbus_read_block_data(data->client, ADM1266_GPIO_STATUS, read_buf); if (ret < 0) return ret; + if (ret < 2) + return -EIO; status = read_buf[0] + (read_buf[1] << 8); @@ -207,11 +212,12 @@ static int adm1266_gpio_get_multiple(struct gpio_chip *chip, unsigned long *mask ret = i2c_smbus_read_block_data(data->client, ADM1266_PDIO_STATUS, read_buf); if (ret < 0) return ret; + if (ret < 2) + return -EIO; status = read_buf[0] + (read_buf[1] << 8); - *bits = 0; - for_each_set_bit_from(gpio_nr, mask, ADM1266_GPIO_NR + ADM1266_PDIO_STATUS) { + for_each_set_bit_from(gpio_nr, mask, ADM1266_GPIO_NR + ADM1266_PDIO_NR) { if (test_bit(gpio_nr - ADM1266_GPIO_NR, &status)) set_bit(gpio_nr, bits); } @@ -347,9 +353,10 @@ static void adm1266_init_debugfs(struct adm1266_data *data) static int adm1266_nvmem_read_blackbox(struct adm1266_data *data, u8 *read_buff) { + u8 record[ADM1266_PMBUS_BLOCK_MAX]; int record_count; char index; - u8 buf[5]; + u8 buf[I2C_SMBUS_BLOCK_MAX]; int ret; ret = i2c_smbus_read_block_data(data->client, ADM1266_BLACKBOX_INFO, buf); @@ -360,15 +367,18 @@ static int adm1266_nvmem_read_blackbox(struct adm1266_data *data, u8 *read_buff) return -EIO; record_count = buf[3]; + if (record_count > ADM1266_BLACKBOX_MAX_RECORDS) + return -EIO; for (index = 0; index < record_count; index++) { - ret = adm1266_pmbus_block_xfer(data, ADM1266_READ_BLACKBOX, 1, &index, read_buff); + ret = adm1266_pmbus_block_xfer(data, ADM1266_READ_BLACKBOX, 1, &index, record); if (ret < 0) return ret; if (ret != ADM1266_BLACKBOX_SIZE) return -EIO; + memcpy(read_buff, record, ADM1266_BLACKBOX_SIZE); read_buff += ADM1266_BLACKBOX_SIZE; } @@ -432,7 +442,7 @@ static int adm1266_set_rtc(struct adm1266_data *data) char write_buf[6]; int i; - kt = ktime_get_seconds(); + kt = ktime_get_real_seconds(); memset(write_buf, 0, sizeof(write_buf)); @@ -462,20 +472,20 @@ static int adm1266_probe(struct i2c_client *client) crc8_populate_msb(pmbus_crc_table, 0x7); mutex_init(&data->buf_mutex); - ret = adm1266_config_gpio(data); + ret = adm1266_set_rtc(data); if (ret < 0) return ret; - ret = adm1266_set_rtc(data); - if (ret < 0) + ret = pmbus_do_probe(client, &data->info); + if (ret) return ret; ret = adm1266_config_nvmem(data); if (ret < 0) return ret; - ret = pmbus_do_probe(client, &data->info); - if (ret) + ret = adm1266_config_gpio(data); + if (ret < 0) return ret; adm1266_init_debugfs(data); diff --git a/drivers/i2c/busses/i2c-tegra.c b/drivers/i2c/busses/i2c-tegra.c index 4eaeb395d5db..b8d7a406ef04 100644 --- a/drivers/i2c/busses/i2c-tegra.c +++ b/drivers/i2c/busses/i2c-tegra.c @@ -1522,8 +1522,10 @@ static int tegra_i2c_xfer(struct i2c_adapter *adap, struct i2c_msg msgs[], } ret = tegra_i2c_mutex_lock(i2c_dev); - if (ret) + if (ret) { + pm_runtime_put(i2c_dev->dev); return ret; + } for (i = 0; i < num; i++) { enum msg_end_type end_type = MSG_END_STOP; diff --git a/drivers/infiniband/hw/mana/main.c b/drivers/infiniband/hw/mana/main.c index 8d99cd00f002..d913f885b3ef 100644 --- a/drivers/infiniband/hw/mana/main.c +++ b/drivers/infiniband/hw/mana/main.c @@ -639,6 +639,7 @@ int mana_ib_query_port(struct ib_device *ibdev, u32 port, if (mana_ib_is_rnic(dev)) { props->gid_tbl_len = 16; props->ip_gids = true; + props->max_msg_sz = SZ_16M; if (port == 1) props->port_cap_flags = IB_PORT_CM_SUP; } diff --git a/drivers/infiniband/sw/siw/siw_qp_rx.c b/drivers/infiniband/sw/siw/siw_qp_rx.c index e8a88b378d51..34d03584160c 100644 --- a/drivers/infiniband/sw/siw/siw_qp_rx.c +++ b/drivers/infiniband/sw/siw/siw_qp_rx.c @@ -1082,6 +1082,21 @@ static int siw_get_hdr(struct siw_rx_stream *srx) } /* + * Peer-controlled mpa_len must not underflow srx->fpdu_part_rem + * in siw_tcp_rx_data(); a negative value flows as a signed copy + * length into siw_check_mem() and skb_copy_bits(). + */ + if (unlikely(be16_to_cpu(c_hdr->mpa_len) + MPA_HDR_SIZE < + iwarp_pktinfo[opcode].hdr_len)) { + pr_warn_ratelimited("siw: short mpa_len %u for opcode %u (hdr_len %u)\n", + be16_to_cpu(c_hdr->mpa_len), opcode, + iwarp_pktinfo[opcode].hdr_len); + siw_init_terminate(rx_qp(srx), TERM_ERROR_LAYER_LLP, + LLP_ETYPE_MPA, LLP_ECODE_FPDU_START, 0); + return -EINVAL; + } + + /* * DDP/RDMAP header receive completed. Check if the current * DDP segment starts a new RDMAP message or continues a previously * started RDMAP message. diff --git a/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c b/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c index 51727c7d710c..9dd9141c86a5 100644 --- a/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c +++ b/drivers/infiniband/ulp/rtrs/rtrs-srv-sysfs.c @@ -295,8 +295,8 @@ remove_group: put_kobj: kobject_del(&srv_path->kobj); destroy_root: - kobject_put(&srv_path->kobj); rtrs_srv_destroy_once_sysfs_root_folders(srv_path); + kobject_put(&srv_path->kobj); return err; } diff --git a/drivers/iommu/amd/debugfs.c b/drivers/iommu/amd/debugfs.c index 20b04996441d..3909a1fb218e 100644 --- a/drivers/iommu/amd/debugfs.c +++ b/drivers/iommu/amd/debugfs.c @@ -26,22 +26,20 @@ static ssize_t iommu_mmio_write(struct file *filp, const char __user *ubuf, { struct seq_file *m = filp->private_data; struct amd_iommu *iommu = m->private; - int ret; - - iommu->dbg_mmio_offset = -1; + int ret, dbg_mmio_offset = iommu->dbg_mmio_offset = -1; if (cnt > OFS_IN_SZ) return -EINVAL; - ret = kstrtou32_from_user(ubuf, cnt, 0, &iommu->dbg_mmio_offset); + ret = kstrtos32_from_user(ubuf, cnt, 0, &dbg_mmio_offset); if (ret) return ret; - if (iommu->dbg_mmio_offset > iommu->mmio_phys_end - sizeof(u64)) { - iommu->dbg_mmio_offset = -1; - return -EINVAL; - } + if (dbg_mmio_offset < 0 || dbg_mmio_offset > + iommu->mmio_phys_end - sizeof(u64)) + return -EINVAL; + iommu->dbg_mmio_offset = dbg_mmio_offset; return cnt; } @@ -49,14 +47,16 @@ static int iommu_mmio_show(struct seq_file *m, void *unused) { struct amd_iommu *iommu = m->private; u64 value; + int dbg_mmio_offset = iommu->dbg_mmio_offset; - if (iommu->dbg_mmio_offset < 0) { + if (dbg_mmio_offset < 0 || dbg_mmio_offset > + iommu->mmio_phys_end - sizeof(u64)) { seq_puts(m, "Please provide mmio register's offset\n"); return 0; } - value = readq(iommu->mmio_base + iommu->dbg_mmio_offset); - seq_printf(m, "Offset:0x%x Value:0x%016llx\n", iommu->dbg_mmio_offset, value); + value = readq(iommu->mmio_base + dbg_mmio_offset); + seq_printf(m, "Offset:0x%x Value:0x%016llx\n", dbg_mmio_offset, value); return 0; } @@ -67,23 +67,20 @@ static ssize_t iommu_capability_write(struct file *filp, const char __user *ubuf { struct seq_file *m = filp->private_data; struct amd_iommu *iommu = m->private; - int ret; - - iommu->dbg_cap_offset = -1; + int ret, dbg_cap_offset = iommu->dbg_cap_offset = -1; if (cnt > OFS_IN_SZ) return -EINVAL; - ret = kstrtou32_from_user(ubuf, cnt, 0, &iommu->dbg_cap_offset); + ret = kstrtos32_from_user(ubuf, cnt, 0, &dbg_cap_offset); if (ret) return ret; /* Capability register at offset 0x14 is the last IOMMU capability register. */ - if (iommu->dbg_cap_offset > 0x14) { - iommu->dbg_cap_offset = -1; + if (dbg_cap_offset < 0 || dbg_cap_offset > 0x14) return -EINVAL; - } + iommu->dbg_cap_offset = dbg_cap_offset; return cnt; } @@ -91,21 +88,21 @@ static int iommu_capability_show(struct seq_file *m, void *unused) { struct amd_iommu *iommu = m->private; u32 value; - int err; + int err, dbg_cap_offset = iommu->dbg_cap_offset; - if (iommu->dbg_cap_offset < 0) { + if (dbg_cap_offset < 0 || dbg_cap_offset > 0x14) { seq_puts(m, "Please provide capability register's offset in the range [0x00 - 0x14]\n"); return 0; } - err = pci_read_config_dword(iommu->dev, iommu->cap_ptr + iommu->dbg_cap_offset, &value); + err = pci_read_config_dword(iommu->dev, iommu->cap_ptr + dbg_cap_offset, &value); if (err) { seq_printf(m, "Not able to read capability register at 0x%x\n", - iommu->dbg_cap_offset); + dbg_cap_offset); return 0; } - seq_printf(m, "Offset:0x%x Value:0x%08x\n", iommu->dbg_cap_offset, value); + seq_printf(m, "Offset:0x%x Value:0x%08x\n", dbg_cap_offset, value); return 0; } diff --git a/drivers/iommu/generic_pt/iommu_pt.h b/drivers/iommu/generic_pt/iommu_pt.h index 7e7a6e7abdee..55faad4b9dc7 100644 --- a/drivers/iommu/generic_pt/iommu_pt.h +++ b/drivers/iommu/generic_pt/iommu_pt.h @@ -466,6 +466,7 @@ struct pt_iommu_map_args { pt_oaddr_t oa; unsigned int leaf_pgsize_lg2; unsigned int leaf_level; + pt_vaddr_t num_leaves; }; /* @@ -518,11 +519,17 @@ static int clear_contig(const struct pt_state *start_pts, static int __map_range_leaf(struct pt_range *range, void *arg, unsigned int level, struct pt_table_p *table) { + struct pt_iommu *iommu_table = iommu_from_common(range->common); struct pt_state pts = pt_init(range, level, table); struct pt_iommu_map_args *map = arg; unsigned int leaf_pgsize_lg2 = map->leaf_pgsize_lg2; + unsigned int leaves_avail; unsigned int start_index; pt_oaddr_t oa = map->oa; + pt_vaddr_t num_leaves; + unsigned int orig_end; + unsigned int step_lg2; + pt_vaddr_t last_va; unsigned int step; bool need_contig; int ret = 0; @@ -530,12 +537,25 @@ static int __map_range_leaf(struct pt_range *range, void *arg, PT_WARN_ON(map->leaf_level != level); PT_WARN_ON(!pt_can_have_leaf(&pts)); - step = log2_to_int_t(unsigned int, - leaf_pgsize_lg2 - pt_table_item_lg2sz(&pts)); - need_contig = leaf_pgsize_lg2 != pt_table_item_lg2sz(&pts); + step_lg2 = leaf_pgsize_lg2 - pt_table_item_lg2sz(&pts); + step = log2_to_int_t(unsigned int, step_lg2); + need_contig = step_lg2 != 0; _pt_iter_first(&pts); start_index = pts.index; + orig_end = pts.end_index; + leaves_avail = + log2_div_t(unsigned int, pts.end_index - pts.index, step_lg2); + if (map->num_leaves <= leaves_avail) { + /* Need to stop in the middle of the table to change sizes */ + pts.end_index = pts.index + log2_mul(map->num_leaves, step_lg2); + num_leaves = 0; + } else { + num_leaves = map->num_leaves - leaves_avail; + } + + PT_WARN_ON( + log2_mod_t(unsigned int, pts.end_index - pts.index, step_lg2)); do { pts.type = pt_load_entry_raw(&pts); if (pts.type != PT_ENTRY_EMPTY || need_contig) { @@ -561,7 +581,40 @@ static int __map_range_leaf(struct pt_range *range, void *arg, flush_writes_range(&pts, start_index, pts.index); map->oa = oa; - return ret; + map->num_leaves = num_leaves; + if (ret || num_leaves) + return ret; + + /* range->va is not valid if we reached the end of the table */ + pts.index -= step; + pt_index_to_va(&pts); + pts.index += step; + last_va = range->va + log2_to_int(leaf_pgsize_lg2); + + if (last_va - 1 == range->last_va) { + PT_WARN_ON(pts.index != orig_end); + return 0; + } + + /* + * Reached a point where the page size changed, compute the new + * parameters. + */ + map->leaf_pgsize_lg2 = pt_compute_best_pgsize( + iommu_table->domain.pgsize_bitmap, last_va, range->last_va, oa); + map->leaf_level = + pt_pgsz_lg2_to_level(range->common, map->leaf_pgsize_lg2); + map->num_leaves = pt_pgsz_count(iommu_table->domain.pgsize_bitmap, + last_va, range->last_va, oa, + map->leaf_pgsize_lg2); + + /* Didn't finish this table level, caller will repeat it */ + if (pts.index != orig_end) { + if (pts.index != start_index) + pt_index_to_va(&pts); + return -EAGAIN; + } + return 0; } static int __map_range(struct pt_range *range, void *arg, unsigned int level, @@ -584,14 +637,9 @@ static int __map_range(struct pt_range *range, void *arg, unsigned int level, if (pts.type != PT_ENTRY_EMPTY) return -EADDRINUSE; ret = pt_iommu_new_table(&pts, &map->attrs); - if (ret) { - /* - * Racing with another thread installing a table - */ - if (ret == -EAGAIN) - continue; + /* EAGAIN on a race will loop again */ + if (ret) return ret; - } } else { pts.table_lower = pt_table_ptr(&pts); /* @@ -615,10 +663,12 @@ static int __map_range(struct pt_range *range, void *arg, unsigned int level, * The already present table can possibly be shared with another * concurrent map. */ - if (map->leaf_level == level - 1) - ret = pt_descend(&pts, arg, __map_range_leaf); - else - ret = pt_descend(&pts, arg, __map_range); + do { + if (map->leaf_level == level - 1) + ret = pt_descend(&pts, arg, __map_range_leaf); + else + ret = pt_descend(&pts, arg, __map_range); + } while (ret == -EAGAIN); if (ret) return ret; @@ -626,6 +676,14 @@ static int __map_range(struct pt_range *range, void *arg, unsigned int level, pt_index_to_va(&pts); if (pts.index >= pts.end_index) break; + + /* + * This level is currently running __map_range_leaf() which is + * not correct if the target level has been updated to this + * level. Have the caller invoke __map_range_leaf. + */ + if (map->leaf_level == level) + return -EAGAIN; } while (true); return 0; } @@ -797,12 +855,13 @@ static int check_map_range(struct pt_iommu *iommu_table, struct pt_range *range, static int do_map(struct pt_range *range, struct pt_common *common, bool single_page, struct pt_iommu_map_args *map) { + int ret; + /* * The __map_single_page() fast path does not support DMA_INCOHERENT * flushing to keep its .text small. */ if (single_page && !pt_feature(common, PT_FEAT_DMA_INCOHERENT)) { - int ret; ret = pt_walk_range(range, __map_single_page, map); if (ret != -EAGAIN) @@ -810,50 +869,25 @@ static int do_map(struct pt_range *range, struct pt_common *common, /* EAGAIN falls through to the full path */ } - if (map->leaf_level == range->top_level) - return pt_walk_range(range, __map_range_leaf, map); - return pt_walk_range(range, __map_range, map); + do { + if (map->leaf_level == range->top_level) + ret = pt_walk_range(range, __map_range_leaf, map); + else + ret = pt_walk_range(range, __map_range, map); + } while (ret == -EAGAIN); + return ret; } -/** - * map_pages() - Install translation for an IOVA range - * @domain: Domain to manipulate - * @iova: IO virtual address to start - * @paddr: Physical/Output address to start - * @pgsize: Length of each page - * @pgcount: Length of the range in pgsize units starting from @iova - * @prot: A bitmap of IOMMU_READ/WRITE/CACHE/NOEXEC/MMIO - * @gfp: GFP flags for any memory allocations - * @mapped: Total bytes successfully mapped - * - * The range starting at IOVA will have paddr installed into it. The caller - * must specify a valid pgsize and pgcount to segment the range into compatible - * blocks. - * - * On error the caller will probably want to invoke unmap on the range from iova - * up to the amount indicated by @mapped to return the table back to an - * unchanged state. - * - * Context: The caller must hold a write range lock that includes the whole - * range. - * - * Returns: -ERRNO on failure, 0 on success. The number of bytes of VA that were - * mapped are added to @mapped, @mapped is not zerod first. - */ -int DOMAIN_NS(map_pages)(struct iommu_domain *domain, unsigned long iova, - phys_addr_t paddr, size_t pgsize, size_t pgcount, - int prot, gfp_t gfp, size_t *mapped) +static int NS(map_range)(struct pt_iommu *iommu_table, dma_addr_t iova, + phys_addr_t paddr, dma_addr_t len, unsigned int prot, + gfp_t gfp, size_t *mapped) { - struct pt_iommu *iommu_table = - container_of(domain, struct pt_iommu, domain); pt_vaddr_t pgsize_bitmap = iommu_table->domain.pgsize_bitmap; struct pt_common *common = common_from_iommu(iommu_table); struct iommu_iotlb_gather iotlb_gather; - pt_vaddr_t len = pgsize * pgcount; struct pt_iommu_map_args map = { .iotlb_gather = &iotlb_gather, .oa = paddr, - .leaf_pgsize_lg2 = vaffs(pgsize), }; bool single_page = false; struct pt_range range; @@ -881,13 +915,13 @@ int DOMAIN_NS(map_pages)(struct iommu_domain *domain, unsigned long iova, return ret; /* Calculate target page size and level for the leaves */ - if (pt_has_system_page_size(common) && pgsize == PAGE_SIZE && - pgcount == 1) { - PT_WARN_ON(!(pgsize_bitmap & PAGE_SIZE)); + if (pt_has_system_page_size(common) && len == PAGE_SIZE && + likely(pgsize_bitmap & PAGE_SIZE)) { if (log2_mod(iova | paddr, PAGE_SHIFT)) return -ENXIO; map.leaf_pgsize_lg2 = PAGE_SHIFT; map.leaf_level = 0; + map.num_leaves = 1; single_page = true; } else { map.leaf_pgsize_lg2 = pt_compute_best_pgsize( @@ -896,6 +930,9 @@ int DOMAIN_NS(map_pages)(struct iommu_domain *domain, unsigned long iova, return -ENXIO; map.leaf_level = pt_pgsz_lg2_to_level(common, map.leaf_pgsize_lg2); + map.num_leaves = pt_pgsz_count(pgsize_bitmap, range.va, + range.last_va, paddr, + map.leaf_pgsize_lg2); } ret = check_map_range(iommu_table, &range, &map); @@ -918,7 +955,6 @@ int DOMAIN_NS(map_pages)(struct iommu_domain *domain, unsigned long iova, *mapped += map.oa - paddr; return ret; } -EXPORT_SYMBOL_NS_GPL(DOMAIN_NS(map_pages), "GENERIC_PT_IOMMU"); struct pt_unmap_args { struct iommu_pages_list free_list; @@ -1020,34 +1056,12 @@ start_oa: return ret; } -/** - * unmap_pages() - Make a range of IOVA empty/not present - * @domain: Domain to manipulate - * @iova: IO virtual address to start - * @pgsize: Length of each page - * @pgcount: Length of the range in pgsize units starting from @iova - * @iotlb_gather: Gather struct that must be flushed on return - * - * unmap_pages() will remove a translation created by map_pages(). It cannot - * subdivide a mapping created by map_pages(), so it should be called with IOVA - * ranges that match those passed to map_pages(). The IOVA range can aggregate - * contiguous map_pages() calls so long as no individual range is split. - * - * Context: The caller must hold a write range lock that includes - * the whole range. - * - * Returns: Number of bytes of VA unmapped. iova + res will be the point - * unmapping stopped. - */ -size_t DOMAIN_NS(unmap_pages)(struct iommu_domain *domain, unsigned long iova, - size_t pgsize, size_t pgcount, +static size_t NS(unmap_range)(struct pt_iommu *iommu_table, dma_addr_t iova, + dma_addr_t len, struct iommu_iotlb_gather *iotlb_gather) { - struct pt_iommu *iommu_table = - container_of(domain, struct pt_iommu, domain); struct pt_unmap_args unmap = { .free_list = IOMMU_PAGES_LIST_INIT( unmap.free_list) }; - pt_vaddr_t len = pgsize * pgcount; struct pt_range range; int ret; @@ -1062,7 +1076,6 @@ size_t DOMAIN_NS(unmap_pages)(struct iommu_domain *domain, unsigned long iova, return unmap.unmapped; } -EXPORT_SYMBOL_NS_GPL(DOMAIN_NS(unmap_pages), "GENERIC_PT_IOMMU"); static void NS(get_info)(struct pt_iommu *iommu_table, struct pt_iommu_info *info) @@ -1110,6 +1123,8 @@ static void NS(deinit)(struct pt_iommu *iommu_table) } static const struct pt_iommu_ops NS(ops) = { + .map_range = NS(map_range), + .unmap_range = NS(unmap_range), #if IS_ENABLED(CONFIG_IOMMUFD_DRIVER) && defined(pt_entry_is_write_dirty) && \ IS_ENABLED(CONFIG_IOMMUFD_TEST) && defined(pt_entry_make_write_dirty) .set_dirty = NS(set_dirty), @@ -1172,6 +1187,7 @@ static int pt_iommu_init_domain(struct pt_iommu *iommu_table, domain->type = __IOMMU_DOMAIN_PAGING; domain->pgsize_bitmap = info.pgsize_bitmap; + domain->is_iommupt = true; if (pt_feature(common, PT_FEAT_DYNAMIC_TOP)) range = _pt_top_range(common, diff --git a/drivers/iommu/generic_pt/kunit_generic_pt.h b/drivers/iommu/generic_pt/kunit_generic_pt.h index 68278bf15cfe..374e475f591e 100644 --- a/drivers/iommu/generic_pt/kunit_generic_pt.h +++ b/drivers/iommu/generic_pt/kunit_generic_pt.h @@ -312,6 +312,17 @@ static void test_best_pgsize(struct kunit *test) } } +static void test_pgsz_count(struct kunit *test) +{ + KUNIT_EXPECT_EQ(test, + pt_pgsz_count(SZ_4K, 0, SZ_1G - 1, 0, ilog2(SZ_4K)), + SZ_1G / SZ_4K); + KUNIT_EXPECT_EQ(test, + pt_pgsz_count(SZ_2M | SZ_4K, SZ_4K, SZ_1G - 1, SZ_4K, + ilog2(SZ_4K)), + (SZ_2M - SZ_4K) / SZ_4K); +} + /* * Check that pt_install_table() and pt_table_pa() match */ @@ -770,6 +781,7 @@ static struct kunit_case generic_pt_test_cases[] = { KUNIT_CASE_FMT(test_init), KUNIT_CASE_FMT(test_bitops), KUNIT_CASE_FMT(test_best_pgsize), + KUNIT_CASE_FMT(test_pgsz_count), KUNIT_CASE_FMT(test_table_ptr), KUNIT_CASE_FMT(test_max_va), KUNIT_CASE_FMT(test_table_radix), diff --git a/drivers/iommu/generic_pt/pt_iter.h b/drivers/iommu/generic_pt/pt_iter.h index c0d8617cce29..3e45dbde6b83 100644 --- a/drivers/iommu/generic_pt/pt_iter.h +++ b/drivers/iommu/generic_pt/pt_iter.h @@ -569,6 +569,28 @@ static inline unsigned int pt_compute_best_pgsize(pt_vaddr_t pgsz_bitmap, return pgsz_lg2; } +/* + * Return the number of pgsize_lg2 leaf entries that can be mapped for + * va to oa. This accounts for any requirement to reduce or increase the page + * size across the VA range. + */ +static inline pt_vaddr_t pt_pgsz_count(pt_vaddr_t pgsz_bitmap, pt_vaddr_t va, + pt_vaddr_t last_va, pt_oaddr_t oa, + unsigned int pgsize_lg2) +{ + pt_vaddr_t len = last_va - va + 1; + pt_vaddr_t next_pgsizes = log2_set_mod(pgsz_bitmap, 0, pgsize_lg2 + 1); + + if (next_pgsizes) { + unsigned int next_pgsize_lg2 = vaffs(next_pgsizes); + + if (log2_mod(va ^ oa, next_pgsize_lg2) == 0) + len = min(len, log2_set_mod_max(va, next_pgsize_lg2) - + va + 1); + } + return log2_div(len, pgsize_lg2); +} + #define _PT_MAKE_CALL_LEVEL(fn) \ static __always_inline int fn(struct pt_range *range, void *arg, \ unsigned int level, \ diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index ef08c2c4ec95..93c908170740 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -34,6 +34,7 @@ #include <linux/sched/mm.h> #include <linux/msi.h> #include <uapi/linux/iommufd.h> +#include <linux/generic_pt/iommu.h> #include "dma-iommu.h" #include "iommu-priv.h" @@ -2612,29 +2613,18 @@ out_set_count: return pgsize; } -int iommu_map_nosync(struct iommu_domain *domain, unsigned long iova, - phys_addr_t paddr, size_t size, int prot, gfp_t gfp) +static int __iommu_map_domain_pgtbl(struct iommu_domain *domain, + unsigned long iova, phys_addr_t paddr, + size_t size, int prot, gfp_t gfp, + size_t *mapped) { const struct iommu_domain_ops *ops = domain->ops; - unsigned long orig_iova = iova; unsigned int min_pagesz; - size_t orig_size = size; - phys_addr_t orig_paddr = paddr; int ret = 0; - might_sleep_if(gfpflags_allow_blocking(gfp)); - - if (unlikely(!(domain->type & __IOMMU_DOMAIN_PAGING))) - return -EINVAL; - - if (WARN_ON(!ops->map_pages || domain->pgsize_bitmap == 0UL)) + if (WARN_ON(!ops->map_pages)) return -ENODEV; - /* Discourage passing strange GFP flags */ - if (WARN_ON_ONCE(gfp & (__GFP_COMP | __GFP_DMA | __GFP_DMA32 | - __GFP_HIGHMEM))) - return -EINVAL; - /* find out the minimum page size supported */ min_pagesz = 1 << __ffs(domain->pgsize_bitmap); @@ -2652,36 +2642,27 @@ int iommu_map_nosync(struct iommu_domain *domain, unsigned long iova, pr_debug("map: iova 0x%lx pa %pa size 0x%zx\n", iova, &paddr, size); while (size) { - size_t pgsize, count, mapped = 0; + size_t pgsize, count, op_mapped = 0; pgsize = iommu_pgsize(domain, iova, paddr, size, &count); pr_debug("mapping: iova 0x%lx pa %pa pgsize 0x%zx count %zu\n", iova, &paddr, pgsize, count); ret = ops->map_pages(domain, iova, paddr, pgsize, count, prot, - gfp, &mapped); + gfp, &op_mapped); /* * Some pages may have been mapped, even if an error occurred, * so we should account for those so they can be unmapped. */ - size -= mapped; - + *mapped += op_mapped; if (ret) - break; + return ret; - iova += mapped; - paddr += mapped; + size -= op_mapped; + iova += op_mapped; + paddr += op_mapped; } - - /* unroll mapping in case something went wrong */ - if (ret) { - iommu_unmap(domain, orig_iova, orig_size - size); - } else { - trace_map(orig_iova, orig_paddr, orig_size); - iommu_debug_map(domain, orig_paddr, orig_size); - } - - return ret; + return 0; } int iommu_sync_map(struct iommu_domain *domain, unsigned long iova, size_t size) @@ -2693,6 +2674,38 @@ int iommu_sync_map(struct iommu_domain *domain, unsigned long iova, size_t size) return ops->iotlb_sync_map(domain, iova, size); } +int iommu_map_nosync(struct iommu_domain *domain, unsigned long iova, + phys_addr_t paddr, size_t size, int prot, gfp_t gfp) +{ + struct pt_iommu *pt = iommupt_from_domain(domain); + size_t mapped = 0; + int ret; + + might_sleep_if(gfpflags_allow_blocking(gfp)); + + /* Discourage passing strange GFP flags or illegal domains */ + if (WARN_ON_ONCE(!(domain->type & __IOMMU_DOMAIN_PAGING) || + !domain->pgsize_bitmap || + (gfp & (__GFP_COMP | __GFP_DMA | __GFP_DMA32 | + __GFP_HIGHMEM)))) + return -EINVAL; + + if (pt) + ret = pt->ops->map_range(pt, iova, paddr, size, prot, gfp, + &mapped); + else + ret = __iommu_map_domain_pgtbl(domain, iova, paddr, size, prot, + gfp, &mapped); + + trace_map(iova, paddr, mapped); + iommu_debug_map(domain, paddr, mapped); + if (ret) { + iommu_unmap(domain, iova, mapped); + return ret; + } + return 0; +} + int iommu_map(struct iommu_domain *domain, unsigned long iova, phys_addr_t paddr, size_t size, int prot, gfp_t gfp) { @@ -2710,19 +2723,15 @@ int iommu_map(struct iommu_domain *domain, unsigned long iova, } EXPORT_SYMBOL_GPL(iommu_map); -static size_t __iommu_unmap(struct iommu_domain *domain, - unsigned long iova, size_t size, - struct iommu_iotlb_gather *iotlb_gather) +static size_t +__iommu_unmap_domain_pgtbl(struct iommu_domain *domain, unsigned long iova, + size_t size, struct iommu_iotlb_gather *iotlb_gather) { const struct iommu_domain_ops *ops = domain->ops; size_t unmapped_page, unmapped = 0; - unsigned long orig_iova = iova; unsigned int min_pagesz; - if (unlikely(!(domain->type & __IOMMU_DOMAIN_PAGING))) - return 0; - - if (WARN_ON(!ops->unmap_pages || domain->pgsize_bitmap == 0UL)) + if (WARN_ON(!ops->unmap_pages)) return 0; /* find out the minimum page size supported */ @@ -2741,8 +2750,6 @@ static size_t __iommu_unmap(struct iommu_domain *domain, pr_debug("unmap this: iova 0x%lx size 0x%zx\n", iova, size); - iommu_debug_unmap_begin(domain, iova, size); - /* * Keep iterating until we either unmap 'size' bytes (or more) * or we hit an area that isn't mapped. @@ -2768,8 +2775,29 @@ static size_t __iommu_unmap(struct iommu_domain *domain, unmapped += unmapped_page; } - trace_unmap(orig_iova, size, unmapped); - iommu_debug_unmap_end(domain, orig_iova, size, unmapped); + return unmapped; +} + +static size_t __iommu_unmap(struct iommu_domain *domain, unsigned long iova, + size_t size, + struct iommu_iotlb_gather *iotlb_gather) +{ + struct pt_iommu *pt = iommupt_from_domain(domain); + size_t unmapped; + + if (WARN_ON_ONCE(!(domain->type & __IOMMU_DOMAIN_PAGING) || + !domain->pgsize_bitmap)) + return 0; + + iommu_debug_unmap_begin(domain, iova, size); + + if (pt) + unmapped = pt->ops->unmap_range(pt, iova, size, iotlb_gather); + else + unmapped = __iommu_unmap_domain_pgtbl(domain, iova, size, + iotlb_gather); + trace_unmap(iova, size, unmapped); + iommu_debug_unmap_end(domain, iova, size, unmapped); return unmapped; } diff --git a/drivers/irqchip/irq-ath79-cpu.c b/drivers/irqchip/irq-ath79-cpu.c index 923e4bba3776..9b7273a7f8ce 100644 --- a/drivers/irqchip/irq-ath79-cpu.c +++ b/drivers/irqchip/irq-ath79-cpu.c @@ -85,10 +85,3 @@ static int __init ar79_cpu_intc_of_init( } IRQCHIP_DECLARE(ar79_cpu_intc, "qca,ar7100-cpu-intc", ar79_cpu_intc_of_init); - -void __init ath79_cpu_irq_init(unsigned irq_wb_chan2, unsigned irq_wb_chan3) -{ - irq_wb_chan[2] = irq_wb_chan2; - irq_wb_chan[3] = irq_wb_chan3; - mips_cpu_irq_init(); -} diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c index b9423389c2ef..cc269d16b75d 100644 --- a/drivers/net/dsa/mt7530.c +++ b/drivers/net/dsa/mt7530.c @@ -973,12 +973,16 @@ mt7530_set_ageing_time(struct dsa_switch *ds, unsigned int msecs) unsigned int age_count; unsigned int age_unit; - /* Applied timer is (AGE_CNT + 1) * (AGE_UNIT + 1) seconds */ - if (secs < 1 || secs > (AGE_CNT_MAX + 1) * (AGE_UNIT_MAX + 1)) - return -ERANGE; - - /* iterate through all possible age_count to find the closest pair */ - for (tmp_age_count = 0; tmp_age_count <= AGE_CNT_MAX; ++tmp_age_count) { + /* Applied timer is (AGE_CNT + 1) * (AGE_UNIT + 1) seconds. + * The DSA core has already validated the range using + * ds->ageing_time_min and ds->ageing_time_max. + * + * Iterate through all possible age_count values to find the closest + * pair. Start from 1 because the per-entry aging counter is + * initialized to AGE_CNT and a value of 0 means the entry will + * never be aged out. + */ + for (tmp_age_count = 1; tmp_age_count <= AGE_CNT_MAX; ++tmp_age_count) { unsigned int tmp_age_unit = secs / (tmp_age_count + 1) - 1; if (tmp_age_unit <= AGE_UNIT_MAX) { @@ -1246,37 +1250,40 @@ static void mt7530_setup_port5(struct dsa_switch *ds, phy_interface_t interface) static void mt753x_trap_frames(struct mt7530_priv *priv) { - /* Trap 802.1X PAE frames and BPDUs to the CPU port(s) and egress them - * VLAN-untagged. + /* Trap 802.1X PAE frames and BPDUs to the CPU port(s) and egress + * them with the EG_TAG attribute set to disabled (system default) + * so that any VLAN tags in the frame are not modified by the + * switch egress VLAN tag processing. This preserves VLAN tags + * for reception on VLAN sub-interfaces. */ mt7530_rmw(priv, MT753X_BPC, PAE_BPDU_FR | PAE_EG_TAG_MASK | PAE_PORT_FW_MASK | BPDU_EG_TAG_MASK | BPDU_PORT_FW_MASK, - PAE_BPDU_FR | PAE_EG_TAG(MT7530_VLAN_EG_UNTAGGED) | + PAE_BPDU_FR | PAE_EG_TAG(MT7530_VLAN_EG_DISABLED) | PAE_PORT_FW(TO_CPU_FW_CPU_ONLY) | - BPDU_EG_TAG(MT7530_VLAN_EG_UNTAGGED) | + BPDU_EG_TAG(MT7530_VLAN_EG_DISABLED) | TO_CPU_FW_CPU_ONLY); - /* Trap frames with :01 and :02 MAC DAs to the CPU port(s) and egress - * them VLAN-untagged. + /* Trap frames with :01 and :02 MAC DAs to the CPU port(s) and + * egress them with EG_TAG disabled. */ mt7530_rmw(priv, MT753X_RGAC1, R02_BPDU_FR | R02_EG_TAG_MASK | R02_PORT_FW_MASK | R01_BPDU_FR | R01_EG_TAG_MASK | R01_PORT_FW_MASK, - R02_BPDU_FR | R02_EG_TAG(MT7530_VLAN_EG_UNTAGGED) | + R02_BPDU_FR | R02_EG_TAG(MT7530_VLAN_EG_DISABLED) | R02_PORT_FW(TO_CPU_FW_CPU_ONLY) | R01_BPDU_FR | - R01_EG_TAG(MT7530_VLAN_EG_UNTAGGED) | + R01_EG_TAG(MT7530_VLAN_EG_DISABLED) | TO_CPU_FW_CPU_ONLY); - /* Trap frames with :03 and :0E MAC DAs to the CPU port(s) and egress - * them VLAN-untagged. + /* Trap frames with :03 and :0E MAC DAs to the CPU port(s) and + * egress them with EG_TAG disabled. */ mt7530_rmw(priv, MT753X_RGAC2, R0E_BPDU_FR | R0E_EG_TAG_MASK | R0E_PORT_FW_MASK | R03_BPDU_FR | R03_EG_TAG_MASK | R03_PORT_FW_MASK, - R0E_BPDU_FR | R0E_EG_TAG(MT7530_VLAN_EG_UNTAGGED) | + R0E_BPDU_FR | R0E_EG_TAG(MT7530_VLAN_EG_DISABLED) | R0E_PORT_FW(TO_CPU_FW_CPU_ONLY) | R03_BPDU_FR | - R03_EG_TAG(MT7530_VLAN_EG_UNTAGGED) | + R03_EG_TAG(MT7530_VLAN_EG_DISABLED) | TO_CPU_FW_CPU_ONLY); } @@ -2378,6 +2385,8 @@ mt7530_setup(struct dsa_switch *ds) ds->assisted_learning_on_cpu_port = true; ds->mtu_enforcement_ingress = true; + ds->ageing_time_min = 2 * 1000; + ds->ageing_time_max = (AGE_CNT_MAX + 1) * (AGE_UNIT_MAX + 1) * 1000; if (priv->id == ID_MT7530) { regulator_set_voltage(priv->core_pwr, 1000000, 1000000); @@ -2567,6 +2576,8 @@ mt7531_setup_common(struct dsa_switch *ds) ds->assisted_learning_on_cpu_port = true; ds->mtu_enforcement_ingress = true; + ds->ageing_time_min = 2 * 1000; + ds->ageing_time_max = (AGE_CNT_MAX + 1) * (AGE_UNIT_MAX + 1) * 1000; mt753x_trap_frames(priv); diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c index 83882a8953d2..13f743359286 100644 --- a/drivers/net/ethernet/airoha/airoha_eth.c +++ b/drivers/net/ethernet/airoha/airoha_eth.c @@ -1781,11 +1781,8 @@ static int airhoha_set_gdm2_loopback(struct airoha_gdm_port *port) u32 val, pse_port, chan; int src_port; - /* Forward the traffic to the proper GDM port */ - pse_port = port->id == AIROHA_GDM3_IDX ? FE_PSE_PORT_GDM3 - : FE_PSE_PORT_GDM4; airoha_set_gdm_port_fwd_cfg(eth, REG_GDM_FWD_CFG(AIROHA_GDM2_IDX), - pse_port); + FE_PSE_PORT_DROP); airoha_fe_clear(eth, REG_GDM_FWD_CFG(AIROHA_GDM2_IDX), GDM_STRIP_CRC_MASK); @@ -1803,6 +1800,11 @@ static int airhoha_set_gdm2_loopback(struct airoha_gdm_port *port) GDM_SHORT_LEN_MASK | GDM_LONG_LEN_MASK, FIELD_PREP(GDM_SHORT_LEN_MASK, 60) | FIELD_PREP(GDM_LONG_LEN_MASK, AIROHA_MAX_MTU)); + /* Forward the traffic to the proper GDM port */ + pse_port = port->id == AIROHA_GDM3_IDX ? FE_PSE_PORT_GDM3 + : FE_PSE_PORT_GDM4; + airoha_set_gdm_port_fwd_cfg(eth, REG_GDM_FWD_CFG(AIROHA_GDM2_IDX), + pse_port); /* Disable VIP and IFC for GDM2 */ airoha_fe_clear(eth, REG_FE_VIP_PORT_EN, BIT(AIROHA_GDM2_IDX)); diff --git a/drivers/net/ethernet/amd/pds_core/debugfs.c b/drivers/net/ethernet/amd/pds_core/debugfs.c index 04c5e3abd8d7..810a0cd9bcac 100644 --- a/drivers/net/ethernet/amd/pds_core/debugfs.c +++ b/drivers/net/ethernet/amd/pds_core/debugfs.c @@ -64,9 +64,14 @@ DEFINE_SHOW_ATTRIBUTE(identity); void pdsc_debugfs_add_ident(struct pdsc *pdsc) { + struct dentry *dentry; + /* This file will already exist in the reset flow */ - if (debugfs_lookup("identity", pdsc->dentry)) + dentry = debugfs_lookup("identity", pdsc->dentry); + if (!IS_ERR_OR_NULL(dentry)) { + dput(dentry); return; + } debugfs_create_file("identity", 0400, pdsc->dentry, pdsc, &identity_fops); diff --git a/drivers/net/ethernet/amd/pds_core/dev.c b/drivers/net/ethernet/amd/pds_core/dev.c index 2e1d0d01d03a..bded6b33289c 100644 --- a/drivers/net/ethernet/amd/pds_core/dev.c +++ b/drivers/net/ethernet/amd/pds_core/dev.c @@ -162,12 +162,19 @@ static int pdsc_devcmd_wait(struct pdsc *pdsc, u8 opcode, int max_seconds) dev_dbg(dev, "DEVCMD %d %s after %ld secs\n", opcode, pdsc_devcmd_str(opcode), duration / HZ); - if ((!done || timeout) && running) { + if (!running) { + dev_err(dev, "DEVCMD %d %s fw not running\n", + opcode, pdsc_devcmd_str(opcode)); + pdsc_devcmd_clean(pdsc); + return -ENXIO; + } + + if (!done || timeout) { dev_err(dev, "DEVCMD %d %s timeout, done %d timeout %d max_seconds=%d\n", opcode, pdsc_devcmd_str(opcode), done, timeout, max_seconds); - err = -ETIMEDOUT; pdsc_devcmd_clean(pdsc); + return -ETIMEDOUT; } status = pdsc_devcmd_status(pdsc); diff --git a/drivers/net/ethernet/amd/pds_core/devlink.c b/drivers/net/ethernet/amd/pds_core/devlink.c index b576be626a29..3f0e56b951bf 100644 --- a/drivers/net/ethernet/amd/pds_core/devlink.c +++ b/drivers/net/ethernet/amd/pds_core/devlink.c @@ -122,12 +122,14 @@ int pdsc_dl_info_get(struct devlink *dl, struct devlink_info_req *req, listlen = min(fw_list.num_fw_slots, ARRAY_SIZE(fw_list.fw_names)); for (i = 0; i < listlen; i++) { + char *fw_ver = fw_list.fw_names[i].fw_version; + if (i < ARRAY_SIZE(fw_slotnames)) strscpy(buf, fw_slotnames[i], sizeof(buf)); else snprintf(buf, sizeof(buf), "fw.slot_%d", i); - err = devlink_info_version_stored_put(req, buf, - fw_list.fw_names[i].fw_version); + fw_ver[sizeof(fw_list.fw_names[i].fw_version) - 1] = '\0'; + err = devlink_info_version_stored_put(req, buf, fw_ver); if (err) return err; } diff --git a/drivers/net/ethernet/atheros/ag71xx.c b/drivers/net/ethernet/atheros/ag71xx.c index a5ab99474179..4e4794c4dfdc 100644 --- a/drivers/net/ethernet/atheros/ag71xx.c +++ b/drivers/net/ethernet/atheros/ag71xx.c @@ -1856,6 +1856,9 @@ static int ag71xx_probe(struct platform_device *pdev) ag71xx_int_disable(ag, AG71XX_INT_POLL); ndev->irq = platform_get_irq(pdev, 0); + if (ndev->irq < 0) + return ndev->irq; + err = devm_request_irq(&pdev->dev, ndev->irq, ag71xx_interrupt, 0x0, dev_name(&pdev->dev), ndev); if (err) { diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c index 54f71b1e85fc..7c11cf916762 100644 --- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c +++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c @@ -1368,13 +1368,12 @@ void bcmgenet_eee_enable_set(struct net_device *dev, bool enable) reg &= ~(TBUF_EEE_EN | TBUF_PM_EN); bcmgenet_writel(reg, priv->base + off); - /* Do the same for thing for RBUF */ + /* RBUF EEE/PM can break the RX path on GENET. Keep it disabled. */ reg = bcmgenet_rbuf_readl(priv, RBUF_ENERGY_CTRL); - if (enable) - reg |= RBUF_EEE_EN | RBUF_PM_EN; - else + if (reg & (RBUF_EEE_EN | RBUF_PM_EN)) { reg &= ~(RBUF_EEE_EN | RBUF_PM_EN); - bcmgenet_rbuf_writel(priv, reg, RBUF_ENERGY_CTRL); + bcmgenet_rbuf_writel(priv, reg, RBUF_ENERGY_CTRL); + } if (!enable && priv->clk_eee_enabled) { clk_disable_unprepare(priv->clk_eee); diff --git a/drivers/net/ethernet/cirrus/cs89x0.c b/drivers/net/ethernet/cirrus/cs89x0.c index fa5857923db4..b4bfd6c174e7 100644 --- a/drivers/net/ethernet/cirrus/cs89x0.c +++ b/drivers/net/ethernet/cirrus/cs89x0.c @@ -1271,7 +1271,6 @@ static const struct net_device_ops net_ops = { static void __init reset_chip(struct net_device *dev) { -#if !defined(CONFIG_MACH_MX31ADS) struct net_local *lp = netdev_priv(dev); unsigned long reset_start_time; @@ -1298,7 +1297,6 @@ static void __init reset_chip(struct net_device *dev) while ((readreg(dev, PP_SelfST) & INIT_DONE) == 0 && time_before(jiffies, reset_start_time + 2)) ; -#endif /* !CONFIG_MACH_MX31ADS */ } /* This is the real probe routine. diff --git a/drivers/net/ethernet/cortina/gemini.c b/drivers/net/ethernet/cortina/gemini.c index 4824232f4890..ccd14a386e3b 100644 --- a/drivers/net/ethernet/cortina/gemini.c +++ b/drivers/net/ethernet/cortina/gemini.c @@ -122,6 +122,9 @@ struct gemini_ethernet_port { struct napi_struct napi; struct hrtimer rx_coalesce_timer; unsigned int rx_coalesce_nsecs; + struct sk_buff *rx_skb; + unsigned int rx_frag_nr; + unsigned int freeq_refill; struct gmac_txq txq[TX_QUEUE_NUM]; unsigned int txq_order; @@ -1442,10 +1445,11 @@ static unsigned int gmac_rx(struct net_device *netdev, unsigned int budget) unsigned short m = (1 << port->rxq_order) - 1; struct gemini_ethernet *geth = port->geth; void __iomem *ptr_reg = port->rxq_rwptr; + unsigned int frag_nr = port->rx_frag_nr; + struct sk_buff *skb = port->rx_skb; unsigned int frame_len, frag_len; struct gmac_rxdesc *rx = NULL; struct gmac_queue_page *gpage; - static struct sk_buff *skb; union gmac_rxdesc_0 word0; union gmac_rxdesc_1 word1; union gmac_rxdesc_3 word3; @@ -1455,7 +1459,6 @@ static unsigned int gmac_rx(struct net_device *netdev, unsigned int budget) unsigned short r, w; union dma_rwptr rw; dma_addr_t mapping; - int frag_nr = 0; spin_lock_irqsave(&geth->irq_lock, flags); rw.bits32 = readl(ptr_reg); @@ -1491,6 +1494,12 @@ static unsigned int gmac_rx(struct net_device *netdev, unsigned int budget) gpage = gmac_get_queue_page(geth, port, mapping + PAGE_SIZE); if (!gpage) { dev_err(geth->dev, "could not find mapping\n"); + if (skb) { + napi_free_frags(&port->napi); + port->stats.rx_dropped++; + skb = NULL; + frag_nr = 0; + } continue; } page = gpage->page; @@ -1499,6 +1508,8 @@ static unsigned int gmac_rx(struct net_device *netdev, unsigned int budget) if (skb) { napi_free_frags(&port->napi); port->stats.rx_dropped++; + skb = NULL; + frag_nr = 0; } skb = gmac_skb_if_good_frame(port, word0, frame_len); @@ -1533,6 +1544,7 @@ static unsigned int gmac_rx(struct net_device *netdev, unsigned int budget) if (word3.bits32 & EOF_BIT) { napi_gro_frags(&port->napi); skb = NULL; + frag_nr = 0; --budget; } continue; @@ -1541,6 +1553,7 @@ err_drop: if (skb) { napi_free_frags(&port->napi); skb = NULL; + frag_nr = 0; } if (mapping) @@ -1549,6 +1562,8 @@ err_drop: port->stats.rx_dropped++; } + port->rx_skb = skb; + port->rx_frag_nr = frag_nr; writew(r, ptr_reg); return budget; } @@ -1876,6 +1891,8 @@ static int gmac_stop(struct net_device *netdev) gmac_disable_tx_rx(netdev); gmac_stop_dma(port); napi_disable(&port->napi); + port->rx_skb = NULL; + port->rx_frag_nr = 0; gmac_enable_irq(netdev, 0); gmac_cleanup_rxq(netdev); diff --git a/drivers/net/ethernet/freescale/enetc/enetc_pf.c b/drivers/net/ethernet/freescale/enetc/enetc_pf.c index a12fd54a475f..ed8f7b3dc526 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc_pf.c +++ b/drivers/net/ethernet/freescale/enetc/enetc_pf.c @@ -960,8 +960,10 @@ static int enetc_pf_probe(struct pci_dev *pdev, if (pf->total_vfs) { pf->vf_state = kzalloc_objs(struct enetc_vf_state, pf->total_vfs); - if (!pf->vf_state) + if (!pf->vf_state) { + err = -ENOMEM; goto err_alloc_vf_state; + } } err = enetc_setup_mac_addresses(node, pf); diff --git a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c index 16aa25535152..0bc6dd375687 100644 --- a/drivers/net/ethernet/intel/ice/ice_dcb_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_dcb_lib.c @@ -537,14 +537,14 @@ void ice_dcb_rebuild(struct ice_pf *pf) struct ice_dcbx_cfg *err_cfg; int ret; + mutex_lock(&pf->tc_mutex); + ret = ice_query_port_ets(pf->hw.port_info, &buf, sizeof(buf), NULL); if (ret) { dev_err(dev, "Query Port ETS failed\n"); goto dcb_error; } - mutex_lock(&pf->tc_mutex); - if (!pf->hw.port_info->qos_cfg.is_sw_lldp) ice_cfg_etsrec_defaults(pf->hw.port_info); diff --git a/drivers/net/ethernet/intel/ice/ice_dpll.c b/drivers/net/ethernet/intel/ice/ice_dpll.c index 27b460926bac..892bc7c2e28b 100644 --- a/drivers/net/ethernet/intel/ice/ice_dpll.c +++ b/drivers/net/ethernet/intel/ice/ice_dpll.c @@ -2523,6 +2523,8 @@ ice_dpll_rclk_state_on_pin_set(const struct dpll_pin *pin, void *pin_priv, if (hw_idx < 0) goto unlock; hw_idx -= pf->dplls.base_rclk_idx; + if (hw_idx >= ICE_DPLL_RCLK_NUM_MAX) + goto unlock; if ((enable && p->state[hw_idx] == DPLL_PIN_STATE_CONNECTED) || (!enable && p->state[hw_idx] == DPLL_PIN_STATE_DISCONNECTED)) { @@ -2586,6 +2588,9 @@ ice_dpll_rclk_state_on_pin_get(const struct dpll_pin *pin, void *pin_priv, hw_idx = ice_dpll_pin_get_parent_idx(p, parent_pin); if (hw_idx < 0) goto unlock; + hw_idx -= pf->dplls.base_rclk_idx; + if (hw_idx >= ICE_DPLL_RCLK_NUM_MAX) + goto unlock; ret = ice_dpll_pin_state_update(pf, p, ICE_DPLL_PIN_TYPE_RCLK_INPUT, extack); diff --git a/drivers/net/ethernet/intel/ice/ice_dpll.h b/drivers/net/ethernet/intel/ice/ice_dpll.h index ae42cdea0ee1..8678575359b9 100644 --- a/drivers/net/ethernet/intel/ice/ice_dpll.h +++ b/drivers/net/ethernet/intel/ice/ice_dpll.h @@ -8,6 +8,22 @@ #define ICE_DPLL_RCLK_NUM_MAX 4 +#define ICE_CGU_R10 0x28 +#define ICE_CGU_R10_SYNCE_CLKO_SEL GENMASK(8, 5) +#define ICE_CGU_R10_SYNCE_CLKODIV_M1 GENMASK(13, 9) +#define ICE_CGU_R10_SYNCE_CLKODIV_LOAD BIT(14) +#define ICE_CGU_R10_SYNCE_DCK_RST BIT(15) +#define ICE_CGU_R10_SYNCE_ETHCLKO_SEL GENMASK(18, 16) +#define ICE_CGU_R10_SYNCE_ETHDIV_M1 GENMASK(23, 19) +#define ICE_CGU_R10_SYNCE_ETHDIV_LOAD BIT(24) +#define ICE_CGU_R10_SYNCE_DCK2_RST BIT(25) +#define ICE_CGU_R10_SYNCE_S_REF_CLK GENMASK(31, 27) + +#define ICE_CGU_R11 0x2C +#define ICE_CGU_R11_SYNCE_S_BYP_CLK GENMASK(6, 1) + +#define ICE_CGU_BYPASS_MUX_OFFSET_E825C 3 + /** * enum ice_dpll_pin_sw - enumerate ice software pin indices: * @ICE_DPLL_PIN_SW_1_IDX: index of first SW pin @@ -157,19 +173,3 @@ static inline void ice_dpll_deinit(struct ice_pf *pf) { } #endif #endif - -#define ICE_CGU_R10 0x28 -#define ICE_CGU_R10_SYNCE_CLKO_SEL GENMASK(8, 5) -#define ICE_CGU_R10_SYNCE_CLKODIV_M1 GENMASK(13, 9) -#define ICE_CGU_R10_SYNCE_CLKODIV_LOAD BIT(14) -#define ICE_CGU_R10_SYNCE_DCK_RST BIT(15) -#define ICE_CGU_R10_SYNCE_ETHCLKO_SEL GENMASK(18, 16) -#define ICE_CGU_R10_SYNCE_ETHDIV_M1 GENMASK(23, 19) -#define ICE_CGU_R10_SYNCE_ETHDIV_LOAD BIT(24) -#define ICE_CGU_R10_SYNCE_DCK2_RST BIT(25) -#define ICE_CGU_R10_SYNCE_S_REF_CLK GENMASK(31, 27) - -#define ICE_CGU_R11 0x2C -#define ICE_CGU_R11_SYNCE_S_BYP_CLK GENMASK(6, 1) - -#define ICE_CGU_BYPASS_MUX_OFFSET_E825C 3 diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index 055968485af6..b5df8e052467 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -3682,7 +3682,7 @@ int ice_vlan_rx_add_vid(struct net_device *netdev, __be16 proto, u16 vid) ret = ice_fltr_set_vsi_promisc(&vsi->back->hw, vsi->idx, ICE_MCAST_VLAN_PROMISC_BITS, vid); - if (ret) + if (ret && ret != -EEXIST) goto finish; } @@ -4104,6 +4104,12 @@ int ice_vsi_recfg_qs(struct ice_vsi *vsi, int new_rx, int new_tx, bool locked) } ice_pf_dcb_recfg(pf, locked); ice_vsi_open(vsi); + /* Rx rings are reallocated during VSI rebuild and lose their ptp_rx + * flag. Restore timestamp mode so newly allocated rings are set up + * for hardware Rx timestamping. + */ + if (test_bit(ICE_FLAG_PTP_SUPPORTED, pf->flags)) + ice_ptp_restore_timestamp_mode(pf); goto done; rebuild_err: @@ -8046,7 +8052,7 @@ int ice_set_rss_hfunc(struct ice_vsi *vsi, u8 hfunc) ctx->info.q_opt_rss |= FIELD_PREP(ICE_AQ_VSI_Q_OPT_RSS_HASH_M, hfunc); ctx->info.q_opt_tc = vsi->info.q_opt_tc; - ctx->info.q_opt_flags = vsi->info.q_opt_rss; + ctx->info.q_opt_flags = vsi->info.q_opt_flags; err = ice_update_vsi(hw, vsi->idx, ctx, NULL); if (err) { diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c index 24fb7a3e14d6..2c18e16fe053 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c +++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c @@ -2141,16 +2141,23 @@ int ice_start_phy_timer_eth56g(struct ice_hw *hw, u8 port) } incval = (u64)hi << 32 | lo; + if (!ice_ptp_lock(hw)) { + dev_err(ice_hw_to_dev(hw), "Failed to acquire PTP semaphore\n"); + return -EBUSY; + } + err = ice_write_40b_ptp_reg_eth56g(hw, port, PHY_REG_TIMETUS_L, incval); if (err) - return err; + goto err_ptp_unlock; err = ice_ptp_one_port_cmd(hw, port, ICE_PTP_INIT_INCVAL); if (err) - return err; + goto err_ptp_unlock; ice_ptp_exec_tmr_cmd(hw); + ice_ptp_unlock(hw); + err = ice_sync_phy_timer_eth56g(hw, port); if (err) return err; @@ -2166,6 +2173,10 @@ int ice_start_phy_timer_eth56g(struct ice_hw *hw, u8 port) ice_debug(hw, ICE_DBG_PTP, "Enabled clock on PHY port %u\n", port); return 0; + +err_ptp_unlock: + ice_ptp_unlock(hw); + return err; } /** @@ -4503,18 +4514,17 @@ static int ice_read_phy_tstamp_ll_e810(struct ice_hw *hw, u8 idx, u8 *hi, u32 *lo) { struct ice_e810_params *params = &hw->ptp.phy.e810; - unsigned long flags; u32 val; int err; - spin_lock_irqsave(¶ms->atqbal_wq.lock, flags); + spin_lock_irq(¶ms->atqbal_wq.lock); /* Wait for any pending in-progress low latency interrupt */ err = wait_event_interruptible_locked_irq(params->atqbal_wq, !(params->atqbal_flags & ATQBAL_FLAGS_INTR_IN_PROGRESS)); if (err) { - spin_unlock_irqrestore(¶ms->atqbal_wq.lock, flags); + spin_unlock_irq(¶ms->atqbal_wq.lock); return err; } @@ -4529,7 +4539,7 @@ ice_read_phy_tstamp_ll_e810(struct ice_hw *hw, u8 idx, u8 *hi, u32 *lo) REG_LL_PROXY_H); if (err) { ice_debug(hw, ICE_DBG_PTP, "Failed to read PTP timestamp using low latency read\n"); - spin_unlock_irqrestore(¶ms->atqbal_wq.lock, flags); + spin_unlock_irq(¶ms->atqbal_wq.lock); return err; } @@ -4539,7 +4549,7 @@ ice_read_phy_tstamp_ll_e810(struct ice_hw *hw, u8 idx, u8 *hi, u32 *lo) /* Read the low 32 bit value and set the TS valid bit */ *lo = rd32(hw, REG_LL_PROXY_L) | TS_VALID; - spin_unlock_irqrestore(¶ms->atqbal_wq.lock, flags); + spin_unlock_irq(¶ms->atqbal_wq.lock); return 0; } @@ -5254,9 +5264,13 @@ static void ice_ptp_init_phy_e830(struct ice_ptp_hw *ptp) */ bool ice_ptp_lock(struct ice_hw *hw) { + struct ice_pf *pf = container_of(hw, struct ice_pf, hw); u32 hw_lock; int i; + if (!ice_is_primary(hw)) + hw = ice_get_primary_hw(pf); + #define MAX_TRIES 15 for (i = 0; i < MAX_TRIES; i++) { @@ -5283,6 +5297,11 @@ bool ice_ptp_lock(struct ice_hw *hw) */ void ice_ptp_unlock(struct ice_hw *hw) { + struct ice_pf *pf = container_of(hw, struct ice_pf, hw); + + if (!ice_is_primary(hw)) + hw = ice_get_primary_hw(pf); + wr32(hw, PFTSYN_SEM + (PFTSYN_SEM_BYTES * hw->pf_id), 0); } diff --git a/drivers/net/ethernet/intel/ice/virt/queues.c b/drivers/net/ethernet/intel/ice/virt/queues.c index f73d5a3e83d4..31be2f76181c 100644 --- a/drivers/net/ethernet/intel/ice/virt/queues.c +++ b/drivers/net/ethernet/intel/ice/virt/queues.c @@ -840,7 +840,7 @@ int ice_vc_cfg_qs_msg(struct ice_vf *vf, u8 *msg) if (qpi->rxq.databuffer_size != 0 && (qpi->rxq.databuffer_size > ((16 * 1024) - 128) || - qpi->rxq.databuffer_size < 1024)) + qpi->rxq.databuffer_size < 128)) goto error_param; ring->rx_buf_len = qpi->rxq.databuffer_size; diff --git a/drivers/net/ethernet/intel/idpf/idpf_ptp.c b/drivers/net/ethernet/intel/idpf/idpf_ptp.c index eec91c4f0a75..4a51d2727547 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_ptp.c +++ b/drivers/net/ethernet/intel/idpf/idpf_ptp.c @@ -952,6 +952,8 @@ int idpf_ptp_init(struct idpf_adapter *adapter) goto free_ptp; } + spin_lock_init(&adapter->ptp->read_dev_clk_lock); + err = idpf_ptp_create_clock(adapter); if (err) goto free_ptp; @@ -977,8 +979,6 @@ int idpf_ptp_init(struct idpf_adapter *adapter) goto remove_clock; } - spin_lock_init(&adapter->ptp->read_dev_clk_lock); - pci_dbg(adapter->pdev, "PTP init successful\n"); return 0; diff --git a/drivers/net/ethernet/intel/igc/igc_tsn.c b/drivers/net/ethernet/intel/igc/igc_tsn.c index 8a110145bfee..52de2bcbadbe 100644 --- a/drivers/net/ethernet/intel/igc/igc_tsn.c +++ b/drivers/net/ethernet/intel/igc/igc_tsn.c @@ -34,6 +34,7 @@ static int igc_fpe_init_smd_frame(struct igc_ring *ring, return -ENOMEM; } + buffer->type = IGC_TX_BUFFER_TYPE_SKB; buffer->skb = skb; buffer->protocol = 0; buffer->bytecount = skb->len; @@ -109,10 +110,16 @@ static int igc_fpe_xmit_smd_frame(struct igc_adapter *adapter, __netif_tx_lock(nq, cpu); err = igc_fpe_init_tx_descriptor(ring, skb, type); - igc_flush_tx_descriptors(ring); + if (err) + goto err_free_skb_any; + igc_flush_tx_descriptors(ring); __netif_tx_unlock(nq); + return 0; +err_free_skb_any: + __netif_tx_unlock(nq); + dev_kfree_skb_any(skb); return err; } diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c index 42f89a179a3f..4ba3be961ab6 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -1221,6 +1221,7 @@ static int ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector, ether_addr_equal(rx_ring->netdev->dev_addr, eth_hdr(skb)->h_source)) { dev_kfree_skb_irq(skb); + skb = NULL; continue; } diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cgx.c b/drivers/net/ethernet/marvell/octeontx2/af/cgx.c index 6000795823a3..0576e7a95c02 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/cgx.c +++ b/drivers/net/ethernet/marvell/octeontx2/af/cgx.c @@ -1294,13 +1294,18 @@ static inline void link_status_user_format(u64 lstat, struct cgx_link_user_info *linfo, struct cgx *cgx, u8 lmac_id) { + unsigned int speed; + linfo->link_up = FIELD_GET(RESP_LINKSTAT_UP, lstat); linfo->full_duplex = FIELD_GET(RESP_LINKSTAT_FDUPLEX, lstat); - linfo->speed = cgx_speed_mbps[FIELD_GET(RESP_LINKSTAT_SPEED, lstat)]; linfo->an = FIELD_GET(RESP_LINKSTAT_AN, lstat); linfo->fec = FIELD_GET(RESP_LINKSTAT_FEC, lstat); linfo->lmac_type_id = FIELD_GET(RESP_LINKSTAT_LMAC_TYPE, lstat); + speed = FIELD_GET(RESP_LINKSTAT_SPEED, lstat); + linfo->speed = speed < ARRAY_SIZE(cgx_speed_mbps) ? + cgx_speed_mbps[speed] : 0; + if (linfo->lmac_type_id >= LMAC_MODE_MAX) { dev_err(&cgx->pdev->dev, "Unknown lmac_type_id %d reported by firmware on cgx port%d:%d", linfo->lmac_type_id, cgx->cgx_id, lmac_id); diff --git a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c index 8658cb2143df..e28675fe1890 100644 --- a/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c +++ b/drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c @@ -837,7 +837,7 @@ void rvu_npc_install_allmulti_entry(struct rvu *rvu, u16 pcifunc, int nixlf, u16 vf_func; /* Only CGX PF/VF can add allmulticast entry */ - if (is_lbk_vf(rvu, pcifunc) && is_sdp_vf(rvu, pcifunc)) + if (is_lbk_vf(rvu, pcifunc) || is_sdp_vf(rvu, pcifunc)) return; blkaddr = rvu_get_blkaddr(rvu, BLKTYPE_NPC, 0); diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/cn20k.c b/drivers/net/ethernet/marvell/octeontx2/nic/cn20k.c index a60f8cf53feb..d546d450e7c2 100644 --- a/drivers/net/ethernet/marvell/octeontx2/nic/cn20k.c +++ b/drivers/net/ethernet/marvell/octeontx2/nic/cn20k.c @@ -353,11 +353,13 @@ static int cn20k_pool_aq_init(struct otx2_nic *pfvf, u16 pool_id, err = otx2_sync_mbox_msg(&pfvf->mbox); if (err) { qmem_free(pfvf->dev, pool->stack); + pool->stack = NULL; return err; } aq = otx2_mbox_alloc_msg_npa_cn20k_aq_enq(&pfvf->mbox); if (!aq) { qmem_free(pfvf->dev, pool->stack); + pool->stack = NULL; return -ENOMEM; } } diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c index 971fcab1c248..3d253132a17f 100644 --- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c +++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c @@ -1482,11 +1482,13 @@ int otx2_pool_aq_init(struct otx2_nic *pfvf, u16 pool_id, err = otx2_sync_mbox_msg(&pfvf->mbox); if (err) { qmem_free(pfvf->dev, pool->stack); + pool->stack = NULL; return err; } aq = otx2_mbox_alloc_msg_npa_aq_enq(&pfvf->mbox); if (!aq) { qmem_free(pfvf->dev, pool->stack); + pool->stack = NULL; return -ENOMEM; } } diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/rep.c b/drivers/net/ethernet/marvell/octeontx2/nic/rep.c index 94f155ffb17f..0f5d5642d3f7 100644 --- a/drivers/net/ethernet/marvell/octeontx2/nic/rep.c +++ b/drivers/net/ethernet/marvell/octeontx2/nic/rep.c @@ -609,7 +609,7 @@ static int rvu_rep_rsrc_init(struct otx2_nic *priv) err = otx2_init_hw_resources(priv); if (err) - goto err_free_rsrc; + goto err_free_mem; /* Set maximum frame size allowed in HW */ err = otx2_hw_set_mtu(priv, priv->hw.max_mtu); @@ -621,6 +621,7 @@ static int rvu_rep_rsrc_init(struct otx2_nic *priv) err_free_rsrc: otx2_free_hw_resources(priv); +err_free_mem: otx2_free_queue_mem(qset); return err; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c index afdeb1b3d425..8409ae73768f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c @@ -160,13 +160,13 @@ static int mlx5e_tx_reporter_timeout_recover(void *ctx) * channels are being closed for other reason and this work is not * relevant anymore. */ - while (!netdev_trylock(sq->netdev)) { + while (!netdev_trylock(priv->netdev)) { if (!test_bit(MLX5E_STATE_CHANNELS_ACTIVE, &priv->state)) return 0; msleep(20); } - err = mlx5e_health_channel_eq_recover(sq->netdev, eq, sq->cq.ch_stats); + err = mlx5e_health_channel_eq_recover(priv->netdev, eq, sq->cq.ch_stats); if (!err) { to_ctx->status = 0; /* this sq recovered */ goto out; @@ -186,7 +186,7 @@ static int mlx5e_tx_reporter_timeout_recover(void *ctx) "mlx5e_safe_reopen_channels failed recovering from a tx_timeout, err(%d).\n", err); out: - netdev_unlock(sq->netdev); + netdev_unlock(priv->netdev); return err; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c index 64e13747084e..9c1f3d734911 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.c @@ -793,8 +793,10 @@ static int mlx5e_xfrm_add_state(struct net_device *dev, sa_entry->dev = dev; sa_entry->ipsec = ipsec; /* Check if this SA is originated from acquire flow temporary SA */ - if (x->xso.flags & XFRM_DEV_OFFLOAD_FLAG_ACQ) - goto out; + if (x->xso.flags & XFRM_DEV_OFFLOAD_FLAG_ACQ) { + x->xso.offload_handle = (unsigned long)sa_entry; + return 0; + } err = mlx5e_xfrm_validate_state(priv->mdev, x, extack); if (err) @@ -871,7 +873,6 @@ static int mlx5e_xfrm_add_state(struct net_device *dev, xa_unlock_bh(&ipsec->sadb); } -out: x->xso.offload_handle = (unsigned long)sa_entry; if (allow_tunnel_mode) mlx5_eswitch_unblock_encap(priv->mdev); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c index b31f689fe271..e90c6c6df835 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c @@ -252,7 +252,7 @@ int mlx5e_napi_poll(struct napi_struct *napi, int budget) mlx5e_cq_arm(&c->xdpsq->cq); if (unlikely(aff_change && busy_xsk)) { - mlx5e_trigger_irq(&c->icosq); + mlx5e_trigger_napi_async_icosq(c); ch_stats->force_irq++; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/ipsec_fs.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/ipsec_fs.c index 3cfe743610d3..ab50d2c734ed 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/esw/ipsec_fs.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/ipsec_fs.c @@ -142,7 +142,8 @@ static int mlx5_esw_ipsec_modify_flow_dests(struct mlx5_eswitch *esw, attr = flow->attr; esw_attr = attr->esw_attr; - if (esw_attr->out_count - esw_attr->split_count > 1) + if (!esw_attr->out_count || + esw_attr->out_count - esw_attr->split_count > 1) return 0; err = mlx5_eswitch_restore_ipsec_rule(esw, flow->rule[0], esw_attr, diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c index 123c96716a54..7c8311f41232 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c @@ -908,6 +908,24 @@ static void esw_vport_cleanup(struct mlx5_eswitch *esw, struct mlx5_vport *vport esw_vport_cleanup_acl(esw, vport); } +static void mlx5_esw_vport_set_max_tx_speed(struct mlx5_eswitch *esw, + struct mlx5_vport *vport) +{ + int ret; + + if (!MLX5_CAP_ESW(esw->dev, esw_vport_state_max_tx_speed)) + return; + + ret = mlx5_modify_vport_max_tx_speed(esw->dev, + MLX5_VPORT_STATE_OP_MOD_ESW_VPORT, + vport->vport, true, + vport->agg_max_tx_speed); + if (ret) + mlx5_core_dbg(esw->dev, + "Failed to set vport %d speed %d, err=%d\n", + vport->vport, vport->agg_max_tx_speed, ret); +} + int mlx5_esw_vport_enable(struct mlx5_eswitch *esw, struct mlx5_vport *vport, enum mlx5_eswitch_vport_event enabled_events) { @@ -948,6 +966,9 @@ int mlx5_esw_vport_enable(struct mlx5_eswitch *esw, struct mlx5_vport *vport, esw->enabled_vports++; esw_debug(esw->dev, "Enabled VPORT(%d)\n", vport_num); + + if (vport->agg_max_tx_speed) + mlx5_esw_vport_set_max_tx_speed(esw, vport); done: mutex_unlock(&esw->state_lock); return ret; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h index c2563bee74df..29cce1bbce94 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h @@ -247,6 +247,7 @@ struct mlx5_vport { enum mlx5_eswitch_vport_event enabled_events; int index; struct mlx5_devlink_port *dl_port; + u32 agg_max_tx_speed; }; struct mlx5_esw_indir_table; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c index 044adfdf9aa2..82e884e70168 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c @@ -1074,6 +1074,11 @@ static void mlx5_lag_modify_device_vports_speed(struct mlx5_core_dev *mdev, if (vport->vport == MLX5_VPORT_UPLINK) continue; + vport->agg_max_tx_speed = speed; + + if (!vport->enabled) + continue; + ret = mlx5_modify_vport_max_tx_speed(mdev, op_mod, vport->vport, true, speed); if (ret) diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c index 47752d3fde0b..1179a6e127c5 100644 --- a/drivers/net/ethernet/microchip/lan966x/lan966x_main.c +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_main.c @@ -749,11 +749,10 @@ static void lan966x_cleanup_ports(struct lan966x *lan966x) for (p = 0; p < lan966x->num_phys_ports; p++) { port = lan966x->ports[p]; - if (!port) + if (!port || !port->dev) continue; - if (port->dev) - unregister_netdev(port->dev); + unregister_netdev(port->dev); lan966x_xdp_port_deinit(port); if (lan966x->fdma && lan966x->fdma_ndev == port->dev) @@ -873,6 +872,9 @@ static int lan966x_probe_port(struct lan966x *lan966x, u32 p, err = register_netdev(dev); if (err) { dev_err(lan966x->dev, "register_netdev failed\n"); + phylink_destroy(phylink); + port->phylink = NULL; + port->dev = NULL; return err; } diff --git a/drivers/net/ethernet/microsoft/mana/hw_channel.c b/drivers/net/ethernet/microsoft/mana/hw_channel.c index 48a9acea4ab6..dbaeedb6e7b1 100644 --- a/drivers/net/ethernet/microsoft/mana/hw_channel.c +++ b/drivers/net/ethernet/microsoft/mana/hw_channel.c @@ -77,21 +77,19 @@ static int mana_hwc_post_rx_wqe(const struct hwc_wq *hwc_rxq, } static void mana_hwc_handle_resp(struct hw_channel_context *hwc, u32 resp_len, - struct hwc_work_request *rx_req) + struct hwc_work_request *rx_req, u16 msg_id) { const struct gdma_resp_hdr *resp_msg = rx_req->buf_va; struct hwc_caller_ctx *ctx; int err; - if (!test_bit(resp_msg->response.hwc_msg_id, - hwc->inflight_msg_res.map)) { - dev_err(hwc->dev, "hwc_rx: invalid msg_id = %u\n", - resp_msg->response.hwc_msg_id); + if (!test_bit(msg_id, hwc->inflight_msg_res.map)) { + dev_err(hwc->dev, "hwc_rx: invalid msg_id = %u\n", msg_id); mana_hwc_post_rx_wqe(hwc->rxq, rx_req); return; } - ctx = hwc->caller_ctx + resp_msg->response.hwc_msg_id; + ctx = hwc->caller_ctx + msg_id; err = mana_hwc_verify_resp_msg(ctx, resp_msg, resp_len); if (err) goto out; @@ -251,6 +249,7 @@ static void mana_hwc_rx_event_handler(void *ctx, u32 gdma_rxq_id, struct gdma_sge *sge; u64 rq_base_addr; u64 rx_req_idx; + u16 msg_id; u8 *wqe; if (WARN_ON_ONCE(hwc_rxq->gdma_wq->id != gdma_rxq_id)) @@ -266,16 +265,26 @@ static void mana_hwc_rx_event_handler(void *ctx, u32 gdma_rxq_id, rq_base_addr = hwc_rxq->msg_buf->mem_info.dma_handle; rx_req_idx = (sge->address - rq_base_addr) / hwc->max_req_msg_size; + if (rx_req_idx >= hwc_rxq->msg_buf->num_reqs) { + dev_err(hwc->dev, "HWC RX: wrong rx_req_idx=%llu, num_reqs=%u\n", + rx_req_idx, hwc_rxq->msg_buf->num_reqs); + return; + } + rx_req = &hwc_rxq->msg_buf->reqs[rx_req_idx]; resp = (struct gdma_resp_hdr *)rx_req->buf_va; - if (resp->response.hwc_msg_id >= hwc->num_inflight_msg) { - dev_err(hwc->dev, "HWC RX: wrong msg_id=%u\n", - resp->response.hwc_msg_id); + /* Read msg_id once from DMA buffer to prevent TOCTOU: + * DMA memory is shared/unencrypted in CVMs - host can + * modify it between reads. + */ + msg_id = READ_ONCE(resp->response.hwc_msg_id); + if (msg_id >= hwc->num_inflight_msg) { + dev_err(hwc->dev, "HWC RX: wrong msg_id=%u\n", msg_id); return; } - mana_hwc_handle_resp(hwc, rx_oob->tx_oob_data_size, rx_req); + mana_hwc_handle_resp(hwc, rx_oob->tx_oob_data_size, rx_req, msg_id); /* Can no longer use 'resp', because the buffer is posted to the HW * in mana_hwc_handle_resp() above. diff --git a/drivers/net/ethernet/qlogic/qed/qed_cxt.c b/drivers/net/ethernet/qlogic/qed/qed_cxt.c index 9861daa82d9e..b70262e70baf 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_cxt.c +++ b/drivers/net/ethernet/qlogic/qed/qed_cxt.c @@ -1036,11 +1036,13 @@ static void qed_cid_map_free(struct qed_hwfn *p_hwfn) for (type = 0; type < MAX_CONN_TYPES; type++) { bitmap_free(p_mngr->acquired[type].cid_map); + p_mngr->acquired[type].cid_map = NULL; p_mngr->acquired[type].max_count = 0; p_mngr->acquired[type].start_cid = 0; for (vf = 0; vf < MAX_NUM_VFS; vf++) { bitmap_free(p_mngr->acquired_vf[type][vf].cid_map); + p_mngr->acquired_vf[type][vf].cid_map = NULL; p_mngr->acquired_vf[type][vf].max_count = 0; p_mngr->acquired_vf[type][vf].start_cid = 0; } diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-eic7700.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-eic7700.c index bcb8e000e720..4ac979d874d6 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-eic7700.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-eic7700.c @@ -28,13 +28,16 @@ /* * TX/RX Clock Delay Bit Masks: - * - TX Delay: bits [14:8] — TX_CLK delay (unit: 0.1ns per bit) - * - RX Delay: bits [30:24] — RX_CLK delay (unit: 0.1ns per bit) + * - TX Delay: bits [14:8] — TX_CLK delay (unit: 0.02ns per bit) + * - RX Delay: bits [30:24] — RX_CLK delay (unit: 0.02ns per bit) */ #define EIC7700_ETH_TX_ADJ_DELAY GENMASK(14, 8) #define EIC7700_ETH_RX_ADJ_DELAY GENMASK(30, 24) -#define EIC7700_MAX_DELAY_UNIT 0x7F +#define EIC7700_MAX_DELAY_STEPS 0x7F +#define EIC7700_DELAY_STEP_PS 20 +#define EIC7700_MAX_DELAY_PS \ + (EIC7700_MAX_DELAY_STEPS * EIC7700_DELAY_STEP_PS) static const char * const eic7700_clk_names[] = { "tx", "axi", "cfg", @@ -42,6 +45,15 @@ static const char * const eic7700_clk_names[] = { struct eic7700_qos_priv { struct plat_stmmacenet_data *plat_dat; + struct regmap *eic7700_hsp_regmap; + u32 eth_axi_lp_ctrl_offset; + u32 eth_phy_ctrl_offset; + u32 eth_clk_offset; + u32 eth_txd_offset; + u32 eth_rxd_offset; + u32 eth_clk_dly_param; + bool has_txd_offset; + bool has_rxd_offset; }; static int eic7700_clks_config(void *priv, bool enabled) @@ -61,8 +73,34 @@ static int eic7700_clks_config(void *priv, bool enabled) static int eic7700_dwmac_init(struct device *dev, void *priv) { struct eic7700_qos_priv *dwc = priv; + int ret; + + ret = eic7700_clks_config(dwc, true); + if (ret) + return ret; + + ret = regmap_set_bits(dwc->eic7700_hsp_regmap, + dwc->eth_phy_ctrl_offset, + EIC7700_ETH_TX_CLK_SEL | + EIC7700_ETH_PHY_INTF_SELI); + if (ret) { + eic7700_clks_config(dwc, false); + return ret; + } - return eic7700_clks_config(dwc, true); + regmap_write(dwc->eic7700_hsp_regmap, dwc->eth_axi_lp_ctrl_offset, + EIC7700_ETH_CSYSREQ_VAL); + + if (dwc->has_txd_offset) + regmap_write(dwc->eic7700_hsp_regmap, dwc->eth_txd_offset, 0); + + if (dwc->has_rxd_offset) + regmap_write(dwc->eic7700_hsp_regmap, dwc->eth_rxd_offset, 0); + + regmap_write(dwc->eic7700_hsp_regmap, dwc->eth_clk_offset, + dwc->eth_clk_dly_param); + + return 0; } static void eic7700_dwmac_exit(struct device *dev, void *priv) @@ -93,13 +131,7 @@ static int eic7700_dwmac_probe(struct platform_device *pdev) struct plat_stmmacenet_data *plat_dat; struct stmmac_resources stmmac_res; struct eic7700_qos_priv *dwc_priv; - struct regmap *eic7700_hsp_regmap; - u32 eth_axi_lp_ctrl_offset; - u32 eth_phy_ctrl_offset; - u32 eth_phy_ctrl_regset; - u32 eth_rxd_dly_offset; - u32 eth_dly_param = 0; - u32 delay_ps; + u32 delay_ps, val; int i, ret; ret = stmmac_get_platform_resources(pdev, &stmmac_res); @@ -119,10 +151,20 @@ static int eic7700_dwmac_probe(struct platform_device *pdev) /* Read rx-internal-delay-ps and update rx_clk delay */ if (!of_property_read_u32(pdev->dev.of_node, "rx-internal-delay-ps", &delay_ps)) { - u32 val = min(delay_ps / 100, EIC7700_MAX_DELAY_UNIT); + if (delay_ps % EIC7700_DELAY_STEP_PS) + return dev_err_probe(&pdev->dev, -EINVAL, + "rx delay must be multiple of %dps\n", + EIC7700_DELAY_STEP_PS); + + if (delay_ps > EIC7700_MAX_DELAY_PS) + return dev_err_probe(&pdev->dev, -EINVAL, + "rx delay out of range\n"); - eth_dly_param &= ~EIC7700_ETH_RX_ADJ_DELAY; - eth_dly_param |= FIELD_PREP(EIC7700_ETH_RX_ADJ_DELAY, val); + val = delay_ps / EIC7700_DELAY_STEP_PS; + + dwc_priv->eth_clk_dly_param &= ~EIC7700_ETH_RX_ADJ_DELAY; + dwc_priv->eth_clk_dly_param |= + FIELD_PREP(EIC7700_ETH_RX_ADJ_DELAY, val); } else { return dev_err_probe(&pdev->dev, -EINVAL, "missing required property rx-internal-delay-ps\n"); @@ -131,55 +173,65 @@ static int eic7700_dwmac_probe(struct platform_device *pdev) /* Read tx-internal-delay-ps and update tx_clk delay */ if (!of_property_read_u32(pdev->dev.of_node, "tx-internal-delay-ps", &delay_ps)) { - u32 val = min(delay_ps / 100, EIC7700_MAX_DELAY_UNIT); + if (delay_ps % EIC7700_DELAY_STEP_PS) + return dev_err_probe(&pdev->dev, -EINVAL, + "tx delay must be multiple of %dps\n", + EIC7700_DELAY_STEP_PS); + + if (delay_ps > EIC7700_MAX_DELAY_PS) + return dev_err_probe(&pdev->dev, -EINVAL, + "tx delay out of range\n"); - eth_dly_param &= ~EIC7700_ETH_TX_ADJ_DELAY; - eth_dly_param |= FIELD_PREP(EIC7700_ETH_TX_ADJ_DELAY, val); + val = delay_ps / EIC7700_DELAY_STEP_PS; + + dwc_priv->eth_clk_dly_param &= ~EIC7700_ETH_TX_ADJ_DELAY; + dwc_priv->eth_clk_dly_param |= + FIELD_PREP(EIC7700_ETH_TX_ADJ_DELAY, val); } else { return dev_err_probe(&pdev->dev, -EINVAL, "missing required property tx-internal-delay-ps\n"); } - eic7700_hsp_regmap = syscon_regmap_lookup_by_phandle(pdev->dev.of_node, - "eswin,hsp-sp-csr"); - if (IS_ERR(eic7700_hsp_regmap)) + dwc_priv->eic7700_hsp_regmap = + syscon_regmap_lookup_by_phandle(pdev->dev.of_node, + "eswin,hsp-sp-csr"); + if (IS_ERR(dwc_priv->eic7700_hsp_regmap)) return dev_err_probe(&pdev->dev, - PTR_ERR(eic7700_hsp_regmap), + PTR_ERR(dwc_priv->eic7700_hsp_regmap), "Failed to get hsp-sp-csr regmap\n"); ret = of_property_read_u32_index(pdev->dev.of_node, "eswin,hsp-sp-csr", - 1, ð_phy_ctrl_offset); + 1, &dwc_priv->eth_phy_ctrl_offset); if (ret) return dev_err_probe(&pdev->dev, ret, "can't get eth_phy_ctrl_offset\n"); - regmap_read(eic7700_hsp_regmap, eth_phy_ctrl_offset, - ð_phy_ctrl_regset); - eth_phy_ctrl_regset |= - (EIC7700_ETH_TX_CLK_SEL | EIC7700_ETH_PHY_INTF_SELI); - regmap_write(eic7700_hsp_regmap, eth_phy_ctrl_offset, - eth_phy_ctrl_regset); - ret = of_property_read_u32_index(pdev->dev.of_node, "eswin,hsp-sp-csr", - 2, ð_axi_lp_ctrl_offset); + 2, &dwc_priv->eth_axi_lp_ctrl_offset); if (ret) return dev_err_probe(&pdev->dev, ret, "can't get eth_axi_lp_ctrl_offset\n"); - regmap_write(eic7700_hsp_regmap, eth_axi_lp_ctrl_offset, - EIC7700_ETH_CSYSREQ_VAL); - ret = of_property_read_u32_index(pdev->dev.of_node, "eswin,hsp-sp-csr", - 3, ð_rxd_dly_offset); + 3, &dwc_priv->eth_clk_offset); if (ret) return dev_err_probe(&pdev->dev, ret, - "can't get eth_rxd_dly_offset\n"); + "can't get eth_clk_offset\n"); + + ret = of_property_read_u32_index(pdev->dev.of_node, + "eswin,hsp-sp-csr", + 4, &dwc_priv->eth_txd_offset); + if (!ret) + dwc_priv->has_txd_offset = true; - regmap_write(eic7700_hsp_regmap, eth_rxd_dly_offset, - eth_dly_param); + ret = of_property_read_u32_index(pdev->dev.of_node, + "eswin,hsp-sp-csr", + 5, &dwc_priv->eth_rxd_offset); + if (!ret) + dwc_priv->has_rxd_offset = true; plat_dat->num_clks = ARRAY_SIZE(eic7700_clk_names); plat_dat->clks = devm_kcalloc(&pdev->dev, diff --git a/drivers/net/ethernet/ti/icssm/icssm_prueth.c b/drivers/net/ethernet/ti/icssm/icssm_prueth.c index 53bbd9290904..b7e94244355a 100644 --- a/drivers/net/ethernet/ti/icssm/icssm_prueth.c +++ b/drivers/net/ethernet/ti/icssm/icssm_prueth.c @@ -1825,6 +1825,7 @@ static int icssm_prueth_probe(struct platform_device *pdev) dev_err(dev, "%pOF error reading port_id %d\n", eth_node, ret); of_node_put(eth_node); + of_node_put(eth_ports_node); return ret; } diff --git a/drivers/net/ifb.c b/drivers/net/ifb.c index 5407d2ed71b3..43aa1bfd41cf 100644 --- a/drivers/net/ifb.c +++ b/drivers/net/ifb.c @@ -211,12 +211,12 @@ static void ifb_get_strings(struct net_device *dev, u32 stringset, u8 *buf) switch (stringset) { case ETH_SS_STATS: - for (i = 0; i < dev->real_num_rx_queues; i++) + for (i = 0; i < dev->num_tx_queues; i++) for (j = 0; j < IFB_Q_STATS_LEN; j++) ethtool_sprintf(&p, "rx_queue_%u_%.18s", i, ifb_q_stats_desc[j].desc); - for (i = 0; i < dev->real_num_tx_queues; i++) + for (i = 0; i < dev->num_tx_queues; i++) for (j = 0; j < IFB_Q_STATS_LEN; j++) ethtool_sprintf(&p, "tx_queue_%u_%.18s", i, ifb_q_stats_desc[j].desc); @@ -229,8 +229,7 @@ static int ifb_get_sset_count(struct net_device *dev, int sset) { switch (sset) { case ETH_SS_STATS: - return IFB_Q_STATS_LEN * (dev->real_num_rx_queues + - dev->real_num_tx_queues); + return IFB_Q_STATS_LEN * dev->num_tx_queues * 2; default: return -EOPNOTSUPP; } @@ -262,12 +261,12 @@ static void ifb_get_ethtool_stats(struct net_device *dev, struct ifb_q_private *txp; int i; - for (i = 0; i < dev->real_num_rx_queues; i++) { + for (i = 0; i < dev->num_tx_queues; i++) { txp = dp->tx_private + i; ifb_fill_stats_data(&data, &txp->rx_stats); } - for (i = 0; i < dev->real_num_tx_queues; i++) { + for (i = 0; i < dev->num_tx_queues; i++) { txp = dp->tx_private + i; ifb_fill_stats_data(&data, &txp->tx_stats); } diff --git a/drivers/net/ovpn/io.c b/drivers/net/ovpn/io.c index 955c9a37e1f8..c03e58e28a86 100644 --- a/drivers/net/ovpn/io.c +++ b/drivers/net/ovpn/io.c @@ -196,7 +196,7 @@ void ovpn_decrypt_post(void *data, int ret) skb = NULL; drop: if (unlikely(skb)) - dev_dstats_rx_dropped(peer->ovpn->dev); + ovpn_dev_dstats_rx_dropped(peer->ovpn->dev); kfree_skb(skb); drop_nocount: if (likely(peer)) @@ -220,7 +220,7 @@ void ovpn_recv(struct ovpn_peer *peer, struct sk_buff *skb) net_info_ratelimited("%s: no available key for peer %u, key-id: %u\n", netdev_name(peer->ovpn->dev), peer->id, key_id); - dev_dstats_rx_dropped(peer->ovpn->dev); + ovpn_dev_dstats_rx_dropped(peer->ovpn->dev); kfree_skb(skb); ovpn_peer_put(peer); return; @@ -298,7 +298,7 @@ err_unlock: rcu_read_unlock(); err: if (unlikely(skb)) - dev_dstats_tx_dropped(peer->ovpn->dev); + ovpn_dev_dstats_tx_dropped(peer->ovpn->dev); if (likely(peer)) ovpn_peer_put(peer); if (likely(ks)) @@ -340,7 +340,7 @@ static void ovpn_send(struct ovpn_priv *ovpn, struct sk_buff *skb, */ skb_list_walk_safe(skb, curr, next) { if (unlikely(!ovpn_encrypt_one(peer, curr))) { - dev_dstats_tx_dropped(ovpn->dev); + ovpn_dev_dstats_tx_dropped(ovpn->dev); kfree_skb(curr); } } @@ -411,7 +411,7 @@ netdev_tx_t ovpn_net_xmit(struct sk_buff *skb, struct net_device *dev) if (unlikely(!curr)) { net_err_ratelimited("%s: skb_share_check failed for payload packet\n", netdev_name(dev)); - dev_dstats_tx_dropped(ovpn->dev); + ovpn_dev_dstats_tx_dropped(ovpn->dev); continue; } @@ -437,7 +437,7 @@ netdev_tx_t ovpn_net_xmit(struct sk_buff *skb, struct net_device *dev) drop: ovpn_peer_put(peer); drop_no_peer: - dev_dstats_tx_dropped(ovpn->dev); + ovpn_dev_dstats_tx_dropped(ovpn->dev); skb_tx_error(skb); kfree_skb_list(skb); return NETDEV_TX_OK; diff --git a/drivers/net/ovpn/main.c b/drivers/net/ovpn/main.c index 2e0420febda0..9993c1dfe471 100644 --- a/drivers/net/ovpn/main.c +++ b/drivers/net/ovpn/main.c @@ -92,6 +92,8 @@ static void ovpn_net_uninit(struct net_device *dev) { struct ovpn_priv *ovpn = netdev_priv(dev); + disable_delayed_work_sync(&ovpn->keepalive_work); + ovpn_peers_free(ovpn, NULL, OVPN_DEL_PEER_REASON_TEARDOWN); gro_cells_destroy(&ovpn->gro_cells); } @@ -208,15 +210,6 @@ static int ovpn_newlink(struct net_device *dev, return register_netdevice(dev); } -static void ovpn_dellink(struct net_device *dev, struct list_head *head) -{ - struct ovpn_priv *ovpn = netdev_priv(dev); - - cancel_delayed_work_sync(&ovpn->keepalive_work); - ovpn_peers_free(ovpn, NULL, OVPN_DEL_PEER_REASON_TEARDOWN); - unregister_netdevice_queue(dev, head); -} - static int ovpn_fill_info(struct sk_buff *skb, const struct net_device *dev) { struct ovpn_priv *ovpn = netdev_priv(dev); @@ -235,7 +228,6 @@ static struct rtnl_link_ops ovpn_link_ops = { .policy = ovpn_policy, .maxtype = IFLA_OVPN_MAX, .newlink = ovpn_newlink, - .dellink = ovpn_dellink, .fill_info = ovpn_fill_info, }; diff --git a/drivers/net/ovpn/netlink.c b/drivers/net/ovpn/netlink.c index c7f382437630..bdb56ef0c904 100644 --- a/drivers/net/ovpn/netlink.c +++ b/drivers/net/ovpn/netlink.c @@ -455,10 +455,12 @@ int ovpn_nl_peer_new_doit(struct sk_buff *skb, struct genl_info *info) sock_release: ovpn_socket_release(peer); peer_release: - /* release right away because peer was not yet hashed, thus it is not - * used in any context + /* For UDP, the peer is unreachable until added to the hashtables, so + * dropping the initial reference is enough. For TCP, the peer may be + * concurrently reachable via sk_user_data->peer until + * ovpn_socket_release() detaches; rely on the refcount. */ - ovpn_peer_release(peer); + ovpn_peer_put(peer); return ret; } diff --git a/drivers/net/ovpn/peer.c b/drivers/net/ovpn/peer.c index 3716a1d82801..8cd129cc2142 100644 --- a/drivers/net/ovpn/peer.c +++ b/drivers/net/ovpn/peer.c @@ -348,7 +348,7 @@ static void ovpn_peer_release_rcu(struct rcu_head *head) * ovpn_peer_release - release peer private members * @peer: the peer to release */ -void ovpn_peer_release(struct ovpn_peer *peer) +static void ovpn_peer_release(struct ovpn_peer *peer) { ovpn_crypto_state_release(&peer->crypto); spin_lock_bh(&peer->lock); @@ -1029,14 +1029,29 @@ static int ovpn_peer_add_p2p(struct ovpn_priv *ovpn, struct ovpn_peer *peer) */ int ovpn_peer_add(struct ovpn_priv *ovpn, struct ovpn_peer *peer) { + int ret = -ENODEV; + + /* Prevent adding new peers while destroying the ovpn interface. + * Failing to do so would end up holding the device reference + * endlessly hostage of the new peer object with no chance of + * release.. + */ + netdev_lock(ovpn->dev); + if (ovpn->dev->reg_state != NETREG_REGISTERED) + goto out; + switch (ovpn->mode) { case OVPN_MODE_MP: - return ovpn_peer_add_mp(ovpn, peer); + ret = ovpn_peer_add_mp(ovpn, peer); + break; case OVPN_MODE_P2P: - return ovpn_peer_add_p2p(ovpn, peer); + ret = ovpn_peer_add_p2p(ovpn, peer); + break; } +out: + netdev_unlock(ovpn->dev); - return -EOPNOTSUPP; + return ret; } /** diff --git a/drivers/net/ovpn/peer.h b/drivers/net/ovpn/peer.h index a1423f2b09e0..4de5aeae33f7 100644 --- a/drivers/net/ovpn/peer.h +++ b/drivers/net/ovpn/peer.h @@ -125,7 +125,6 @@ static inline bool ovpn_peer_hold(struct ovpn_peer *peer) return kref_get_unless_zero(&peer->refcount); } -void ovpn_peer_release(struct ovpn_peer *peer); void ovpn_peer_release_kref(struct kref *kref); /** diff --git a/drivers/net/ovpn/stats.h b/drivers/net/ovpn/stats.h index 53433d8b6c33..3a45b97c0056 100644 --- a/drivers/net/ovpn/stats.h +++ b/drivers/net/ovpn/stats.h @@ -11,6 +11,8 @@ #ifndef _NET_OVPN_OVPNSTATS_H_ #define _NET_OVPN_OVPNSTATS_H_ +#include <linux/netdevice.h> + /* one stat */ struct ovpn_peer_stat { atomic64_t bytes; @@ -44,4 +46,18 @@ static inline void ovpn_peer_stats_increment_tx(struct ovpn_peer_stats *stats, ovpn_peer_stats_increment(&stats->tx, n); } +static inline void ovpn_dev_dstats_tx_dropped(struct net_device *dev) +{ + local_bh_disable(); + dev_dstats_tx_dropped(dev); + local_bh_enable(); +} + +static inline void ovpn_dev_dstats_rx_dropped(struct net_device *dev) +{ + local_bh_disable(); + dev_dstats_rx_dropped(dev); + local_bh_enable(); +} + #endif /* _NET_OVPN_OVPNSTATS_H_ */ diff --git a/drivers/net/ovpn/tcp.c b/drivers/net/ovpn/tcp.c index 5499c1572f3e..505c2f214c9f 100644 --- a/drivers/net/ovpn/tcp.c +++ b/drivers/net/ovpn/tcp.c @@ -152,7 +152,7 @@ err: if (WARN_ON(!ovpn_peer_hold(peer))) goto err_nopeer; schedule_work(&peer->tcp.defer_del_work); - dev_dstats_rx_dropped(peer->ovpn->dev); + ovpn_dev_dstats_rx_dropped(peer->ovpn->dev); err_nopeer: kfree_skb(skb); } @@ -298,9 +298,9 @@ static void ovpn_tcp_send_sock(struct ovpn_peer *peer, struct sock *sk) } while (peer->tcp.out_msg.len > 0); if (!peer->tcp.out_msg.len) { - preempt_disable(); + local_bh_disable(); dev_dstats_tx_add(peer->ovpn->dev, skb->len); - preempt_enable(); + local_bh_enable(); } kfree_skb(peer->tcp.out_msg.skb); @@ -331,7 +331,7 @@ static void ovpn_tcp_send_sock_skb(struct ovpn_peer *peer, struct sock *sk, ovpn_tcp_send_sock(peer, sk); if (peer->tcp.out_msg.skb) { - dev_dstats_tx_dropped(peer->ovpn->dev); + ovpn_dev_dstats_tx_dropped(peer->ovpn->dev); kfree_skb(skb); return; } @@ -353,7 +353,7 @@ void ovpn_tcp_send_skb(struct ovpn_peer *peer, struct sock *sk, if (sock_owned_by_user(sk)) { if (skb_queue_len(&peer->tcp.out_queue) >= READ_ONCE(net_hotdata.max_backlog)) { - dev_dstats_tx_dropped(peer->ovpn->dev); + ovpn_dev_dstats_tx_dropped(peer->ovpn->dev); kfree_skb(skb); goto unlock; } @@ -581,14 +581,19 @@ static void ovpn_tcp_close(struct sock *sk, long timeout) rcu_read_lock(); sock = rcu_dereference_sk_user_data(sk); - if (!sock || !sock->peer || !ovpn_peer_hold(sock->peer)) { + if (!sock) { rcu_read_unlock(); return; } + peer = sock->peer; + if (!peer || !ovpn_peer_hold(peer)) { + rcu_read_unlock(); + return; + } rcu_read_unlock(); - ovpn_peer_del(sock->peer, OVPN_DEL_PEER_REASON_TRANSPORT_DISCONNECT); + ovpn_peer_del(peer, OVPN_DEL_PEER_REASON_TRANSPORT_DISCONNECT); peer->tcp.sk_cb.prot->close(sk, timeout); ovpn_peer_put(peer); } diff --git a/drivers/net/ovpn/udp.c b/drivers/net/ovpn/udp.c index 272b535ecaad..367563d84472 100644 --- a/drivers/net/ovpn/udp.c +++ b/drivers/net/ovpn/udp.c @@ -126,7 +126,7 @@ static int ovpn_udp_encap_recv(struct sock *sk, struct sk_buff *skb) return 0; drop: - dev_dstats_rx_dropped(ovpn->dev); + ovpn_dev_dstats_rx_dropped(ovpn->dev); drop_noovpn: kfree_skb(skb); return 0; diff --git a/drivers/net/phy/dp83tc811.c b/drivers/net/phy/dp83tc811.c index e480c2a07450..252fb12b3e68 100644 --- a/drivers/net/phy/dp83tc811.c +++ b/drivers/net/phy/dp83tc811.c @@ -393,6 +393,7 @@ static struct phy_driver dp83811_driver[] = { .config_init = dp83811_config_init, .config_aneg = dp83811_config_aneg, .soft_reset = dp83811_phy_reset, + .get_features = genphy_c45_pma_read_ext_abilities, .get_wol = dp83811_get_wol, .set_wol = dp83811_set_wol, .config_intr = dp83811_config_intr, diff --git a/drivers/net/phy/phy-c45.c b/drivers/net/phy/phy-c45.c index d48aa7231b37..126951741428 100644 --- a/drivers/net/phy/phy-c45.c +++ b/drivers/net/phy/phy-c45.c @@ -940,6 +940,14 @@ EXPORT_SYMBOL_GPL(genphy_c45_read_eee_abilities); */ int genphy_c45_an_config_eee_aneg(struct phy_device *phydev) { + /* Writing MMD AN advertisements while autoneg is disabled has no + * effect on link-partner negotiation, but on some PHYs (e.g. the + * Broadcom BCM54213PE) the write itself disturbs the receive + * datapath. Skip it. + */ + if (phydev->autoneg == AUTONEG_DISABLE) + return 0; + if (!phydev->eee_cfg.eee_enabled) { __ETHTOOL_DECLARE_LINK_MODE_MASK(adv) = {}; diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index f3696d9819d3..cfb505ed9a3a 100644 --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -2907,7 +2907,8 @@ EXPORT_SYMBOL(phy_advertise_supported); */ void phy_advertise_eee_all(struct phy_device *phydev) { - linkmode_copy(phydev->advertising_eee, phydev->supported_eee); + linkmode_andnot(phydev->advertising_eee, phydev->supported_eee, + phydev->eee_disabled_modes); } EXPORT_SYMBOL_GPL(phy_advertise_eee_all); @@ -2933,7 +2934,8 @@ EXPORT_SYMBOL_GPL(phy_advertise_eee_all); */ void phy_support_eee(struct phy_device *phydev) { - linkmode_copy(phydev->advertising_eee, phydev->supported_eee); + linkmode_andnot(phydev->advertising_eee, phydev->supported_eee, + phydev->eee_disabled_modes); phydev->eee_cfg.tx_lpi_enabled = true; phydev->eee_cfg.eee_enabled = true; } diff --git a/drivers/net/pse-pd/pse_core.c b/drivers/net/pse-pd/pse_core.c index 3beaaaeec9e1..871baca2de4d 100644 --- a/drivers/net/pse-pd/pse_core.c +++ b/drivers/net/pse-pd/pse_core.c @@ -210,7 +210,7 @@ static int of_load_pse_pis(struct pse_controller_dev *pcdev) ret = of_load_pse_pi_pairsets(node, &pi, ret); if (ret) goto out; - } else if (ret != ENOENT) { + } else if (ret != -ENOENT) { dev_err(pcdev->dev, "error: wrong number of pairsets. Should be 1 or 2, got %d (%pOF)\n", ret, node); diff --git a/drivers/net/tap.c b/drivers/net/tap.c index b8240737dc51..a590e07ce0a9 100644 --- a/drivers/net/tap.c +++ b/drivers/net/tap.c @@ -919,11 +919,11 @@ static long tap_ioctl(struct file *file, unsigned int cmd, struct tap_queue *q = file->private_data; struct tap_dev *tap; void __user *argp = (void __user *)arg; + struct sockaddr_storage ss = {}; struct ifreq __user *ifr = argp; unsigned int __user *up = argp; unsigned short u; int __user *sp = argp; - struct sockaddr_storage ss; int s; int ret; diff --git a/drivers/net/wireless/ath/ath10k/wmi.c b/drivers/net/wireless/ath/ath10k/wmi.c index 0bdb38edd915..e57588c19c80 100644 --- a/drivers/net/wireless/ath/ath10k/wmi.c +++ b/drivers/net/wireless/ath/ath10k/wmi.c @@ -3,7 +3,6 @@ * Copyright (c) 2005-2011 Atheros Communications Inc. * Copyright (c) 2011-2017 Qualcomm Atheros, Inc. * Copyright (c) 2018-2019, The Linux Foundation. All rights reserved. - * Copyright (c) 2021-2024 Qualcomm Innovation Center, Inc. All rights reserved. * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries. */ @@ -1947,15 +1946,15 @@ int ath10k_wmi_cmd_send(struct ath10k *ar, struct sk_buff *skb, u32 cmd_id) ret = -ESHUTDOWN; ath10k_dbg(ar, ATH10K_DBG_WMI, "drop wmi command %d, hardware is wedged\n", cmd_id); - } - /* try to send pending beacons first. they take priority */ - ath10k_wmi_tx_beacons_nowait(ar); + } else { + /* try to send pending beacons first. they take priority */ + ath10k_wmi_tx_beacons_nowait(ar); - ret = ath10k_wmi_cmd_send_nowait(ar, skb, cmd_id); - - if (ret && test_bit(ATH10K_FLAG_CRASH_FLUSH, &ar->dev_flags)) - ret = -ESHUTDOWN; + ret = ath10k_wmi_cmd_send_nowait(ar, skb, cmd_id); + if (ret && test_bit(ATH10K_FLAG_CRASH_FLUSH, &ar->dev_flags)) + ret = -ESHUTDOWN; + } (ret != -EAGAIN); }), 3 * HZ); diff --git a/drivers/net/wireless/ath/ath11k/dp_rx.c b/drivers/net/wireless/ath/ath11k/dp_rx.c index 85defe11750d..9bbafbd696e6 100644 --- a/drivers/net/wireless/ath/ath11k/dp_rx.c +++ b/drivers/net/wireless/ath/ath11k/dp_rx.c @@ -2214,8 +2214,7 @@ ath11k_dp_rx_h_find_peer(struct ath11k_base *ab, struct sk_buff *msdu) lockdep_assert_held(&ab->base_lock); - if (rxcb->peer_id) - peer = ath11k_peer_find_by_id(ab, rxcb->peer_id); + peer = ath11k_peer_find_by_id(ab, rxcb->peer_id); if (peer) return peer; diff --git a/drivers/net/wireless/ath/ath11k/hal.c b/drivers/net/wireless/ath/ath11k/hal.c index e821e5a62c1c..98bd9e3f0aae 100644 --- a/drivers/net/wireless/ath/ath11k/hal.c +++ b/drivers/net/wireless/ath/ath11k/hal.c @@ -1387,14 +1387,22 @@ EXPORT_SYMBOL(ath11k_hal_srng_deinit); void ath11k_hal_srng_clear(struct ath11k_base *ab) { - /* No need to memset rdp and wrp memory since each individual - * segment would get cleared in ath11k_hal_srng_src_hw_init() - * and ath11k_hal_srng_dst_hw_init(). + /* + * Preserve the shared pointer buffers, but clear the previous + * firmware instance's hp/tp state before handing them back to FW. + * LMAC rings reuse this shared memory without going through the + * normal SRNG hw-init path that zeros non-LMAC ring pointers. */ memset(ab->hal.srng_list, 0, sizeof(ab->hal.srng_list)); memset(ab->hal.shadow_reg_addr, 0, sizeof(ab->hal.shadow_reg_addr)); + if (ab->hal.rdp.vaddr) + memset(ab->hal.rdp.vaddr, 0, + sizeof(*ab->hal.rdp.vaddr) * HAL_SRNG_RING_ID_MAX); + if (ab->hal.wrp.vaddr) + memset(ab->hal.wrp.vaddr, 0, + sizeof(*ab->hal.wrp.vaddr) * HAL_SRNG_NUM_LMAC_RINGS); ab->hal.avail_blk_resource = 0; ab->hal.current_blk_index = 0; ab->hal.num_shadow_reg_configured = 0; diff --git a/drivers/net/wireless/ath/ath11k/hal_rx.c b/drivers/net/wireless/ath/ath11k/hal_rx.c index 753bd93f0212..51e0840bc0d1 100644 --- a/drivers/net/wireless/ath/ath11k/hal_rx.c +++ b/drivers/net/wireless/ath/ath11k/hal_rx.c @@ -1467,11 +1467,8 @@ ath11k_hal_rx_parse_mon_status_tlv(struct ath11k_base *ab, case HAL_RX_MPDU_START: { struct hal_rx_mpdu_info *mpdu_info = (struct hal_rx_mpdu_info *)tlv_data; - u16 peer_id; - peer_id = ath11k_hal_rx_mpduinfo_get_peerid(ab, mpdu_info); - if (peer_id) - ppdu_info->peer_id = peer_id; + ppdu_info->peer_id = ath11k_hal_rx_mpduinfo_get_peerid(ab, mpdu_info); break; } case HAL_RXPCU_PPDU_END_INFO: { diff --git a/drivers/net/wireless/ath/ath11k/testmode.c b/drivers/net/wireless/ath/ath11k/testmode.c index a9751ea2a0b7..c72eed358f6d 100644 --- a/drivers/net/wireless/ath/ath11k/testmode.c +++ b/drivers/net/wireless/ath/ath11k/testmode.c @@ -457,6 +457,7 @@ static int ath11k_tm_cmd_wmi_ftm(struct ath11k *ar, struct nlattr *tb[]) ret = ath11k_wmi_cmd_send(wmi, skb, cmd_id); if (ret) { ath11k_warn(ar->ab, "failed to send wmi ftm command: %d\n", ret); + dev_kfree_skb(skb); goto out; } diff --git a/drivers/net/wireless/ath/ath11k/wmi.c b/drivers/net/wireless/ath/ath11k/wmi.c index 40747fba3b0c..024c2aad9fb4 100644 --- a/drivers/net/wireless/ath/ath11k/wmi.c +++ b/drivers/net/wireless/ath/ath11k/wmi.c @@ -9332,6 +9332,7 @@ int ath11k_wmi_wow_host_wakeup_ind(struct ath11k *ar) struct wmi_wow_host_wakeup_ind *cmd; struct sk_buff *skb; size_t len; + int ret; len = sizeof(*cmd); skb = ath11k_wmi_alloc_skb(ar->wmi->wmi_ab, len); @@ -9345,14 +9346,20 @@ int ath11k_wmi_wow_host_wakeup_ind(struct ath11k *ar) ath11k_dbg(ar->ab, ATH11K_DBG_WMI, "tlv wow host wakeup ind\n"); - return ath11k_wmi_cmd_send(ar->wmi, skb, WMI_WOW_HOSTWAKEUP_FROM_SLEEP_CMDID); + ret = ath11k_wmi_cmd_send(ar->wmi, skb, WMI_WOW_HOSTWAKEUP_FROM_SLEEP_CMDID); + if (ret) { + ath11k_warn(ar->ab, "failed to send WMI_WOW_HOSTWAKEUP_FROM_SLEEP_CMDID\n"); + dev_kfree_skb(skb); + } + + return ret; } int ath11k_wmi_wow_enable(struct ath11k *ar) { struct wmi_wow_enable_cmd *cmd; struct sk_buff *skb; - int len; + int ret, len; len = sizeof(*cmd); skb = ath11k_wmi_alloc_skb(ar->wmi->wmi_ab, len); @@ -9367,7 +9374,13 @@ int ath11k_wmi_wow_enable(struct ath11k *ar) cmd->pause_iface_config = WOW_IFACE_PAUSE_ENABLED; ath11k_dbg(ar->ab, ATH11K_DBG_WMI, "tlv wow enable\n"); - return ath11k_wmi_cmd_send(ar->wmi, skb, WMI_WOW_ENABLE_CMDID); + ret = ath11k_wmi_cmd_send(ar->wmi, skb, WMI_WOW_ENABLE_CMDID); + if (ret) { + ath11k_warn(ar->ab, "failed to send WMI_WOW_ENABLE_CMDID\n"); + dev_kfree_skb(skb); + } + + return ret; } int ath11k_wmi_scan_prob_req_oui(struct ath11k *ar, diff --git a/drivers/net/wireless/ath/ath12k/mac.c b/drivers/net/wireless/ath/ath12k/mac.c index fa36e984c74b..6869e9860776 100644 --- a/drivers/net/wireless/ath/ath12k/mac.c +++ b/drivers/net/wireless/ath/ath12k/mac.c @@ -3446,7 +3446,9 @@ static void ath12k_peer_assoc_h_eht(struct ath12k *ar, arg->peer_eht_mcs_count++; fallthrough; default: - if (!(link_sta->he_cap.he_cap_elem.phy_cap_info[0] & + if ((vif->type == NL80211_IFTYPE_AP || + vif->type == NL80211_IFTYPE_MESH_POINT) && + !(link_sta->he_cap.he_cap_elem.phy_cap_info[0] & IEEE80211_HE_PHY_CAP0_CHANNEL_WIDTH_SET_MASK_ALL)) { bw_20 = &eht_cap->eht_mcs_nss_supp.only_20mhz; @@ -3475,7 +3477,9 @@ static void ath12k_peer_assoc_h_eht(struct ath12k *ar, arg->punct_bitmap = ~arvif->punct_bitmap; arg->eht_disable_mcs15 = link_conf->eht_disable_mcs15; - if (!(link_sta->he_cap.he_cap_elem.phy_cap_info[0] & + if ((vif->type == NL80211_IFTYPE_AP || + vif->type == NL80211_IFTYPE_MESH_POINT) && + !(link_sta->he_cap.he_cap_elem.phy_cap_info[0] & IEEE80211_HE_PHY_CAP0_CHANNEL_WIDTH_SET_MASK_ALL)) { if (bw_20->rx_tx_mcs13_max_nss) max_nss = max(max_nss, u8_get_bits(bw_20->rx_tx_mcs13_max_nss, diff --git a/drivers/net/wireless/intel/iwlwifi/mld/link.c b/drivers/net/wireless/intel/iwlwifi/mld/link.c index b5430e8a73d6..7496528e8587 100644 --- a/drivers/net/wireless/intel/iwlwifi/mld/link.c +++ b/drivers/net/wireless/intel/iwlwifi/mld/link.c @@ -1,6 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause /* - * Copyright (C) 2024-2025 Intel Corporation + * Copyright (C) 2024-2026 Intel Corporation */ #include "constants.h" @@ -504,7 +504,6 @@ void iwl_mld_remove_link(struct iwl_mld *mld, struct iwl_mld_vif *mld_vif = iwl_mld_vif_from_mac80211(bss_conf->vif); struct iwl_mld_link *link = iwl_mld_link_from_mac80211(bss_conf); bool is_deflink = link == &mld_vif->deflink; - u8 fw_id = link->fw_id; if (WARN_ON(!link || link->active)) return; @@ -512,15 +511,15 @@ void iwl_mld_remove_link(struct iwl_mld *mld, iwl_mld_rm_link_from_fw(mld, bss_conf); /* Continue cleanup on failure */ - if (!is_deflink) - kfree_rcu(link, rcu_head); - RCU_INIT_POINTER(mld_vif->link[bss_conf->link_id], NULL); - if (WARN_ON(fw_id >= mld->fw->ucode_capa.num_links)) + if (WARN_ON(link->fw_id >= mld->fw->ucode_capa.num_links)) return; - RCU_INIT_POINTER(mld->fw_id_to_bss_conf[fw_id], NULL); + RCU_INIT_POINTER(mld->fw_id_to_bss_conf[link->fw_id], NULL); + + if (!is_deflink) + kfree_rcu(link, rcu_head); } void iwl_mld_handle_missed_beacon_notif(struct iwl_mld *mld, diff --git a/drivers/net/wireless/intel/iwlwifi/mld/tx.c b/drivers/net/wireless/intel/iwlwifi/mld/tx.c index 546d09a38dab..0bcb1ae69468 100644 --- a/drivers/net/wireless/intel/iwlwifi/mld/tx.c +++ b/drivers/net/wireless/intel/iwlwifi/mld/tx.c @@ -834,7 +834,7 @@ static int iwl_mld_tx_tso_segment(struct iwl_mld *mld, struct sk_buff *skb, return -EINVAL; max_tid_amsdu_len = sta->cur->max_tid_amsdu_len[tid]; - if (!max_tid_amsdu_len) + if (!max_tid_amsdu_len || max_tid_amsdu_len == 1) return iwl_tx_tso_segment(skb, 1, netdev_flags, mpdus_skbs); /* Sub frame header + SNAP + IP header + TCP header + MSS */ @@ -846,6 +846,9 @@ static int iwl_mld_tx_tso_segment(struct iwl_mld *mld, struct sk_buff *skb, */ num_subframes = (max_tid_amsdu_len + pad) / (subf_len + pad); + if (WARN_ON_ONCE(!num_subframes)) + return iwl_tx_tso_segment(skb, 1, netdev_flags, mpdus_skbs); + if (sta->max_amsdu_subframes && num_subframes > sta->max_amsdu_subframes) num_subframes = sta->max_amsdu_subframes; @@ -971,6 +974,16 @@ void iwl_mld_tx_from_txq(struct iwl_mld *mld, struct ieee80211_txq *txq) u8 zero_addr[ETH_ALEN] = {}; /* + * Don't transmit during firmware restart. The firmware is dead, + * so iwl_trans_tx() would return -EIO for each frame. Avoid the + * overhead of dequeuing from mac80211 only to immediately free + * the skbs, and the potential memory pressure from rapid skb + * allocation churn during high-throughput restart scenarios. + */ + if (unlikely(mld->fw_status.in_hw_restart)) + return; + + /* * No need for threads to be pending here, they can leave the first * taker all the work. * diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/mac-ctxt.c b/drivers/net/wireless/intel/iwlwifi/mvm/mac-ctxt.c index c523c5e82d4a..8ffa72aca3cf 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/mac-ctxt.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/mac-ctxt.c @@ -1,6 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause /* - * Copyright (C) 2012-2014, 2018-2025 Intel Corporation + * Copyright (C) 2012-2014, 2018-2026 Intel Corporation * Copyright (C) 2013-2014 Intel Mobile Communications GmbH * Copyright (C) 2015-2017 Intel Deutschland GmbH */ @@ -927,13 +927,18 @@ u8 iwl_mvm_mac_ctxt_get_lowest_rate(struct iwl_mvm *mvm, u16 iwl_mvm_mac_ctxt_get_beacon_flags(const struct iwl_fw *fw, u8 rate_idx) { - u16 flags = iwl_mvm_mac80211_idx_to_hwrate(fw, rate_idx); bool is_new_rate = iwl_fw_lookup_cmd_ver(fw, BEACON_TEMPLATE_CMD, 0) > 10; + u16 flags = 0; if (rate_idx <= IWL_LAST_CCK_RATE) flags |= is_new_rate ? IWL_MAC_BEACON_CCK : IWL_MAC_BEACON_CCK_V1; + if (iwl_fw_lookup_cmd_ver(fw, TX_CMD, 0) > 8) + flags |= iwl_mvm_mac80211_idx_to_hwrate(fw, rate_idx); + else + flags |= iwl_fw_rate_idx_to_plcp(rate_idx); + return flags; } @@ -962,6 +967,7 @@ static void iwl_mvm_mac_ctxt_set_tx(struct iwl_mvm *mvm, { struct iwl_mvm_vif *mvmvif = iwl_mvm_vif_from_mac80211(vif); struct ieee80211_tx_info *info; + u32 rate_n_flags = 0; u8 rate; u32 tx_flags; @@ -981,18 +987,21 @@ static void iwl_mvm_mac_ctxt_set_tx(struct iwl_mvm *mvm, IWL_UCODE_TLV_CAPA_BEACON_ANT_SELECTION)) { iwl_mvm_toggle_tx_ant(mvm, &mvm->mgmt_last_antenna_idx); - tx_params->rate_n_flags = - cpu_to_le32(BIT(mvm->mgmt_last_antenna_idx) << - RATE_MCS_ANT_POS); + rate_n_flags |= BIT(mvm->mgmt_last_antenna_idx) << + RATE_MCS_ANT_POS; } rate = iwl_mvm_mac_ctxt_get_beacon_rate(mvm, info, vif); - tx_params->rate_n_flags |= - cpu_to_le32(iwl_mvm_mac80211_idx_to_hwrate(mvm->fw, rate)); - if (rate == IWL_FIRST_CCK_RATE) - tx_params->rate_n_flags |= cpu_to_le32(RATE_MCS_CCK_MSK_V1); + if (rate < IWL_FIRST_OFDM_RATE) + rate_n_flags |= RATE_MCS_MOD_TYPE_CCK; + else + rate_n_flags |= RATE_MCS_MOD_TYPE_LEGACY_OFDM; + + rate_n_flags |= iwl_mvm_mac80211_idx_to_hwrate(mvm->fw, rate); + tx_params->rate_n_flags = iwl_mvm_v3_rate_to_fw(rate_n_flags, + mvm->fw_rates_ver); } int iwl_mvm_mac_ctxt_send_beacon_cmd(struct iwl_mvm *mvm, diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/utils.c b/drivers/net/wireless/intel/iwlwifi/mvm/utils.c index 4a33a032c2a7..f052537e9567 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/utils.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/utils.c @@ -1,6 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause /* - * Copyright (C) 2012-2014, 2018-2025 Intel Corporation + * Copyright (C) 2012-2014, 2018-2026 Intel Corporation * Copyright (C) 2013-2014 Intel Mobile Communications GmbH * Copyright (C) 2015-2017 Intel Deutschland GmbH */ @@ -159,15 +159,9 @@ int iwl_mvm_legacy_rate_to_mac80211_idx(u32 rate_n_flags, u8 iwl_mvm_mac80211_idx_to_hwrate(const struct iwl_fw *fw, int rate_idx) { - if (iwl_fw_lookup_cmd_ver(fw, TX_CMD, 0) > 8) - /* In the new rate legacy rates are indexed: - * 0 - 3 for CCK and 0 - 7 for OFDM. - */ - return (rate_idx >= IWL_FIRST_OFDM_RATE ? - rate_idx - IWL_FIRST_OFDM_RATE : - rate_idx); - - return iwl_fw_rate_idx_to_plcp(rate_idx); + return rate_idx >= IWL_FIRST_OFDM_RATE ? + rate_idx - IWL_FIRST_OFDM_RATE : + rate_idx; } u8 iwl_mvm_mac80211_ac_to_ucode_ac(enum ieee80211_ac_numbers ac) diff --git a/drivers/net/wireless/microchip/wilc1000/wlan.c b/drivers/net/wireless/microchip/wilc1000/wlan.c index 3fa8592eb250..4b116fe6f9ea 100644 --- a/drivers/net/wireless/microchip/wilc1000/wlan.c +++ b/drivers/net/wireless/microchip/wilc1000/wlan.c @@ -1265,7 +1265,7 @@ int wilc_wlan_firmware_download(struct wilc *wilc, const u8 *buffer, ret = acquire_bus(wilc, WILC_BUS_ACQUIRE_AND_WAKEUP); if (ret) - return ret; + goto fail; wilc->hif_func->hif_read_reg(wilc, WILC_GLB_RESET_0, ®); reg &= ~BIT(10); diff --git a/drivers/net/wwan/iosm/iosm_ipc_imem.c b/drivers/net/wwan/iosm/iosm_ipc_imem.c index 1b7bc7d63a2e..4405c8531888 100644 --- a/drivers/net/wwan/iosm/iosm_ipc_imem.c +++ b/drivers/net/wwan/iosm/iosm_ipc_imem.c @@ -1425,6 +1425,8 @@ imem_config_fail: protocol_init_fail: cancel_work_sync(&ipc_imem->run_state_worker); ipc_task_deinit(ipc_imem->ipc_task); + if (ipc_imem->ipc_protocol) + ipc_protocol_deinit(ipc_imem->ipc_protocol); ipc_task_init_fail: kfree(ipc_imem->ipc_task); ipc_task_fail: diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c index 8844bbd39515..77c668282d99 100644 --- a/drivers/nvme/host/ioctl.c +++ b/drivers/nvme/host/ioctl.c @@ -122,7 +122,6 @@ static int nvme_map_user_request(struct request *req, u64 ubuffer, bool supports_metadata = bdev && blk_get_integrity(bdev->bd_disk); struct nvme_ctrl *ctrl = nvme_req(req)->ctrl; bool has_metadata = meta_buffer && meta_len; - struct bio *bio = NULL; int ret; if (!nvme_ctrl_sgl_supported(ctrl)) @@ -154,8 +153,8 @@ static int nvme_map_user_request(struct request *req, u64 ubuffer, return ret; out_unmap: - if (bio) - blk_rq_unmap_user(bio); + if (req->bio) + blk_rq_unmap_user(req->bio); return ret; } diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 4c052ed18cb8..a0e9767bc21e 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -966,7 +966,8 @@ static bool nvme_pci_prp_save_mapping(struct request *req, { struct nvme_iod *iod = blk_mq_rq_to_pdu(req); - if (dma_use_iova(&iod->dma_state) || !dma_need_unmap(dma_dev)) + if (dma_use_iova(&iod->dma_state) || !dma_need_unmap(dma_dev) || + (iod->flags & IOD_DATA_P2P)) return true; if (!iod->nr_dma_vecs) { @@ -996,6 +997,23 @@ static bool nvme_pci_prp_iter_next(struct request *req, struct device *dma_dev, return nvme_pci_prp_save_mapping(req, dma_dev, iter); } +static void nvme_unmap_iter(struct request *req, struct blk_dma_iter *iter, + struct dma_iova_state *state) +{ + struct nvme_queue *nvmeq = req->mq_hctx->driver_data; + struct device *dev = nvmeq->dev->dev; + + if (!blk_rq_dma_unmap(req, dev, state, iter->len, iter->p2pdma.map)) { + unsigned int attrs = 0; + + if (iter->p2pdma.map == PCI_P2PDMA_MAP_THRU_HOST_BRIDGE) + attrs |= DMA_ATTR_MMIO; + + dma_unmap_phys(dev, iter->addr, iter->len, rq_dma_dir(req), + attrs); + } +} + static blk_status_t nvme_pci_setup_data_prp(struct request *req, struct blk_dma_iter *iter) { @@ -1006,8 +1024,10 @@ static blk_status_t nvme_pci_setup_data_prp(struct request *req, unsigned int prp_len, i; __le64 *prp_list; - if (!nvme_pci_prp_save_mapping(req, nvmeq->dev->dev, iter)) + if (!nvme_pci_prp_save_mapping(req, nvmeq->dev->dev, iter)) { + nvme_unmap_iter(req, iter, &iod->dma_state); return iter->status; + } /* * PRP1 always points to the start of the DMA transfers. @@ -1112,6 +1132,7 @@ bad_sgl: dev_err_once(nvmeq->dev->dev, "Incorrectly formed request for payload:%d nents:%d\n", blk_rq_payload_bytes(req), blk_rq_nr_phys_segments(req)); + nvme_unmap_data(req); return BLK_STS_IOERR; } @@ -1155,8 +1176,11 @@ static blk_status_t nvme_pci_setup_data_sgl(struct request *req, sg_list = dma_pool_alloc(nvme_dma_pool(nvmeq, iod), GFP_ATOMIC, &sgl_dma); - if (!sg_list) + if (!sg_list) { + nvme_unmap_iter(req, iter, &iod->dma_state); return BLK_STS_RESOURCE; + } + iod->descriptors[iod->nr_descriptors++] = sg_list; do { @@ -1313,8 +1337,10 @@ static blk_status_t nvme_pci_setup_meta_iter(struct request *req) sg_list = dma_pool_alloc(nvmeq->descriptor_pools.small, GFP_ATOMIC, &sgl_dma); - if (!sg_list) + if (!sg_list) { + nvme_unmap_iter(req, &iter, &iod->meta_dma_state); return BLK_STS_RESOURCE; + } iod->meta_descriptor = sg_list; iod->meta_dma = sgl_dma; @@ -2533,11 +2559,13 @@ static void nvme_free_host_mem_multi(struct nvme_dev *dev) static void nvme_free_host_mem(struct nvme_dev *dev) { - if (dev->hmb_sgt) + if (dev->hmb_sgt) { dma_free_noncontiguous(dev->dev, dev->host_mem_size, dev->hmb_sgt, DMA_BIDIRECTIONAL); - else + dev->hmb_sgt = NULL; + } else { nvme_free_host_mem_multi(dev); + } dma_free_coherent(dev->dev, dev->host_mem_descs_size, dev->host_mem_descs, dev->host_mem_descs_dma); diff --git a/drivers/phy/apple/atc.c b/drivers/phy/apple/atc.c index 64d0c3dba1cb..4f0585818fa7 100644 --- a/drivers/phy/apple/atc.c +++ b/drivers/phy/apple/atc.c @@ -628,9 +628,6 @@ struct apple_atcphy { struct reset_controller_dev rcdev; - struct typec_switch *sw; - struct typec_mux *mux; - struct mutex lock; }; @@ -2066,15 +2063,25 @@ static int atcphy_sw_set(struct typec_switch_dev *sw, enum typec_orientation ori return 0; } +static void atcphy_typec_switch_unregister(void *data) +{ + typec_switch_unregister(data); +} + static int atcphy_probe_switch(struct apple_atcphy *atcphy) { + struct typec_switch_dev *sw; struct typec_switch_desc sw_desc = { .drvdata = atcphy, .fwnode = atcphy->dev->fwnode, .set = atcphy_sw_set, }; - return PTR_ERR_OR_ZERO(typec_switch_register(atcphy->dev, &sw_desc)); + sw = typec_switch_register(atcphy->dev, &sw_desc); + if (IS_ERR(sw)) + return PTR_ERR(sw); + + return devm_add_action_or_reset(atcphy->dev, atcphy_typec_switch_unregister, sw); } static int atcphy_mux_set(struct typec_mux_dev *mux, struct typec_mux_state *state) @@ -2146,15 +2153,25 @@ static int atcphy_mux_set(struct typec_mux_dev *mux, struct typec_mux_state *sta return atcphy_configure(atcphy, target_mode); } +static void atcphy_typec_mux_unregister(void *data) +{ + typec_mux_unregister(data); +} + static int atcphy_probe_mux(struct apple_atcphy *atcphy) { + struct typec_mux_dev *mux; struct typec_mux_desc mux_desc = { .drvdata = atcphy, .fwnode = atcphy->dev->fwnode, .set = atcphy_mux_set, }; - return PTR_ERR_OR_ZERO(typec_mux_register(atcphy->dev, &mux_desc)); + mux = typec_mux_register(atcphy->dev, &mux_desc); + if (IS_ERR(mux)) + return PTR_ERR(mux); + + return devm_add_action_or_reset(atcphy->dev, atcphy_typec_mux_unregister, mux); } static int atcphy_load_tunables(struct apple_atcphy *atcphy) diff --git a/drivers/phy/marvell/phy-mvebu-a3700-utmi.c b/drivers/phy/marvell/phy-mvebu-a3700-utmi.c index 04f4fb4bed70..f882bc57649c 100644 --- a/drivers/phy/marvell/phy-mvebu-a3700-utmi.c +++ b/drivers/phy/marvell/phy-mvebu-a3700-utmi.c @@ -168,9 +168,8 @@ static int mvebu_a3700_utmi_phy_power_off(struct phy *phy) u32 reg; /* Disable PHY pull-up and enable USB2 suspend */ - reg = readl(utmi->regs + USB2_PHY_CTRL(usb32)); - reg &= ~(RB_USB2PHY_PU | RB_USB2PHY_SUSPM(usb32)); - writel(reg, utmi->regs + USB2_PHY_CTRL(usb32)); + regmap_update_bits(utmi->usb_misc, USB2_PHY_CTRL(usb32), + RB_USB2PHY_PU | RB_USB2PHY_SUSPM(usb32), 0); /* Power down OTG module */ if (usb32) { diff --git a/drivers/phy/qualcomm/phy-qcom-edp.c b/drivers/phy/qualcomm/phy-qcom-edp.c index 7372de05a0b8..fd933252ca73 100644 --- a/drivers/phy/qualcomm/phy-qcom-edp.c +++ b/drivers/phy/qualcomm/phy-qcom-edp.c @@ -87,7 +87,8 @@ struct qcom_edp_phy_cfg { bool is_edp; const u8 *aux_cfg; const u8 *vco_div_cfg; - const struct qcom_edp_swing_pre_emph_cfg *swing_pre_emph_cfg; + const struct qcom_edp_swing_pre_emph_cfg *dp_swing_pre_emph_cfg; + const struct qcom_edp_swing_pre_emph_cfg *edp_swing_pre_emph_cfg; const struct phy_ver_ops *ver_ops; }; @@ -116,17 +117,17 @@ struct qcom_edp { }; static const u8 dp_swing_hbr_rbr[4][4] = { - { 0x08, 0x0f, 0x16, 0x1f }, + { 0x07, 0x0f, 0x16, 0x1f }, { 0x11, 0x1e, 0x1f, 0xff }, { 0x16, 0x1f, 0xff, 0xff }, { 0x1f, 0xff, 0xff, 0xff } }; static const u8 dp_pre_emp_hbr_rbr[4][4] = { - { 0x00, 0x0d, 0x14, 0x1a }, + { 0x00, 0x0e, 0x15, 0x1a }, { 0x00, 0x0e, 0x15, 0xff }, { 0x00, 0x0e, 0xff, 0xff }, - { 0x03, 0xff, 0xff, 0xff } + { 0x04, 0xff, 0xff, 0xff } }; static const u8 dp_swing_hbr2_hbr3[4][4] = { @@ -150,6 +151,20 @@ static const struct qcom_edp_swing_pre_emph_cfg dp_phy_swing_pre_emph_cfg = { .pre_emphasis_hbr3_hbr2 = &dp_pre_emp_hbr2_hbr3, }; +static const u8 dp_pre_emp_hbr_rbr_v8[4][4] = { + { 0x00, 0x0e, 0x15, 0x1a }, + { 0x00, 0x0e, 0x15, 0xff }, + { 0x00, 0x0e, 0xff, 0xff }, + { 0x00, 0xff, 0xff, 0xff } +}; + +static const struct qcom_edp_swing_pre_emph_cfg dp_phy_swing_pre_emph_cfg_v8 = { + .swing_hbr_rbr = &dp_swing_hbr_rbr, + .swing_hbr3_hbr2 = &dp_swing_hbr2_hbr3, + .pre_emphasis_hbr_rbr = &dp_pre_emp_hbr_rbr_v8, + .pre_emphasis_hbr3_hbr2 = &dp_pre_emp_hbr2_hbr3, +}; + static const u8 edp_swing_hbr_rbr[4][4] = { { 0x07, 0x0f, 0x16, 0x1f }, { 0x0d, 0x16, 0x1e, 0xff }, @@ -158,7 +173,7 @@ static const u8 edp_swing_hbr_rbr[4][4] = { }; static const u8 edp_pre_emp_hbr_rbr[4][4] = { - { 0x05, 0x12, 0x17, 0x1d }, + { 0x05, 0x11, 0x17, 0x1d }, { 0x05, 0x11, 0x18, 0xff }, { 0x06, 0x11, 0xff, 0xff }, { 0x00, 0xff, 0xff, 0xff } @@ -172,10 +187,10 @@ static const u8 edp_swing_hbr2_hbr3[4][4] = { }; static const u8 edp_pre_emp_hbr2_hbr3[4][4] = { - { 0x08, 0x11, 0x17, 0x1b }, - { 0x00, 0x0c, 0x13, 0xff }, - { 0x05, 0x10, 0xff, 0xff }, - { 0x00, 0xff, 0xff, 0xff } + { 0x0c, 0x15, 0x19, 0x1e }, + { 0x0b, 0x15, 0x19, 0xff }, + { 0x0e, 0x14, 0xff, 0xff }, + { 0x0d, 0xff, 0xff, 0xff } }; static const struct qcom_edp_swing_pre_emph_cfg edp_phy_swing_pre_emph_cfg = { @@ -193,27 +208,6 @@ static const u8 edp_phy_vco_div_cfg_v4[4] = { 0x01, 0x01, 0x02, 0x00, }; -static const u8 edp_pre_emp_hbr_rbr_v5[4][4] = { - { 0x05, 0x11, 0x17, 0x1d }, - { 0x05, 0x11, 0x18, 0xff }, - { 0x06, 0x11, 0xff, 0xff }, - { 0x00, 0xff, 0xff, 0xff } -}; - -static const u8 edp_pre_emp_hbr2_hbr3_v5[4][4] = { - { 0x0c, 0x15, 0x19, 0x1e }, - { 0x0b, 0x15, 0x19, 0xff }, - { 0x0e, 0x14, 0xff, 0xff }, - { 0x0d, 0xff, 0xff, 0xff } -}; - -static const struct qcom_edp_swing_pre_emph_cfg edp_phy_swing_pre_emph_cfg_v5 = { - .swing_hbr_rbr = &edp_swing_hbr_rbr, - .swing_hbr3_hbr2 = &edp_swing_hbr2_hbr3, - .pre_emphasis_hbr_rbr = &edp_pre_emp_hbr_rbr_v5, - .pre_emphasis_hbr3_hbr2 = &edp_pre_emp_hbr2_hbr3_v5, -}; - static const u8 edp_phy_aux_cfg_v5[DP_AUX_CFG_SIZE] = { 0x00, 0x13, 0xa4, 0x00, 0x0a, 0x26, 0x0a, 0x03, 0x37, 0x03, 0x02, 0x02, 0x00, }; @@ -262,12 +256,7 @@ static int qcom_edp_phy_init(struct phy *phy) DP_PHY_PD_CTL_PLL_PWRDN | DP_PHY_PD_CTL_DP_CLAMP_EN, edp->edp + DP_PHY_PD_CTL); - /* - * TODO: Re-work the conditions around setting the cfg8 value - * when more information becomes available about why this is - * even needed. - */ - if (edp->cfg->swing_pre_emph_cfg && !edp->is_edp) + if (!edp->is_edp) aux_cfg[8] = 0xb7; writel(0xfc, edp->edp + DP_PHY_MODE); @@ -291,7 +280,7 @@ out_disable_supplies: static int qcom_edp_set_voltages(struct qcom_edp *edp, const struct phy_configure_opts_dp *dp_opts) { - const struct qcom_edp_swing_pre_emph_cfg *cfg = edp->cfg->swing_pre_emph_cfg; + const struct qcom_edp_swing_pre_emph_cfg *cfg; unsigned int v_level = 0; unsigned int p_level = 0; u8 ldo_config; @@ -299,12 +288,14 @@ static int qcom_edp_set_voltages(struct qcom_edp *edp, const struct phy_configur u8 emph; int i; + if (edp->is_edp) + cfg = edp->cfg->edp_swing_pre_emph_cfg; + else + cfg = edp->cfg->dp_swing_pre_emph_cfg; + if (!cfg) return 0; - if (edp->is_edp) - cfg = &edp_phy_swing_pre_emph_cfg; - for (i = 0; i < dp_opts->lanes; i++) { v_level = max(v_level, dp_opts->voltage[i]); p_level = max(p_level, dp_opts->pre[i]); @@ -564,7 +555,8 @@ static const struct qcom_edp_phy_cfg sa8775p_dp_phy_cfg = { .is_edp = false, .aux_cfg = edp_phy_aux_cfg_v5, .vco_div_cfg = edp_phy_vco_div_cfg_v4, - .swing_pre_emph_cfg = &edp_phy_swing_pre_emph_cfg_v5, + .dp_swing_pre_emph_cfg = &dp_phy_swing_pre_emph_cfg, + .edp_swing_pre_emph_cfg = &edp_phy_swing_pre_emph_cfg, .ver_ops = &qcom_edp_phy_ops_v4, }; @@ -577,7 +569,8 @@ static const struct qcom_edp_phy_cfg sc7280_dp_phy_cfg = { static const struct qcom_edp_phy_cfg sc8280xp_dp_phy_cfg = { .aux_cfg = edp_phy_aux_cfg_v4, .vco_div_cfg = edp_phy_vco_div_cfg_v4, - .swing_pre_emph_cfg = &dp_phy_swing_pre_emph_cfg, + .dp_swing_pre_emph_cfg = &dp_phy_swing_pre_emph_cfg, + .edp_swing_pre_emph_cfg = &edp_phy_swing_pre_emph_cfg, .ver_ops = &qcom_edp_phy_ops_v4, }; @@ -585,7 +578,8 @@ static const struct qcom_edp_phy_cfg sc8280xp_edp_phy_cfg = { .is_edp = true, .aux_cfg = edp_phy_aux_cfg_v4, .vco_div_cfg = edp_phy_vco_div_cfg_v4, - .swing_pre_emph_cfg = &edp_phy_swing_pre_emph_cfg, + .dp_swing_pre_emph_cfg = &dp_phy_swing_pre_emph_cfg, + .edp_swing_pre_emph_cfg = &edp_phy_swing_pre_emph_cfg, .ver_ops = &qcom_edp_phy_ops_v4, }; @@ -766,7 +760,8 @@ static const struct phy_ver_ops qcom_edp_phy_ops_v6 = { static struct qcom_edp_phy_cfg x1e80100_phy_cfg = { .aux_cfg = edp_phy_aux_cfg_v4, .vco_div_cfg = edp_phy_vco_div_cfg_v4, - .swing_pre_emph_cfg = &dp_phy_swing_pre_emph_cfg, + .dp_swing_pre_emph_cfg = &dp_phy_swing_pre_emph_cfg, + .edp_swing_pre_emph_cfg = &edp_phy_swing_pre_emph_cfg, .ver_ops = &qcom_edp_phy_ops_v6, }; @@ -945,7 +940,8 @@ static const struct phy_ver_ops qcom_edp_phy_ops_v8 = { static struct qcom_edp_phy_cfg glymur_phy_cfg = { .aux_cfg = edp_phy_aux_cfg_v8, .vco_div_cfg = edp_phy_vco_div_cfg_v8, - .swing_pre_emph_cfg = &edp_phy_swing_pre_emph_cfg_v5, + .dp_swing_pre_emph_cfg = &dp_phy_swing_pre_emph_cfg_v8, + .edp_swing_pre_emph_cfg = &edp_phy_swing_pre_emph_cfg, .ver_ops = &qcom_edp_phy_ops_v8, }; @@ -963,7 +959,7 @@ static int qcom_edp_phy_power_on(struct phy *phy) if (ret) return ret; - if (edp->cfg->swing_pre_emph_cfg && !edp->is_edp) + if (edp->cfg->edp_swing_pre_emph_cfg && !edp->is_edp) ldo_config = 0x1; writel(ldo_config, edp->tx0 + TXn_LDO_CONFIG); diff --git a/drivers/phy/qualcomm/phy-qcom-qmp-ufs.c b/drivers/phy/qualcomm/phy-qcom-qmp-ufs.c index 771bc7c2ab50..b87314c8379d 100644 --- a/drivers/phy/qualcomm/phy-qcom-qmp-ufs.c +++ b/drivers/phy/qualcomm/phy-qcom-qmp-ufs.c @@ -1112,6 +1112,7 @@ static const struct qmp_phy_init_tbl sm8750_ufsphy_pcs[] = { QMP_PHY_INIT_CFG(QPHY_V6_PCS_UFS_MULTI_LANE_CTRL1, 0x02), QMP_PHY_INIT_CFG(QPHY_V6_PCS_UFS_TX_MID_TERM_CTRL1, 0x43), QMP_PHY_INIT_CFG(QPHY_V6_PCS_UFS_PCS_CTRL1, 0x40), + QMP_PHY_INIT_CFG(QPHY_V6_PCS_UFS_PLL_CNTL, 0x33), QMP_PHY_INIT_CFG(QPHY_V6_PCS_UFS_TX_LARGE_AMP_DRV_LVL, 0x0f), QMP_PHY_INIT_CFG(QPHY_V6_PCS_UFS_RX_SIGDET_CTRL2, 0x68), QMP_PHY_INIT_CFG(QPHY_V6_PCS_UFS_TX_POST_EMP_LVL_S4, 0x0e), diff --git a/drivers/phy/qualcomm/phy-qcom-qmp-usbc.c b/drivers/phy/qualcomm/phy-qcom-qmp-usbc.c index 14feb77789b3..0dd7000614f4 100644 --- a/drivers/phy/qualcomm/phy-qcom-qmp-usbc.c +++ b/drivers/phy/qualcomm/phy-qcom-qmp-usbc.c @@ -794,7 +794,7 @@ static int qmp_v2_configure_dp_swing(struct qmp_usbc *qmp) p_level = max(p_level, dp_opts->pre[i]); } - if (v_level > 4 || p_level > 4) { + if (v_level >= 4 || p_level >= 4) { dev_err(qmp->dev, "Invalid v(%d) | p(%d) level)\n", v_level, p_level); return -EINVAL; diff --git a/drivers/phy/samsung/phy-exynos5-usbdrd.c b/drivers/phy/samsung/phy-exynos5-usbdrd.c index 5a181cb4597e..8711a3b62c8e 100644 --- a/drivers/phy/samsung/phy-exynos5-usbdrd.c +++ b/drivers/phy/samsung/phy-exynos5-usbdrd.c @@ -1958,13 +1958,14 @@ const struct exynos5_usbdrd_phy_tuning exynos7870_tunes_utmi_postinit[] = { PHYPARAM0_TXPREEMPAMPTUNE | PHYPARAM0_TXHSXVTUNE | PHYPARAM0_TXFSLSTUNE | PHYPARAM0_SQRXTUNE | PHYPARAM0_OTGTUNE | PHYPARAM0_COMPDISTUNE), - (FIELD_PREP_CONST(PHYPARAM0_TXVREFTUNE, 14) | + (FIELD_PREP_CONST(PHYPARAM0_TXVREFTUNE, 3) | FIELD_PREP_CONST(PHYPARAM0_TXRISETUNE, 1) | - FIELD_PREP_CONST(PHYPARAM0_TXRESTUNE, 3) | + FIELD_PREP_CONST(PHYPARAM0_TXRESTUNE, 2) | + FIELD_PREP_CONST(PHYPARAM0_TXPREEMPPULSETUNE, 0) | FIELD_PREP_CONST(PHYPARAM0_TXPREEMPAMPTUNE, 0) | FIELD_PREP_CONST(PHYPARAM0_TXHSXVTUNE, 0) | FIELD_PREP_CONST(PHYPARAM0_TXFSLSTUNE, 3) | - FIELD_PREP_CONST(PHYPARAM0_SQRXTUNE, 6) | + FIELD_PREP_CONST(PHYPARAM0_SQRXTUNE, 5) | FIELD_PREP_CONST(PHYPARAM0_OTGTUNE, 2) | FIELD_PREP_CONST(PHYPARAM0_COMPDISTUNE, 3))), PHY_TUNING_ENTRY_LAST diff --git a/drivers/phy/spacemit/phy-k1-usb2.c b/drivers/phy/spacemit/phy-k1-usb2.c index 9215d0b223b2..e8c1e26428a9 100644 --- a/drivers/phy/spacemit/phy-k1-usb2.c +++ b/drivers/phy/spacemit/phy-k1-usb2.c @@ -97,7 +97,6 @@ static int spacemit_usb2phy_init(struct phy *phy) ret = clk_enable(sphy->clk); if (ret) { dev_err(&phy->dev, "failed to enable clock\n"); - clk_disable(sphy->clk); return ret; } diff --git a/drivers/phy/tegra/xusb-tegra186.c b/drivers/phy/tegra/xusb-tegra186.c index 1ddf11265974..60156aea2707 100644 --- a/drivers/phy/tegra/xusb-tegra186.c +++ b/drivers/phy/tegra/xusb-tegra186.c @@ -20,8 +20,8 @@ /* FUSE USB_CALIB registers */ #define HS_CURR_LEVEL_PADX_SHIFT(x) ((x) ? (11 + (x - 1) * 6) : 0) #define HS_CURR_LEVEL_PAD_MASK 0x3f -#define HS_TERM_RANGE_ADJ_SHIFT 7 -#define HS_TERM_RANGE_ADJ_MASK 0xf +#define HS_TERM_RANGE_ADJ_PADX_SHIFT(x) ((x) ? (5 + (x - 1) * 4) : 7) +#define HS_TERM_RANGE_ADJ_PAD_MASK 0xf #define HS_SQUELCH_SHIFT 29 #define HS_SQUELCH_MASK 0x7 @@ -253,7 +253,7 @@ struct tegra_xusb_fuse_calibration { u32 *hs_curr_level; u32 hs_squelch; - u32 hs_term_range_adj; + u32 *hs_term_range_adj; u32 rpd_ctrl; }; @@ -930,7 +930,7 @@ static int tegra186_utmi_phy_power_on(struct phy *phy) value = padctl_readl(padctl, XUSB_PADCTL_USB2_OTG_PADX_CTL1(index)); value &= ~TERM_RANGE_ADJ(~0); - value |= TERM_RANGE_ADJ(priv->calib.hs_term_range_adj); + value |= TERM_RANGE_ADJ(priv->calib.hs_term_range_adj[index]); value &= ~RPD_CTRL(~0); value |= RPD_CTRL(priv->calib.rpd_ctrl); padctl_writel(padctl, value, XUSB_PADCTL_USB2_OTG_PADX_CTL1(index)); @@ -1464,17 +1464,23 @@ static const char * const tegra186_usb3_functions[] = { static int tegra186_xusb_read_fuse_calibration(struct tegra186_xusb_padctl *padctl) { + const struct tegra_xusb_padctl_soc *soc = padctl->base.soc; struct device *dev = padctl->base.dev; unsigned int i, count; u32 value, *level; + u32 *hs_term_range_adj; int err; - count = padctl->base.soc->ports.usb2.count; + count = soc->ports.usb2.count; level = devm_kcalloc(dev, count, sizeof(u32), GFP_KERNEL); if (!level) return -ENOMEM; + hs_term_range_adj = devm_kcalloc(dev, count, sizeof(u32), GFP_KERNEL); + if (!hs_term_range_adj) + return -ENOMEM; + err = tegra_fuse_readl(TEGRA_FUSE_SKU_CALIB_0, &value); if (err) return dev_err_probe(dev, err, @@ -1490,8 +1496,8 @@ tegra186_xusb_read_fuse_calibration(struct tegra186_xusb_padctl *padctl) padctl->calib.hs_squelch = (value >> HS_SQUELCH_SHIFT) & HS_SQUELCH_MASK; - padctl->calib.hs_term_range_adj = (value >> HS_TERM_RANGE_ADJ_SHIFT) & - HS_TERM_RANGE_ADJ_MASK; + hs_term_range_adj[0] = (value >> HS_TERM_RANGE_ADJ_PADX_SHIFT(0)) & + HS_TERM_RANGE_ADJ_PAD_MASK; err = tegra_fuse_readl(TEGRA_FUSE_USB_CALIB_EXT_0, &value); if (err) { @@ -1503,6 +1509,17 @@ tegra186_xusb_read_fuse_calibration(struct tegra186_xusb_padctl *padctl) padctl->calib.rpd_ctrl = (value >> RPD_CTRL_SHIFT) & RPD_CTRL_MASK; + for (i = 1; i < count; i++) { + if (soc->has_per_pad_term) + hs_term_range_adj[i] = + (value >> HS_TERM_RANGE_ADJ_PADX_SHIFT(i)) & + HS_TERM_RANGE_ADJ_PAD_MASK; + else + hs_term_range_adj[i] = hs_term_range_adj[0]; + } + + padctl->calib.hs_term_range_adj = hs_term_range_adj; + return 0; } @@ -1708,6 +1725,7 @@ const struct tegra_xusb_padctl_soc tegra194_xusb_padctl_soc = { .num_supplies = ARRAY_SIZE(tegra194_xusb_padctl_supply_names), .supports_gen2 = true, .poll_trk_completed = true, + .has_per_pad_term = true, }; EXPORT_SYMBOL_GPL(tegra194_xusb_padctl_soc); @@ -1732,6 +1750,7 @@ const struct tegra_xusb_padctl_soc tegra234_xusb_padctl_soc = { .trk_hw_mode = false, .trk_update_on_idle = true, .supports_lp_cfg_en = true, + .has_per_pad_term = true, }; EXPORT_SYMBOL_GPL(tegra234_xusb_padctl_soc); #endif diff --git a/drivers/phy/tegra/xusb.h b/drivers/phy/tegra/xusb.h index cd277d0ed9e1..77609e54de66 100644 --- a/drivers/phy/tegra/xusb.h +++ b/drivers/phy/tegra/xusb.h @@ -435,6 +435,7 @@ struct tegra_xusb_padctl_soc { bool trk_hw_mode; bool trk_update_on_idle; bool supports_lp_cfg_en; + bool has_per_pad_term; }; struct tegra_xusb_padctl { diff --git a/drivers/pinctrl/mediatek/pinctrl-moore.c b/drivers/pinctrl/mediatek/pinctrl-moore.c index 70f608347a5f..071ba849e532 100644 --- a/drivers/pinctrl/mediatek/pinctrl-moore.c +++ b/drivers/pinctrl/mediatek/pinctrl-moore.c @@ -520,6 +520,23 @@ static int mtk_gpio_direction_output(struct gpio_chip *chip, unsigned int gpio, return pinctrl_gpio_direction_output(chip, gpio); } +static int mtk_gpio_get_direction(struct gpio_chip *chip, unsigned int offset) +{ + struct mtk_pinctrl *hw = gpiochip_get_data(chip); + const struct mtk_pin_desc *desc; + int ret, dir; + + desc = (const struct mtk_pin_desc *)&hw->soc->pins[offset]; + if (!desc->name) + return -ENOTSUPP; + + ret = mtk_hw_get_value(hw, desc, PINCTRL_PIN_REG_DIR, &dir); + if (ret) + return ret; + + return dir ? GPIO_LINE_DIRECTION_OUT : GPIO_LINE_DIRECTION_IN; +} + static int mtk_gpio_to_irq(struct gpio_chip *chip, unsigned int offset) { struct mtk_pinctrl *hw = gpiochip_get_data(chip); @@ -566,6 +583,7 @@ static int mtk_build_gpiochip(struct mtk_pinctrl *hw) chip->parent = hw->dev; chip->request = gpiochip_generic_request; chip->free = gpiochip_generic_free; + chip->get_direction = mtk_gpio_get_direction; chip->direction_input = pinctrl_gpio_direction_input; chip->direction_output = mtk_gpio_direction_output; chip->get = mtk_gpio_get; diff --git a/drivers/pinctrl/meson/pinctrl-amlogic-a4.c b/drivers/pinctrl/meson/pinctrl-amlogic-a4.c index e2293a872dcb..35d27626a336 100644 --- a/drivers/pinctrl/meson/pinctrl-amlogic-a4.c +++ b/drivers/pinctrl/meson/pinctrl-amlogic-a4.c @@ -292,7 +292,7 @@ static int aml_calc_reg_and_bit(struct pinctrl_gpio_range *range, static int aml_pinconf_get_pull(struct aml_pinctrl *info, unsigned int pin) { struct pinctrl_gpio_range *range = - pinctrl_find_gpio_range_from_pin(info->pctl, pin); + pinctrl_find_gpio_range_from_pin_nolock(info->pctl, pin); struct aml_gpio_bank *bank = gpio_chip_to_bank(range->gc); unsigned int reg, bit, val; int ret, conf; @@ -326,7 +326,7 @@ static int aml_pinconf_get_drive_strength(struct aml_pinctrl *info, u16 *drive_strength_ua) { struct pinctrl_gpio_range *range = - pinctrl_find_gpio_range_from_pin(info->pctl, pin); + pinctrl_find_gpio_range_from_pin_nolock(info->pctl, pin); struct aml_gpio_bank *bank = gpio_chip_to_bank(range->gc); unsigned int reg, bit; unsigned int val; @@ -365,7 +365,7 @@ static int aml_pinconf_get_gpio_bit(struct aml_pinctrl *info, unsigned int reg_type) { struct pinctrl_gpio_range *range = - pinctrl_find_gpio_range_from_pin(info->pctl, pin); + pinctrl_find_gpio_range_from_pin_nolock(info->pctl, pin); struct aml_gpio_bank *bank = gpio_chip_to_bank(range->gc); unsigned int reg, bit, val; int ret; diff --git a/drivers/pinctrl/qcom/pinctrl-ipq4019.c b/drivers/pinctrl/qcom/pinctrl-ipq4019.c index 6ede3149b6e1..07df812fb728 100644 --- a/drivers/pinctrl/qcom/pinctrl-ipq4019.c +++ b/drivers/pinctrl/qcom/pinctrl-ipq4019.c @@ -480,7 +480,7 @@ static const struct pinfunction ipq4019_functions[] = { QCA_PIN_FUNCTION(blsp_uart0), QCA_PIN_FUNCTION(blsp_uart1), QCA_PIN_FUNCTION(chip_rst), - QCA_PIN_FUNCTION(gpio), + QCA_GPIO_PIN_FUNCTION(gpio), QCA_PIN_FUNCTION(i2s_rx), QCA_PIN_FUNCTION(i2s_spdif_in), QCA_PIN_FUNCTION(i2s_spdif_out), diff --git a/drivers/pinctrl/qcom/pinctrl-msm.h b/drivers/pinctrl/qcom/pinctrl-msm.h index 4625fa5320a9..120217012a9f 100644 --- a/drivers/pinctrl/qcom/pinctrl-msm.h +++ b/drivers/pinctrl/qcom/pinctrl-msm.h @@ -39,6 +39,11 @@ struct pinctrl_pin_desc; fname##_groups, \ ARRAY_SIZE(fname##_groups)) +#define QCA_GPIO_PIN_FUNCTION(fname) \ + [qca_mux_##fname] = PINCTRL_GPIO_PINFUNCTION(#fname, \ + fname##_groups, \ + ARRAY_SIZE(fname##_groups)) + /** * struct msm_pingroup - Qualcomm pingroup definition * @grp: Generic data of the pin group (name and pins) diff --git a/drivers/pinctrl/qcom/pinctrl-qcs615.c b/drivers/pinctrl/qcom/pinctrl-qcs615.c index f1c827ddbfbf..4d474c312c10 100644 --- a/drivers/pinctrl/qcom/pinctrl-qcs615.c +++ b/drivers/pinctrl/qcom/pinctrl-qcs615.c @@ -1043,11 +1043,11 @@ static const struct msm_pingroup qcs615_groups[] = { static const struct msm_gpio_wakeirq_map qcs615_pdc_map[] = { { 1, 45 }, { 3, 31 }, { 7, 55 }, { 9, 110 }, { 11, 34 }, { 13, 33 }, { 14, 35 }, { 17, 46 }, { 19, 48 }, { 21, 83 }, - { 22, 36 }, { 26, 38 }, { 35, 37 }, { 39, 125 }, { 41, 47 }, - { 47, 49 }, { 48, 51 }, { 50, 52 }, { 51, 123 }, { 55, 56 }, + { 22, 36 }, { 26, 38 }, { 35, 37 }, { 39, 118 }, { 41, 47 }, + { 47, 49 }, { 48, 51 }, { 50, 52 }, { 51, 116 }, { 55, 56 }, { 56, 57 }, { 57, 58 }, { 60, 60 }, { 71, 54 }, { 80, 73 }, { 81, 64 }, { 82, 50 }, { 83, 65 }, { 84, 92 }, { 85, 99 }, - { 86, 67 }, { 87, 84 }, { 88, 124 }, { 89, 122 }, { 90, 69 }, + { 86, 67 }, { 87, 84 }, { 88, 117 }, { 89, 115 }, { 90, 69 }, { 92, 88 }, { 93, 75 }, { 94, 91 }, { 95, 72 }, { 96, 82 }, { 97, 74 }, { 98, 95 }, { 99, 94 }, { 100, 100 }, { 101, 40 }, { 102, 93 }, { 103, 77 }, { 104, 78 }, { 105, 96 }, { 107, 97 }, diff --git a/drivers/pinctrl/qcom/pinctrl-sm8150.c b/drivers/pinctrl/qcom/pinctrl-sm8150.c index ad861cd66958..e4c561a9c50a 100644 --- a/drivers/pinctrl/qcom/pinctrl-sm8150.c +++ b/drivers/pinctrl/qcom/pinctrl-sm8150.c @@ -1496,18 +1496,18 @@ static const struct msm_gpio_wakeirq_map sm8150_pdc_map[] = { { 3, 31 }, { 5, 32 }, { 8, 33 }, { 9, 34 }, { 10, 100 }, { 12, 104 }, { 24, 37 }, { 26, 38 }, { 27, 41 }, { 28, 42 }, { 30, 39 }, { 36, 43 }, { 37, 44 }, { 38, 30 }, { 39, 118 }, - { 39, 125 }, { 41, 47 }, { 42, 48 }, { 46, 50 }, { 47, 49 }, - { 48, 51 }, { 49, 53 }, { 50, 52 }, { 51, 116 }, { 51, 123 }, + { 41, 47 }, { 42, 48 }, { 46, 50 }, { 47, 49 }, + { 48, 51 }, { 49, 53 }, { 50, 52 }, { 51, 116 }, { 53, 54 }, { 54, 55 }, { 55, 56 }, { 56, 57 }, { 58, 58 }, { 60, 60 }, { 61, 61 }, { 68, 62 }, { 70, 63 }, { 76, 71 }, { 77, 66 }, { 81, 64 }, { 83, 65 }, { 86, 67 }, { 87, 84 }, - { 88, 117 }, { 88, 124 }, { 90, 69 }, { 91, 70 }, { 93, 75 }, + { 88, 117 }, { 90, 69 }, { 91, 70 }, { 93, 75 }, { 95, 72 }, { 96, 73 }, { 97, 74 }, { 101, 40 }, { 103, 77 }, { 104, 78 }, { 108, 79 }, { 112, 80 }, { 113, 81 }, { 114, 82 }, { 117, 85 }, { 118, 101 }, { 119, 87 }, { 120, 88 }, { 121, 89 }, { 122, 90 }, { 123, 91 }, { 124, 92 }, { 125, 93 }, { 129, 94 }, { 132, 105 }, { 133, 83 }, { 134, 36 }, { 136, 97 }, { 142, 103 }, - { 144, 115 }, { 144, 122 }, { 147, 102 }, { 150, 107 }, + { 144, 115 }, { 147, 102 }, { 150, 107 }, { 152, 108 }, { 153, 109 } }; diff --git a/drivers/pinctrl/renesas/pinctrl-rzg2l.c b/drivers/pinctrl/renesas/pinctrl-rzg2l.c index 55e35f63343c..99008ec3deb0 100644 --- a/drivers/pinctrl/renesas/pinctrl-rzg2l.c +++ b/drivers/pinctrl/renesas/pinctrl-rzg2l.c @@ -335,7 +335,7 @@ struct rzg2l_pinctrl_reg_cache { u32 *iolh[2]; u32 *ien[2]; u32 *pupd[2]; - u32 *smt; + u32 *smt[2]; u8 sd_ch[2]; u8 eth_poc[2]; u8 oen; @@ -2738,10 +2738,6 @@ static int rzg2l_pinctrl_reg_cache_alloc(struct rzg2l_pinctrl *pctrl) if (!cache->pfc) return -ENOMEM; - cache->smt = devm_kcalloc(pctrl->dev, nports, sizeof(*cache->smt), GFP_KERNEL); - if (!cache->smt) - return -ENOMEM; - for (u8 i = 0; i < 2; i++) { u32 n_dedicated_pins = pctrl->data->n_dedicated_pins; @@ -2760,6 +2756,11 @@ static int rzg2l_pinctrl_reg_cache_alloc(struct rzg2l_pinctrl *pctrl) if (!cache->pupd[i]) return -ENOMEM; + cache->smt[i] = devm_kcalloc(pctrl->dev, nports, sizeof(*cache->smt[i]), + GFP_KERNEL); + if (!cache->smt[i]) + return -ENOMEM; + /* Allocate dedicated cache. */ dedicated_cache->iolh[i] = devm_kcalloc(pctrl->dev, n_dedicated_pins, sizeof(*dedicated_cache->iolh[i]), @@ -3050,7 +3051,7 @@ static void rzg2l_pinctrl_pm_setup_regs(struct rzg2l_pinctrl *pctrl, bool suspen RZG2L_PCTRL_REG_ACCESS32(suspend, pctrl->base + PUPD(off), cache->pupd[0][port]); if (pincnt >= 4) { - RZG2L_PCTRL_REG_ACCESS32(suspend, pctrl->base + PUPD(off), + RZG2L_PCTRL_REG_ACCESS32(suspend, pctrl->base + PUPD(off) + 4, cache->pupd[1][port]); } } @@ -3067,8 +3068,14 @@ static void rzg2l_pinctrl_pm_setup_regs(struct rzg2l_pinctrl *pctrl, bool suspen } } - if (has_smt) - RZG2L_PCTRL_REG_ACCESS32(suspend, pctrl->base + SMT(off), cache->smt[port]); + if (has_smt) { + RZG2L_PCTRL_REG_ACCESS32(suspend, pctrl->base + SMT(off), + cache->smt[0][port]); + if (pincnt >= 4) { + RZG2L_PCTRL_REG_ACCESS32(suspend, pctrl->base + SMT(off) + 4, + cache->smt[1][port]); + } + } } } diff --git a/drivers/platform/surface/surface_aggregator_registry.c b/drivers/platform/surface/surface_aggregator_registry.c index 0599d5adf02e..f0881edfb616 100644 --- a/drivers/platform/surface/surface_aggregator_registry.c +++ b/drivers/platform/surface/surface_aggregator_registry.c @@ -295,8 +295,6 @@ static const struct software_node *ssam_node_group_sl6[] = { /* Devices for Surface Laptop 7. */ static const struct software_node *ssam_node_group_sl7[] = { &ssam_node_root, - &ssam_node_bat_ac, - &ssam_node_bat_main, &ssam_node_tmp_perf_profile_with_fan, &ssam_node_fan_speed, &ssam_node_hid_sam_keyboard, diff --git a/drivers/platform/x86/adv_swbutton.c b/drivers/platform/x86/adv_swbutton.c index 6fa60f3fc53c..8f7a26e6de81 100644 --- a/drivers/platform/x86/adv_swbutton.c +++ b/drivers/platform/x86/adv_swbutton.c @@ -48,10 +48,14 @@ static int adv_swbutton_probe(struct platform_device *device) { struct adv_swbutton *button; struct input_dev *input; - acpi_handle handle = ACPI_HANDLE(&device->dev); + acpi_handle handle; acpi_status status; int error; + handle = ACPI_HANDLE(&device->dev); + if (!handle) + return -ENODEV; + button = devm_kzalloc(&device->dev, sizeof(*button), GFP_KERNEL); if (!button) return -ENOMEM; diff --git a/drivers/platform/x86/asus-armoury.c b/drivers/platform/x86/asus-armoury.c index 5b0987ccc270..495dc1e31d40 100644 --- a/drivers/platform/x86/asus-armoury.c +++ b/drivers/platform/x86/asus-armoury.c @@ -370,7 +370,7 @@ static ssize_t mini_led_mode_current_value_show(struct kobject *kobj, if (err) return err; - mode = FIELD_GET(ASUS_MINI_LED_MODE_MASK, 0); + mode = FIELD_GET(ASUS_MINI_LED_MODE_MASK, mode); for (i = 0; i < mini_led_mode_map_size; i++) if (mode == mini_led_mode_map[i]) @@ -386,6 +386,7 @@ static ssize_t mini_led_mode_current_value_store(struct kobject *kobj, { u32 *mini_led_mode_map; size_t mini_led_mode_map_size; + char mapped_value[12]; u32 mode; int err; @@ -414,9 +415,16 @@ static ssize_t mini_led_mode_current_value_store(struct kobject *kobj, return -ENODEV; } - return armoury_attr_uint_store(kobj, attr, buf, count, - 0, mini_led_mode_map[mode], - NULL, asus_armoury.mini_led_dev_id); + /* + * armoury_attr_uint_store() parses and sends the value from the + * passed buffer; hand it the mapped firmware value so the device + * receives the translated mode instead of the raw index. + */ + snprintf(mapped_value, sizeof(mapped_value), "%u", mini_led_mode_map[mode]); + + return armoury_attr_uint_store(kobj, attr, mapped_value, count, 0, + mini_led_mode_map[mode], NULL, + asus_armoury.mini_led_dev_id); } static ssize_t mini_led_mode_possible_values_show(struct kobject *kobj, diff --git a/drivers/platform/x86/hp/hp_accel.c b/drivers/platform/x86/hp/hp_accel.c index 10d5af18d639..39b73dc473f1 100644 --- a/drivers/platform/x86/hp/hp_accel.c +++ b/drivers/platform/x86/hp/hp_accel.c @@ -300,6 +300,9 @@ static int lis3lv02d_probe(struct platform_device *device) int ret; lis3_dev.bus_priv = ACPI_COMPANION(&device->dev); + if (!lis3_dev.bus_priv) + return -ENODEV; + lis3_dev.init = lis3lv02d_acpi_init; lis3_dev.read = lis3lv02d_acpi_read; lis3_dev.write = lis3lv02d_acpi_write; diff --git a/drivers/platform/x86/intel/hid.c b/drivers/platform/x86/intel/hid.c index 2ddd8af8c1ce..085093506dda 100644 --- a/drivers/platform/x86/intel/hid.c +++ b/drivers/platform/x86/intel/hid.c @@ -688,12 +688,16 @@ static bool button_array_present(struct platform_device *device) static int intel_hid_probe(struct platform_device *device) { - acpi_handle handle = ACPI_HANDLE(&device->dev); unsigned long long mode, dummy; struct intel_hid_priv *priv; + acpi_handle handle; acpi_status status; int err; + handle = ACPI_HANDLE(&device->dev); + if (!handle) + return -ENODEV; + intel_hid_init_dsm(handle); if (!intel_hid_evaluate_method(handle, INTEL_HID_DSM_HDMM_FN, &mode)) { diff --git a/drivers/platform/x86/intel/int1092/intel_sar.c b/drivers/platform/x86/intel/int1092/intel_sar.c index 88822023a149..849f7b415c1e 100644 --- a/drivers/platform/x86/intel/int1092/intel_sar.c +++ b/drivers/platform/x86/intel/int1092/intel_sar.c @@ -245,15 +245,20 @@ static void sar_get_data(int reg, struct wwan_sar_context *context) static int sar_probe(struct platform_device *device) { struct wwan_sar_context *context; + acpi_handle handle; int reg; int result; + handle = ACPI_HANDLE(&device->dev); + if (!handle) + return -ENODEV; + context = kzalloc_obj(*context); if (!context) return -ENOMEM; context->sar_device = device; - context->handle = ACPI_HANDLE(&device->dev); + context->handle = handle; dev_set_drvdata(&device->dev, context); result = guid_parse(SAR_DSM_UUID, &context->guid); diff --git a/drivers/platform/x86/intel/vbtn.c b/drivers/platform/x86/intel/vbtn.c index 9ca87e707582..874023c38fd1 100644 --- a/drivers/platform/x86/intel/vbtn.c +++ b/drivers/platform/x86/intel/vbtn.c @@ -275,12 +275,16 @@ static bool intel_vbtn_has_switches(acpi_handle handle, bool dual_accel) static int intel_vbtn_probe(struct platform_device *device) { - acpi_handle handle = ACPI_HANDLE(&device->dev); bool dual_accel, has_buttons, has_switches; struct intel_vbtn_priv *priv; + acpi_handle handle; acpi_status status; int err; + handle = ACPI_HANDLE(&device->dev); + if (!handle) + return -ENODEV; + dual_accel = dual_accel_detect(); has_buttons = acpi_has_method(handle, "VBDL"); has_switches = intel_vbtn_has_switches(handle, dual_accel); diff --git a/drivers/platform/x86/uniwill/uniwill-acpi.c b/drivers/platform/x86/uniwill/uniwill-acpi.c index 6341dca20b76..bcd25d08f56b 100644 --- a/drivers/platform/x86/uniwill/uniwill-acpi.c +++ b/drivers/platform/x86/uniwill/uniwill-acpi.c @@ -1193,6 +1193,16 @@ static int uniwill_led_init(struct uniwill_data *data) &init_data); } +static unsigned int uniwill_sanitize_battery_threshold(unsigned int value) +{ + /* 0 means "charging threshold not active" */ + if (!value) + return 100; + + /* Guard against invalid values */ + return min(value, 100); +} + static int uniwill_get_property(struct power_supply *psy, const struct power_supply_ext *ext, void *drvdata, enum power_supply_property psp, union power_supply_propval *val) @@ -1239,7 +1249,8 @@ static int uniwill_get_property(struct power_supply *psy, const struct power_sup if (ret < 0) return ret; - val->intval = clamp_val(FIELD_GET(CHARGE_CTRL_MASK, regval), 0, 100); + regval = FIELD_GET(CHARGE_CTRL_MASK, regval); + val->intval = uniwill_sanitize_battery_threshold(regval); return 0; default: return -EINVAL; @@ -1254,11 +1265,11 @@ static int uniwill_set_property(struct power_supply *psy, const struct power_sup switch (psp) { case POWER_SUPPLY_PROP_CHARGE_CONTROL_END_THRESHOLD: - if (val->intval < 1 || val->intval > 100) + if (val->intval < 0 || val->intval > 100) return -EINVAL; return regmap_update_bits(data->regmap, EC_ADDR_CHARGE_CTRL, CHARGE_CTRL_MASK, - val->intval); + max(val->intval, 1)); default: return -EINVAL; } @@ -1334,11 +1345,33 @@ static int uniwill_remove_battery(struct power_supply *battery, struct acpi_batt static int uniwill_battery_init(struct uniwill_data *data) { + unsigned int value, threshold, sanitized; int ret; if (!uniwill_device_supports(data, UNIWILL_FEATURE_BATTERY)) return 0; + ret = regmap_read(data->regmap, EC_ADDR_CHARGE_CTRL, &value); + if (ret < 0) + return ret; + + /* + * The charge control threshold might be initialized with 0 by + * the EC to signal that said threshold is uninitialized. We thus + * need to replace this placeholder value with a valid one (100) + * to signal that we want to take control of battery charging. + * For the sake of completeness we also apply this to other + * invalid threshold values. + */ + threshold = FIELD_GET(CHARGE_CTRL_MASK, value); + sanitized = uniwill_sanitize_battery_threshold(threshold); + if (threshold != sanitized) { + FIELD_MODIFY(CHARGE_CTRL_MASK, &value, sanitized); + ret = regmap_write(data->regmap, EC_ADDR_CHARGE_CTRL, value); + if (ret < 0) + return ret; + } + ret = devm_mutex_init(data->dev, &data->battery_lock); if (ret < 0) return ret; @@ -2156,8 +2189,6 @@ static int __init uniwill_init(void) if (!force) return -ENODEV; - /* Assume that the device supports all features */ - device_descriptor.features = UINT_MAX; pr_warn("Loading on a potentially unsupported device\n"); } else { /* @@ -2175,6 +2206,12 @@ static int __init uniwill_init(void) device_descriptor = *descriptor; } + if (force) { + /* Assume that the device supports all features except the charge limit */ + device_descriptor.features = UINT_MAX & ~UNIWILL_FEATURE_BATTERY; + pr_warn("Enabling potentially unsupported features\n"); + } + ret = platform_driver_register(&uniwill_driver); if (ret < 0) return ret; diff --git a/drivers/regulator/tps65219-regulator.c b/drivers/regulator/tps65219-regulator.c index d77ca486879f..324c3a33af8a 100644 --- a/drivers/regulator/tps65219-regulator.c +++ b/drivers/regulator/tps65219-regulator.c @@ -346,8 +346,9 @@ static irqreturn_t tps65219_regulator_irq_handler(int irq, void *data) return IRQ_HANDLED; } - regulator_notifier_call_chain(irq_data->rdev, - irq_data->type->event, NULL); + if (irq_data->rdev) + regulator_notifier_call_chain(irq_data->rdev, + irq_data->type->event, NULL); dev_err(irq_data->dev, "Error IRQ trap %s for %s\n", irq_data->type->event_name, irq_data->type->regulator_name); @@ -398,14 +399,65 @@ static struct tps65219_chip_data chip_info_table[] = { }, }; -static int tps65219_regulator_probe(struct platform_device *pdev) +static bool tps65219_is_regulator_name(const struct tps65219_chip_data *pmic, + const char *name) +{ + int i; + + for (i = 0; i < pmic->common_rdesc_size; i++) + if (!strcmp(pmic->common_rdesc[i].name, name)) + return true; + for (i = 0; i < pmic->rdesc_size; i++) + if (!strcmp(pmic->rdesc[i].name, name)) + return true; + return false; +} + +static int tps65219_register_irqs(struct platform_device *pdev, + struct tps65219 *tps, + struct regulator_dev *rdev, + struct tps65219_regulator_irq_type *irq_types, + int nirqs, + const char *regulator_name) { struct tps65219_regulator_irq_data *irq_data; + int i, irq, error; + + for (i = 0; i < nirqs; i++) { + if (strcmp(irq_types[i].regulator_name, regulator_name)) + continue; + + irq = platform_get_irq_byname(pdev, irq_types[i].irq_name); + if (irq < 0) + return -EINVAL; + + irq_data = devm_kmalloc(tps->dev, sizeof(*irq_data), GFP_KERNEL); + if (!irq_data) + return -ENOMEM; + + irq_data->dev = tps->dev; + irq_data->type = &irq_types[i]; + irq_data->rdev = rdev; + + error = devm_request_threaded_irq(tps->dev, irq, NULL, + tps65219_regulator_irq_handler, + IRQF_ONESHOT, + irq_types[i].irq_name, + irq_data); + if (error) + return dev_err_probe(tps->dev, error, + "Failed to request %s IRQ %d\n", + irq_types[i].irq_name, irq); + } + return 0; +} + +static int tps65219_regulator_probe(struct platform_device *pdev) +{ struct tps65219_regulator_irq_type *irq_type; struct tps65219_chip_data *pmic; struct regulator_dev *rdev; int error; - int irq; int i; struct tps65219 *tps = dev_get_drvdata(pdev->dev.parent); @@ -425,6 +477,19 @@ static int tps65219_regulator_probe(struct platform_device *pdev) return dev_err_probe(tps->dev, PTR_ERR(rdev), "Failed to register %s regulator\n", pmic->common_rdesc[i].name); + + error = tps65219_register_irqs(pdev, tps, rdev, + pmic->common_irq_types, + pmic->common_irq_size, + pmic->common_rdesc[i].name); + if (error) + return error; + error = tps65219_register_irqs(pdev, tps, rdev, + pmic->irq_types, + pmic->dev_irq_size, + pmic->common_rdesc[i].name); + if (error) + return error; } for (i = 0; i < pmic->rdesc_size; i++) { @@ -434,52 +499,42 @@ static int tps65219_regulator_probe(struct platform_device *pdev) return dev_err_probe(tps->dev, PTR_ERR(rdev), "Failed to register %s regulator\n", pmic->rdesc[i].name); + + error = tps65219_register_irqs(pdev, tps, rdev, + pmic->common_irq_types, + pmic->common_irq_size, + pmic->rdesc[i].name); + if (error) + return error; + error = tps65219_register_irqs(pdev, tps, rdev, + pmic->irq_types, + pmic->dev_irq_size, + pmic->rdesc[i].name); + if (error) + return error; } + /* Register non-regulator IRQs (TIMEOUT, SENSOR) with rdev=NULL */ for (i = 0; i < pmic->common_irq_size; ++i) { irq_type = &pmic->common_irq_types[i]; - irq = platform_get_irq_byname(pdev, irq_type->irq_name); - if (irq < 0) - return -EINVAL; - - irq_data = devm_kmalloc(tps->dev, sizeof(*irq_data), GFP_KERNEL); - if (!irq_data) - return -ENOMEM; - - irq_data->dev = tps->dev; - irq_data->type = irq_type; - error = devm_request_threaded_irq(tps->dev, irq, NULL, - tps65219_regulator_irq_handler, - IRQF_ONESHOT, - irq_type->irq_name, - irq_data); + if (tps65219_is_regulator_name(pmic, irq_type->regulator_name)) + continue; + error = tps65219_register_irqs(pdev, tps, NULL, + irq_type, 1, + irq_type->regulator_name); if (error) - return dev_err_probe(tps->dev, error, - "Failed to request %s IRQ %d\n", - irq_type->irq_name, irq); + return error; } for (i = 0; i < pmic->dev_irq_size; ++i) { irq_type = &pmic->irq_types[i]; - irq = platform_get_irq_byname(pdev, irq_type->irq_name); - if (irq < 0) - return -EINVAL; - - irq_data = devm_kmalloc(tps->dev, sizeof(*irq_data), GFP_KERNEL); - if (!irq_data) - return -ENOMEM; - - irq_data->dev = tps->dev; - irq_data->type = irq_type; - error = devm_request_threaded_irq(tps->dev, irq, NULL, - tps65219_regulator_irq_handler, - IRQF_ONESHOT, - irq_type->irq_name, - irq_data); + if (tps65219_is_regulator_name(pmic, irq_type->regulator_name)) + continue; + error = tps65219_register_irqs(pdev, tps, NULL, + irq_type, 1, + irq_type->regulator_name); if (error) - return dev_err_probe(tps->dev, error, - "Failed to request %s IRQ %d\n", - irq_type->irq_name, irq); + return error; } return 0; diff --git a/drivers/s390/cio/chsc.c b/drivers/s390/cio/chsc.c index fbb58edd6274..9689f722c863 100644 --- a/drivers/s390/cio/chsc.c +++ b/drivers/s390/cio/chsc.c @@ -1142,8 +1142,8 @@ int __init chsc_init(void) { int ret; - sei_page = (void *)get_zeroed_page(GFP_KERNEL); - chsc_page = (void *)get_zeroed_page(GFP_KERNEL); + sei_page = (void *)get_zeroed_page(GFP_KERNEL | GFP_DMA); + chsc_page = (void *)get_zeroed_page(GFP_KERNEL | GFP_DMA); if (!sei_page || !chsc_page) { ret = -ENOMEM; goto out_err; diff --git a/drivers/s390/cio/chsc_sch.c b/drivers/s390/cio/chsc_sch.c index 73413417a2ce..b6cb8bb8bcc4 100644 --- a/drivers/s390/cio/chsc_sch.c +++ b/drivers/s390/cio/chsc_sch.c @@ -292,7 +292,7 @@ static int chsc_ioctl_start(void __user *user_area) if (!css_general_characteristics.dynio) /* It makes no sense to try. */ return -EOPNOTSUPP; - chsc_area = (void *)get_zeroed_page(GFP_KERNEL); + chsc_area = (void *)get_zeroed_page(GFP_DMA | GFP_KERNEL); if (!chsc_area) return -ENOMEM; request = kzalloc_obj(*request); @@ -340,7 +340,7 @@ static int chsc_ioctl_on_close_set(void __user *user_area) ret = -ENOMEM; goto out_unlock; } - on_close_chsc_area = (void *)get_zeroed_page(GFP_KERNEL); + on_close_chsc_area = (void *)get_zeroed_page(GFP_DMA | GFP_KERNEL); if (!on_close_chsc_area) { ret = -ENOMEM; goto out_free_request; @@ -392,7 +392,7 @@ static int chsc_ioctl_start_sync(void __user *user_area) struct chsc_sync_area *chsc_area; int ret, ccode; - chsc_area = (void *)get_zeroed_page(GFP_KERNEL); + chsc_area = (void *)get_zeroed_page(GFP_KERNEL | GFP_DMA); if (!chsc_area) return -ENOMEM; if (copy_from_user(chsc_area, user_area, PAGE_SIZE)) { @@ -438,7 +438,7 @@ static int chsc_ioctl_info_channel_path(void __user *user_cd) u8 data[PAGE_SIZE - 20]; } __attribute__ ((packed)) *scpcd_area; - scpcd_area = (void *)get_zeroed_page(GFP_KERNEL); + scpcd_area = (void *)get_zeroed_page(GFP_KERNEL | GFP_DMA); if (!scpcd_area) return -ENOMEM; cd = kzalloc_obj(*cd); @@ -500,7 +500,7 @@ static int chsc_ioctl_info_cu(void __user *user_cd) u8 data[PAGE_SIZE - 20]; } __attribute__ ((packed)) *scucd_area; - scucd_area = (void *)get_zeroed_page(GFP_KERNEL); + scucd_area = (void *)get_zeroed_page(GFP_KERNEL | GFP_DMA); if (!scucd_area) return -ENOMEM; cd = kzalloc_obj(*cd); @@ -563,7 +563,7 @@ static int chsc_ioctl_info_sch_cu(void __user *user_cud) u8 data[PAGE_SIZE - 20]; } __attribute__ ((packed)) *sscud_area; - sscud_area = (void *)get_zeroed_page(GFP_KERNEL); + sscud_area = (void *)get_zeroed_page(GFP_KERNEL | GFP_DMA); if (!sscud_area) return -ENOMEM; cud = kzalloc_obj(*cud); @@ -625,7 +625,7 @@ static int chsc_ioctl_conf_info(void __user *user_ci) u8 data[PAGE_SIZE - 20]; } __attribute__ ((packed)) *sci_area; - sci_area = (void *)get_zeroed_page(GFP_KERNEL); + sci_area = (void *)get_zeroed_page(GFP_KERNEL | GFP_DMA); if (!sci_area) return -ENOMEM; ci = kzalloc_obj(*ci); @@ -696,7 +696,7 @@ static int chsc_ioctl_conf_comp_list(void __user *user_ccl) u32 res; } __attribute__ ((packed)) *cssids_parm; - sccl_area = (void *)get_zeroed_page(GFP_KERNEL); + sccl_area = (void *)get_zeroed_page(GFP_KERNEL | GFP_DMA); if (!sccl_area) return -ENOMEM; ccl = kzalloc_obj(*ccl); @@ -756,7 +756,7 @@ static int chsc_ioctl_chpd(void __user *user_chpd) int ret; chpd = kzalloc_obj(*chpd); - scpd_area = (void *)get_zeroed_page(GFP_KERNEL); + scpd_area = (void *)get_zeroed_page(GFP_KERNEL | GFP_DMA); if (!scpd_area || !chpd) { ret = -ENOMEM; goto out_free; @@ -796,7 +796,7 @@ static int chsc_ioctl_dcal(void __user *user_dcal) u8 data[PAGE_SIZE - 36]; } __attribute__ ((packed)) *sdcal_area; - sdcal_area = (void *)get_zeroed_page(GFP_KERNEL); + sdcal_area = (void *)get_zeroed_page(GFP_KERNEL | GFP_DMA); if (!sdcal_area) return -ENOMEM; dcal = kzalloc_obj(*dcal); diff --git a/drivers/s390/cio/scm.c b/drivers/s390/cio/scm.c index d13ed1011c03..171212a6d2d9 100644 --- a/drivers/s390/cio/scm.c +++ b/drivers/s390/cio/scm.c @@ -229,7 +229,7 @@ int scm_update_information(void) size_t num; int ret; - scm_info = (void *)__get_free_page(GFP_KERNEL); + scm_info = (void *)__get_free_page(GFP_KERNEL | GFP_DMA); if (!scm_info) return -ENOMEM; diff --git a/drivers/scsi/isci/host.c b/drivers/scsi/isci/host.c index 6d2f4c831df7..ff199bab5d1a 100644 --- a/drivers/scsi/isci/host.c +++ b/drivers/scsi/isci/host.c @@ -1252,6 +1252,9 @@ void isci_host_deinit(struct isci_host *ihost) wait_for_stop(ihost); + /* No further IRQ-driven scheduling can happen past wait_for_stop(). */ + tasklet_kill(&ihost->completion_tasklet); + /* phy stop is after controller stop to allow port and device to * go idle before shutting down the phys, but the expectation is * that i/o has been shut off well before we reach this diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index aba22060fcd5..79136d943595 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -2430,8 +2430,7 @@ sd_spinup_disk(struct scsi_disk *sdkp) { static const u8 cmd[10] = { TEST_UNIT_READY }; unsigned long spintime_expire = 0; - int spintime, sense_valid = 0; - unsigned int the_result; + int the_result, spintime, sense_valid = 0; struct scsi_sense_hdr sshdr; struct scsi_failure failure_defs[] = { /* Do not retry Medium Not Present */ diff --git a/drivers/spi/spi-amd.c b/drivers/spi/spi-amd.c index 4d1dce4f4974..71a6e5c475b0 100644 --- a/drivers/spi/spi-amd.c +++ b/drivers/spi/spi-amd.c @@ -868,7 +868,7 @@ static int amd_spi_probe(struct platform_device *pdev) dev_dbg(dev, "io_remap_address: %p\n", amd_spi->io_remap_addr); amd_spi->version = (uintptr_t)device_get_match_data(dev); - host->bus_num = 0; + host->bus_num = (amd_spi->version == AMD_HID2_SPI) ? 2 : 0; return amd_spi_probe_common(dev, host); } diff --git a/drivers/spi/spi-ep93xx.c b/drivers/spi/spi-ep93xx.c index db50018050e5..f716c9607be4 100644 --- a/drivers/spi/spi-ep93xx.c +++ b/drivers/spi/spi-ep93xx.c @@ -582,12 +582,14 @@ static int ep93xx_spi_setup_dma(struct device *dev, struct ep93xx_spi *espi) espi->dma_rx = dma_request_chan(dev, "rx"); if (IS_ERR(espi->dma_rx)) { ret = dev_err_probe(dev, PTR_ERR(espi->dma_rx), "rx DMA setup failed"); + espi->dma_rx = NULL; goto fail_free_page; } espi->dma_tx = dma_request_chan(dev, "tx"); if (IS_ERR(espi->dma_tx)) { ret = dev_err_probe(dev, PTR_ERR(espi->dma_tx), "tx DMA setup failed"); + espi->dma_tx = NULL; goto fail_release_rx; } diff --git a/drivers/spi/spi-mtk-snfi.c b/drivers/spi/spi-mtk-snfi.c index 73fa84475f0e..7725748cab2a 100644 --- a/drivers/spi/spi-mtk-snfi.c +++ b/drivers/spi/spi-mtk-snfi.c @@ -961,7 +961,7 @@ static int mtk_snand_read_page_cache(struct mtk_snand *snf, &snf->op_done, usecs_to_jiffies(SNFI_POLL_INTERVAL))) { dev_err(snf->dev, "DMA timed out for reading from cache.\n"); ret = -ETIMEDOUT; - goto cleanup; + goto cleanup2; } // Wait for BUS_SEC_CNTR returning expected value diff --git a/drivers/spi/spi-qup.c b/drivers/spi/spi-qup.c index 45d9b4cb75e4..50bb7701b9d5 100644 --- a/drivers/spi/spi-qup.c +++ b/drivers/spi/spi-qup.c @@ -996,8 +996,11 @@ static int spi_qup_init_dma(struct spi_controller *host, resource_size_t base) err: dma_release_channel(host->dma_tx); + host->dma_tx = NULL; err_tx: dma_release_channel(host->dma_rx); + host->dma_rx = NULL; + return ret; } diff --git a/drivers/spi/spi-sprd.c b/drivers/spi/spi-sprd.c index fd3fd0ce122c..acebf9c2e795 100644 --- a/drivers/spi/spi-sprd.c +++ b/drivers/spi/spi-sprd.c @@ -991,7 +991,8 @@ err_rpm_put: disable_clk: clk_disable_unprepare(ss->clk); release_dma: - sprd_spi_dma_release(ss); + if (ss->dma.enable) + sprd_spi_dma_release(ss); free_controller: spi_controller_put(sctlr); diff --git a/drivers/spi/spi-ti-qspi.c b/drivers/spi/spi-ti-qspi.c index 1fbd710d616f..e3b413b9828c 100644 --- a/drivers/spi/spi-ti-qspi.c +++ b/drivers/spi/spi-ti-qspi.c @@ -867,6 +867,7 @@ static int ti_qspi_probe(struct platform_device *pdev) dev_err(qspi->dev, "dma_alloc_coherent failed, using PIO mode\n"); dma_release_channel(qspi->rx_chan); + qspi->rx_chan = NULL; goto no_dma; } host->dma_rx = qspi->rx_chan; diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c index b1d658b8f7b5..05385b63e2ba 100644 --- a/drivers/vfio/pci/vfio_pci_dmabuf.c +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c @@ -232,9 +232,11 @@ int vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags, return -EINVAL; /* - * For PCI the region_index is the BAR number like everything else. + * For PCI the region_index is the BAR number like everything + * else. Check that PCI resources have been claimed for it. */ - if (get_dma_buf.region_index >= VFIO_PCI_ROM_REGION_INDEX) + if (get_dma_buf.region_index >= VFIO_PCI_ROM_REGION_INDEX || + vfio_pci_core_setup_barmap(vdev, get_dma_buf.region_index)) return -ENODEV; dma_ranges = memdup_array_user(&arg->dma_ranges, get_dma_buf.nr_ranges, diff --git a/drivers/virt/coco/sev-guest/sev-guest.c b/drivers/virt/coco/sev-guest/sev-guest.c index 910a1de0d5a7..d186ae55cf63 100644 --- a/drivers/virt/coco/sev-guest/sev-guest.c +++ b/drivers/virt/coco/sev-guest/sev-guest.c @@ -176,6 +176,7 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques struct snp_guest_req req = {}; int ret, npages = 0, resp_len; sockptr_t certs_address; + u64 pfn; if (sockptr_is_null(io->req_data) || sockptr_is_null(io->resp_data)) return -EINVAL; @@ -215,10 +216,11 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques if (!req.certs_data) return -ENOMEM; + pfn = PHYS_PFN(virt_to_phys(req.certs_data)); ret = set_memory_decrypted((unsigned long)req.certs_data, npages); if (ret) { pr_err("failed to mark page shared, ret=%d\n", ret); - free_pages_exact(req.certs_data, npages << PAGE_SHIFT); + snp_leak_pages(pfn, npages); return -EFAULT; } @@ -272,10 +274,12 @@ e_free: kfree(report_resp); e_free_data: if (npages) { - if (set_memory_encrypted((unsigned long)req.certs_data, npages)) + if (set_memory_encrypted((unsigned long)req.certs_data, npages)) { WARN_ONCE(ret, "failed to restore encryption mask (leak it)\n"); - else + snp_leak_pages(pfn, npages); + } else { free_pages_exact(req.certs_data, npages << PAGE_SHIFT); + } } return ret; } diff --git a/fs/9p/v9fs_vfs.h b/fs/9p/v9fs_vfs.h index d3aefbec4de6..34c115d7c250 100644 --- a/fs/9p/v9fs_vfs.h +++ b/fs/9p/v9fs_vfs.h @@ -75,17 +75,4 @@ static inline void v9fs_invalidate_inode_attr(struct inode *inode) int v9fs_open_to_dotl_flags(int flags); -static inline void v9fs_i_size_write(struct inode *inode, loff_t i_size) -{ - /* - * 32-bit need the lock, concurrent updates could break the - * sequences and make i_size_read() loop forever. - * 64-bit updates are atomic and can skip the locking. - */ - if (sizeof(i_size) > sizeof(long)) - spin_lock(&inode->i_lock); - i_size_write(inode, i_size); - if (sizeof(i_size) > sizeof(long)) - spin_unlock(&inode->i_lock); -} #endif diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c index 97abe65bf7c1..a0a5aec8e5d5 100644 --- a/fs/9p/vfs_inode.c +++ b/fs/9p/vfs_inode.c @@ -1141,11 +1141,13 @@ v9fs_stat2inode(struct p9_wstat *stat, struct inode *inode, mode |= inode->i_mode & ~S_IALLUGO; inode->i_mode = mode; - v9inode->netfs.remote_i_size = stat->length; + spin_lock(&inode->i_lock); + netfs_write_remote_i_size(inode, stat->length); if (!(flags & V9FS_STAT2INODE_KEEP_ISIZE)) - v9fs_i_size_write(inode, stat->length); + i_size_write(inode, stat->length); /* not real number of blocks, but 512 byte ones ... */ inode->i_blocks = (stat->length + 512 - 1) >> 9; + spin_unlock(&inode->i_lock); v9inode->cache_validity &= ~V9FS_INO_INVALID_ATTR; } diff --git a/fs/9p/vfs_inode_dotl.c b/fs/9p/vfs_inode_dotl.c index 643e759eacb2..d800f4fad555 100644 --- a/fs/9p/vfs_inode_dotl.c +++ b/fs/9p/vfs_inode_dotl.c @@ -634,10 +634,12 @@ v9fs_stat2inode_dotl(struct p9_stat_dotl *stat, struct inode *inode, mode |= inode->i_mode & ~S_IALLUGO; inode->i_mode = mode; - v9inode->netfs.remote_i_size = stat->st_size; + spin_lock(&inode->i_lock); + netfs_write_remote_i_size(inode, stat->st_size); if (!(flags & V9FS_STAT2INODE_KEEP_ISIZE)) - v9fs_i_size_write(inode, stat->st_size); + i_size_write(inode, stat->st_size); inode->i_blocks = stat->st_blocks; + spin_unlock(&inode->i_lock); } else { if (stat->st_result_mask & P9_STATS_ATIME) { inode_set_atime(inode, stat->st_atime_sec, @@ -662,13 +664,15 @@ v9fs_stat2inode_dotl(struct p9_stat_dotl *stat, struct inode *inode, mode |= inode->i_mode & ~S_IALLUGO; inode->i_mode = mode; } + spin_lock(&inode->i_lock); if (!(flags & V9FS_STAT2INODE_KEEP_ISIZE) && stat->st_result_mask & P9_STATS_SIZE) { - v9inode->netfs.remote_i_size = stat->st_size; - v9fs_i_size_write(inode, stat->st_size); + netfs_write_remote_i_size(inode, stat->st_size); + i_size_write(inode, stat->st_size); } if (stat->st_result_mask & P9_STATS_BLOCKS) inode->i_blocks = stat->st_blocks; + spin_unlock(&inode->i_lock); } if (stat->st_result_mask & P9_STATS_GEN) inode->i_generation = stat->st_gen; diff --git a/fs/afs/Makefile b/fs/afs/Makefile index b49b8fe682f3..0d8f1982d596 100644 --- a/fs/afs/Makefile +++ b/fs/afs/Makefile @@ -30,6 +30,7 @@ kafs-y := \ server.o \ server_list.o \ super.o \ + symlink.o \ validation.o \ vlclient.o \ vl_alias.o \ diff --git a/fs/afs/dir.c b/fs/afs/dir.c index 78caef3f1338..99ad0058694c 100644 --- a/fs/afs/dir.c +++ b/fs/afs/dir.c @@ -44,6 +44,8 @@ static int afs_symlink(struct mnt_idmap *idmap, struct inode *dir, static int afs_rename(struct mnt_idmap *idmap, struct inode *old_dir, struct dentry *old_dentry, struct inode *new_dir, struct dentry *new_dentry, unsigned int flags); +static int afs_dir_writepages(struct address_space *mapping, + struct writeback_control *wbc); const struct file_operations afs_dir_file_operations = { .open = afs_dir_open, @@ -68,7 +70,7 @@ const struct inode_operations afs_dir_inode_operations = { }; const struct address_space_operations afs_dir_aops = { - .writepages = afs_single_writepages, + .writepages = afs_dir_writepages, }; const struct dentry_operations afs_fs_dentry_operations = { @@ -233,22 +235,13 @@ static ssize_t afs_do_read_single(struct afs_vnode *dvnode, struct file *file) struct iov_iter iter; ssize_t ret; loff_t i_size; - bool is_dir = (S_ISDIR(dvnode->netfs.inode.i_mode) && - !test_bit(AFS_VNODE_MOUNTPOINT, &dvnode->flags)); i_size = i_size_read(&dvnode->netfs.inode); - if (is_dir) { - if (i_size < AFS_DIR_BLOCK_SIZE) - return afs_bad(dvnode, afs_file_error_dir_small); - if (i_size > AFS_DIR_BLOCK_SIZE * 1024) { - trace_afs_file_error(dvnode, -EFBIG, afs_file_error_dir_big); - return -EFBIG; - } - } else { - if (i_size > AFSPATHMAX) { - trace_afs_file_error(dvnode, -EFBIG, afs_file_error_dir_big); - return -EFBIG; - } + if (i_size < AFS_DIR_BLOCK_SIZE) + return afs_bad(dvnode, afs_file_error_dir_small); + if (i_size > AFS_DIR_BLOCK_SIZE * 1024) { + trace_afs_file_error(dvnode, -EFBIG, afs_file_error_dir_big); + return -EFBIG; } /* Expand the storage. TODO: Shrink the storage too. */ @@ -277,24 +270,18 @@ static ssize_t afs_do_read_single(struct afs_vnode *dvnode, struct file *file) * buffer. */ ret = -ESTALE; - } else if (is_dir) { + } else { int ret2 = afs_dir_check(dvnode); if (ret2 < 0) ret = ret2; - } else if (i_size < folioq_folio_size(dvnode->directory, 0)) { - /* NUL-terminate a symlink. */ - char *symlink = kmap_local_folio(folioq_folio(dvnode->directory, 0), 0); - - symlink[i_size] = 0; - kunmap_local(symlink); } } return ret; } -ssize_t afs_read_single(struct afs_vnode *dvnode, struct file *file) +static ssize_t afs_read_single(struct afs_vnode *dvnode, struct file *file) { ssize_t ret; @@ -1763,13 +1750,20 @@ error: return ret; } +static void afs_symlink_put(struct afs_operation *op) +{ + kfree(op->create.symlink); + op->create.symlink = NULL; + afs_create_put(op); +} + static const struct afs_operation_ops afs_symlink_operation = { .issue_afs_rpc = afs_fs_symlink, .issue_yfs_rpc = yfs_fs_symlink, .success = afs_create_success, .aborted = afs_check_for_remote_deletion, .edit_dir = afs_create_edit_dir, - .put = afs_create_put, + .put = afs_symlink_put, }; /* @@ -1779,7 +1773,9 @@ static int afs_symlink(struct mnt_idmap *idmap, struct inode *dir, struct dentry *dentry, const char *content) { struct afs_operation *op; + struct afs_symlink *symlink; struct afs_vnode *dvnode = AFS_FS_I(dir); + size_t clen = strlen(content); int ret; _enter("{%llx:%llu},{%pd},%s", @@ -1791,12 +1787,20 @@ static int afs_symlink(struct mnt_idmap *idmap, struct inode *dir, goto error; ret = -EINVAL; - if (strlen(content) >= AFSPATHMAX) + if (clen >= AFSPATHMAX) + goto error; + + ret = -ENOMEM; + symlink = kmalloc_flex(struct afs_symlink, content, clen + 1, GFP_KERNEL); + if (!symlink) goto error; + refcount_set(&symlink->ref, 1); + memcpy(symlink->content, content, clen + 1); op = afs_alloc_operation(NULL, dvnode->volume); if (IS_ERR(op)) { ret = PTR_ERR(op); + kfree(symlink); goto error; } @@ -1808,7 +1812,7 @@ static int afs_symlink(struct mnt_idmap *idmap, struct inode *dir, op->dentry = dentry; op->ops = &afs_symlink_operation; op->create.reason = afs_edit_dir_for_symlink; - op->create.symlink = content; + op->create.symlink = symlink; op->mtime = current_time(dir); ret = afs_do_sync_operation(op); afs_dir_unuse_cookie(dvnode, ret); @@ -2192,28 +2196,33 @@ error: } /* - * Write the file contents to the cache as a single blob. + * Write the directory contents to the cache as a single blob. */ -int afs_single_writepages(struct address_space *mapping, - struct writeback_control *wbc) +static int afs_dir_writepages(struct address_space *mapping, + struct writeback_control *wbc) { struct afs_vnode *dvnode = AFS_FS_I(mapping->host); struct iov_iter iter; - bool is_dir = (S_ISDIR(dvnode->netfs.inode.i_mode) && - !test_bit(AFS_VNODE_MOUNTPOINT, &dvnode->flags)); int ret = 0; /* Need to lock to prevent the folio queue and folios from being thrown * away. */ - down_read(&dvnode->validate_lock); + if (!down_read_trylock(&dvnode->validate_lock)) { + if (wbc->sync_mode == WB_SYNC_NONE) { + /* The VFS will have undirtied the inode. */ + netfs_single_mark_inode_dirty(&dvnode->netfs.inode); + return 0; + } + down_read(&dvnode->validate_lock); + } - if (is_dir ? - test_bit(AFS_VNODE_DIR_VALID, &dvnode->flags) : - atomic64_read(&dvnode->cb_expires_at) != AFS_NO_CB_PROMISE) { + if (test_bit(AFS_VNODE_DIR_VALID, &dvnode->flags)) { iov_iter_folio_queue(&iter, ITER_SOURCE, dvnode->directory, 0, 0, i_size_read(&dvnode->netfs.inode)); ret = netfs_writeback_single(mapping, wbc, &iter); + if (ret == 1) + ret = 0; /* Skipped write due to lock conflict. */ } up_read(&dvnode->validate_lock); diff --git a/fs/afs/file.c b/fs/afs/file.c index 74d04af51ff4..650595e1c3f3 100644 --- a/fs/afs/file.c +++ b/fs/afs/file.c @@ -424,21 +424,35 @@ static void afs_free_request(struct netfs_io_request *rreq) afs_put_wb_key(rreq->netfs_priv2); } -static void afs_update_i_size(struct inode *inode, loff_t new_i_size) +/* + * Set the file size and block count, taking ->cb_lock and ->i_lock to maintain + * coherency and prevent 64-bit tearing on 32-bit arches. + * + * Also, estimate the number of 512 bytes blocks used, rounded up to nearest 1K + * for consistency with other AFS clients. + */ +void afs_set_i_size(struct afs_vnode *vnode, loff_t new_i_size) { - struct afs_vnode *vnode = AFS_FS_I(inode); + struct inode *inode = &vnode->netfs.inode; loff_t i_size; write_seqlock(&vnode->cb_lock); - i_size = i_size_read(&vnode->netfs.inode); + spin_lock(&inode->i_lock); + i_size = i_size_read(inode); if (new_i_size > i_size) { - i_size_write(&vnode->netfs.inode, new_i_size); - inode_set_bytes(&vnode->netfs.inode, new_i_size); + i_size_write(inode, new_i_size); + inode_set_bytes(inode, round_up(new_i_size, 1024)); } + spin_unlock(&inode->i_lock); write_sequnlock(&vnode->cb_lock); fscache_update_cookie(afs_vnode_cache(vnode), NULL, &new_i_size); } +static void afs_update_i_size(struct inode *inode, loff_t new_i_size) +{ + afs_set_i_size(AFS_FS_I(inode), new_i_size); +} + static void afs_netfs_invalidate_cache(struct netfs_io_request *wreq) { struct afs_vnode *vnode = AFS_FS_I(wreq->inode); diff --git a/fs/afs/fsclient.c b/fs/afs/fsclient.c index 95494d5f2b8a..a2ffd60889f8 100644 --- a/fs/afs/fsclient.c +++ b/fs/afs/fsclient.c @@ -886,7 +886,7 @@ void afs_fs_symlink(struct afs_operation *op) namesz = name->len; padsz = (4 - (namesz & 3)) & 3; - c_namesz = strlen(op->create.symlink); + c_namesz = strlen(op->create.symlink->content); c_padsz = (4 - (c_namesz & 3)) & 3; reqsz = (6 * 4) + namesz + padsz + c_namesz + c_padsz + (6 * 4); @@ -910,7 +910,7 @@ void afs_fs_symlink(struct afs_operation *op) bp = (void *) bp + padsz; } *bp++ = htonl(c_namesz); - memcpy(bp, op->create.symlink, c_namesz); + memcpy(bp, op->create.symlink->content, c_namesz); bp = (void *) bp + c_namesz; if (c_padsz > 0) { memset(bp, 0, c_padsz); diff --git a/fs/afs/inode.c b/fs/afs/inode.c index dde1857fcabb..de72256f00db 100644 --- a/fs/afs/inode.c +++ b/fs/afs/inode.c @@ -25,96 +25,6 @@ #include "internal.h" #include "afs_fs.h" -void afs_init_new_symlink(struct afs_vnode *vnode, struct afs_operation *op) -{ - size_t size = strlen(op->create.symlink) + 1; - size_t dsize = 0; - char *p; - - if (netfs_alloc_folioq_buffer(NULL, &vnode->directory, &dsize, size, - mapping_gfp_mask(vnode->netfs.inode.i_mapping)) < 0) - return; - - vnode->directory_size = dsize; - p = kmap_local_folio(folioq_folio(vnode->directory, 0), 0); - memcpy(p, op->create.symlink, size); - kunmap_local(p); - set_bit(AFS_VNODE_DIR_READ, &vnode->flags); - netfs_single_mark_inode_dirty(&vnode->netfs.inode); -} - -static void afs_put_link(void *arg) -{ - struct folio *folio = virt_to_folio(arg); - - kunmap_local(arg); - folio_put(folio); -} - -const char *afs_get_link(struct dentry *dentry, struct inode *inode, - struct delayed_call *callback) -{ - struct afs_vnode *vnode = AFS_FS_I(inode); - struct folio *folio; - char *content; - ssize_t ret; - - if (!dentry) { - /* RCU pathwalk. */ - if (!test_bit(AFS_VNODE_DIR_READ, &vnode->flags) || !afs_check_validity(vnode)) - return ERR_PTR(-ECHILD); - goto good; - } - - if (test_bit(AFS_VNODE_DIR_READ, &vnode->flags)) - goto fetch; - - ret = afs_validate(vnode, NULL); - if (ret < 0) - return ERR_PTR(ret); - - if (!test_and_clear_bit(AFS_VNODE_ZAP_DATA, &vnode->flags) && - test_bit(AFS_VNODE_DIR_READ, &vnode->flags)) - goto good; - -fetch: - ret = afs_read_single(vnode, NULL); - if (ret < 0) - return ERR_PTR(ret); - set_bit(AFS_VNODE_DIR_READ, &vnode->flags); - -good: - folio = folioq_folio(vnode->directory, 0); - folio_get(folio); - content = kmap_local_folio(folio, 0); - set_delayed_call(callback, afs_put_link, content); - return content; -} - -int afs_readlink(struct dentry *dentry, char __user *buffer, int buflen) -{ - DEFINE_DELAYED_CALL(done); - const char *content; - int len; - - content = afs_get_link(dentry, d_inode(dentry), &done); - if (IS_ERR(content)) { - do_delayed_call(&done); - return PTR_ERR(content); - } - - len = umin(strlen(content), buflen); - if (copy_to_user(buffer, content, len)) - len = -EFAULT; - do_delayed_call(&done); - return len; -} - -static const struct inode_operations afs_symlink_inode_operations = { - .get_link = afs_get_link, - .readlink = afs_readlink, -}; - static noinline void dump_vnode(struct afs_vnode *vnode, struct afs_vnode *parent_vnode) { static unsigned long once_only; @@ -214,7 +124,7 @@ static int afs_inode_init_from_status(struct afs_operation *op, inode->i_mode = S_IFLNK | status->mode; inode->i_op = &afs_symlink_inode_operations; } - inode->i_mapping->a_ops = &afs_dir_aops; + inode->i_mapping->a_ops = &afs_symlink_aops; inode_nohighmem(inode); mapping_set_release_always(inode->i_mapping); break; @@ -224,7 +134,8 @@ static int afs_inode_init_from_status(struct afs_operation *op, return afs_protocol_error(NULL, afs_eproto_file_type); } - afs_set_i_size(vnode, status->size); + i_size_write(inode, status->size); + inode_set_bytes(inode, status->size); afs_set_netfs_context(vnode); vnode->invalid_before = status->data_version; @@ -253,7 +164,8 @@ static void afs_apply_status(struct afs_operation *op, { struct afs_file_status *status = &vp->scb.status; struct afs_vnode *vnode = vp->vnode; - struct inode *inode = &vnode->netfs.inode; + struct netfs_inode *ictx = &vnode->netfs; + struct inode *inode = &ictx->inode; struct timespec64 t; umode_t mode; bool unexpected_jump = false; @@ -336,6 +248,8 @@ static void afs_apply_status(struct afs_operation *op, } if (data_changed) { + unsigned long long zero_point, size = status->size; + inode_set_iversion_raw(inode, status->data_version); /* Only update the size if the data version jumped. If the @@ -343,16 +257,25 @@ static void afs_apply_status(struct afs_operation *op, * idea of what the size should be that's not the same as * what's on the server. */ - vnode->netfs.remote_i_size = status->size; - if (change_size || status->size > i_size_read(inode)) { - afs_set_i_size(vnode, status->size); + spin_lock(&inode->i_lock); + + if (change_size || size > i_size_read(inode)) { + /* We can read the sizes directly as we hold i_lock. */ + zero_point = ictx->_zero_point; + if (unexpected_jump) - vnode->netfs.zero_point = status->size; + zero_point = size; + netfs_write_sizes(inode, size, size, zero_point); + inode_set_bytes(inode, size); inode_set_ctime_to_ts(inode, t); inode_set_atime_to_ts(inode, t); + } else { + netfs_write_remote_i_size(inode, size); } + spin_unlock(&inode->i_lock); + if (op->ops == &afs_fetch_data_operation) - op->fetch.subreq->rreq->i_size = status->size; + op->fetch.subreq->rreq->i_size = size; } } @@ -709,7 +632,7 @@ int afs_getattr(struct mnt_idmap *idmap, const struct path *path, * it, but we need to give userspace the server's size. */ if (S_ISDIR(inode->i_mode)) - stat->size = vnode->netfs.remote_i_size; + stat->size = netfs_read_remote_i_size(inode); } while (read_seqretry(&vnode->cb_lock, seq)); return 0; @@ -756,12 +679,14 @@ void afs_evict_inode(struct inode *inode) .range_end = LLONG_MAX, }; - afs_single_writepages(inode->i_mapping, &wbc); + inode->i_mapping->a_ops->writepages(inode->i_mapping, &wbc); } netfs_wait_for_outstanding_io(inode); truncate_inode_pages_final(&inode->i_data); netfs_free_folioq_buffer(vnode->directory); + if (vnode->symlink) + afs_evict_symlink(vnode); afs_set_cache_aux(vnode, &aux); netfs_clear_inode_writeback(inode, &aux); @@ -889,7 +814,7 @@ int afs_setattr(struct mnt_idmap *idmap, struct dentry *dentry, */ if (!(attr->ia_valid & (supported & ~ATTR_SIZE & ~ATTR_MTIME)) && attr->ia_size < i_size && - attr->ia_size > vnode->netfs.remote_i_size) { + attr->ia_size > netfs_read_remote_i_size(inode)) { truncate_setsize(inode, attr->ia_size); netfs_resize_file(&vnode->netfs, size, false); fscache_resize_cookie(afs_vnode_cache(vnode), diff --git a/fs/afs/internal.h b/fs/afs/internal.h index 009064b8d661..dc89c3c60203 100644 --- a/fs/afs/internal.h +++ b/fs/afs/internal.h @@ -711,6 +711,7 @@ struct afs_vnode { #define AFS_VNODE_DIR_READ 11 /* Set if we've read a dir's contents */ struct folio_queue *directory; /* Directory contents */ + struct afs_symlink __rcu *symlink; /* Symlink content */ struct list_head wb_keys; /* List of keys available for writeback */ struct list_head pending_locks; /* locks waiting to be granted */ struct list_head granted_locks; /* locks granted on this file */ @@ -778,6 +779,15 @@ struct afs_permits { }; /* + * Copy of symlink content for normal use. + */ +struct afs_symlink { + struct rcu_head rcu; + refcount_t ref; + char content[]; +}; + +/* * Error prioritisation and accumulation. */ struct afs_error { @@ -888,7 +898,7 @@ struct afs_operation { struct { int reason; /* enum afs_edit_dir_reason */ mode_t mode; - const char *symlink; + struct afs_symlink *symlink; } create; struct { bool need_rehash; @@ -1099,13 +1109,10 @@ extern const struct inode_operations afs_dir_inode_operations; extern const struct address_space_operations afs_dir_aops; extern const struct dentry_operations afs_fs_dentry_operations; -ssize_t afs_read_single(struct afs_vnode *dvnode, struct file *file); ssize_t afs_read_dir(struct afs_vnode *dvnode, struct file *file) __acquires(&dvnode->validate_lock); extern void afs_d_release(struct dentry *); extern void afs_check_for_remote_deletion(struct afs_operation *); -int afs_single_writepages(struct address_space *mapping, - struct writeback_control *wbc); /* * dir_edit.c @@ -1158,6 +1165,7 @@ extern int afs_open(struct inode *, struct file *); extern int afs_release(struct inode *, struct file *); void afs_fetch_data_async_rx(struct work_struct *work); void afs_fetch_data_immediate_cancel(struct afs_call *call); +void afs_set_i_size(struct afs_vnode *vnode, loff_t new_i_size); /* * flock.c @@ -1247,10 +1255,6 @@ extern void afs_fs_probe_cleanup(struct afs_net *); */ extern const struct afs_operation_ops afs_fetch_status_operation; -void afs_init_new_symlink(struct afs_vnode *vnode, struct afs_operation *op); -const char *afs_get_link(struct dentry *dentry, struct inode *inode, - struct delayed_call *callback); -int afs_readlink(struct dentry *dentry, char __user *buffer, int buflen); extern void afs_vnode_commit_status(struct afs_operation *, struct afs_vnode_param *); extern int afs_fetch_status(struct afs_vnode *, struct key *, bool, afs_access_t *); extern int afs_ilookup5_test_by_fid(struct inode *, void *); @@ -1601,6 +1605,21 @@ extern int __init afs_fs_init(void); extern void afs_fs_exit(void); /* + * symlink.c + */ +extern const struct inode_operations afs_symlink_inode_operations; +extern const struct address_space_operations afs_symlink_aops; + +void afs_invalidate_symlink(struct afs_vnode *vnode); +void afs_evict_symlink(struct afs_vnode *vnode); +void afs_init_new_symlink(struct afs_vnode *vnode, struct afs_operation *op); +const char *afs_get_link(struct dentry *dentry, struct inode *inode, + struct delayed_call *callback); +int afs_readlink(struct dentry *dentry, char __user *buffer, int buflen); +int afs_symlink_writepages(struct address_space *mapping, + struct writeback_control *wbc); + +/* * validation.c */ bool afs_check_validity(const struct afs_vnode *vnode); @@ -1760,16 +1779,6 @@ static inline void afs_update_dentry_version(struct afs_operation *op, } /* - * Set the file size and block count. Estimate the number of 512 bytes blocks - * used, rounded up to nearest 1K for consistency with other AFS clients. - */ -static inline void afs_set_i_size(struct afs_vnode *vnode, u64 size) -{ - i_size_write(&vnode->netfs.inode, size); - vnode->netfs.inode.i_blocks = ((size + 1023) >> 10) << 1; -} - -/* * Check for a conflicting operation on a directory that we just unlinked from. * If someone managed to sneak a link or an unlink in on the file we just * unlinked, we won't be able to trust nlink on an AFS file (but not YFS). diff --git a/fs/afs/symlink.c b/fs/afs/symlink.c new file mode 100644 index 000000000000..ed5868369f37 --- /dev/null +++ b/fs/afs/symlink.c @@ -0,0 +1,278 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* AFS filesystem symbolic link handling + * + * Copyright (C) 2026 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + */ + +#include <linux/kernel.h> +#include <linux/fs.h> +#include <linux/namei.h> +#include <linux/pagemap.h> +#include <linux/iov_iter.h> +#include "internal.h" + +static void afs_put_symlink(struct afs_symlink *symlink) +{ + if (refcount_dec_and_test(&symlink->ref)) + kfree_rcu(symlink, rcu); +} + +static void afs_replace_symlink(struct afs_vnode *vnode, struct afs_symlink *symlink) +{ + struct afs_symlink *old; + + old = rcu_replace_pointer(vnode->symlink, symlink, + lockdep_is_held(&vnode->validate_lock)); + if (old) + afs_put_symlink(old); +} + +/* + * In the event that a third-party update of a symlink occurs, dispose of the + * copy of the old contents. Called under ->validate_lock. + */ +void afs_invalidate_symlink(struct afs_vnode *vnode) +{ + afs_replace_symlink(vnode, NULL); +} + +/* + * Dispose of a symlink copy during inode deletion. + */ +void afs_evict_symlink(struct afs_vnode *vnode) +{ + struct afs_symlink *old; + + old = rcu_replace_pointer(vnode->symlink, NULL, true); + if (old) + afs_put_symlink(old); + +} + +/* + * Set up a locally created symlink inode for immediate write to the cache. + */ +void afs_init_new_symlink(struct afs_vnode *vnode, struct afs_operation *op) +{ + struct afs_symlink *symlink = op->create.symlink; + size_t dsize = 0; + size_t size = strlen(symlink->content) + 1; + char *p; + + rcu_assign_pointer(vnode->symlink, symlink); + op->create.symlink = NULL; + + if (!fscache_cookie_enabled(netfs_i_cookie(&vnode->netfs))) + return; + + if (netfs_alloc_folioq_buffer(NULL, &vnode->directory, &dsize, size, + mapping_gfp_mask(vnode->netfs.inode.i_mapping)) < 0) + return; + + vnode->directory_size = dsize; + p = kmap_local_folio(folioq_folio(vnode->directory, 0), 0); + memcpy(p, symlink->content, size); + kunmap_local(p); + netfs_single_mark_inode_dirty(&vnode->netfs.inode); +} + +/* + * Read a symlink in a single download. + */ +static ssize_t afs_do_read_symlink(struct afs_vnode *vnode) +{ + struct afs_symlink *symlink; + struct iov_iter iter; + ssize_t ret; + loff_t i_size; + + i_size = i_size_read(&vnode->netfs.inode); + if (i_size > PAGE_SIZE - 1) { + trace_afs_file_error(vnode, -EFBIG, afs_file_error_dir_big); + return -EFBIG; + } + + if (!vnode->directory) { + size_t cur_size = 0; + + ret = netfs_alloc_folioq_buffer(NULL, + &vnode->directory, &cur_size, PAGE_SIZE, + mapping_gfp_mask(vnode->netfs.inode.i_mapping)); + vnode->directory_size = PAGE_SIZE - 1; + if (ret < 0) + return ret; + } + + iov_iter_folio_queue(&iter, ITER_DEST, vnode->directory, 0, 0, PAGE_SIZE); + + /* AFS requires us to perform the read of a symlink as a single unit to + * avoid issues with the content being changed between reads. + */ + ret = netfs_read_single(&vnode->netfs.inode, NULL, &iter); + if (ret >= 0) { + i_size = ret; + if (i_size > PAGE_SIZE - 1) { + trace_afs_file_error(vnode, -EFBIG, afs_file_error_dir_big); + return -EFBIG; + } + vnode->directory_size = i_size; + + /* Copy the symlink. */ + symlink = kmalloc_flex(struct afs_symlink, content, i_size + 1, + GFP_KERNEL); + if (!symlink) + return -ENOMEM; + + refcount_set(&symlink->ref, 1); + symlink->content[i_size] = 0; + + const char *s = kmap_local_folio(folioq_folio(vnode->directory, 0), 0); + + memcpy(symlink->content, s, i_size); + kunmap_local(s); + + afs_replace_symlink(vnode, symlink); + } + + if (!fscache_cookie_enabled(netfs_i_cookie(&vnode->netfs))) { + netfs_free_folioq_buffer(vnode->directory); + vnode->directory = NULL; + vnode->directory_size = 0; + } + + return ret; +} + +static ssize_t afs_read_symlink(struct afs_vnode *vnode) +{ + ssize_t ret; + + fscache_use_cookie(afs_vnode_cache(vnode), false); + ret = afs_do_read_symlink(vnode); + fscache_unuse_cookie(afs_vnode_cache(vnode), NULL, NULL); + return ret; +} + +static void afs_put_link(void *arg) +{ + afs_put_symlink(arg); +} + +const char *afs_get_link(struct dentry *dentry, struct inode *inode, + struct delayed_call *callback) +{ + struct afs_symlink *symlink; + struct afs_vnode *vnode = AFS_FS_I(inode); + ssize_t ret; + + if (!dentry) { + /* RCU pathwalk. */ + symlink = rcu_dereference(vnode->symlink); + if (!symlink || !afs_check_validity(vnode)) + return ERR_PTR(-ECHILD); + set_delayed_call(callback, NULL, NULL); + return symlink->content; + } + + if (vnode->symlink) { + ret = afs_validate(vnode, NULL); + if (ret < 0) + return ERR_PTR(ret); + + down_read(&vnode->validate_lock); + if (vnode->symlink) + goto good; + up_read(&vnode->validate_lock); + } + + if (down_write_killable(&vnode->validate_lock) < 0) + return ERR_PTR(-ERESTARTSYS); + if (!vnode->symlink) { + ret = afs_read_symlink(vnode); + if (ret < 0) { + up_write(&vnode->validate_lock); + return ERR_PTR(ret); + } + } + + downgrade_write(&vnode->validate_lock); + +good: + symlink = rcu_dereference_protected(vnode->symlink, + lockdep_is_held(&vnode->validate_lock)); + refcount_inc(&symlink->ref); + up_read(&vnode->validate_lock); + + set_delayed_call(callback, afs_put_link, symlink); + return symlink->content; +} + +int afs_readlink(struct dentry *dentry, char __user *buffer, int buflen) +{ + DEFINE_DELAYED_CALL(done); + const char *content; + int len; + + content = afs_get_link(dentry, d_inode(dentry), &done); + if (IS_ERR(content)) { + do_delayed_call(&done); + return PTR_ERR(content); + } + + len = umin(strlen(content), buflen); + if (copy_to_user(buffer, content, len)) + len = -EFAULT; + do_delayed_call(&done); + return len; +} + +/* + * Write the symlink contents to the cache as a single blob. We then throw + * away the page we used to receive it. + */ +int afs_symlink_writepages(struct address_space *mapping, + struct writeback_control *wbc) +{ + struct afs_vnode *vnode = AFS_FS_I(mapping->host); + struct iov_iter iter; + int ret = 0; + + if (!down_read_trylock(&vnode->validate_lock)) { + if (wbc->sync_mode == WB_SYNC_NONE) { + /* The VFS will have undirtied the inode. */ + netfs_single_mark_inode_dirty(&vnode->netfs.inode); + return 0; + } + down_read(&vnode->validate_lock); + } + + if (vnode->directory && + atomic64_read(&vnode->cb_expires_at) != AFS_NO_CB_PROMISE) { + iov_iter_folio_queue(&iter, ITER_SOURCE, vnode->directory, 0, 0, + i_size_read(&vnode->netfs.inode)); + ret = netfs_writeback_single(mapping, wbc, &iter); + } + + if (ret == 0) { + mutex_lock(&vnode->netfs.wb_lock); + netfs_free_folioq_buffer(vnode->directory); + vnode->directory = NULL; + vnode->directory_size = 0; + mutex_unlock(&vnode->netfs.wb_lock); + } else if (ret == 1) { + ret = 0; /* Skipped write due to lock conflict. */ + } + + up_read(&vnode->validate_lock); + return ret; +} + +const struct inode_operations afs_symlink_inode_operations = { + .get_link = afs_get_link, + .readlink = afs_readlink, +}; + +const struct address_space_operations afs_symlink_aops = { + .writepages = afs_symlink_writepages, +}; diff --git a/fs/afs/validation.c b/fs/afs/validation.c index 0ba8336c9025..e997563af658 100644 --- a/fs/afs/validation.c +++ b/fs/afs/validation.c @@ -465,11 +465,17 @@ int afs_validate(struct afs_vnode *vnode, struct key *key) vnode->cb_ro_snapshot = cb_ro_snapshot; vnode->cb_scrub = cb_scrub; - /* if the vnode's data version number changed then its contents are - * different */ + /* If the vnode's data version number changed then its contents are + * different. Note that afs_apply_status() doesn't set ZAP_DATA on + * directories. + */ zap |= test_and_clear_bit(AFS_VNODE_ZAP_DATA, &vnode->flags); - if (zap) - afs_zap_data(vnode); + if (zap) { + if (S_ISREG(vnode->netfs.inode.i_mode)) + afs_zap_data(vnode); + else if (S_ISLNK(vnode->netfs.inode.i_mode)) + afs_invalidate_symlink(vnode); + } up_write(&vnode->validate_lock); _leave(" = 0"); return 0; diff --git a/fs/afs/write.c b/fs/afs/write.c index 93ad86ff3345..e2ef19a73bbf 100644 --- a/fs/afs/write.c +++ b/fs/afs/write.c @@ -143,7 +143,7 @@ static void afs_issue_write_worker(struct work_struct *work) afs_begin_vnode_operation(op); op->store.write_iter = &subreq->io_iter; - op->store.i_size = umax(pos + len, vnode->netfs.remote_i_size); + op->store.i_size = umax(pos + len, netfs_read_remote_i_size(&vnode->netfs.inode)); op->mtime = inode_get_mtime(&vnode->netfs.inode); afs_wait_for_operation(op); diff --git a/fs/afs/yfsclient.c b/fs/afs/yfsclient.c index 24fb562ebd33..d941179730a9 100644 --- a/fs/afs/yfsclient.c +++ b/fs/afs/yfsclient.c @@ -960,7 +960,7 @@ void yfs_fs_symlink(struct afs_operation *op) _enter(""); - contents_sz = strlen(op->create.symlink); + contents_sz = strlen(op->create.symlink->content); call = afs_alloc_flat_call(op->net, &yfs_RXYFSSymlink, sizeof(__be32) + sizeof(struct yfs_xdr_RPCFlags) + @@ -981,7 +981,7 @@ void yfs_fs_symlink(struct afs_operation *op) bp = xdr_encode_u32(bp, 0); /* RPC flags */ bp = xdr_encode_YFSFid(bp, &dvp->fid); bp = xdr_encode_name(bp, name); - bp = xdr_encode_string(bp, op->create.symlink, contents_sz); + bp = xdr_encode_string(bp, op->create.symlink->content, contents_sz); bp = xdr_encode_YFSStoreStatus(bp, &mode, &op->mtime); yfs_check_req(call, bp); diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h index 3de3b517810e..d8c41f194729 100644 --- a/fs/btrfs/fs.h +++ b/fs/btrfs/fs.h @@ -154,6 +154,7 @@ enum { BTRFS_FS_LOG_RECOVERING, BTRFS_FS_OPEN, BTRFS_FS_QUOTA_ENABLED, + BTRFS_FS_SQUOTA_ENABLING, BTRFS_FS_UPDATE_UUID_TREE_GEN, BTRFS_FS_CREATING_FREE_SPACE_TREE, BTRFS_FS_BTREE_ERR, diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c index 41589ce66371..0823f5f561d7 100644 --- a/fs/btrfs/qgroup.c +++ b/fs/btrfs/qgroup.c @@ -1107,7 +1107,13 @@ int btrfs_quota_enable(struct btrfs_fs_info *fs_info, if (simple) { fs_info->qgroup_flags |= BTRFS_QGROUP_STATUS_FLAG_SIMPLE_MODE; btrfs_set_fs_incompat(fs_info, SIMPLE_QUOTA); - btrfs_set_qgroup_status_enable_gen(leaf, ptr, trans->transid); + /* + * Set the enable generation to the next transaction, as we cannot + * ensure that extents written during this transaction will see any + * state we have set here. So we should treat all extents of the + * transaction as coming in before squotas was enabled. + */ + btrfs_set_qgroup_status_enable_gen(leaf, ptr, trans->transid + 1); } else { fs_info->qgroup_flags |= BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT; } @@ -1210,7 +1216,15 @@ out_add_root: goto out_free_path; } - fs_info->qgroup_enable_gen = trans->transid; + /* + * Set fs_info->qgroup_enable_gen and BTRFS_FS_SQUOTA_ENABLING + * under the transaction handle. We want to ensure that all extents in + * the next transaction definitely see them. + */ + if (simple) { + fs_info->qgroup_enable_gen = trans->transid + 1; + set_bit(BTRFS_FS_SQUOTA_ENABLING, &fs_info->flags); + } mutex_unlock(&fs_info->qgroup_ioctl_lock); /* @@ -1224,9 +1238,15 @@ out_add_root: */ ret = btrfs_commit_transaction(trans); trans = NULL; + mutex_lock(&fs_info->qgroup_ioctl_lock); - if (ret) + if (ret) { + if (simple) { + clear_bit(BTRFS_FS_SQUOTA_ENABLING, &fs_info->flags); + fs_info->qgroup_enable_gen = 0; + } goto out_free_path; + } /* * Set quota enabled flag after committing the transaction, to avoid @@ -1236,6 +1256,8 @@ out_add_root: spin_lock(&fs_info->qgroup_lock); fs_info->quota_root = quota_root; set_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags); + if (simple) + clear_bit(BTRFS_FS_SQUOTA_ENABLING, &fs_info->flags); spin_unlock(&fs_info->qgroup_lock); /* Skip rescan for simple qgroups. */ @@ -1715,32 +1737,24 @@ out: return ret; } -static bool can_delete_parent_qgroup(struct btrfs_qgroup *qgroup) - +static bool can_delete_parent_qgroup(struct btrfs_fs_info *fs_info, struct btrfs_qgroup *qgroup) { ASSERT(btrfs_qgroup_level(qgroup->qgroupid)); + if (btrfs_qgroup_mode(fs_info) == BTRFS_QGROUP_MODE_SIMPLE) + squota_check_parent_usage(fs_info, qgroup); return list_empty(&qgroup->members); } /* - * Return true if we can delete the squota qgroup and false otherwise. - * - * Rules for whether we can delete: - * - * A subvolume qgroup can be removed iff the subvolume is fully deleted, which - * is iff there is 0 usage in the qgroup. - * - * A higher level qgroup can be removed iff it has no members. - * Note: We audit its usage to warn on inconsitencies without blocking deletion. + * Because a shared extent can outlive its owning subvolume, we cannot delete a + * subvol squota qgroup until all of the extents it owns are gone, even if the + * subvolume itself has been deleted. */ -static bool can_delete_squota_qgroup(struct btrfs_fs_info *fs_info, struct btrfs_qgroup *qgroup) +static bool can_delete_squota_subvol_qgroup(struct btrfs_fs_info *fs_info, + struct btrfs_qgroup *qgroup) { ASSERT(btrfs_qgroup_mode(fs_info) == BTRFS_QGROUP_MODE_SIMPLE); - - if (btrfs_qgroup_level(qgroup->qgroupid) > 0) { - squota_check_parent_usage(fs_info, qgroup); - return can_delete_parent_qgroup(qgroup); - } + ASSERT(btrfs_qgroup_level(qgroup->qgroupid) == 0); return !(qgroup->rfer || qgroup->excl || qgroup->rfer_cmpr || qgroup->excl_cmpr); } @@ -1754,14 +1768,11 @@ static int can_delete_qgroup(struct btrfs_fs_info *fs_info, struct btrfs_qgroup { struct btrfs_key key; BTRFS_PATH_AUTO_FREE(path); - - /* Since squotas cannot be inconsistent, they have special rules for deletion. */ - if (btrfs_qgroup_mode(fs_info) == BTRFS_QGROUP_MODE_SIMPLE) - return can_delete_squota_qgroup(fs_info, qgroup); + int ret; /* For higher level qgroup, we can only delete it if it has no child. */ if (btrfs_qgroup_level(qgroup->qgroupid)) - return can_delete_parent_qgroup(qgroup); + return can_delete_parent_qgroup(fs_info, qgroup); /* * For level-0 qgroups, we can only delete it if it has no subvolume @@ -1777,10 +1788,21 @@ static int can_delete_qgroup(struct btrfs_fs_info *fs_info, struct btrfs_qgroup return -ENOMEM; /* - * The @ret from btrfs_find_root() exactly matches our definition for - * the return value, thus can be returned directly. + * Any subvol qgroup, regardless of mode, cannot be deleted if the + * subvol still exists. + */ + ret = btrfs_find_root(fs_info->tree_root, &key, path, NULL, NULL); + /* + * btrfs_find_root returns <0 on error, 0 if found, and >0 if not, + * so the "found" and "error" cases match our desired return values. */ - return btrfs_find_root(fs_info->tree_root, &key, path, NULL, NULL); + if (ret <= 0) + return ret; + + /* Squotas require additional checks, even if the subvol is deleted. */ + if (btrfs_qgroup_mode(fs_info) == BTRFS_QGROUP_MODE_SIMPLE) + return can_delete_squota_subvol_qgroup(fs_info, qgroup); + return 1; } int btrfs_remove_qgroup(struct btrfs_trans_handle *trans, u64 qgroupid) @@ -4924,7 +4946,8 @@ int btrfs_record_squota_delta(struct btrfs_fs_info *fs_info, u64 num_bytes = delta->num_bytes; const int sign = (delta->is_inc ? 1 : -1); - if (btrfs_qgroup_mode(fs_info) != BTRFS_QGROUP_MODE_SIMPLE) + if (btrfs_qgroup_mode(fs_info) != BTRFS_QGROUP_MODE_SIMPLE && + !test_bit(BTRFS_FS_SQUOTA_ENABLING, &fs_info->flags)) return 0; if (!btrfs_is_fstree(root)) diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c index eb9eb7683e3c..6336d976d469 100644 --- a/fs/cachefiles/namei.c +++ b/fs/cachefiles/namei.c @@ -130,6 +130,8 @@ retry: ret = cachefiles_inject_write_error(); if (ret == 0) { subdir = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), subdir, 0700, NULL); + if (IS_ERR(subdir)) + ret = PTR_ERR(subdir); } else { end_creating(subdir); subdir = ERR_PTR(ret); diff --git a/fs/erofs/xattr.c b/fs/erofs/xattr.c index c411df5d9dfc..df7ea019526d 100644 --- a/fs/erofs/xattr.c +++ b/fs/erofs/xattr.c @@ -85,9 +85,15 @@ static int erofs_init_inode_xattrs(struct inode *inode) } vi->xattr_name_filter = le32_to_cpu(ih->h_name_filter); vi->xattr_shared_count = ih->h_shared_count; + if ((u32)vi->xattr_shared_count * sizeof(__le32) > + vi->xattr_isize - sizeof(struct erofs_xattr_ibody_header)) { + erofs_err(sb, "invalid h_shared_count %u @ nid %llu", + vi->xattr_shared_count, vi->nid); + ret = -EFSCORRUPTED; + goto out_unlock; + } vi->xattr_shared_xattrs = kmalloc_objs(uint, vi->xattr_shared_count); if (!vi->xattr_shared_xattrs) { - erofs_put_metabuf(&buf); ret = -ENOMEM; goto out_unlock; } @@ -104,12 +110,12 @@ static int erofs_init_inode_xattrs(struct inode *inode) } vi->xattr_shared_xattrs[i] = le32_to_cpu(*xattr_id); } - erofs_put_metabuf(&buf); /* paired with smp_mb() at the beginning of the function. */ smp_mb(); set_bit(EROFS_I_EA_INITED_BIT, &vi->flags); out_unlock: + erofs_put_metabuf(&buf); clear_and_wake_up_bit(EROFS_I_BL_XATTR_BIT, &vi->flags); return ret; } diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c index fe8121df9ef2..d7445e98312d 100644 --- a/fs/erofs/zdata.c +++ b/fs/erofs/zdata.c @@ -1511,8 +1511,15 @@ repeat: DBG_BUGON(z_erofs_is_shortlived_page(bvec->bv_page)); folio = page_folio(zbv.page); - /* For preallocated managed folios, add them to page cache here */ + /* + * Preallocated folios are added to the managed cache here rather than + * in z_erofs_bind_cache() in order to keep these folios locked in + * increasing (physical) address order. + * Clear folio->private before these folios become visible to others in + * the managed cache to avoid duplicate additions for unaligned extents. + */ if (folio->private == Z_EROFS_PREALLOCATED_FOLIO) { + folio->private = NULL; tocache = true; goto out_tocache; } @@ -1548,14 +1555,8 @@ repeat: } return; } - /* - * Already linked with another pcluster, which only appears in - * crafted images by fuzzers for now. But handle this anyway. - */ - tocache = false; /* use temporary short-lived pages */ } else { DBG_BUGON(1); /* referenced managed folios can't be truncated */ - tocache = true; } folio_unlock(folio); folio_put(folio); diff --git a/fs/inode.c b/fs/inode.c index cc12b68e021b..e10439d8d7d9 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -2130,7 +2130,13 @@ static int inode_update_cmtime(struct inode *inode, unsigned int flags) inode_iversion_need_inc(inode)) return -EAGAIN; } else { - if (inode_maybe_inc_iversion(inode, !!dirty)) + /* + * Don't force iversion increment for pure lazytime + * updates (I_DIRTY_TIME only), let I_VERSION_QUERIED + * dictate whether the increment is needed. + */ + if (inode_maybe_inc_iversion(inode, + dirty != I_DIRTY_TIME)) dirty |= I_DIRTY_SYNC; } } diff --git a/fs/jfs/namei.c b/fs/jfs/namei.c index 60c4a0e0fca5..442d62679262 100644 --- a/fs/jfs/namei.c +++ b/fs/jfs/namei.c @@ -309,7 +309,7 @@ static struct dentry *jfs_mkdir(struct mnt_idmap *idmap, struct inode *dip, out1: jfs_info("jfs_mkdir: rc:%d", rc); - return ERR_PTR(rc); + return rc ? ERR_PTR(rc) : NULL; } /* diff --git a/fs/mnt_idmapping.c b/fs/mnt_idmapping.c index 6472c4ea3d1e..cb61fbdb52e9 100644 --- a/fs/mnt_idmapping.c +++ b/fs/mnt_idmapping.c @@ -375,6 +375,8 @@ int statmount_mnt_idmap(struct mnt_idmap *idmap, struct seq_file *seq, bool uid_ continue; seq_printf(seq, "%u %u %u", extent->first, lower, extent->count); + if (seq_has_overflowed(seq)) + return -EAGAIN; seq->count++; /* mappings are separated by \0 */ if (seq_has_overflowed(seq)) diff --git a/fs/netfs/buffered_read.c b/fs/netfs/buffered_read.c index a8c0d86118c5..76d0f6a29aba 100644 --- a/fs/netfs/buffered_read.c +++ b/fs/netfs/buffered_read.c @@ -156,9 +156,8 @@ static void netfs_read_cache_to_pagecache(struct netfs_io_request *rreq, netfs_cache_read_terminated, subreq); } -static void netfs_queue_read(struct netfs_io_request *rreq, - struct netfs_io_subrequest *subreq, - bool last_subreq) +void netfs_queue_read(struct netfs_io_request *rreq, + struct netfs_io_subrequest *subreq) { struct netfs_io_stream *stream = &rreq->io_streams[0]; @@ -169,7 +168,8 @@ static void netfs_queue_read(struct netfs_io_request *rreq, * remove entries off of the front. */ spin_lock(&rreq->lock); - list_add_tail(&subreq->rreq_link, &stream->subrequests); + /* Write IN_PROGRESS before pointer to new subreq */ + list_add_tail_release(&subreq->rreq_link, &stream->subrequests); if (list_is_first(&subreq->rreq_link, &stream->subrequests)) { if (!stream->active) { stream->collected_to = subreq->start; @@ -178,11 +178,6 @@ static void netfs_queue_read(struct netfs_io_request *rreq, } } - if (last_subreq) { - smp_wmb(); /* Write lists before ALL_QUEUED. */ - set_bit(NETFS_RREQ_ALL_QUEUED, &rreq->flags); - } - spin_unlock(&rreq->lock); } @@ -214,7 +209,6 @@ static void netfs_issue_read(struct netfs_io_request *rreq, static void netfs_read_to_pagecache(struct netfs_io_request *rreq, struct readahead_control *ractl) { - struct netfs_inode *ictx = netfs_inode(rreq->inode); unsigned long long start = rreq->start; ssize_t size = rreq->len; int ret = 0; @@ -233,10 +227,13 @@ static void netfs_read_to_pagecache(struct netfs_io_request *rreq, subreq->start = start; subreq->len = size; + netfs_queue_read(rreq, subreq); + source = netfs_cache_prepare_read(rreq, subreq, rreq->i_size); subreq->source = source; if (source == NETFS_DOWNLOAD_FROM_SERVER) { - unsigned long long zp = umin(ictx->zero_point, rreq->i_size); + unsigned long long zero_point = netfs_read_zero_point(rreq->inode); + unsigned long long zp = umin(zero_point, rreq->i_size); size_t len = subreq->len; if (unlikely(rreq->origin == NETFS_READ_SINGLE)) @@ -252,7 +249,8 @@ static void netfs_read_to_pagecache(struct netfs_io_request *rreq, pr_err("ZERO-LEN READ: R=%08x[%x] l=%zx/%zx s=%llx z=%llx i=%llx", rreq->debug_id, subreq->debug_index, subreq->len, size, - subreq->start, ictx->zero_point, rreq->i_size); + subreq->start, zero_point, rreq->i_size); + netfs_cancel_read(subreq, ret); break; } subreq->len = len; @@ -261,12 +259,7 @@ static void netfs_read_to_pagecache(struct netfs_io_request *rreq, if (rreq->netfs_ops->prepare_read) { ret = rreq->netfs_ops->prepare_read(subreq); if (ret < 0) { - subreq->error = ret; - /* Not queued - release both refs. */ - netfs_put_subrequest(subreq, - netfs_sreq_trace_put_cancel); - netfs_put_subrequest(subreq, - netfs_sreq_trace_put_cancel); + netfs_cancel_read(subreq, ret); break; } trace_netfs_sreq(subreq, netfs_sreq_trace_prepare); @@ -289,24 +282,29 @@ static void netfs_read_to_pagecache(struct netfs_io_request *rreq, pr_err("Unexpected read source %u\n", source); WARN_ON_ONCE(1); + netfs_cancel_read(subreq, ret); break; issue: slice = netfs_prepare_read_iterator(subreq, ractl); if (slice < 0) { ret = slice; - subreq->error = ret; - trace_netfs_sreq(subreq, netfs_sreq_trace_cancel); - /* Not queued - release both refs. */ - netfs_put_subrequest(subreq, netfs_sreq_trace_put_cancel); - netfs_put_subrequest(subreq, netfs_sreq_trace_put_cancel); + netfs_cancel_read(subreq, ret); break; } - size -= slice; start += slice; + size -= slice; + if (size <= 0) { + smp_wmb(); /* Write lists before ALL_QUEUED. */ + set_bit(NETFS_RREQ_ALL_QUEUED, &rreq->flags); + } - netfs_queue_read(rreq, subreq, size <= 0); netfs_issue_read(rreq, subreq); + + if (test_bit(NETFS_RREQ_PAUSE, &rreq->flags)) + netfs_wait_for_paused_read(rreq); + if (test_bit(NETFS_RREQ_FAILED, &rreq->flags)) + break; cond_resched(); } while (size > 0); @@ -397,6 +395,7 @@ static int netfs_read_gaps(struct file *file, struct folio *folio) { struct netfs_io_request *rreq; struct address_space *mapping = folio->mapping; + struct netfs_group *group = netfs_folio_group(folio); struct netfs_folio *finfo = netfs_folio_info(folio); struct netfs_inode *ctx = netfs_inode(mapping->host); struct folio *sink = NULL; @@ -458,14 +457,20 @@ static int netfs_read_gaps(struct file *file, struct folio *folio) netfs_read_to_pagecache(rreq, NULL); - if (sink) - folio_put(sink); - ret = netfs_wait_for_read(rreq); if (ret >= 0) { + if (group) + folio_change_private(folio, group); + else + folio_detach_private(folio); + kfree(finfo); + trace_netfs_folio(folio, netfs_folio_trace_filled_gaps); flush_dcache_folio(folio); folio_mark_uptodate(folio); } + + if (sink) + folio_put(sink); folio_unlock(folio); netfs_put_request(rreq, netfs_rreq_trace_put_return); return ret < 0 ? ret : 0; @@ -498,10 +503,10 @@ int netfs_read_folio(struct file *file, struct folio *folio) struct netfs_inode *ctx = netfs_inode(mapping->host); int ret; - if (folio_test_dirty(folio)) { - trace_netfs_folio(folio, netfs_folio_trace_read_gaps); + folio_wait_writeback(folio); + + if (folio_test_dirty(folio)) return netfs_read_gaps(file, folio); - } _enter("%lx", folio->index); @@ -667,7 +672,7 @@ retry: ret = PTR_ERR(rreq); goto error; } - rreq->no_unlock_folio = folio->index; + rreq->no_unlock_folio = folio; __set_bit(NETFS_RREQ_NO_UNLOCK_FOLIO, &rreq->flags); ret = netfs_begin_cache_read(rreq, ctx); @@ -684,9 +689,9 @@ retry: netfs_read_to_pagecache(rreq, NULL); ret = netfs_wait_for_read(rreq); + netfs_put_request(rreq, netfs_rreq_trace_put_return); if (ret < 0) goto error; - netfs_put_request(rreq, netfs_rreq_trace_put_return); have_folio: ret = folio_wait_private_2_killable(folio); @@ -733,7 +738,7 @@ int netfs_prefetch_for_write(struct file *file, struct folio *folio, goto error; } - rreq->no_unlock_folio = folio->index; + rreq->no_unlock_folio = folio; __set_bit(NETFS_RREQ_NO_UNLOCK_FOLIO, &rreq->flags); ret = netfs_begin_cache_read(rreq, ctx); if (ret == -ENOMEM || ret == -EINTR || ret == -ERESTARTSYS) diff --git a/fs/netfs/buffered_write.c b/fs/netfs/buffered_write.c index 22a4d61631c9..dee10570383b 100644 --- a/fs/netfs/buffered_write.c +++ b/fs/netfs/buffered_write.c @@ -13,24 +13,6 @@ #include <linux/pagevec.h> #include "internal.h" -static void __netfs_set_group(struct folio *folio, struct netfs_group *netfs_group) -{ - if (netfs_group) - folio_attach_private(folio, netfs_get_group(netfs_group)); -} - -static void netfs_set_group(struct folio *folio, struct netfs_group *netfs_group) -{ - void *priv = folio_get_private(folio); - - if (unlikely(priv != netfs_group)) { - if (netfs_group && (!priv || priv == NETFS_FOLIO_COPY_TO_CACHE)) - folio_attach_private(folio, netfs_get_group(netfs_group)); - else if (!netfs_group && priv == NETFS_FOLIO_COPY_TO_CACHE) - folio_detach_private(folio); - } -} - /* * Grab a folio for writing and lock it. Attempt to allocate as large a folio * as possible to hold as much of the remaining length as possible in one go. @@ -150,6 +132,7 @@ ssize_t netfs_perform_write(struct kiocb *iocb, struct iov_iter *iter, } do { + enum netfs_folio_trace trace; struct netfs_folio *finfo; struct netfs_group *group; unsigned long long fpos; @@ -157,6 +140,7 @@ ssize_t netfs_perform_write(struct kiocb *iocb, struct iov_iter *iter, size_t offset; /* Offset into pagecache folio */ size_t part; /* Bytes to write to folio */ size_t copied; /* Bytes copied from user */ + void *priv; offset = pos & (max_chunk - 1); part = min(max_chunk - offset, iov_iter_count(iter)); @@ -202,73 +186,99 @@ ssize_t netfs_perform_write(struct kiocb *iocb, struct iov_iter *iter, goto error_folio_unlock; } - /* Decide how we should modify a folio. We might be attempting - * to do write-streaming, in which case we don't want to a - * local RMW cycle if we can avoid it. If we're doing local - * caching or content crypto, we award that priority over - * avoiding RMW. If the file is open readably, then we also - * assume that we may want to read what we wrote. - */ finfo = netfs_folio_info(folio); group = netfs_folio_group(folio); + /* If the requested group differs from the group set on the + * page, then we need to flush out the folio if it has a group + * set (ie. is non-NULL). Note that COPY_TO_CACHE is a special + * case, being a netfs annotation rather than an actual group. + * + * The filesystem isn't permitted to mix writes with groups and + * writes without groups as the NULL group is used to indicate + * that no group is set. + */ if (unlikely(group != netfs_group) && - group != NETFS_FOLIO_COPY_TO_CACHE) + group != NETFS_FOLIO_COPY_TO_CACHE && + group) { + WARN_ON_ONCE(!netfs_group); goto flush_content; + } + /* Decide how we should modify a folio. We might be attempting + * to do write-streaming, as we don't want to a local RMW cycle + * if we can avoid it. If we're doing local caching or content + * crypto, we award that priority over avoiding RMW. If the + * file is open readably, then we let ->read_folio() fill in + * the gaps. + */ if (folio_test_uptodate(folio)) { if (mapping_writably_mapped(mapping)) flush_dcache_folio(folio); copied = copy_folio_from_iter_atomic(folio, offset, part, iter); if (unlikely(copied == 0)) goto copy_failed; - netfs_set_group(folio, netfs_group); - trace_netfs_folio(folio, netfs_folio_is_uptodate); - goto copied; + trace = netfs_folio_is_uptodate; + goto copied_uptodate; } /* If the page is above the zero-point then we assume that the * server would just return a block of zeros or a short read if * we try to read it. */ - if (fpos >= ctx->zero_point) { + if (fpos >= netfs_read_zero_point(inode)) { folio_zero_segment(folio, 0, offset); copied = copy_folio_from_iter_atomic(folio, offset, part, iter); if (unlikely(copied == 0)) goto copy_failed; folio_zero_segment(folio, offset + copied, flen); - __netfs_set_group(folio, netfs_group); - folio_mark_uptodate(folio); - trace_netfs_folio(folio, netfs_modify_and_clear); - goto copied; + if (finfo) + trace = netfs_modify_and_clear_rm_finfo; + else + trace = netfs_modify_and_clear; + goto mark_uptodate; } /* See if we can write a whole folio in one go. */ if (!maybe_trouble && offset == 0 && part >= flen) { copied = copy_folio_from_iter_atomic(folio, offset, part, iter); - if (unlikely(copied == 0)) + if (likely(copied == part)) { + if (finfo) + trace = netfs_whole_folio_modify_filled; + else + trace = netfs_whole_folio_modify; + goto mark_uptodate; + } + if (copied == 0) goto copy_failed; - if (unlikely(copied < part)) { + if (!finfo || copied <= finfo->dirty_offset) { maybe_trouble = true; iov_iter_revert(iter, copied); copied = 0; folio_unlock(folio); goto retry; } - __netfs_set_group(folio, netfs_group); - folio_mark_uptodate(folio); - trace_netfs_folio(folio, netfs_whole_folio_modify); + + /* We overwrote some existing dirty data, so we have to + * accept the partial write. + */ + finfo->dirty_len += finfo->dirty_offset; + if (finfo->dirty_len == flen) { + trace = netfs_whole_folio_modify_filled_efault; + goto mark_uptodate; + } + if (copied > finfo->dirty_len) + finfo->dirty_len = copied; + finfo->dirty_offset = 0; + trace = netfs_whole_folio_modify_efault; goto copied; } /* We don't want to do a streaming write on a file that loses * caching service temporarily because the backing store got - * culled and we don't really want to get a streaming write on - * a file that's open for reading as ->read_folio() then has to - * be able to flush it. + * culled. */ - if ((file->f_mode & FMODE_READ) || - netfs_is_cache_enabled(ctx)) { + if (netfs_is_cache_enabled(ctx)) { if (finfo) { netfs_stat(&netfs_n_wh_wstream_conflict); goto flush_content; @@ -283,11 +293,11 @@ ssize_t netfs_perform_write(struct kiocb *iocb, struct iov_iter *iter, copied = copy_folio_from_iter_atomic(folio, offset, part, iter); if (unlikely(copied == 0)) goto copy_failed; - netfs_set_group(folio, netfs_group); - trace_netfs_folio(folio, netfs_just_prefetch); - goto copied; + trace = netfs_just_prefetch; + goto copied_uptodate; } + /* Do a streaming write on a folio that has nothing in it yet. */ if (!finfo) { ret = -EIO; if (WARN_ON(folio_get_private(folio))) @@ -296,10 +306,8 @@ ssize_t netfs_perform_write(struct kiocb *iocb, struct iov_iter *iter, if (unlikely(copied == 0)) goto copy_failed; if (offset == 0 && copied == flen) { - __netfs_set_group(folio, netfs_group); - folio_mark_uptodate(folio); - trace_netfs_folio(folio, netfs_streaming_filled_page); - goto copied; + trace = netfs_streaming_filled_page; + goto mark_uptodate; } finfo = kzalloc_obj(*finfo); @@ -313,7 +321,7 @@ ssize_t netfs_perform_write(struct kiocb *iocb, struct iov_iter *iter, finfo->dirty_len = copied; folio_attach_private(folio, (void *)((unsigned long)finfo | NETFS_FOLIO_INFO)); - trace_netfs_folio(folio, netfs_streaming_write); + trace = netfs_streaming_write; goto copied; } @@ -327,16 +335,10 @@ ssize_t netfs_perform_write(struct kiocb *iocb, struct iov_iter *iter, goto copy_failed; finfo->dirty_len += copied; if (finfo->dirty_offset == 0 && finfo->dirty_len == flen) { - if (finfo->netfs_group) - folio_change_private(folio, finfo->netfs_group); - else - folio_detach_private(folio); - folio_mark_uptodate(folio); - kfree(finfo); - trace_netfs_folio(folio, netfs_streaming_cont_filled_page); - } else { - trace_netfs_folio(folio, netfs_streaming_write_cont); + trace = netfs_streaming_cont_filled_page; + goto mark_uptodate; } + trace = netfs_streaming_write_cont; goto copied; } @@ -350,7 +352,38 @@ ssize_t netfs_perform_write(struct kiocb *iocb, struct iov_iter *iter, goto out; continue; + /* Mark a folio as being up to data when we've filled it + * completely. If the folio has a group attached, then it must + * be the same group, otherwise we should have flushed it out + * above. We have to get rid of the netfs_folio struct if + * there was one. + */ + mark_uptodate: + folio_mark_uptodate(folio); + + copied_uptodate: + priv = folio_get_private(folio); + if (likely(priv == netfs_group)) { + /* Already set correctly; no change required. */ + } else if (priv == NETFS_FOLIO_COPY_TO_CACHE) { + if (!netfs_group) + folio_detach_private(folio); + else + folio_change_private(folio, netfs_get_group(netfs_group)); + } else if (!priv) { + folio_attach_private(folio, netfs_get_group(netfs_group)); + } else { + WARN_ON_ONCE(!finfo); + if (netfs_group) + /* finfo->netfs_group has a ref */ + folio_change_private(folio, netfs_group); + else + folio_detach_private(folio); + kfree(finfo); + } + copied: + trace_netfs_folio(folio, trace); flush_dcache_folio(folio); /* Update the inode size if we moved the EOF marker */ @@ -511,6 +544,7 @@ vm_fault_t netfs_page_mkwrite(struct vm_fault *vmf, struct netfs_group *netfs_gr struct inode *inode = file_inode(file); struct netfs_inode *ictx = netfs_inode(inode); vm_fault_t ret = VM_FAULT_NOPAGE; + void *priv; int err; _enter("%lx", folio->index); @@ -531,7 +565,9 @@ vm_fault_t netfs_page_mkwrite(struct vm_fault *vmf, struct netfs_group *netfs_gr } group = netfs_folio_group(folio); - if (group != netfs_group && group != NETFS_FOLIO_COPY_TO_CACHE) { + if (group && + group != netfs_group && + group != NETFS_FOLIO_COPY_TO_CACHE) { folio_unlock(folio); err = filemap_fdatawrite_range(mapping, folio_pos(folio), @@ -553,7 +589,19 @@ vm_fault_t netfs_page_mkwrite(struct vm_fault *vmf, struct netfs_group *netfs_gr trace_netfs_folio(folio, netfs_folio_trace_mkwrite_plus); else trace_netfs_folio(folio, netfs_folio_trace_mkwrite); - netfs_set_group(folio, netfs_group); + + priv = folio_get_private(folio); + if (priv != netfs_group) { + if (!netfs_group && priv == NETFS_FOLIO_COPY_TO_CACHE) + folio_detach_private(folio); + else if (netfs_group && priv == NETFS_FOLIO_COPY_TO_CACHE) + folio_change_private(folio, netfs_get_group(netfs_group)); + else if (netfs_group && !priv) + folio_attach_private(folio, netfs_get_group(netfs_group)); + else + WARN_ON_ONCE(1); + } + file_update_time(file); set_bit(NETFS_ICTX_MODIFIED_ATTR, &ictx->flags); if (ictx->ops->post_modify) diff --git a/fs/netfs/direct_read.c b/fs/netfs/direct_read.c index f72e6da88cca..6a8fb0d55e04 100644 --- a/fs/netfs/direct_read.c +++ b/fs/netfs/direct_read.c @@ -45,12 +45,11 @@ static void netfs_prepare_dio_read_iterator(struct netfs_io_subrequest *subreq) * Perform a read to a buffer from the server, slicing up the region to be read * according to the network rsize. */ -static int netfs_dispatch_unbuffered_reads(struct netfs_io_request *rreq) +static void netfs_dispatch_unbuffered_reads(struct netfs_io_request *rreq) { - struct netfs_io_stream *stream = &rreq->io_streams[0]; unsigned long long start = rreq->start; ssize_t size = rreq->len; - int ret = 0; + int ret; do { struct netfs_io_subrequest *subreq; @@ -58,7 +57,10 @@ static int netfs_dispatch_unbuffered_reads(struct netfs_io_request *rreq) subreq = netfs_alloc_subrequest(rreq); if (!subreq) { - ret = -ENOMEM; + /* Stash the error in the request if there's not + * already an error set. + */ + cmpxchg(&rreq->error, 0, -ENOMEM); break; } @@ -66,25 +68,13 @@ static int netfs_dispatch_unbuffered_reads(struct netfs_io_request *rreq) subreq->start = start; subreq->len = size; - __set_bit(NETFS_SREQ_IN_PROGRESS, &subreq->flags); - - spin_lock(&rreq->lock); - list_add_tail(&subreq->rreq_link, &stream->subrequests); - if (list_is_first(&subreq->rreq_link, &stream->subrequests)) { - if (!stream->active) { - stream->collected_to = subreq->start; - /* Store list pointers before active flag */ - smp_store_release(&stream->active, true); - } - } - trace_netfs_sreq(subreq, netfs_sreq_trace_added); - spin_unlock(&rreq->lock); + netfs_queue_read(rreq, subreq); netfs_stat(&netfs_n_rh_download); if (rreq->netfs_ops->prepare_read) { ret = rreq->netfs_ops->prepare_read(subreq); if (ret < 0) { - netfs_put_subrequest(subreq, netfs_sreq_trace_put_cancel); + netfs_cancel_read(subreq, ret); break; } } @@ -113,8 +103,6 @@ static int netfs_dispatch_unbuffered_reads(struct netfs_io_request *rreq) set_bit(NETFS_RREQ_ALL_QUEUED, &rreq->flags); netfs_wake_collector(rreq); } - - return ret; } /* @@ -137,21 +125,17 @@ static ssize_t netfs_unbuffered_read(struct netfs_io_request *rreq, bool sync) // TODO: Use bounce buffer if requested inode_dio_begin(rreq->inode); + netfs_dispatch_unbuffered_reads(rreq); - ret = netfs_dispatch_unbuffered_reads(rreq); - - if (!rreq->submitted) { - netfs_put_request(rreq, netfs_rreq_trace_put_no_submit); - inode_dio_end(rreq->inode); - ret = 0; - goto out; - } + /* The collector will get run, even if we don't manage to submit any + * subreqs, so we shouldn't call inode_dio_end() here. + */ if (sync) ret = netfs_wait_for_read(rreq); else ret = -EIOCBQUEUED; -out: + _leave(" = %zd", ret); return ret; } diff --git a/fs/netfs/direct_write.c b/fs/netfs/direct_write.c index f9ab69de3e29..25f8ceb15fad 100644 --- a/fs/netfs/direct_write.c +++ b/fs/netfs/direct_write.c @@ -376,8 +376,10 @@ ssize_t netfs_unbuffered_write_iter(struct kiocb *iocb, struct iov_iter *from) if (ret < 0) goto out; end = iocb->ki_pos + iov_iter_count(from); - if (end > ictx->zero_point) - ictx->zero_point = end; + spin_lock(&inode->i_lock); + if (end > ictx->_zero_point) + netfs_write_zero_point(inode, end); + spin_unlock(&inode->i_lock); fscache_invalidate(netfs_i_cookie(ictx), NULL, i_size_read(inode), FSCACHE_INVAL_DIO_WRITE); diff --git a/fs/netfs/internal.h b/fs/netfs/internal.h index d436e20d3418..645996ecfc80 100644 --- a/fs/netfs/internal.h +++ b/fs/netfs/internal.h @@ -23,6 +23,8 @@ /* * buffered_read.c */ +void netfs_queue_read(struct netfs_io_request *rreq, + struct netfs_io_subrequest *subreq); void netfs_cache_read_terminated(void *priv, ssize_t transferred_or_error); int netfs_prefetch_for_write(struct file *file, struct folio *folio, size_t offset, size_t len); @@ -108,6 +110,7 @@ static inline void netfs_see_subrequest(struct netfs_io_subrequest *subreq, */ bool netfs_read_collection(struct netfs_io_request *rreq); void netfs_read_collection_worker(struct work_struct *work); +void netfs_cancel_read(struct netfs_io_subrequest *subreq, int error); void netfs_cache_read_terminated(void *priv, ssize_t transferred_or_error); /* diff --git a/fs/netfs/iterator.c b/fs/netfs/iterator.c index 429e4396e1b0..b375567e0520 100644 --- a/fs/netfs/iterator.c +++ b/fs/netfs/iterator.c @@ -72,21 +72,24 @@ ssize_t netfs_extract_user_iter(struct iov_iter *orig, size_t orig_len, break; } - if (ret > count) { - pr_err("get_pages rc=%zd more than %zu\n", ret, count); + if (WARN(ret > count, + "%s: extract_pages overrun %zd > %zu bytes\n", + __func__, ret, count)) { + ret = -EIO; break; } - count -= ret; - ret += offset; - cur_npages = DIV_ROUND_UP(ret, PAGE_SIZE); - - if (npages + cur_npages > max_pages) { - pr_err("Out of bvec array capacity (%u vs %u)\n", - npages + cur_npages, max_pages); + cur_npages = DIV_ROUND_UP(offset + ret, PAGE_SIZE); + if (WARN(cur_npages > max_pages - npages, + "%s: extract_pages overrun %u > %u pages\n", + __func__, npages + cur_npages, max_pages)) { + ret = -EIO; break; } + count -= ret; + ret += offset; + for (i = 0; i < cur_npages; i++) { len = ret > PAGE_SIZE ? PAGE_SIZE : ret; bvec_set_page(bv + npages + i, *pages++, len - offset, offset); @@ -97,6 +100,11 @@ ssize_t netfs_extract_user_iter(struct iov_iter *orig, size_t orig_len, npages += cur_npages; } + /* Note: Don't try to clean up after EIO. Either we got no pages, so + * nothing to clean up, or we got a buffer overrun, memory corruption + * and can't trust the stuff in the buffer (a WARN was emitted). + */ + if (ret < 0 && (ret == -ENOMEM || npages == 0)) { for (i = 0; i < npages; i++) unpin_user_page(bv[i].bv_page); diff --git a/fs/netfs/misc.c b/fs/netfs/misc.c index 6df89c92b10b..5d554512ed23 100644 --- a/fs/netfs/misc.c +++ b/fs/netfs/misc.c @@ -211,18 +211,25 @@ EXPORT_SYMBOL(netfs_clear_inode_writeback); void netfs_invalidate_folio(struct folio *folio, size_t offset, size_t length) { struct netfs_folio *finfo; - struct netfs_inode *ctx = netfs_inode(folio_inode(folio)); + struct inode *inode = folio_inode(folio); + struct netfs_inode *ctx = netfs_inode(inode); size_t flen = folio_size(folio); _enter("{%lx},%zx,%zx", folio->index, offset, length); if (offset == 0 && length == flen) { - unsigned long long i_size = i_size_read(&ctx->inode); + unsigned long long i_size, remote_i_size, zero_point; unsigned long long fpos = folio_pos(folio), end; + netfs_read_sizes(inode, &i_size, &remote_i_size, &zero_point); end = umin(fpos + flen, i_size); - if (fpos < i_size && end > ctx->zero_point) - ctx->zero_point = end; + if (fpos < i_size && end > zero_point) { + spin_lock(&inode->i_lock); + end = umin(fpos + flen, inode->i_size); + if (fpos < i_size && end > ctx->_zero_point) + netfs_write_zero_point(inode, end); + spin_unlock(&inode->i_lock); + } } folio_wait_private_2(folio); /* [DEPRECATED] */ @@ -255,7 +262,8 @@ void netfs_invalidate_folio(struct folio *folio, size_t offset, size_t length) goto erase_completely; /* Move the start of the data. */ finfo->dirty_len = fend - iend; - finfo->dirty_offset = offset; + finfo->dirty_offset = iend; + trace_netfs_folio(folio, netfs_folio_trace_invalidate_front); return; } @@ -264,12 +272,14 @@ void netfs_invalidate_folio(struct folio *folio, size_t offset, size_t length) */ if (iend >= fend) { finfo->dirty_len = offset - fstart; + trace_netfs_folio(folio, netfs_folio_trace_invalidate_tail); return; } /* A partial write was split. The caller has already zeroed * it, so just absorb the hole. */ + trace_netfs_folio(folio, netfs_folio_trace_invalidate_middle); } return; @@ -277,8 +287,9 @@ erase_completely: netfs_put_group(netfs_folio_group(folio)); folio_detach_private(folio); folio_clear_uptodate(folio); + folio_cancel_dirty(folio); kfree(finfo); - return; + trace_netfs_folio(folio, netfs_folio_trace_invalidate_all); } EXPORT_SYMBOL(netfs_invalidate_folio); @@ -292,15 +303,22 @@ EXPORT_SYMBOL(netfs_invalidate_folio); */ bool netfs_release_folio(struct folio *folio, gfp_t gfp) { - struct netfs_inode *ctx = netfs_inode(folio_inode(folio)); - unsigned long long end; + struct inode *inode = folio_inode(folio); + struct netfs_inode *ctx = netfs_inode(inode); + unsigned long long i_size, remote_i_size, zero_point, end; if (folio_test_dirty(folio)) return false; - end = umin(folio_next_pos(folio), i_size_read(&ctx->inode)); - if (end > ctx->zero_point) - ctx->zero_point = end; + netfs_read_sizes(inode, &i_size, &remote_i_size, &zero_point); + end = folio_next_pos(folio); + if (end > zero_point) { + spin_lock(&inode->i_lock); + end = umin(end, ctx->_remote_i_size); + if (end > ctx->_zero_point) + netfs_write_zero_point(inode, end); + spin_unlock(&inode->i_lock); + } if (folio_test_private(folio)) return false; @@ -356,6 +374,7 @@ void netfs_wait_for_in_progress_stream(struct netfs_io_request *rreq, DEFINE_WAIT(myself); list_for_each_entry(subreq, &stream->subrequests, rreq_link) { + smp_rmb(); /* Read ->next before IN_PROGRESS. */ if (!netfs_check_subreq_in_progress(subreq)) continue; diff --git a/fs/netfs/read_collect.c b/fs/netfs/read_collect.c index e5f6665b3341..23660a590124 100644 --- a/fs/netfs/read_collect.c +++ b/fs/netfs/read_collect.c @@ -83,7 +83,7 @@ static void netfs_unlock_read_folio(struct netfs_io_request *rreq, } just_unlock: - if (folio->index == rreq->no_unlock_folio && + if (folio == rreq->no_unlock_folio && test_bit(NETFS_RREQ_NO_UNLOCK_FOLIO, &rreq->flags)) { _debug("no unlock"); } else { @@ -205,8 +205,10 @@ reassess: * in progress. The issuer thread may be adding stuff to the tail * whilst we're doing this. */ - front = list_first_entry_or_null(&stream->subrequests, - struct netfs_io_subrequest, rreq_link); + front = list_first_entry_or_null_acquire(&stream->subrequests, + struct netfs_io_subrequest, rreq_link); + /* Read first subreq pointer before IN_PROGRESS flag. */ + while (front) { size_t transferred; @@ -576,6 +578,17 @@ skip_error_checks: EXPORT_SYMBOL(netfs_read_subreq_terminated); /* + * Cancel a read subrequest due to preparation failure. + */ +void netfs_cancel_read(struct netfs_io_subrequest *subreq, int error) +{ + trace_netfs_sreq(subreq, netfs_sreq_trace_cancel); + subreq->error = error; + __set_bit(NETFS_SREQ_FAILED, &subreq->flags); + netfs_read_subreq_terminated(subreq); +} + +/* * Handle termination of a read from the cache. */ void netfs_cache_read_terminated(void *priv, ssize_t transferred_or_error) diff --git a/fs/netfs/read_retry.c b/fs/netfs/read_retry.c index cca9ac43c077..f59a70f3a086 100644 --- a/fs/netfs/read_retry.c +++ b/fs/netfs/read_retry.c @@ -175,7 +175,9 @@ static void netfs_retry_read_subrequests(struct netfs_io_request *rreq) list_for_each_entry_safe_from(subreq, tmp, &stream->subrequests, rreq_link) { trace_netfs_sreq(subreq, netfs_sreq_trace_superfluous); + spin_lock(&rreq->lock); list_del(&subreq->rreq_link); + spin_unlock(&rreq->lock); netfs_put_subrequest(subreq, netfs_sreq_trace_put_done); if (subreq == to) break; @@ -203,8 +205,10 @@ static void netfs_retry_read_subrequests(struct netfs_io_request *rreq) refcount_read(&subreq->ref), netfs_sreq_trace_new); + spin_lock(&rreq->lock); list_add(&subreq->rreq_link, &to->rreq_link); - to = list_next_entry(to, rreq_link); + spin_unlock(&rreq->lock); + to = subreq; trace_netfs_sreq(subreq, netfs_sreq_trace_retry); stream->sreq_max_len = umin(len, rreq->rsize); @@ -288,8 +292,15 @@ void netfs_unlock_abandoned_read_pages(struct netfs_io_request *rreq) struct folio *folio = folioq_folio(p, slot); if (folio && !folioq_is_marked2(p, slot)) { - trace_netfs_folio(folio, netfs_folio_trace_abandon); - folio_unlock(folio); + if (folio == rreq->no_unlock_folio && + test_bit(NETFS_RREQ_NO_UNLOCK_FOLIO, + &rreq->flags)) { + _debug("no unlock"); + } else { + trace_netfs_folio(folio, + netfs_folio_trace_abandon); + folio_unlock(folio); + } } } } diff --git a/fs/netfs/read_single.c b/fs/netfs/read_single.c index d0e23bc42445..8833550d2eb6 100644 --- a/fs/netfs/read_single.c +++ b/fs/netfs/read_single.c @@ -89,7 +89,6 @@ static void netfs_single_read_cache(struct netfs_io_request *rreq, */ static int netfs_single_dispatch_read(struct netfs_io_request *rreq) { - struct netfs_io_stream *stream = &rreq->io_streams[0]; struct netfs_io_subrequest *subreq; int ret = 0; @@ -102,14 +101,7 @@ static int netfs_single_dispatch_read(struct netfs_io_request *rreq) subreq->len = rreq->len; subreq->io_iter = rreq->buffer.iter; - __set_bit(NETFS_SREQ_IN_PROGRESS, &subreq->flags); - - spin_lock(&rreq->lock); - list_add_tail(&subreq->rreq_link, &stream->subrequests); - trace_netfs_sreq(subreq, netfs_sreq_trace_added); - /* Store list pointers before active flag */ - smp_store_release(&stream->active, true); - spin_unlock(&rreq->lock); + netfs_queue_read(rreq, subreq); netfs_single_cache_prepare_read(rreq, subreq); switch (subreq->source) { @@ -121,10 +113,14 @@ static int netfs_single_dispatch_read(struct netfs_io_request *rreq) goto cancel; } + smp_wmb(); /* Write lists before ALL_QUEUED. */ + set_bit(NETFS_RREQ_ALL_QUEUED, &rreq->flags); rreq->netfs_ops->issue_read(subreq); rreq->submitted += subreq->len; break; case NETFS_READ_FROM_CACHE: + smp_wmb(); /* Write lists before ALL_QUEUED. */ + set_bit(NETFS_RREQ_ALL_QUEUED, &rreq->flags); trace_netfs_sreq(subreq, netfs_sreq_trace_submit); netfs_single_read_cache(rreq, subreq); rreq->submitted += subreq->len; @@ -134,14 +130,15 @@ static int netfs_single_dispatch_read(struct netfs_io_request *rreq) pr_warn("Unexpected single-read source %u\n", subreq->source); WARN_ON_ONCE(true); ret = -EIO; - break; + goto cancel; } - smp_wmb(); /* Write lists before ALL_QUEUED. */ - set_bit(NETFS_RREQ_ALL_QUEUED, &rreq->flags); return ret; cancel: - netfs_put_subrequest(subreq, netfs_sreq_trace_put_cancel); + netfs_cancel_read(subreq, ret); + smp_wmb(); /* Write lists before ALL_QUEUED. */ + set_bit(NETFS_RREQ_ALL_QUEUED, &rreq->flags); + netfs_wake_collector(rreq); return ret; } diff --git a/fs/netfs/write_collect.c b/fs/netfs/write_collect.c index b194447f4b11..24fc2bb2f8a4 100644 --- a/fs/netfs/write_collect.c +++ b/fs/netfs/write_collect.c @@ -57,7 +57,8 @@ static void netfs_dump_request(const struct netfs_io_request *rreq) int netfs_folio_written_back(struct folio *folio) { enum netfs_folio_trace why = netfs_folio_trace_clear; - struct netfs_inode *ictx = netfs_inode(folio->mapping->host); + struct inode *inode = folio_inode(folio); + struct netfs_inode *ictx = netfs_inode(inode); struct netfs_folio *finfo; struct netfs_group *group = NULL; int gcount = 0; @@ -69,8 +70,10 @@ int netfs_folio_written_back(struct folio *folio) unsigned long long fend; fend = folio_pos(folio) + finfo->dirty_offset + finfo->dirty_len; - if (fend > ictx->zero_point) - ictx->zero_point = fend; + spin_lock(&ictx->inode.i_lock); + if (fend > ictx->_zero_point) + netfs_write_zero_point(inode, fend); + spin_unlock(&ictx->inode.i_lock); folio_detach_private(folio); group = finfo->netfs_group; @@ -228,8 +231,10 @@ reassess_streams: if (!smp_load_acquire(&stream->active)) continue; - front = list_first_entry_or_null(&stream->subrequests, - struct netfs_io_subrequest, rreq_link); + front = list_first_entry_or_null_acquire(&stream->subrequests, + struct netfs_io_subrequest, rreq_link); + /* Read first subreq pointer before IN_PROGRESS flag. */ + while (front) { trace_netfs_collect_sreq(wreq, front); //_debug("sreq [%x] %llx %zx/%zx", diff --git a/fs/netfs/write_issue.c b/fs/netfs/write_issue.c index 2db688f94125..c03c7cc45e47 100644 --- a/fs/netfs/write_issue.c +++ b/fs/netfs/write_issue.c @@ -204,7 +204,8 @@ void netfs_prepare_write(struct netfs_io_request *wreq, * remove entries off of the front. */ spin_lock(&wreq->lock); - list_add_tail(&subreq->rreq_link, &stream->subrequests); + /* Write IN_PROGRESS before pointer to new subreq */ + list_add_tail_release(&subreq->rreq_link, &stream->subrequests); if (list_is_first(&subreq->rreq_link, &stream->subrequests)) { if (!stream->active) { stream->collected_to = subreq->start; @@ -413,12 +414,7 @@ static int netfs_write_folio(struct netfs_io_request *wreq, if (streamw) netfs_issue_write(wreq, cache); - /* Flip the page to the writeback state and unlock. If we're called - * from write-through, then the page has already been put into the wb - * state. - */ - if (wreq->origin == NETFS_WRITEBACK) - folio_start_writeback(folio); + folio_start_writeback(folio); folio_unlock(folio); if (fgroup == NETFS_FOLIO_COPY_TO_CACHE) { @@ -646,29 +642,41 @@ int netfs_advance_writethrough(struct netfs_io_request *wreq, struct writeback_c struct folio *folio, size_t copied, bool to_page_end, struct folio **writethrough_cache) { + int ret; + _enter("R=%x ic=%zu ws=%u cp=%zu tp=%u", wreq->debug_id, wreq->buffer.iter.count, wreq->wsize, copied, to_page_end); - if (!*writethrough_cache) { - if (folio_test_dirty(folio)) - /* Sigh. mmap. */ - folio_clear_dirty_for_io(folio); + /* The folio is locked. */ + if (*writethrough_cache != folio) { + if (*writethrough_cache) { + /* Did the folio get moved? */ + folio_put(*writethrough_cache); + *writethrough_cache = NULL; + } /* We can make multiple writes to the folio... */ - folio_start_writeback(folio); if (wreq->len == 0) trace_netfs_folio(folio, netfs_folio_trace_wthru); else trace_netfs_folio(folio, netfs_folio_trace_wthru_plus); *writethrough_cache = folio; + folio_get(folio); } wreq->len += copied; - if (!to_page_end) + + if (!to_page_end) { + folio_mark_dirty(folio); + folio_unlock(folio); return 0; + } + ret = netfs_write_folio(wreq, wbc, folio); + folio_put(*writethrough_cache); *writethrough_cache = NULL; - return netfs_write_folio(wreq, wbc, folio); + wreq->submitted = wreq->len; + return ret; } /* @@ -682,8 +690,12 @@ ssize_t netfs_end_writethrough(struct netfs_io_request *wreq, struct writeback_c _enter("R=%x", wreq->debug_id); - if (writethrough_cache) + if (writethrough_cache) { + folio_lock(writethrough_cache); netfs_write_folio(wreq, wbc, writethrough_cache); + folio_put(writethrough_cache); + wreq->submitted = wreq->len; + } netfs_end_issue_write(wreq); @@ -818,6 +830,9 @@ static int netfs_write_folio_single(struct netfs_io_request *wreq, * * Write a monolithic, non-pagecache object back to the server and/or * the cache. + * + * Return: 0 if successful; 1 if skipped due to lock conflict and WB_SYNC_NONE; + * or a negative error code. */ int netfs_writeback_single(struct address_space *mapping, struct writeback_control *wbc, @@ -834,8 +849,10 @@ int netfs_writeback_single(struct address_space *mapping, if (!mutex_trylock(&ictx->wb_lock)) { if (wbc->sync_mode == WB_SYNC_NONE) { + /* The VFS will have undirtied the inode. */ + netfs_single_mark_inode_dirty(&ictx->inode); netfs_stat(&netfs_n_wb_lock_skip); - return 0; + return 1; } netfs_stat(&netfs_n_wb_lock_wait); mutex_lock(&ictx->wb_lock); diff --git a/fs/netfs/write_retry.c b/fs/netfs/write_retry.c index 29489a23a220..32735abfa03f 100644 --- a/fs/netfs/write_retry.c +++ b/fs/netfs/write_retry.c @@ -130,7 +130,9 @@ static void netfs_retry_write_stream(struct netfs_io_request *wreq, list_for_each_entry_safe_from(subreq, tmp, &stream->subrequests, rreq_link) { trace_netfs_sreq(subreq, netfs_sreq_trace_discard); + spin_lock(&wreq->lock); list_del(&subreq->rreq_link); + spin_unlock(&wreq->lock); netfs_put_subrequest(subreq, netfs_sreq_trace_put_done); if (subreq == to) break; @@ -153,8 +155,10 @@ static void netfs_retry_write_stream(struct netfs_io_request *wreq, netfs_sreq_trace_new); trace_netfs_sreq(subreq, netfs_sreq_trace_split); + spin_lock(&wreq->lock); list_add(&subreq->rreq_link, &to->rreq_link); - to = list_next_entry(to, rreq_link); + spin_unlock(&wreq->lock); + to = subreq; trace_netfs_sreq(subreq, netfs_sreq_trace_retry); stream->sreq_max_len = len; diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index 44b1a93f219a..530459dfa760 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -1850,6 +1850,13 @@ void nfsd4_revoke_states(struct nfsd_net *nn, struct super_block *sb) break; case SC_TYPE_LAYOUT: ls = layoutstateid(stid); + spin_lock(&clp->cl_lock); + if (stid->sc_status == 0) { + stid->sc_status |= + SC_STATUS_ADMIN_REVOKED; + atomic_inc(&clp->cl_admin_revoked); + } + spin_unlock(&clp->cl_lock); nfsd4_close_layout(ls); break; } diff --git a/fs/nsfs.c b/fs/nsfs.c index c215878d55e8..fb0dcc119669 100644 --- a/fs/nsfs.c +++ b/fs/nsfs.c @@ -266,7 +266,7 @@ static long ns_ioctl(struct file *filp, unsigned int ioctl, else tsk = find_task_by_pid_ns(arg, pid_ns); if (!tsk) - break; + return ret; switch (ioctl) { case NS_GET_PID_FROM_PIDNS: diff --git a/fs/orangefs/namei.c b/fs/orangefs/namei.c index bec5475de094..75e65e72c2d6 100644 --- a/fs/orangefs/namei.c +++ b/fs/orangefs/namei.c @@ -362,7 +362,7 @@ static struct dentry *orangefs_mkdir(struct mnt_idmap *idmap, struct inode *dir, __orangefs_setattr(dir, &iattr); out: op_release(new_op); - return ERR_PTR(ret); + return ret ? ERR_PTR(ret) : NULL; } static int orangefs_rename(struct mnt_idmap *idmap, diff --git a/fs/smb/client/cifs_spnego.c b/fs/smb/client/cifs_spnego.c index 3a41bbada04c..44c407275680 100644 --- a/fs/smb/client/cifs_spnego.c +++ b/fs/smb/client/cifs_spnego.c @@ -8,6 +8,7 @@ */ #include <linux/list.h> +#include <linux/cred.h> #include <linux/slab.h> #include <linux/string.h> #include <keys/user-type.h> @@ -40,12 +41,27 @@ cifs_spnego_key_destroy(struct key *key) kfree(key->payload.data[0]); } +static int +cifs_spnego_key_vet_description(const char *description) +{ + /* + * cifs.spnego descriptions are authority-bearing inputs to cifs.upcall. + * They are only valid when produced by CIFS while using the private + * spnego_cred installed below. Do not let userspace create this type + * of key through request_key(2)/add_key(2), since the helper treats + * pid/uid/creduid/upcall_target as kernel-originating fields. + */ + if (current_cred() != spnego_cred) + return -EPERM; + return 0; +} /* * keytype for CIFS spnego keys */ struct key_type cifs_spnego_key_type = { .name = "cifs.spnego", + .vet_description = cifs_spnego_key_vet_description, .instantiate = cifs_spnego_key_instantiate, .destroy = cifs_spnego_key_destroy, .describe = user_describe, diff --git a/fs/smb/client/cifsfs.c b/fs/smb/client/cifsfs.c index 32d0305a1239..386b0d43f064 100644 --- a/fs/smb/client/cifsfs.c +++ b/fs/smb/client/cifsfs.c @@ -340,6 +340,8 @@ static void cifs_kill_sb(struct super_block *sb) /* Wait for all pending oplock breaks to complete */ flush_workqueue(cifsoplockd_wq); + /* Wait for all opened files to release */ + flush_workqueue(deferredclose_wq); /* finally release root dentry */ dput(cifs_sb->root); @@ -468,7 +470,8 @@ cifs_alloc_inode(struct super_block *sb) spin_lock_init(&cifs_inode->writers_lock); cifs_inode->writers = 0; cifs_inode->netfs.inode.i_blkbits = 14; /* 2**14 = CIFS_MAX_MSGSIZE */ - cifs_inode->netfs.remote_i_size = 0; + cifs_inode->netfs._remote_i_size = 0; + cifs_inode->netfs._zero_point = 0; cifs_inode->uniqueid = 0; cifs_inode->createtime = 0; cifs_inode->epoch = 0; @@ -1336,7 +1339,8 @@ static loff_t cifs_remap_file_range(struct file *src_file, loff_t off, struct cifsFileInfo *smb_file_src = src_file->private_data; struct cifsFileInfo *smb_file_target = dst_file->private_data; struct cifs_tcon *target_tcon, *src_tcon; - unsigned long long destend, fstart, fend, old_size, new_size; + unsigned long long i_size, new_size; + unsigned long long destend, fstart, fend; unsigned int xid; int rc; @@ -1380,7 +1384,7 @@ static loff_t cifs_remap_file_range(struct file *src_file, loff_t off, * Advance the EOF marker after the flush above to the end of the range * if it's short of that. */ - if (src_cifsi->netfs.remote_i_size < off + len) { + if (netfs_read_remote_i_size(src_inode) < off + len) { rc = cifs_precopy_set_eof(src_inode, src_cifsi, src_tcon, xid, off + len); if (rc < 0) goto unlock; @@ -1401,22 +1405,24 @@ static loff_t cifs_remap_file_range(struct file *src_file, loff_t off, rc = cifs_flush_folio(target_inode, destend, &fstart, &fend, false); if (rc) goto unlock; - if (fend > target_cifsi->netfs.zero_point) - target_cifsi->netfs.zero_point = fend + 1; - old_size = target_cifsi->netfs.remote_i_size; + + spin_lock(&target_inode->i_lock); + if (fend > target_cifsi->netfs._zero_point) + netfs_write_zero_point(target_inode, fend + 1); + i_size = target_inode->i_size; + spin_unlock(&target_inode->i_lock); /* Discard all the folios that overlap the destination region. */ cifs_dbg(FYI, "about to discard pages %llx-%llx\n", fstart, fend); truncate_inode_pages_range(&target_inode->i_data, fstart, fend); - fscache_invalidate(cifs_inode_cookie(target_inode), NULL, - i_size_read(target_inode), 0); + fscache_invalidate(cifs_inode_cookie(target_inode), NULL, i_size, 0); rc = -EOPNOTSUPP; if (target_tcon->ses->server->ops->duplicate_extents) { rc = target_tcon->ses->server->ops->duplicate_extents(xid, smb_file_src, smb_file_target, off, len, destoff); - if (rc == 0 && new_size > old_size) { + if (rc == 0 && new_size > i_size) { truncate_setsize(target_inode, new_size); fscache_resize_cookie(cifs_inode_cookie(target_inode), new_size); @@ -1435,8 +1441,12 @@ static loff_t cifs_remap_file_range(struct file *src_file, loff_t off, rc = -EINVAL; } } - if (rc == 0 && new_size > target_cifsi->netfs.zero_point) - target_cifsi->netfs.zero_point = new_size; + if (rc == 0) { + spin_lock(&target_inode->i_lock); + if (new_size > target_cifsi->netfs._zero_point) + netfs_write_zero_point(target_inode, new_size); + spin_unlock(&target_inode->i_lock); + } } /* force revalidate of size and timestamps of target file now @@ -1507,7 +1517,7 @@ ssize_t cifs_file_copychunk_range(unsigned int xid, * Advance the EOF marker after the flush above to the end of the range * if it's short of that. */ - if (src_cifsi->netfs.remote_i_size < off + len) { + if (netfs_read_remote_i_size(src_inode) < off + len) { rc = cifs_precopy_set_eof(src_inode, src_cifsi, src_tcon, xid, off + len); if (rc < 0) goto unlock; @@ -1535,8 +1545,12 @@ ssize_t cifs_file_copychunk_range(unsigned int xid, fscache_resize_cookie(cifs_inode_cookie(target_inode), i_size_read(target_inode)); } - if (rc > 0 && destoff + rc > target_cifsi->netfs.zero_point) - target_cifsi->netfs.zero_point = destoff + rc; + if (rc > 0) { + spin_lock(&target_inode->i_lock); + if (destoff + rc > target_cifsi->netfs._zero_point) + netfs_write_zero_point(target_inode, destoff + rc); + spin_unlock(&target_inode->i_lock); + } } file_accessed(src_file); diff --git a/fs/smb/client/cifssmb.c b/fs/smb/client/cifssmb.c index 3990a9012264..9e27bfa7376b 100644 --- a/fs/smb/client/cifssmb.c +++ b/fs/smb/client/cifssmb.c @@ -1465,6 +1465,7 @@ cifs_readv_callback(struct TCP_Server_Info *server, struct mid_q_entry *mid) struct cifs_io_subrequest *rdata = mid->callback_data; struct netfs_inode *ictx = netfs_inode(rdata->rreq->inode); struct cifs_tcon *tcon = tlink_tcon(rdata->req->cfile->tlink); + struct inode *inode = &ictx->inode; struct smb_rqst rqst = { .rq_iov = rdata->iov, .rq_nvec = 1, .rq_iter = rdata->subreq.io_iter }; @@ -1538,7 +1539,7 @@ do_retry: } else { size_t trans = rdata->subreq.transferred + rdata->got_bytes; if (trans < rdata->subreq.len && - rdata->subreq.start + trans >= ictx->remote_i_size) { + rdata->subreq.start + trans >= netfs_read_remote_i_size(inode)) { rdata->result = 0; __set_bit(NETFS_SREQ_HIT_EOF, &rdata->subreq.flags); } else if (rdata->got_bytes > 0) { diff --git a/fs/smb/client/file.c b/fs/smb/client/file.c index a69e05f86d7e..ad624c01193e 100644 --- a/fs/smb/client/file.c +++ b/fs/smb/client/file.c @@ -2491,18 +2491,23 @@ int cifs_lock(struct file *file, int cmd, struct file_lock *flock) void cifs_write_subrequest_terminated(struct cifs_io_subrequest *wdata, ssize_t result) { struct netfs_io_request *wreq = wdata->rreq; - struct netfs_inode *ictx = netfs_inode(wreq->inode); + struct inode *inode = wreq->inode; + struct netfs_inode *ictx = netfs_inode(inode); loff_t wrend; if (result > 0) { + spin_lock(&inode->i_lock); + wrend = wdata->subreq.start + wdata->subreq.transferred + result; - if (wrend > ictx->zero_point && + if (wrend > ictx->_zero_point && (wdata->rreq->origin == NETFS_UNBUFFERED_WRITE || wdata->rreq->origin == NETFS_DIO_WRITE)) - ictx->zero_point = wrend; - if (wrend > ictx->remote_i_size) + netfs_write_zero_point(inode, wrend); + if (wrend > ictx->_remote_i_size) netfs_resize_file(ictx, wrend, true); + + spin_unlock(&inode->i_lock); } netfs_write_subrequest_terminated(&wdata->subreq, result); diff --git a/fs/smb/client/fs_context.c b/fs/smb/client/fs_context.c index a46764c24710..59598d334bae 100644 --- a/fs/smb/client/fs_context.c +++ b/fs/smb/client/fs_context.c @@ -761,7 +761,7 @@ static int smb3_fs_context_parse_param(struct fs_context *fc, static int smb3_fs_context_parse_monolithic(struct fs_context *fc, void *data); static int smb3_get_tree(struct fs_context *fc); -static void smb3_sync_ses_chan_max(struct cifs_ses *ses, unsigned int max_channels); +static void smb3_sync_ses_chan_max(struct cifs_ses *ses, size_t max_channels); static int smb3_reconfigure(struct fs_context *fc); static const struct fs_context_operations smb3_fs_context_ops = { @@ -1035,25 +1035,34 @@ do { \ int smb3_sync_session_ctx_passwords(struct cifs_sb_info *cifs_sb, struct cifs_ses *ses) { + char *password = NULL, *password2 = NULL; + if (ses->password && cifs_sb->ctx->password && strcmp(ses->password, cifs_sb->ctx->password)) { - kfree_sensitive(cifs_sb->ctx->password); - cifs_sb->ctx->password = kstrdup(ses->password, GFP_KERNEL); - if (!cifs_sb->ctx->password) + password = kstrdup(ses->password, GFP_KERNEL); + if (!password) return -ENOMEM; } if (ses->password2 && cifs_sb->ctx->password2 && strcmp(ses->password2, cifs_sb->ctx->password2)) { - kfree_sensitive(cifs_sb->ctx->password2); - cifs_sb->ctx->password2 = kstrdup(ses->password2, GFP_KERNEL); - if (!cifs_sb->ctx->password2) { - kfree_sensitive(cifs_sb->ctx->password); - cifs_sb->ctx->password = NULL; + password2 = kstrdup(ses->password2, GFP_KERNEL); + if (!password2) { + kfree_sensitive(password); return -ENOMEM; } } + + if (password) { + kfree_sensitive(cifs_sb->ctx->password); + cifs_sb->ctx->password = password; + } + if (password2) { + kfree_sensitive(cifs_sb->ctx->password2); + cifs_sb->ctx->password2 = password2; + } + return 0; } @@ -1066,7 +1075,7 @@ int smb3_sync_session_ctx_passwords(struct cifs_sb_info *cifs_sb, struct cifs_se * with the session's channel lock. This should be called whenever the maximum * allowed channels for a session changes (e.g., after a remount or reconfigure). */ -static void smb3_sync_ses_chan_max(struct cifs_ses *ses, unsigned int max_channels) +static void smb3_sync_ses_chan_max(struct cifs_ses *ses, size_t max_channels) { spin_lock(&ses->chan_lock); ses->chan_max = max_channels; @@ -1076,12 +1085,15 @@ static void smb3_sync_ses_chan_max(struct cifs_ses *ses, unsigned int max_channe static int smb3_reconfigure(struct fs_context *fc) { struct smb3_fs_context *ctx = smb3_fc2context(fc); + struct smb3_fs_context *new_ctx = NULL; + struct smb3_fs_context *old_ctx = NULL; struct dentry *root = fc->root; struct cifs_sb_info *cifs_sb = CIFS_SB(root->d_sb); struct cifs_ses *ses = cifs_sb_master_tcon(cifs_sb)->ses; unsigned int rsize = ctx->rsize, wsize = ctx->wsize; char *new_password = NULL, *new_password2 = NULL; bool need_recon = false; + bool need_mchan_update; int rc; if (ses->expired_pwd) @@ -1091,6 +1103,16 @@ static int smb3_reconfigure(struct fs_context *fc) if (rc) return rc; + old_ctx = kzalloc_obj(*old_ctx); + if (!old_ctx) + return -ENOMEM; + + rc = smb3_fs_context_dup(old_ctx, cifs_sb->ctx); + if (rc) { + kfree(old_ctx); + return rc; + } + /* * We can not change UNC/username/password/domainname/ * workstation_name/nodename/iocharset @@ -1100,16 +1122,22 @@ static int smb3_reconfigure(struct fs_context *fc) STEAL_STRING(cifs_sb, ctx, UNC); STEAL_STRING(cifs_sb, ctx, source); STEAL_STRING(cifs_sb, ctx, username); + STEAL_STRING(cifs_sb, ctx, domainname); + STEAL_STRING(cifs_sb, ctx, nodename); + STEAL_STRING(cifs_sb, ctx, iocharset); - if (need_recon == false) + if (!need_recon) { STEAL_STRING_SENSITIVE(cifs_sb, ctx, password); - else { + } else { if (ctx->password) { new_password = kstrdup(ctx->password, GFP_KERNEL); - if (!new_password) - return -ENOMEM; - } else + if (!new_password) { + rc = -ENOMEM; + goto restore_ctx; + } + } else { STEAL_STRING_SENSITIVE(cifs_sb, ctx, password); + } } /* @@ -1119,11 +1147,29 @@ static int smb3_reconfigure(struct fs_context *fc) if (ctx->password2) { new_password2 = kstrdup(ctx->password2, GFP_KERNEL); if (!new_password2) { - kfree_sensitive(new_password); - return -ENOMEM; + rc = -ENOMEM; + goto restore_ctx; } - } else + } else { STEAL_STRING_SENSITIVE(cifs_sb, ctx, password2); + } + + /* if rsize or wsize not passed in on remount, use previous values */ + ctx->rsize = rsize ? CIFS_ALIGN_RSIZE(fc, rsize) : cifs_sb->ctx->rsize; + ctx->wsize = wsize ? CIFS_ALIGN_WSIZE(fc, wsize) : cifs_sb->ctx->wsize; + + new_ctx = kzalloc_obj(*new_ctx); + if (!new_ctx) { + rc = -ENOMEM; + goto restore_ctx; + } + + rc = smb3_fs_context_dup(new_ctx, ctx); + if (rc) + goto restore_ctx; + + need_mchan_update = ctx->multichannel != cifs_sb->ctx->multichannel || + ctx->max_channels != cifs_sb->ctx->max_channels; /* * we may update the passwords in the ses struct below. Make sure we do @@ -1134,54 +1180,55 @@ static int smb3_reconfigure(struct fs_context *fc) /* * smb2_reconnect may swap password and password2 in case session setup * failed. First get ctx passwords in sync with ses passwords. It should - * be okay to do this even if this function were to return an error at a - * later stage + * be done before committing new passwords. */ rc = smb3_sync_session_ctx_passwords(cifs_sb, ses); if (rc) { mutex_unlock(&ses->session_mutex); - kfree_sensitive(new_password); - kfree_sensitive(new_password2); - return rc; + goto cleanup_new_ctx; + } + + /* + * If multichannel or max_channels has changed, update the session's channels accordingly. + * This may add or remove channels to match the new configuration. + */ + if (need_mchan_update) { + /* Prevent concurrent scaling operations */ + spin_lock(&ses->ses_lock); + if (ses->flags & CIFS_SES_FLAG_SCALE_CHANNELS) { + spin_unlock(&ses->ses_lock); + mutex_unlock(&ses->session_mutex); + rc = -EINVAL; + goto cleanup_new_ctx; + } + ses->flags |= CIFS_SES_FLAG_SCALE_CHANNELS; + spin_unlock(&ses->ses_lock); } /* - * now that allocations for passwords are done, commit them + * Commit session passwords before any channel work so newly added + * channels authenticate with the new credentials. */ if (new_password) { kfree_sensitive(ses->password); ses->password = new_password; + new_password = NULL; } if (new_password2) { kfree_sensitive(ses->password2); ses->password2 = new_password2; + new_password2 = NULL; } - /* - * If multichannel or max_channels has changed, update the session's channels accordingly. - * This may add or remove channels to match the new configuration. - */ - if ((ctx->multichannel != cifs_sb->ctx->multichannel) || - (ctx->max_channels != cifs_sb->ctx->max_channels)) { - + if (need_mchan_update) { /* Synchronize ses->chan_max with the new mount context */ smb3_sync_ses_chan_max(ses, ctx->max_channels); - /* Now update the session's channels to match the new configuration */ - /* Prevent concurrent scaling operations */ - spin_lock(&ses->ses_lock); - if (ses->flags & CIFS_SES_FLAG_SCALE_CHANNELS) { - spin_unlock(&ses->ses_lock); - mutex_unlock(&ses->session_mutex); - return -EINVAL; - } - ses->flags |= CIFS_SES_FLAG_SCALE_CHANNELS; - spin_unlock(&ses->ses_lock); mutex_unlock(&ses->session_mutex); - rc = smb3_update_ses_channels(ses, ses->server, - false /* from_reconnect */, - false /* disable_mchan */); + smb3_update_ses_channels(ses, ses->server, + false /* from_reconnect */, + false /* disable_mchan */); /* Clear scaling flag after operation */ spin_lock(&ses->ses_lock); @@ -1191,16 +1238,12 @@ static int smb3_reconfigure(struct fs_context *fc) mutex_unlock(&ses->session_mutex); } - STEAL_STRING(cifs_sb, ctx, domainname); - STEAL_STRING(cifs_sb, ctx, nodename); - STEAL_STRING(cifs_sb, ctx, iocharset); - - /* if rsize or wsize not passed in on remount, use previous values */ - ctx->rsize = rsize ? CIFS_ALIGN_RSIZE(fc, rsize) : cifs_sb->ctx->rsize; - ctx->wsize = wsize ? CIFS_ALIGN_WSIZE(fc, wsize) : cifs_sb->ctx->wsize; - smb3_cleanup_fs_context_contents(cifs_sb->ctx); - rc = smb3_fs_context_dup(cifs_sb->ctx, ctx); + memcpy(cifs_sb->ctx, new_ctx, sizeof(*new_ctx)); + kfree(new_ctx); + new_ctx = NULL; + smb3_cleanup_fs_context(old_ctx); + old_ctx = NULL; smb3_update_mnt_flags(cifs_sb); #ifdef CONFIG_CIFS_DFS_UPCALL if (!rc) @@ -1208,6 +1251,18 @@ static int smb3_reconfigure(struct fs_context *fc) #endif return rc; + +cleanup_new_ctx: + smb3_cleanup_fs_context_contents(new_ctx); +restore_ctx: + kfree(new_ctx); + kfree_sensitive(new_password); + kfree_sensitive(new_password2); + smb3_cleanup_fs_context_contents(cifs_sb->ctx); + memcpy(cifs_sb->ctx, old_ctx, sizeof(*old_ctx)); + kfree(old_ctx); + + return rc; } static int smb3_fs_context_parse_param(struct fs_context *fc, diff --git a/fs/smb/client/inode.c b/fs/smb/client/inode.c index 888f9e35f14b..5b1beba77c0e 100644 --- a/fs/smb/client/inode.c +++ b/fs/smb/client/inode.c @@ -119,7 +119,7 @@ cifs_revalidate_cache(struct inode *inode, struct cifs_fattr *fattr) fattr->cf_mtime = timestamp_truncate(fattr->cf_mtime, inode); mtime = inode_get_mtime(inode); if (timespec64_equal(&mtime, &fattr->cf_mtime) && - cifs_i->netfs.remote_i_size == fattr->cf_eof) { + netfs_read_remote_i_size(inode) == fattr->cf_eof) { cifs_dbg(FYI, "%s: inode %llu is unchanged\n", __func__, cifs_i->uniqueid); return; @@ -173,12 +173,12 @@ cifs_fattr_to_inode(struct inode *inode, struct cifs_fattr *fattr, CIFS_I(inode)->time = 0; /* force reval */ return -ESTALE; } - if (inode_state_read_once(inode) & I_NEW) - CIFS_I(inode)->netfs.zero_point = fattr->cf_eof; - cifs_revalidate_cache(inode, fattr); spin_lock(&inode->i_lock); + if (inode_state_read_once(inode) & I_NEW) + netfs_write_zero_point(inode, fattr->cf_eof); + fattr->cf_mtime = timestamp_truncate(fattr->cf_mtime, inode); fattr->cf_atime = timestamp_truncate(fattr->cf_atime, inode); fattr->cf_ctime = timestamp_truncate(fattr->cf_ctime, inode); @@ -212,7 +212,7 @@ cifs_fattr_to_inode(struct inode *inode, struct cifs_fattr *fattr, else clear_bit(CIFS_INO_DELETE_PENDING, &cifs_i->flags); - cifs_i->netfs.remote_i_size = fattr->cf_eof; + netfs_write_remote_i_size(inode, fattr->cf_eof); /* * Can't safely change the file size here if the client is writing to * it due to potential races. @@ -2771,7 +2771,9 @@ cifs_revalidate_mapping(struct inode *inode) if (cifs_sb_flags(cifs_sb) & CIFS_MOUNT_RW_CACHE) goto skip_invalidate; - cifs_inode->netfs.zero_point = cifs_inode->netfs.remote_i_size; + spin_lock(&inode->i_lock); + netfs_write_zero_point(inode, netfs_inode(inode)->_remote_i_size); + spin_unlock(&inode->i_lock); rc = filemap_invalidate_inode(inode, true, 0, LLONG_MAX); if (rc) { cifs_dbg(VFS, "%s: invalidate inode %p failed with rc %d\n", diff --git a/fs/smb/client/netlink.c b/fs/smb/client/netlink.c index 147d9409252c..0dd10913c37a 100644 --- a/fs/smb/client/netlink.c +++ b/fs/smb/client/netlink.c @@ -33,13 +33,17 @@ static const struct nla_policy cifs_genl_policy[CIFS_GENL_ATTR_MAX + 1] = { static const struct genl_ops cifs_genl_ops[] = { { .cmd = CIFS_GENL_CMD_SWN_NOTIFY, + .flags = GENL_ADMIN_PERM, .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP, .doit = cifs_swn_notify, }, }; static const struct genl_multicast_group cifs_genl_mcgrps[] = { - [CIFS_GENL_MCGRP_SWN] = { .name = CIFS_GENL_MCGRP_SWN_NAME }, + [CIFS_GENL_MCGRP_SWN] = { + .name = CIFS_GENL_MCGRP_SWN_NAME, + .flags = GENL_MCAST_CAP_NET_ADMIN, + }, }; struct genl_family cifs_genl_family = { diff --git a/fs/smb/client/readdir.c b/fs/smb/client/readdir.c index be22bbc4a65a..e860fa08b5e3 100644 --- a/fs/smb/client/readdir.c +++ b/fs/smb/client/readdir.c @@ -143,7 +143,8 @@ retry: fattr->cf_rdev = inode->i_rdev; fattr->cf_uid = inode->i_uid; fattr->cf_gid = inode->i_gid; - fattr->cf_eof = CIFS_I(inode)->netfs.remote_i_size; + fattr->cf_eof = + netfs_read_remote_i_size(inode); fattr->cf_symlink_target = NULL; } else { CIFS_I(inode)->time = 0; diff --git a/fs/smb/client/smb2ops.c b/fs/smb/client/smb2ops.c index ccc06c83956b..a07d72cd16dc 100644 --- a/fs/smb/client/smb2ops.c +++ b/fs/smb/client/smb2ops.c @@ -3402,8 +3402,7 @@ static long smb3_zero_range(struct file *file, struct cifs_tcon *tcon, struct inode *inode = file_inode(file); struct cifsInodeInfo *cifsi = CIFS_I(inode); struct cifsFileInfo *cfile = file->private_data; - struct netfs_inode *ictx = netfs_inode(inode); - unsigned long long i_size, new_size, remote_size; + unsigned long long i_size, new_size, remote_i_size, zero_point; long rc; unsigned int xid; @@ -3414,9 +3413,8 @@ static long smb3_zero_range(struct file *file, struct cifs_tcon *tcon, filemap_invalidate_lock(inode->i_mapping); - i_size = i_size_read(inode); - remote_size = ictx->remote_i_size; - if (offset + len >= remote_size && offset < i_size) { + netfs_read_sizes(inode, &i_size, &remote_i_size, &zero_point); + if (offset + len >= remote_i_size && offset < i_size) { unsigned long long top = umin(offset + len, i_size); rc = filemap_write_and_wait_range(inode->i_mapping, offset, top - 1); @@ -3449,9 +3447,11 @@ static long smb3_zero_range(struct file *file, struct cifs_tcon *tcon, cfile->fid.volatile_fid, cfile->pid, new_size); if (rc >= 0) { truncate_setsize(inode, new_size); + spin_lock(&inode->i_lock); netfs_resize_file(&cifsi->netfs, new_size, true); - if (offset < cifsi->netfs.zero_point) - cifsi->netfs.zero_point = offset; + if (offset < cifsi->netfs._zero_point) + netfs_write_zero_point(inode, offset); + spin_unlock(&inode->i_lock); fscache_resize_cookie(cifs_inode_cookie(inode), new_size); } } @@ -3474,7 +3474,7 @@ static long smb3_punch_hole(struct file *file, struct cifs_tcon *tcon, struct inode *inode = file_inode(file); struct cifsFileInfo *cfile = file->private_data; struct file_zero_data_information fsctl_buf; - unsigned long long end = offset + len, i_size, remote_i_size; + unsigned long long end = offset + len, i_size, remote_i_size, zero_point; long rc; unsigned int xid; __u8 set_sparse = 1; @@ -3516,14 +3516,17 @@ static long smb3_punch_hole(struct file *file, struct cifs_tcon *tcon, * that we locally hole-punch the tail of the dirty data, the proposed * EOF update will end up in the wrong place. */ - i_size = i_size_read(inode); - remote_i_size = netfs_inode(inode)->remote_i_size; + netfs_read_sizes(inode, &i_size, &remote_i_size, &zero_point); + if (end > remote_i_size && i_size > remote_i_size) { unsigned long long extend_to = umin(end, i_size); rc = SMB2_set_eof(xid, tcon, cfile->fid.persistent_fid, cfile->fid.volatile_fid, cfile->pid, extend_to); - if (rc >= 0) - netfs_inode(inode)->remote_i_size = extend_to; + if (rc >= 0) { + spin_lock(&inode->i_lock); + netfs_write_remote_i_size(inode, extend_to); + spin_unlock(&inode->i_lock); + } } unlock: @@ -3787,7 +3790,6 @@ static long smb3_collapse_range(struct file *file, struct cifs_tcon *tcon, struct inode *inode = file_inode(file); struct cifsInodeInfo *cifsi = CIFS_I(inode); struct cifsFileInfo *cfile = file->private_data; - struct netfs_inode *ictx = &cifsi->netfs; loff_t old_eof, new_eof; xid = get_xid(); @@ -3805,7 +3807,9 @@ static long smb3_collapse_range(struct file *file, struct cifs_tcon *tcon, goto out_2; truncate_pagecache_range(inode, off, old_eof); - ictx->zero_point = old_eof; + spin_lock(&inode->i_lock); + netfs_write_zero_point(inode, old_eof); + spin_unlock(&inode->i_lock); netfs_wait_for_outstanding_io(inode); rc = smb2_copychunk_range(xid, cfile, cfile, off + len, @@ -3822,8 +3826,10 @@ static long smb3_collapse_range(struct file *file, struct cifs_tcon *tcon, rc = 0; truncate_setsize(inode, new_eof); + spin_lock(&inode->i_lock); netfs_resize_file(&cifsi->netfs, new_eof, true); - ictx->zero_point = new_eof; + netfs_write_zero_point(inode, new_eof); + spin_unlock(&inode->i_lock); fscache_resize_cookie(cifs_inode_cookie(inode), new_eof); out_2: filemap_invalidate_unlock(inode->i_mapping); @@ -3866,13 +3872,17 @@ static long smb3_insert_range(struct file *file, struct cifs_tcon *tcon, goto out_2; truncate_setsize(inode, new_eof); + spin_lock(&inode->i_lock); netfs_resize_file(&cifsi->netfs, i_size_read(inode), true); + spin_unlock(&inode->i_lock); fscache_resize_cookie(cifs_inode_cookie(inode), i_size_read(inode)); rc = smb2_copychunk_range(xid, cfile, cfile, off, count, off + len); if (rc < 0) goto out_2; - cifsi->netfs.zero_point = new_eof; + spin_lock(&inode->i_lock); + netfs_write_zero_point(inode, new_eof); + spin_unlock(&inode->i_lock); rc = smb3_zero_data(file, tcon, off, len, xid); if (rc < 0) @@ -4825,7 +4835,7 @@ handle_read_data(struct TCP_Server_Info *server, struct mid_q_entry *mid, } /* Copy the data to the output I/O iterator. */ - rdata->result = cifs_copy_folioq_to_iter(buffer, buffer_len, + rdata->result = cifs_copy_folioq_to_iter(buffer, data_len, cur_off, &rdata->subreq.io_iter); if (rdata->result != 0) { if (is_offloaded) @@ -4834,7 +4844,7 @@ handle_read_data(struct TCP_Server_Info *server, struct mid_q_entry *mid, dequeue_mid(server, mid, rdata->result); return 0; } - rdata->got_bytes = buffer_len; + rdata->got_bytes = data_len; } else if (buf_len >= data_offset + data_len) { /* read response payload is in buf */ diff --git a/fs/smb/client/smb2pdu.c b/fs/smb/client/smb2pdu.c index 5188218c25be..967047894a1e 100644 --- a/fs/smb/client/smb2pdu.c +++ b/fs/smb/client/smb2pdu.c @@ -4596,6 +4596,7 @@ smb2_readv_callback(struct TCP_Server_Info *server, struct mid_q_entry *mid) struct netfs_inode *ictx = netfs_inode(rdata->rreq->inode); struct cifs_tcon *tcon = tlink_tcon(rdata->req->cfile->tlink); struct smb2_hdr *shdr = (struct smb2_hdr *)rdata->iov[0].iov_base; + struct inode *inode = &ictx->inode; struct cifs_credits credits = { .value = 0, .instance = 0, @@ -4709,7 +4710,7 @@ do_retry: } else { size_t trans = rdata->subreq.transferred + rdata->got_bytes; if (trans < rdata->subreq.len && - rdata->subreq.start + trans >= ictx->remote_i_size) { + rdata->subreq.start + trans >= netfs_read_remote_i_size(inode)) { __set_bit(NETFS_SREQ_HIT_EOF, &rdata->subreq.flags); rdata->result = 0; } diff --git a/fs/smb/client/smb2transport.c b/fs/smb/client/smb2transport.c index bcd7ec9c9521..0a2fa2e6b72d 100644 --- a/fs/smb/client/smb2transport.c +++ b/fs/smb/client/smb2transport.c @@ -176,7 +176,9 @@ smb2_find_smb_sess_tcon_unlocked(struct cifs_ses *ses, __u32 tid) list_for_each_entry(tcon, &ses->tcon_list, tcon_list) { if (tcon->tid != tid) continue; + spin_lock(&tcon->tc_lock); ++tcon->tc_count; + spin_unlock(&tcon->tc_lock); trace_smb3_tcon_ref(tcon->debug_id, tcon->tc_count, netfs_trace_tcon_ref_get_find_sess_tcon); return tcon; diff --git a/fs/smb/server/oplock.c b/fs/smb/server/oplock.c index cd3f28b0e7cb..f5ec1283b16e 100644 --- a/fs/smb/server/oplock.c +++ b/fs/smb/server/oplock.c @@ -484,8 +484,12 @@ static inline int compare_guid_key(struct oplock_info *opinfo, const char *guid1, const char *key1) { const char *guid2, *key2; + struct ksmbd_conn *conn; - guid2 = opinfo->conn->ClientGUID; + conn = READ_ONCE(opinfo->conn); + if (!conn) + return 0; + guid2 = conn->ClientGUID; key2 = opinfo->o_lease->lease_key; if (!memcmp(guid1, guid2, SMB2_CLIENT_GUID_SIZE) && !memcmp(key1, key2, SMB2_LEASE_KEY_SIZE)) diff --git a/fs/smb/server/smb2pdu.c b/fs/smb/server/smb2pdu.c index c3c7688f0fa8..3a8a739c025f 100644 --- a/fs/smb/server/smb2pdu.c +++ b/fs/smb/server/smb2pdu.c @@ -3803,8 +3803,19 @@ err_out2: ksmbd_debug(SMB, "Error response: %x\n", rsp->hdr.Status); } - if (dh_info.reconnected) - ksmbd_put_durable_fd(dh_info.fp); + if (dh_info.reconnected) { + /* + * If reconnect succeeded, fp was republished in the + * session file table. On a later error, ksmbd_fd_put() + * above drops the session reference; drop the durable + * lookup reference through the same session-aware path so + * final close removes the volatile id before freeing fp. + */ + if (rc && fp == dh_info.fp) + ksmbd_fd_put(work, dh_info.fp); + else + ksmbd_put_durable_fd(dh_info.fp); + } kfree(name); kfree(lc); diff --git a/fs/smb/server/smbacl.c b/fs/smb/server/smbacl.c index c1d1f34581d6..c2d9be52a311 100644 --- a/fs/smb/server/smbacl.c +++ b/fs/smb/server/smbacl.c @@ -643,8 +643,10 @@ static void set_posix_acl_entries_dacl(struct mnt_idmap *idmap, ntace = (struct smb_ace *)((char *)pndace + *size); ace_sz = fill_ace_for_sid(ntace, sid, ACCESS_ALLOWED, flags, pace->e_perm, 0777); - if (check_add_overflow(*size, ace_sz, size)) + if (check_add_overflow(*size, ace_sz, size)) { + kfree(sid); break; + } (*num_aces)++; if (pace->e_tag == ACL_USER) ntace->access_req |= @@ -655,8 +657,10 @@ static void set_posix_acl_entries_dacl(struct mnt_idmap *idmap, ntace = (struct smb_ace *)((char *)pndace + *size); ace_sz = fill_ace_for_sid(ntace, sid, ACCESS_ALLOWED, 0x03, pace->e_perm, 0777); - if (check_add_overflow(*size, ace_sz, size)) + if (check_add_overflow(*size, ace_sz, size)) { + kfree(sid); break; + } (*num_aces)++; if (pace->e_tag == ACL_USER) ntace->access_req |= @@ -698,8 +702,10 @@ posix_default_acl: ntace = (struct smb_ace *)((char *)pndace + *size); ace_sz = fill_ace_for_sid(ntace, sid, ACCESS_ALLOWED, 0x0b, pace->e_perm, 0777); - if (check_add_overflow(*size, ace_sz, size)) + if (check_add_overflow(*size, ace_sz, size)) { + kfree(sid); break; + } (*num_aces)++; if (pace->e_tag == ACL_USER) ntace->access_req |= @@ -1090,6 +1096,40 @@ static int smb_append_inherited_ace(struct smb_ace **ace, int *nt_size, return 0; } +static int smb_validate_ntsd_sid(struct smb_ntsd *pntsd, size_t pntsd_size, + unsigned int sid_offset, struct smb_sid **sid, + size_t *sid_size) +{ + size_t sid_end; + + *sid = NULL; + *sid_size = 0; + + if (!sid_offset) + return 0; + + if (sid_offset < sizeof(struct smb_ntsd) || + check_add_overflow(sid_offset, (size_t)CIFS_SID_BASE_SIZE, + &sid_end) || + sid_end > pntsd_size) + return -EINVAL; + + *sid = (struct smb_sid *)((char *)pntsd + sid_offset); + if ((*sid)->num_subauth > SID_MAX_SUB_AUTHORITIES) + return -EINVAL; + + if (check_add_overflow((size_t)CIFS_SID_BASE_SIZE, + sizeof(__le32) * (size_t)(*sid)->num_subauth, + &sid_end)) + return -EINVAL; + + if (sid_offset > pntsd_size || sid_end > pntsd_size - sid_offset) + return -EINVAL; + + *sid_size = sid_end; + return 0; +} + int smb_inherit_dacl(struct ksmbd_conn *conn, const struct path *path, unsigned int uid, unsigned int gid) @@ -1102,28 +1142,28 @@ int smb_inherit_dacl(struct ksmbd_conn *conn, struct dentry *parent = path->dentry->d_parent; struct mnt_idmap *idmap = mnt_idmap(path->mnt); int inherited_flags = 0, flags = 0, i, nt_size = 0, pdacl_size; - int rc = 0, pntsd_type, pntsd_size, acl_len, aces_size; + int rc = 0, pntsd_type, ppntsd_size, acl_len, aces_size; unsigned int dacloffset; size_t dacl_struct_end; u16 num_aces, ace_cnt = 0; char *aces_base; bool is_dir = S_ISDIR(d_inode(path->dentry)->i_mode); - pntsd_size = ksmbd_vfs_get_sd_xattr(conn, idmap, + ppntsd_size = ksmbd_vfs_get_sd_xattr(conn, idmap, parent, &parent_pntsd); - if (pntsd_size <= 0) + if (ppntsd_size <= 0) return -ENOENT; dacloffset = le32_to_cpu(parent_pntsd->dacloffset); if (!dacloffset || check_add_overflow(dacloffset, sizeof(struct smb_acl), &dacl_struct_end) || - dacl_struct_end > (size_t)pntsd_size) { + dacl_struct_end > (size_t)ppntsd_size) { rc = -EINVAL; goto free_parent_pntsd; } parent_pdacl = (struct smb_acl *)((char *)parent_pntsd + dacloffset); - acl_len = pntsd_size - dacloffset; + acl_len = ppntsd_size - dacloffset; num_aces = le16_to_cpu(parent_pdacl->num_aces); pntsd_type = le16_to_cpu(parent_pntsd->type); pdacl_size = le16_to_cpu(parent_pdacl->size); @@ -1237,19 +1277,19 @@ pass: struct smb_ntsd *pntsd; struct smb_acl *pdacl; struct smb_sid *powner_sid = NULL, *pgroup_sid = NULL; - int powner_sid_size = 0, pgroup_sid_size = 0, pntsd_size; + size_t powner_sid_size = 0, pgroup_sid_size = 0, pntsd_size; size_t pntsd_alloc_size; - if (parent_pntsd->osidoffset) { - powner_sid = (struct smb_sid *)((char *)parent_pntsd + - le32_to_cpu(parent_pntsd->osidoffset)); - powner_sid_size = 1 + 1 + 6 + (powner_sid->num_subauth * 4); - } - if (parent_pntsd->gsidoffset) { - pgroup_sid = (struct smb_sid *)((char *)parent_pntsd + - le32_to_cpu(parent_pntsd->gsidoffset)); - pgroup_sid_size = 1 + 1 + 6 + (pgroup_sid->num_subauth * 4); - } + rc = smb_validate_ntsd_sid(parent_pntsd, ppntsd_size, + le32_to_cpu(parent_pntsd->osidoffset), + &powner_sid, &powner_sid_size); + if (rc) + goto free_aces_base; + rc = smb_validate_ntsd_sid(parent_pntsd, ppntsd_size, + le32_to_cpu(parent_pntsd->gsidoffset), + &pgroup_sid, &pgroup_sid_size); + if (rc) + goto free_aces_base; if (check_add_overflow(sizeof(struct smb_ntsd), (size_t)powner_sid_size, diff --git a/fs/smb/server/vfs_cache.c b/fs/smb/server/vfs_cache.c index 3551f01a3fa0..daed487d7304 100644 --- a/fs/smb/server/vfs_cache.c +++ b/fs/smb/server/vfs_cache.c @@ -81,7 +81,7 @@ static int proc_show_files(struct seq_file *m, void *v) read_lock(&global_ft.lock); idr_for_each_entry(global_ft.idr, fp, id) { seq_printf(m, "%#-10x %#-10llx %#-10llx %#-10x", - fp->tcon->id, + fp->tcon ? fp->tcon->id : 0, fp->persistent_id, fp->volatile_id, atomic_read(&fp->refcount)); @@ -211,7 +211,7 @@ int ksmbd_query_inode_status(struct dentry *dentry) return ret; down_read(&ci->m_lock); - if (ci->m_flags & (S_DEL_PENDING | S_DEL_ON_CLS)) + if (ci->m_flags & S_DEL_PENDING) ret = KSMBD_INODE_STATUS_PENDING_DELETE; else ret = KSMBD_INODE_STATUS_OK; @@ -227,7 +227,7 @@ bool ksmbd_inode_pending_delete(struct ksmbd_file *fp) int ret; down_read(&ci->m_lock); - ret = (ci->m_flags & (S_DEL_PENDING | S_DEL_ON_CLS)); + ret = (ci->m_flags & S_DEL_PENDING); up_read(&ci->m_lock); return ret; @@ -395,12 +395,20 @@ static void __ksmbd_inode_close(struct ksmbd_file *fp) } } + down_write(&ci->m_lock); + /* Promote S_DEL_ON_CLS to S_DEL_PENDING when close */ + if (ci->m_flags & S_DEL_ON_CLS) { + ci->m_flags &= ~S_DEL_ON_CLS; + ci->m_flags |= S_DEL_PENDING; + } + up_write(&ci->m_lock); + if (atomic_dec_and_test(&ci->m_count)) { bool do_unlink = false; down_write(&ci->m_lock); - if (ci->m_flags & (S_DEL_ON_CLS | S_DEL_PENDING)) { - ci->m_flags &= ~(S_DEL_ON_CLS | S_DEL_PENDING); + if (ci->m_flags & S_DEL_PENDING) { + ci->m_flags &= ~S_DEL_PENDING; do_unlink = true; } up_write(&ci->m_lock); @@ -418,6 +426,14 @@ static void __ksmbd_remove_durable_fd(struct ksmbd_file *fp) return; idr_remove(global_ft.idr, fp->persistent_id); + /* + * Clear persistent_id so a later __ksmbd_close_fd() that runs from a + * delayed putter (e.g. when a concurrent ksmbd_lookup_fd_inode() + * walker held the final reference) does not re-issue idr_remove() on + * an id that idr_alloc_cyclic() may have already handed out to a new + * durable handle. + */ + fp->persistent_id = KSMBD_NO_FID; } static void ksmbd_remove_durable_fd(struct ksmbd_file *fp) @@ -510,6 +526,20 @@ static struct ksmbd_file *__ksmbd_lookup_fd(struct ksmbd_file_table *ft, static void __put_fd_final(struct ksmbd_work *work, struct ksmbd_file *fp) { + /* + * Detached durable fp -- session_fd_check() cleared fp->conn at + * preserve, so this fp is no longer tracked by any conn's + * stats.open_files_count. This happens when + * ksmbd_scavenger_dispose_dh() hands the final close off to an + * m_fp_list walker (e.g. ksmbd_lookup_fd_inode()) whose work->conn + * is unrelated to the conn that originally opened the handle; close + * via the NULL-ft path so we do not underflow that unrelated + * counter. + */ + if (!fp->conn) { + __ksmbd_close_fd(NULL, fp); + return; + } __ksmbd_close_fd(&work->sess->file_table, fp); atomic_dec(&work->conn->stats.open_files_count); } @@ -881,24 +911,37 @@ static bool ksmbd_durable_scavenger_alive(void) return true; } -static void ksmbd_scavenger_dispose_dh(struct list_head *head) +static void ksmbd_scavenger_dispose_dh(struct ksmbd_file *fp) { - while (!list_empty(head)) { - struct ksmbd_file *fp; + /* + * Durable-preserved fp can remain linked on f_ci->m_fp_list for + * share-mode checks. Unlink it before final close; fp->node is not + * available as a scavenger-private list node because re-adding it to + * another list corrupts m_fp_list. + */ + down_write(&fp->f_ci->m_lock); + list_del_init(&fp->node); + up_write(&fp->f_ci->m_lock); - fp = list_first_entry(head, struct ksmbd_file, node); - list_del_init(&fp->node); + /* + * Drop both the durable lifetime reference and the transient reference + * taken by the scavenger under global_ft.lock. If a concurrent + * ksmbd_lookup_fd_inode() (or any other m_fp_list walker) snatched fp + * before the unlink above, that holder owns the final close via + * ksmbd_fd_put() -> __ksmbd_close_fd(). Otherwise the scavenger is + * the last putter and finalises fp here. + */ + if (atomic_sub_and_test(2, &fp->refcount)) __ksmbd_close_fd(NULL, fp); - } } static int ksmbd_durable_scavenger(void *dummy) { struct ksmbd_file *fp = NULL; + struct ksmbd_file *expired_fp; unsigned int id; unsigned int min_timeout = 1; bool found_fp_timeout; - LIST_HEAD(scavenger_list); unsigned long remaining_jiffies; __module_get(THIS_MODULE); @@ -908,8 +951,6 @@ static int ksmbd_durable_scavenger(void *dummy) if (try_to_freeze()) continue; - found_fp_timeout = false; - remaining_jiffies = wait_event_timeout(dh_wq, ksmbd_durable_scavenger_alive() == false, __msecs_to_jiffies(min_timeout)); @@ -918,23 +959,39 @@ static int ksmbd_durable_scavenger(void *dummy) else min_timeout = DURABLE_HANDLE_MAX_TIMEOUT; - write_lock(&global_ft.lock); - idr_for_each_entry(global_ft.idr, fp, id) { - if (!fp->durable_timeout) - continue; - - if (atomic_read(&fp->refcount) > 1 || - fp->conn) - continue; + do { + expired_fp = NULL; + found_fp_timeout = false; - found_fp_timeout = true; - if (fp->durable_scavenger_timeout <= - jiffies_to_msecs(jiffies)) { - __ksmbd_remove_durable_fd(fp); - list_add(&fp->node, &scavenger_list); - } else { + write_lock(&global_ft.lock); + idr_for_each_entry(global_ft.idr, fp, id) { unsigned long durable_timeout; + if (!fp->durable_timeout) + continue; + + if (atomic_read(&fp->refcount) > 1 || + fp->conn) + continue; + + found_fp_timeout = true; + if (fp->durable_scavenger_timeout <= + jiffies_to_msecs(jiffies)) { + __ksmbd_remove_durable_fd(fp); + /* + * Take a transient reference so fp + * cannot be freed by an in-flight + * ksmbd_lookup_fd_inode() that found + * it through f_ci->m_fp_list while we + * drop global_ft.lock and reach the + * m_fp_list unlink in + * ksmbd_scavenger_dispose_dh(). + */ + atomic_inc(&fp->refcount); + expired_fp = fp; + break; + } + durable_timeout = fp->durable_scavenger_timeout - jiffies_to_msecs(jiffies); @@ -942,10 +999,11 @@ static int ksmbd_durable_scavenger(void *dummy) if (min_timeout > durable_timeout) min_timeout = durable_timeout; } - } - write_unlock(&global_ft.lock); + write_unlock(&global_ft.lock); - ksmbd_scavenger_dispose_dh(&scavenger_list); + if (expired_fp) + ksmbd_scavenger_dispose_dh(expired_fp); + } while (expired_fp); if (found_fp_timeout == false) break; diff --git a/fs/sysfs/group.c b/fs/sysfs/group.c index 989edd6c6c23..461d6b93f443 100644 --- a/fs/sysfs/group.c +++ b/fs/sysfs/group.c @@ -188,7 +188,7 @@ static int internal_create_group(struct kobject *kobj, int update, kernfs_get(kn); error = create_files(kn, kobj, uid, gid, grp, update); if (error) { - if (grp->name) + if (grp->name && !update) kernfs_remove(kn); } kernfs_put(kn); diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c index e83b2ec5e49f..d0976c874b74 100644 --- a/fs/zonefs/super.c +++ b/fs/zonefs/super.c @@ -610,10 +610,14 @@ static long zonefs_fname_to_fno(const struct qstr *fname) return c - '0'; for (i = 0, rname = name + len - 1; i < len; i++, rname--) { + long digit; + c = *rname; if (!isdigit(c)) return -ENOENT; - fno += (c - '0') * shift; + digit = (c - '0') * shift; + if (check_add_overflow(fno, digit, &fno)) + return -ENOENT; shift *= 10; } diff --git a/include/asm-generic/kprobes.h b/include/asm-generic/kprobes.h index 060eab094e5a..5290a2b2e15a 100644 --- a/include/asm-generic/kprobes.h +++ b/include/asm-generic/kprobes.h @@ -14,7 +14,7 @@ static unsigned long __used \ _kbl_addr_##fname = (unsigned long)fname; # define NOKPROBE_SYMBOL(fname) __NOKPROBE_SYMBOL(fname) /* Use this to forbid a kprobes attach on very low level functions */ -# define __kprobes __section(".kprobes.text") +# define __kprobes notrace __section(".kprobes.text") # define nokprobe_inline __always_inline #else # define NOKPROBE_SYMBOL(fname) diff --git a/include/asm-generic/ring_buffer.h b/include/asm-generic/ring_buffer.h new file mode 100644 index 000000000000..201d2aee1005 --- /dev/null +++ b/include/asm-generic/ring_buffer.h @@ -0,0 +1,13 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Generic arch dependent ring_buffer macros. + */ +#ifndef __ASM_GENERIC_RING_BUFFER_H__ +#define __ASM_GENERIC_RING_BUFFER_H__ + +#include <linux/cacheflush.h> + +/* Flush cache on ring buffer range if needed. Do nothing by default. */ +#define arch_ring_buffer_flush_range(start, end) do { } while (0) + +#endif /* __ASM_GENERIC_RING_BUFFER_H__ */ diff --git a/include/crypto/krb5.h b/include/crypto/krb5.h index 71dd38f59be1..aac3ecf88467 100644 --- a/include/crypto/krb5.h +++ b/include/crypto/krb5.h @@ -121,9 +121,12 @@ size_t crypto_krb5_how_much_buffer(const struct krb5_enctype *krb5, size_t crypto_krb5_how_much_data(const struct krb5_enctype *krb5, enum krb5_crypto_mode mode, size_t *_buffer_size, size_t *_offset); -void crypto_krb5_where_is_the_data(const struct krb5_enctype *krb5, - enum krb5_crypto_mode mode, - size_t *_offset, size_t *_len); +int crypto_krb5_where_is_the_data(const struct krb5_enctype *krb5, + enum krb5_crypto_mode mode, + size_t *_offset, size_t *_len); +int crypto_krb5_check_data_len(const struct krb5_enctype *krb5, + enum krb5_crypto_mode mode, + size_t len, size_t min_content); struct crypto_aead *crypto_krb5_prepare_encryption(const struct krb5_enctype *krb5, const struct krb5_buffer *TK, u32 usage, gfp_t gfp); diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h index bc78fb77cc27..768a8dae83c5 100644 --- a/include/drm/drm_device.h +++ b/include/drm/drm_device.h @@ -375,6 +375,13 @@ struct drm_device { * Root directory for debugfs files. */ struct dentry *debugfs_root; + + /** + * @gem_lru_mutex: + * + * Lock protecting movement of GEM objects between LRUs. + */ + struct mutex gem_lru_mutex; }; void drm_dev_set_dma_dev(struct drm_device *dev, struct device *dma_dev); diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h index 86f5846154f7..8a704f6a65c1 100644 --- a/include/drm/drm_gem.h +++ b/include/drm/drm_gem.h @@ -245,18 +245,12 @@ struct drm_gem_object_funcs { * for lockless &shrinker.count_objects, and provides * &drm_gem_lru_scan for driver's &shrinker.scan_objects * implementation. + * + * Any access to this kind of object must be done with + * drm_device::gem_lru_mutex held. */ struct drm_gem_lru { /** - * @lock: - * - * Lock protecting movement of GEM objects between LRUs. All - * LRUs that the object can move between should be protected - * by the same lock. - */ - struct mutex *lock; - - /** * @count: * * The total number of backing pages of the GEM objects in @@ -453,6 +447,9 @@ struct drm_gem_object { * @lru: * * The current LRU list that the GEM object is on. + * + * Access to this field must be done with drm_device::gem_lru_mutex + * held. */ struct drm_gem_lru *lru; }; @@ -610,12 +607,13 @@ void drm_gem_unlock_reservations(struct drm_gem_object **objs, int count, int drm_gem_dumb_map_offset(struct drm_file *file, struct drm_device *dev, u32 handle, u64 *offset); -void drm_gem_lru_init(struct drm_gem_lru *lru, struct mutex *lock); +void drm_gem_lru_init(struct drm_gem_lru *lru); void drm_gem_lru_remove(struct drm_gem_object *obj); void drm_gem_lru_move_tail_locked(struct drm_gem_lru *lru, struct drm_gem_object *obj); void drm_gem_lru_move_tail(struct drm_gem_lru *lru, struct drm_gem_object *obj); unsigned long -drm_gem_lru_scan(struct drm_gem_lru *lru, +drm_gem_lru_scan(struct drm_device *dev, + struct drm_gem_lru *lru, unsigned int nr_to_scan, unsigned long *remaining, bool (*shrink)(struct drm_gem_object *obj, struct ww_acquire_ctx *ticket), diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index d463b9b5a0a5..ac899cd0cd70 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -13,6 +13,7 @@ #include <linux/minmax.h> #include <linux/timer.h> #include <linux/workqueue.h> +#include <linux/completion.h> #include <linux/wait.h> #include <linux/bio.h> #include <linux/gfp.h> @@ -200,10 +201,14 @@ struct gendisk { u8 __rcu *zones_cond; unsigned int zone_wplugs_hash_bits; atomic_t nr_zone_wplugs; - spinlock_t zone_wplugs_lock; + spinlock_t zone_wplugs_hash_lock; struct mempool *zone_wplugs_pool; struct hlist_head *zone_wplugs_hash; struct workqueue_struct *zone_wplugs_wq; + spinlock_t zone_wplugs_list_lock; + struct list_head zone_wplugs_list; + struct task_struct *zone_wplugs_worker; + struct completion zone_wplugs_worker_bio_done; #endif /* CONFIG_BLK_DEV_ZONED */ #if IS_ENABLED(CONFIG_CDROM) @@ -668,6 +673,7 @@ enum { QUEUE_FLAG_NO_ELV_SWITCH, /* can't switch elevator any more */ QUEUE_FLAG_QOS_ENABLED, /* qos is enabled */ QUEUE_FLAG_BIO_ISSUE_TIME, /* record bio->issue_time_ns */ + QUEUE_FLAG_ZONED_QD1_WRITES, /* Limit zoned devices writes to QD=1 */ QUEUE_FLAG_MAX }; @@ -707,6 +713,8 @@ void blk_queue_flag_clear(unsigned int flag, struct request_queue *q); test_bit(QUEUE_FLAG_DISABLE_WBT_DEF, &(q)->queue_flags) #define blk_queue_no_elv_switch(q) \ test_bit(QUEUE_FLAG_NO_ELV_SWITCH, &(q)->queue_flags) +#define blk_queue_zoned_qd1_writes(q) \ + test_bit(QUEUE_FLAG_ZONED_QD1_WRITES, &(q)->queue_flags) extern void blk_set_pm_only(struct request_queue *q); extern void blk_clear_pm_only(struct request_queue *q); diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h index bc892e3b37ee..b61b9b7849df 100644 --- a/include/linux/cgroup.h +++ b/include/linux/cgroup.h @@ -715,6 +715,7 @@ static inline void cgroup_path_from_kernfs_id(u64 id, char *buf, size_t buflen) /* * cgroup scalable recursive statistics. */ +void __css_rstat_updated(struct cgroup_subsys_state *css, int cpu); void css_rstat_updated(struct cgroup_subsys_state *css, int cpu); void css_rstat_flush(struct cgroup_subsys_state *css); diff --git a/include/linux/fprobe.h b/include/linux/fprobe.h index 0a3bcd1718f3..be1b38c981d4 100644 --- a/include/linux/fprobe.h +++ b/include/linux/fprobe.h @@ -94,6 +94,7 @@ int register_fprobe(struct fprobe *fp, const char *filter, const char *notfilter int register_fprobe_ips(struct fprobe *fp, unsigned long *addrs, int num); int register_fprobe_syms(struct fprobe *fp, const char **syms, int num); int unregister_fprobe(struct fprobe *fp); +int unregister_fprobe_async(struct fprobe *fp); bool fprobe_is_registered(struct fprobe *fp); int fprobe_count_ips_from_filter(const char *filter, const char *notfilter); #else @@ -113,6 +114,10 @@ static inline int unregister_fprobe(struct fprobe *fp) { return -EOPNOTSUPP; } +static inline int unregister_fprobe_async(struct fprobe *fp) +{ + return -EOPNOTSUPP; +} static inline bool fprobe_is_registered(struct fprobe *fp) { return false; diff --git a/include/linux/fwnode.h b/include/linux/fwnode.h index 80b38fbf2121..31df7608737e 100644 --- a/include/linux/fwnode.h +++ b/include/linux/fwnode.h @@ -208,6 +208,7 @@ struct fwnode_operations { static inline void fwnode_init(struct fwnode_handle *fwnode, const struct fwnode_operations *ops) { + fwnode->secondary = NULL; fwnode->ops = ops; INIT_LIST_HEAD(&fwnode->consumers); INIT_LIST_HEAD(&fwnode->suppliers); diff --git a/include/linux/generic_pt/iommu.h b/include/linux/generic_pt/iommu.h index 9eefbb74efd0..43cc98c9c55f 100644 --- a/include/linux/generic_pt/iommu.h +++ b/include/linux/generic_pt/iommu.h @@ -66,6 +66,13 @@ struct pt_iommu { struct device *iommu_device; }; +static inline struct pt_iommu *iommupt_from_domain(struct iommu_domain *domain) +{ + if (!IS_ENABLED(CONFIG_IOMMU_PT) || !domain->is_iommupt) + return NULL; + return container_of(domain, struct pt_iommu, domain); +} + /** * struct pt_iommu_info - Details about the IOMMU page table * @@ -81,6 +88,56 @@ struct pt_iommu_info { struct pt_iommu_ops { /** + * @map_range: Install translation for an IOVA range + * @iommu_table: Table to manipulate + * @iova: IO virtual address to start + * @paddr: Physical/Output address to start + * @len: Length of the range starting from @iova + * @prot: A bitmap of IOMMU_READ/WRITE/CACHE/NOEXEC/MMIO + * @gfp: GFP flags for any memory allocations + * + * The range starting at IOVA will have paddr installed into it. The + * rage is automatically segmented into optimally sized table entries, + * and can have any valid alignment. + * + * On error the caller will probably want to invoke unmap on the range + * from iova up to the amount indicated by @mapped to return the table + * back to an unchanged state. + * + * Context: The caller must hold a write range lock that includes + * the whole range. + * + * Returns: -ERRNO on failure, 0 on success. The number of bytes of VA + * that were mapped are added to @mapped, @mapped is not zerod first. + */ + int (*map_range)(struct pt_iommu *iommu_table, dma_addr_t iova, + phys_addr_t paddr, dma_addr_t len, unsigned int prot, + gfp_t gfp, size_t *mapped); + + /** + * @unmap_range: Make a range of IOVA empty/not present + * @iommu_table: Table to manipulate + * @iova: IO virtual address to start + * @len: Length of the range starting from @iova + * @iotlb_gather: Gather struct that must be flushed on return + * + * unmap_range() will remove a translation created by map_range(). It + * cannot subdivide a mapping created by map_range(), so it should be + * called with IOVA ranges that match those passed to map_pages. The + * IOVA range can aggregate contiguous map_range() calls so long as no + * individual range is split. + * + * Context: The caller must hold a write range lock that includes + * the whole range. + * + * Returns: Number of bytes of VA unmapped. iova + res will be the + * point unmapping stopped. + */ + size_t (*unmap_range)(struct pt_iommu *iommu_table, dma_addr_t iova, + dma_addr_t len, + struct iommu_iotlb_gather *iotlb_gather); + + /** * @set_dirty: Make the iova write dirty * @iommu_table: Table to manipulate * @iova: IO virtual address to start @@ -194,14 +251,6 @@ struct pt_iommu_cfg { #define IOMMU_PROTOTYPES(fmt) \ phys_addr_t pt_iommu_##fmt##_iova_to_phys(struct iommu_domain *domain, \ dma_addr_t iova); \ - int pt_iommu_##fmt##_map_pages(struct iommu_domain *domain, \ - unsigned long iova, phys_addr_t paddr, \ - size_t pgsize, size_t pgcount, \ - int prot, gfp_t gfp, size_t *mapped); \ - size_t pt_iommu_##fmt##_unmap_pages( \ - struct iommu_domain *domain, unsigned long iova, \ - size_t pgsize, size_t pgcount, \ - struct iommu_iotlb_gather *iotlb_gather); \ int pt_iommu_##fmt##_read_and_clear_dirty( \ struct iommu_domain *domain, unsigned long iova, size_t size, \ unsigned long flags, struct iommu_dirty_bitmap *dirty); \ @@ -222,9 +271,7 @@ struct pt_iommu_cfg { * iommu_pt */ #define IOMMU_PT_DOMAIN_OPS(fmt) \ - .iova_to_phys = &pt_iommu_##fmt##_iova_to_phys, \ - .map_pages = &pt_iommu_##fmt##_map_pages, \ - .unmap_pages = &pt_iommu_##fmt##_unmap_pages + .iova_to_phys = &pt_iommu_##fmt##_iova_to_phys #define IOMMU_PT_DIRTY_OPS(fmt) \ .read_and_clear_dirty = &pt_iommu_##fmt##_read_and_clear_dirty diff --git a/include/linux/gfp_types.h b/include/linux/gfp_types.h index 6c75df30a281..cd4972a7c97c 100644 --- a/include/linux/gfp_types.h +++ b/include/linux/gfp_types.h @@ -273,11 +273,11 @@ enum { * * %__GFP_ZERO returns a zeroed page on success. * - * %__GFP_ZEROTAGS zeroes memory tags at allocation time if the memory itself - * is being zeroed (either via __GFP_ZERO or via init_on_alloc, provided that - * __GFP_SKIP_ZERO is not set). This flag is intended for optimization: setting - * memory tags at the same time as zeroing memory has minimal additional - * performance impact. + * %__GFP_ZEROTAGS zeroes memory tags at allocation time. Setting memory tags at + * the same time as zeroing memory (e.g., with __GFP_ZERO) has minimal + * additional performance impact. However, __GFP_ZEROTAGS also zeroes the tags + * even if memory is not getting zeroed at allocation time (e.g., + * with init_on_free). * * %__GFP_SKIP_KASAN makes KASAN skip unpoisoning on page allocation. * Used for userspace and vmalloc pages; the latter are unpoisoned by diff --git a/include/linux/highmem.h b/include/linux/highmem.h index af03db851a1d..d7aac9de1c8a 100644 --- a/include/linux/highmem.h +++ b/include/linux/highmem.h @@ -347,10 +347,11 @@ static inline void clear_highpage_kasan_tagged(struct page *page) #ifndef __HAVE_ARCH_TAG_CLEAR_HIGHPAGES -/* Return false to let people know we did not initialize the pages */ -static inline bool tag_clear_highpages(struct page *page, int numpages) +/* Returns true if the caller has to initialize the pages */ +static inline bool tag_clear_highpages(struct page *page, int numpages, + bool clear_pages) { - return false; + return clear_pages; } #endif diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 555597b54083..563d0f104114 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -223,6 +223,7 @@ enum iommu_domain_cookie_type { struct iommu_domain { unsigned type; enum iommu_domain_cookie_type cookie_type; + bool is_iommupt; const struct iommu_domain_ops *ops; const struct iommu_dirty_ops *dirty_ops; const struct iommu_ops *owner; /* Whose domain_alloc we came from */ diff --git a/include/linux/libata.h b/include/linux/libata.h index 00346ce3af5e..93ab3595c640 100644 --- a/include/linux/libata.h +++ b/include/linux/libata.h @@ -371,6 +371,7 @@ enum { /* return values for ->qc_defer */ ATA_DEFER_LINK = 1, ATA_DEFER_PORT = 2, + ATA_DEFER_LINK_EXCL = 3, /* desc_len for ata_eh_info and context */ ATA_EH_DESC_LEN = 80, @@ -854,6 +855,9 @@ struct ata_link { unsigned int sata_spd; /* current SATA PHY speed */ enum ata_lpm_policy lpm_policy; + struct work_struct deferred_qc_work; + struct ata_queued_cmd *deferred_qc; + /* record runtime error info, protected by host_set lock */ struct ata_eh_info eh_info; /* EH context */ @@ -899,9 +903,6 @@ struct ata_port { u64 qc_active; int nr_active_links; /* #links with active qcs */ - struct work_struct deferred_qc_work; - struct ata_queued_cmd *deferred_qc; - struct ata_link link; /* host default link */ struct ata_link *slave_link; /* see ata_slave_link_init() */ diff --git a/include/linux/list.h b/include/linux/list.h index 00ea8e5fb88b..09d979976b3b 100644 --- a/include/linux/list.h +++ b/include/linux/list.h @@ -191,6 +191,29 @@ static inline void list_add_tail(struct list_head *new, struct list_head *head) __list_add(new, head->prev, head); } +/** + * list_add_tail_release - add a new entry with release barrier + * @new: new entry to be added + * @head: list head to add it before + * + * Insert a new entry before the specified head, using a release barrier to set + * the ->next pointer that points to it. This is useful for implementing + * queues, in particular one that the elements will be walked through forwards + * locklessly. + */ +static inline void list_add_tail_release(struct list_head *new, + struct list_head *head) +{ + struct list_head *prev = head->prev; + + if (__list_add_valid(new, prev, head)) { + new->next = head; + new->prev = prev; + head->prev = new; + smp_store_release(&prev->next, new); + } +} + /* * Delete a list entry by making the prev/next entries * point to each other. @@ -645,6 +668,20 @@ static inline void list_splice_tail_init(struct list_head *list, }) /** + * list_first_entry_or_null_acquire - get the first element from a list with barrier + * @ptr: the list head to take the element from. + * @type: the type of the struct this is embedded in. + * @member: the name of the list_head within the struct. + * + * Note that if the list is empty, it returns NULL. + */ +#define list_first_entry_or_null_acquire(ptr, type, member) ({ \ + struct list_head *head__ = (ptr); \ + struct list_head *pos__ = smp_load_acquire(&head__->next); \ + pos__ != head__ ? list_entry(pos__, type, member) : NULL; \ +}) + +/** * list_last_entry_or_null - get the last element from a list * @ptr: the list head to take the element from. * @type: the type of the struct this is embedded in. diff --git a/include/linux/netfilter/x_tables.h b/include/linux/netfilter/x_tables.h index 77c778d84d4c..6fd365f7b35b 100644 --- a/include/linux/netfilter/x_tables.h +++ b/include/linux/netfilter/x_tables.h @@ -297,9 +297,11 @@ struct xt_counters *xt_counters_alloc(unsigned int counters); struct xt_table *xt_register_table(struct net *net, const struct xt_table *table, + const struct nf_hook_ops *template_ops, struct xt_table_info *bootstrap, struct xt_table_info *newinfo); -void *xt_unregister_table(struct xt_table *table); +void xt_unregister_table_pre_exit(struct net *net, u8 af, const char *name); +struct xt_table *xt_unregister_table_exit(struct net *net, u8 af, const char *name); struct xt_table_info *xt_replace_table(struct xt_table *table, unsigned int num_counters, diff --git a/include/linux/netfilter_arp/arp_tables.h b/include/linux/netfilter_arp/arp_tables.h index a40aaf645fa4..05631a25e622 100644 --- a/include/linux/netfilter_arp/arp_tables.h +++ b/include/linux/netfilter_arp/arp_tables.h @@ -53,7 +53,6 @@ int arpt_register_table(struct net *net, const struct xt_table *table, const struct arpt_replace *repl, const struct nf_hook_ops *ops); void arpt_unregister_table(struct net *net, const char *name); -void arpt_unregister_table_pre_exit(struct net *net, const char *name); extern unsigned int arpt_do_table(void *priv, struct sk_buff *skb, const struct nf_hook_state *state); diff --git a/include/linux/netfilter_ipv4/ip_tables.h b/include/linux/netfilter_ipv4/ip_tables.h index 132b0e4a6d4d..13593391d605 100644 --- a/include/linux/netfilter_ipv4/ip_tables.h +++ b/include/linux/netfilter_ipv4/ip_tables.h @@ -26,7 +26,6 @@ int ipt_register_table(struct net *net, const struct xt_table *table, const struct ipt_replace *repl, const struct nf_hook_ops *ops); -void ipt_unregister_table_pre_exit(struct net *net, const char *name); void ipt_unregister_table_exit(struct net *net, const char *name); /* Standard entry. */ diff --git a/include/linux/netfilter_ipv6/ip6_tables.h b/include/linux/netfilter_ipv6/ip6_tables.h index 8b8885a73c76..c6d5b927830d 100644 --- a/include/linux/netfilter_ipv6/ip6_tables.h +++ b/include/linux/netfilter_ipv6/ip6_tables.h @@ -27,7 +27,6 @@ extern void *ip6t_alloc_initial_table(const struct xt_table *); int ip6t_register_table(struct net *net, const struct xt_table *table, const struct ip6t_replace *repl, const struct nf_hook_ops *ops); -void ip6t_unregister_table_pre_exit(struct net *net, const char *name); void ip6t_unregister_table_exit(struct net *net, const char *name); extern unsigned int ip6t_do_table(void *priv, struct sk_buff *skb, const struct nf_hook_state *state); diff --git a/include/linux/netfs.h b/include/linux/netfs.h index ba17ac5bf356..243c0f737938 100644 --- a/include/linux/netfs.h +++ b/include/linux/netfs.h @@ -62,8 +62,8 @@ struct netfs_inode { struct fscache_cookie *cache; #endif struct mutex wb_lock; /* Writeback serialisation */ - loff_t remote_i_size; /* Size of the remote file */ - loff_t zero_point; /* Size after which we assume there's no data + loff_t _remote_i_size; /* Size of the remote file */ + loff_t _zero_point; /* Size after which we assume there's no data * on the server */ atomic_t io_count; /* Number of outstanding reqs */ unsigned long flags; @@ -252,7 +252,7 @@ struct netfs_io_request { unsigned long long collected_to; /* Point we've collected to */ unsigned long long cleaned_to; /* Position we've cleaned folios to */ unsigned long long abandon_to; /* Position to abandon folios to */ - pgoff_t no_unlock_folio; /* Don't unlock this folio after read */ + const struct folio *no_unlock_folio; /* Don't unlock this folio after read */ unsigned int direct_bv_count; /* Number of elements in direct_bv[] */ unsigned int debug_id; unsigned int rsize; /* Maximum read size (0 for none) */ @@ -475,6 +475,254 @@ static inline struct netfs_inode *netfs_inode(struct inode *inode) } /** + * netfs_read_remote_i_size - Read remote_i_size safely + * @inode: The inode to access + * + * Read remote_i_size safely without the potential for tearing on 32-bit + * arches. + * + * NOTE: in a 32bit arch with a preemptable kernel and an UP compile the + * i_size_read/write must be atomic with respect to the local cpu (unlike with + * preempt disabled), but they don't need to be atomic with respect to other + * cpus like in true SMP (so they need either to either locally disable irq + * around the read or for example on x86 they can be still implemented as a + * cmpxchg8b without the need of the lock prefix). For SMP compiles and 64bit + * archs it makes no difference if preempt is enabled or not. + */ +static inline unsigned long long netfs_read_remote_i_size(const struct inode *inode) +{ + const struct netfs_inode *ictx = container_of(inode, struct netfs_inode, inode); + unsigned long long remote_i_size; + +#if BITS_PER_LONG==32 && defined(CONFIG_SMP) + unsigned int seq; + + do { + seq = read_seqcount_begin(&inode->i_size_seqcount); + remote_i_size = ictx->_remote_i_size; + } while (read_seqcount_retry(&inode->i_size_seqcount, seq)); +#elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPTION) + preempt_disable(); + remote_i_size = ictx->_remote_i_size; + preempt_enable(); +#else + /* Pairs with smp_store_release() in netfs_write_remote_i_size() */ + remote_i_size = smp_load_acquire(&ictx->_remote_i_size); +#endif + return remote_i_size; +} + +/* + * netfs_write_remote_i_size - Set remote_i_size safely + * @inode: The inode to access + * @remote_i_size: The new value for the size of the file on the server + * + * Set remote_i_size safely without the potential for tearing on 32-bit arches. + * + * Context: The caller must hold inode->i_lock. + * + * NOTE: unlike netfs_read_remote_i_size(), netfs_write_remote_i_size() does + * need locking around it (normally i_rwsem), otherwise on 32bit/SMP an update + * of i_size_seqcount can be lost, resulting in subsequent i_size_read() calls + * spinning forever. + */ +static inline void netfs_write_remote_i_size(struct inode *inode, + unsigned long long remote_i_size) +{ + struct netfs_inode *ictx = netfs_inode(inode); + +#if BITS_PER_LONG==32 && defined(CONFIG_SMP) + write_seqcount_begin(&inode->i_size_seqcount); + ictx->_remote_i_size = remote_i_size; + write_seqcount_end(&inode->i_size_seqcount); +#elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPTION) + preempt_disable(); + ictx->_remote_i_size = remote_i_size; + preempt_enable(); +#else + /* + * Pairs with smp_load_acquire() in netfs_read_remote_i_size() to + * ensure changes related to inode size (such as page contents) are + * visible before we see the changed inode size. + */ + smp_store_release(&ictx->_remote_i_size, remote_i_size); +#endif +} + +/** + * netfs_read_zero_point - Read zero_point safely + * @inode: The inode to access + * + * Read zero_point safely without the potential for tearing on 32-bit + * arches. + * + * NOTE: in a 32bit arch with a preemptable kernel and an UP compile the + * i_size_read/write must be atomic with respect to the local cpu (unlike with + * preempt disabled), but they don't need to be atomic with respect to other + * cpus like in true SMP (so they need either to either locally disable irq + * around the read or for example on x86 they can be still implemented as a + * cmpxchg8b without the need of the lock prefix). For SMP compiles and 64bit + * archs it makes no difference if preempt is enabled or not. + */ +static inline unsigned long long netfs_read_zero_point(const struct inode *inode) +{ + struct netfs_inode *ictx = container_of(inode, struct netfs_inode, inode); + unsigned long long zero_point; + +#if BITS_PER_LONG==32 && defined(CONFIG_SMP) + unsigned int seq; + + do { + seq = read_seqcount_begin(&inode->i_size_seqcount); + zero_point = ictx->_zero_point; + } while (read_seqcount_retry(&inode->i_size_seqcount, seq)); +#elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPTION) + preempt_disable(); + zero_point = ictx->_zero_point; + preempt_enable(); +#else + /* Pairs with smp_store_release() in netfs_write_zero_point() */ + zero_point = smp_load_acquire(&ictx->_zero_point); +#endif + return zero_point; +} + +/* + * netfs_write_zero_point - Set zero_point safely + * @inode: The inode to access + * @zero_point: The new value for the point beyond which the server has no data + * + * Set zero_point safely without the potential for tearing on 32-bit arches. + * + * Context: The caller must hold inode->i_lock. + * + * NOTE: unlike netfs_read_zero_point(), netfs_write_zero_point() does need + * locking around it (normally i_rwsem), otherwise on 32bit/SMP an update of + * i_size_seqcount can be lost, resulting in subsequent read calls spinning + * forever. + */ +static inline void netfs_write_zero_point(struct inode *inode, + unsigned long long zero_point) +{ + struct netfs_inode *ictx = netfs_inode(inode); + +#if BITS_PER_LONG==32 && defined(CONFIG_SMP) + write_seqcount_begin(&inode->i_size_seqcount); + ictx->_zero_point = zero_point; + write_seqcount_end(&inode->i_size_seqcount); +#elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPTION) + preempt_disable(); + ictx->_zero_point = zero_point; + preempt_enable(); +#else + /* + * Pairs with smp_load_acquire() in netfs_read_zero_point() to + * ensure changes related to inode size (such as page contents) are + * visible before we see the changed inode size. + */ + smp_store_release(&ictx->_zero_point, zero_point); +#endif +} + +/** + * netfs_read_sizes - Read remote_i_size and zero_point safely + * @inode: The inode to access + * @i_size: Where to return the local file size. + * @remote_i_size: Where to return the size of the file on the server + * @zero_point: Where to return the the point beyond which the server has no data + * + * Read remote_i_size and zero_point safely without the potential for tearing + * on 32-bit arches. + * + * NOTE: in a 32bit arch with a preemptable kernel and an UP compile the + * i_size_read/write must be atomic with respect to the local cpu (unlike with + * preempt disabled), but they don't need to be atomic with respect to other + * cpus like in true SMP (so they need either to either locally disable irq + * around the read or for example on x86 they can be still implemented as a + * cmpxchg8b without the need of the lock prefix). For SMP compiles and 64bit + * archs it makes no difference if preempt is enabled or not. + */ +static inline void netfs_read_sizes(const struct inode *inode, + unsigned long long *i_size, + unsigned long long *remote_i_size, + unsigned long long *zero_point) +{ + const struct netfs_inode *ictx = container_of(inode, struct netfs_inode, inode); +#if BITS_PER_LONG==32 && defined(CONFIG_SMP) + unsigned int seq; + + do { + seq = read_seqcount_begin(&inode->i_size_seqcount); + *i_size = inode->i_size; + *remote_i_size = ictx->_remote_i_size; + *zero_point = ictx->_zero_point; + } while (read_seqcount_retry(&inode->i_size_seqcount, seq)); +#elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPTION) + preempt_disable(); + *i_size = inode->i_size; + *remote_i_size = ictx->_remote_i_size; + *zero_point = ictx->_zero_point; + preempt_enable(); +#else + /* Pairs with smp_store_release() in i_size_write() */ + *i_size = smp_load_acquire(&inode->i_size); + /* Pairs with smp_store_release() in netfs_write_remote_i_size() */ + *remote_i_size = smp_load_acquire(&ictx->_remote_i_size); + /* Pairs with smp_store_release() in netfs_write_zero_point() */ + *zero_point = smp_load_acquire(&ictx->_zero_point); +#endif +} + +/* + * netfs_write_sizes - Set i_size, remote_i_size and zero_point safely + * @inode: The inode to access + * @i_size: The new value for the local size of the file + * @remote_i_size: The new value for the size of the file on the server + * @zero_point: The new value for the point beyond which the server has no data + * + * Set both remote_i_size and zero_point safely without the potential for + * tearing on 32-bit arches. + * + * Context: The caller must hold inode->i_lock. + * + * NOTE: unlike netfs_read_zero_point(), netfs_write_zero_point() does need + * locking around it (normally i_rwsem), otherwise on 32bit/SMP an update of + * i_size_seqcount can be lost, resulting in subsequent read calls spinning + * forever. + */ +static inline void netfs_write_sizes(struct inode *inode, + unsigned long long i_size, + unsigned long long remote_i_size, + unsigned long long zero_point) +{ + struct netfs_inode *ictx = netfs_inode(inode); + +#if BITS_PER_LONG==32 && defined(CONFIG_SMP) + write_seqcount_begin(&inode->i_size_seqcount); + inode->i_size = i_size; + ictx->_remote_i_size = remote_i_size; + ictx->_zero_point = zero_point; + write_seqcount_end(&inode->i_size_seqcount); +#elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPTION) + preempt_disable(); + inode->i_size = i_size; + ictx->_remote_i_size = remote_i_size; + ictx->_zero_point = zero_point; + preempt_enable(); +#else + /* + * Pairs with smp_load_acquire() in i_size_read(), + * netfs_read_remote_i_size() and netfs_read_zero_point() to ensure + * changes related to inode size (such as page contents) are visible + * before we see the changed inode size. + */ + smp_store_release(&inode->i_size, i_size); + smp_store_release(&ictx->_remote_i_size, remote_i_size); + smp_store_release(&ictx->_zero_point, zero_point); +#endif +} + +/** * netfs_inode_init - Initialise a netfslib inode context * @ctx: The netfs inode to initialise * @ops: The netfs's operations list @@ -488,8 +736,8 @@ static inline void netfs_inode_init(struct netfs_inode *ctx, bool use_zero_point) { ctx->ops = ops; - ctx->remote_i_size = i_size_read(&ctx->inode); - ctx->zero_point = LLONG_MAX; + ctx->_remote_i_size = i_size_read(&ctx->inode); + ctx->_zero_point = LLONG_MAX; ctx->flags = 0; atomic_set(&ctx->io_count, 0); #if IS_ENABLED(CONFIG_FSCACHE) @@ -498,7 +746,7 @@ static inline void netfs_inode_init(struct netfs_inode *ctx, mutex_init(&ctx->wb_lock); /* ->releasepage() drives zero_point */ if (use_zero_point) { - ctx->zero_point = ctx->remote_i_size; + ctx->_zero_point = ctx->_remote_i_size; mapping_set_release_always(ctx->inode.i_mapping); } } @@ -511,13 +759,40 @@ static inline void netfs_inode_init(struct netfs_inode *ctx, * * Inform the netfs lib that a file got resized so that it can adjust its state. */ -static inline void netfs_resize_file(struct netfs_inode *ctx, loff_t new_i_size, +static inline void netfs_resize_file(struct netfs_inode *ictx, + unsigned long long new_i_size, bool changed_on_server) { +#if BITS_PER_LONG==32 && defined(CONFIG_SMP) + struct inode *inode = &ictx->inode; + + preempt_disable(); + write_seqcount_begin(&inode->i_size_seqcount); + if (changed_on_server) + ictx->_remote_i_size = new_i_size; + if (new_i_size < ictx->_zero_point) + ictx->_zero_point = new_i_size; + write_seqcount_end(&inode->i_size_seqcount); + preempt_enable(); +#elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPTION) + preempt_disable(); if (changed_on_server) - ctx->remote_i_size = new_i_size; - if (new_i_size < ctx->zero_point) - ctx->zero_point = new_i_size; + ictx->_remote_i_size = new_i_size; + if (new_i_size < ictx->_zero_point) + ictx->_zero_point = new_i_size; + preempt_enable(); +#else + /* + * Pairs with smp_load_acquire() in netfs_read_remote_i_size and + * netfs_read_zero_point() to ensure changes related to inode size + * (such as page contents) are visible before we see the changed inode + * size. + */ + if (changed_on_server) + smp_store_release(&ictx->_remote_i_size, new_i_size); + if (new_i_size < ictx->_zero_point) + smp_store_release(&ictx->_zero_point, new_i_size); +#endif } /** diff --git a/include/linux/soc/airoha/airoha_offload.h b/include/linux/soc/airoha/airoha_offload.h index d01ef4a6b3d7..7589fccfeef6 100644 --- a/include/linux/soc/airoha/airoha_offload.h +++ b/include/linux/soc/airoha/airoha_offload.h @@ -71,9 +71,9 @@ static inline void airoha_ppe_dev_check_skb(struct airoha_ppe_dev *dev, #define NPU_RX1_DESC_NUM 512 /* CTRL */ -#define NPU_RX_DMA_DESC_LAST_MASK BIT(27) -#define NPU_RX_DMA_DESC_LEN_MASK GENMASK(26, 14) -#define NPU_RX_DMA_DESC_CUR_LEN_MASK GENMASK(13, 1) +#define NPU_RX_DMA_DESC_LAST_MASK BIT(29) +#define NPU_RX_DMA_DESC_LEN_MASK GENMASK(28, 15) +#define NPU_RX_DMA_DESC_CUR_LEN_MASK GENMASK(14, 1) #define NPU_RX_DMA_DESC_DONE_MASK BIT(0) /* INFO */ #define NPU_RX_DMA_PKT_COUNT_MASK GENMASK(31, 29) diff --git a/include/net/bluetooth/bluetooth.h b/include/net/bluetooth/bluetooth.h index 69eed69f7f26..3faea66b1979 100644 --- a/include/net/bluetooth/bluetooth.h +++ b/include/net/bluetooth/bluetooth.h @@ -398,6 +398,7 @@ void baswap(bdaddr_t *dst, const bdaddr_t *src); struct bt_sock { struct sock sk; struct list_head accept_q; + spinlock_t accept_q_lock; /* protects accept_q */ struct sock *parent; unsigned long flags; void (*skb_msg_name)(struct sk_buff *, void *, int *); diff --git a/include/net/net_shaper.h b/include/net/net_shaper.h index 5c3f49b52fe9..3939b816b001 100644 --- a/include/net/net_shaper.h +++ b/include/net/net_shaper.h @@ -53,6 +53,7 @@ struct net_shaper { /* private: */ u32 leaves; /* accounted only for NODE scope */ + bool valid; struct rcu_head rcu; }; diff --git a/include/net/netfilter/nf_conntrack_expect.h b/include/net/netfilter/nf_conntrack_expect.h index e9a8350e7ccf..80f50fd0f7ad 100644 --- a/include/net/netfilter/nf_conntrack_expect.h +++ b/include/net/netfilter/nf_conntrack_expect.h @@ -45,9 +45,12 @@ struct nf_conntrack_expect { void (*expectfn)(struct nf_conn *new, struct nf_conntrack_expect *this); - /* Helper to assign to new connection */ + /* Helper that created this expectation */ struct nf_conntrack_helper __rcu *helper; + /* Helper to assign to new connection */ + struct nf_conntrack_helper __rcu *assign_helper; + /* The conntrack of the master connection */ struct nf_conn *master; diff --git a/include/net/netfilter/nf_queue.h b/include/net/netfilter/nf_queue.h index d17035d14d96..3978c3174cdb 100644 --- a/include/net/netfilter/nf_queue.h +++ b/include/net/netfilter/nf_queue.h @@ -14,6 +14,7 @@ struct nf_queue_entry { struct list_head list; struct rhash_head hash_node; struct sk_buff *skb; + struct net_device *skb_dev; unsigned int id; unsigned int hook_index; /* index in hook_entries->hook[] */ #if IS_ENABLED(CONFIG_BRIDGE_NETFILTER) diff --git a/include/net/tcp.h b/include/net/tcp.h index ebc72dce4134..e29d73118d82 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -65,8 +65,6 @@ static inline void tcp_orphan_count_dec(void) this_cpu_dec(tcp_orphan_count); } -DECLARE_PER_CPU(u32, tcp_tw_isn); - void tcp_time_wait(struct sock *sk, int state, int timeo); #define MAX_TCP_HEADER L1_CACHE_ALIGN(128 + MAX_HEADER) @@ -1060,10 +1058,13 @@ struct tcp_skb_cb { __u32 seq; /* Starting sequence number */ __u32 end_seq; /* SEQ + FIN + SYN + datalen */ union { - /* Note : + /* Notes : + * tcp_tw_isn is used in input path only + * (isn chosen by tcp_timewait_state_process()) * tcp_gso_segs/size are used in write queue only, * cf tcp_skb_pcount()/tcp_skb_mss() */ + u32 tcp_tw_isn; struct { u16 tcp_gso_segs; u16 tcp_gso_size; diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h index 0864700f76e0..fa090a455037 100644 --- a/include/trace/events/btrfs.h +++ b/include/trace/events/btrfs.h @@ -771,10 +771,8 @@ TRACE_EVENT(btrfs_sync_file, TP_fast_assign( struct dentry *dentry = file_dentry(file); struct inode *inode = file_inode(file); - struct dentry *parent = dget_parent(dentry); - struct inode *parent_inode = d_inode(parent); + struct inode *parent_inode = d_inode(dentry->d_parent); - dput(parent); TP_fast_assign_fsid(btrfs_sb(inode->i_sb)); __entry->ino = btrfs_ino(BTRFS_I(inode)); __entry->parent = btrfs_ino(BTRFS_I(parent_inode)); diff --git a/include/trace/events/damon.h b/include/trace/events/damon.h index 24fc402ab3c8..7e25f4469b81 100644 --- a/include/trace/events/damon.h +++ b/include/trace/events/damon.h @@ -41,7 +41,7 @@ TRACE_EVENT(damos_stat_after_apply_interval, ), TP_printk("ctx_idx=%u scheme_idx=%u nr_tried=%lu sz_tried=%lu " - "nr_applied=%lu sz_tried=%lu sz_ops_filter_passed=%lu " + "nr_applied=%lu sz_applied=%lu sz_ops_filter_passed=%lu " "qt_exceeds=%lu nr_snapshots=%lu", __entry->context_idx, __entry->scheme_idx, __entry->nr_tried, __entry->sz_tried, diff --git a/include/trace/events/netfs.h b/include/trace/events/netfs.h index cbe28211106c..3fe3980902c2 100644 --- a/include/trace/events/netfs.h +++ b/include/trace/events/netfs.h @@ -177,7 +177,11 @@ EM(netfs_folio_is_uptodate, "mod-uptodate") \ EM(netfs_just_prefetch, "mod-prefetch") \ EM(netfs_whole_folio_modify, "mod-whole-f") \ + EM(netfs_whole_folio_modify_efault, "mod-whole-f!") \ + EM(netfs_whole_folio_modify_filled, "mod-whole-f+") \ + EM(netfs_whole_folio_modify_filled_efault, "mod-whole-f+!") \ EM(netfs_modify_and_clear, "mod-n-clear") \ + EM(netfs_modify_and_clear_rm_finfo, "mod-n-clear+") \ EM(netfs_streaming_write, "mod-streamw") \ EM(netfs_streaming_write_cont, "mod-streamw+") \ EM(netfs_flush_content, "flush") \ @@ -194,6 +198,10 @@ EM(netfs_folio_trace_copy_to_cache, "mark-copy") \ EM(netfs_folio_trace_end_copy, "end-copy") \ EM(netfs_folio_trace_filled_gaps, "filled-gaps") \ + EM(netfs_folio_trace_invalidate_all, "inval-all") \ + EM(netfs_folio_trace_invalidate_front, "inval-front") \ + EM(netfs_folio_trace_invalidate_middle, "inval-mid") \ + EM(netfs_folio_trace_invalidate_tail, "inval-tail") \ EM(netfs_folio_trace_kill, "kill") \ EM(netfs_folio_trace_kill_cc, "kill-cc") \ EM(netfs_folio_trace_kill_g, "kill-g") \ diff --git a/include/trace/events/rxrpc.h b/include/trace/events/rxrpc.h index 573f2df3a2c9..704a10de6670 100644 --- a/include/trace/events/rxrpc.h +++ b/include/trace/events/rxrpc.h @@ -71,6 +71,7 @@ EM(rxkad_abort_resp_unknown_tkt, "rxkad-resp-unknown-tkt") \ EM(rxkad_abort_resp_version, "rxkad-resp-version") \ /* RxGK security errors */ \ + EM(rxgk_abort_1_short_header, "rxgk1-short-hdr") \ EM(rxgk_abort_1_verify_mic_eproto, "rxgk1-vfy-mic-eproto") \ EM(rxgk_abort_2_decrypt_eproto, "rxgk2-dec-eproto") \ EM(rxgk_abort_2_short_data, "rxgk2-short-data") \ diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 97260bca67e7..cc4011d84337 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -1719,10 +1719,9 @@ static int io_init_req(struct io_ring_ctx *ctx, struct io_kiocb *req, const struct io_issue_def *def; unsigned int sqe_flags; int personality; - u8 opcode; req->ctx = ctx; - req->opcode = opcode = READ_ONCE(sqe->opcode); + req->opcode = READ_ONCE(sqe->opcode); /* same numerical values with corresponding REQ_F_*, safe to copy */ sqe_flags = READ_ONCE(sqe->flags); req->flags = (__force io_req_flags_t) sqe_flags; @@ -1732,13 +1731,13 @@ static int io_init_req(struct io_ring_ctx *ctx, struct io_kiocb *req, req->cancel_seq_set = false; req->async_data = NULL; - if (unlikely(opcode >= IORING_OP_LAST)) { + if (unlikely(req->opcode >= IORING_OP_LAST)) { req->opcode = 0; return io_init_fail_req(req, -EINVAL); } - opcode = array_index_nospec(opcode, IORING_OP_LAST); + req->opcode = array_index_nospec(req->opcode, IORING_OP_LAST); - def = &io_issue_defs[opcode]; + def = &io_issue_defs[req->opcode]; if (def->is_128 && !(ctx->flags & IORING_SETUP_SQE128)) { /* * A 128b op on a non-128b SQ requires mixed SQE support as diff --git a/io_uring/net.c b/io_uring/net.c index 8885d944130a..5bd3fa5a2b6d 100644 --- a/io_uring/net.c +++ b/io_uring/net.c @@ -4,6 +4,7 @@ #include <linux/file.h> #include <linux/slab.h> #include <linux/net.h> +#include <linux/un.h> #include <linux/compat.h> #include <net/compat.h> #include <linux/io_uring.h> @@ -1841,11 +1842,29 @@ out: return IOU_COMPLETE; } +/* + * Check if bind request would potentially end up with filename_create(), + * which in turn end up in mnt_want_write() which will grab the fs + * percpu start write sem. This can trigger a lockdep warning. + */ +static int io_bind_file_create(const struct io_async_msghdr *io, int addr_len) +{ + const struct sockaddr_un *sun; + + if (io->addr.ss_family != AF_UNIX) + return 0; + if (addr_len <= offsetof(struct sockaddr_un, sun_path)) + return 0; + sun = (const struct sockaddr_un *) &io->addr; + return sun->sun_path[0] != '\0'; +} + int io_bind_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) { struct io_bind *bind = io_kiocb_to_cmd(req, struct io_bind); struct sockaddr __user *uaddr; struct io_async_msghdr *io; + int ret; if (sqe->len || sqe->buf_index || sqe->rw_flags || sqe->splice_fd_in) return -EINVAL; @@ -1856,7 +1875,12 @@ int io_bind_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) io = io_msg_alloc_async(req); if (unlikely(!io)) return -ENOMEM; - return move_addr_to_kernel(uaddr, bind->addr_len, &io->addr); + ret = move_addr_to_kernel(uaddr, bind->addr_len, &io->addr); + if (unlikely(ret)) + return ret; + if (io_bind_file_create(io, bind->addr_len)) + req->flags |= REQ_F_FORCE_ASYNC; + return 0; } int io_bind(struct io_kiocb *req, unsigned int issue_flags) diff --git a/io_uring/nop.c b/io_uring/nop.c index 3caf07878f8a..f5c9969e7f64 100644 --- a/io_uring/nop.c +++ b/io_uring/nop.c @@ -79,9 +79,9 @@ done: if (ret < 0) req_set_fail(req); if (nop->flags & IORING_NOP_CQE32) - io_req_set_res32(req, nop->result, 0, nop->extra1, nop->extra2); + io_req_set_res32(req, ret, 0, nop->extra1, nop->extra2); else - io_req_set_res(req, nop->result, 0); + io_req_set_res(req, ret, 0); if (nop->flags & IORING_NOP_TW) { req->io_task_work.func = io_req_task_complete; io_req_task_work_add(req); diff --git a/io_uring/waitid.c b/io_uring/waitid.c index d25d60aed6af..32f68fd7fcdd 100644 --- a/io_uring/waitid.c +++ b/io_uring/waitid.c @@ -275,6 +275,7 @@ int io_waitid_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) iw->options = READ_ONCE(sqe->file_index); iw->head = NULL; iw->infop = u64_to_user_ptr(READ_ONCE(sqe->addr2)); + memset(&iw->info, 0, sizeof(iw->info)); return 0; } diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c index 150e5871e66f..de816a43db9f 100644 --- a/kernel/cgroup/rstat.c +++ b/kernel/cgroup/rstat.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0-only #include "cgroup-internal.h" +#include <linux/cpumask.h> #include <linux/sched/cputime.h> #include <linux/bpf.h> @@ -53,7 +54,7 @@ static inline struct llist_head *ss_lhead_cpu(struct cgroup_subsys *ss, int cpu) } /** - * css_rstat_updated - keep track of updated rstat_cpu + * __css_rstat_updated - keep track of updated rstat_cpu * @css: target cgroup subsystem state * @cpu: cpu on which rstat_cpu was updated * @@ -63,31 +64,27 @@ static inline struct llist_head *ss_lhead_cpu(struct cgroup_subsys *ss, int cpu) * * NOTE: if the user needs the guarantee that the updater either add itself in * the lockless list or the concurrent flusher flushes its updated stats, a - * memory barrier is needed before the call to css_rstat_updated() i.e. a + * memory barrier is needed before the call to __css_rstat_updated() i.e. a * barrier after updating the per-cpu stats and before calling - * css_rstat_updated(). + * __css_rstat_updated(). */ -__bpf_kfunc void css_rstat_updated(struct cgroup_subsys_state *css, int cpu) +void __css_rstat_updated(struct cgroup_subsys_state *css, int cpu) { struct llist_head *lhead; struct css_rstat_cpu *rstatc; struct llist_node *self; - /* - * Since bpf programs can call this function, prevent access to - * uninitialized rstat pointers. - */ + /* Prevent access to uninitialized rstat pointers. */ if (!css_uses_rstat(css)) return; lockdep_assert_preemption_disabled(); /* - * For archs withnot nmi safe cmpxchg or percpu ops support, ignore - * the requests from nmi context. + * The lockless insertion below relies on NMI-safe cmpxchg; + * bail out in NMI on archs that don't provide it. */ - if ((!IS_ENABLED(CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG) || - !IS_ENABLED(CONFIG_ARCH_HAS_NMI_SAFE_THIS_CPU_OPS)) && in_nmi()) + if (!IS_ENABLED(CONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG) && in_nmi()) return; rstatc = css_rstat_cpu(css, cpu); @@ -125,6 +122,18 @@ __bpf_kfunc void css_rstat_updated(struct cgroup_subsys_state *css, int cpu) llist_add(&rstatc->lnode, lhead); } +/* + * BPF-facing wrapper for __css_rstat_updated(). Validate the caller-provided + * CPU before passing it to the internal rstat updater. + */ +__bpf_kfunc void css_rstat_updated(struct cgroup_subsys_state *css, int cpu) +{ + if (unlikely(cpu < 0 || cpu >= nr_cpu_ids || !cpu_possible(cpu))) + return; + + __css_rstat_updated(css, cpu); +} + static void __css_process_update_tree(struct cgroup_subsys_state *css, int cpu) { /* put @css and all ancestors on the corresponding updated lists */ @@ -170,7 +179,7 @@ static void css_process_update_tree(struct cgroup_subsys *ss, int cpu) * flusher flush the stats updated by the updater who have * observed that they are already on the list. The * corresponding barrier pair for this one should be before - * css_rstat_updated() by the user. + * __css_rstat_updated() by the user. * * For now, there aren't any such user, so not adding the * barrier here but if such a use-case arise, please add @@ -614,7 +623,7 @@ static void cgroup_base_stat_cputime_account_end(struct cgroup *cgrp, unsigned long flags) { u64_stats_update_end_irqrestore(&rstatbc->bsync, flags); - css_rstat_updated(&cgrp->self, smp_processor_id()); + __css_rstat_updated(&cgrp->self, smp_processor_id()); put_cpu_ptr(rstatbc); } diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c index 1a725edbbbf6..3248f8b4d096 100644 --- a/kernel/dma/debug.c +++ b/kernel/dma/debug.c @@ -1251,7 +1251,14 @@ void debug_dma_map_phys(struct device *dev, phys_addr_t phys, size_t size, entry->direction = direction; entry->map_err_type = MAP_ERR_NOT_CHECKED; - if (!(attrs & DMA_ATTR_MMIO)) { + if (attrs & DMA_ATTR_MMIO) { + unsigned long pfn = PHYS_PFN(phys); + + if (pfn_valid(pfn) && !PageReserved(pfn_to_page(pfn))) + err_printk(dev, entry, + "dma_map_resource called for RAM address %pa\n", + &phys); + } else { check_for_stack(dev, phys); if (!PhysHighMem(phys)) diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c index 6d3dd0bd3a88..5d59372f4277 100644 --- a/kernel/dma/mapping.c +++ b/kernel/dma/mapping.c @@ -356,10 +356,6 @@ EXPORT_SYMBOL(dma_unmap_sg_attrs); dma_addr_t dma_map_resource(struct device *dev, phys_addr_t phys_addr, size_t size, enum dma_data_direction dir, unsigned long attrs) { - if (IS_ENABLED(CONFIG_DMA_API_DEBUG) && - WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr)))) - return DMA_MAPPING_ERROR; - return dma_map_phys(dev, phys_addr, size, dir, attrs | DMA_ATTR_MMIO); } EXPORT_SYMBOL(dma_map_resource); diff --git a/kernel/irq_work.c b/kernel/irq_work.c index 73f7e1fd4ab4..bf411656c316 100644 --- a/kernel/irq_work.c +++ b/kernel/irq_work.c @@ -292,6 +292,12 @@ void irq_work_sync(struct irq_work *work) !arch_irq_work_has_interrupt()) { rcuwait_wait_event(&work->irqwait, !irq_work_is_busy(work), TASK_UNINTERRUPTIBLE); + /* + * Ensure irq_work_single() does not access @work + * after removing IRQ_WORK_BUSY. It is always + * accessed within a RCU-read section. + */ + synchronize_rcu(); return; } @@ -302,6 +308,7 @@ EXPORT_SYMBOL_GPL(irq_work_sync); static void run_irq_workd(unsigned int cpu) { + guard(rcu)(); irq_work_run_list(this_cpu_ptr(&lazy_list)); } diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c index 479c42e08b74..d8893f2adce8 100644 --- a/kernel/liveupdate/kexec_handover.c +++ b/kernel/liveupdate/kexec_handover.c @@ -1556,7 +1556,7 @@ int kho_fill_kimage(struct kimage *image) int err = 0; struct kexec_buf scratch; - if (!kho_enable) + if (!kho_enable || image->type == KEXEC_TYPE_CRASH) return 0; image->kho.fdt = virt_to_phys(kho_out.fdt); diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c index 0d01cd8c4b4a..7c2f7cc131f7 100644 --- a/kernel/rcu/srcutree.c +++ b/kernel/rcu/srcutree.c @@ -897,11 +897,9 @@ static void srcu_schedule_cbs_snp(struct srcu_struct *ssp, struct srcu_node *snp { int cpu; - for (cpu = snp->grplo; cpu <= snp->grphi; cpu++) { - if (!(mask & (1UL << (cpu - snp->grplo)))) - continue; - srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, cpu), delay); - } + for (cpu = snp->grplo; cpu <= snp->grphi; cpu++) + if ((mask & (1UL << (cpu - snp->grplo))) && rcu_cpu_beenfullyonline(cpu)) + srcu_schedule_cbs_sdp(per_cpu_ptr(ssp->sda, cpu), delay); } /* @@ -1322,7 +1320,9 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp, */ idx = __srcu_read_lock_nmisafe(ssp); ss_state = smp_load_acquire(&ssp->srcu_sup->srcu_size_state); - if (ss_state < SRCU_SIZE_WAIT_CALL) + // If !rcu_cpu_beenfullyonline(), interrupts are still disabled, + // so no migration is possible in either direction from this CPU. + if (ss_state < SRCU_SIZE_WAIT_CALL || !rcu_cpu_beenfullyonline(raw_smp_processor_id())) sdp = per_cpu_ptr(ssp->sda, get_boot_cpu_id()); else sdp = raw_cpu_ptr(ssp->sda); diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index 3ac01ea9bfb1..38157fc58c5c 100644 --- a/kernel/sched/ext.c +++ b/kernel/sched/ext.c @@ -2936,7 +2936,8 @@ static void scx_set_task_state(struct task_struct *p, enum scx_task_state state) warn = prev_state != SCX_TASK_READY; break; default: - warn = true; + WARN_ONCE(1, "sched_ext: Invalid task state %d -> %d for %s[%d]", + prev_state, state, p->comm, p->pid); return; } @@ -5238,10 +5239,10 @@ static void scx_enable_workfn(struct kthread_work *work) ret = scx_init_task(p, task_group(p), false); if (ret) { - put_task_struct(p); scx_task_iter_stop(&sti); scx_error(sch, "ops.init_task() failed (%d) for %s[%d]", ret, p->comm, p->pid); + put_task_struct(p); goto err_disable_unlock_all; } diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index af7079aa0f36..a02bd258677e 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -2384,7 +2384,8 @@ static void bpf_kprobe_multi_link_release(struct bpf_link *link) struct bpf_kprobe_multi_link *kmulti_link; kmulti_link = container_of(link, struct bpf_kprobe_multi_link, link); - unregister_fprobe(&kmulti_link->fp); + /* Don't wait for RCU GP here. */ + unregister_fprobe_async(&kmulti_link->fp); kprobe_multi_put_modules(kmulti_link->mods, kmulti_link->mods_cnt); } diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c index 0afaae4e1a59..fe4d630aa446 100644 --- a/kernel/trace/fprobe.c +++ b/kernel/trace/fprobe.c @@ -1001,14 +1001,15 @@ static int unregister_fprobe_nolock(struct fprobe *fp) } /** - * unregister_fprobe() - Unregister fprobe. + * unregister_fprobe_async() - Unregister fprobe without RCU GP wait * @fp: A fprobe data structure to be unregistered. * * Unregister fprobe (and remove ftrace hooks from the function entries). + * This function will NOT wait until the fprobe is no longer used. * * Return 0 if @fp is unregistered successfully, -errno if not. */ -int unregister_fprobe(struct fprobe *fp) +int unregister_fprobe_async(struct fprobe *fp) { guard(mutex)(&fprobe_mutex); if (!fp || !fprobe_registered(fp)) @@ -1016,6 +1017,24 @@ int unregister_fprobe(struct fprobe *fp) return unregister_fprobe_nolock(fp); } + +/** + * unregister_fprobe() - Unregister fprobe with RCU GP wait + * @fp: A fprobe data structure to be unregistered. + * + * Unregister fprobe (and remove ftrace hooks from the function entries). + * This function will block until the fprobe is no longer used. + * + * Return 0 if @fp is unregistered successfully, -errno if not. + */ +int unregister_fprobe(struct fprobe *fp) +{ + int ret = unregister_fprobe_async(fp); + + if (!ret) + synchronize_rcu(); + return ret; +} EXPORT_SYMBOL_GPL(unregister_fprobe); static int __init fprobe_initcall(void) diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index aad2c7254f62..6c52d642f40d 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -6,6 +6,7 @@ */ #include <linux/sched/isolation.h> #include <linux/trace_recursion.h> +#include <linux/panic_notifier.h> #include <linux/trace_events.h> #include <linux/ring_buffer.h> #include <linux/trace_clock.h> @@ -30,6 +31,7 @@ #include <linux/oom.h> #include <linux/mm.h> +#include <asm/ring_buffer.h> #include <asm/local64.h> #include <asm/local.h> #include <asm/setup.h> @@ -589,6 +591,7 @@ struct trace_buffer { unsigned long range_addr_start; unsigned long range_addr_end; + struct notifier_block flush_nb; struct ring_buffer_meta *meta; @@ -2472,6 +2475,16 @@ static void rb_free_cpu_buffer(struct ring_buffer_per_cpu *cpu_buffer) kfree(cpu_buffer); } +/* Stop recording on a persistent buffer and flush cache if needed. */ +static int rb_flush_buffer_cb(struct notifier_block *nb, unsigned long event, void *data) +{ + struct trace_buffer *buffer = container_of(nb, struct trace_buffer, flush_nb); + + ring_buffer_record_off(buffer); + arch_ring_buffer_flush_range(buffer->range_addr_start, buffer->range_addr_end); + return NOTIFY_DONE; +} + static struct trace_buffer *alloc_buffer(unsigned long size, unsigned flags, int order, unsigned long start, unsigned long end, @@ -2591,6 +2604,12 @@ static struct trace_buffer *alloc_buffer(unsigned long size, unsigned flags, mutex_init(&buffer->mutex); + /* Persistent ring buffer needs to flush cache before reboot. */ + if (start && end) { + buffer->flush_nb.notifier_call = rb_flush_buffer_cb; + atomic_notifier_chain_register(&panic_notifier_list, &buffer->flush_nb); + } + return_ptr(buffer); fail_free_buffers: @@ -2678,6 +2697,9 @@ ring_buffer_free(struct trace_buffer *buffer) { int cpu; + if (buffer->range_addr_start && buffer->range_addr_end) + atomic_notifier_chain_unregister(&panic_notifier_list, &buffer->flush_nb); + cpuhp_state_remove_instance(CPUHP_TRACE_RB_PREPARE, &buffer->node); irq_work_sync(&buffer->irq_work.work); @@ -5283,6 +5305,7 @@ static void rb_iter_reset(struct ring_buffer_iter *iter) iter->head_page = cpu_buffer->reader_page; iter->head = cpu_buffer->reader_page->read; iter->next_event = iter->head; + iter->missed_events = 0; iter->cache_reader_page = iter->head_page; iter->cache_read = cpu_buffer->read; @@ -5897,10 +5920,7 @@ ring_buffer_peek(struct trace_buffer *buffer, int cpu, u64 *ts, */ bool ring_buffer_iter_dropped(struct ring_buffer_iter *iter) { - bool ret = iter->missed_events != 0; - - iter->missed_events = 0; - return ret; + return iter->missed_events != 0; } EXPORT_SYMBOL_GPL(ring_buffer_iter_dropped); @@ -6062,7 +6082,7 @@ void ring_buffer_iter_advance(struct ring_buffer_iter *iter) unsigned long flags; raw_spin_lock_irqsave(&cpu_buffer->reader_lock, flags); - + iter->missed_events = 0; rb_advance_iter(iter); raw_spin_unlock_irqrestore(&cpu_buffer->reader_lock, flags); diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c index f9c8a4f078ea..f8c0e66cc587 100644 --- a/kernel/trace/trace_events_hist.c +++ b/kernel/trace/trace_events_hist.c @@ -1366,10 +1366,8 @@ static const char *hist_field_name(struct hist_field *field, len = snprintf(full_name, sizeof(full_name), "%s.%s.%s", field->system, field->event_name, field->name); - if (len >= sizeof(full_name)) - return NULL; - - field_name = full_name; + if (len < sizeof(full_name)) + field_name = full_name; } else field_name = field->name; } else if (field->flags & HIST_FIELD_FL_TIMESTAMP) diff --git a/kernel/trace/tracing_map.c b/kernel/trace/tracing_map.c index bf1a507695b6..0dd7927df22a 100644 --- a/kernel/trace/tracing_map.c +++ b/kernel/trace/tracing_map.c @@ -386,13 +386,11 @@ static void tracing_map_elt_init_fields(struct tracing_map_elt *elt) } } -static void tracing_map_elt_free(struct tracing_map_elt *elt) +static void __tracing_map_elt_free(struct tracing_map_elt *elt) { if (!elt) return; - if (elt->map->ops && elt->map->ops->elt_free) - elt->map->ops->elt_free(elt); kfree(elt->fields); kfree(elt->vars); kfree(elt->var_set); @@ -400,6 +398,17 @@ static void tracing_map_elt_free(struct tracing_map_elt *elt) kfree(elt); } +static void tracing_map_elt_free(struct tracing_map_elt *elt) +{ + if (!elt) + return; + + /* Only objects initialized with alloc_elt() should be passed to free_elt().*/ + if (elt->map->ops && elt->map->ops->elt_free) + elt->map->ops->elt_free(elt); + __tracing_map_elt_free(elt); +} + static struct tracing_map_elt *tracing_map_elt_alloc(struct tracing_map *map) { struct tracing_map_elt *elt; @@ -444,7 +453,7 @@ static struct tracing_map_elt *tracing_map_elt_alloc(struct tracing_map *map) } return elt; free: - tracing_map_elt_free(elt); + __tracing_map_elt_free(elt); return ERR_PTR(err); } diff --git a/lib/kunit/Kconfig b/lib/kunit/Kconfig index 498cc51e493d..94ff8e4089bf 100644 --- a/lib/kunit/Kconfig +++ b/lib/kunit/Kconfig @@ -16,8 +16,9 @@ menuconfig KUNIT if KUNIT config KUNIT_DEBUGFS - bool "KUnit - Enable /sys/kernel/debug/kunit debugfs representation" if !KUNIT_ALL_TESTS - default KUNIT_ALL_TESTS + bool "KUnit - Enable /sys/kernel/debug/kunit debugfs representation" + depends on DEBUG_FS + default y help Enable debugfs representation for kunit. Currently this consists of /sys/kernel/debug/kunit/<test_suite>/results files for each diff --git a/lib/tests/test_kprobes.c b/lib/tests/test_kprobes.c index b7582010125c..06e729e4de05 100644 --- a/lib/tests/test_kprobes.c +++ b/lib/tests/test_kprobes.c @@ -12,6 +12,12 @@ #define div_factor 3 +#define KP_CLEAR(_kp) \ +do { \ + (_kp).addr = NULL; \ + (_kp).flags = 0; \ +} while (0) + static u32 rand1, preh_val, posth_val; static u32 (*target)(u32 value); static u32 (*recursed_target)(u32 value); @@ -125,10 +131,6 @@ static void test_kprobes(struct kunit *test) current_test = test; - /* addr and flags should be cleard for reusing kprobe. */ - kp.addr = NULL; - kp.flags = 0; - KUNIT_EXPECT_EQ(test, 0, register_kprobes(kps, 2)); preh_val = 0; posth_val = 0; @@ -226,9 +228,6 @@ static void test_kretprobes(struct kunit *test) struct kretprobe *rps[2] = {&rp, &rp2}; current_test = test; - /* addr and flags should be cleard for reusing kprobe. */ - rp.kp.addr = NULL; - rp.kp.flags = 0; KUNIT_EXPECT_EQ(test, 0, register_kretprobes(rps, 2)); krph_val = 0; @@ -290,8 +289,6 @@ static void test_stacktrace_on_kretprobe(struct kunit *test) unsigned long myretaddr = (unsigned long)__builtin_return_address(0); current_test = test; - rp3.kp.addr = NULL; - rp3.kp.flags = 0; /* * Run the stacktrace_driver() to record correct return address in @@ -352,8 +349,6 @@ static void test_stacktrace_on_nested_kretprobe(struct kunit *test) struct kretprobe *rps[2] = {&rp3, &rp4}; current_test = test; - rp3.kp.addr = NULL; - rp3.kp.flags = 0; //KUNIT_ASSERT_NE(test, myretaddr, stacktrace_driver()); @@ -367,6 +362,18 @@ static void test_stacktrace_on_nested_kretprobe(struct kunit *test) static int kprobes_test_init(struct kunit *test) { + KP_CLEAR(kp); + KP_CLEAR(kp2); + KP_CLEAR(kp_missed); +#ifdef CONFIG_KRETPROBES + KP_CLEAR(rp.kp); + KP_CLEAR(rp2.kp); +#ifdef CONFIG_ARCH_CORRECT_STACKTRACE_ON_KRETPROBE + KP_CLEAR(rp3.kp); + KP_CLEAR(rp4.kp); +#endif +#endif + target = kprobe_target; target2 = kprobe_target2; recursed_target = kprobe_recursed_target; diff --git a/mm/damon/sysfs-schemes.c b/mm/damon/sysfs-schemes.c index 9302ad0a603b..05da14101cdd 100644 --- a/mm/damon/sysfs-schemes.c +++ b/mm/damon/sysfs-schemes.c @@ -2537,6 +2537,7 @@ static int damon_sysfs_memcg_path_to_id(char *memcg_path, u64 *id) if (damon_sysfs_memcg_path_eq(memcg, path, memcg_path)) { *id = mem_cgroup_id(memcg); found = true; + mem_cgroup_iter_break(NULL, memcg); break; } } diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 772bac21d155..96786a4af753 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -574,7 +574,7 @@ static inline void memcg_rstat_updated(struct mem_cgroup *memcg, int val, if (!val) return; - css_rstat_updated(&memcg->css, cpu); + __css_rstat_updated(&memcg->css, cpu); statc_pcpu = memcg->vmstats_percpu; for (; statc_pcpu; statc_pcpu = statc->parent_pcpu) { statc = this_cpu_ptr(statc_pcpu); @@ -2583,7 +2583,7 @@ static inline void account_slab_nmi_safe(struct mem_cgroup *memcg, struct mem_cgroup_per_node *pn = memcg->nodeinfo[pgdat->node_id]; /* preemption is disabled in_nmi(). */ - css_rstat_updated(&memcg->css, smp_processor_id()); + __css_rstat_updated(&memcg->css, smp_processor_id()); if (idx == NR_SLAB_RECLAIMABLE_B) atomic_add(nr, &pn->slab_reclaimable); else @@ -2807,7 +2807,7 @@ static inline void account_kmem_nmi_safe(struct mem_cgroup *memcg, int val) mod_memcg_state(memcg, MEMCG_KMEM, val); } else { /* preemption is disabled in_nmi(). */ - css_rstat_updated(&memcg->css, smp_processor_id()); + __css_rstat_updated(&memcg->css, smp_processor_id()); atomic_add(val, &memcg->kmem_stat); } } diff --git a/mm/memfd_luo.c b/mm/memfd_luo.c index cfd665a5b787..bb5f60141803 100644 --- a/mm/memfd_luo.c +++ b/mm/memfd_luo.c @@ -412,6 +412,7 @@ static int memfd_luo_retrieve_folios(struct file *file, if (!folio) { pr_err("Unable to restore folio at physical address: %llx\n", phys); + err = -EIO; goto put_folios; } index = pfolio->index; diff --git a/mm/memory.c b/mm/memory.c index e03522c2bea6..6d4b0ec73356 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -625,6 +625,21 @@ static void print_bad_page_map(struct vm_area_struct *vma, dump_stack(); add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE); } + +static inline bool pgtable_level_has_pxx_special(enum pgtable_level level) +{ + switch (level) { + case PGTABLE_LEVEL_PTE: + return IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL); + case PGTABLE_LEVEL_PMD: + return IS_ENABLED(CONFIG_ARCH_SUPPORTS_PMD_PFNMAP); + case PGTABLE_LEVEL_PUD: + return IS_ENABLED(CONFIG_ARCH_SUPPORTS_PUD_PFNMAP); + default: + return false; + } +} + #define print_bad_pte(vma, addr, pte, page) \ print_bad_page_map(vma, addr, pte_val(pte), page, PGTABLE_LEVEL_PTE) @@ -697,7 +712,7 @@ static inline struct page *__vm_normal_page(struct vm_area_struct *vma, unsigned long addr, unsigned long pfn, bool special, unsigned long long entry, enum pgtable_level level) { - if (IS_ENABLED(CONFIG_ARCH_HAS_PTE_SPECIAL)) { + if (pgtable_level_has_pxx_special(level)) { if (unlikely(special)) { #ifdef CONFIG_FIND_NORMAL_PAGE if (vma->vm_ops && vma->vm_ops->find_normal_page) @@ -712,8 +727,9 @@ static inline struct page *__vm_normal_page(struct vm_area_struct *vma, return NULL; } /* - * With CONFIG_ARCH_HAS_PTE_SPECIAL, any special page table - * mappings (incl. shared zero folios) are marked accordingly. + * With working pte_special()/pmd_special()..., any special page + * table mappings (incl. shared zero folios) are marked + * accordingly. */ } else { if (unlikely(vma->vm_flags & (VM_PFNMAP | VM_MIXEDMAP))) { @@ -1750,7 +1766,7 @@ static inline int zap_nonpresent_ptes(struct mmu_gather *tlb, * consider uffd-wp bit when zap. For more information, * see zap_install_uffd_wp_if_needed(). */ - WARN_ON_ONCE(!vma_is_anonymous(vma)); + WARN_ON_ONCE(!folio_test_anon(folio)); rss[mm_counter(folio)]--; folio_remove_rmap_pte(folio, page, vma); folio_put(folio); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 05a47953ef21..44df12956997 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1430,6 +1430,8 @@ static void remove_memory_blocks_and_altmaps(u64 start, u64 size) altmap = mem->altmap; mem->altmap = NULL; + /* drop the ref. we got via find_memory_block() */ + put_device(&mem->dev); remove_memory_block_devices(cur_start, memblock_size); diff --git a/mm/migrate_device.c b/mm/migrate_device.c index 8079676c8f1f..a83bac73e3bc 100644 --- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -856,7 +856,7 @@ static int migrate_vma_insert_huge_pmd_page(struct migrate_vma *migrate, ptl = pmd_lock(vma->vm_mm, pmdp); csa_ret = check_stable_address_space(vma->vm_mm); if (csa_ret) - goto abort; + goto unlock_abort; /* * Check for userfaultfd but do not deliver the fault. Instead, diff --git a/mm/page_alloc.c b/mm/page_alloc.c index e92898ad51cd..7c52f85b2ea5 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1847,9 +1847,9 @@ static inline bool should_skip_init(gfp_t flags) inline void post_alloc_hook(struct page *page, unsigned int order, gfp_t gfp_flags) { + const bool zero_tags = gfp_flags & __GFP_ZEROTAGS; bool init = !want_init_on_free() && want_init_on_alloc(gfp_flags) && !should_skip_init(gfp_flags); - bool zero_tags = init && (gfp_flags & __GFP_ZEROTAGS); int i; set_page_private(page, 0); @@ -1871,11 +1871,11 @@ inline void post_alloc_hook(struct page *page, unsigned int order, */ /* - * If memory tags should be zeroed - * (which happens only when memory should be initialized as well). + * Clearing tags can efficiently clear the memory for us as well, if + * required. */ if (zero_tags) - init = !tag_clear_highpages(page, 1 << order); + init = tag_clear_highpages(page, 1 << order, /* clear_pages= */init); if (!should_skip_kasan_unpoison(gfp_flags) && kasan_unpoison_pages(page, order, init)) { diff --git a/mm/slab_common.c b/mm/slab_common.c index d5a70a831a2a..8b661fff5eed 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -2110,7 +2110,9 @@ EXPORT_SYMBOL_GPL(kvfree_rcu_barrier); void kvfree_rcu_barrier_on_cache(struct kmem_cache *s) { if (cache_has_sheaves(s)) { + cpus_read_lock(); flush_rcu_sheaves_on_cache(s); + cpus_read_unlock(); rcu_barrier(); } diff --git a/mm/slub.c b/mm/slub.c index e423afa27d1a..edd3909f4198 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -4038,6 +4038,7 @@ void flush_rcu_sheaves_on_cache(struct kmem_cache *s) struct slub_flush_work *sfw; unsigned int cpu; + lockdep_assert_cpus_held(); mutex_lock(&flush_lock); for_each_online_cpu(cpu) { diff --git a/net/batman-adv/bat_iv_ogm.c b/net/batman-adv/bat_iv_ogm.c index 74ef7dc2b2f9..b8b1b997960a 100644 --- a/net/batman-adv/bat_iv_ogm.c +++ b/net/batman-adv/bat_iv_ogm.c @@ -224,6 +224,8 @@ static void batadv_iv_ogm_iface_disable(struct batadv_hard_iface *hard_iface) hard_iface->bat_iv.ogm_buff = NULL; mutex_unlock(&hard_iface->bat_iv.ogm_buff_mutex); + + cancel_delayed_work_sync(&hard_iface->bat_iv.reschedule_work); } static void batadv_iv_ogm_iface_update_mac(struct batadv_hard_iface *hard_iface) @@ -536,8 +538,10 @@ out: * @if_incoming: interface where the packet was received * @if_outgoing: interface for which the retransmission should be considered * @own_packet: true if it is a self-generated ogm + * + * Return: whether forward packet was scheduled */ -static void batadv_iv_ogm_aggregate_new(const unsigned char *packet_buff, +static bool batadv_iv_ogm_aggregate_new(const unsigned char *packet_buff, int packet_len, unsigned long send_time, bool direct_link, struct batadv_hard_iface *if_incoming, @@ -561,13 +565,13 @@ static void batadv_iv_ogm_aggregate_new(const unsigned char *packet_buff, skb = netdev_alloc_skb_ip_align(NULL, skb_size); if (!skb) - return; + return false; forw_packet_aggr = batadv_forw_packet_alloc(if_incoming, if_outgoing, queue_left, bat_priv, skb); if (!forw_packet_aggr) { kfree_skb(skb); - return; + return false; } forw_packet_aggr->skb->priority = TC_PRIO_CONTROL; @@ -590,6 +594,8 @@ static void batadv_iv_ogm_aggregate_new(const unsigned char *packet_buff, batadv_iv_send_outstanding_bat_ogm_packet); batadv_forw_packet_ogmv1_queue(bat_priv, forw_packet_aggr, send_time); + + return true; } /* aggregate a new packet into the existing ogm packet */ @@ -617,8 +623,10 @@ static void batadv_iv_ogm_aggregate(struct batadv_forw_packet *forw_packet_aggr, * @if_outgoing: interface for which the retransmission should be considered * @own_packet: true if it is a self-generated ogm * @send_time: timestamp (jiffies) when the packet is to be sent + * + * Return: whether forward packet was scheduled */ -static void batadv_iv_ogm_queue_add(struct batadv_priv *bat_priv, +static bool batadv_iv_ogm_queue_add(struct batadv_priv *bat_priv, unsigned char *packet_buff, int packet_len, struct batadv_hard_iface *if_incoming, @@ -670,14 +678,16 @@ static void batadv_iv_ogm_queue_add(struct batadv_priv *bat_priv, if (!own_packet && atomic_read(&bat_priv->aggregated_ogms)) send_time += max_aggregation_jiffies; - batadv_iv_ogm_aggregate_new(packet_buff, packet_len, - send_time, direct_link, - if_incoming, if_outgoing, - own_packet); + return batadv_iv_ogm_aggregate_new(packet_buff, packet_len, + send_time, direct_link, + if_incoming, if_outgoing, + own_packet); } else { batadv_iv_ogm_aggregate(forw_packet_aggr, packet_buff, packet_len, direct_link); spin_unlock_bh(&bat_priv->forw_bat_list_lock); + + return true; } } @@ -790,6 +800,9 @@ static void batadv_iv_ogm_schedule_buff(struct batadv_hard_iface *hard_iface) u32 seqno; u16 tvlv_len = 0; unsigned long send_time; + bool reschedule = false; + bool scheduled; + int ret; lockdep_assert_held(&hard_iface->bat_iv.ogm_buff_mutex); @@ -813,9 +826,15 @@ static void batadv_iv_ogm_schedule_buff(struct batadv_hard_iface *hard_iface) * appended as it may alter the tt tvlv container */ batadv_tt_local_commit_changes(bat_priv); - tvlv_len = batadv_tvlv_container_ogm_append(bat_priv, ogm_buff, - ogm_buff_len, - BATADV_OGM_HLEN); + ret = batadv_tvlv_container_ogm_append(bat_priv, ogm_buff, + ogm_buff_len, + BATADV_OGM_HLEN); + if (ret < 0) { + reschedule = true; + goto out; + } + + tvlv_len = ret; } batadv_ogm_packet = (struct batadv_ogm_packet *)(*ogm_buff); @@ -834,8 +853,11 @@ static void batadv_iv_ogm_schedule_buff(struct batadv_hard_iface *hard_iface) /* OGMs from secondary interfaces are only scheduled on their * respective interfaces. */ - batadv_iv_ogm_queue_add(bat_priv, *ogm_buff, *ogm_buff_len, - hard_iface, hard_iface, 1, send_time); + scheduled = batadv_iv_ogm_queue_add(bat_priv, *ogm_buff, *ogm_buff_len, + hard_iface, hard_iface, 1, send_time); + if (!scheduled) + reschedule = true; + goto out; } @@ -847,15 +869,28 @@ static void batadv_iv_ogm_schedule_buff(struct batadv_hard_iface *hard_iface) if (!kref_get_unless_zero(&tmp_hard_iface->refcount)) continue; - batadv_iv_ogm_queue_add(bat_priv, *ogm_buff, - *ogm_buff_len, hard_iface, - tmp_hard_iface, 1, send_time); - + scheduled = batadv_iv_ogm_queue_add(bat_priv, *ogm_buff, + *ogm_buff_len, hard_iface, + tmp_hard_iface, 1, send_time); batadv_hardif_put(tmp_hard_iface); + + if (!scheduled && tmp_hard_iface == hard_iface) + reschedule = true; } rcu_read_unlock(); out: + if (reschedule) { + /* there was a failure scheduling the own forward packet. + * as result, the batadv_iv_send_outstanding_bat_ogm_packet() + * work item is no longer scheduled. it is therefore necessary + * to reschedule it manually + */ + queue_delayed_work(batadv_event_workqueue, + &hard_iface->bat_iv.reschedule_work, + msecs_to_jiffies(atomic_read(&bat_priv->orig_interval))); + } + batadv_hardif_put(primary_if); } @@ -870,6 +905,17 @@ static void batadv_iv_ogm_schedule(struct batadv_hard_iface *hard_iface) mutex_unlock(&hard_iface->bat_iv.ogm_buff_mutex); } +static void batadv_iv_ogm_reschedule(struct work_struct *work) +{ + struct delayed_work *delayed_work = to_delayed_work(work); + struct batadv_hard_iface *hard_iface; + + hard_iface = container_of(delayed_work, + struct batadv_hard_iface, + bat_iv.reschedule_work); + batadv_iv_ogm_schedule(hard_iface); +} + /** * batadv_iv_orig_ifinfo_sum() - Get bcast_own sum for originator over interface * @orig_node: originator which reproadcasted the OGMs directly @@ -2262,6 +2308,8 @@ batadv_iv_ogm_neigh_is_sob(struct batadv_neigh_node *neigh1, static void batadv_iv_iface_enabled(struct batadv_hard_iface *hard_iface) { + INIT_DELAYED_WORK(&hard_iface->bat_iv.reschedule_work, batadv_iv_ogm_reschedule); + /* begin scheduling originator messages on that interface */ batadv_iv_ogm_schedule(hard_iface); } diff --git a/net/batman-adv/bat_v_ogm.c b/net/batman-adv/bat_v_ogm.c index e3870492dab7..d66ca77b1aaa 100644 --- a/net/batman-adv/bat_v_ogm.c +++ b/net/batman-adv/bat_v_ogm.c @@ -113,14 +113,14 @@ static void batadv_v_ogm_start_timer(struct batadv_priv *bat_priv) /** * batadv_v_ogm_send_to_if() - send a batman ogm using a given interface + * @bat_priv: the bat priv with all the mesh interface information * @skb: the OGM to send * @hard_iface: the interface to use to send the OGM */ -static void batadv_v_ogm_send_to_if(struct sk_buff *skb, +static void batadv_v_ogm_send_to_if(struct batadv_priv *bat_priv, + struct sk_buff *skb, struct batadv_hard_iface *hard_iface) { - struct batadv_priv *bat_priv = netdev_priv(hard_iface->mesh_iface); - if (hard_iface->if_status != BATADV_IF_ACTIVE) { kfree_skb(skb); return; @@ -187,6 +187,7 @@ static void batadv_v_ogm_aggr_list_free(struct batadv_hard_iface *hard_iface) /** * batadv_v_ogm_aggr_send() - flush & send aggregation queue + * @bat_priv: the bat priv with all the mesh interface information * @hard_iface: the interface with the aggregation queue to flush * * Aggregates all OGMv2 packets currently in the aggregation queue into a @@ -196,7 +197,8 @@ static void batadv_v_ogm_aggr_list_free(struct batadv_hard_iface *hard_iface) * * Caller needs to hold the hard_iface->bat_v.aggr_list.lock. */ -static void batadv_v_ogm_aggr_send(struct batadv_hard_iface *hard_iface) +static void batadv_v_ogm_aggr_send(struct batadv_priv *bat_priv, + struct batadv_hard_iface *hard_iface) { unsigned int aggr_len = hard_iface->bat_v.aggr_len; struct sk_buff *skb_aggr; @@ -226,27 +228,32 @@ static void batadv_v_ogm_aggr_send(struct batadv_hard_iface *hard_iface) consume_skb(skb); } - batadv_v_ogm_send_to_if(skb_aggr, hard_iface); + batadv_v_ogm_send_to_if(bat_priv, skb_aggr, hard_iface); } /** * batadv_v_ogm_queue_on_if() - queue a batman ogm on a given interface + * @bat_priv: the bat priv with all the mesh interface information * @skb: the OGM to queue * @hard_iface: the interface to queue the OGM on */ -static void batadv_v_ogm_queue_on_if(struct sk_buff *skb, +static void batadv_v_ogm_queue_on_if(struct batadv_priv *bat_priv, + struct sk_buff *skb, struct batadv_hard_iface *hard_iface) { - struct batadv_priv *bat_priv = netdev_priv(hard_iface->mesh_iface); + if (hard_iface->mesh_iface != bat_priv->mesh_iface) { + kfree_skb(skb); + return; + } if (!atomic_read(&bat_priv->aggregated_ogms)) { - batadv_v_ogm_send_to_if(skb, hard_iface); + batadv_v_ogm_send_to_if(bat_priv, skb, hard_iface); return; } spin_lock_bh(&hard_iface->bat_v.aggr_list.lock); if (!batadv_v_ogm_queue_left(skb, hard_iface)) - batadv_v_ogm_aggr_send(hard_iface); + batadv_v_ogm_aggr_send(bat_priv, hard_iface); hard_iface->bat_v.aggr_len += batadv_v_ogm_len(skb); __skb_queue_tail(&hard_iface->bat_v.aggr_list, skb); @@ -262,10 +269,10 @@ static void batadv_v_ogm_send_meshif(struct batadv_priv *bat_priv) struct batadv_hard_iface *hard_iface; struct batadv_ogm2_packet *ogm_packet; struct sk_buff *skb, *skb_tmp; - unsigned char *ogm_buff; + unsigned char **ogm_buff; struct list_head *iter; - int ogm_buff_len; - u16 tvlv_len = 0; + int *ogm_buff_len; + u16 tvlv_len; int ret; lockdep_assert_held(&bat_priv->bat_v.ogm_buff_mutex); @@ -273,25 +280,27 @@ static void batadv_v_ogm_send_meshif(struct batadv_priv *bat_priv) if (atomic_read(&bat_priv->mesh_state) == BATADV_MESH_DEACTIVATING) goto out; - ogm_buff = bat_priv->bat_v.ogm_buff; - ogm_buff_len = bat_priv->bat_v.ogm_buff_len; + ogm_buff = &bat_priv->bat_v.ogm_buff; + ogm_buff_len = &bat_priv->bat_v.ogm_buff_len; + /* tt changes have to be committed before the tvlv data is * appended as it may alter the tt tvlv container */ batadv_tt_local_commit_changes(bat_priv); - tvlv_len = batadv_tvlv_container_ogm_append(bat_priv, &ogm_buff, - &ogm_buff_len, - BATADV_OGM2_HLEN); + ret = batadv_tvlv_container_ogm_append(bat_priv, ogm_buff, + ogm_buff_len, + BATADV_OGM2_HLEN); + if (ret < 0) + goto reschedule; - bat_priv->bat_v.ogm_buff = ogm_buff; - bat_priv->bat_v.ogm_buff_len = ogm_buff_len; + tvlv_len = ret; - skb = netdev_alloc_skb_ip_align(NULL, ETH_HLEN + ogm_buff_len); + skb = netdev_alloc_skb_ip_align(NULL, ETH_HLEN + *ogm_buff_len); if (!skb) goto reschedule; skb_reserve(skb, ETH_HLEN); - skb_put_data(skb, ogm_buff, ogm_buff_len); + skb_put_data(skb, *ogm_buff, *ogm_buff_len); ogm_packet = (struct batadv_ogm2_packet *)skb->data; ogm_packet->seqno = htonl(atomic_read(&bat_priv->bat_v.ogm_seqno)); @@ -343,7 +352,7 @@ static void batadv_v_ogm_send_meshif(struct batadv_priv *bat_priv) break; } - batadv_v_ogm_queue_on_if(skb_tmp, hard_iface); + batadv_v_ogm_queue_on_if(bat_priv, skb_tmp, hard_iface); batadv_hardif_put(hard_iface); } rcu_read_unlock(); @@ -383,12 +392,14 @@ void batadv_v_ogm_aggr_work(struct work_struct *work) { struct batadv_hard_iface_bat_v *batv; struct batadv_hard_iface *hard_iface; + struct batadv_priv *bat_priv; batv = container_of(work, struct batadv_hard_iface_bat_v, aggr_wq.work); hard_iface = container_of(batv, struct batadv_hard_iface, bat_v); + bat_priv = netdev_priv(hard_iface->mesh_iface); spin_lock_bh(&hard_iface->bat_v.aggr_list.lock); - batadv_v_ogm_aggr_send(hard_iface); + batadv_v_ogm_aggr_send(bat_priv, hard_iface); spin_unlock_bh(&hard_iface->bat_v.aggr_list.lock); batadv_v_ogm_start_queue_timer(hard_iface); @@ -578,7 +589,7 @@ static void batadv_v_ogm_forward(struct batadv_priv *bat_priv, if_outgoing->net_dev->name, ntohl(ogm_forward->throughput), ogm_forward->ttl, if_incoming->net_dev->name); - batadv_v_ogm_queue_on_if(skb, if_outgoing); + batadv_v_ogm_queue_on_if(bat_priv, skb, if_outgoing); out: batadv_orig_ifinfo_put(orig_ifinfo); diff --git a/net/batman-adv/bridge_loop_avoidance.c b/net/batman-adv/bridge_loop_avoidance.c index cec11f1251d6..ffe854018bd3 100644 --- a/net/batman-adv/bridge_loop_avoidance.c +++ b/net/batman-adv/bridge_loop_avoidance.c @@ -356,12 +356,14 @@ static void batadv_bla_send_claim(struct batadv_priv *bat_priv, const u8 *mac, sizeof(local_claim_dest)); local_claim_dest.type = claimtype; - mesh_iface = primary_if->mesh_iface; + mesh_iface = READ_ONCE(primary_if->mesh_iface); + if (!mesh_iface) + goto out; skb = arp_create(ARPOP_REPLY, ETH_P_ARP, /* IP DST: 0.0.0.0 */ zeroip, - primary_if->mesh_iface, + mesh_iface, /* IP SRC: 0.0.0.0 */ zeroip, /* Ethernet DST: Broadcast */ @@ -514,8 +516,8 @@ batadv_bla_get_backbone_gw(struct batadv_priv *bat_priv, const u8 *orig, entry->crc = BATADV_BLA_CRC_INIT; entry->bat_priv = bat_priv; spin_lock_init(&entry->crc_lock); - atomic_set(&entry->request_sent, 0); - atomic_set(&entry->wait_periods, 0); + entry->state = BATADV_BLA_BACKBONE_GW_SYNCED; + entry->wait_periods = 0; ether_addr_copy(entry->orig, orig); INIT_WORK(&entry->report_work, batadv_bla_loopdetect_report); kref_init(&entry->refcount); @@ -544,9 +546,13 @@ batadv_bla_get_backbone_gw(struct batadv_priv *bat_priv, const u8 *orig, batadv_bla_send_announce(bat_priv, entry); /* this will be decreased in the worker thread */ - atomic_inc(&entry->request_sent); - atomic_set(&entry->wait_periods, BATADV_BLA_WAIT_PERIODS); - atomic_inc(&bat_priv->bla.num_requests); + spin_lock_bh(&bat_priv->bla.num_requests_lock); + if (entry->state == BATADV_BLA_BACKBONE_GW_SYNCED) { + entry->state = BATADV_BLA_BACKBONE_GW_UNSYNCED; + entry->wait_periods = BATADV_BLA_WAIT_PERIODS; + atomic_inc(&bat_priv->bla.num_requests); + } + spin_unlock_bh(&bat_priv->bla.num_requests_lock); } return entry; @@ -649,10 +655,12 @@ static void batadv_bla_send_request(struct batadv_bla_backbone_gw *backbone_gw) backbone_gw->vid, BATADV_CLAIM_TYPE_REQUEST); /* no local broadcasts should be sent or received, for now. */ - if (!atomic_read(&backbone_gw->request_sent)) { + spin_lock_bh(&backbone_gw->bat_priv->bla.num_requests_lock); + if (backbone_gw->state == BATADV_BLA_BACKBONE_GW_SYNCED) { + backbone_gw->state = BATADV_BLA_BACKBONE_GW_UNSYNCED; atomic_inc(&backbone_gw->bat_priv->bla.num_requests); - atomic_set(&backbone_gw->request_sent, 1); } + spin_unlock_bh(&backbone_gw->bat_priv->bla.num_requests_lock); } /** @@ -873,10 +881,12 @@ static bool batadv_handle_announce(struct batadv_priv *bat_priv, u8 *an_addr, /* if we have sent a request and the crc was OK, * we can allow traffic again. */ - if (atomic_read(&backbone_gw->request_sent)) { + spin_lock_bh(&bat_priv->bla.num_requests_lock); + if (backbone_gw->state == BATADV_BLA_BACKBONE_GW_UNSYNCED) { + backbone_gw->state = BATADV_BLA_BACKBONE_GW_SYNCED; atomic_dec(&backbone_gw->bat_priv->bla.num_requests); - atomic_set(&backbone_gw->request_sent, 0); } + spin_unlock_bh(&bat_priv->bla.num_requests_lock); } batadv_backbone_gw_put(backbone_gw); @@ -1224,6 +1234,7 @@ static void batadv_bla_purge_backbone_gw(struct batadv_priv *bat_priv, int now) struct hlist_head *head; struct batadv_hashtable *hash; spinlock_t *list_lock; /* protects write access to the hash lists */ + bool purged; int i; hash = bat_priv->bla.backbone_hash; @@ -1234,30 +1245,49 @@ static void batadv_bla_purge_backbone_gw(struct batadv_priv *bat_priv, int now) head = &hash->table[i]; list_lock = &hash->list_locks[i]; - spin_lock_bh(list_lock); - hlist_for_each_entry_safe(backbone_gw, node_tmp, - head, hash_entry) { - if (now) - goto purge_now; - if (!batadv_has_timed_out(backbone_gw->lasttime, - BATADV_BLA_BACKBONE_TIMEOUT)) - continue; + do { + purged = false; + + spin_lock_bh(list_lock); + hlist_for_each_entry_safe(backbone_gw, node_tmp, + head, hash_entry) { + if (now) + goto purge_now; + if (!batadv_has_timed_out(backbone_gw->lasttime, + BATADV_BLA_BACKBONE_TIMEOUT)) + continue; - batadv_dbg(BATADV_DBG_BLA, backbone_gw->bat_priv, - "%s(): backbone gw %pM timed out\n", - __func__, backbone_gw->orig); + batadv_dbg(BATADV_DBG_BLA, backbone_gw->bat_priv, + "%s(): backbone gw %pM timed out\n", + __func__, backbone_gw->orig); purge_now: - /* don't wait for the pending request anymore */ - if (atomic_read(&backbone_gw->request_sent)) - atomic_dec(&bat_priv->bla.num_requests); + purged = true; - batadv_bla_del_backbone_claims(backbone_gw); + /* don't wait for the pending request anymore */ + spin_lock_bh(&bat_priv->bla.num_requests_lock); + if (backbone_gw->state == BATADV_BLA_BACKBONE_GW_UNSYNCED) + atomic_dec(&bat_priv->bla.num_requests); - hlist_del_rcu(&backbone_gw->hash_entry); - batadv_backbone_gw_put(backbone_gw); - } - spin_unlock_bh(list_lock); + backbone_gw->state = BATADV_BLA_BACKBONE_GW_STOPPED; + spin_unlock_bh(&bat_priv->bla.num_requests_lock); + + batadv_bla_del_backbone_claims(backbone_gw); + + hlist_del_rcu(&backbone_gw->hash_entry); + break; + } + spin_unlock_bh(list_lock); + + if (purged) { + /* reference for pending report_work */ + if (cancel_work_sync(&backbone_gw->report_work)) + batadv_backbone_gw_put(backbone_gw); + + /* reference for hash_entry */ + batadv_backbone_gw_put(backbone_gw); + } + } while (purged); } } @@ -1492,7 +1522,7 @@ static void batadv_bla_periodic_work(struct work_struct *work) batadv_bla_send_loopdetect(bat_priv, backbone_gw); - /* request_sent is only set after creation to avoid + /* state is only set to unsynced after creation to avoid * problems when we are not yet known as backbone gw * in the backbone. * @@ -1501,14 +1531,21 @@ static void batadv_bla_periodic_work(struct work_struct *work) * some grace time. */ - if (atomic_read(&backbone_gw->request_sent) == 0) - continue; + spin_lock_bh(&bat_priv->bla.num_requests_lock); + if (backbone_gw->state != BATADV_BLA_BACKBONE_GW_UNSYNCED) + goto unlock_next; - if (!atomic_dec_and_test(&backbone_gw->wait_periods)) - continue; + if (backbone_gw->wait_periods > 0) + backbone_gw->wait_periods--; + if (backbone_gw->wait_periods > 0) + goto unlock_next; + + backbone_gw->state = BATADV_BLA_BACKBONE_GW_SYNCED; atomic_dec(&backbone_gw->bat_priv->bla.num_requests); - atomic_set(&backbone_gw->request_sent, 0); + +unlock_next: + spin_unlock_bh(&bat_priv->bla.num_requests_lock); } rcu_read_unlock(); } diff --git a/net/batman-adv/distributed-arp-table.c b/net/batman-adv/distributed-arp-table.c index 3efc4cf50b46..0a8bd95e2f99 100644 --- a/net/batman-adv/distributed-arp-table.c +++ b/net/batman-adv/distributed-arp-table.c @@ -696,6 +696,9 @@ static bool batadv_dat_forward_data(struct batadv_priv *bat_priv, goto free_orig; tmp_skb = pskb_copy_for_clone(skb, GFP_ATOMIC); + if (!tmp_skb) + goto free_neigh; + if (!batadv_send_skb_prepare_unicast_4addr(bat_priv, tmp_skb, cand[i].orig_node, packet_subtype)) { diff --git a/net/batman-adv/fragmentation.c b/net/batman-adv/fragmentation.c index f4e45cc25816..4a594aa2ebf6 100644 --- a/net/batman-adv/fragmentation.c +++ b/net/batman-adv/fragmentation.c @@ -17,6 +17,7 @@ #include <linux/lockdep.h> #include <linux/minmax.h> #include <linux/netdevice.h> +#include <linux/overflow.h> #include <linux/skbuff.h> #include <linux/slab.h> #include <linux/spinlock.h> @@ -80,9 +81,9 @@ void batadv_frag_purge_orig(struct batadv_orig_node *orig_node, * * Return: the maximum size of payload that can be fragmented. */ -static int batadv_frag_size_limit(void) +static size_t batadv_frag_size_limit(void) { - int limit = BATADV_FRAG_MAX_FRAG_SIZE; + size_t limit = BATADV_FRAG_MAX_FRAG_SIZE; limit -= sizeof(struct batadv_frag_packet); limit *= BATADV_FRAG_MAX_FRAGMENTS; @@ -143,7 +144,9 @@ static bool batadv_frag_insert_packet(struct batadv_orig_node *orig_node, struct batadv_frag_packet *frag_packet; u8 bucket; u16 seqno, hdr_size = sizeof(struct batadv_frag_packet); + bool overflow = false; bool ret = false; + size_t data_len; /* Linearize packet to avoid linearizing 16 packets in a row when doing * the later merge. Non-linear merge should be added to remove this @@ -153,6 +156,7 @@ static bool batadv_frag_insert_packet(struct batadv_orig_node *orig_node, goto err; frag_packet = (struct batadv_frag_packet *)skb->data; + data_len = skb->len - hdr_size; seqno = ntohs(frag_packet->seqno); bucket = seqno % BATADV_FRAG_BUFFER_COUNT; @@ -171,7 +175,7 @@ static bool batadv_frag_insert_packet(struct batadv_orig_node *orig_node, spin_lock_bh(&chain->lock); if (batadv_frag_init_chain(chain, seqno)) { hlist_add_head(&frag_entry_new->list, &chain->fragment_list); - chain->size = skb->len - hdr_size; + chain->size = data_len; chain->timestamp = jiffies; chain->total_size = ntohs(frag_packet->total_size); ret = true; @@ -188,7 +192,11 @@ static bool batadv_frag_insert_packet(struct batadv_orig_node *orig_node, if (frag_entry_curr->no < frag_entry_new->no) { hlist_add_before(&frag_entry_new->list, &frag_entry_curr->list); - chain->size += skb->len - hdr_size; + + if (check_add_overflow(chain->size, data_len, + &chain->size)) + overflow = true; + chain->timestamp = jiffies; ret = true; goto out; @@ -201,13 +209,16 @@ static bool batadv_frag_insert_packet(struct batadv_orig_node *orig_node, /* Reached the end of the list, so insert after 'frag_entry_last'. */ if (likely(frag_entry_last)) { hlist_add_behind(&frag_entry_new->list, &frag_entry_last->list); - chain->size += skb->len - hdr_size; + + if (check_add_overflow(chain->size, data_len, &chain->size)) + overflow = true; + chain->timestamp = jiffies; ret = true; } out: - if (chain->size > batadv_frag_size_limit() || + if (overflow || chain->size > batadv_frag_size_limit() || chain->total_size != ntohs(frag_packet->total_size) || chain->total_size > batadv_frag_size_limit()) { /* Clear chain if total size of either the list or the packet @@ -294,6 +305,31 @@ free: } /** + * batadv_skb_is_frag() - check if newly merged skb is gain a unicast packet + * @skb: newly merged skb + * + * Return: if newly skb is of type BATADV_UNICAST_FRAG + */ +static bool batadv_skb_is_frag(struct sk_buff *skb) +{ + struct batadv_ogm_packet *batadv_ogm_packet; + + /* packet should hold at least type and version */ + if (unlikely(!pskb_may_pull(skb, 2))) + return false; + + batadv_ogm_packet = (struct batadv_ogm_packet *)skb->data; + + if (batadv_ogm_packet->version != BATADV_COMPAT_VERSION) + return false; + + if (batadv_ogm_packet->packet_type != BATADV_UNICAST_FRAG) + return false; + + return true; +} + +/** * batadv_frag_skb_buffer() - buffer fragment for later merge * @skb: skb to buffer * @orig_node_src: originator that the skb is received from @@ -326,6 +362,16 @@ bool batadv_frag_skb_buffer(struct sk_buff **skb, if (!skb_out) goto out_err; + /* fragment in fragment is not allowed. otherwise it is possible + * to exhaust the stack when receiving a matryoshka-style + * "fragments in a fragment packet" + */ + if (batadv_skb_is_frag(skb_out)) { + kfree_skb(skb_out); + skb_out = NULL; + goto out_err; + } + out: ret = true; out_err: diff --git a/net/batman-adv/gateway_client.c b/net/batman-adv/gateway_client.c index 51e9c081a2a4..a9d0346e8332 100644 --- a/net/batman-adv/gateway_client.c +++ b/net/batman-adv/gateway_client.c @@ -478,10 +478,14 @@ void batadv_gw_node_delete(struct batadv_priv *bat_priv, */ void batadv_gw_node_free(struct batadv_priv *bat_priv) { + struct batadv_gw_node *curr_gw; struct batadv_gw_node *gw_node; struct hlist_node *node_tmp; spin_lock_bh(&bat_priv->gw.list_lock); + curr_gw = rcu_replace_pointer(bat_priv->gw.curr_gw, NULL, true); + batadv_gw_node_put(curr_gw); + hlist_for_each_entry_safe(gw_node, node_tmp, &bat_priv->gw.gateway_list, list) { hlist_del_init_rcu(&gw_node->list); diff --git a/net/batman-adv/mesh-interface.c b/net/batman-adv/mesh-interface.c index 56ca1c1b83f2..e7aa45bc6b7a 100644 --- a/net/batman-adv/mesh-interface.c +++ b/net/batman-adv/mesh-interface.c @@ -787,6 +787,7 @@ static int batadv_meshif_init_late(struct net_device *dev) atomic_set(&bat_priv->tt.ogm_append_cnt, 0); #ifdef CONFIG_BATMAN_ADV_BLA atomic_set(&bat_priv->bla.num_requests, 0); + spin_lock_init(&bat_priv->bla.num_requests_lock); #endif atomic_set(&bat_priv->tp_num, 0); diff --git a/net/batman-adv/originator.c b/net/batman-adv/originator.c index b3468ccab535..ad4921b659d9 100644 --- a/net/batman-adv/originator.c +++ b/net/batman-adv/originator.c @@ -835,8 +835,6 @@ static void batadv_orig_node_free_rcu(struct rcu_head *rcu) orig_node = container_of(rcu, struct batadv_orig_node, rcu); - batadv_mcast_purge_orig(orig_node); - batadv_frag_purge_orig(orig_node, NULL); kfree(orig_node->tt_buff); @@ -887,6 +885,8 @@ void batadv_orig_node_release(struct kref *ref) } spin_unlock_bh(&orig_node->vlan_list_lock); + batadv_mcast_purge_orig(orig_node); + call_rcu(&orig_node->rcu, batadv_orig_node_free_rcu); } diff --git a/net/batman-adv/tp_meter.c b/net/batman-adv/tp_meter.c index 066c76113fc4..0fc4ca78e84e 100644 --- a/net/batman-adv/tp_meter.c +++ b/net/batman-adv/tp_meter.c @@ -8,6 +8,7 @@ #include "main.h" #include <linux/atomic.h> +#include <linux/bug.h> #include <linux/build_bug.h> #include <linux/byteorder/generic.h> #include <linux/cache.h> @@ -254,6 +255,7 @@ static void batadv_tp_batctl_error_notify(enum batadv_tp_meter_reason reason, * batadv_tp_list_find() - find a tp_vars object in the global list * @bat_priv: the bat priv with all the mesh interface information * @dst: the other endpoint MAC address to look for + * @role: role of the session * * Look for a tp_vars object matching dst as end_point and return it after * having increment the refcounter. Return NULL is not found @@ -261,7 +263,8 @@ static void batadv_tp_batctl_error_notify(enum batadv_tp_meter_reason reason, * Return: matching tp_vars or NULL when no tp_vars with @dst was found */ static struct batadv_tp_vars *batadv_tp_list_find(struct batadv_priv *bat_priv, - const u8 *dst) + const u8 *dst, + enum batadv_tp_meter_role role) { struct batadv_tp_vars *pos, *tp_vars = NULL; @@ -270,6 +273,9 @@ static struct batadv_tp_vars *batadv_tp_list_find(struct batadv_priv *bat_priv, if (!batadv_compare_eth(pos->other_end, dst)) continue; + if (pos->role != role) + continue; + /* most of the time this function is invoked during the normal * process..it makes sens to pay more when the session is * finished and to speed the process up during the measurement @@ -286,11 +292,32 @@ static struct batadv_tp_vars *batadv_tp_list_find(struct batadv_priv *bat_priv, } /** + * batadv_tp_list_active() - check if session from/to destination is ongoing + * @bat_priv: the bat priv with all the mesh interface information + * @dst: the other endpoint MAC address to look for + * + * Return: if matching session with @dst was found + */ +static bool batadv_tp_list_active(struct batadv_priv *bat_priv, const u8 *dst) + __must_hold(&bat_priv->tp_list_lock) +{ + struct batadv_tp_vars *tp_vars; + + hlist_for_each_entry_rcu(tp_vars, &bat_priv->tp_list, list) { + if (batadv_compare_eth(tp_vars->other_end, dst)) + return true; + } + + return false; +} + +/** * batadv_tp_list_find_session() - find tp_vars session object in the global * list * @bat_priv: the bat priv with all the mesh interface information * @dst: the other endpoint MAC address to look for * @session: session identifier + * @role: role of the session * * Look for a tp_vars object matching dst as end_point, session as tp meter * session and return it after having increment the refcounter. Return NULL @@ -300,7 +327,7 @@ static struct batadv_tp_vars *batadv_tp_list_find(struct batadv_priv *bat_priv, */ static struct batadv_tp_vars * batadv_tp_list_find_session(struct batadv_priv *bat_priv, const u8 *dst, - const u8 *session) + const u8 *session, enum batadv_tp_meter_role role) { struct batadv_tp_vars *pos, *tp_vars = NULL; @@ -312,6 +339,9 @@ batadv_tp_list_find_session(struct batadv_priv *bat_priv, const u8 *dst, if (memcmp(pos->session, session, sizeof(pos->session)) != 0) continue; + if (pos->role != role) + continue; + /* most of the time this function is invoked during the normal * process..it makes sense to pay more when the session is * finished and to speed the process up during the measurement @@ -400,13 +430,7 @@ static void batadv_tp_sender_cleanup(struct batadv_tp_vars *tp_vars) batadv_tp_list_detach(tp_vars); /* kill the timer and remove its reference */ - timer_delete_sync(&tp_vars->timer); - /* the worker might have rearmed itself therefore we kill it again. Note - * that if the worker should run again before invoking the following - * timer_delete(), it would not re-arm itself once again because the status - * is OFF now - */ - timer_delete(&tp_vars->timer); + timer_shutdown_sync(&tp_vars->timer); batadv_tp_vars_put(tp_vars); } @@ -418,11 +442,14 @@ static void batadv_tp_sender_cleanup(struct batadv_tp_vars *tp_vars) static void batadv_tp_sender_end(struct batadv_priv *bat_priv, struct batadv_tp_vars *tp_vars) { + enum batadv_tp_meter_reason reason; u32 session_cookie; + reason = atomic_read(&tp_vars->send_result); + batadv_dbg(BATADV_DBG_TP_METER, bat_priv, "Test towards %pM finished..shutting down (reason=%d)\n", - tp_vars->other_end, tp_vars->reason); + tp_vars->other_end, reason); batadv_dbg(BATADV_DBG_TP_METER, bat_priv, "Last timing stats: SRTT=%ums RTTVAR=%ums RTO=%ums\n", @@ -435,7 +462,7 @@ static void batadv_tp_sender_end(struct batadv_priv *bat_priv, session_cookie = batadv_tp_session_cookie(tp_vars->session, tp_vars->icmp_uid); - batadv_tp_batctl_notify(tp_vars->reason, + batadv_tp_batctl_notify(reason, tp_vars->other_end, bat_priv, tp_vars->start_time, @@ -451,10 +478,18 @@ static void batadv_tp_sender_end(struct batadv_priv *bat_priv, static void batadv_tp_sender_shutdown(struct batadv_tp_vars *tp_vars, enum batadv_tp_meter_reason reason) { - if (!atomic_dec_and_test(&tp_vars->sending)) - return; + atomic_cmpxchg(&tp_vars->send_result, 0, reason); +} - tp_vars->reason = reason; +/** + * batadv_tp_sender_stopped() - check if tp session was stopped with reason + * @tp_vars: the private data of the current TP meter session + * + * Return: whether stop reason was found + */ +static bool batadv_tp_sender_stopped(struct batadv_tp_vars *tp_vars) +{ + return atomic_read(&tp_vars->send_result) != 0; } /** @@ -484,7 +519,7 @@ static void batadv_tp_reset_sender_timer(struct batadv_tp_vars *tp_vars) /* most of the time this function is invoked while normal packet * reception... */ - if (unlikely(atomic_read(&tp_vars->sending) == 0)) + if (unlikely(batadv_tp_sender_stopped(tp_vars))) /* timer ref will be dropped in batadv_tp_sender_cleanup */ return; @@ -504,7 +539,7 @@ static void batadv_tp_sender_timeout(struct timer_list *t) struct batadv_tp_vars *tp_vars = timer_container_of(tp_vars, t, timer); struct batadv_priv *bat_priv = tp_vars->bat_priv; - if (atomic_read(&tp_vars->sending) == 0) + if (batadv_tp_sender_stopped(tp_vars)) return; /* if the user waited long enough...shutdown the test */ @@ -659,11 +694,11 @@ static void batadv_tp_recv_ack(struct batadv_priv *bat_priv, /* find the tp_vars */ tp_vars = batadv_tp_list_find_session(bat_priv, icmp->orig, - icmp->session); + icmp->session, BATADV_TP_SENDER); if (unlikely(!tp_vars)) return; - if (unlikely(atomic_read(&tp_vars->sending) == 0)) + if (unlikely(batadv_tp_sender_stopped(tp_vars))) goto out; /* old ACK? silently drop it.. */ @@ -829,21 +864,21 @@ static int batadv_tp_send(void *arg) if (unlikely(tp_vars->role != BATADV_TP_SENDER)) { err = BATADV_TP_REASON_DST_UNREACHABLE; - tp_vars->reason = err; + batadv_tp_sender_shutdown(tp_vars, err); goto out; } orig_node = batadv_orig_hash_find(bat_priv, tp_vars->other_end); if (unlikely(!orig_node)) { err = BATADV_TP_REASON_DST_UNREACHABLE; - tp_vars->reason = err; + batadv_tp_sender_shutdown(tp_vars, err); goto out; } primary_if = batadv_primary_if_get_selected(bat_priv); if (unlikely(!primary_if)) { err = BATADV_TP_REASON_DST_UNREACHABLE; - tp_vars->reason = err; + batadv_tp_sender_shutdown(tp_vars, err); goto out; } @@ -862,7 +897,7 @@ static int batadv_tp_send(void *arg) queue_delayed_work(batadv_event_workqueue, &tp_vars->finish_work, msecs_to_jiffies(tp_vars->test_length)); - while (atomic_read(&tp_vars->sending) != 0) { + while (!batadv_tp_sender_stopped(tp_vars)) { if (unlikely(!batadv_tp_avail(tp_vars, payload_len))) { batadv_tp_wait_available(tp_vars, payload_len); continue; @@ -885,8 +920,7 @@ static int batadv_tp_send(void *arg) "Meter: %s() cannot send packets (%d)\n", __func__, err); /* ensure nobody else tries to stop the thread now */ - if (atomic_dec_and_test(&tp_vars->sending)) - tp_vars->reason = err; + batadv_tp_sender_shutdown(tp_vars, err); break; } @@ -972,10 +1006,8 @@ void batadv_tp_start(struct batadv_priv *bat_priv, const u8 *dst, return; } - tp_vars = batadv_tp_list_find(bat_priv, dst); - if (tp_vars) { + if (batadv_tp_list_active(bat_priv, dst)) { spin_unlock_bh(&bat_priv->tp_list_lock); - batadv_tp_vars_put(tp_vars); batadv_dbg(BATADV_DBG_TP_METER, bat_priv, "Meter: test to or from the same node already ongoing, aborting\n"); batadv_tp_batctl_error_notify(BATADV_TP_REASON_ALREADY_ONGOING, @@ -1008,7 +1040,7 @@ void batadv_tp_start(struct batadv_priv *bat_priv, const u8 *dst, ether_addr_copy(tp_vars->other_end, dst); kref_init(&tp_vars->refcount); tp_vars->role = BATADV_TP_SENDER; - atomic_set(&tp_vars->sending, 1); + atomic_set(&tp_vars->send_result, 0); memcpy(tp_vars->session, session_id, sizeof(session_id)); tp_vars->icmp_uid = icmp_uid; @@ -1096,16 +1128,16 @@ void batadv_tp_stop(struct batadv_priv *bat_priv, const u8 *dst, if (!orig_node) return; - tp_vars = batadv_tp_list_find(bat_priv, orig_node->orig); + tp_vars = batadv_tp_list_find(bat_priv, orig_node->orig, BATADV_TP_SENDER); if (!tp_vars) { batadv_dbg(BATADV_DBG_TP_METER, bat_priv, "Meter: trying to interrupt an already over connection\n"); - goto out; + goto out_put_orig_node; } batadv_tp_sender_shutdown(tp_vars, return_value); batadv_tp_vars_put(tp_vars); -out: +out_put_orig_node: batadv_orig_node_put(orig_node); } @@ -1156,6 +1188,9 @@ static void batadv_tp_receiver_shutdown(struct timer_list *t) spin_unlock_bh(&tp_vars->unacked_lock); /* drop reference of timer */ + if (WARN_ON(atomic_xchg(&tp_vars->receiving, 0) != 1)) + return; + batadv_tp_vars_put(tp_vars); } @@ -1356,7 +1391,7 @@ batadv_tp_init_recv(struct batadv_priv *bat_priv, goto out_unlock; tp_vars = batadv_tp_list_find_session(bat_priv, icmp->orig, - icmp->session); + icmp->session, BATADV_TP_RECEIVER); if (tp_vars) goto out_unlock; @@ -1374,6 +1409,7 @@ batadv_tp_init_recv(struct batadv_priv *bat_priv, ether_addr_copy(tp_vars->other_end, icmp->orig); tp_vars->role = BATADV_TP_RECEIVER; + atomic_set(&tp_vars->receiving, 1); memcpy(tp_vars->session, icmp->session, sizeof(tp_vars->session)); tp_vars->last_recv = BATADV_TP_FIRST_SEQ; tp_vars->bat_priv = bat_priv; @@ -1426,7 +1462,7 @@ static void batadv_tp_recv_msg(struct batadv_priv *bat_priv, } } else { tp_vars = batadv_tp_list_find_session(bat_priv, icmp->orig, - icmp->session); + icmp->session, BATADV_TP_RECEIVER); if (!tp_vars) { batadv_dbg(BATADV_DBG_TP_METER, bat_priv, "Unexpected packet from %pM!\n", @@ -1435,13 +1471,6 @@ static void batadv_tp_recv_msg(struct batadv_priv *bat_priv, } } - if (unlikely(tp_vars->role != BATADV_TP_RECEIVER)) { - batadv_dbg(BATADV_DBG_TP_METER, bat_priv, - "Meter: dropping packet: not expected (role=%u)\n", - tp_vars->role); - goto out; - } - tp_vars->last_recv_time = jiffies; /* if the packet is a duplicate, it may be the case that an ACK has been @@ -1546,8 +1575,12 @@ void batadv_tp_stop_all(struct batadv_priv *bat_priv) break; case BATADV_TP_RECEIVER: batadv_tp_list_detach(tp_var); - if (timer_shutdown_sync(&tp_var->timer)) - batadv_tp_vars_put(tp_var); + timer_shutdown_sync(&tp_var->timer); + + if (atomic_xchg(&tp_var->receiving, 0) != 1) + break; + + batadv_tp_vars_put(tp_var); break; } diff --git a/net/batman-adv/translation-table.c b/net/batman-adv/translation-table.c index 05cddcf994f6..9f6e67771ffa 100644 --- a/net/batman-adv/translation-table.c +++ b/net/batman-adv/translation-table.c @@ -797,24 +797,33 @@ batadv_tt_prepare_tvlv_global_data(struct batadv_orig_node *orig_node, s32 *tt_len) { u16 num_vlan = 0; - u16 num_entries = 0; u16 tvlv_len = 0; unsigned int change_offset; struct batadv_tvlv_tt_vlan_data *tt_vlan; struct batadv_orig_node_vlan *vlan; + u16 total_entries = 0; u8 *tt_change_ptr; + int vlan_entries; + u16 sum_entries; spin_lock_bh(&orig_node->vlan_list_lock); hlist_for_each_entry(vlan, &orig_node->vlan_list, list) { + vlan_entries = atomic_read(&vlan->tt.num_entries); + + if (check_add_overflow(vlan_entries, total_entries, &sum_entries)) { + *tt_len = 0; + goto out; + } + + total_entries = sum_entries; num_vlan++; - num_entries += atomic_read(&vlan->tt.num_entries); } change_offset = struct_size(*tt_data, vlan_data, num_vlan); /* if tt_len is negative, allocate the space needed by the full table */ if (*tt_len < 0) - *tt_len = batadv_tt_len(num_entries); + *tt_len = batadv_tt_len(total_entries); if (change_offset > U16_MAX || *tt_len > U16_MAX - change_offset) { *tt_len = 0; @@ -835,14 +844,26 @@ batadv_tt_prepare_tvlv_global_data(struct batadv_orig_node *orig_node, (*tt_data)->num_vlan = htons(num_vlan); tt_vlan = (*tt_data)->vlan_data; + num_vlan = 0; hlist_for_each_entry(vlan, &orig_node->vlan_list, list) { + vlan_entries = atomic_read(&vlan->tt.num_entries); + if (vlan_entries < 1) + continue; + tt_vlan->vid = htons(vlan->vid); tt_vlan->crc = htonl(vlan->tt.crc); tt_vlan->reserved = 0; tt_vlan++; + num_vlan++; } + /* recalculate in case number of VLANs reduced */ + change_offset = struct_size(*tt_data, vlan_data, num_vlan); + tvlv_len = *tt_len + change_offset; + + (*tt_data)->num_vlan = htons(num_vlan); + tt_change_ptr = (u8 *)*tt_data + change_offset; *tt_change = (struct batadv_tvlv_tt_change *)tt_change_ptr; @@ -877,21 +898,25 @@ batadv_tt_prepare_tvlv_local_data(struct batadv_priv *bat_priv, { struct batadv_tvlv_tt_vlan_data *tt_vlan; struct batadv_meshif_vlan *vlan; + size_t change_offset; u16 num_vlan = 0; - u16 vlan_entries = 0; u16 total_entries = 0; u16 tvlv_len; u8 *tt_change_ptr; - int change_offset; + int vlan_entries; + u16 sum_entries; spin_lock_bh(&bat_priv->meshif_vlan_list_lock); hlist_for_each_entry(vlan, &bat_priv->meshif_vlan_list, list) { vlan_entries = atomic_read(&vlan->tt.num_entries); - if (vlan_entries < 1) - continue; + if (check_add_overflow(vlan_entries, total_entries, &sum_entries)) { + tvlv_len = 0; + goto out; + } + + total_entries = sum_entries; num_vlan++; - total_entries += vlan_entries; } change_offset = struct_size(*tt_data, vlan_data, num_vlan); @@ -900,8 +925,10 @@ batadv_tt_prepare_tvlv_local_data(struct batadv_priv *bat_priv, if (*tt_len < 0) *tt_len = batadv_tt_len(total_entries); - tvlv_len = *tt_len; - tvlv_len += change_offset; + if (check_add_overflow(*tt_len, change_offset, &tvlv_len)) { + tvlv_len = 0; + goto out; + } *tt_data = kmalloc(tvlv_len, GFP_ATOMIC); if (!*tt_data) { @@ -914,6 +941,7 @@ batadv_tt_prepare_tvlv_local_data(struct batadv_priv *bat_priv, (*tt_data)->num_vlan = htons(num_vlan); tt_vlan = (*tt_data)->vlan_data; + num_vlan = 0; hlist_for_each_entry(vlan, &bat_priv->meshif_vlan_list, list) { vlan_entries = atomic_read(&vlan->tt.num_entries); if (vlan_entries < 1) @@ -924,8 +952,15 @@ batadv_tt_prepare_tvlv_local_data(struct batadv_priv *bat_priv, tt_vlan->reserved = 0; tt_vlan++; + num_vlan++; } + /* recalculate in case number of VLANs reduced */ + change_offset = struct_size(*tt_data, vlan_data, num_vlan); + tvlv_len = *tt_len + change_offset; + + (*tt_data)->num_vlan = htons(num_vlan); + tt_change_ptr = (u8 *)*tt_data + change_offset; *tt_change = (struct batadv_tvlv_tt_change *)tt_change_ptr; diff --git a/net/batman-adv/tvlv.c b/net/batman-adv/tvlv.c index 8129a3f9c44d..cc6ac580c620 100644 --- a/net/batman-adv/tvlv.c +++ b/net/batman-adv/tvlv.c @@ -8,10 +8,12 @@ #include <linux/byteorder/generic.h> #include <linux/container_of.h> +#include <linux/errno.h> #include <linux/etherdevice.h> #include <linux/gfp.h> #include <linux/if_ether.h> #include <linux/kref.h> +#include <linux/limits.h> #include <linux/list.h> #include <linux/lockdep.h> #include <linux/netdevice.h> @@ -159,10 +161,10 @@ batadv_tvlv_container_get(struct batadv_priv *bat_priv, u8 type, u8 version) * * Return: size of all currently registered tvlv containers in bytes. */ -static u16 batadv_tvlv_container_list_size(struct batadv_priv *bat_priv) +static size_t batadv_tvlv_container_list_size(struct batadv_priv *bat_priv) { struct batadv_tvlv_container *tvlv; - u16 tvlv_len = 0; + size_t tvlv_len = 0; lockdep_assert_held(&bat_priv->tvlv.container_list_lock); @@ -306,26 +308,35 @@ static bool batadv_tvlv_realloc_packet_buff(unsigned char **packet_buff, * The ogm packet might be enlarged or shrunk depending on the current size * and the size of the to-be-appended tvlv containers. * - * Return: size of all appended tvlv containers in bytes. + * Return: size of all appended tvlv containers in bytes (max U16_MAX), negative + * if operation failed */ -u16 batadv_tvlv_container_ogm_append(struct batadv_priv *bat_priv, +int batadv_tvlv_container_ogm_append(struct batadv_priv *bat_priv, unsigned char **packet_buff, int *packet_buff_len, int packet_min_len) { struct batadv_tvlv_container *tvlv; struct batadv_tvlv_hdr *tvlv_hdr; - u16 tvlv_value_len; + size_t tvlv_value_len; void *tvlv_value; + int tvlv_len_ret; bool ret; spin_lock_bh(&bat_priv->tvlv.container_list_lock); tvlv_value_len = batadv_tvlv_container_list_size(bat_priv); + if (tvlv_value_len > U16_MAX) { + tvlv_len_ret = -E2BIG; + goto end; + } ret = batadv_tvlv_realloc_packet_buff(packet_buff, packet_buff_len, packet_min_len, tvlv_value_len); - - if (!ret) + if (!ret) { + tvlv_len_ret = -ENOMEM; goto end; + } + + tvlv_len_ret = tvlv_value_len; if (!tvlv_value_len) goto end; @@ -344,7 +355,8 @@ u16 batadv_tvlv_container_ogm_append(struct batadv_priv *bat_priv, end: spin_unlock_bh(&bat_priv->tvlv.container_list_lock); - return tvlv_value_len; + + return tvlv_len_ret; } /** diff --git a/net/batman-adv/tvlv.h b/net/batman-adv/tvlv.h index e5697230d991..f96f6b3f44a0 100644 --- a/net/batman-adv/tvlv.h +++ b/net/batman-adv/tvlv.h @@ -16,7 +16,7 @@ void batadv_tvlv_container_register(struct batadv_priv *bat_priv, u8 type, u8 version, void *tvlv_value, u16 tvlv_value_len); -u16 batadv_tvlv_container_ogm_append(struct batadv_priv *bat_priv, +int batadv_tvlv_container_ogm_append(struct batadv_priv *bat_priv, unsigned char **packet_buff, int *packet_buff_len, int packet_min_len); void batadv_tvlv_ogm_receive(struct batadv_priv *bat_priv, diff --git a/net/batman-adv/types.h b/net/batman-adv/types.h index daa06f421154..a01ee46d97f3 100644 --- a/net/batman-adv/types.h +++ b/net/batman-adv/types.h @@ -83,6 +83,9 @@ struct batadv_hard_iface_bat_iv { /** @ogm_seqno: OGM sequence number - used to identify each OGM */ atomic_t ogm_seqno; + /** @reschedule_work: recover OGM schedule after schedule error */ + struct delayed_work reschedule_work; + /** @ogm_buff_mutex: lock protecting ogm_buff and ogm_buff_len */ struct mutex ogm_buff_mutex; }; @@ -301,7 +304,7 @@ struct batadv_frag_table_entry { u16 seqno; /** @size: accumulated size of packets in list */ - u16 size; + size_t size; /** @total_size: expected size of the assembled packet */ u16 total_size; @@ -452,7 +455,7 @@ struct batadv_orig_node { * @tt_buff_len: length of the last tt changeset this node received * from the orig node */ - s16 tt_buff_len; + u16 tt_buff_len; /** @tt_buff_lock: lock that protects tt_buff and tt_buff_len */ spinlock_t tt_buff_lock; @@ -993,7 +996,7 @@ struct batadv_priv_tt { * @last_changeset_len: length of last tt changeset this host has * generated */ - s16 last_changeset_len; + u16 last_changeset_len; /** * @last_changeset_lock: lock protecting last_changeset & @@ -1024,6 +1027,12 @@ struct batadv_priv_bla { atomic_t num_requests; /** + * @num_requests_lock: locks update num_requests + + * batadv_backbone_gw::state + batadv_backbone_gw::wait_periods update + */ + spinlock_t num_requests_lock; + + /** * @claim_hash: hash table containing mesh nodes this host has claimed */ struct batadv_hashtable *claim_hash; @@ -1320,11 +1329,14 @@ struct batadv_tp_vars { /** @role: receiver/sender modi */ enum batadv_tp_meter_role role; - /** @sending: sending binary semaphore: 1 if sending, 0 is not */ - atomic_t sending; + /** + * @send_result: 0 when sending is ongoing and otherwise + * enum batadv_tp_meter_reason + */ + atomic_t send_result; - /** @reason: reason for a stopped session */ - enum batadv_tp_meter_reason reason; + /** @receiving: receiving binary semaphore: 1 if receiving, 0 is not */ + atomic_t receiving; /** @finish_work: work item for the finishing procedure */ struct delayed_work finish_work; @@ -1666,6 +1678,27 @@ struct batadv_priv { #ifdef CONFIG_BATMAN_ADV_BLA +enum batadv_bla_backbone_gw_state { + /** + * @BATADV_BLA_BACKBONE_GW_STOPPED: backbone gw is being removed + * and it must not longer work on requests + */ + BATADV_BLA_BACKBONE_GW_STOPPED, + + /** + * @BATADV_BLA_BACKBONE_GW_UNSYNCED: backbone was detected out + * of sync and a request was send. No traffic is forwarded until the + * situation is resolved + */ + BATADV_BLA_BACKBONE_GW_UNSYNCED, + + /** + * @BATADV_BLA_BACKBONE_GW_SYNCED: backbone is consider to be in + * sync. traffic can be forwarded + */ + BATADV_BLA_BACKBONE_GW_SYNCED, +}; + /** * struct batadv_bla_backbone_gw - batman-adv gateway bridged into the LAN */ @@ -1691,16 +1724,12 @@ struct batadv_bla_backbone_gw { /** * @wait_periods: grace time for bridge forward delays and bla group * forming at bootup phase - no bcast traffic is formwared until it has - * elapsed + * elapsed. Must only be access with num_requests_lock. */ - atomic_t wait_periods; + u8 wait_periods; - /** - * @request_sent: if this bool is set to true we are out of sync with - * this backbone gateway - no bcast traffic is formwared until the - * situation was resolved - */ - atomic_t request_sent; + /** @state: sync state. Must only be access with num_requests_lock. */ + enum batadv_bla_backbone_gw_state state; /** @crc: crc16 checksum over all claims */ u16 crc; diff --git a/net/bluetooth/af_bluetooth.c b/net/bluetooth/af_bluetooth.c index 2b94e2077203..70e35e198075 100644 --- a/net/bluetooth/af_bluetooth.c +++ b/net/bluetooth/af_bluetooth.c @@ -154,6 +154,7 @@ struct sock *bt_sock_alloc(struct net *net, struct socket *sock, sock_init_data(sock, sk); INIT_LIST_HEAD(&bt_sk(sk)->accept_q); + spin_lock_init(&bt_sk(sk)->accept_q_lock); sock_reset_flag(sk, SOCK_ZAPPED); @@ -214,6 +215,7 @@ void bt_accept_enqueue(struct sock *parent, struct sock *sk, bool bh) { const struct cred *old_cred; struct pid *old_pid; + struct bt_sock *par = bt_sk(parent); BT_DBG("parent %p, sk %p", parent, sk); @@ -224,9 +226,13 @@ void bt_accept_enqueue(struct sock *parent, struct sock *sk, bool bh) else lock_sock_nested(sk, SINGLE_DEPTH_NESTING); - list_add_tail(&bt_sk(sk)->accept_q, &bt_sk(parent)->accept_q); bt_sk(sk)->parent = parent; + spin_lock_bh(&par->accept_q_lock); + list_add_tail(&bt_sk(sk)->accept_q, &par->accept_q); + sk_acceptq_added(parent); + spin_unlock_bh(&par->accept_q_lock); + /* Copy credentials from parent since for incoming connections the * socket is allocated by the kernel. */ @@ -244,8 +250,6 @@ void bt_accept_enqueue(struct sock *parent, struct sock *sk, bool bh) bh_unlock_sock(sk); else release_sock(sk); - - sk_acceptq_added(parent); } EXPORT_SYMBOL(bt_accept_enqueue); @@ -254,45 +258,72 @@ EXPORT_SYMBOL(bt_accept_enqueue); */ void bt_accept_unlink(struct sock *sk) { + struct sock *parent = bt_sk(sk)->parent; + BT_DBG("sk %p state %d", sk, sk->sk_state); + spin_lock_bh(&bt_sk(parent)->accept_q_lock); list_del_init(&bt_sk(sk)->accept_q); - sk_acceptq_removed(bt_sk(sk)->parent); + sk_acceptq_removed(parent); + spin_unlock_bh(&bt_sk(parent)->accept_q_lock); bt_sk(sk)->parent = NULL; sock_put(sk); } EXPORT_SYMBOL(bt_accept_unlink); +static struct sock *bt_accept_get(struct sock *parent, struct sock *sk) +{ + struct bt_sock *bt = bt_sk(parent); + struct sock *next = NULL; + + /* accept_q is modified from child teardown paths too, so take a + * temporary reference before dropping the queue lock. + */ + spin_lock_bh(&bt->accept_q_lock); + + if (sk) { + if (bt_sk(sk)->parent != parent) + goto out; + + if (!list_is_last(&bt_sk(sk)->accept_q, &bt->accept_q)) { + next = &list_next_entry(bt_sk(sk), accept_q)->sk; + sock_hold(next); + } + } else if (!list_empty(&bt->accept_q)) { + next = &list_first_entry(&bt->accept_q, + struct bt_sock, accept_q)->sk; + sock_hold(next); + } + +out: + spin_unlock_bh(&bt->accept_q_lock); + return next; +} + struct sock *bt_accept_dequeue(struct sock *parent, struct socket *newsock) { - struct bt_sock *s, *n; - struct sock *sk; + struct sock *sk, *next; BT_DBG("parent %p", parent); restart: - list_for_each_entry_safe(s, n, &bt_sk(parent)->accept_q, accept_q) { - sk = (struct sock *)s; - + for (sk = bt_accept_get(parent, NULL); sk; sk = next) { /* Prevent early freeing of sk due to unlink and sock_kill */ - sock_hold(sk); lock_sock(sk); /* Check sk has not already been unlinked via * bt_accept_unlink() due to serialisation caused by sk locking */ - if (!bt_sk(sk)->parent) { + if (bt_sk(sk)->parent != parent) { BT_DBG("sk %p, already unlinked", sk); release_sock(sk); sock_put(sk); - /* Restart the loop as sk is no longer in the list - * and also avoid a potential infinite loop because - * list_for_each_entry_safe() is not thread safe. - */ goto restart; } + next = bt_accept_get(parent, sk); + /* sk is safely in the parent list so reduce reference count */ sock_put(sk); @@ -309,7 +340,19 @@ restart: if (newsock) sock_graft(sk, newsock); + /* Hand the caller a reference taken while sk is + * still locked. bt_accept_unlink() just dropped + * the accept-queue reference; without this hold a + * concurrent teardown (e.g. l2cap_conn_del() -> + * l2cap_sock_kill()) could free sk between + * release_sock() and the caller using it. Every + * caller drops this with sock_put() when done. + */ + sock_hold(sk); + release_sock(sk); + if (next) + sock_put(next); return sk; } @@ -518,18 +561,28 @@ EXPORT_SYMBOL(bt_sock_stream_recvmsg); static inline __poll_t bt_accept_poll(struct sock *parent) { - struct bt_sock *s, *n; + struct bt_sock *bt = bt_sk(parent); + struct bt_sock *s; struct sock *sk; + __poll_t mask = 0; + + spin_lock_bh(&bt->accept_q_lock); + list_for_each_entry(s, &bt->accept_q, accept_q) { + int state; - list_for_each_entry_safe(s, n, &bt_sk(parent)->accept_q, accept_q) { sk = (struct sock *)s; - if (sk->sk_state == BT_CONNECTED || - (test_bit(BT_SK_DEFER_SETUP, &bt_sk(parent)->flags) && - sk->sk_state == BT_CONNECT2)) - return EPOLLIN | EPOLLRDNORM; + state = READ_ONCE(sk->sk_state); + + if (state == BT_CONNECTED || + (test_bit(BT_SK_DEFER_SETUP, &bt->flags) && + state == BT_CONNECT2)) { + mask = EPOLLIN | EPOLLRDNORM; + break; + } } + spin_unlock_bh(&bt->accept_q_lock); - return 0; + return mask; } __poll_t bt_sock_poll(struct file *file, struct socket *sock, diff --git a/net/bluetooth/bnep/core.c b/net/bluetooth/bnep/core.c index d44987d4515c..b3cef7a4db54 100644 --- a/net/bluetooth/bnep/core.c +++ b/net/bluetooth/bnep/core.c @@ -638,8 +638,8 @@ int bnep_add_connection(struct bnep_connadd_req *req, struct socket *sock) goto failed; } - up_write(&bnep_session_sem); strcpy(req->device, dev->name); + up_write(&bnep_session_sem); return 0; failed: diff --git a/net/bluetooth/hci_sync.c b/net/bluetooth/hci_sync.c index 919ec275dd23..426f465be355 100644 --- a/net/bluetooth/hci_sync.c +++ b/net/bluetooth/hci_sync.c @@ -4438,6 +4438,9 @@ static int hci_le_set_event_mask_sync(struct hci_dev *hdev) events[4] |= 0x02; /* LE BIG Info Advertising Report */ } + if (ll_ext_feature_capable(hdev)) + events[5] |= BIT(2); + if (le_cs_capable(hdev)) { /* Channel Sounding events */ events[5] |= 0x08; /* LE CS Read Remote Supported Cap Complete event */ @@ -7413,9 +7416,6 @@ static int hci_le_read_all_remote_features_sync(struct hci_dev *hdev, sizeof(cp), &cp, HCI_EVT_LE_ALL_REMOTE_FEATURES_COMPLETE, HCI_CMD_TIMEOUT, NULL); - - return __hci_cmd_sync_status(hdev, HCI_OP_LE_READ_ALL_REMOTE_FEATURES, - sizeof(cp), &cp, HCI_CMD_TIMEOUT); } static int hci_le_read_remote_features_sync(struct hci_dev *hdev, void *data) diff --git a/net/bluetooth/iso.c b/net/bluetooth/iso.c index be145e2736b7..c72830744d56 100644 --- a/net/bluetooth/iso.c +++ b/net/bluetooth/iso.c @@ -759,6 +759,8 @@ static void iso_sock_cleanup_listen(struct sock *parent) while ((sk = bt_accept_dequeue(parent, NULL))) { iso_sock_close(sk); iso_sock_kill(sk); + /* Drop the reference handed back by bt_accept_dequeue(). */ + sock_put(sk); } /* If listening socket has a hcon, properly disconnect it */ @@ -1364,8 +1366,13 @@ static int iso_sock_accept(struct socket *sock, struct socket *newsock, } ch = bt_accept_dequeue(sk, newsock); - if (ch) + if (ch) { + /* Drop the bridging ref from bt_accept_dequeue(); + * the grafted socket keeps ch alive from here. + */ + sock_put(ch); break; + } if (!timeo) { err = -EAGAIN; @@ -2587,6 +2594,11 @@ int iso_recv(struct hci_dev *hdev, u16 handle, struct sk_buff *skb, u16 flags) break; case ISO_END: + if (!conn->rx_len) { + BT_ERR("Unexpected end frame (len %d)", skb->len); + goto drop; + } + skb_copy_from_linear_data(skb, skb_put(conn->rx_skb, skb->len), skb->len); conn->rx_len -= skb->len; diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c index 0d8053a3fc0a..99297d8f2c1f 100644 --- a/net/bluetooth/l2cap_core.c +++ b/net/bluetooth/l2cap_core.c @@ -7275,7 +7275,7 @@ static void l2cap_ecred_reconfigure(struct l2cap_chan *chan) chan->ident = l2cap_get_ident(conn); l2cap_send_cmd(conn, chan->ident, L2CAP_ECRED_RECONF_REQ, - sizeof(pdu), &pdu); + struct_size(pdu, scid, 1), pdu); } int l2cap_chan_reconfigure(struct l2cap_chan *chan, __u16 mtu) diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c index cf590a67d364..b34e7da8d906 100644 --- a/net/bluetooth/l2cap_sock.c +++ b/net/bluetooth/l2cap_sock.c @@ -349,8 +349,13 @@ static int l2cap_sock_accept(struct socket *sock, struct socket *newsock, } nsk = bt_accept_dequeue(sk, newsock); - if (nsk) + if (nsk) { + /* Drop the bridging ref from bt_accept_dequeue(); + * the grafted socket keeps nsk alive from here. + */ + sock_put(nsk); break; + } if (!timeo) { err = -EAGAIN; @@ -1475,22 +1480,54 @@ static void l2cap_sock_cleanup_listen(struct sock *parent) BT_DBG("parent %p state %s", parent, state_to_string(parent->sk_state)); - /* Close not yet accepted channels */ + /* Close not yet accepted channels. + * + * bt_accept_dequeue() now returns sk with an extra reference held + * (taken while sk was still locked) so a concurrent l2cap_conn_del() + * -> l2cap_sock_kill() cannot free sk under us. + * + * cleanup_listen() runs under the parent sk lock, so unlike + * l2cap_sock_shutdown() we must NOT take conn->lock here: that would + * establish sk_lock -> conn->lock and invert the established + * conn->lock -> chan->lock -> sk_lock order (lockdep deadlock). + * + * Instead, briefly take the child sk lock to fetch and pin its chan. + * l2cap_conn_del() reaches the chan free only via + * l2cap_chan_del() -> l2cap_sock_teardown_cb(), which itself takes + * the child sk lock; holding it across l2cap_chan_hold_unless_zero() + * therefore guarantees the chan cannot be freed while we read and + * pin it (hold_unless_zero() additionally skips a chan already past + * its last reference). We then drop the sk lock before taking + * chan->lock, so sk and chan locks are never held together. + */ while ((sk = bt_accept_dequeue(parent, NULL))) { - struct l2cap_chan *chan = l2cap_pi(sk)->chan; + struct l2cap_chan *chan; + + lock_sock_nested(sk, L2CAP_NESTING_NORMAL); + chan = l2cap_chan_hold_unless_zero(l2cap_pi(sk)->chan); + release_sock(sk); + if (!chan) { + /* l2cap_conn_del() already tearing this child down */ + sock_put(sk); + continue; + } BT_DBG("child chan %p state %s", chan, state_to_string(chan->state)); - l2cap_chan_hold(chan); l2cap_chan_lock(chan); - __clear_chan_timer(chan); l2cap_chan_close(chan, ECONNRESET); - l2cap_sock_kill(sk); - + /* l2cap_conn_del() may already have killed this socket + * (it sets SOCK_DEAD); skip the duplicate to avoid a + * double sock_put()/l2cap_chan_put(). + */ + if (!sock_flag(sk, SOCK_DEAD)) + l2cap_sock_kill(sk); l2cap_chan_unlock(chan); + l2cap_chan_put(chan); + sock_put(sk); } } diff --git a/net/bluetooth/mgmt.c b/net/bluetooth/mgmt.c index b05bb380e5f8..de5bd6b637b2 100644 --- a/net/bluetooth/mgmt.c +++ b/net/bluetooth/mgmt.c @@ -9110,9 +9110,15 @@ static int add_ext_adv_data(struct sock *sk, struct hci_dev *hdev, void *data, struct adv_info *adv_instance; int err = 0; struct mgmt_pending_cmd *cmd; + u16 expected_len; BT_DBG("%s", hdev->name); + expected_len = struct_size(cp, data, cp->adv_data_len + cp->scan_rsp_len); + if (expected_len != data_len) + return mgmt_cmd_status(sk, hdev->id, MGMT_OP_ADD_EXT_ADV_DATA, + MGMT_STATUS_INVALID_PARAMS); + hci_dev_lock(hdev); adv_instance = hci_find_adv_instance(hdev, cp->instance); diff --git a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c index be6639cd6f59..bd7d959c6e9e 100644 --- a/net/bluetooth/rfcomm/sock.c +++ b/net/bluetooth/rfcomm/sock.c @@ -180,6 +180,8 @@ static void rfcomm_sock_cleanup_listen(struct sock *parent) while ((sk = bt_accept_dequeue(parent, NULL))) { rfcomm_sock_close(sk); rfcomm_sock_kill(sk); + /* Drop the reference handed back by bt_accept_dequeue(). */ + sock_put(sk); } parent->sk_state = BT_CLOSED; @@ -497,8 +499,13 @@ static int rfcomm_sock_accept(struct socket *sock, struct socket *newsock, } nsk = bt_accept_dequeue(sk, newsock); - if (nsk) + if (nsk) { + /* Drop the bridging ref from bt_accept_dequeue(); + * the grafted socket keeps nsk alive from here. + */ + sock_put(nsk); break; + } if (!timeo) { err = -EAGAIN; diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c index 18826d4b9c0b..770b9d6fad88 100644 --- a/net/bluetooth/sco.c +++ b/net/bluetooth/sco.c @@ -498,6 +498,8 @@ static void sco_sock_cleanup_listen(struct sock *parent) while ((sk = bt_accept_dequeue(parent, NULL))) { sco_sock_close(sk); sco_sock_kill(sk); + /* Drop the reference handed back by bt_accept_dequeue(). */ + sock_put(sk); } parent->sk_state = BT_CLOSED; @@ -759,8 +761,13 @@ static int sco_sock_accept(struct socket *sock, struct socket *newsock, } ch = bt_accept_dequeue(sk, newsock); - if (ch) + if (ch) { + /* Drop the bridging ref from bt_accept_dequeue(); + * the grafted socket keeps ch alive from here. + */ + sock_put(ch); break; + } if (!timeo) { err = -EAGAIN; diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c index 881d866d687a..2eef4f3345cd 100644 --- a/net/bridge/br_multicast.c +++ b/net/bridge/br_multicast.c @@ -4640,10 +4640,24 @@ static void br_multicast_start_querier(struct net_bridge_mcast *brmctx, rcu_read_unlock(); } -static void br_multicast_del_grps(struct net_bridge *br) +static void br_multicast_enable_all_ports(struct net_bridge *br) { struct net_bridge_port *port; + if (br_opt_get(br, BROPT_MCAST_VLAN_SNOOPING_ENABLED)) + return; + + list_for_each_entry(port, &br->port_list, list) + __br_multicast_enable_port_ctx(&port->multicast_ctx); +} + +static void br_multicast_disable_all_ports(struct net_bridge *br) +{ + struct net_bridge_port *port; + + if (br_opt_get(br, BROPT_MCAST_VLAN_SNOOPING_ENABLED)) + return; + list_for_each_entry(port, &br->port_list, list) __br_multicast_disable_port_ctx(&port->multicast_ctx); } @@ -4651,7 +4665,6 @@ static void br_multicast_del_grps(struct net_bridge *br) int br_multicast_toggle(struct net_bridge *br, unsigned long val, struct netlink_ext_ack *extack) { - struct net_bridge_port *port; bool change_snoopers = false; int err = 0; @@ -4668,7 +4681,7 @@ int br_multicast_toggle(struct net_bridge *br, unsigned long val, br_opt_toggle(br, BROPT_MULTICAST_ENABLED, !!val); if (!br_opt_get(br, BROPT_MULTICAST_ENABLED)) { change_snoopers = true; - br_multicast_del_grps(br); + br_multicast_disable_all_ports(br); goto unlock; } @@ -4676,8 +4689,7 @@ int br_multicast_toggle(struct net_bridge *br, unsigned long val, goto unlock; br_multicast_open(br); - list_for_each_entry(port, &br->port_list, list) - __br_multicast_enable_port_ctx(&port->multicast_ctx); + br_multicast_enable_all_ports(br); change_snoopers = true; diff --git a/net/bridge/netfilter/ebtable_broute.c b/net/bridge/netfilter/ebtable_broute.c index 741360219552..f05c79f215ea 100644 --- a/net/bridge/netfilter/ebtable_broute.c +++ b/net/bridge/netfilter/ebtable_broute.c @@ -112,24 +112,22 @@ static struct pernet_operations broute_net_ops = { static int __init ebtable_broute_init(void) { - int ret = ebt_register_template(&broute_table, broute_table_init); + int ret = register_pernet_subsys(&broute_net_ops); if (ret) return ret; - ret = register_pernet_subsys(&broute_net_ops); - if (ret) { - ebt_unregister_template(&broute_table); - return ret; - } + ret = ebt_register_template(&broute_table, broute_table_init); + if (ret) + unregister_pernet_subsys(&broute_net_ops); - return 0; + return ret; } static void __exit ebtable_broute_fini(void) { - unregister_pernet_subsys(&broute_net_ops); ebt_unregister_template(&broute_table); + unregister_pernet_subsys(&broute_net_ops); } module_init(ebtable_broute_init); diff --git a/net/bridge/netfilter/ebtable_filter.c b/net/bridge/netfilter/ebtable_filter.c index dacd81b12e62..0fc03b07e62a 100644 --- a/net/bridge/netfilter/ebtable_filter.c +++ b/net/bridge/netfilter/ebtable_filter.c @@ -93,24 +93,22 @@ static struct pernet_operations frame_filter_net_ops = { static int __init ebtable_filter_init(void) { - int ret = ebt_register_template(&frame_filter, frame_filter_table_init); + int ret = register_pernet_subsys(&frame_filter_net_ops); if (ret) return ret; - ret = register_pernet_subsys(&frame_filter_net_ops); - if (ret) { - ebt_unregister_template(&frame_filter); - return ret; - } + ret = ebt_register_template(&frame_filter, frame_filter_table_init); + if (ret) + unregister_pernet_subsys(&frame_filter_net_ops); - return 0; + return ret; } static void __exit ebtable_filter_fini(void) { - unregister_pernet_subsys(&frame_filter_net_ops); ebt_unregister_template(&frame_filter); + unregister_pernet_subsys(&frame_filter_net_ops); } module_init(ebtable_filter_init); diff --git a/net/bridge/netfilter/ebtable_nat.c b/net/bridge/netfilter/ebtable_nat.c index 0f2a8c6118d4..8a10375d8909 100644 --- a/net/bridge/netfilter/ebtable_nat.c +++ b/net/bridge/netfilter/ebtable_nat.c @@ -93,24 +93,22 @@ static struct pernet_operations frame_nat_net_ops = { static int __init ebtable_nat_init(void) { - int ret = ebt_register_template(&frame_nat, frame_nat_table_init); + int ret = register_pernet_subsys(&frame_nat_net_ops); if (ret) return ret; - ret = register_pernet_subsys(&frame_nat_net_ops); - if (ret) { - ebt_unregister_template(&frame_nat); - return ret; - } + ret = ebt_register_template(&frame_nat, frame_nat_table_init); + if (ret) + unregister_pernet_subsys(&frame_nat_net_ops); return ret; } static void __exit ebtable_nat_fini(void) { - unregister_pernet_subsys(&frame_nat_net_ops); ebt_unregister_template(&frame_nat); + unregister_pernet_subsys(&frame_nat_net_ops); } module_init(ebtable_nat_init); diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c index aea3e19875c6..b9f4daac09af 100644 --- a/net/bridge/netfilter/ebtables.c +++ b/net/bridge/netfilter/ebtables.c @@ -42,6 +42,7 @@ struct ebt_pernet { struct list_head tables; + struct list_head dead_tables; }; struct ebt_template { @@ -1162,11 +1163,6 @@ free_newinfo: static void __ebt_unregister_table(struct net *net, struct ebt_table *table) { - mutex_lock(&ebt_mutex); - list_del(&table->list); - mutex_unlock(&ebt_mutex); - audit_log_nfcfg(table->name, AF_BRIDGE, table->private->nentries, - AUDIT_XT_OP_UNREGISTER, GFP_KERNEL); EBT_ENTRY_ITERATE(table->private->entries, table->private->entries_size, ebt_cleanup_entry, net, NULL); if (table->private->nentries) @@ -1267,13 +1263,15 @@ int ebt_register_table(struct net *net, const struct ebt_table *input_table, for (i = 0; i < num_ops; i++) ops[i].priv = table; - list_add(&table->list, &ebt_net->tables); - mutex_unlock(&ebt_mutex); - table->ops = ops; ret = nf_register_net_hooks(net, ops, num_ops); - if (ret) + if (ret) { + synchronize_rcu(); __ebt_unregister_table(net, table); + } else { + list_add(&table->list, &ebt_net->tables); + } + mutex_unlock(&ebt_mutex); audit_log_nfcfg(repl->name, AF_BRIDGE, repl->nentries, AUDIT_XT_OP_REGISTER, GFP_KERNEL); @@ -1339,7 +1337,7 @@ void ebt_unregister_template(const struct ebt_table *t) } EXPORT_SYMBOL(ebt_unregister_template); -static struct ebt_table *__ebt_find_table(struct net *net, const char *name) +void ebt_unregister_table_pre_exit(struct net *net, const char *name) { struct ebt_pernet *ebt_net = net_generic(net, ebt_pernet_id); struct ebt_table *t; @@ -1348,30 +1346,36 @@ static struct ebt_table *__ebt_find_table(struct net *net, const char *name) list_for_each_entry(t, &ebt_net->tables, list) { if (strcmp(t->name, name) == 0) { + list_move(&t->list, &ebt_net->dead_tables); mutex_unlock(&ebt_mutex); - return t; + nf_unregister_net_hooks(net, t->ops, hweight32(t->valid_hooks)); + return; } } mutex_unlock(&ebt_mutex); - return NULL; -} - -void ebt_unregister_table_pre_exit(struct net *net, const char *name) -{ - struct ebt_table *table = __ebt_find_table(net, name); - - if (table) - nf_unregister_net_hooks(net, table->ops, hweight32(table->valid_hooks)); } EXPORT_SYMBOL(ebt_unregister_table_pre_exit); void ebt_unregister_table(struct net *net, const char *name) { - struct ebt_table *table = __ebt_find_table(net, name); + struct ebt_pernet *ebt_net = net_generic(net, ebt_pernet_id); + struct ebt_table *t; - if (table) - __ebt_unregister_table(net, table); + mutex_lock(&ebt_mutex); + + list_for_each_entry(t, &ebt_net->dead_tables, list) { + if (strcmp(t->name, name) == 0) { + list_del(&t->list); + audit_log_nfcfg(t->name, AF_BRIDGE, t->private->nentries, + AUDIT_XT_OP_UNREGISTER, GFP_KERNEL); + __ebt_unregister_table(net, t); + mutex_unlock(&ebt_mutex); + return; + } + } + + mutex_unlock(&ebt_mutex); } /* userspace just supplied us with counters */ @@ -2556,11 +2560,21 @@ static int __net_init ebt_pernet_init(struct net *net) struct ebt_pernet *ebt_net = net_generic(net, ebt_pernet_id); INIT_LIST_HEAD(&ebt_net->tables); + INIT_LIST_HEAD(&ebt_net->dead_tables); return 0; } +static void __net_exit ebt_pernet_exit(struct net *net) +{ + struct ebt_pernet *ebt_net = net_generic(net, ebt_pernet_id); + + WARN_ON_ONCE(!list_empty(&ebt_net->tables)); + WARN_ON_ONCE(!list_empty(&ebt_net->dead_tables)); +} + static struct pernet_operations ebt_net_ops = { .init = ebt_pernet_init, + .exit = ebt_pernet_exit, .id = &ebt_pernet_id, .size = sizeof(struct ebt_pernet), }; @@ -2569,19 +2583,20 @@ static int __init ebtables_init(void) { int ret; - ret = xt_register_target(&ebt_standard_target); + ret = register_pernet_subsys(&ebt_net_ops); if (ret < 0) return ret; - ret = nf_register_sockopt(&ebt_sockopts); + + ret = xt_register_target(&ebt_standard_target); if (ret < 0) { - xt_unregister_target(&ebt_standard_target); + unregister_pernet_subsys(&ebt_net_ops); return ret; } - ret = register_pernet_subsys(&ebt_net_ops); + ret = nf_register_sockopt(&ebt_sockopts); if (ret < 0) { - nf_unregister_sockopt(&ebt_sockopts); xt_unregister_target(&ebt_standard_target); + unregister_pernet_subsys(&ebt_net_ops); return ret; } diff --git a/net/core/dev.c b/net/core/dev.c index e4fcf09ba2be..fab5a0bebd92 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6855,9 +6855,9 @@ static void skb_defer_free_flush(void) #if defined(CONFIG_NET_RX_BUSY_POLL) -static void __busy_poll_stop(struct napi_struct *napi, bool skip_schedule) +static void __busy_poll_stop(struct napi_struct *napi, unsigned long timeout) { - if (!skip_schedule) { + if (!timeout) { gro_normal_list(&napi->gro); __napi_schedule(napi); return; @@ -6867,6 +6867,8 @@ static void __busy_poll_stop(struct napi_struct *napi, bool skip_schedule) gro_flush_normal(&napi->gro, HZ >= 1000); clear_bit(NAPI_STATE_SCHED, &napi->state); + hrtimer_start(&napi->timer, ns_to_ktime(timeout), + HRTIMER_MODE_REL_PINNED); } enum { @@ -6878,8 +6880,7 @@ static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock, unsigned flags, u16 budget) { struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; - bool skip_schedule = false; - unsigned long timeout; + unsigned long timeout = 0; int rc; /* Busy polling means there is a high chance device driver hard irq @@ -6899,10 +6900,12 @@ static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock, if (flags & NAPI_F_PREFER_BUSY_POLL) { napi->defer_hard_irqs_count = napi_get_defer_hard_irqs(napi); - timeout = napi_get_gro_flush_timeout(napi); - if (napi->defer_hard_irqs_count && timeout) { - hrtimer_start(&napi->timer, ns_to_ktime(timeout), HRTIMER_MODE_REL_PINNED); - skip_schedule = true; + if (napi->defer_hard_irqs_count) { + /* A short enough gro flush timeout and long enough + * poll can result in timer firing too early. + * Timer will be armed later if necessary. + */ + timeout = napi_get_gro_flush_timeout(napi); } } @@ -6917,7 +6920,7 @@ static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock, trace_napi_poll(napi, rc, budget); netpoll_poll_unlock(have_poll_lock); if (rc == budget) - __busy_poll_stop(napi, skip_schedule); + __busy_poll_stop(napi, timeout); bpf_net_ctx_clear(bpf_net_ctx); local_bh_enable(); } diff --git a/net/core/devmem.c b/net/core/devmem.c index 69d79aee07ef..c84eb683c025 100644 --- a/net/core/devmem.c +++ b/net/core/devmem.c @@ -241,6 +241,11 @@ net_devmem_bind_dmabuf(struct net_device *dev, } if (direction == DMA_TO_DEVICE) { + if (!IS_ALIGNED(dmabuf->size, PAGE_SIZE)) { + err = -EINVAL; + NL_SET_ERR_MSG(extack, "TX dma-buf size must be a multiple of PAGE_SIZE"); + goto err_unmap; + } binding->tx_vec = kvmalloc_objs(struct net_iov *, dmabuf->size / PAGE_SIZE); if (!binding->tx_vec) { @@ -267,6 +272,12 @@ net_devmem_bind_dmabuf(struct net_device *dev, size_t len = sg_dma_len(sg); struct net_iov *niov; + if (!IS_ALIGNED(len, PAGE_SIZE)) { + err = -EINVAL; + NL_SET_ERR_MSG(extack, "dma-buf SG length must be PAGE_SIZE aligned"); + goto err_free_chunks; + } + owner = kzalloc_node(sizeof(*owner), GFP_KERNEL, dev_to_node(&dev->dev)); if (!owner) { diff --git a/net/core/gro.c b/net/core/gro.c index 9f8960789b2c..a84753983467 100644 --- a/net/core/gro.c +++ b/net/core/gro.c @@ -109,6 +109,9 @@ int skb_gro_receive(struct sk_buff *p, struct sk_buff *skb) if (p->pp_recycle != skb->pp_recycle) return -ETOOMANYREFS; + if (skb_zcopy(p) || skb_zcopy(skb)) + return -ETOOMANYREFS; + if (unlikely(p->len + len >= netif_get_gro_max_size(p->dev, p) || NAPI_GRO_CB(skb)->flush)) return -E2BIG; diff --git a/net/core/skmsg.c b/net/core/skmsg.c index 6187a83bd741..e1850caf1a71 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -1268,12 +1268,19 @@ out: static void sk_psock_verdict_data_ready(struct sock *sk) { const struct proto_ops *ops = NULL; + struct sk_psock *psock; struct socket *sock; int copied; trace_sk_data_ready(sk); rcu_read_lock(); + psock = sk_psock(sk); + if (psock && tls_sw_has_ctx_rx(sk)) { + psock->saved_data_ready(sk); + rcu_read_unlock(); + return; + } sock = READ_ONCE(sk->sk_socket); if (likely(sock)) ops = READ_ONCE(sock->ops); @@ -1283,8 +1290,6 @@ static void sk_psock_verdict_data_ready(struct sock *sk) copied = ops->read_skb(sk, sk_psock_verdict_recv); if (copied >= 0) { - struct sk_psock *psock; - rcu_read_lock(); psock = sk_psock(sk); if (psock) diff --git a/net/ethtool/bitset.c b/net/ethtool/bitset.c index f0883357d12e..4691d6d0f2b7 100644 --- a/net/ethtool/bitset.c +++ b/net/ethtool/bitset.c @@ -91,7 +91,7 @@ static bool ethnl_bitmap32_not_zero(const u32 *map, unsigned int start, u32 mask; if (end <= start) - return true; + return false; if (start % 32) { mask = ethnl_upper_bits(start); @@ -104,11 +104,11 @@ static bool ethnl_bitmap32_not_zero(const u32 *map, unsigned int start, start_word++; } - if (!memchr_inv(map + start_word, '\0', - (end_word - start_word) * sizeof(u32))) + if (memchr_inv(map + start_word, '\0', + (end_word - start_word) * sizeof(u32))) return true; if (end % 32 == 0) - return true; + return false; return map[end_word] & ethnl_lower_bits(end); } diff --git a/net/ethtool/phy.c b/net/ethtool/phy.c index 68372bef4b2f..98392a3c34b5 100644 --- a/net/ethtool/phy.c +++ b/net/ethtool/phy.c @@ -76,6 +76,7 @@ static int phy_prepare_data(const struct ethnl_req_info *req_info, struct nlattr **tb = info->attrs; struct phy_device_node *pdn; struct phy_device *phydev; + int ret; /* RTNL is held by the caller */ phydev = ethnl_req_get_phydev(req_info, tb, ETHTOOL_A_PHY_HEADER, @@ -88,8 +89,19 @@ static int phy_prepare_data(const struct ethnl_req_info *req_info, return -EOPNOTSUPP; rep_data->phyindex = phydev->phyindex; + rep_data->name = kstrdup(dev_name(&phydev->mdio.dev), GFP_KERNEL); - rep_data->drvname = kstrdup(phydev->drv->name, GFP_KERNEL); + if (!rep_data->name) + return -ENOMEM; + + if (phydev->drv) { + rep_data->drvname = kstrdup(phydev->drv->name, GFP_KERNEL); + if (!rep_data->drvname) { + ret = -ENOMEM; + goto err_free_name; + } + } + rep_data->upstream_type = pdn->upstream_type; if (pdn->upstream_type == PHY_UPSTREAM_PHY) { @@ -97,15 +109,33 @@ static int phy_prepare_data(const struct ethnl_req_info *req_info, rep_data->upstream_index = upstream->phyindex; } - if (pdn->parent_sfp_bus) + if (pdn->parent_sfp_bus) { rep_data->upstream_sfp_name = kstrdup(sfp_get_name(pdn->parent_sfp_bus), GFP_KERNEL); + if (!rep_data->upstream_sfp_name) { + ret = -ENOMEM; + goto err_free_drvname; + } + } - if (phydev->sfp_bus) + if (phydev->sfp_bus) { rep_data->downstream_sfp_name = kstrdup(sfp_get_name(phydev->sfp_bus), GFP_KERNEL); + if (!rep_data->downstream_sfp_name) { + ret = -ENOMEM; + goto err_free_upstream_sfp; + } + } return 0; + +err_free_upstream_sfp: + kfree(rep_data->upstream_sfp_name); +err_free_drvname: + kfree(rep_data->drvname); +err_free_name: + kfree(rep_data->name); + return ret; } static int phy_fill_reply(struct sk_buff *skb, diff --git a/net/hsr/hsr_framereg.c b/net/hsr/hsr_framereg.c index d41863593674..f268e469af4f 100644 --- a/net/hsr/hsr_framereg.c +++ b/net/hsr/hsr_framereg.c @@ -163,8 +163,8 @@ void hsr_del_nodes(struct list_head *node_db) struct hsr_node *tmp; list_for_each_entry_safe(node, tmp, node_db, mac_list) { - list_del(&node->mac_list); - hsr_free_node(node); + list_del_rcu(&node->mac_list); + call_rcu(&node->rcu_head, hsr_free_node_rcu); } } diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c index bc987a59a095..f1988fd50354 100644 --- a/net/ipv4/inet_connection_sock.c +++ b/net/ipv4/inet_connection_sock.c @@ -1137,7 +1137,7 @@ no_ownership: } drop: - __inet_csk_reqsk_queue_drop(sk_listener, oreq, true); + __inet_csk_reqsk_queue_drop(oreq->rsk_listener, oreq, true); reqsk_put(oreq); } diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c index 97ead883e4a1..ad2259678c78 100644 --- a/net/ipv4/netfilter/arp_tables.c +++ b/net/ipv4/netfilter/arp_tables.c @@ -1501,13 +1501,11 @@ static int do_arpt_get_ctl(struct sock *sk, int cmd, void __user *user, int *len static void __arpt_unregister_table(struct net *net, struct xt_table *table) { - struct xt_table_info *private; - void *loc_cpu_entry; + struct xt_table_info *private = table->private; struct module *table_owner = table->me; + void *loc_cpu_entry; struct arpt_entry *iter; - private = xt_unregister_table(table); - /* Decrease module usage counts and free resources */ loc_cpu_entry = private->entries; xt_entry_foreach(iter, loc_cpu_entry, private->size) @@ -1515,6 +1513,7 @@ static void __arpt_unregister_table(struct net *net, struct xt_table *table) if (private->number > private->initial_entries) module_put(table_owner); xt_free_table_info(private); + kfree(table); } int arpt_register_table(struct net *net, @@ -1522,13 +1521,11 @@ int arpt_register_table(struct net *net, const struct arpt_replace *repl, const struct nf_hook_ops *template_ops) { - struct nf_hook_ops *ops; - unsigned int num_ops; - int ret, i; - struct xt_table_info *newinfo; struct xt_table_info bootstrap = {0}; - void *loc_cpu_entry; + struct xt_table_info *newinfo; struct xt_table *new_table; + void *loc_cpu_entry; + int ret; newinfo = xt_alloc_table_info(repl->size); if (!newinfo) @@ -1543,7 +1540,7 @@ int arpt_register_table(struct net *net, return ret; } - new_table = xt_register_table(net, table, &bootstrap, newinfo); + new_table = xt_register_table(net, table, template_ops, &bootstrap, newinfo); if (IS_ERR(new_table)) { struct arpt_entry *iter; @@ -1553,46 +1550,12 @@ int arpt_register_table(struct net *net, return PTR_ERR(new_table); } - num_ops = hweight32(table->valid_hooks); - if (num_ops == 0) { - ret = -EINVAL; - goto out_free; - } - - ops = kmemdup_array(template_ops, num_ops, sizeof(*ops), GFP_KERNEL); - if (!ops) { - ret = -ENOMEM; - goto out_free; - } - - for (i = 0; i < num_ops; i++) - ops[i].priv = new_table; - - new_table->ops = ops; - - ret = nf_register_net_hooks(net, ops, num_ops); - if (ret != 0) - goto out_free; - return ret; - -out_free: - __arpt_unregister_table(net, new_table); - return ret; -} - -void arpt_unregister_table_pre_exit(struct net *net, const char *name) -{ - struct xt_table *table = xt_find_table(net, NFPROTO_ARP, name); - - if (table) - nf_unregister_net_hooks(net, table->ops, hweight32(table->valid_hooks)); } -EXPORT_SYMBOL(arpt_unregister_table_pre_exit); void arpt_unregister_table(struct net *net, const char *name) { - struct xt_table *table = xt_find_table(net, NFPROTO_ARP, name); + struct xt_table *table = xt_unregister_table_exit(net, NFPROTO_ARP, name); if (table) __arpt_unregister_table(net, table); diff --git a/net/ipv4/netfilter/arptable_filter.c b/net/ipv4/netfilter/arptable_filter.c index 78cd5ee24448..370b635e3523 100644 --- a/net/ipv4/netfilter/arptable_filter.c +++ b/net/ipv4/netfilter/arptable_filter.c @@ -43,7 +43,7 @@ static int arptable_filter_table_init(struct net *net) static void __net_exit arptable_filter_net_pre_exit(struct net *net) { - arpt_unregister_table_pre_exit(net, "filter"); + xt_unregister_table_pre_exit(net, NFPROTO_ARP, "filter"); } static void __net_exit arptable_filter_net_exit(struct net *net) @@ -58,32 +58,33 @@ static struct pernet_operations arptable_filter_net_ops = { static int __init arptable_filter_init(void) { - int ret = xt_register_template(&packet_filter, - arptable_filter_table_init); - - if (ret < 0) - return ret; + int ret; arpfilter_ops = xt_hook_ops_alloc(&packet_filter, arpt_do_table); - if (IS_ERR(arpfilter_ops)) { - xt_unregister_template(&packet_filter); + if (IS_ERR(arpfilter_ops)) return PTR_ERR(arpfilter_ops); - } ret = register_pernet_subsys(&arptable_filter_net_ops); + if (ret < 0) + goto err_free; + + ret = xt_register_template(&packet_filter, + arptable_filter_table_init); if (ret < 0) { - xt_unregister_template(&packet_filter); - kfree(arpfilter_ops); - return ret; + unregister_pernet_subsys(&arptable_filter_net_ops); + goto err_free; } + return 0; +err_free: + kfree(arpfilter_ops); return ret; } static void __exit arptable_filter_fini(void) { - unregister_pernet_subsys(&arptable_filter_net_ops); xt_unregister_template(&packet_filter); + unregister_pernet_subsys(&arptable_filter_net_ops); kfree(arpfilter_ops); } diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c index 23c8deff8095..5cbdb0815857 100644 --- a/net/ipv4/netfilter/ip_tables.c +++ b/net/ipv4/netfilter/ip_tables.c @@ -1704,12 +1704,10 @@ do_ipt_get_ctl(struct sock *sk, int cmd, void __user *user, int *len) static void __ipt_unregister_table(struct net *net, struct xt_table *table) { - struct xt_table_info *private; - void *loc_cpu_entry; + struct xt_table_info *private = table->private; struct module *table_owner = table->me; struct ipt_entry *iter; - - private = xt_unregister_table(table); + void *loc_cpu_entry; /* Decrease module usage counts and free resources */ loc_cpu_entry = private->entries; @@ -1718,19 +1716,18 @@ static void __ipt_unregister_table(struct net *net, struct xt_table *table) if (private->number > private->initial_entries) module_put(table_owner); xt_free_table_info(private); + kfree(table); } int ipt_register_table(struct net *net, const struct xt_table *table, const struct ipt_replace *repl, const struct nf_hook_ops *template_ops) { - struct nf_hook_ops *ops; - unsigned int num_ops; - int ret, i; - struct xt_table_info *newinfo; struct xt_table_info bootstrap = {0}; - void *loc_cpu_entry; + struct xt_table_info *newinfo; struct xt_table *new_table; + void *loc_cpu_entry; + int ret; newinfo = xt_alloc_table_info(repl->size); if (!newinfo) @@ -1745,7 +1742,7 @@ int ipt_register_table(struct net *net, const struct xt_table *table, return ret; } - new_table = xt_register_table(net, table, &bootstrap, newinfo); + new_table = xt_register_table(net, table, template_ops, &bootstrap, newinfo); if (IS_ERR(new_table)) { struct ipt_entry *iter; @@ -1755,51 +1752,12 @@ int ipt_register_table(struct net *net, const struct xt_table *table, return PTR_ERR(new_table); } - /* No template? No need to do anything. This is used by 'nat' table, it registers - * with the nat core instead of the netfilter core. - */ - if (!template_ops) - return 0; - - num_ops = hweight32(table->valid_hooks); - if (num_ops == 0) { - ret = -EINVAL; - goto out_free; - } - - ops = kmemdup_array(template_ops, num_ops, sizeof(*ops), GFP_KERNEL); - if (!ops) { - ret = -ENOMEM; - goto out_free; - } - - for (i = 0; i < num_ops; i++) - ops[i].priv = new_table; - - new_table->ops = ops; - - ret = nf_register_net_hooks(net, ops, num_ops); - if (ret != 0) - goto out_free; - return ret; - -out_free: - __ipt_unregister_table(net, new_table); - return ret; -} - -void ipt_unregister_table_pre_exit(struct net *net, const char *name) -{ - struct xt_table *table = xt_find_table(net, NFPROTO_IPV4, name); - - if (table) - nf_unregister_net_hooks(net, table->ops, hweight32(table->valid_hooks)); } void ipt_unregister_table_exit(struct net *net, const char *name) { - struct xt_table *table = xt_find_table(net, NFPROTO_IPV4, name); + struct xt_table *table = xt_unregister_table_exit(net, NFPROTO_IPV4, name); if (table) __ipt_unregister_table(net, table); @@ -1887,7 +1845,6 @@ static void __exit ip_tables_fini(void) } EXPORT_SYMBOL(ipt_register_table); -EXPORT_SYMBOL(ipt_unregister_table_pre_exit); EXPORT_SYMBOL(ipt_unregister_table_exit); EXPORT_SYMBOL(ipt_do_table); module_init(ip_tables_init); diff --git a/net/ipv4/netfilter/iptable_filter.c b/net/ipv4/netfilter/iptable_filter.c index 3ab908b74795..672d7da1071d 100644 --- a/net/ipv4/netfilter/iptable_filter.c +++ b/net/ipv4/netfilter/iptable_filter.c @@ -61,7 +61,7 @@ static int __net_init iptable_filter_net_init(struct net *net) static void __net_exit iptable_filter_net_pre_exit(struct net *net) { - ipt_unregister_table_pre_exit(net, "filter"); + xt_unregister_table_pre_exit(net, NFPROTO_IPV4, "filter"); } static void __net_exit iptable_filter_net_exit(struct net *net) @@ -77,32 +77,33 @@ static struct pernet_operations iptable_filter_net_ops = { static int __init iptable_filter_init(void) { - int ret = xt_register_template(&packet_filter, - iptable_filter_table_init); - - if (ret < 0) - return ret; + int ret; filter_ops = xt_hook_ops_alloc(&packet_filter, ipt_do_table); - if (IS_ERR(filter_ops)) { - xt_unregister_template(&packet_filter); + if (IS_ERR(filter_ops)) return PTR_ERR(filter_ops); - } ret = register_pernet_subsys(&iptable_filter_net_ops); + if (ret < 0) + goto err_free; + + ret = xt_register_template(&packet_filter, + iptable_filter_table_init); if (ret < 0) { - xt_unregister_template(&packet_filter); - kfree(filter_ops); - return ret; + unregister_pernet_subsys(&iptable_filter_net_ops); + goto err_free; } return 0; +err_free: + kfree(filter_ops); + return ret; } static void __exit iptable_filter_fini(void) { - unregister_pernet_subsys(&iptable_filter_net_ops); xt_unregister_template(&packet_filter); + unregister_pernet_subsys(&iptable_filter_net_ops); kfree(filter_ops); } diff --git a/net/ipv4/netfilter/iptable_mangle.c b/net/ipv4/netfilter/iptable_mangle.c index 385d945d8ebe..13d25d9a4610 100644 --- a/net/ipv4/netfilter/iptable_mangle.c +++ b/net/ipv4/netfilter/iptable_mangle.c @@ -96,7 +96,7 @@ static int iptable_mangle_table_init(struct net *net) static void __net_exit iptable_mangle_net_pre_exit(struct net *net) { - ipt_unregister_table_pre_exit(net, "mangle"); + xt_unregister_table_pre_exit(net, NFPROTO_IPV4, "mangle"); } static void __net_exit iptable_mangle_net_exit(struct net *net) @@ -111,32 +111,33 @@ static struct pernet_operations iptable_mangle_net_ops = { static int __init iptable_mangle_init(void) { - int ret = xt_register_template(&packet_mangler, - iptable_mangle_table_init); - if (ret < 0) - return ret; + int ret; mangle_ops = xt_hook_ops_alloc(&packet_mangler, iptable_mangle_hook); - if (IS_ERR(mangle_ops)) { - xt_unregister_template(&packet_mangler); - ret = PTR_ERR(mangle_ops); - return ret; - } + if (IS_ERR(mangle_ops)) + return PTR_ERR(mangle_ops); ret = register_pernet_subsys(&iptable_mangle_net_ops); + if (ret < 0) + goto err_free; + + ret = xt_register_template(&packet_mangler, + iptable_mangle_table_init); if (ret < 0) { - xt_unregister_template(&packet_mangler); - kfree(mangle_ops); - return ret; + unregister_pernet_subsys(&iptable_mangle_net_ops); + goto err_free; } + return 0; +err_free: + kfree(mangle_ops); return ret; } static void __exit iptable_mangle_fini(void) { - unregister_pernet_subsys(&iptable_mangle_net_ops); xt_unregister_template(&packet_mangler); + unregister_pernet_subsys(&iptable_mangle_net_ops); kfree(mangle_ops); } diff --git a/net/ipv4/netfilter/iptable_nat.c b/net/ipv4/netfilter/iptable_nat.c index 625a1ca13b1b..a0df72554025 100644 --- a/net/ipv4/netfilter/iptable_nat.c +++ b/net/ipv4/netfilter/iptable_nat.c @@ -119,8 +119,11 @@ static int iptable_nat_table_init(struct net *net) } ret = ipt_nat_register_lookups(net); - if (ret < 0) + if (ret < 0) { + xt_unregister_table_pre_exit(net, NFPROTO_IPV4, "nat"); + synchronize_rcu(); ipt_unregister_table_exit(net, "nat"); + } kfree(repl); return ret; @@ -129,6 +132,7 @@ static int iptable_nat_table_init(struct net *net) static void __net_exit iptable_nat_net_pre_exit(struct net *net) { ipt_nat_unregister_lookups(net); + xt_unregister_table_pre_exit(net, NFPROTO_IPV4, "nat"); } static void __net_exit iptable_nat_net_exit(struct net *net) diff --git a/net/ipv4/netfilter/iptable_raw.c b/net/ipv4/netfilter/iptable_raw.c index 0e7f53964d0a..2745c22f4034 100644 --- a/net/ipv4/netfilter/iptable_raw.c +++ b/net/ipv4/netfilter/iptable_raw.c @@ -53,7 +53,7 @@ static int iptable_raw_table_init(struct net *net) static void __net_exit iptable_raw_net_pre_exit(struct net *net) { - ipt_unregister_table_pre_exit(net, "raw"); + xt_unregister_table_pre_exit(net, NFPROTO_IPV4, "raw"); } static void __net_exit iptable_raw_net_exit(struct net *net) @@ -77,32 +77,32 @@ static int __init iptable_raw_init(void) pr_info("Enabling raw table before defrag\n"); } - ret = xt_register_template(table, - iptable_raw_table_init); - if (ret < 0) - return ret; - rawtable_ops = xt_hook_ops_alloc(table, ipt_do_table); - if (IS_ERR(rawtable_ops)) { - xt_unregister_template(table); + if (IS_ERR(rawtable_ops)) return PTR_ERR(rawtable_ops); - } ret = register_pernet_subsys(&iptable_raw_net_ops); + if (ret < 0) + goto err_free; + + ret = xt_register_template(table, + iptable_raw_table_init); if (ret < 0) { - xt_unregister_template(table); - kfree(rawtable_ops); - return ret; + unregister_pernet_subsys(&iptable_raw_net_ops); + goto err_free; } + return 0; +err_free: + kfree(rawtable_ops); return ret; } static void __exit iptable_raw_fini(void) { + xt_unregister_template(&packet_raw); unregister_pernet_subsys(&iptable_raw_net_ops); kfree(rawtable_ops); - xt_unregister_template(&packet_raw); } module_init(iptable_raw_init); diff --git a/net/ipv4/netfilter/iptable_security.c b/net/ipv4/netfilter/iptable_security.c index d885443cb267..491894511c54 100644 --- a/net/ipv4/netfilter/iptable_security.c +++ b/net/ipv4/netfilter/iptable_security.c @@ -50,7 +50,7 @@ static int iptable_security_table_init(struct net *net) static void __net_exit iptable_security_net_pre_exit(struct net *net) { - ipt_unregister_table_pre_exit(net, "security"); + xt_unregister_table_pre_exit(net, NFPROTO_IPV4, "security"); } static void __net_exit iptable_security_net_exit(struct net *net) @@ -65,33 +65,34 @@ static struct pernet_operations iptable_security_net_ops = { static int __init iptable_security_init(void) { - int ret = xt_register_template(&security_table, - iptable_security_table_init); - - if (ret < 0) - return ret; + int ret; sectbl_ops = xt_hook_ops_alloc(&security_table, ipt_do_table); - if (IS_ERR(sectbl_ops)) { - xt_unregister_template(&security_table); + if (IS_ERR(sectbl_ops)) return PTR_ERR(sectbl_ops); - } ret = register_pernet_subsys(&iptable_security_net_ops); + if (ret < 0) + goto err_free; + + ret = xt_register_template(&security_table, + iptable_security_table_init); if (ret < 0) { - xt_unregister_template(&security_table); - kfree(sectbl_ops); - return ret; + unregister_pernet_subsys(&iptable_security_net_ops); + goto err_free; } + return 0; +err_free: + kfree(sectbl_ops); return ret; } static void __exit iptable_security_fini(void) { + xt_unregister_template(&security_table); unregister_pernet_subsys(&iptable_security_net_ops); kfree(sectbl_ops); - xt_unregister_template(&security_table); } module_init(iptable_security_init); diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c index e20c41206e29..3ea759b66600 100644 --- a/net/ipv4/raw.c +++ b/net/ipv4/raw.c @@ -390,7 +390,7 @@ static int raw_send_hdrinc(struct sock *sk, struct flowi4 *fl4, * in, reject the frame as invalid */ err = -EINVAL; - if (iphlen > length) + if (iphlen > length || iphlen < sizeof(*iph)) goto error_free; if (iphlen >= sizeof(*iph)) { diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index cee51749df16..f27f50111172 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -299,9 +299,6 @@ enum { DEFINE_PER_CPU(unsigned int, tcp_orphan_count); EXPORT_PER_CPU_SYMBOL_GPL(tcp_orphan_count); -DEFINE_PER_CPU(u32, tcp_tw_isn); -EXPORT_PER_CPU_SYMBOL_GPL(tcp_tw_isn); - long sysctl_tcp_mem[3] __read_mostly; EXPORT_IPV6_MOD(sysctl_tcp_mem); diff --git a/net/ipv4/tcp_ao.c b/net/ipv4/tcp_ao.c index a97cdf3e6af4..0a4b38b315fe 100644 --- a/net/ipv4/tcp_ao.c +++ b/net/ipv4/tcp_ao.c @@ -116,7 +116,8 @@ struct tcp_ao_key *tcp_ao_established_key(const struct sock *sk, { struct tcp_ao_key *key; - hlist_for_each_entry_rcu(key, &ao->head, node, lockdep_sock_is_held(sk)) { + hlist_for_each_entry_rcu(key, &ao->head, node, + sk_fullsock(sk) && lockdep_sock_is_held(sk)) { if ((sndid >= 0 && key->sndid != sndid) || (rcvid >= 0 && key->rcvid != rcvid)) continue; diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index cb4bcc5a8578..a8b626ac1ade 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -7648,6 +7648,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops, struct sock *sk, struct sk_buff *skb) { struct tcp_fastopen_cookie foc = { .len = -1 }; + u32 isn = TCP_SKB_CB(skb)->tcp_tw_isn; struct tcp_options_received tmp_opt; const struct tcp_sock *tp = tcp_sk(sk); struct net *net = sock_net(sk); @@ -7658,20 +7659,16 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops, struct dst_entry *dst; struct flowi fl; u8 syncookies; - u32 isn; #ifdef CONFIG_TCP_AO const struct tcp_ao_hdr *aoh; #endif - isn = __this_cpu_read(tcp_tw_isn); - if (isn) { - /* TW buckets are converted to open requests without - * limitations, they conserve resources and peer is - * evidently real one. - */ - __this_cpu_write(tcp_tw_isn, 0); - } else { + /* If isn is non-zero, this SYN originally matched a TIME_WAIT socket. + * TW sockets are converted to open requests without limitations, + * we skip the queue limits and syncookie checks in the block below. + */ + if (!isn) { syncookies = READ_ONCE(net->ipv4.sysctl_tcp_syncookies); if (syncookies == 2 || inet_csk_reqsk_queue_is_full(sk)) { diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index c7b2463c2e25..0bda739f3d68 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -2274,6 +2274,7 @@ lookup: } } + isn = 0; process: if (static_branch_unlikely(&ip4_min_ttl)) { /* min_ttl can be changed concurrently from do_ip_setsockopt() */ @@ -2302,6 +2303,7 @@ process: th = (const struct tcphdr *)skb->data; iph = ip_hdr(skb); tcp_v4_fill_cb(skb, iph, th); + TCP_SKB_CB(skb)->tcp_tw_isn = isn; skb->dev = NULL; @@ -2387,7 +2389,6 @@ do_time_wait: sk = sk2; tcp_v4_restore_cb(skb); refcounted = false; - __this_cpu_write(tcp_tw_isn, isn); goto process; } diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c index 6b1654c1ad4a..9714d40c5b6e 100644 --- a/net/ipv4/udp_offload.c +++ b/net/ipv4/udp_offload.c @@ -483,11 +483,11 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb, struct sock *sk = gso_skb->sk; unsigned int sum_truesize = 0; struct sk_buff *segs, *seg; - __be16 newlen, msslen; struct udphdr *uh; unsigned int mss; bool copy_dtor; __sum16 check; + __be16 newlen; int ret = 0; mss = skb_shinfo(gso_skb)->gso_size; @@ -556,15 +556,6 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb, return segs; } - msslen = htons(sizeof(*uh) + mss); - - /* GSO partial and frag_list segmentation only requires splitting - * the frame into an MSS multiple and possibly a remainder, both - * cases return a GSO skb. So update the mss now. - */ - if (skb_is_gso(segs)) - mss *= skb_shinfo(segs)->gso_segs; - seg = segs; uh = udp_hdr(seg); @@ -587,7 +578,7 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb, if (!seg->next) break; - uh->len = msslen; + uh->len = newlen; uh->check = check; if (seg->ip_summed == CHECKSUM_PARTIAL) @@ -600,9 +591,12 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb, uh = udp_hdr(seg); } - /* last packet can be partial gso_size, account for that in checksum */ - newlen = htons(skb_tail_pointer(seg) - skb_transport_header(seg) + - seg->data_len); + /* Unless skb fits perfectly as GSO_PARTIAL, the trailing + * segment may not be full MSS, account for that in the checksum + */ + if (!skb_is_gso(seg)) + newlen = htons(skb_tail_pointer(seg) - + skb_transport_header(seg) + seg->data_len); check = csum16_add(csum16_sub(uh->check, uh->len), newlen); uh->len = newlen; diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c index 03cbce842c1a..cf90f933ca1a 100644 --- a/net/ipv6/exthdrs.c +++ b/net/ipv6/exthdrs.c @@ -910,16 +910,27 @@ static bool ipv6_hop_ra(struct sk_buff *skb, int optoff) static bool ipv6_hop_ioam(struct sk_buff *skb, int optoff) { + enum skb_drop_reason drop_reason; struct ioam6_trace_hdr *trace; struct ioam6_namespace *ns; + struct inet6_dev *idev; struct ioam6_hdr *hdr; + drop_reason = SKB_DROP_REASON_IP_INHDR; + /* Bad alignment (must be 4n-aligned) */ if (optoff & 3) goto drop; + /* Does the device still have IPv6 configuration? */ + idev = __in6_dev_get(skb->dev); + if (!idev) { + drop_reason = SKB_DROP_REASON_IPV6DISABLED; + goto drop; + } + /* Ignore if IOAM is not enabled on ingress */ - if (!READ_ONCE(__in6_dev_get(skb->dev)->cnf.ioam6_enabled)) + if (!READ_ONCE(idev->cnf.ioam6_enabled)) goto ignore; /* Truncated Option header */ @@ -955,9 +966,9 @@ static bool ipv6_hop_ioam(struct sk_buff *skb, int optoff) if (skb_ensure_writable(skb, optoff + 2 + hdr->opt_len)) goto drop; - /* Trace pointer may have changed */ - trace = (struct ioam6_trace_hdr *)(skb_network_header(skb) - + optoff + sizeof(*hdr)); + /* Trace and hdr pointers may have changed */ + hdr = (struct ioam6_hdr *)(skb_network_header(skb) + optoff); + trace = (struct ioam6_trace_hdr *)((u8 *)hdr + sizeof(*hdr)); ioam6_fill_trace_data(skb, ns, trace, true); @@ -972,7 +983,7 @@ ignore: return true; drop: - kfree_skb_reason(skb, SKB_DROP_REASON_IP_INHDR); + kfree_skb_reason(skb, drop_reason); return false; } diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c index d585ac3c1113..9d9c3763f2f5 100644 --- a/net/ipv6/netfilter/ip6_tables.c +++ b/net/ipv6/netfilter/ip6_tables.c @@ -1713,12 +1713,10 @@ do_ip6t_get_ctl(struct sock *sk, int cmd, void __user *user, int *len) static void __ip6t_unregister_table(struct net *net, struct xt_table *table) { - struct xt_table_info *private; - void *loc_cpu_entry; + struct xt_table_info *private = table->private; struct module *table_owner = table->me; struct ip6t_entry *iter; - - private = xt_unregister_table(table); + void *loc_cpu_entry; /* Decrease module usage counts and free resources */ loc_cpu_entry = private->entries; @@ -1727,19 +1725,18 @@ static void __ip6t_unregister_table(struct net *net, struct xt_table *table) if (private->number > private->initial_entries) module_put(table_owner); xt_free_table_info(private); + kfree(table); } int ip6t_register_table(struct net *net, const struct xt_table *table, const struct ip6t_replace *repl, const struct nf_hook_ops *template_ops) { - struct nf_hook_ops *ops; - unsigned int num_ops; - int ret, i; - struct xt_table_info *newinfo; struct xt_table_info bootstrap = {0}; - void *loc_cpu_entry; + struct xt_table_info *newinfo; struct xt_table *new_table; + void *loc_cpu_entry; + int ret; newinfo = xt_alloc_table_info(repl->size); if (!newinfo) @@ -1754,7 +1751,7 @@ int ip6t_register_table(struct net *net, const struct xt_table *table, return ret; } - new_table = xt_register_table(net, table, &bootstrap, newinfo); + new_table = xt_register_table(net, table, template_ops, &bootstrap, newinfo); if (IS_ERR(new_table)) { struct ip6t_entry *iter; @@ -1764,48 +1761,12 @@ int ip6t_register_table(struct net *net, const struct xt_table *table, return PTR_ERR(new_table); } - if (!template_ops) - return 0; - - num_ops = hweight32(table->valid_hooks); - if (num_ops == 0) { - ret = -EINVAL; - goto out_free; - } - - ops = kmemdup_array(template_ops, num_ops, sizeof(*ops), GFP_KERNEL); - if (!ops) { - ret = -ENOMEM; - goto out_free; - } - - for (i = 0; i < num_ops; i++) - ops[i].priv = new_table; - - new_table->ops = ops; - - ret = nf_register_net_hooks(net, ops, num_ops); - if (ret != 0) - goto out_free; - return ret; - -out_free: - __ip6t_unregister_table(net, new_table); - return ret; -} - -void ip6t_unregister_table_pre_exit(struct net *net, const char *name) -{ - struct xt_table *table = xt_find_table(net, NFPROTO_IPV6, name); - - if (table) - nf_unregister_net_hooks(net, table->ops, hweight32(table->valid_hooks)); } void ip6t_unregister_table_exit(struct net *net, const char *name) { - struct xt_table *table = xt_find_table(net, NFPROTO_IPV6, name); + struct xt_table *table = xt_unregister_table_exit(net, NFPROTO_IPV6, name); if (table) __ip6t_unregister_table(net, table); @@ -1894,7 +1855,6 @@ static void __exit ip6_tables_fini(void) } EXPORT_SYMBOL(ip6t_register_table); -EXPORT_SYMBOL(ip6t_unregister_table_pre_exit); EXPORT_SYMBOL(ip6t_unregister_table_exit); EXPORT_SYMBOL(ip6t_do_table); diff --git a/net/ipv6/netfilter/ip6t_hbh.c b/net/ipv6/netfilter/ip6t_hbh.c index e7a3fb9355ee..450dd53846a2 100644 --- a/net/ipv6/netfilter/ip6t_hbh.c +++ b/net/ipv6/netfilter/ip6t_hbh.c @@ -168,6 +168,10 @@ static int hbh_mt6_check(const struct xt_mtchk_param *par) pr_debug("unknown flags %X\n", optsinfo->invflags); return -EINVAL; } + if (optsinfo->optsnr > IP6T_OPTS_OPTSNR) { + pr_debug("too many supported opts specified\n"); + return -EINVAL; + } if (optsinfo->flags & IP6T_OPTS_NSTRICT) { pr_debug("Not strict - not implemented"); diff --git a/net/ipv6/netfilter/ip6table_filter.c b/net/ipv6/netfilter/ip6table_filter.c index e8992693e14a..b074fc477676 100644 --- a/net/ipv6/netfilter/ip6table_filter.c +++ b/net/ipv6/netfilter/ip6table_filter.c @@ -60,7 +60,7 @@ static int __net_init ip6table_filter_net_init(struct net *net) static void __net_exit ip6table_filter_net_pre_exit(struct net *net) { - ip6t_unregister_table_pre_exit(net, "filter"); + xt_unregister_table_pre_exit(net, NFPROTO_IPV6, "filter"); } static void __net_exit ip6table_filter_net_exit(struct net *net) @@ -76,32 +76,32 @@ static struct pernet_operations ip6table_filter_net_ops = { static int __init ip6table_filter_init(void) { - int ret = xt_register_template(&packet_filter, - ip6table_filter_table_init); - - if (ret < 0) - return ret; + int ret; filter_ops = xt_hook_ops_alloc(&packet_filter, ip6t_do_table); - if (IS_ERR(filter_ops)) { - xt_unregister_template(&packet_filter); + if (IS_ERR(filter_ops)) return PTR_ERR(filter_ops); - } ret = register_pernet_subsys(&ip6table_filter_net_ops); + if (ret < 0) + goto err_free; + + ret = xt_register_template(&packet_filter, ip6table_filter_table_init); if (ret < 0) { - xt_unregister_template(&packet_filter); - kfree(filter_ops); - return ret; + unregister_pernet_subsys(&ip6table_filter_net_ops); + goto err_free; } + return 0; +err_free: + kfree(filter_ops); return ret; } static void __exit ip6table_filter_fini(void) { - unregister_pernet_subsys(&ip6table_filter_net_ops); xt_unregister_template(&packet_filter); + unregister_pernet_subsys(&ip6table_filter_net_ops); kfree(filter_ops); } diff --git a/net/ipv6/netfilter/ip6table_mangle.c b/net/ipv6/netfilter/ip6table_mangle.c index 8dd4cd0c47bd..e6ee036a9b2c 100644 --- a/net/ipv6/netfilter/ip6table_mangle.c +++ b/net/ipv6/netfilter/ip6table_mangle.c @@ -89,7 +89,7 @@ static int ip6table_mangle_table_init(struct net *net) static void __net_exit ip6table_mangle_net_pre_exit(struct net *net) { - ip6t_unregister_table_pre_exit(net, "mangle"); + xt_unregister_table_pre_exit(net, NFPROTO_IPV6, "mangle"); } static void __net_exit ip6table_mangle_net_exit(struct net *net) @@ -104,32 +104,33 @@ static struct pernet_operations ip6table_mangle_net_ops = { static int __init ip6table_mangle_init(void) { - int ret = xt_register_template(&packet_mangler, - ip6table_mangle_table_init); - - if (ret < 0) - return ret; + int ret; mangle_ops = xt_hook_ops_alloc(&packet_mangler, ip6table_mangle_hook); - if (IS_ERR(mangle_ops)) { - xt_unregister_template(&packet_mangler); + if (IS_ERR(mangle_ops)) return PTR_ERR(mangle_ops); - } ret = register_pernet_subsys(&ip6table_mangle_net_ops); + if (ret < 0) + goto err_free; + + ret = xt_register_template(&packet_mangler, + ip6table_mangle_table_init); if (ret < 0) { - xt_unregister_template(&packet_mangler); - kfree(mangle_ops); - return ret; + unregister_pernet_subsys(&ip6table_mangle_net_ops); + goto err_free; } + return 0; +err_free: + kfree(mangle_ops); return ret; } static void __exit ip6table_mangle_fini(void) { - unregister_pernet_subsys(&ip6table_mangle_net_ops); xt_unregister_template(&packet_mangler); + unregister_pernet_subsys(&ip6table_mangle_net_ops); kfree(mangle_ops); } diff --git a/net/ipv6/netfilter/ip6table_nat.c b/net/ipv6/netfilter/ip6table_nat.c index 5be723232df8..c2394e2c94b5 100644 --- a/net/ipv6/netfilter/ip6table_nat.c +++ b/net/ipv6/netfilter/ip6table_nat.c @@ -121,8 +121,11 @@ static int ip6table_nat_table_init(struct net *net) } ret = ip6t_nat_register_lookups(net); - if (ret < 0) + if (ret < 0) { + xt_unregister_table_pre_exit(net, NFPROTO_IPV6, "nat"); + synchronize_rcu(); ip6t_unregister_table_exit(net, "nat"); + } kfree(repl); return ret; @@ -131,6 +134,7 @@ static int ip6table_nat_table_init(struct net *net) static void __net_exit ip6table_nat_net_pre_exit(struct net *net) { ip6t_nat_unregister_lookups(net); + xt_unregister_table_pre_exit(net, NFPROTO_IPV6, "nat"); } static void __net_exit ip6table_nat_net_exit(struct net *net) diff --git a/net/ipv6/netfilter/ip6table_raw.c b/net/ipv6/netfilter/ip6table_raw.c index fc9f6754028f..3b161ee875bc 100644 --- a/net/ipv6/netfilter/ip6table_raw.c +++ b/net/ipv6/netfilter/ip6table_raw.c @@ -52,7 +52,7 @@ static int ip6table_raw_table_init(struct net *net) static void __net_exit ip6table_raw_net_pre_exit(struct net *net) { - ip6t_unregister_table_pre_exit(net, "raw"); + xt_unregister_table_pre_exit(net, NFPROTO_IPV6, "raw"); } static void __net_exit ip6table_raw_net_exit(struct net *net) @@ -75,31 +75,31 @@ static int __init ip6table_raw_init(void) pr_info("Enabling raw table before defrag\n"); } - ret = xt_register_template(table, ip6table_raw_table_init); - if (ret < 0) - return ret; - /* Register hooks */ rawtable_ops = xt_hook_ops_alloc(table, ip6t_do_table); - if (IS_ERR(rawtable_ops)) { - xt_unregister_template(table); + if (IS_ERR(rawtable_ops)) return PTR_ERR(rawtable_ops); - } ret = register_pernet_subsys(&ip6table_raw_net_ops); + if (ret < 0) + goto err_free; + + ret = xt_register_template(table, ip6table_raw_table_init); if (ret < 0) { - kfree(rawtable_ops); - xt_unregister_template(table); - return ret; + unregister_pernet_subsys(&ip6table_raw_net_ops); + goto err_free; } + return 0; +err_free: + kfree(rawtable_ops); return ret; } static void __exit ip6table_raw_fini(void) { - unregister_pernet_subsys(&ip6table_raw_net_ops); xt_unregister_template(&packet_raw); + unregister_pernet_subsys(&ip6table_raw_net_ops); kfree(rawtable_ops); } diff --git a/net/ipv6/netfilter/ip6table_security.c b/net/ipv6/netfilter/ip6table_security.c index 4df14a9bae78..4bd5d97b8ab6 100644 --- a/net/ipv6/netfilter/ip6table_security.c +++ b/net/ipv6/netfilter/ip6table_security.c @@ -49,7 +49,7 @@ static int ip6table_security_table_init(struct net *net) static void __net_exit ip6table_security_net_pre_exit(struct net *net) { - ip6t_unregister_table_pre_exit(net, "security"); + xt_unregister_table_pre_exit(net, NFPROTO_IPV6, "security"); } static void __net_exit ip6table_security_net_exit(struct net *net) @@ -64,32 +64,33 @@ static struct pernet_operations ip6table_security_net_ops = { static int __init ip6table_security_init(void) { - int ret = xt_register_template(&security_table, - ip6table_security_table_init); - - if (ret < 0) - return ret; + int ret; sectbl_ops = xt_hook_ops_alloc(&security_table, ip6t_do_table); - if (IS_ERR(sectbl_ops)) { - xt_unregister_template(&security_table); + if (IS_ERR(sectbl_ops)) return PTR_ERR(sectbl_ops); - } ret = register_pernet_subsys(&ip6table_security_net_ops); + if (ret < 0) + goto err_free; + + ret = xt_register_template(&security_table, + ip6table_security_table_init); if (ret < 0) { - kfree(sectbl_ops); - xt_unregister_template(&security_table); - return ret; + unregister_pernet_subsys(&ip6table_security_net_ops); + goto err_free; } + return 0; +err_free: + kfree(sectbl_ops); return ret; } static void __exit ip6table_security_fini(void) { - unregister_pernet_subsys(&ip6table_security_net_ops); xt_unregister_template(&security_table); + unregister_pernet_subsys(&ip6table_security_net_ops); kfree(sectbl_ops); } diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index bb09d5ccf599..a41ea2a866ee 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1818,6 +1818,7 @@ lookup: } } + isn = 0; process: if (static_branch_unlikely(&ip6_min_hopcount)) { /* min_hopcount can be changed concurrently from do_ipv6_setsockopt() */ @@ -1846,6 +1847,7 @@ process: th = (const struct tcphdr *)skb->data; hdr = ipv6_hdr(skb); tcp_v6_fill_cb(skb, hdr, th); + TCP_SKB_CB(skb)->tcp_tw_isn = isn; skb->dev = NULL; @@ -1933,7 +1935,6 @@ do_time_wait: sk = sk2; tcp_v6_restore_cb(skb); refcounted = false; - __this_cpu_write(tcp_tw_isn, isn); goto process; } diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c index 157fc23ce4e1..1455f67e01dd 100644 --- a/net/l2tp/l2tp_core.c +++ b/net/l2tp/l2tp_core.c @@ -1360,7 +1360,7 @@ static void l2tp_session_unhash(struct l2tp_session *session) spin_lock_bh(&pn->l2tp_session_idr_lock); /* Remove from the per-tunnel list */ - list_del_init(&session->list); + list_del_rcu(&session->list); /* Remove from per-net IDR */ if (tunnel->version == L2TP_HDR_VER_3) { diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c index 53bd98646e33..eba890366e9f 100644 --- a/net/mac80211/mlme.c +++ b/net/mac80211/mlme.c @@ -8070,6 +8070,7 @@ ieee80211_parse_neg_ttlm(struct ieee80211_sub_if_data *sdata, "No active links for TID %d", tid); return -EINVAL; } + pos += map_size; } else { map = 0; } @@ -8088,7 +8089,6 @@ ieee80211_parse_neg_ttlm(struct ieee80211_sub_if_data *sdata, default: return -EINVAL; } - pos += map_size; } return 0; } @@ -11137,6 +11137,9 @@ static void ieee80211_ml_epcs(struct ieee80211_sub_if_data *sdata, control = get_unaligned_le16(pos); link_id = control & IEEE80211_MLE_STA_EPCS_CONTROL_LINK_ID; + if (link_id >= IEEE80211_MLD_MAX_NUM_LINKS) + continue; + link = sdata_dereference(sdata->link[link_id], sdata); if (!link) continue; diff --git a/net/mac80211/parse.c b/net/mac80211/parse.c index 2b3632c6008a..77894d997113 100644 --- a/net/mac80211/parse.c +++ b/net/mac80211/parse.c @@ -34,6 +34,22 @@ #include "led.h" #include "wep.h" +static const u8 empty_non_inheritance[] = { + WLAN_EID_EXTENSION, 1, WLAN_EID_EXT_NON_INHERITANCE, + /* + * cfg80211_is_element_inherited() hardcodes elements that + * cannot be inherited, so we just need an empty one to be + * calling it at all. + */ +}; + +struct ieee80211_elem_defrag { + const struct element *elem; + /* container start/len */ + const u8 *start; + size_t len; +}; + struct ieee80211_elems_parse { /* must be first for kfree to work */ struct ieee802_11_elems elems; @@ -41,11 +57,7 @@ struct ieee80211_elems_parse { /* The basic Multi-Link element in the original elements */ const struct element *ml_basic_elem; - /* The reconfiguration Multi-Link element in the original elements */ - const struct element *ml_reconf_elem; - - /* The EPCS Multi-Link element in the original elements */ - const struct element *ml_epcs_elem; + struct ieee80211_elem_defrag ml_reconf, ml_epcs; bool multi_link_inner; bool skip_vendor; @@ -162,10 +174,14 @@ ieee80211_parse_extension_element(u32 *crc, } break; case IEEE80211_ML_CONTROL_TYPE_RECONF: - elems_parse->ml_reconf_elem = elem; + elems_parse->ml_reconf.elem = elem; + elems_parse->ml_reconf.start = params->start; + elems_parse->ml_reconf.len = params->len; break; case IEEE80211_ML_CONTROL_TYPE_PRIO_ACCESS: - elems_parse->ml_epcs_elem = elem; + elems_parse->ml_epcs.elem = elem; + elems_parse->ml_epcs.start = params->start; + elems_parse->ml_epcs.len = params->len; break; default: break; @@ -916,7 +932,7 @@ ieee80211_prep_mle_link_parse(struct ieee80211_elems_parse *elems_parse, { struct ieee802_11_elems *elems = &elems_parse->elems; struct ieee80211_mle_per_sta_profile *prof; - const struct element *tmp; + const struct element *tmp, *ret; ssize_t ml_len; const u8 *end; @@ -986,50 +1002,40 @@ ieee80211_prep_mle_link_parse(struct ieee80211_elems_parse *elems_parse, sub->from_ap = params->from_ap; sub->link_id = -1; - return cfg80211_find_ext_elem(WLAN_EID_EXT_NON_INHERITANCE, - sub->start, sub->len); -} - -static void -ieee80211_mle_defrag_reconf(struct ieee80211_elems_parse *elems_parse) -{ - struct ieee802_11_elems *elems = &elems_parse->elems; - ssize_t ml_len; + ret = cfg80211_find_ext_elem(WLAN_EID_EXT_NON_INHERITANCE, + sub->start, sub->len); + if (ret) + return ret; - ml_len = cfg80211_defragment_element(elems_parse->ml_reconf_elem, - elems->ie_start, - elems->total_len, - elems_parse->scratch_pos, - elems_parse->scratch + - elems_parse->scratch_len - - elems_parse->scratch_pos, - WLAN_EID_FRAGMENT); - if (ml_len < 0) - return; - elems->ml_reconf = (void *)elems_parse->scratch_pos; - elems->ml_reconf_len = ml_len; - elems_parse->scratch_pos += ml_len; + /* + * Since we know we want and found a profile, apply an empty + * non-inheritance if the profile didn't have one, so that any + * element that shouldn't be inherited by spec isn't. + */ + return (const void *)empty_non_inheritance; } -static void -ieee80211_mle_defrag_epcs(struct ieee80211_elems_parse *elems_parse) +static const void * +ieee80211_mle_defrag(struct ieee80211_elems_parse *elems_parse, + struct ieee80211_elem_defrag *defrag, + size_t *out_len) { - struct ieee802_11_elems *elems = &elems_parse->elems; + const void *ret; ssize_t ml_len; - ml_len = cfg80211_defragment_element(elems_parse->ml_epcs_elem, - elems->ie_start, - elems->total_len, + ml_len = cfg80211_defragment_element(defrag->elem, + defrag->start, defrag->len, elems_parse->scratch_pos, elems_parse->scratch + elems_parse->scratch_len - elems_parse->scratch_pos, WLAN_EID_FRAGMENT); if (ml_len < 0) - return; - elems->ml_epcs = (void *)elems_parse->scratch_pos; - elems->ml_epcs_len = ml_len; + return NULL; + ret = elems_parse->scratch_pos; + *out_len = ml_len; elems_parse->scratch_pos += ml_len; + return ret; } struct ieee802_11_elems * @@ -1042,6 +1048,7 @@ ieee802_11_parse_elems_full(struct ieee80211_elems_parse_params *params) size_t scratch_len = 3 * params->len; bool multi_link_inner = false; + BUILD_BUG_ON(sizeof(empty_non_inheritance) != empty_non_inheritance[1] + 2); BUILD_BUG_ON(offsetof(typeof(*elems_parse), elems) != 0); /* cannot parse for both a specific link and non-transmitted BSS */ @@ -1089,6 +1096,17 @@ ieee802_11_parse_elems_full(struct ieee80211_elems_parse_params *params) non_inherit = cfg80211_find_ext_elem(WLAN_EID_EXT_NON_INHERITANCE, sub.start, nontx_len); + /* + * If it's a non-transmitted BSS, we shouldn't pick + * any elements in the outer parsing that shouldn't + * be inherited. If the profile has a non-inheritance + * element this automatically happens, but if not then + * provide an empty one so that the hard-coded elements + * in cfg80211_is_element_inherited() are ignored, but + * it must be called. + */ + if (params->bss->transmitted_bss && !non_inherit) + non_inherit = (const void *)empty_non_inheritance; } else { /* must always parse to get elems_parse->ml_basic_elem */ non_inherit = ieee80211_prep_mle_link_parse(elems_parse, params, @@ -1109,9 +1127,12 @@ ieee802_11_parse_elems_full(struct ieee80211_elems_parse_params *params) _ieee802_11_parse_elems_full(&sub, elems_parse, NULL); } - ieee80211_mle_defrag_reconf(elems_parse); - - ieee80211_mle_defrag_epcs(elems_parse); + elems->ml_reconf = ieee80211_mle_defrag(elems_parse, + &elems_parse->ml_reconf, + &elems->ml_reconf_len); + elems->ml_epcs = ieee80211_mle_defrag(elems_parse, + &elems_parse->ml_epcs, + &elems->ml_epcs_len); if (elems->tim && !elems->parse_error) { const struct ieee80211_tim_ie *tim_ie = elems->tim; diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c index 7a8c964b0ae6..e9570257151b 100644 --- a/net/mac80211/rx.c +++ b/net/mac80211/rx.c @@ -4941,6 +4941,7 @@ static bool ieee80211_invoke_fast_rx(struct ieee80211_rx_data *rx, u8 sa[ETH_ALEN]; } addrs __aligned(2); struct ieee80211_sta_rx_stats *stats; + u32 encoded_rate; /* for parallel-rx, we need to have DUP_VALIDATED, otherwise we write * to a common data structure; drivers can implement that per queue @@ -5048,11 +5049,14 @@ static bool ieee80211_invoke_fast_rx(struct ieee80211_rx_data *rx, /* push the addresses in front */ memcpy(skb_push(skb, sizeof(addrs)), &addrs, sizeof(addrs)); + /* capture before mesh forward may memset or free skb->cb */ + encoded_rate = sta_stats_encode_rate(status); + res = ieee80211_rx_mesh_data(rx->sdata, rx->sta, rx->skb); switch (res) { case RX_QUEUED: stats->last_rx = jiffies; - stats->last_rate = sta_stats_encode_rate(status); + stats->last_rate = encoded_rate; return true; case RX_CONTINUE: break; diff --git a/net/mptcp/pm.c b/net/mptcp/pm.c index 3c152bf66cd5..3e770c7407e1 100644 --- a/net/mptcp/pm.c +++ b/net/mptcp/pm.c @@ -364,7 +364,13 @@ static void mptcp_pm_add_timer(struct timer_list *timer) spin_lock_bh(&msk->pm.lock); - if (!mptcp_pm_should_add_signal_addr(msk)) { + /* The cancel path (mptcp_pm_del_add_timer()) can race with this + * callback. Once cancel updates retrans_times to MAX, suppress further + * retransmissions here. If this callback acquires pm.lock first, one + * final transmit attempt is still possible. + */ + if (entry->retrans_times < ADD_ADDR_RETRANS_MAX && + !mptcp_pm_should_add_signal_addr(msk)) { pr_debug("retransmit ADD_ADDR id=%d\n", entry->addr.id); mptcp_pm_announce_addr(msk, &entry->addr, false); mptcp_pm_add_addr_send_ack(msk); @@ -414,8 +420,12 @@ mptcp_pm_del_add_timer(struct mptcp_sock *msk, /* Note: entry might have been removed by another thread. * We hold rcu_read_lock() to ensure it is not freed under us. */ - if (stop_timer) - sk_stop_timer_sync(sk, &entry->add_timer); + if (stop_timer) { + if (check_id) + sk_stop_timer(sk, &entry->add_timer); + else + sk_stop_timer_sync(sk, &entry->add_timer); + } rcu_read_unlock(); return entry; @@ -882,6 +892,7 @@ bool mptcp_pm_add_addr_signal(struct mptcp_sock *msk, const struct sk_buff *skb, struct mptcp_addr_info *addr, bool *echo, bool *drop_other_suboptions) { + bool skip_add_addr = false; int ret = false; u8 add_addr; u8 family; @@ -903,24 +914,49 @@ bool mptcp_pm_add_addr_signal(struct mptcp_sock *msk, const struct sk_buff *skb, } *echo = mptcp_pm_should_add_signal_echo(msk); - port = !!(*echo ? msk->pm.remote.port : msk->pm.local.port); - - family = *echo ? msk->pm.remote.family : msk->pm.local.family; - if (remaining < mptcp_add_addr_len(family, *echo, port)) - goto out_unlock; - if (*echo) { *addr = msk->pm.remote; add_addr = msk->pm.addr_signal & ~BIT(MPTCP_ADD_ADDR_ECHO); + port = !!msk->pm.remote.port; + family = msk->pm.remote.family; } else { *addr = msk->pm.local; add_addr = msk->pm.addr_signal & ~BIT(MPTCP_ADD_ADDR_SIGNAL); + port = !!msk->pm.local.port; + family = msk->pm.local.family; } - WRITE_ONCE(msk->pm.addr_signal, add_addr); + + if (remaining < mptcp_add_addr_len(family, *echo, port)) { + struct net *net = sock_net((struct sock *)msk); + + if (!*drop_other_suboptions) + goto out_unlock; + + if (*echo) { + MPTCP_INC_STATS(net, MPTCP_MIB_ECHOADDTXDROP); + } else { + skip_add_addr = true; + MPTCP_INC_STATS(net, MPTCP_MIB_ADDADDRTXDROP); + } + goto drop_signal_mark; + } + ret = true; +drop_signal_mark: + WRITE_ONCE(msk->pm.addr_signal, add_addr); + out_unlock: spin_unlock_bh(&msk->pm.lock); + + /* On pure-ACK option-space exhaustion, stop retrying this ADD_ADDR: + * clear the signal bit, cancel the matching retransmission timer, and + * let the PM state machine progress. + */ + if (skip_add_addr) { + mptcp_pm_del_add_timer(msk, addr, true); + mptcp_pm_subflow_established(msk); + } return ret; } diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 8ef967aa80a0..479fd51609f2 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -397,12 +397,26 @@ static bool __mptcp_move_skb(struct sock *sk, struct sk_buff *skb) return false; } - /* old data, keep it simple and drop the whole pkt, sender - * will retransmit as needed, if needed. + /* Completely old data? */ + if (!after64(MPTCP_SKB_CB(skb)->end_seq, msk->ack_seq)) { + MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_DUPDATA); + mptcp_drop(sk, skb); + return false; + } + + /* Partial packet: map_seq < ack_seq < end_seq. + * Skip the already-acked bytes and enqueue the new data. */ - MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_DUPDATA); - mptcp_drop(sk, skb); - return false; + copy_len = MPTCP_SKB_CB(skb)->end_seq - msk->ack_seq; + MPTCP_SKB_CB(skb)->offset += msk->ack_seq - MPTCP_SKB_CB(skb)->map_seq; + MPTCP_SKB_CB(skb)->map_seq += msk->ack_seq - + MPTCP_SKB_CB(skb)->map_seq; + msk->bytes_received += copy_len; + WRITE_ONCE(msk->ack_seq, msk->ack_seq + copy_len); + + skb_set_owner_r(skb, sk); + __skb_queue_tail(&sk->sk_receive_queue, skb); + return true; } static void mptcp_stop_rtx_timer(struct sock *sk) @@ -3456,6 +3470,7 @@ static int mptcp_disconnect(struct sock *sk, int flags) /* for fallback's sake */ WRITE_ONCE(msk->ack_seq, 0); + atomic64_set(&msk->rcv_wnd_sent, 0); WRITE_ONCE(sk->sk_shutdown, 0); sk_error_report(sk); diff --git a/net/netfilter/ipset/ip_set_hash_ipmark.c b/net/netfilter/ipset/ip_set_hash_ipmark.c index a22ec1a6f6ec..e26ca2a370e3 100644 --- a/net/netfilter/ipset/ip_set_hash_ipmark.c +++ b/net/netfilter/ipset/ip_set_hash_ipmark.c @@ -150,7 +150,7 @@ hash_ipmark4_uadt(struct ip_set *set, struct nlattr *tb[], if (retried) ip = ntohl(h->next.ip); - for (; ip <= ip_to; ip++, i++) { + for (; ip <= ip_to; i++) { e.ip = htonl(ip); if (i > IPSET_MAX_RANGE) { hash_ipmark4_data_next(&h->next, &e); @@ -162,6 +162,10 @@ hash_ipmark4_uadt(struct ip_set *set, struct nlattr *tb[], return ret; ret = 0; + + if (ip == ip_to) + break; + ip++; } return ret; } diff --git a/net/netfilter/ipset/ip_set_hash_ipport.c b/net/netfilter/ipset/ip_set_hash_ipport.c index e977b5a9c48d..41ca24a22a02 100644 --- a/net/netfilter/ipset/ip_set_hash_ipport.c +++ b/net/netfilter/ipset/ip_set_hash_ipport.c @@ -186,7 +186,7 @@ hash_ipport4_uadt(struct ip_set *set, struct nlattr *tb[], if (retried) ip = ntohl(h->next.ip); - for (; ip <= ip_to; ip++) { + for (; ip <= ip_to;) { p = retried && ip == ntohl(h->next.ip) ? ntohs(h->next.port) : port; for (; p <= port_to; p++, i++) { @@ -203,6 +203,9 @@ hash_ipport4_uadt(struct ip_set *set, struct nlattr *tb[], ret = 0; } + if (ip == ip_to) + break; + ip++; } return ret; } diff --git a/net/netfilter/ipset/ip_set_hash_ipportip.c b/net/netfilter/ipset/ip_set_hash_ipportip.c index 39a01934b153..b9ac2efaa15c 100644 --- a/net/netfilter/ipset/ip_set_hash_ipportip.c +++ b/net/netfilter/ipset/ip_set_hash_ipportip.c @@ -182,7 +182,7 @@ hash_ipportip4_uadt(struct ip_set *set, struct nlattr *tb[], if (retried) ip = ntohl(h->next.ip); - for (; ip <= ip_to; ip++) { + for (; ip <= ip_to;) { p = retried && ip == ntohl(h->next.ip) ? ntohs(h->next.port) : port; for (; p <= port_to; p++, i++) { @@ -199,6 +199,9 @@ hash_ipportip4_uadt(struct ip_set *set, struct nlattr *tb[], ret = 0; } + if (ip == ip_to) + break; + ip++; } return ret; } diff --git a/net/netfilter/ipset/ip_set_hash_ipportnet.c b/net/netfilter/ipset/ip_set_hash_ipportnet.c index 5c6de605a9fb..2d6652d43199 100644 --- a/net/netfilter/ipset/ip_set_hash_ipportnet.c +++ b/net/netfilter/ipset/ip_set_hash_ipportnet.c @@ -274,7 +274,7 @@ hash_ipportnet4_uadt(struct ip_set *set, struct nlattr *tb[], p = port; ip2 = ip2_from; } - for (; ip <= ip_to; ip++) { + for (; ip <= ip_to;) { e.ip = htonl(ip); for (; p <= port_to; p++) { e.port = htons(p); @@ -298,6 +298,9 @@ hash_ipportnet4_uadt(struct ip_set *set, struct nlattr *tb[], ip2 = ip2_from; } p = port; + if (ip == ip_to) + break; + ip++; } return ret; } diff --git a/net/netfilter/nf_conntrack_broadcast.c b/net/netfilter/nf_conntrack_broadcast.c index 4f39bf7c843f..75e53fde6b29 100644 --- a/net/netfilter/nf_conntrack_broadcast.c +++ b/net/netfilter/nf_conntrack_broadcast.c @@ -72,6 +72,7 @@ int nf_conntrack_broadcast_help(struct sk_buff *skb, exp->flags = NF_CT_EXPECT_PERMANENT; exp->class = NF_CT_EXPECT_CLASS_DEFAULT; rcu_assign_pointer(exp->helper, helper); + rcu_assign_pointer(exp->assign_helper, NULL); write_pnet(&exp->net, net); #ifdef CONFIG_NF_CONNTRACK_ZONES exp->zone = ct->zone; diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index 27ce5fda8993..b5ee274be773 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -1814,14 +1814,17 @@ init_conntrack(struct net *net, struct nf_conn *tmpl, spin_lock_bh(&nf_conntrack_expect_lock); exp = nf_ct_find_expectation(net, zone, tuple, !tmpl || nf_ct_is_confirmed(tmpl)); if (exp) { + struct nf_conntrack_helper *assign_helper; + /* Welcome, Mr. Bond. We've been expecting you... */ __set_bit(IPS_EXPECTED_BIT, &ct->status); /* exp->master safe, refcnt bumped in nf_ct_find_expectation */ ct->master = exp->master; - if (exp->helper) { + assign_helper = rcu_dereference(exp->assign_helper); + if (assign_helper) { help = nf_ct_helper_ext_add(ct, GFP_ATOMIC); if (help) - rcu_assign_pointer(help->helper, exp->helper); + rcu_assign_pointer(help->helper, assign_helper); } #ifdef CONFIG_NF_CONNTRACK_MARK diff --git a/net/netfilter/nf_conntrack_expect.c b/net/netfilter/nf_conntrack_expect.c index 24d0576d84b7..8e943efbdf0a 100644 --- a/net/netfilter/nf_conntrack_expect.c +++ b/net/netfilter/nf_conntrack_expect.c @@ -344,6 +344,7 @@ void nf_ct_expect_init(struct nf_conntrack_expect *exp, unsigned int class, helper = rcu_dereference(help->helper); rcu_assign_pointer(exp->helper, helper); + rcu_assign_pointer(exp->assign_helper, NULL); write_pnet(&exp->net, net); #ifdef CONFIG_NF_CONNTRACK_ZONES exp->zone = ct->zone; diff --git a/net/netfilter/nf_conntrack_h323_main.c b/net/netfilter/nf_conntrack_h323_main.c index 3f5c50455b71..b2fe6554b9cf 100644 --- a/net/netfilter/nf_conntrack_h323_main.c +++ b/net/netfilter/nf_conntrack_h323_main.c @@ -643,7 +643,7 @@ static int expect_h245(struct sk_buff *skb, struct nf_conn *ct, &ct->tuplehash[!dir].tuple.src.u3, &ct->tuplehash[!dir].tuple.dst.u3, IPPROTO_TCP, NULL, &port); - rcu_assign_pointer(exp->helper, &nf_conntrack_helper_h245); + rcu_assign_pointer(exp->assign_helper, &nf_conntrack_helper_h245); nathook = rcu_dereference(nfct_h323_nat_hook); if (memcmp(&ct->tuplehash[dir].tuple.src.u3, @@ -767,7 +767,7 @@ static int expect_callforwarding(struct sk_buff *skb, nf_ct_expect_init(exp, NF_CT_EXPECT_CLASS_DEFAULT, nf_ct_l3num(ct), &ct->tuplehash[!dir].tuple.src.u3, &addr, IPPROTO_TCP, NULL, &port); - rcu_assign_pointer(exp->helper, nf_conntrack_helper_q931); + rcu_assign_pointer(exp->assign_helper, nf_conntrack_helper_q931); nathook = rcu_dereference(nfct_h323_nat_hook); if (memcmp(&ct->tuplehash[dir].tuple.src.u3, @@ -1234,7 +1234,7 @@ static int expect_q931(struct sk_buff *skb, struct nf_conn *ct, &ct->tuplehash[!dir].tuple.src.u3 : NULL, &ct->tuplehash[!dir].tuple.dst.u3, IPPROTO_TCP, NULL, &port); - rcu_assign_pointer(exp->helper, nf_conntrack_helper_q931); + rcu_assign_pointer(exp->assign_helper, nf_conntrack_helper_q931); exp->flags = NF_CT_EXPECT_PERMANENT; /* Accept multiple calls */ nathook = rcu_dereference(nfct_h323_nat_hook); @@ -1306,7 +1306,7 @@ static int process_gcf(struct sk_buff *skb, struct nf_conn *ct, nf_ct_expect_init(exp, NF_CT_EXPECT_CLASS_DEFAULT, nf_ct_l3num(ct), &ct->tuplehash[!dir].tuple.src.u3, &addr, IPPROTO_UDP, NULL, &port); - rcu_assign_pointer(exp->helper, nf_conntrack_helper_ras); + rcu_assign_pointer(exp->assign_helper, nf_conntrack_helper_ras); if (nf_ct_expect_related(exp, 0) == 0) { pr_debug("nf_ct_ras: expect RAS "); @@ -1523,7 +1523,7 @@ static int process_acf(struct sk_buff *skb, struct nf_conn *ct, &ct->tuplehash[!dir].tuple.src.u3, &addr, IPPROTO_TCP, NULL, &port); exp->flags = NF_CT_EXPECT_PERMANENT; - rcu_assign_pointer(exp->helper, nf_conntrack_helper_q931); + rcu_assign_pointer(exp->assign_helper, nf_conntrack_helper_q931); if (nf_ct_expect_related(exp, 0) == 0) { pr_debug("nf_ct_ras: expect Q.931 "); @@ -1577,7 +1577,7 @@ static int process_lcf(struct sk_buff *skb, struct nf_conn *ct, &ct->tuplehash[!dir].tuple.src.u3, &addr, IPPROTO_TCP, NULL, &port); exp->flags = NF_CT_EXPECT_PERMANENT; - rcu_assign_pointer(exp->helper, nf_conntrack_helper_q931); + rcu_assign_pointer(exp->assign_helper, nf_conntrack_helper_q931); if (nf_ct_expect_related(exp, 0) == 0) { pr_debug("nf_ct_ras: expect Q.931 "); diff --git a/net/netfilter/nf_conntrack_helper.c b/net/netfilter/nf_conntrack_helper.c index a715304a53d8..b594cd244fe1 100644 --- a/net/netfilter/nf_conntrack_helper.c +++ b/net/netfilter/nf_conntrack_helper.c @@ -400,6 +400,11 @@ static bool expect_iter_me(struct nf_conntrack_expect *exp, void *data) this = rcu_dereference_protected(exp->helper, lockdep_is_held(&nf_conntrack_expect_lock)); + if (this == me) + return true; + + this = rcu_dereference_protected(exp->assign_helper, + lockdep_is_held(&nf_conntrack_expect_lock)); return this == me; } diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c index a20cd82446c5..e2f149aabbe8 100644 --- a/net/netfilter/nf_conntrack_netlink.c +++ b/net/netfilter/nf_conntrack_netlink.c @@ -2636,6 +2636,7 @@ static const struct nla_policy exp_nla_policy[CTA_EXPECT_MAX+1] = { static struct nf_conntrack_expect * ctnetlink_alloc_expect(const struct nlattr *const cda[], struct nf_conn *ct, + const struct nf_conntrack_helper *assign_helper, struct nf_conntrack_tuple *tuple, struct nf_conntrack_tuple *mask); @@ -2862,6 +2863,7 @@ static int ctnetlink_glue_attach_expect(const struct nlattr *attr, struct nf_conn *ct, u32 portid, u32 report) { + struct nf_conntrack_helper *assign_helper = NULL; struct nlattr *cda[CTA_EXPECT_MAX+1]; struct nf_conntrack_tuple tuple, mask; struct nf_conntrack_expect *exp; @@ -2877,8 +2879,18 @@ ctnetlink_glue_attach_expect(const struct nlattr *attr, struct nf_conn *ct, if (err < 0) return err; + if (cda[CTA_EXPECT_HELP_NAME]) { + const char *helpname = nla_data(cda[CTA_EXPECT_HELP_NAME]); + + assign_helper = __nf_conntrack_helper_find(helpname, + nf_ct_l3num(ct), + tuple.dst.protonum); + if (!assign_helper) + return -EOPNOTSUPP; + } + exp = ctnetlink_alloc_expect((const struct nlattr * const *)cda, ct, - &tuple, &mask); + assign_helper, &tuple, &mask); if (IS_ERR(exp)) return PTR_ERR(exp); @@ -3517,6 +3529,7 @@ ctnetlink_parse_expect_nat(const struct nlattr *attr, static struct nf_conntrack_expect * ctnetlink_alloc_expect(const struct nlattr * const cda[], struct nf_conn *ct, + const struct nf_conntrack_helper *assign_helper, struct nf_conntrack_tuple *tuple, struct nf_conntrack_tuple *mask) { @@ -3570,6 +3583,7 @@ ctnetlink_alloc_expect(const struct nlattr * const cda[], struct nf_conn *ct, exp->zone = ct->zone; #endif rcu_assign_pointer(exp->helper, helper); + rcu_assign_pointer(exp->assign_helper, assign_helper); exp->tuple = *tuple; exp->mask.src.u3 = mask->src.u3; exp->mask.src.u.all = mask->src.u.all; @@ -3625,7 +3639,7 @@ ctnetlink_create_expect(struct net *net, ct = nf_ct_tuplehash_to_ctrack(h); rcu_read_lock(); - exp = ctnetlink_alloc_expect(cda, ct, &tuple, &mask); + exp = ctnetlink_alloc_expect(cda, ct, NULL, &tuple, &mask); if (IS_ERR(exp)) { err = PTR_ERR(exp); goto err_rcu; diff --git a/net/netfilter/nf_conntrack_sip.c b/net/netfilter/nf_conntrack_sip.c index 81534213f00f..fd4326ee1aca 100644 --- a/net/netfilter/nf_conntrack_sip.c +++ b/net/netfilter/nf_conntrack_sip.c @@ -1384,7 +1384,7 @@ static int process_register_request(struct sk_buff *skb, unsigned int protoff, nf_ct_expect_init(exp, SIP_EXPECT_SIGNALLING, nf_ct_l3num(ct), saddr, &daddr, proto, NULL, &port); exp->timeout.expires = sip_timeout * HZ; - rcu_assign_pointer(exp->helper, helper); + rcu_assign_pointer(exp->assign_helper, helper); exp->flags = NF_CT_EXPECT_PERMANENT | NF_CT_EXPECT_INACTIVE; hooks = rcu_dereference(nf_nat_sip_hooks); diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c index 7f12e56e6e52..dd416c8532c5 100644 --- a/net/netfilter/nf_queue.c +++ b/net/netfilter/nf_queue.c @@ -60,6 +60,7 @@ static void nf_queue_entry_release_refs(struct nf_queue_entry *entry) struct nf_hook_state *state = &entry->state; /* Release those devices we held, or Alexey will kill me. */ + dev_put(entry->skb_dev); dev_put(state->in); dev_put(state->out); if (state->sk) @@ -101,6 +102,7 @@ bool nf_queue_entry_get_refs(struct nf_queue_entry *entry) if (state->sk && !refcount_inc_not_zero(&state->sk->sk_refcnt)) return false; + dev_hold(entry->skb_dev); dev_hold(state->in); dev_hold(state->out); @@ -201,11 +203,11 @@ static int __nf_queue(struct sk_buff *skb, const struct nf_hook_state *state, *entry = (struct nf_queue_entry) { .skb = skb, + .skb_dev = skb->dev, .state = *state, .hook_index = index, .size = sizeof(*entry) + route_key_size, }; - __nf_queue_entry_init_physdevs(entry); if (!nf_queue_entry_get_refs(entry)) { diff --git a/net/netfilter/nfnetlink_queue.c b/net/netfilter/nfnetlink_queue.c index 8e02f84784da..0529a19ca9a8 100644 --- a/net/netfilter/nfnetlink_queue.c +++ b/net/netfilter/nfnetlink_queue.c @@ -1198,6 +1198,8 @@ dev_cmp(struct nf_queue_entry *entry, unsigned long ifindex) if (physinif == ifindex || physoutif == ifindex) return 1; #endif + if (entry->skb_dev && entry->skb_dev->ifindex == ifindex) + return 1; if (entry->state.in) if (entry->state.in->ifindex == ifindex) return 1; diff --git a/net/netfilter/nft_inner.c b/net/netfilter/nft_inner.c index c4569d4b9228..ad08a43535b5 100644 --- a/net/netfilter/nft_inner.c +++ b/net/netfilter/nft_inner.c @@ -163,7 +163,6 @@ static int nft_inner_parse_l2l3(const struct nft_inner *priv, return -1; if (fragoff == 0) { - thoff = nhoff + sizeof(_ip6h); ctx->flags |= NFT_PAYLOAD_CTX_INNER_TH; ctx->inner_thoff = thoff; ctx->l4proto = l4proto; @@ -247,8 +246,8 @@ static bool nft_inner_restore_tun_ctx(const struct nft_pktinfo *pkt, local_lock_nested_bh(&nft_pcpu_tun_ctx.bh_lock); this_cpu_tun_ctx = this_cpu_ptr(&nft_pcpu_tun_ctx.ctx); if (this_cpu_tun_ctx->cookie != (unsigned long)pkt->skb) { - local_bh_enable(); local_unlock_nested_bh(&nft_pcpu_tun_ctx.bh_lock); + local_bh_enable(); return false; } *tun_ctx = *this_cpu_tun_ctx; diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c index b39017c80548..8050cc06a9a3 100644 --- a/net/netfilter/x_tables.c +++ b/net/netfilter/x_tables.c @@ -55,6 +55,9 @@ static struct list_head xt_templates[NFPROTO_NUMPROTO]; struct xt_pernet { struct list_head tables[NFPROTO_NUMPROTO]; + + /* stash area used during netns exit */ + struct list_head dead_tables[NFPROTO_NUMPROTO]; }; struct compat_delta { @@ -1405,11 +1408,9 @@ struct xt_counters *xt_counters_alloc(unsigned int counters) } EXPORT_SYMBOL(xt_counters_alloc); -struct xt_table_info * -xt_replace_table(struct xt_table *table, - unsigned int num_counters, - struct xt_table_info *newinfo, - int *error) +static struct xt_table_info * +do_replace_table(struct xt_table *table, unsigned int num_counters, + struct xt_table_info *newinfo, int *error) { struct xt_table_info *private; unsigned int cpu; @@ -1464,30 +1465,54 @@ xt_replace_table(struct xt_table *table, } } - audit_log_nfcfg(table->name, table->af, private->number, - !private->number ? AUDIT_XT_OP_REGISTER : - AUDIT_XT_OP_REPLACE, - GFP_KERNEL); + return private; +} + +struct xt_table_info * +xt_replace_table(struct xt_table *table, unsigned int num_counters, + struct xt_table_info *newinfo, + int *error) +{ + struct xt_table_info *private; + + private = do_replace_table(table, num_counters, newinfo, error); + if (private) + audit_log_nfcfg(table->name, table->af, private->number, + AUDIT_XT_OP_REPLACE, + GFP_KERNEL); + return private; } EXPORT_SYMBOL_GPL(xt_replace_table); struct xt_table *xt_register_table(struct net *net, const struct xt_table *input_table, + const struct nf_hook_ops *template_ops, struct xt_table_info *bootstrap, struct xt_table_info *newinfo) { struct xt_pernet *xt_net = net_generic(net, xt_pernet_id); + struct xt_table *t, *table = NULL; + struct nf_hook_ops *ops = NULL; struct xt_table_info *private; - struct xt_table *t, *table; - int ret; + unsigned int num_ops; + int ret = -EINVAL; + + num_ops = hweight32(input_table->valid_hooks); + if (num_ops == 0) + goto out; + + ret = -ENOMEM; + if (template_ops) { + ops = kmemdup_array(template_ops, num_ops, sizeof(*ops), GFP_KERNEL); + if (!ops) + goto out; + } /* Don't add one object to multiple lists. */ table = kmemdup(input_table, sizeof(struct xt_table), GFP_KERNEL); - if (!table) { - ret = -ENOMEM; + if (!table) goto out; - } mutex_lock(&xt[table->af].mutex); /* Don't autoload: we'd eat our tail... */ @@ -1501,7 +1526,7 @@ struct xt_table *xt_register_table(struct net *net, /* Simplifies replace_table code. */ table->private = bootstrap; - if (!xt_replace_table(table, 0, newinfo, &ret)) + if (!do_replace_table(table, 0, newinfo, &ret)) goto unlock; private = table->private; @@ -1510,34 +1535,122 @@ struct xt_table *xt_register_table(struct net *net, /* save number of initial entries */ private->initial_entries = private->number; + if (ops) { + int i; + + for (i = 0; i < num_ops; i++) + ops[i].priv = table; + + ret = nf_register_net_hooks(net, ops, num_ops); + if (ret != 0) { + mutex_unlock(&xt[table->af].mutex); + /* nf_register_net_hooks() might have published a + * base chain before internal error unwind. + */ + synchronize_rcu(); + goto out; + } + + table->ops = ops; + } + + audit_log_nfcfg(table->name, table->af, private->number, + AUDIT_XT_OP_REGISTER, GFP_KERNEL); + list_add(&table->list, &xt_net->tables[table->af]); mutex_unlock(&xt[table->af].mutex); return table; unlock: mutex_unlock(&xt[table->af].mutex); - kfree(table); out: + kfree(table); + kfree(ops); return ERR_PTR(ret); } EXPORT_SYMBOL_GPL(xt_register_table); -void *xt_unregister_table(struct xt_table *table) +/** + * xt_unregister_table_pre_exit - pre-shutdown unregister of a table + * @net: network namespace + * @af: address family (e.g., NFPROTO_IPV4, NFPROTO_IPV6) + * @name: name of the table to unregister + * + * Unregisters the specified netfilter table from the given network namespace + * and also unregisters the hooks from netfilter core: no new packets will be + * processed. + * + * This must be called prior to xt_unregister_table_exit() from the pernet + * .pre_exit callback. After this call, the table is no longer visible to + * the get/setsockopt path. In case of rmmod, module exit path must have + * called xt_unregister_template() prior to unregistering pernet ops to + * prevent re-instantiation of the table. + * + * See also: xt_unregister_table_exit() + */ +void xt_unregister_table_pre_exit(struct net *net, u8 af, const char *name) { - struct xt_table_info *private; + struct xt_pernet *xt_net = net_generic(net, xt_pernet_id); + struct xt_table *t; - mutex_lock(&xt[table->af].mutex); - private = table->private; - list_del(&table->list); - mutex_unlock(&xt[table->af].mutex); - audit_log_nfcfg(table->name, table->af, private->number, - AUDIT_XT_OP_UNREGISTER, GFP_KERNEL); - kfree(table->ops); - kfree(table); + mutex_lock(&xt[af].mutex); + list_for_each_entry(t, &xt_net->tables[af], list) { + if (strcmp(t->name, name) == 0) { + list_move(&t->list, &xt_net->dead_tables[af]); + mutex_unlock(&xt[af].mutex); - return private; + if (t->ops) /* nat table registers with nat core, t->ops is NULL. */ + nf_unregister_net_hooks(net, t->ops, hweight32(t->valid_hooks)); + return; + } + } + mutex_unlock(&xt[af].mutex); +} +EXPORT_SYMBOL(xt_unregister_table_pre_exit); + +/** + * xt_unregister_table_exit - remove a table during namespace teardown + * @net: the network namespace from which to unregister the table + * @af: address family (e.g., NFPROTO_IPV4, NFPROTO_IPV6) + * @name: name of the table to unregister + * + * Completes the unregister process for a table. This must be called from + * the pernet ops .exit callback. This is the second stage after + * xt_unregister_table_pre_exit(). + * + * pair with xt_unregister_table_pre_exit() during namespace shutdown. + * + * Return: the unregistered table or NULL if the table was never + * instantiated. The caller needs to kfree() the table after it + * has removed the family specific matches/targets. + */ +struct xt_table *xt_unregister_table_exit(struct net *net, u8 af, const char *name) +{ + struct xt_pernet *xt_net = net_generic(net, xt_pernet_id); + struct xt_table *table; + + mutex_lock(&xt[af].mutex); + list_for_each_entry(table, &xt_net->dead_tables[af], list) { + struct nf_hook_ops *ops = NULL; + + if (strcmp(table->name, name) != 0) + continue; + + list_del(&table->list); + + audit_log_nfcfg(table->name, table->af, table->private->number, + AUDIT_XT_OP_UNREGISTER, GFP_KERNEL); + swap(table->ops, ops); + mutex_unlock(&xt[af].mutex); + + kfree(ops); + return table; + } + mutex_unlock(&xt[af].mutex); + + return NULL; } -EXPORT_SYMBOL_GPL(xt_unregister_table); +EXPORT_SYMBOL_GPL(xt_unregister_table_exit); #endif #ifdef CONFIG_PROC_FS @@ -1984,8 +2097,10 @@ static int __net_init xt_net_init(struct net *net) struct xt_pernet *xt_net = net_generic(net, xt_pernet_id); int i; - for (i = 0; i < NFPROTO_NUMPROTO; i++) + for (i = 0; i < NFPROTO_NUMPROTO; i++) { INIT_LIST_HEAD(&xt_net->tables[i]); + INIT_LIST_HEAD(&xt_net->dead_tables[i]); + } return 0; } @@ -1994,8 +2109,10 @@ static void __net_exit xt_net_exit(struct net *net) struct xt_pernet *xt_net = net_generic(net, xt_pernet_id); int i; - for (i = 0; i < NFPROTO_NUMPROTO; i++) + for (i = 0; i < NFPROTO_NUMPROTO; i++) { WARN_ON_ONCE(!list_empty(&xt_net->tables[i])); + WARN_ON_ONCE(!list_empty(&xt_net->dead_tables[i])); + } } static struct pernet_operations xt_net_ops = { diff --git a/net/phonet/pep.c b/net/phonet/pep.c index 120e711ea78c..2c7bad24d7c9 100644 --- a/net/phonet/pep.c +++ b/net/phonet/pep.c @@ -671,8 +671,23 @@ static int pep_do_rcv(struct sock *sk, struct sk_buff *skb) /* Look for an existing pipe handle */ sknode = pep_find_pipe(&pn->hlist, &dst, pipe_handle); - if (sknode) - return sk_receive_skb(sknode, skb, 1); + if (sknode) { + int rc; + + /* pep_do_rcv() runs from two contexts: from softirq via + * phonet_rcv() -> __sk_receive_skb() with BH disabled, + * and from process context via + * release_sock() -> __release_sock(), which drops + * the listener slock with spin_unlock_bh() before draining + * the backlog. The child pipe slock is taken below via + * bh_lock_sock_nested(), which does not itself disable BH, so + * disable BH here to keep both acquire contexts consistent. + */ + local_bh_disable(); + rc = sk_receive_skb(sknode, skb, 1); + local_bh_enable(); + return rc; + } switch (hdr->message_id) { case PNS_PEP_CONNECT_REQ: diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h index 27c2aa2dd023..783367eea798 100644 --- a/net/rxrpc/ar-internal.h +++ b/net/rxrpc/ar-internal.h @@ -213,8 +213,6 @@ struct rxrpc_skb_priv { struct { u16 offset; /* Offset of data */ u16 len; /* Length of data */ - u8 flags; -#define RXRPC_RX_VERIFIED 0x01 }; struct { rxrpc_seq_t first_ack; /* First packet in acks table */ @@ -774,6 +772,11 @@ struct rxrpc_call { struct sk_buff_head recvmsg_queue; /* Queue of packets ready for recvmsg() */ struct sk_buff_head rx_queue; /* Queue of packets for this call to receive */ struct sk_buff_head rx_oos_queue; /* Queue of out of sequence packets */ + void *rx_dec_buffer; /* Decryption buffer */ + unsigned short rx_dec_bsize; /* rx_dec_buffer size */ + unsigned short rx_dec_offset; /* Decrypted packet data offset */ + unsigned short rx_dec_len; /* Decrypted packet data len */ + rxrpc_seq_t rx_dec_seq; /* Packet in decryption buffer */ rxrpc_seq_t rx_highest_seq; /* Higest sequence number received */ rxrpc_seq_t rx_consumed; /* Highest packet consumed */ diff --git a/net/rxrpc/call_event.c b/net/rxrpc/call_event.c index 2b19b252225e..fec59d9338b9 100644 --- a/net/rxrpc/call_event.c +++ b/net/rxrpc/call_event.c @@ -332,27 +332,7 @@ bool rxrpc_input_call_event(struct rxrpc_call *call) saw_ack |= sp->hdr.type == RXRPC_PACKET_TYPE_ACK; - if (sp->hdr.type == RXRPC_PACKET_TYPE_DATA && - sp->hdr.securityIndex != 0 && - (skb_cloned(skb) || - skb_has_frag_list(skb) || - skb_has_shared_frag(skb))) { - /* Unshare the packet so that it can be - * modified by in-place decryption. - */ - struct sk_buff *nskb = skb_copy(skb, GFP_ATOMIC); - - if (nskb) { - rxrpc_new_skb(nskb, rxrpc_skb_new_unshared); - rxrpc_input_call_packet(call, nskb); - rxrpc_free_skb(nskb, rxrpc_skb_put_call_rx); - } else { - /* OOM - Drop the packet. */ - rxrpc_see_skb(skb, rxrpc_skb_see_unshare_nomem); - } - } else { - rxrpc_input_call_packet(call, skb); - } + rxrpc_input_call_packet(call, skb); rxrpc_free_skb(skb, rxrpc_skb_put_call_rx); did_receive = true; } diff --git a/net/rxrpc/call_object.c b/net/rxrpc/call_object.c index f035f486c139..fcb9d38bb521 100644 --- a/net/rxrpc/call_object.c +++ b/net/rxrpc/call_object.c @@ -152,6 +152,7 @@ struct rxrpc_call *rxrpc_alloc_call(struct rxrpc_sock *rx, gfp_t gfp, spin_lock_init(&call->notify_lock); refcount_set(&call->ref, 1); call->debug_id = debug_id; + call->rx_pkt_offset = USHRT_MAX; call->tx_total_len = -1; call->tx_jumbo_max = 1; call->next_rx_timo = 20 * HZ; @@ -553,6 +554,7 @@ static void rxrpc_cleanup_rx_buffers(struct rxrpc_call *call) rxrpc_purge_queue(&call->recvmsg_queue); rxrpc_purge_queue(&call->rx_queue); rxrpc_purge_queue(&call->rx_oos_queue); + kfree(call->rx_dec_buffer); } /* diff --git a/net/rxrpc/insecure.c b/net/rxrpc/insecure.c index 0a260df45d25..7a26c6097d03 100644 --- a/net/rxrpc/insecure.c +++ b/net/rxrpc/insecure.c @@ -32,9 +32,6 @@ static int none_secure_packet(struct rxrpc_call *call, struct rxrpc_txbuf *txb) static int none_verify_packet(struct rxrpc_call *call, struct sk_buff *skb) { - struct rxrpc_skb_priv *sp = rxrpc_skb(skb); - - sp->flags |= RXRPC_RX_VERIFIED; return 0; } diff --git a/net/rxrpc/recvmsg.c b/net/rxrpc/recvmsg.c index e1f7513a46db..c940600117a4 100644 --- a/net/rxrpc/recvmsg.c +++ b/net/rxrpc/recvmsg.c @@ -147,15 +147,52 @@ static void rxrpc_rotate_rx_window(struct rxrpc_call *call) } /* - * Decrypt and verify a DATA packet. + * Decrypt and verify a DATA packet. The content of the packet is pulled out + * into a flat buffer rather than decrypting in place in the skbuff. This also + * has the advantage of aligning the buffer correctly for the crypto routines. + * + * We keep track of the sequence number of the packet currently decrypted into + * the buffer in ->rx_dec_seq. If MSG_PEEK is used and steps onto a new + * packet, subsequent recvmsg() calls will have to go back and re-decrypt the + * current packet. */ static int rxrpc_verify_data(struct rxrpc_call *call, struct sk_buff *skb) { struct rxrpc_skb_priv *sp = rxrpc_skb(skb); + int ret; - if (sp->flags & RXRPC_RX_VERIFIED) - return 0; - return call->security->verify_packet(call, skb); + if (sp->len > call->rx_dec_bsize) { + /* Make sure we can hold a 1412-byte jumbo subpacket and make + * sure that the buffer size is aligned to a crypto blocksize. + */ + size_t size = clamp(round_up(sp->len, 32), 2048, 65535); + void *buffer = krealloc(call->rx_dec_buffer, size, GFP_NOFS); + + if (!buffer) + return -ENOMEM; + call->rx_dec_buffer = buffer; + call->rx_dec_bsize = size; + } + + ret = -EFAULT; + if (skb_copy_bits(skb, sp->offset, call->rx_dec_buffer, sp->len) < 0) + goto err; + + call->rx_dec_offset = 0; + call->rx_dec_len = sp->len; + call->rx_dec_seq = sp->hdr.seq; + ret = call->security->verify_packet(call, skb); + if (ret < 0) + goto err; + return 0; + +err: + kfree(call->rx_dec_buffer); + call->rx_dec_buffer = NULL; + call->rx_dec_bsize = 0; + call->rx_dec_offset = 0; + call->rx_dec_len = 0; + return ret; } /* @@ -283,16 +320,21 @@ static int rxrpc_recvmsg_data(struct socket *sock, struct rxrpc_call *call, if (msg) sock_recv_timestamp(msg, sock->sk, skb); - if (rx_pkt_offset == 0) { + if (call->rx_dec_seq != sp->hdr.seq || + !call->rx_dec_buffer) { ret2 = rxrpc_verify_data(call, skb); trace_rxrpc_recvdata(call, rxrpc_recvmsg_next, seq, - sp->offset, sp->len, ret2); + call->rx_dec_offset, + call->rx_dec_len, ret2); if (ret2 < 0) { ret = ret2; goto out; } - rx_pkt_offset = sp->offset; - rx_pkt_len = sp->len; + } + + if (rx_pkt_offset == USHRT_MAX) { + rx_pkt_offset = call->rx_dec_offset; + rx_pkt_len = call->rx_dec_len; } else { trace_rxrpc_recvdata(call, rxrpc_recvmsg_cont, seq, rx_pkt_offset, rx_pkt_len, 0); @@ -304,10 +346,10 @@ static int rxrpc_recvmsg_data(struct socket *sock, struct rxrpc_call *call, if (copy > remain) copy = remain; if (copy > 0) { - ret2 = skb_copy_datagram_iter(skb, rx_pkt_offset, iter, - copy); - if (ret2 < 0) { - ret = ret2; + ret2 = copy_to_iter(call->rx_dec_buffer + rx_pkt_offset, + copy, iter); + if (ret2 != copy) { + ret = -EFAULT; goto out; } @@ -328,7 +370,7 @@ static int rxrpc_recvmsg_data(struct socket *sock, struct rxrpc_call *call, /* The whole packet has been transferred. */ if (sp->hdr.flags & RXRPC_LAST_PACKET) ret = 1; - rx_pkt_offset = 0; + rx_pkt_offset = USHRT_MAX; rx_pkt_len = 0; skb = skb_peek_next(skb, &call->recvmsg_queue); diff --git a/net/rxrpc/rxgk.c b/net/rxrpc/rxgk.c index 0d5e654da918..f81703ee7ac3 100644 --- a/net/rxrpc/rxgk.c +++ b/net/rxrpc/rxgk.c @@ -473,15 +473,20 @@ static int rxgk_verify_packet_integrity(struct rxrpc_call *call, struct rxrpc_skb_priv *sp = rxrpc_skb(skb); struct rxgk_header *hdr; struct krb5_buffer metadata; - unsigned int offset = sp->offset, len = sp->len; + unsigned int len = call->rx_dec_len; size_t data_offset = 0, data_len = len; + void *data = call->rx_dec_buffer, *p = data; u32 ac = 0; int ret = -ENOMEM; _enter(""); - crypto_krb5_where_is_the_data(gk->krb5, KRB5_CHECKSUM_MODE, - &data_offset, &data_len); + if (crypto_krb5_where_is_the_data(gk->krb5, KRB5_CHECKSUM_MODE, + &data_offset, &data_len) < 0) { + ret = rxrpc_abort_eproto(call, skb, RXGK_PACKETSHORT, + rxgk_abort_1_short_header); + goto put_gk; + } hdr = kzalloc_obj(*hdr, GFP_NOFS); if (!hdr) @@ -496,16 +501,15 @@ static int rxgk_verify_packet_integrity(struct rxrpc_call *call, metadata.len = sizeof(*hdr); metadata.data = hdr; - ret = rxgk_verify_mic_skb(gk->krb5, gk->rx_Kc, &metadata, - skb, &offset, &len, &ac); + ret = rxgk_verify_mic(gk->krb5, gk->rx_Kc, &metadata, &p, &len, &ac); kfree(hdr); if (ret < 0) { if (ret != -ENOMEM) rxrpc_abort_eproto(call, skb, ac, rxgk_abort_1_verify_mic_eproto); } else { - sp->offset = offset; - sp->len = len; + call->rx_dec_offset = p - data; + call->rx_dec_len = len; } put_gk: @@ -522,49 +526,53 @@ static int rxgk_verify_packet_encrypted(struct rxrpc_call *call, struct sk_buff *skb) { struct rxrpc_skb_priv *sp = rxrpc_skb(skb); - struct rxgk_header hdr; - unsigned int offset = sp->offset, len = sp->len; + struct rxgk_header *hdr; + unsigned int offset = 0, len = call->rx_dec_len; + void *data = call->rx_dec_buffer, *p = data; int ret; u32 ac = 0; _enter(""); - ret = rxgk_decrypt_skb(gk->krb5, gk->rx_enc, skb, &offset, &len, &ac); + if (crypto_krb5_check_data_len(gk->krb5, KRB5_ENCRYPT_MODE, + len, sizeof(*hdr)) < 0) { + ret = rxrpc_abort_eproto(call, skb, RXGK_PACKETSHORT, + rxgk_abort_2_short_header); + goto error; + } + + ret = rxgk_decrypt(gk->krb5, gk->rx_enc, &p, &len, &ac); if (ret < 0) { if (ret != -ENOMEM) rxrpc_abort_eproto(call, skb, ac, rxgk_abort_2_decrypt_eproto); goto error; } + offset = p - data; - if (len < sizeof(hdr)) { + if (len < sizeof(*hdr)) { ret = rxrpc_abort_eproto(call, skb, RXGK_PACKETSHORT, rxgk_abort_2_short_header); goto error; } /* Extract the header from the skb */ - ret = skb_copy_bits(skb, offset, &hdr, sizeof(hdr)); - if (ret < 0) { - ret = rxrpc_abort_eproto(call, skb, RXGK_PACKETSHORT, - rxgk_abort_2_short_encdata); - goto error; - } - offset += sizeof(hdr); - len -= sizeof(hdr); - - if (ntohl(hdr.epoch) != call->conn->proto.epoch || - ntohl(hdr.cid) != call->cid || - ntohl(hdr.call_number) != call->call_id || - ntohl(hdr.seq) != sp->hdr.seq || - ntohl(hdr.sec_index) != call->security_ix || - ntohl(hdr.data_len) > len) { + hdr = data + offset; + offset += sizeof(*hdr); + len -= sizeof(*hdr); + + if (ntohl(hdr->epoch) != call->conn->proto.epoch || + ntohl(hdr->cid) != call->cid || + ntohl(hdr->call_number) != call->call_id || + ntohl(hdr->seq) != sp->hdr.seq || + ntohl(hdr->sec_index) != call->security_ix || + ntohl(hdr->data_len) > len) { ret = rxrpc_abort_eproto(call, skb, RXGK_SEALEDINCON, rxgk_abort_2_short_data); goto error; } - sp->offset = offset; - sp->len = ntohl(hdr.data_len); + call->rx_dec_offset = offset; + call->rx_dec_len = ntohl(hdr->data_len); ret = 0; error: rxgk_put(gk); diff --git a/net/rxrpc/rxgk_common.h b/net/rxrpc/rxgk_common.h index 1e257d7ab8ec..112b5366ce11 100644 --- a/net/rxrpc/rxgk_common.h +++ b/net/rxrpc/rxgk_common.h @@ -106,6 +106,49 @@ int rxgk_decrypt_skb(const struct krb5_enctype *krb5, } /* + * Apply decryption and checksumming functions a flat data buffer. The data + * point and length are updated to reflect the actual content of the encrypted + * region. + */ +static inline int rxgk_decrypt(const struct krb5_enctype *krb5, + struct crypto_aead *aead, + void **_data, unsigned int *_len, + int *_error_code) +{ + struct scatterlist sg[1]; + size_t offset = 0, len = *_len; + int ret; + + sg_init_one(sg, *_data, len); + + ret = crypto_krb5_decrypt(krb5, aead, sg, 1, &offset, &len); + switch (ret) { + case 0: + if (offset & 3) { + *_error_code = RXGK_INCONSISTENCY; + ret = -EPROTO; + break; + } + *_data += offset; + *_len = len; + break; + case -EBADMSG: /* Checksum mismatch. */ + case -EPROTO: + *_error_code = RXGK_SEALEDINCON; + break; + case -EMSGSIZE: + *_error_code = RXGK_PACKETSHORT; + break; + case -ENOPKG: /* Would prefer RXGK_BADETYPE, but not available for YFS. */ + default: + *_error_code = RXGK_INCONSISTENCY; + break; + } + + return ret; +} + +/* * Check the MIC on a region of an skbuff. The offset and length are updated * to reflect the actual content of the secure region. */ @@ -148,3 +191,42 @@ int rxgk_verify_mic_skb(const struct krb5_enctype *krb5, return ret; } + +/* + * Check the MIC on a flat buffer. The data pointer and length are updated to + * reflect the actual content of the secure region. + */ +static inline +int rxgk_verify_mic(const struct krb5_enctype *krb5, + struct crypto_shash *shash, + const struct krb5_buffer *metadata, + void **_data, unsigned int *_len, + u32 *_error_code) +{ + struct scatterlist sg[1]; + size_t offset = 0, len = *_len; + int ret; + + sg_init_one(sg, *_data, len); + + ret = crypto_krb5_verify_mic(krb5, shash, metadata, sg, 1, &offset, &len); + switch (ret) { + case 0: + *_data += offset; + *_len = len; + break; + case -EBADMSG: /* Checksum mismatch */ + case -EPROTO: + *_error_code = RXGK_SEALEDINCON; + break; + case -EMSGSIZE: + *_error_code = RXGK_PACKETSHORT; + break; + case -ENOPKG: /* Would prefer RXGK_BADETYPE, but not available for YFS. */ + default: + *_error_code = RXGK_INCONSISTENCY; + break; + } + + return ret; +} diff --git a/net/rxrpc/rxkad.c b/net/rxrpc/rxkad.c index cba7935977f0..075936337836 100644 --- a/net/rxrpc/rxkad.c +++ b/net/rxrpc/rxkad.c @@ -430,27 +430,25 @@ static int rxkad_verify_packet_1(struct rxrpc_call *call, struct sk_buff *skb, rxrpc_seq_t seq, struct skcipher_request *req) { - struct rxkad_level1_hdr sechdr; + struct rxkad_level1_hdr *sechdr; struct rxrpc_skb_priv *sp = rxrpc_skb(skb); struct rxrpc_crypt iv; - struct scatterlist sg[16]; - u32 data_size, buf; + struct scatterlist sg[1]; + void *data = call->rx_dec_buffer; + u32 len = sp->len, data_size, buf; u16 check; int ret; _enter(""); - if (sp->len < 8) + if (len < 8) return rxrpc_abort_eproto(call, skb, RXKADSEALEDINCON, rxkad_abort_1_short_header); /* Decrypt the skbuff in-place. TODO: We really want to decrypt * directly into the target buffer. */ - sg_init_table(sg, ARRAY_SIZE(sg)); - ret = skb_to_sgvec(skb, sg, sp->offset, 8); - if (unlikely(ret < 0)) - return ret; + sg_init_one(sg, data, len); /* start the decryption afresh */ memset(&iv, 0, sizeof(iv)); @@ -464,13 +462,11 @@ static int rxkad_verify_packet_1(struct rxrpc_call *call, struct sk_buff *skb, return ret; /* Extract the decrypted packet length */ - if (skb_copy_bits(skb, sp->offset, &sechdr, sizeof(sechdr)) < 0) - return rxrpc_abort_eproto(call, skb, RXKADDATALEN, - rxkad_abort_1_short_encdata); - sp->offset += sizeof(sechdr); - sp->len -= sizeof(sechdr); + sechdr = data; + call->rx_dec_offset = sizeof(*sechdr); + len -= sizeof(*sechdr); - buf = ntohl(sechdr.data_size); + buf = ntohl(sechdr->data_size); data_size = buf & 0xffff; check = buf >> 16; @@ -479,10 +475,10 @@ static int rxkad_verify_packet_1(struct rxrpc_call *call, struct sk_buff *skb, if (check != 0) return rxrpc_abort_eproto(call, skb, RXKADSEALEDINCON, rxkad_abort_1_short_check); - if (data_size > sp->len) + if (data_size > len) return rxrpc_abort_eproto(call, skb, RXKADDATALEN, rxkad_abort_1_short_data); - sp->len = data_size; + call->rx_dec_len = data_size; _leave(" = 0 [dlen=%x]", data_size); return 0; @@ -496,43 +492,28 @@ static int rxkad_verify_packet_2(struct rxrpc_call *call, struct sk_buff *skb, struct skcipher_request *req) { const struct rxrpc_key_token *token; - struct rxkad_level2_hdr sechdr; + struct rxkad_level2_hdr *sechdr; struct rxrpc_skb_priv *sp = rxrpc_skb(skb); struct rxrpc_crypt iv; - struct scatterlist _sg[4], *sg; - u32 data_size, buf; + struct scatterlist sg[1]; + void *data = call->rx_dec_buffer; + u32 len = sp->len, data_size, buf; u16 check; - int nsg, ret; + int ret; - _enter(",{%d}", sp->len); + _enter(",{%d}", len); - if (sp->len < 8) + if (len < 8) return rxrpc_abort_eproto(call, skb, RXKADSEALEDINCON, rxkad_abort_2_short_header); /* Don't let the crypto algo see a misaligned length. */ - sp->len = round_down(sp->len, 8); + len = round_down(len, 8); - /* Decrypt the skbuff in-place. TODO: We really want to decrypt - * directly into the target buffer. + /* Decrypt in place in the call's decryption buffer. TODO: We really + * want to decrypt directly into the target buffer. */ - sg = _sg; - nsg = skb_shinfo(skb)->nr_frags + 1; - if (nsg <= 4) { - nsg = 4; - } else { - sg = kmalloc_objs(*sg, nsg, GFP_NOIO); - if (!sg) - return -ENOMEM; - } - - sg_init_table(sg, nsg); - ret = skb_to_sgvec(skb, sg, sp->offset, sp->len); - if (unlikely(ret < 0)) { - if (sg != _sg) - kfree(sg); - return ret; - } + sg_init_one(sg, data, len); /* decrypt from the session key */ token = call->conn->key->payload.data[0]; @@ -540,11 +521,9 @@ static int rxkad_verify_packet_2(struct rxrpc_call *call, struct sk_buff *skb, skcipher_request_set_sync_tfm(req, call->conn->rxkad.cipher); skcipher_request_set_callback(req, 0, NULL, NULL); - skcipher_request_set_crypt(req, sg, sg, sp->len, iv.x); + skcipher_request_set_crypt(req, sg, sg, len, iv.x); ret = crypto_skcipher_decrypt(req); skcipher_request_zero(req); - if (sg != _sg) - kfree(sg); if (ret < 0) { if (ret == -ENOMEM) return ret; @@ -553,13 +532,11 @@ static int rxkad_verify_packet_2(struct rxrpc_call *call, struct sk_buff *skb, } /* Extract the decrypted packet length */ - if (skb_copy_bits(skb, sp->offset, &sechdr, sizeof(sechdr)) < 0) - return rxrpc_abort_eproto(call, skb, RXKADDATALEN, - rxkad_abort_2_short_len); - sp->offset += sizeof(sechdr); - sp->len -= sizeof(sechdr); + sechdr = data; + call->rx_dec_offset = sizeof(*sechdr); + len -= sizeof(*sechdr); - buf = ntohl(sechdr.data_size); + buf = ntohl(sechdr->data_size); data_size = buf & 0xffff; check = buf >> 16; @@ -569,17 +546,18 @@ static int rxkad_verify_packet_2(struct rxrpc_call *call, struct sk_buff *skb, return rxrpc_abort_eproto(call, skb, RXKADSEALEDINCON, rxkad_abort_2_short_check); - if (data_size > sp->len) + if (data_size > len) return rxrpc_abort_eproto(call, skb, RXKADDATALEN, rxkad_abort_2_short_data); - sp->len = data_size; + call->rx_dec_len = data_size; _leave(" = 0 [dlen=%x]", data_size); return 0; } /* - * Verify the security on a received packet and the subpackets therein. + * Verify the security on a received (sub)packet. If the packet needs + * modifying (e.g. decrypting), it must be copied. */ static int rxkad_verify_packet(struct rxrpc_call *call, struct sk_buff *skb) { diff --git a/net/shaper/shaper.c b/net/shaper/shaper.c index 94bc9c7382ea..526fbde2e37f 100644 --- a/net/shaper/shaper.c +++ b/net/shaper/shaper.c @@ -90,6 +90,12 @@ static int net_shaper_handle_size(void) nla_total_size(sizeof(u32))); } +static int net_shaper_group_reply_size(void) +{ + return nla_total_size(sizeof(u32)) + /* NET_SHAPER_A_IFINDEX */ + net_shaper_handle_size(); /* NET_SHAPER_A_HANDLE */ +} + static int net_shaper_fill_binding(struct sk_buff *msg, const struct net_shaper_binding *binding, u32 type) @@ -130,35 +136,58 @@ handle_nest_cancel: return -EMSGSIZE; } +static void net_shaper_copy(struct net_shaper *dst, + const struct net_shaper *src) +{ + WRITE_ONCE(dst->parent.scope, READ_ONCE(src->parent.scope)); + WRITE_ONCE(dst->parent.id, READ_ONCE(src->parent.id)); + WRITE_ONCE(dst->handle.scope, READ_ONCE(src->handle.scope)); + WRITE_ONCE(dst->handle.id, READ_ONCE(src->handle.id)); + + WRITE_ONCE(dst->metric, READ_ONCE(src->metric)); + WRITE_ONCE(dst->bw_min, READ_ONCE(src->bw_min)); + WRITE_ONCE(dst->bw_max, READ_ONCE(src->bw_max)); + WRITE_ONCE(dst->burst, READ_ONCE(src->burst)); + WRITE_ONCE(dst->priority, READ_ONCE(src->priority)); + WRITE_ONCE(dst->weight, READ_ONCE(src->weight)); + + /* private fields are only used on the write path under the lock */ + data_race(dst->leaves = src->leaves); +} + static int net_shaper_fill_one(struct sk_buff *msg, const struct net_shaper_binding *binding, const struct net_shaper *shaper, const struct genl_info *info) { + struct net_shaper cur; void *hdr; hdr = genlmsg_iput(msg, info); if (!hdr) return -EMSGSIZE; + /* Make a copy to avoid data races */ + net_shaper_copy(&cur, shaper); + if (net_shaper_fill_binding(msg, binding, NET_SHAPER_A_IFINDEX) || - net_shaper_fill_handle(msg, &shaper->parent, + net_shaper_fill_handle(msg, &cur.parent, NET_SHAPER_A_PARENT) || - net_shaper_fill_handle(msg, &shaper->handle, + net_shaper_fill_handle(msg, &cur.handle, NET_SHAPER_A_HANDLE) || - ((shaper->bw_min || shaper->bw_max || shaper->burst) && - nla_put_u32(msg, NET_SHAPER_A_METRIC, shaper->metric)) || - (shaper->bw_min && - nla_put_uint(msg, NET_SHAPER_A_BW_MIN, shaper->bw_min)) || - (shaper->bw_max && - nla_put_uint(msg, NET_SHAPER_A_BW_MAX, shaper->bw_max)) || - (shaper->burst && - nla_put_uint(msg, NET_SHAPER_A_BURST, shaper->burst)) || - (shaper->priority && - nla_put_u32(msg, NET_SHAPER_A_PRIORITY, shaper->priority)) || - (shaper->weight && - nla_put_u32(msg, NET_SHAPER_A_WEIGHT, shaper->weight))) + ((cur.bw_min || cur.bw_max || cur.burst) && + nla_put_u32(msg, NET_SHAPER_A_METRIC, cur.metric)) || + (cur.bw_min && + nla_put_uint(msg, NET_SHAPER_A_BW_MIN, cur.bw_min)) || + (cur.bw_max && + nla_put_uint(msg, NET_SHAPER_A_BW_MAX, cur.bw_max)) || + (cur.burst && + nla_put_uint(msg, NET_SHAPER_A_BURST, cur.burst)) || + (cur.priority && + nla_put_u32(msg, NET_SHAPER_A_PRIORITY, cur.priority)) || + (cur.weight && + nla_put_u32(msg, NET_SHAPER_A_WEIGHT, cur.weight))) goto nla_put_failure; genlmsg_end(msg, hdr); @@ -275,25 +304,24 @@ static void net_shaper_default_parent(const struct net_shaper_handle *handle, parent->id = 0; } -/* - * MARK_0 is already in use due to XA_FLAGS_ALLOC, can't reuse such flag as - * it's cleared by xa_store(). - */ -#define NET_SHAPER_NOT_VALID XA_MARK_1 - static struct net_shaper * net_shaper_lookup(struct net_shaper_binding *binding, const struct net_shaper_handle *handle) { u32 index = net_shaper_handle_to_index(handle); struct net_shaper_hierarchy *hierarchy; + struct net_shaper *cur; hierarchy = net_shaper_hierarchy_rcu(binding); - if (!hierarchy || xa_get_mark(&hierarchy->shapers, index, - NET_SHAPER_NOT_VALID)) + if (!hierarchy) return NULL; - return xa_load(&hierarchy->shapers, index); + cur = xa_load(&hierarchy->shapers, index); + /* Check valid before reading fields */ + if (!cur || !smp_load_acquire(&cur->valid)) + return NULL; + + return cur; } /* Allocate on demand the per device shaper's hierarchy container. @@ -370,13 +398,10 @@ static int net_shaper_pre_insert(struct net_shaper_binding *binding, goto free_id; } - /* Mark 'tentative' shaper inside the hierarchy container. - * xa_set_mark is a no-op if the previous store fails. + /* Insert as 'tentative' (no VALID mark). The mark will be set by + * net_shaper_commit() once the driver-side configuration succeeds. */ - xa_lock(&hierarchy->shapers); - prev = __xa_store(&hierarchy->shapers, index, cur, GFP_KERNEL); - __xa_set_mark(&hierarchy->shapers, index, NET_SHAPER_NOT_VALID); - xa_unlock(&hierarchy->shapers); + prev = xa_store(&hierarchy->shapers, index, cur, GFP_KERNEL); if (xa_err(prev)) { NL_SET_ERR_MSG(extack, "Can't insert shaper into device store"); kfree_rcu(cur, rcu); @@ -410,12 +435,10 @@ static void net_shaper_commit(struct net_shaper_binding *binding, if (WARN_ON_ONCE(!cur)) continue; - /* Successful update: drop the tentative mark - * and update the hierarchy container. - */ - __xa_clear_mark(&hierarchy->shapers, index, - NET_SHAPER_NOT_VALID); - *cur = shapers[i]; + /* Successful update: update the hierarchy container... */ + net_shaper_copy(cur, &shapers[i]); + /* ... publish to lockless readers. */ + smp_store_release(&cur->valid, true); } xa_unlock(&hierarchy->shapers); } @@ -431,10 +454,11 @@ static void net_shaper_rollback(struct net_shaper_binding *binding) return; xa_lock(&hierarchy->shapers); - xa_for_each_marked(&hierarchy->shapers, index, cur, - NET_SHAPER_NOT_VALID) { + xa_for_each(&hierarchy->shapers, index, cur) { + if (cur->valid) + continue; __xa_erase(&hierarchy->shapers, index); - kfree(cur); + kfree_rcu(cur, rcu); } xa_unlock(&hierarchy->shapers); } @@ -465,10 +489,21 @@ static int net_shaper_parse_handle(const struct nlattr *attr, * shaper (any other value). */ id_attr = tb[NET_SHAPER_A_HANDLE_ID]; - if (id_attr) + if (id_attr) { id = nla_get_u32(id_attr); - else if (handle->scope == NET_SHAPER_SCOPE_NODE) + } else if (handle->scope == NET_SHAPER_SCOPE_NODE) { id = NET_SHAPER_ID_UNSPEC; + } else if (handle->scope == NET_SHAPER_SCOPE_QUEUE) { + NL_SET_ERR_ATTR_MISS(info->extack, attr, + NET_SHAPER_A_HANDLE_ID); + return -EINVAL; + } + + if (id && handle->scope == NET_SHAPER_SCOPE_NETDEV) { + NL_SET_ERR_MSG_ATTR(info->extack, id_attr, + "Netdev scope is a singleton, must use ID 0"); + return -EINVAL; + } handle->id = id; return 0; @@ -836,7 +871,12 @@ int net_shaper_nl_get_dumpit(struct sk_buff *skb, goto out_unlock; for (; (shaper = xa_find(&hierarchy->shapers, &ctx->start_index, - U32_MAX, XA_PRESENT)); ctx->start_index++) { + U32_MAX, XA_PRESENT)); + ctx->start_index++) { + /* Check valid before reading fields */ + if (!smp_load_acquire(&shaper->valid)) + continue; + ret = net_shaper_fill_one(skb, binding, shaper, info); if (ret) break; @@ -932,6 +972,46 @@ static int net_shaper_handle_cmp(const struct net_shaper_handle *a, return memcmp(a, b, sizeof(*a)); } +static int net_shaper_parse_leaves(struct net_shaper_binding *binding, + struct genl_info *info, + const struct net_shaper *node, + struct net_shaper *leaves, + int leaves_count) +{ + struct nlattr *attr; + int i, j, ret, rem; + + i = 0; + nla_for_each_attr_type(attr, NET_SHAPER_A_LEAVES, + genlmsg_data(info->genlhdr), + genlmsg_len(info->genlhdr), rem) { + if (WARN_ON_ONCE(i >= leaves_count)) + return -EINVAL; + + ret = net_shaper_parse_leaf(binding, attr, info, + node, &leaves[i]); + if (ret) + return ret; + + /* Reject duplicates */ + for (j = 0; j < i; j++) { + if (net_shaper_handle_cmp(&leaves[i].handle, + &leaves[j].handle)) + continue; + + NL_SET_ERR_MSG_ATTR_FMT(info->extack, attr, + "Duplicate leaf shaper %d:%d", + leaves[i].handle.scope, + leaves[i].handle.id); + return -EINVAL; + } + + i++; + } + + return 0; +} + static int net_shaper_parent_from_leaves(int leaves_count, const struct net_shaper *leaves, struct net_shaper *node, @@ -964,15 +1044,22 @@ static int __net_shaper_group(struct net_shaper_binding *binding, int i, ret; if (node->handle.scope == NET_SHAPER_SCOPE_NODE) { + struct net_shaper *cur = NULL; + new_node = node->handle.id == NET_SHAPER_ID_UNSPEC; - if (!new_node && !net_shaper_lookup(binding, &node->handle)) { - /* The related attribute is not available when - * reaching here from the delete() op. - */ - NL_SET_ERR_MSG_FMT(extack, "Node shaper %d:%d does not exists", - node->handle.scope, node->handle.id); - return -ENOENT; + if (!new_node) { + cur = net_shaper_lookup(binding, &node->handle); + if (!cur) { + /* The related attribute is not available + * when reaching here from the delete() op. + */ + NL_SET_ERR_MSG_FMT(extack, + "Node shaper %d:%d does not exist", + node->handle.scope, + node->handle.id); + return -ENOENT; + } } /* When unspecified, the node parent scope is inherited from @@ -986,6 +1073,15 @@ static int __net_shaper_group(struct net_shaper_binding *binding, return ret; } + if (cur && net_shaper_handle_cmp(&cur->parent, + &node->parent)) { + NL_SET_ERR_MSG_FMT(extack, + "Cannot reparent node shaper %d:%d", + node->handle.scope, + node->handle.id); + return -EOPNOTSUPP; + } + } else { net_shaper_default_parent(&node->handle, &node->parent); } @@ -1162,7 +1258,7 @@ static int net_shaper_group_send_reply(struct net_shaper_binding *binding, free_msg: /* Should never happen as msg is pre-allocated with enough space. */ WARN_ONCE(true, "calculated message payload length (%d)", - net_shaper_handle_size()); + net_shaper_group_reply_size()); nlmsg_free(msg); return -EMSGSIZE; } @@ -1172,10 +1268,9 @@ int net_shaper_nl_group_doit(struct sk_buff *skb, struct genl_info *info) struct net_shaper **old_nodes, *leaves, node = {}; struct net_shaper_hierarchy *hierarchy; struct net_shaper_binding *binding; - int i, ret, rem, leaves_count; + int i, ret, leaves_count; int old_nodes_count = 0; struct sk_buff *msg; - struct nlattr *attr; if (GENL_REQ_ATTR_CHECK(info, NET_SHAPER_A_LEAVES)) return -EINVAL; @@ -1203,26 +1298,19 @@ int net_shaper_nl_group_doit(struct sk_buff *skb, struct genl_info *info) if (ret) goto free_leaves; - i = 0; - nla_for_each_attr_type(attr, NET_SHAPER_A_LEAVES, - genlmsg_data(info->genlhdr), - genlmsg_len(info->genlhdr), rem) { - if (WARN_ON_ONCE(i >= leaves_count)) - goto free_leaves; - - ret = net_shaper_parse_leaf(binding, attr, info, - &node, &leaves[i]); - if (ret) - goto free_leaves; - i++; - } + ret = net_shaper_parse_leaves(binding, info, &node, + leaves, leaves_count); + if (ret) + goto free_leaves; /* Prepare the msg reply in advance, to avoid device operation * rollback on allocation failure. */ - msg = genlmsg_new(net_shaper_handle_size(), GFP_KERNEL); - if (!msg) + msg = genlmsg_new(net_shaper_group_reply_size(), GFP_KERNEL); + if (!msg) { + ret = -ENOMEM; goto free_leaves; + } hierarchy = net_shaper_hierarchy_setup(binding); if (!hierarchy) { diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c index 1a565095376a..f744f7911217 100644 --- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -1400,7 +1400,8 @@ smc_v2_determine_accepted_chid(struct smc_clc_msg_accept_confirm *aclc, int i; for (i = 0; i < ini->ism_offered_cnt + 1; i++) { - if (ini->ism_chid[i] == ntohs(aclc->d1.chid)) { + if (ini->ism_dev[i] && + ini->ism_chid[i] == ntohs(aclc->d1.chid)) { ini->ism_selected = i; return 0; } diff --git a/net/smc/smc_tracepoint.h b/net/smc/smc_tracepoint.h index a9a6e3c1113a..53da84f57fd6 100644 --- a/net/smc/smc_tracepoint.h +++ b/net/smc/smc_tracepoint.h @@ -51,7 +51,7 @@ DECLARE_EVENT_CLASS(smc_msg_event, __field(const void *, smc) __field(u64, net_cookie) __field(size_t, len) - __string(name, smc->conn.lnk->ibname) + __string(name, smc->conn.lnk ? smc->conn.lnk->ibname : "") ), TP_fast_assign( diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c index 23a31646d038..11a70c10770b 100644 --- a/net/tls/tls_sw.c +++ b/net/tls/tls_sw.c @@ -789,23 +789,33 @@ static int tls_push_record(struct sock *sk, int flags, i = msg_pl->sg.end; sk_msg_iter_var_prev(i); + /* msg_pl->sg.data is a ring; data[MAX+1] is reserved for the wrap + * link (frags won't use it). 'i' is now the last filled entry: + * + * i end start + * v v v [ rsv ] + * [ d ][ d ][ ][ ]...[ ][ d ][ d ][ d ][chain] + * ^ END v + * `-----------------------------------------' + * + * Note that SGL does not allow chain-after-chain, so for TLS 1.3, + * we must make sure we don't create the wrap entry and then chain + * link to content_type immediately at index 0. + */ + if (i < msg_pl->sg.start) + sg_chain(msg_pl->sg.data, ARRAY_SIZE(msg_pl->sg.data), + msg_pl->sg.data); + rec->content_type = record_type; if (prot->version == TLS_1_3_VERSION) { /* Add content type to end of message. No padding added */ sg_set_buf(&rec->sg_content_type, &rec->content_type, 1); sg_mark_end(&rec->sg_content_type); - sg_chain(msg_pl->sg.data, msg_pl->sg.end + 1, - &rec->sg_content_type); + sg_chain(msg_pl->sg.data, i + 2, &rec->sg_content_type); } else { sg_mark_end(sk_msg_elem(msg_pl, i)); } - if (msg_pl->sg.end < msg_pl->sg.start) { - sg_chain(&msg_pl->sg.data[msg_pl->sg.start], - MAX_SKB_FRAGS - msg_pl->sg.start + 1, - msg_pl->sg.data); - } - i = msg_pl->sg.start; sg_chain(rec->sg_aead_in, 2, &msg_pl->sg.data[i]); @@ -1356,9 +1366,14 @@ unlock: mutex_unlock(&tls_ctx->tx_lock); } +/* When has_copied is true the caller has already moved bytes to + * userspace. Report sk_err but leave it set so the next read + * surfaces it instead of a spurious EOF, otherwise sk_err is + * consumed via sock_error(). + */ static int tls_rx_rec_wait(struct sock *sk, struct sk_psock *psock, bool nonblock, - bool released) + bool released, bool has_copied) { struct tls_context *tls_ctx = tls_get_ctx(sk); struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx); @@ -1376,8 +1391,11 @@ tls_rx_rec_wait(struct sock *sk, struct sk_psock *psock, bool nonblock, if (!sk_psock_queue_empty(psock)) return 0; - if (sk->sk_err) + if (sk->sk_err) { + if (has_copied) + return -READ_ONCE(sk->sk_err); return sock_error(sk); + } if (ret < 0) return ret; @@ -1413,7 +1431,7 @@ tls_rx_rec_wait(struct sock *sk, struct sk_psock *psock, bool nonblock, } if (unlikely(!tls_strp_msg_load(&ctx->strp, released))) - return tls_rx_rec_wait(sk, psock, nonblock, false); + return tls_rx_rec_wait(sk, psock, nonblock, false, has_copied); return 1; } @@ -2101,7 +2119,7 @@ int tls_sw_recvmsg(struct sock *sk, int to_decrypt, chunk; err = tls_rx_rec_wait(sk, psock, flags & MSG_DONTWAIT, - released); + released, !!(decrypted + copied)); if (err <= 0) { if (psock) { chunk = sk_msg_recvmsg(sk, psock, msg, len, @@ -2288,7 +2306,7 @@ ssize_t tls_sw_splice_read(struct socket *sock, loff_t *ppos, struct tls_decrypt_arg darg; err = tls_rx_rec_wait(sk, NULL, flags & SPLICE_F_NONBLOCK, - true); + true, false); if (err <= 0) goto splice_read_end; @@ -2374,7 +2392,7 @@ int tls_sw_read_sock(struct sock *sk, read_descriptor_t *desc, } else { struct tls_decrypt_arg darg; - err = tls_rx_rec_wait(sk, NULL, true, released); + err = tls_rx_rec_wait(sk, NULL, true, released, !!copied); if (err <= 0) goto read_sock_end; diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 001f6602a665..c3d68bf26ce1 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -2707,8 +2707,7 @@ static int unix_read_skb(struct sock *sk, skb_read_actor_t recv_actor) * Sleep until more data has arrived. But check for races.. */ static long unix_stream_data_wait(struct sock *sk, long timeo, - struct sk_buff *last, unsigned int last_len, - bool freezable) + struct sk_buff *last, bool freezable) { unsigned int state = TASK_INTERRUPTIBLE | freezable * TASK_FREEZABLE; struct sk_buff *tail; @@ -2721,7 +2720,6 @@ static long unix_stream_data_wait(struct sock *sk, long timeo, tail = skb_peek_tail(&sk->sk_receive_queue); if (tail != last || - (tail && tail->len != last_len) || sk->sk_err || (sk->sk_shutdown & RCV_SHUTDOWN) || signal_pending(current) || @@ -2917,7 +2915,6 @@ static int unix_stream_read_generic(struct unix_stream_read_state *state, int flags = state->flags; bool check_creds = false; struct scm_cookie scm; - unsigned int last_len; struct unix_sock *u; int copied = 0; int err = 0; @@ -2963,7 +2960,6 @@ redo: goto unlock; } last = skb = skb_peek(&sk->sk_receive_queue); - last_len = last ? last->len : 0; again: #if IS_ENABLED(CONFIG_AF_UNIX_OOB) @@ -2997,8 +2993,7 @@ again: mutex_unlock(&u->iolock); - timeo = unix_stream_data_wait(sk, timeo, last, - last_len, freezable); + timeo = unix_stream_data_wait(sk, timeo, last, freezable); if (signal_pending(current)) { err = sock_intr_errno(timeo); @@ -3015,7 +3010,6 @@ unlock: while (skip >= unix_skb_len(skb)) { skip -= unix_skb_len(skb); last = skb; - last_len = skb->len; skb = skb_peek_next(skb, &sk->sk_receive_queue); if (!skb) goto again; @@ -3090,7 +3084,6 @@ unlock: skip = 0; last = skb; - last_len = skb->len; unix_state_lock(sk); skb = skb_peek_next(skb, &sk->sk_receive_queue); if (skb) diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c index 0d0265f770ad..e8fb2e20db0f 100644 --- a/net/vmw_vsock/virtio_transport_common.c +++ b/net/vmw_vsock/virtio_transport_common.c @@ -72,34 +72,6 @@ static bool virtio_transport_can_zcopy(const struct virtio_transport *t_ops, return true; } -static int virtio_transport_init_zcopy_skb(struct vsock_sock *vsk, - struct sk_buff *skb, - struct msghdr *msg, - size_t pkt_len, - bool zerocopy) -{ - struct ubuf_info *uarg; - - if (msg->msg_ubuf) { - uarg = msg->msg_ubuf; - net_zcopy_get(uarg); - } else { - struct ubuf_info_msgzc *uarg_zc; - - uarg = msg_zerocopy_realloc(sk_vsock(vsk), - pkt_len, NULL, false); - if (!uarg) - return -1; - - uarg_zc = uarg_to_msgzc(uarg); - uarg_zc->zerocopy = zerocopy ? 1 : 0; - } - - skb_zcopy_init(skb, uarg); - - return 0; -} - static int virtio_transport_fill_skb(struct sk_buff *skb, struct virtio_vsock_pkt_info *info, size_t len, @@ -319,8 +291,10 @@ static int virtio_transport_send_pkt_info(struct vsock_sock *vsk, u32 src_cid, src_port, dst_cid, dst_port; const struct virtio_transport *t_ops; struct virtio_vsock_sock *vvs; + struct ubuf_info *uarg = NULL; u32 pkt_len = info->pkt_len; bool can_zcopy = false; + bool have_uref = false; u32 rest_len; int ret; @@ -362,6 +336,25 @@ static int virtio_transport_send_pkt_info(struct vsock_sock *vsk, if (can_zcopy) max_skb_len = min_t(u32, VIRTIO_VSOCK_MAX_PKT_BUF_SIZE, (MAX_SKB_FRAGS * PAGE_SIZE)); + + if (info->msg->msg_flags & MSG_ZEROCOPY && + info->op == VIRTIO_VSOCK_OP_RW) { + uarg = info->msg->msg_ubuf; + + if (!uarg) { + uarg = msg_zerocopy_realloc(sk_vsock(vsk), + pkt_len, NULL, false); + if (!uarg) { + virtio_transport_put_credit(vvs, pkt_len); + return -ENOMEM; + } + + if (!can_zcopy) + uarg_to_msgzc(uarg)->zerocopy = 0; + + have_uref = true; + } + } } rest_len = pkt_len; @@ -380,27 +373,7 @@ static int virtio_transport_send_pkt_info(struct vsock_sock *vsk, break; } - /* We process buffer part by part, allocating skb on - * each iteration. If this is last skb for this buffer - * and MSG_ZEROCOPY mode is in use - we must allocate - * completion for the current syscall. - * - * Pass pkt_len because msg iter is already consumed - * by virtio_transport_fill_skb(), so iter->count - * can not be used for RLIMIT_MEMLOCK pinned-pages - * accounting done by msg_zerocopy_realloc(). - */ - if (info->msg && info->msg->msg_flags & MSG_ZEROCOPY && - skb_len == rest_len && info->op == VIRTIO_VSOCK_OP_RW) { - if (virtio_transport_init_zcopy_skb(vsk, skb, - info->msg, - pkt_len, - can_zcopy)) { - kfree_skb(skb); - ret = -ENOMEM; - break; - } - } + skb_zcopy_set(skb, uarg, NULL); virtio_transport_inc_tx_pkt(vvs, skb); @@ -424,6 +397,18 @@ static int virtio_transport_send_pkt_info(struct vsock_sock *vsk, virtio_transport_put_credit(vvs, rest_len); + /* msg_zerocopy_realloc() initializes the ubuf_info refcnt to 1. + * skb_zcopy_set() increases it for each skb, so we can drop that + * initial reference to keep it balanced. + */ + if (have_uref) { + if (rest_len == pkt_len) + /* No data sent, abort the notification. */ + net_zcopy_put_abort(uarg, true); + else + net_zcopy_put(uarg); + } + /* Return number of bytes, if any data has been sent. */ if (rest_len != pkt_len) ret = pkt_len - rest_len; @@ -1350,7 +1335,7 @@ destroy: return err; } -static void +static bool virtio_transport_recv_enqueue(struct vsock_sock *vsk, struct sk_buff *skb) { @@ -1365,10 +1350,8 @@ virtio_transport_recv_enqueue(struct vsock_sock *vsk, spin_lock_bh(&vvs->rx_lock); can_enqueue = virtio_transport_inc_rx_pkt(vvs, len); - if (!can_enqueue) { - free_pkt = true; + if (!can_enqueue) goto out; - } if (le32_to_cpu(hdr->flags) & VIRTIO_VSOCK_SEQ_EOM) vvs->msg_count++; @@ -1408,6 +1391,8 @@ out: spin_unlock_bh(&vvs->rx_lock); if (free_pkt) kfree_skb(skb); + + return can_enqueue; } static int @@ -1420,7 +1405,17 @@ virtio_transport_recv_connected(struct sock *sk, switch (le16_to_cpu(hdr->op)) { case VIRTIO_VSOCK_OP_RW: - virtio_transport_recv_enqueue(vsk, skb); + if (!virtio_transport_recv_enqueue(vsk, skb)) { + /* There is no more space to queue the packet, so let's + * close the connection; otherwise, we'll lose data. + */ + (void)virtio_transport_reset(vsk, skb); + virtio_transport_do_close(vsk, true); + sk->sk_err = ENOBUFS; + sk_error_report(sk); + vsock_remove_sock(vsk); + break; + } vsock_data_ready(sk); return err; case VIRTIO_VSOCK_OP_CREDIT_REQUEST: diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c index 4296ca1183f1..d2579380f51e 100644 --- a/net/vmw_vsock/vmci_transport.c +++ b/net/vmw_vsock/vmci_transport.c @@ -1164,7 +1164,7 @@ vmci_transport_recv_connecting_server(struct sock *listener, /* Close and cleanup the connection. */ vmci_transport_send_reset(pending, pkt); skerr = EPROTO; - err = pkt->type == VMCI_TRANSPORT_PACKET_TYPE_RST ? 0 : -EINVAL; + err = -EINVAL; goto destroy; } diff --git a/net/wireless/scan.c b/net/wireless/scan.c index 328af43ef832..358cbc9e43d8 100644 --- a/net/wireless/scan.c +++ b/net/wireless/scan.c @@ -2462,6 +2462,9 @@ size_t cfg80211_merge_profile(const u8 *ie, size_t ielen, memcpy(merged_ie + copied_len, next_sub->data, next_sub->datalen); copied_len += next_sub->datalen; + + mbssid_elem = next_mbssid; + sub_elem = next_sub; } return copied_len; diff --git a/scripts/gcc-plugins/gcc-common.h b/scripts/gcc-plugins/gcc-common.h index 8f1b3500f8e2..abb1964c44d4 100644 --- a/scripts/gcc-plugins/gcc-common.h +++ b/scripts/gcc-plugins/gcc-common.h @@ -309,7 +309,9 @@ typedef const gimple *const_gimple_ptr; #define gimple gimple_ptr #define const_gimple const_gimple_ptr #undef CONST_CAST_GIMPLE -#define CONST_CAST_GIMPLE(X) CONST_CAST(gimple, (X)) +#define CONST_CAST_GIMPLE(X) const_cast<gimple>((X)) +#undef CONST_CAST_TREE +#define CONST_CAST_TREE(X) const_cast<tree>((X)) /* gimple related */ static inline gimple gimple_build_assign_with_ops(enum tree_code subcode, tree lhs, tree op1, tree op2 MEM_STAT_DECL) diff --git a/scripts/gdb/linux/mm.py b/scripts/gdb/linux/mm.py index d78908f6664d..dffadccbb01d 100644 --- a/scripts/gdb/linux/mm.py +++ b/scripts/gdb/linux/mm.py @@ -40,11 +40,11 @@ class x86_page_ops(): self.PAGE_OFFSET = int(gdb.parse_and_eval("page_offset_base")) self.VMEMMAP_START = int(gdb.parse_and_eval("vmemmap_base")) - self.PHYS_BASE = int(gdb.parse_and_eval("phys_base")) + self.PHYS_BASE = int(gdb.parse_and_eval("(unsigned long) phys_base")) self.START_KERNEL_map = 0xffffffff80000000 - self.KERNEL_START = gdb.parse_and_eval("_text") - self.KERNEL_END = gdb.parse_and_eval("_end") + self.KERNEL_START = gdb.parse_and_eval("(unsigned long) &_text") + self.KERNEL_END = gdb.parse_and_eval("(unsigned long) &_end") self.VMALLOC_START = int(gdb.parse_and_eval("vmalloc_base")) if self.VMALLOC_START == 0xffffc90000000000: diff --git a/scripts/package/PKGBUILD b/scripts/package/PKGBUILD index 452374d63c24..1213c8e04671 100644 --- a/scripts/package/PKGBUILD +++ b/scripts/package/PKGBUILD @@ -10,7 +10,7 @@ for pkg in $_extrapackages; do pkgname+=("${pkgbase}-${pkg}") done -pkgver="${KERNELRELEASE//-/_}" +pkgver="$(echo "${KERNELRELEASE}" | sed 's/-\(rc[0-9]\+\)/\1/;s/-/_/g')" # The PKGBUILD is evaluated multiple times. # Running scripts/build-version from here would introduce inconsistencies. pkgrel="${KBUILD_REVISION}" diff --git a/security/keys/keyring.c b/security/keys/keyring.c index b39038f7dd31..5a9887d6b7be 100644 --- a/security/keys/keyring.c +++ b/security/keys/keyring.c @@ -1109,6 +1109,7 @@ key_ref_t find_key_to_update(key_ref_t keyring_ref, kenter("{%d},{%s,%s}", keyring->serial, index_key->type->name, index_key->description); + guard(rcu)(); object = assoc_array_find(&keyring->keys, &keyring_assoc_array_ops, index_key); diff --git a/security/lsm_syscalls.c b/security/lsm_syscalls.c index 5648b1f0ce9c..08a017669c02 100644 --- a/security/lsm_syscalls.c +++ b/security/lsm_syscalls.c @@ -57,7 +57,14 @@ u64 lsm_name_to_attr(const char *name) SYSCALL_DEFINE4(lsm_set_self_attr, unsigned int, attr, struct lsm_ctx __user *, ctx, u32, size, u32, flags) { - return security_setselfattr(attr, ctx, size, flags); + int rc; + + rc = mutex_lock_interruptible(¤t->signal->cred_guard_mutex); + if (rc < 0) + return rc; + rc = security_setselfattr(attr, ctx, size, flags); + mutex_unlock(¤t->signal->cred_guard_mutex); + return rc; } /** diff --git a/sound/core/pcm_lib.c b/sound/core/pcm_lib.c index 09c421cd9319..fe597f7d522d 100644 --- a/sound/core/pcm_lib.c +++ b/sound/core/pcm_lib.c @@ -2138,6 +2138,9 @@ static int interleaved_copy(struct snd_pcm_substream *substream, off = frames_to_bytes(runtime, off); frames = frames_to_bytes(runtime, frames); + if (!data) + return fill_silence(substream, 0, hwoff, NULL, frames); + return do_transfer(substream, 0, hwoff, data + off, frames, transfer, in_kernel); } diff --git a/sound/core/seq/seq_ump_client.c b/sound/core/seq/seq_ump_client.c index 9079ccfdc866..ccd93599b493 100644 --- a/sound/core/seq/seq_ump_client.c +++ b/sound/core/seq/seq_ump_client.c @@ -37,6 +37,7 @@ struct seq_ump_client { struct snd_ump_endpoint *ump; /* assigned endpoint */ int seq_client; /* sequencer client id */ int opened[2]; /* current opens for each direction */ + rwlock_t output_lock; /* protects out_rfile output access */ struct snd_rawmidi_file out_rfile; /* rawmidi for output */ struct seq_ump_input_buffer input; /* input parser context */ void *ump_info[SNDRV_UMP_MAX_BLOCKS + 1]; /* shadow of seq client ump_info */ @@ -88,6 +89,7 @@ static int seq_ump_process_event(struct snd_seq_event *ev, int direct, unsigned char type; int len; + guard(read_lock_irqsave)(&client->output_lock); substream = client->out_rfile.output; if (!substream) return -ENODEV; @@ -106,6 +108,7 @@ static int seq_ump_process_event(struct snd_seq_event *ev, int direct, static int seq_ump_client_open(struct seq_ump_client *client, int dir) { struct snd_ump_endpoint *ump = client->ump; + struct snd_rawmidi_file rfile = {}; int err; guard(mutex)(&ump->open_mutex); @@ -113,9 +116,11 @@ static int seq_ump_client_open(struct seq_ump_client *client, int dir) err = snd_rawmidi_kernel_open(&ump->core, 0, SNDRV_RAWMIDI_LFLG_OUTPUT | SNDRV_RAWMIDI_LFLG_APPEND, - &client->out_rfile); + &rfile); if (err < 0) return err; + scoped_guard(write_lock_irqsave, &client->output_lock) + client->out_rfile = rfile; } client->opened[dir]++; return 0; @@ -125,11 +130,19 @@ static int seq_ump_client_open(struct seq_ump_client *client, int dir) static int seq_ump_client_close(struct seq_ump_client *client, int dir) { struct snd_ump_endpoint *ump = client->ump; + struct snd_rawmidi_file rfile = {}; guard(mutex)(&ump->open_mutex); - if (!--client->opened[dir]) - if (dir == STR_OUT) - snd_rawmidi_kernel_release(&client->out_rfile); + if (!--client->opened[dir]) { + if (dir == STR_OUT) { + scoped_guard(write_lock_irqsave, &client->output_lock) { + rfile = client->out_rfile; + client->out_rfile = (struct snd_rawmidi_file){}; + } + if (rfile.rmidi) + snd_rawmidi_kernel_release(&rfile); + } + } return 0; } @@ -467,6 +480,7 @@ static int snd_seq_ump_probe(struct snd_seq_device *dev) INIT_WORK(&client->group_notify_work, handle_group_notify); client->ump = ump; + rwlock_init(&client->output_lock); client->seq_client = snd_seq_create_kernel_client(card, ump->core.device, diff --git a/sound/hda/codecs/ca0132.c b/sound/hda/codecs/ca0132.c index a0677d7da8e2..b4e10957ac6d 100644 --- a/sound/hda/codecs/ca0132.c +++ b/sound/hda/codecs/ca0132.c @@ -5508,6 +5508,30 @@ static int zxr_headphone_gain_set(struct hda_codec *codec, long val) return 0; } +/* + * Manual output selection (HP/Speaker Playback Switch or alt Output Select) + * is meaningful only when HP/Speaker auto-detect is disabled, since the + * select_out path always prefers jack presence when auto-detect is on. When + * the user explicitly chooses an output, turn auto-detect off so the manual + * choice actually takes effect, and notify userspace so the auto-detect + * control reflects the new state. + */ +static void ca0132_disable_hp_auto_detect(struct hda_codec *codec) +{ + struct ca0132_spec *spec = codec->spec; + struct snd_kcontrol *kctl; + + if (!spec->vnode_lswitch[VNID_HP_ASEL - VNODE_START_NID]) + return; + + spec->vnode_lswitch[VNID_HP_ASEL - VNODE_START_NID] = 0; + kctl = snd_hda_find_mixer_ctl(codec, + "HP/Speaker Auto Detect Playback Switch"); + if (kctl) + snd_ctl_notify(codec->card, SNDRV_CTL_EVENT_MASK_VALUE, + &kctl->id); +} + static int ca0132_vnode_switch_set(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *ucontrol) { @@ -5520,14 +5544,11 @@ static int ca0132_vnode_switch_set(struct snd_kcontrol *kcontrol, int auto_jack; if (nid == VNID_HP_SEL) { - auto_jack = - spec->vnode_lswitch[VNID_HP_ASEL - VNODE_START_NID]; - if (!auto_jack) { - if (ca0132_use_alt_functions(spec)) - ca0132_alt_select_out(codec); - else - ca0132_select_out(codec); - } + ca0132_disable_hp_auto_detect(codec); + if (ca0132_use_alt_functions(spec)) + ca0132_alt_select_out(codec); + else + ca0132_select_out(codec); return 1; } @@ -5988,7 +6009,6 @@ static int ca0132_alt_output_select_put(struct snd_kcontrol *kcontrol, struct ca0132_spec *spec = codec->spec; int sel = ucontrol->value.enumerated.item[0]; unsigned int items = NUM_OF_OUTPUTS; - unsigned int auto_jack; if (sel >= items) return 0; @@ -5998,10 +6018,8 @@ static int ca0132_alt_output_select_put(struct snd_kcontrol *kcontrol, spec->out_enum_val = sel; - auto_jack = spec->vnode_lswitch[VNID_HP_ASEL - VNODE_START_NID]; - - if (!auto_jack) - ca0132_alt_select_out(codec); + ca0132_disable_hp_auto_detect(codec); + ca0132_alt_select_out(codec); return 1; } diff --git a/sound/hda/codecs/realtek/alc269.c b/sound/hda/codecs/realtek/alc269.c index 8f7d8337b4bc..c59021c15d66 100644 --- a/sound/hda/codecs/realtek/alc269.c +++ b/sound/hda/codecs/realtek/alc269.c @@ -7389,12 +7389,12 @@ static const struct hda_quirk alc269_fixup_tbl[] = { SND_PCI_QUIRK(0x1043, 0x3e00, "ASUS G814FH/FM/FP", ALC287_FIXUP_CS35L41_I2C_2), SND_PCI_QUIRK(0x1043, 0x3e20, "ASUS G814PH/PM/PP", ALC287_FIXUP_CS35L41_I2C_2), SND_PCI_QUIRK(0x1043, 0x3e30, "ASUS TP3607SA", ALC287_FIXUP_TAS2781_I2C), - SND_PCI_QUIRK(0x1043, 0x3ee0, "ASUS Strix G815_JHR_JMR_JPR", ALC287_FIXUP_TAS2781_I2C), - SND_PCI_QUIRK(0x1043, 0x3ef0, "ASUS Strix G635LR_LW_LX", ALC287_FIXUP_TAS2781_I2C), - SND_PCI_QUIRK(0x1043, 0x3f00, "ASUS Strix G815LH_LM_LP", ALC287_FIXUP_TAS2781_I2C), - SND_PCI_QUIRK(0x1043, 0x3f10, "ASUS Strix G835LR_LW_LX", ALC287_FIXUP_TAS2781_I2C), - SND_PCI_QUIRK(0x1043, 0x3f20, "ASUS Strix G615LR_LW", ALC287_FIXUP_TAS2781_I2C), - SND_PCI_QUIRK(0x1043, 0x3f30, "ASUS Strix G815LR_LW", ALC287_FIXUP_TAS2781_I2C), + SND_PCI_QUIRK(0x1043, 0x3ee0, "ASUS Strix G815_JHR_JMR_JPR", ALC287_FIXUP_TXNW2781_I2C), + SND_PCI_QUIRK(0x1043, 0x3ef0, "ASUS Strix G635LR_LW_LX", ALC287_FIXUP_TXNW2781_I2C), + SND_PCI_QUIRK(0x1043, 0x3f00, "ASUS Strix G815LH_LM_LP", ALC287_FIXUP_TXNW2781_I2C), + SND_PCI_QUIRK(0x1043, 0x3f10, "ASUS Strix G835LR_LW_LX", ALC287_FIXUP_TXNW2781_I2C), + SND_PCI_QUIRK(0x1043, 0x3f20, "ASUS Strix G615LR_LW", ALC287_FIXUP_TXNW2781_I2C), + SND_PCI_QUIRK(0x1043, 0x3f30, "ASUS Strix G815LR_LW", ALC287_FIXUP_TXNW2781_I2C), SND_PCI_QUIRK(0x1043, 0x3fd0, "ASUS B3605CVA", ALC245_FIXUP_CS35L41_SPI_2), SND_PCI_QUIRK(0x1043, 0x3ff0, "ASUS B5405CVA", ALC245_FIXUP_CS35L41_SPI_2), SND_PCI_QUIRK(0x1043, 0x831a, "ASUS P901", ALC269_FIXUP_STEREO_DMIC), diff --git a/sound/hda/codecs/side-codecs/cs35l41_hda.c b/sound/hda/codecs/side-codecs/cs35l41_hda.c index b64890006bb7..acfccc848f82 100644 --- a/sound/hda/codecs/side-codecs/cs35l41_hda.c +++ b/sound/hda/codecs/side-codecs/cs35l41_hda.c @@ -1896,8 +1896,10 @@ static int cs35l41_hda_read_acpi(struct cs35l41_hda *cs35l41, const char *hid, i cs35l41->dacpi = adev; physdev = get_device(acpi_get_first_physical_node(adev)); - if (!physdev) + if (!physdev) { + acpi_dev_put(adev); return -ENODEV; + } sub = acpi_get_subsystem_id(ACPI_HANDLE(physdev)); if (IS_ERR(sub)) diff --git a/sound/hda/codecs/side-codecs/cs35l56_hda.c b/sound/hda/codecs/side-codecs/cs35l56_hda.c index 4c8d01799931..cdbc576569ef 100644 --- a/sound/hda/codecs/side-codecs/cs35l56_hda.c +++ b/sound/hda/codecs/side-codecs/cs35l56_hda.c @@ -1041,6 +1041,7 @@ static int cs35l56_hda_read_acpi(struct cs35l56_hda *cs35l56, int hid, int id) return -ENODEV; } ACPI_COMPANION_SET(cs35l56->base.dev, adev); + acpi_dev_put(adev); } /* Initialize things that could be overwritten by a fixup */ diff --git a/sound/pci/asihpi/hpicmn.c b/sound/pci/asihpi/hpicmn.c index d846777e7462..19f0da2e6501 100644 --- a/sound/pci/asihpi/hpicmn.c +++ b/sound/pci/asihpi/hpicmn.c @@ -276,6 +276,12 @@ static short find_control(u16 control_index, return 0; } + if (control_index >= p_cache->control_count) { + HPI_DEBUG_LOG(VERBOSE, "control_index out of bounce %d\n", + control_index); + return 0; + } + *pI = p_cache->p_info[control_index]; if (!*pI) { HPI_DEBUG_LOG(VERBOSE, "Uncached Control %d\n", diff --git a/sound/soc/amd/acp/acp-sdw-legacy-mach.c b/sound/soc/amd/acp/acp-sdw-legacy-mach.c index a9c8d9545281..ae9579c8511e 100644 --- a/sound/soc/amd/acp/acp-sdw-legacy-mach.c +++ b/sound/soc/amd/acp/acp-sdw-legacy-mach.c @@ -260,9 +260,9 @@ static int create_sdw_dailink(struct snd_soc_card *card, cpus->dai_name = devm_kasprintf(dev, GFP_KERNEL, "SDW%d Pin%d", link_num, cpu_pin_id); - dev_dbg(dev, "cpu->dai_name:%s\n", cpus->dai_name); if (!cpus->dai_name) return -ENOMEM; + dev_dbg(dev, "cpu->dai_name:%s\n", cpus->dai_name); codec_maps[j].cpu = 0; codec_maps[j].codec = j; diff --git a/sound/soc/codecs/cs-amp-lib.c b/sound/soc/codecs/cs-amp-lib.c index 8b131975143d..fae006aa7898 100644 --- a/sound/soc/codecs/cs-amp-lib.c +++ b/sound/soc/codecs/cs-amp-lib.c @@ -500,7 +500,7 @@ static int _cs_amp_set_efi_calibration_data(struct device *dev, int amp_index, i * must be set. */ if (data->count == 0) - data->count = (data->size - sizeof(data)) / sizeof(data->data[0]); + data->count = (data->size - struct_offset(data, data)) / sizeof(data->data[0]); if (amp_index < 0) { /* Is there already a slot for this target? */ @@ -831,11 +831,18 @@ EXPORT_SYMBOL_NS_GPL(cs_amp_devm_get_vendor_specific_variant_id, "SND_SOC_CS_AMP */ struct dentry *cs_amp_create_debugfs(struct device *dev) { - struct dentry *dir; + struct dentry *dir, *created; + /* debugfs_lookup() can return NULL or ERR_PTR on error */ dir = debugfs_lookup("cirrus_logic", NULL); - if (!dir) - dir = debugfs_create_dir("cirrus_logic", NULL); + if (!IS_ERR_OR_NULL(dir)) { + created = debugfs_create_dir(dev_name(dev), dir); + dput(dir); + + return created; + } + + dir = debugfs_create_dir("cirrus_logic", NULL); return debugfs_create_dir(dev_name(dev), dir); } diff --git a/sound/soc/codecs/cs35l56-sdw.c b/sound/soc/codecs/cs35l56-sdw.c index 30b3192d6ce9..8d7772894f10 100644 --- a/sound/soc/codecs/cs35l56-sdw.c +++ b/sound/soc/codecs/cs35l56-sdw.c @@ -560,10 +560,11 @@ static void cs35l56_sdw_remove(struct sdw_slave *peripheral) /* Disable SoundWire interrupts */ cs35l56->sdw_irq_no_unmask = true; - cancel_work_sync(&cs35l56->sdw_irq_work); + flush_work(&cs35l56->sdw_irq_work); sdw_write_no_pm(peripheral, CS35L56_SDW_GEN_INT_MASK_1, 0); sdw_read_no_pm(peripheral, CS35L56_SDW_GEN_INT_STAT_1); sdw_write_no_pm(peripheral, CS35L56_SDW_GEN_INT_STAT_1, 0xFF); + flush_work(&cs35l56->sdw_irq_work); cs35l56_remove(cs35l56); } diff --git a/sound/soc/codecs/fs210x.c b/sound/soc/codecs/fs210x.c index e6195b71adad..eda716f817b5 100644 --- a/sound/soc/codecs/fs210x.c +++ b/sound/soc/codecs/fs210x.c @@ -968,7 +968,7 @@ static int fs210x_effect_scene_info(struct snd_kcontrol *kcontrol, if (scene->name) name = scene->name; - strscpy(uinfo->value.enumerated.name, name, strlen(name) + 1); + strscpy(uinfo->value.enumerated.name, name); return 0; } diff --git a/sound/soc/codecs/pcm512x.c b/sound/soc/codecs/pcm512x.c index a70e8ea166dc..fdef98ce52f1 100644 --- a/sound/soc/codecs/pcm512x.c +++ b/sound/soc/codecs/pcm512x.c @@ -235,7 +235,7 @@ static int pcm512x_overclock_pll_put(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *ucontrol) { struct snd_soc_component *component = snd_kcontrol_chip(kcontrol); - struct snd_soc_dapm_context *dapm = snd_soc_dapm_kcontrol_to_dapm(kcontrol); + struct snd_soc_dapm_context *dapm = snd_soc_component_to_dapm(component); struct pcm512x_priv *pcm512x = snd_soc_component_get_drvdata(component); switch (snd_soc_dapm_get_bias_level(dapm)) { @@ -264,7 +264,7 @@ static int pcm512x_overclock_dsp_put(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *ucontrol) { struct snd_soc_component *component = snd_kcontrol_chip(kcontrol); - struct snd_soc_dapm_context *dapm = snd_soc_dapm_kcontrol_to_dapm(kcontrol); + struct snd_soc_dapm_context *dapm = snd_soc_component_to_dapm(component); struct pcm512x_priv *pcm512x = snd_soc_component_get_drvdata(component); switch (snd_soc_dapm_get_bias_level(dapm)) { @@ -293,7 +293,7 @@ static int pcm512x_overclock_dac_put(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *ucontrol) { struct snd_soc_component *component = snd_kcontrol_chip(kcontrol); - struct snd_soc_dapm_context *dapm = snd_soc_dapm_kcontrol_to_dapm(kcontrol); + struct snd_soc_dapm_context *dapm = snd_soc_component_to_dapm(component); struct pcm512x_priv *pcm512x = snd_soc_component_get_drvdata(component); switch (snd_soc_dapm_get_bias_level(dapm)) { diff --git a/sound/soc/sdw_utils/soc_sdw_bridge_cs35l56.c b/sound/soc/sdw_utils/soc_sdw_bridge_cs35l56.c index 2a7109d53cbe..e0e32a279787 100644 --- a/sound/soc/sdw_utils/soc_sdw_bridge_cs35l56.c +++ b/sound/soc/sdw_utils/soc_sdw_bridge_cs35l56.c @@ -40,12 +40,6 @@ static int asoc_sdw_bridge_cs35l56_asp_init(struct snd_soc_pcm_runtime *rtd) struct snd_soc_dai *codec_dai; struct snd_soc_dai *cpu_dai; - card->components = devm_kasprintf(card->dev, GFP_KERNEL, - "%s spk:cs35l56-bridge", - card->components); - if (!card->components) - return -ENOMEM; - ret = snd_soc_dapm_new_controls(dapm, bridge_widgets, ARRAY_SIZE(bridge_widgets)); if (ret) { diff --git a/sound/soc/sdw_utils/soc_sdw_cs42l43.c b/sound/soc/sdw_utils/soc_sdw_cs42l43.c index 2685ff4f0932..e99ea3c4e5dd 100644 --- a/sound/soc/sdw_utils/soc_sdw_cs42l43.c +++ b/sound/soc/sdw_utils/soc_sdw_cs42l43.c @@ -107,20 +107,11 @@ EXPORT_SYMBOL_NS(asoc_sdw_cs42l43_hs_rtd_init, "SND_SOC_SDW_UTILS"); int asoc_sdw_cs42l43_spk_rtd_init(struct snd_soc_pcm_runtime *rtd, struct snd_soc_dai *dai) { + struct snd_soc_component *component = dai->component; struct snd_soc_card *card = rtd->card; struct snd_soc_dapm_context *dapm = snd_soc_card_to_dapm(card); - struct asoc_sdw_mc_private *ctx = snd_soc_card_get_drvdata(card); int ret; - if (!(ctx->mc_quirk & SOC_SDW_SIDECAR_AMPS)) { - /* Will be set by the bridge code in this case */ - card->components = devm_kasprintf(card->dev, GFP_KERNEL, - "%s spk:cs42l43-spk", - card->components); - if (!card->components) - return -ENOMEM; - } - ret = snd_soc_limit_volume(card, "cs42l43 Speaker Digital Volume", CS42L43_SPK_VOLUME_0DB); if (ret) @@ -131,8 +122,15 @@ int asoc_sdw_cs42l43_spk_rtd_init(struct snd_soc_pcm_runtime *rtd, struct snd_so ret = snd_soc_dapm_add_routes(dapm, cs42l43_spk_map, ARRAY_SIZE(cs42l43_spk_map)); - if (ret) + if (ret) { dev_err(card->dev, "cs42l43 speaker map addition failed: %d\n", ret); + return ret; + } + + ret = snd_soc_component_set_sysclk(component, CS42L43_SYSCLK, CS42L43_SYSCLK_SDW, + 0, SND_SOC_CLOCK_IN); + if (ret) + dev_err(card->dev, "Failed to set sysclk: %d\n", ret); return ret; } diff --git a/sound/soc/sdw_utils/soc_sdw_utils.c b/sound/soc/sdw_utils/soc_sdw_utils.c index 0e67d9f34cba..1b897c8c2c2c 100644 --- a/sound/soc/sdw_utils/soc_sdw_utils.c +++ b/sound/soc/sdw_utils/soc_sdw_utils.c @@ -189,6 +189,8 @@ struct asoc_sdw_codec_info codec_info_list[] = { .dai_type = SOC_SDW_DAI_TYPE_MIC, .dailink = {SOC_SDW_UNUSED_DAI_ID, SOC_SDW_DMIC_DAI_ID}, .rtd_init = asoc_sdw_rt_dmic_rtd_init, + .quirk = SOC_SDW_CODEC_MIC, + .quirk_exclude = true, }, }, .dai_num = 3, @@ -461,6 +463,8 @@ struct asoc_sdw_codec_info codec_info_list[] = { .dai_type = SOC_SDW_DAI_TYPE_MIC, .dailink = {SOC_SDW_UNUSED_DAI_ID, SOC_SDW_DMIC_DAI_ID}, .rtd_init = asoc_sdw_rt_dmic_rtd_init, + .quirk = SOC_SDW_CODEC_MIC, + .quirk_exclude = true, }, }, .dai_num = 3, @@ -709,6 +713,7 @@ struct asoc_sdw_codec_info codec_info_list[] = { { .direction = {true, false}, .codec_name = "cs42l43-codec", + .component_name = "cs42l43-spk", .dai_name = "cs42l43-dp6", .dai_type = SOC_SDW_DAI_TYPE_AMP, .dailink = {SOC_SDW_AMP_OUT_DAI_ID, SOC_SDW_UNUSED_DAI_ID}, @@ -918,11 +923,12 @@ static int asoc_sdw_find_codec_info_dai_index(const struct asoc_sdw_codec_info * int asoc_sdw_rtd_init(struct snd_soc_pcm_runtime *rtd) { struct snd_soc_card *card = rtd->card; + struct asoc_sdw_mc_private *ctx = snd_soc_card_get_drvdata(card); struct snd_soc_dapm_context *dapm = snd_soc_card_to_dapm(card); struct asoc_sdw_codec_info *codec_info; struct snd_soc_dai *dai; struct sdw_slave *sdw_peripheral; - const char *spk_components=""; + const char *spk_components = NULL; int dai_index; int ret; int i; @@ -993,23 +999,35 @@ skip_add_controls_widgets: /* Generate the spk component string for card->components string */ if (codec_info->dais[dai_index].dai_type == SOC_SDW_DAI_TYPE_AMP && codec_info->dais[dai_index].component_name) { - if (strlen (spk_components) == 0) + const char *component; + + /* + * For the special case of cs42l43 with sidecar amps, use only + * "cs35l56-bridge" as the component name in card->components + */ + if (ctx->mc_quirk & SOC_SDW_SIDECAR_AMPS && + !strcmp(codec_info->dais[dai_index].component_name, "cs42l43-spk")) + component = "cs35l56-bridge"; + else + component = codec_info->dais[dai_index].component_name; + + if (!spk_components) spk_components = - devm_kasprintf(card->dev, GFP_KERNEL, "%s", - codec_info->dais[dai_index].component_name); + devm_kasprintf(card->dev, GFP_KERNEL, "%s", component); else /* Append component name to spk_components */ spk_components = devm_kasprintf(card->dev, GFP_KERNEL, - "%s+%s", spk_components, - codec_info->dais[dai_index].component_name); + "%s+%s", spk_components, component); + + if (!spk_components) + return -ENOMEM; } codec_info->dais[dai_index].rtd_init_done = true; - } - if (strlen (spk_components) > 0) { + if (spk_components) { /* Update card components for speaker components */ card->components = devm_kasprintf(card->dev, GFP_KERNEL, "%s spk:%s", card->components, spk_components); diff --git a/sound/soc/soc-utils.c b/sound/soc/soc-utils.c index c8adfff826bd..9cb7567e263e 100644 --- a/sound/soc/soc-utils.c +++ b/sound/soc/soc-utils.c @@ -36,6 +36,7 @@ int snd_soc_ret(const struct device *dev, int ret, const char *fmt, ...) vaf.va = &args; dev_err(dev, "ASoC error (%d): %pV", ret, &vaf); + va_end(args); } return ret; diff --git a/sound/soc/sof/amd/acp.c b/sound/soc/sof/amd/acp.c index 71a18f156de2..f615b8d1c802 100644 --- a/sound/soc/sof/amd/acp.c +++ b/sound/soc/sof/amd/acp.c @@ -223,7 +223,7 @@ static int psp_send_cmd(struct acp_dev_data *adata, int cmd) { struct snd_sof_dev *sdev = adata->dev; int ret; - u32 data; + int data; if (!cmd) return -EINVAL; diff --git a/sound/usb/misc/ua101.c b/sound/usb/misc/ua101.c index d129b42eb979..b9a62e94e06c 100644 --- a/sound/usb/misc/ua101.c +++ b/sound/usb/misc/ua101.c @@ -894,8 +894,9 @@ find_format_descriptor(struct usb_interface *interface) struct uac_format_type_i_discrete_descriptor *desc; desc = (struct uac_format_type_i_discrete_descriptor *)extra; - if (desc->bLength > extralen) { - dev_err(&interface->dev, "descriptor overflow\n"); + if (desc->bLength < sizeof(struct usb_descriptor_header) || + desc->bLength > extralen) { + dev_err(&interface->dev, "invalid descriptor length\n"); return NULL; } if (desc->bLength == UAC_FORMAT_TYPE_I_DISCRETE_DESC_SIZE(1) && diff --git a/sound/usb/mixer_scarlett2.c b/sound/usb/mixer_scarlett2.c index 8eaa96222759..8e80a7165faf 100644 --- a/sound/usb/mixer_scarlett2.c +++ b/sound/usb/mixer_scarlett2.c @@ -6707,6 +6707,8 @@ static int scarlett2_add_line_in_ctls(struct usb_mixer_interface *mixer) err = scarlett2_add_new_ctl( mixer, &scarlett2_autogain_status_ctl, i, 1, s, &private->autogain_status_ctls[i]); + if (err < 0) + return err; } /* Add autogain target controls */ @@ -9185,12 +9187,15 @@ static long scarlett2_hwdep_write(struct snd_hwdep *hw, flash_size = private->flash_segment_blocks[segment_id] * SCARLETT2_FLASH_BLOCK_SIZE; - if (count < 0 || *offset < 0 || *offset + count >= flash_size) - return -ENOSPC; + if (count < 0 || *offset < 0) + return -EINVAL; if (!count) return 0; + if (*offset >= flash_size || count > flash_size - *offset) + return -ENOSPC; + /* Limit the *req size to SCARLETT2_FLASH_RW_MAX */ if (count > max_data_size) count = max_data_size; diff --git a/tools/testing/selftests/mm/hmm-tests.c b/tools/testing/selftests/mm/hmm-tests.c index 788689497e92..77fb4c5d871b 100644 --- a/tools/testing/selftests/mm/hmm-tests.c +++ b/tools/testing/selftests/mm/hmm-tests.c @@ -986,6 +986,56 @@ TEST_F(hmm, migrate) } /* + * Migrate private file memory to device private memory. + */ +TEST_F(hmm, migrate_file_private) +{ + struct hmm_buffer *buffer; + unsigned long npages; + unsigned long size; + unsigned long i; + int *ptr; + int ret; + int fd; + + npages = ALIGN(HMM_BUFFER_SIZE, self->page_size) >> self->page_shift; + ASSERT_NE(npages, 0); + size = npages << self->page_shift; + + fd = hmm_create_file(size); + ASSERT_GE(fd, 0); + + buffer = malloc(sizeof(*buffer)); + ASSERT_NE(buffer, NULL); + + buffer->fd = fd; + buffer->size = size; + buffer->mirror = malloc(size); + ASSERT_NE(buffer->mirror, NULL); + + buffer->ptr = mmap(NULL, size, + PROT_READ | PROT_WRITE, + MAP_PRIVATE, + buffer->fd, 0); + ASSERT_NE(buffer->ptr, MAP_FAILED); + + /* Initialize buffer in system memory. */ + for (i = 0, ptr = buffer->ptr; i < size / sizeof(*ptr); ++i) + ptr[i] = i; + + /* Migrate memory to device. */ + ret = hmm_migrate_sys_to_dev(self->fd, buffer, npages); + ASSERT_EQ(ret, 0); + ASSERT_EQ(buffer->cpages, npages); + + /* Check what the device read. */ + for (i = 0, ptr = buffer->mirror; i < size / sizeof(*ptr); ++i) + ASSERT_EQ(ptr[i], i); + + hmm_buffer_free(buffer); +} + +/* * Migrate anonymous memory to device private memory and fault some of it back * to system memory, then try migrating the resulting mix of system and device * private memory to the device. diff --git a/tools/testing/selftests/mm/run_vmtests.sh b/tools/testing/selftests/mm/run_vmtests.sh index afdcfd0d7cef..55fc4b46ecbe 100755 --- a/tools/testing/selftests/mm/run_vmtests.sh +++ b/tools/testing/selftests/mm/run_vmtests.sh @@ -103,7 +103,7 @@ RUN_ALL=false RUN_DESTRUCTIVE=false TAP_PREFIX="# " -while getopts "aht:n" OPT; do +while getopts "aht:nd" OPT; do case ${OPT} in "a") RUN_ALL=true ;; "h") usage ;; diff --git a/tools/testing/selftests/net/lib/xdp_native.bpf.c b/tools/testing/selftests/net/lib/xdp_native.bpf.c index 64f05229ab24..ded3f896e622 100644 --- a/tools/testing/selftests/net/lib/xdp_native.bpf.c +++ b/tools/testing/selftests/net/lib/xdp_native.bpf.c @@ -268,6 +268,17 @@ static int xdp_mode_tx_handler(struct xdp_md *ctx, __u16 port) return XDP_PASS; } +static __always_inline __u16 csum_fold_helper(__u32 csum) +{ + csum = (csum & 0xffff) + (csum >> 16); + return ~((csum & 0xffff) + (csum >> 16)); +} + +static __always_inline __u16 csum_fold_udp_helper(__u32 csum) +{ + return csum_fold_helper(csum) ? : 0xffff; +} + static void *update_pkt(struct xdp_md *ctx, __s16 offset, __u32 *udp_csum) { void *data_end = (void *)(long)ctx->data_end; @@ -281,21 +292,22 @@ static void *update_pkt(struct xdp_md *ctx, __s16 offset, __u32 *udp_csum) if (eth->h_proto == bpf_htons(ETH_P_IP)) { struct iphdr *iph = data + sizeof(*eth); - __u16 total_len; if (iph + 1 > (struct iphdr *)data_end) return NULL; - iph->tot_len = bpf_htons(bpf_ntohs(iph->tot_len) + offset); - udph = (void *)eth + sizeof(*iph) + sizeof(*eth); if (!udph || udph + 1 > (struct udphdr *)data_end) return NULL; - len_new = bpf_htons(bpf_ntohs(udph->len) + offset); + len = iph->tot_len; + len_new = bpf_htons(bpf_ntohs(len) + offset); + iph->tot_len = len_new; + iph->check = csum_fold_helper( + bpf_csum_diff(&len, sizeof(len), &len_new, + sizeof(len_new), ~((__u32)iph->check))); } else if (eth->h_proto == bpf_htons(ETH_P_IPV6)) { struct ipv6hdr *ipv6h = data + sizeof(*eth); - __u16 payload_len; if (ipv6h + 1 > (struct ipv6hdr *)data_end) return NULL; @@ -304,33 +316,27 @@ static void *update_pkt(struct xdp_md *ctx, __s16 offset, __u32 *udp_csum) if (!udph || udph + 1 > (struct udphdr *)data_end) return NULL; - *udp_csum = ~((__u32)udph->check); - len = ipv6h->payload_len; len_new = bpf_htons(bpf_ntohs(len) + offset); ipv6h->payload_len = len_new; - - *udp_csum = bpf_csum_diff(&len, sizeof(len), &len_new, - sizeof(len_new), *udp_csum); - - len = udph->len; - len_new = bpf_htons(bpf_ntohs(udph->len) + offset); - *udp_csum = bpf_csum_diff(&len, sizeof(len), &len_new, - sizeof(len_new), *udp_csum); } else { return NULL; } + len = udph->len; + len_new = bpf_htons(bpf_ntohs(len) + offset); + + *udp_csum = ~((__u32)udph->check); + *udp_csum = bpf_csum_diff(&len, sizeof(len), &len_new, + sizeof(len_new), *udp_csum); + *udp_csum = bpf_csum_diff(&len, sizeof(len), &len_new, + sizeof(len_new), *udp_csum); + udph->len = len_new; return udph; } -static __u16 csum_fold_helper(__u32 csum) -{ - return ~((csum & 0xffff) + (csum >> 16)) ? : 0xffff; -} - static int xdp_adjst_tail_shrnk_data(struct xdp_md *ctx, __u16 offset, unsigned long hdr_len) { @@ -359,7 +365,7 @@ static int xdp_adjst_tail_shrnk_data(struct xdp_md *ctx, __u16 offset, return -1; udp_csum = bpf_csum_diff((__be32 *)tmp_buff, offset, 0, 0, udp_csum); - udph->check = (__u16)csum_fold_helper(udp_csum); + udph->check = (__u16)csum_fold_udp_helper(udp_csum); if (bpf_xdp_adjust_tail(ctx, 0 - offset) < 0) return -1; @@ -403,7 +409,7 @@ static int xdp_adjst_tail_grow_data(struct xdp_md *ctx, __u16 offset) return -1; udp_csum = bpf_csum_diff(0, 0, (__be32 *)tmp_buff, offset, udp_csum); - udph->check = (__u16)csum_fold_helper(udp_csum); + udph->check = (__u16)csum_fold_udp_helper(udp_csum); buff_len = bpf_xdp_get_buff_len(ctx); @@ -484,8 +490,7 @@ static int xdp_adjst_head_shrnk_data(struct xdp_md *ctx, __u64 hdr_len, return -1; udp_csum = bpf_csum_diff((__be32 *)tmp_buff, offset, 0, 0, udp_csum); - - udph->check = (__u16)csum_fold_helper(udp_csum); + udph->check = (__u16)csum_fold_udp_helper(udp_csum); if (bpf_xdp_load_bytes(ctx, 0, tmp_buff, MAX_ADJST_OFFSET) < 0) return -1; @@ -542,7 +547,7 @@ static int xdp_adjst_head_grow_data(struct xdp_md *ctx, __u64 hdr_len, return -1; udp_csum = bpf_csum_diff(0, 0, (__be32 *)data_buff, offset, udp_csum); - udph->check = (__u16)csum_fold_helper(udp_csum); + udph->check = (__u16)csum_fold_udp_helper(udp_csum); if (hdr_len > MAX_ADJST_OFFSET || hdr_len == 0) return -1; diff --git a/tools/testing/selftests/net/mptcp/mptcp_connect.sh b/tools/testing/selftests/net/mptcp/mptcp_connect.sh index a6447f7a31fe..d158678fa6ab 100755 --- a/tools/testing/selftests/net/mptcp/mptcp_connect.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_connect.sh @@ -401,7 +401,7 @@ do_transfer() mptcp_lib_wait_local_port_listen "${listener_ns}" "${port}" local start - start=$(date +%s%3N) + start=$(date +%s%N) ip netns exec ${connector_ns} \ ./mptcp_connect -t ${timeout_poll} -p $port -s ${cl_proto} \ $extra_args $connect_addr < "$cin" > "$cout" & @@ -423,7 +423,7 @@ do_transfer() fi local stop - stop=$(date +%s%3N) + stop=$(date +%s%N) if $capture; then sleep 1 @@ -439,7 +439,7 @@ do_transfer() fi local duration - duration=$((stop-start)) + duration=$(((stop-start) / 1000000)) printf "(duration %05sms) " "${duration}" if [ ${rets} -ne 0 ] || [ ${retc} -ne 0 ] || [ ${timeout_pid} -ne 0 ]; then mptcp_lib_pr_fail "client exit code $retc, server $rets" diff --git a/tools/testing/selftests/net/mptcp/mptcp_lib.sh b/tools/testing/selftests/net/mptcp/mptcp_lib.sh index 989a5975dcea..5ef6033775c8 100644 --- a/tools/testing/selftests/net/mptcp/mptcp_lib.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_lib.sh @@ -28,7 +28,7 @@ declare -rx MPTCP_LIB_AF_INET6=10 MPTCP_LIB_SUBTESTS=() MPTCP_LIB_SUBTESTS_DUPLICATED=0 MPTCP_LIB_SUBTEST_FLAKY=0 -MPTCP_LIB_SUBTESTS_LAST_TS_MS= +MPTCP_LIB_SUBTESTS_LAST_TS_NS= MPTCP_LIB_TEST_COUNTER=0 MPTCP_LIB_TEST_FORMAT="%02u %-50s" MPTCP_LIB_IP_MPTCP=0 @@ -236,7 +236,7 @@ mptcp_lib_kversion_ge() { } mptcp_lib_subtests_last_ts_reset() { - MPTCP_LIB_SUBTESTS_LAST_TS_MS="$(date +%s%3N)" + MPTCP_LIB_SUBTESTS_LAST_TS_NS="$(date +%s%N)" } mptcp_lib_subtests_last_ts_reset @@ -255,7 +255,7 @@ __mptcp_lib_result_check_duplicated() { __mptcp_lib_result_add() { local result="${1}" local time="time=" - local ts_prev_ms + local ts_prev_ns shift local id=$((${#MPTCP_LIB_SUBTESTS[@]} + 1)) @@ -265,9 +265,9 @@ __mptcp_lib_result_add() { # not to add two '#' [[ "${*}" != *"#"* ]] && time="# ${time}" - ts_prev_ms="${MPTCP_LIB_SUBTESTS_LAST_TS_MS}" + ts_prev_ns="${MPTCP_LIB_SUBTESTS_LAST_TS_NS}" mptcp_lib_subtests_last_ts_reset - time+="$((MPTCP_LIB_SUBTESTS_LAST_TS_MS - ts_prev_ms))ms" + time+="$(((MPTCP_LIB_SUBTESTS_LAST_TS_NS - ts_prev_ns) / 1000000))ms" MPTCP_LIB_SUBTESTS+=("${result} ${id} - ${KSFT_TEST}: ${*} ${time}") } diff --git a/tools/testing/selftests/ublk/kublk.c b/tools/testing/selftests/ublk/kublk.c index e1c3b3c55e56..c40aa7952b6e 100644 --- a/tools/testing/selftests/ublk/kublk.c +++ b/tools/testing/selftests/ublk/kublk.c @@ -1395,6 +1395,17 @@ static int __cmd_dev_add(const struct dev_ctx *ctx) goto fail; } + /* + * The kernel may reduce nr_hw_queues (e.g. capped to nr_cpu_ids). + * Cap nthreads to the actual queue count to avoid creating extra + * handler threads that will hang during device removal. + * + * per_io_tasks mode is excluded: threads interleave across all + * queues so nthreads > nr_hw_queues is valid and intentional. + */ + if (!ctx->per_io_tasks && dev->nthreads > info->nr_hw_queues) + dev->nthreads = info->nr_hw_queues; + ret = ublk_start_daemon(ctx, dev); ublk_dbg(UBLK_DBG_DEV, "%s: daemon exit %d\n", __func__, ret); if (ret < 0) |
