<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/include/linux/cpuhotplug.h, branch v5.13.2</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>Merge tag 'dmaengine-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine</title>
<updated>2021-05-04T18:24:46+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2021-05-04T18:24:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e4adffb8daf476a01e7b4a55f586dc8c26e81392'/>
<id>e4adffb8daf476a01e7b4a55f586dc8c26e81392</id>
<content type='text'>
Pull dmaengine updates from Vinod Koul:
 "New drivers/devices:

   - Support for QCOM SM8150 GPI DMA

  Updates:

   - Big pile of idxd updates including support for performance
     monitoring

   - Support in dw-edma for interleaved dma

   - Support for synchronize() in Xilinx driver"

* tag 'dmaengine-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (42 commits)
  dmaengine: idxd: Enable IDXD performance monitor support
  dmaengine: idxd: Add IDXD performance monitor support
  dmaengine: idxd: remove MSIX masking for interrupt handlers
  dmaengine: idxd: device cmd should use dedicated lock
  dmaengine: idxd: support reporting of halt interrupt
  dmaengine: idxd: enable SVA feature for IOMMU
  dmaengine: idxd: convert sprintf() to sysfs_emit() for all usages
  dmaengine: idxd: add interrupt handle request and release support
  dmaengine: idxd: add support for readonly config mode
  dmaengine: idxd: add percpu_ref to descriptor submission path
  dmaengine: idxd: remove detection of device type
  dmaengine: idxd: iax bus removal
  dmaengine: idxd: fix cdev setup and free device lifetime issues
  dmaengine: idxd: fix group conf_dev lifetime
  dmaengine: idxd: fix engine conf_dev lifetime
  dmaengine: idxd: fix wq conf_dev 'struct device' lifetime
  dmaengine: idxd: fix idxd conf_dev 'struct device' lifetime
  dmaengine: idxd: use ida for device instance enumeration
  dmaengine: idxd: removal of pcim managed mmio mapping
  dmaengine: idxd: cleanup pci interrupt vector allocation management
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull dmaengine updates from Vinod Koul:
 "New drivers/devices:

   - Support for QCOM SM8150 GPI DMA

  Updates:

   - Big pile of idxd updates including support for performance
     monitoring

   - Support in dw-edma for interleaved dma

   - Support for synchronize() in Xilinx driver"

* tag 'dmaengine-5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (42 commits)
  dmaengine: idxd: Enable IDXD performance monitor support
  dmaengine: idxd: Add IDXD performance monitor support
  dmaengine: idxd: remove MSIX masking for interrupt handlers
  dmaengine: idxd: device cmd should use dedicated lock
  dmaengine: idxd: support reporting of halt interrupt
  dmaengine: idxd: enable SVA feature for IOMMU
  dmaengine: idxd: convert sprintf() to sysfs_emit() for all usages
  dmaengine: idxd: add interrupt handle request and release support
  dmaengine: idxd: add support for readonly config mode
  dmaengine: idxd: add percpu_ref to descriptor submission path
  dmaengine: idxd: remove detection of device type
  dmaengine: idxd: iax bus removal
  dmaengine: idxd: fix cdev setup and free device lifetime issues
  dmaengine: idxd: fix group conf_dev lifetime
  dmaengine: idxd: fix engine conf_dev lifetime
  dmaengine: idxd: fix wq conf_dev 'struct device' lifetime
  dmaengine: idxd: fix idxd conf_dev 'struct device' lifetime
  dmaengine: idxd: use ida for device instance enumeration
  dmaengine: idxd: removal of pcim managed mmio mapping
  dmaengine: idxd: cleanup pci interrupt vector allocation management
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'iommu-updates-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu</title>
<updated>2021-05-01T16:33:00+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2021-05-01T16:33:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4f9701057a9cc1ae6bfc533204c9d3ba386687de'/>
<id>4f9701057a9cc1ae6bfc533204c9d3ba386687de</id>
<content type='text'>
Pull iommu updates from Joerg Roedel:

 - Big cleanup of almost unsused parts of the IOMMU API by Christoph
   Hellwig. This mostly affects the Freescale PAMU driver.

 - New IOMMU driver for Unisoc SOCs

 - ARM SMMU Updates from Will:
     - Drop vestigial PREFETCH_ADDR support (SMMUv3)
     - Elide TLB sync logic for empty gather (SMMUv3)
     - Fix "Service Failure Mode" handling (SMMUv3)
     - New Qualcomm compatible string (SMMUv2)

 - Removal of the AMD IOMMU performance counter writeable check on AMD.
   It caused long boot delays on some machines and is only needed to
   work around an errata on some older (possibly pre-production) chips.
   If someone is still hit by this hardware issue anyway the performance
   counters will just return 0.

 - Support for targeted invalidations in the AMD IOMMU driver. Before
   that the driver only invalidated a single 4k page or the whole IO/TLB
   for an address space. This has been extended now and is mostly useful
   for emulated AMD IOMMUs.

 - Several fixes for the Shared Virtual Memory support in the Intel VT-d
   driver

 - Mediatek drivers can now be built as modules

 - Re-introduction of the forcedac boot option which got lost when
   converting the Intel VT-d driver to the common dma-iommu
   implementation.

 - Extension of the IOMMU device registration interface and support
   iommu_ops to be const again when drivers are built as modules.

* tag 'iommu-updates-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (84 commits)
  iommu: Streamline registration interface
  iommu: Statically set module owner
  iommu/mediatek-v1: Add error handle for mtk_iommu_probe
  iommu/mediatek-v1: Avoid build fail when build as module
  iommu/mediatek: Always enable the clk on resume
  iommu/fsl-pamu: Fix uninitialized variable warning
  iommu/vt-d: Force to flush iotlb before creating superpage
  iommu/amd: Put newline after closing bracket in warning
  iommu/vt-d: Fix an error handling path in 'intel_prepare_irq_remapping()'
  iommu/vt-d: Fix build error of pasid_enable_wpe() with !X86
  iommu/amd: Remove performance counter pre-initialization test
  Revert "iommu/amd: Fix performance counter initialization"
  iommu/amd: Remove duplicate check of devid
  iommu/exynos: Remove unneeded local variable initialization
  iommu/amd: Page-specific invalidations for more than one page
  iommu/arm-smmu-v3: Remove the unused fields for PREFETCH_CONFIG command
  iommu/vt-d: Avoid unnecessary cache flush in pasid entry teardown
  iommu/vt-d: Invalidate PASID cache when root/context entry changed
  iommu/vt-d: Remove WO permissions on second-level paging entries
  iommu/vt-d: Report the right page fault address
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull iommu updates from Joerg Roedel:

 - Big cleanup of almost unsused parts of the IOMMU API by Christoph
   Hellwig. This mostly affects the Freescale PAMU driver.

 - New IOMMU driver for Unisoc SOCs

 - ARM SMMU Updates from Will:
     - Drop vestigial PREFETCH_ADDR support (SMMUv3)
     - Elide TLB sync logic for empty gather (SMMUv3)
     - Fix "Service Failure Mode" handling (SMMUv3)
     - New Qualcomm compatible string (SMMUv2)

 - Removal of the AMD IOMMU performance counter writeable check on AMD.
   It caused long boot delays on some machines and is only needed to
   work around an errata on some older (possibly pre-production) chips.
   If someone is still hit by this hardware issue anyway the performance
   counters will just return 0.

 - Support for targeted invalidations in the AMD IOMMU driver. Before
   that the driver only invalidated a single 4k page or the whole IO/TLB
   for an address space. This has been extended now and is mostly useful
   for emulated AMD IOMMUs.

 - Several fixes for the Shared Virtual Memory support in the Intel VT-d
   driver

 - Mediatek drivers can now be built as modules

 - Re-introduction of the forcedac boot option which got lost when
   converting the Intel VT-d driver to the common dma-iommu
   implementation.

 - Extension of the IOMMU device registration interface and support
   iommu_ops to be const again when drivers are built as modules.

* tag 'iommu-updates-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (84 commits)
  iommu: Streamline registration interface
  iommu: Statically set module owner
  iommu/mediatek-v1: Add error handle for mtk_iommu_probe
  iommu/mediatek-v1: Avoid build fail when build as module
  iommu/mediatek: Always enable the clk on resume
  iommu/fsl-pamu: Fix uninitialized variable warning
  iommu/vt-d: Force to flush iotlb before creating superpage
  iommu/amd: Put newline after closing bracket in warning
  iommu/vt-d: Fix an error handling path in 'intel_prepare_irq_remapping()'
  iommu/vt-d: Fix build error of pasid_enable_wpe() with !X86
  iommu/amd: Remove performance counter pre-initialization test
  Revert "iommu/amd: Fix performance counter initialization"
  iommu/amd: Remove duplicate check of devid
  iommu/exynos: Remove unneeded local variable initialization
  iommu/amd: Page-specific invalidations for more than one page
  iommu/arm-smmu-v3: Remove the unused fields for PREFETCH_CONFIG command
  iommu/vt-d: Avoid unnecessary cache flush in pasid entry teardown
  iommu/vt-d: Invalidate PASID cache when root/context entry changed
  iommu/vt-d: Remove WO permissions on second-level paging entries
  iommu/vt-d: Report the right page fault address
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'arm-apple-m1-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc</title>
<updated>2021-04-26T19:30:36+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2021-04-26T19:30:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0c855563182001c829065faa17f8e29e9ceffe13'/>
<id>0c855563182001c829065faa17f8e29e9ceffe13</id>
<content type='text'>
Pull ARM Apple M1 platform support from Arnd Bergmann:
 "The Apple M1 is the processor used it all current generation Apple
  Macintosh computers. Support for this platform so far is rudimentary,
  but it boots and can use framebuffer and serial console over a special
  USB cable.

  Support for several essential on-chip devices (USB, PCIe, IOMMU, NVMe)
  is work in progress but was not ready in time.

  A very detailed description of what works is in the commit message of
  commit 1bb2fd3880d4 ("Merge tag 'm1-soc-bringup-v5' [..]") and on the
  AsahiLinux wiki"

Link: https://lore.kernel.org/linux-arm-kernel/bdb18e9f-fcd7-1e31-2224-19c0e5090706@marcan.st/

* tag 'arm-apple-m1-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  asm-generic/io.h: Unbork ioremap_np() declaration
  arm64: apple: Add initial Apple Mac mini (M1, 2020) devicetree
  dt-bindings: display: Add apple,simple-framebuffer
  arm64: Kconfig: Introduce CONFIG_ARCH_APPLE
  irqchip/apple-aic: Add support for the Apple Interrupt Controller
  dt-bindings: interrupt-controller: Add DT bindings for apple-aic
  arm64: Move ICH_ sysreg bits from arm-gic-v3.h to sysreg.h
  of/address: Add infrastructure to declare MMIO as non-posted
  asm-generic/io.h: implement pci_remap_cfgspace using ioremap_np
  arm64: Implement ioremap_np() to map MMIO as nGnRnE
  docs: driver-api: device-io: Document ioremap() variants &amp; access funcs
  docs: driver-api: device-io: Document I/O access functions
  asm-generic/io.h:  Add a non-posted variant of ioremap()
  arm64: arch_timer: Implement support for interrupt-names
  dt-bindings: timer: arm,arch_timer: Add interrupt-names support
  arm64: cputype: Add CPU implementor &amp; types for the Apple M1 cores
  dt-bindings: arm: cpus: Add apple,firestorm &amp; icestorm compatibles
  dt-bindings: arm: apple: Add bindings for Apple ARM platforms
  dt-bindings: vendor-prefixes: Add apple prefix
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull ARM Apple M1 platform support from Arnd Bergmann:
 "The Apple M1 is the processor used it all current generation Apple
  Macintosh computers. Support for this platform so far is rudimentary,
  but it boots and can use framebuffer and serial console over a special
  USB cable.

  Support for several essential on-chip devices (USB, PCIe, IOMMU, NVMe)
  is work in progress but was not ready in time.

  A very detailed description of what works is in the commit message of
  commit 1bb2fd3880d4 ("Merge tag 'm1-soc-bringup-v5' [..]") and on the
  AsahiLinux wiki"

Link: https://lore.kernel.org/linux-arm-kernel/bdb18e9f-fcd7-1e31-2224-19c0e5090706@marcan.st/

* tag 'arm-apple-m1-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  asm-generic/io.h: Unbork ioremap_np() declaration
  arm64: apple: Add initial Apple Mac mini (M1, 2020) devicetree
  dt-bindings: display: Add apple,simple-framebuffer
  arm64: Kconfig: Introduce CONFIG_ARCH_APPLE
  irqchip/apple-aic: Add support for the Apple Interrupt Controller
  dt-bindings: interrupt-controller: Add DT bindings for apple-aic
  arm64: Move ICH_ sysreg bits from arm-gic-v3.h to sysreg.h
  of/address: Add infrastructure to declare MMIO as non-posted
  asm-generic/io.h: implement pci_remap_cfgspace using ioremap_np
  arm64: Implement ioremap_np() to map MMIO as nGnRnE
  docs: driver-api: device-io: Document ioremap() variants &amp; access funcs
  docs: driver-api: device-io: Document I/O access functions
  asm-generic/io.h:  Add a non-posted variant of ioremap()
  arm64: arch_timer: Implement support for interrupt-names
  dt-bindings: timer: arm,arch_timer: Add interrupt-names support
  arm64: cputype: Add CPU implementor &amp; types for the Apple M1 cores
  dt-bindings: arm: cpus: Add apple,firestorm &amp; icestorm compatibles
  dt-bindings: arm: apple: Add bindings for Apple ARM platforms
  dt-bindings: vendor-prefixes: Add apple prefix
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux</title>
<updated>2021-04-26T17:25:03+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2021-04-26T17:25:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=31a24ae89c92d5533c049046a76c6a2d649efb72'/>
<id>31a24ae89c92d5533c049046a76c6a2d649efb72</id>
<content type='text'>
Pull arm64 updates from Catalin Marinas:

 - MTE asynchronous support for KASan. Previously only synchronous
   (slower) mode was supported. Asynchronous is faster but does not
   allow precise identification of the illegal access.

 - Run kernel mode SIMD with softirqs disabled. This allows using NEON
   in softirq context for crypto performance improvements. The
   conditional yield support is modified to take softirqs into account
   and reduce the latency.

 - Preparatory patches for Apple M1: handle CPUs that only have the VHE
   mode available (host kernel running at EL2), add FIQ support.

 - arm64 perf updates: support for HiSilicon PA and SLLC PMU drivers,
   new functions for the HiSilicon HHA and L3C PMU, cleanups.

 - Re-introduce support for execute-only user permissions but only when
   the EPAN (Enhanced Privileged Access Never) architecture feature is
   available.

 - Disable fine-grained traps at boot and improve the documented boot
   requirements.

 - Support CONFIG_KASAN_VMALLOC on arm64 (only with KASAN_GENERIC).

 - Add hierarchical eXecute Never permissions for all page tables.

 - Add arm64 prctl(PR_PAC_{SET,GET}_ENABLED_KEYS) allowing user programs
   to control which PAC keys are enabled in a particular task.

 - arm64 kselftests for BTI and some improvements to the MTE tests.

 - Minor improvements to the compat vdso and sigpage.

 - Miscellaneous cleanups.

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (86 commits)
  arm64/sve: Add compile time checks for SVE hooks in generic functions
  arm64/kernel/probes: Use BUG_ON instead of if condition followed by BUG.
  arm64: pac: Optimize kernel entry/exit key installation code paths
  arm64: Introduce prctl(PR_PAC_{SET,GET}_ENABLED_KEYS)
  arm64: mte: make the per-task SCTLR_EL1 field usable elsewhere
  arm64/sve: Remove redundant system_supports_sve() tests
  arm64: fpsimd: run kernel mode NEON with softirqs disabled
  arm64: assembler: introduce wxN aliases for wN registers
  arm64: assembler: remove conditional NEON yield macros
  kasan, arm64: tests supports for HW_TAGS async mode
  arm64: mte: Report async tag faults before suspend
  arm64: mte: Enable async tag check fault
  arm64: mte: Conditionally compile mte_enable_kernel_*()
  arm64: mte: Enable TCO in functions that can read beyond buffer limits
  kasan: Add report for async mode
  arm64: mte: Drop arch_enable_tagging()
  kasan: Add KASAN mode kernel parameter
  arm64: mte: Add asynchronous mode support
  arm64: Get rid of CONFIG_ARM64_VHE
  arm64: Cope with CPUs stuck in VHE mode
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull arm64 updates from Catalin Marinas:

 - MTE asynchronous support for KASan. Previously only synchronous
   (slower) mode was supported. Asynchronous is faster but does not
   allow precise identification of the illegal access.

 - Run kernel mode SIMD with softirqs disabled. This allows using NEON
   in softirq context for crypto performance improvements. The
   conditional yield support is modified to take softirqs into account
   and reduce the latency.

 - Preparatory patches for Apple M1: handle CPUs that only have the VHE
   mode available (host kernel running at EL2), add FIQ support.

 - arm64 perf updates: support for HiSilicon PA and SLLC PMU drivers,
   new functions for the HiSilicon HHA and L3C PMU, cleanups.

 - Re-introduce support for execute-only user permissions but only when
   the EPAN (Enhanced Privileged Access Never) architecture feature is
   available.

 - Disable fine-grained traps at boot and improve the documented boot
   requirements.

 - Support CONFIG_KASAN_VMALLOC on arm64 (only with KASAN_GENERIC).

 - Add hierarchical eXecute Never permissions for all page tables.

 - Add arm64 prctl(PR_PAC_{SET,GET}_ENABLED_KEYS) allowing user programs
   to control which PAC keys are enabled in a particular task.

 - arm64 kselftests for BTI and some improvements to the MTE tests.

 - Minor improvements to the compat vdso and sigpage.

 - Miscellaneous cleanups.

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (86 commits)
  arm64/sve: Add compile time checks for SVE hooks in generic functions
  arm64/kernel/probes: Use BUG_ON instead of if condition followed by BUG.
  arm64: pac: Optimize kernel entry/exit key installation code paths
  arm64: Introduce prctl(PR_PAC_{SET,GET}_ENABLED_KEYS)
  arm64: mte: make the per-task SCTLR_EL1 field usable elsewhere
  arm64/sve: Remove redundant system_supports_sve() tests
  arm64: fpsimd: run kernel mode NEON with softirqs disabled
  arm64: assembler: introduce wxN aliases for wN registers
  arm64: assembler: remove conditional NEON yield macros
  kasan, arm64: tests supports for HW_TAGS async mode
  arm64: mte: Report async tag faults before suspend
  arm64: mte: Enable async tag check fault
  arm64: mte: Conditionally compile mte_enable_kernel_*()
  arm64: mte: Enable TCO in functions that can read beyond buffer limits
  kasan: Add report for async mode
  arm64: mte: Drop arch_enable_tagging()
  kasan: Add KASAN mode kernel parameter
  arm64: mte: Add asynchronous mode support
  arm64: Get rid of CONFIG_ARM64_VHE
  arm64: Cope with CPUs stuck in VHE mode
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>dmaengine: idxd: Add IDXD performance monitor support</title>
<updated>2021-04-25T16:16:12+00:00</updated>
<author>
<name>Tom Zanussi</name>
<email>tom.zanussi@linux.intel.com</email>
</author>
<published>2021-04-24T15:04:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=81dd4d4d6178306ab31db91bdc7353d485bdafce'/>
<id>81dd4d4d6178306ab31db91bdc7353d485bdafce</id>
<content type='text'>
Implement the IDXD performance monitor capability (named 'perfmon' in
the DSA (Data Streaming Accelerator) spec [1]), which supports the
collection of information about key events occurring during DSA and
IAX (Intel Analytics Accelerator) device execution, to assist in
performance tuning and debugging.

The idxd perfmon support is implemented as part of the IDXD driver and
interfaces with the Linux perf framework.  It has several features in
common with the existing uncore pmu support:

  - it does not support sampling
  - does not support per-thread counting

However it also has some unique features not present in the core and
uncore support:

  - all general-purpose counters are identical, thus no event constraints
  - operation is always system-wide

While the core perf subsystem assumes that all counters are by default
per-cpu, the uncore pmus are socket-scoped and use a cpu mask to
restrict counting to one cpu from each socket.  IDXD counters use a
similar strategy but expand the scope even further; since IDXD
counters are system-wide and can be read from any cpu, the IDXD perf
driver picks a single cpu to do the work (with cpu hotplug notifiers
to choose a different cpu if the chosen one is taken off-line).

More specifically, the perf userspace tool by default opens a counter
for each cpu for an event.  However, if it finds a cpumask file
associated with the pmu under sysfs, as is the case with the uncore
pmus, it will open counters only on the cpus specified by the cpumask.
Since perfmon only needs to open a single counter per event for a
given IDXD device, the perfmon driver will create a sysfs cpumask file
for the device and insert the first cpu of the system into it.  When a
user uses perf to open an event, perf will open a single counter on
the cpu specified by the cpu mask.  This amounts to the default
system-wide rather than per-cpu counting mentioned previously for
perfmon pmu events.  In order to keep the cpu mask up-to-date, the
driver implements cpu hotplug support for multiple devices, as IDXD
usually enumerates and registers more than one idxd device.

The perfmon driver implements basic perfmon hardware capability
discovery and configuration, and is initialized by the IDXD driver's
probe function.  During initialization, the driver retrieves the total
number of supported performance counters, the pmu ID, and the device
type from idxd device, and registers itself under the Linux perf
framework.

The perf userspace tool can be used to monitor single or multiple
events depending on the given configuration, as well as event groups,
which are also supported by the perfmon driver.  The user configures
events using the perf tool command-line interface by specifying the
event and corresponding event category, along with an optional set of
filters that can be used to restrict counting to specific work queues,
traffic classes, page and transfer sizes, and engines (See [1] for
specifics).

With the configuration specified by the user, the perf tool issues a
system call passing that information to the kernel, which uses it to
initialize the specified event(s).  The event(s) are opened and
started, and following termination of the perf command, they're
stopped.  At that point, the perfmon driver will read the latest count
for the event(s), calculate the difference between the latest counter
values and previously tracked counter values, and display the final
incremental count as the event count for the cycle.  An overflow
handler registered on the IDXD irq path is used to account for counter
overflows, which are signaled by an overflow interrupt.

Below are a couple of examples of perf usage for monitoring DSA events.

The following monitors all events in the 'engine' category.  Becuuse
no filters are specified, this captures all engine events for the
workload, which in this case is 19 iterations of the work generated by
the kernel dmatest module.

Details describing the events can be found in Appendix D of [1],
Performance Monitoring Events, but briefly they are:

  event 0x1:  total input data processed, in 32-byte units
  event 0x2:  total data written, in 32-byte units
  event 0x4:  number of work descriptors that read the source
  event 0x8:  number of work descriptors that write the destination
  event 0x10: number of work descriptors dispatched from batch descriptors
  event 0x20: number of work descriptors dispatched from work queues

 # perf stat -e dsa0/event=0x1,event_category=0x1/,
                dsa0/event=0x2,event_category=0x1/,
		dsa0/event=0x4,event_category=0x1/,
		dsa0/event=0x8,event_category=0x1/,
		dsa0/event=0x10,event_category=0x1/,
		dsa0/event=0x20,event_category=0x1/
		  modprobe dmatest channel=dma0chan0 timeout=2000
		  iterations=19 run=1 wait=1

     Performance counter stats for 'system wide':

                 5,332      dsa0/event=0x1,event_category=0x1/
                 5,327      dsa0/event=0x2,event_category=0x1/
                    19      dsa0/event=0x4,event_category=0x1/
                    19      dsa0/event=0x8,event_category=0x1/
                     0      dsa0/event=0x10,event_category=0x1/
                    19      dsa0/event=0x20,event_category=0x1/

          21.977436186 seconds time elapsed

The command below illustrates filter usage with a simple example.  It
specifies that MEM_MOVE operations should be counted for the DSA
device dsa0 (event 0x8 corresponds to the EV_MEM_MOVE event - Number
of Memory Move Descriptors, which is part of event category 0x3 -
Operations. The detailed category and event IDs are available in
Appendix D, Performance Monitoring Events, of [1]).  In addition to
the event and event category, a number of filters are also specified
(the detailed filter values are available in Chapter 6.4 (Filter
Support) of [1]), which will restrict counting to only those events
that meet all of the filter criteria.  In this case, the filters
specify that only MEM_MOVE operations that are serviced by work queue
wq0 and specifically engine number engine0 and traffic class tc0
having sizes between 0 and 4k and page size of between 0 and 1G result
in a counter hit; anything else will be filtered out and not appear in
the final count.  Note that filters are optional - any filter not
specified is assumed to be all ones and will pass anything.

 # perf stat -e dsa0/filter_wq=0x1,filter_tc=0x1,filter_sz=0x7,
                filter_eng=0x1,event=0x8,event_category=0x3/
		  modprobe dmatest channel=dma0chan0 timeout=2000
		  iterations=19 run=1 wait=1

     Performance counter stats for 'system wide':

       19      dsa0/filter_wq=0x1,filter_tc=0x1,filter_sz=0x7,
               filter_eng=0x1,event=0x8,event_category=0x3/

          21.865914091 seconds time elapsed

The output above reflects that the unspecified workload resulted in
the counting of 19 MEM_MOVE operation events that met the filter
criteria.

[1]: https://software.intel.com/content/www/us/en/develop/download/intel-data-streaming-accelerator-preliminary-architecture-specification.html

[ Based on work originally by Jing Lin. ]

Reviewed-by: Dave Jiang &lt;dave.jiang@intel.com&gt;
Reviewed-by: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Signed-off-by: Tom Zanussi &lt;tom.zanussi@linux.intel.com&gt;
Link: https://lore.kernel.org/r/0c5080a7d541904c4ad42b848c76a1ce056ddac7.1619276133.git.zanussi@kernel.org
Signed-off-by: Vinod Koul &lt;vkoul@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Implement the IDXD performance monitor capability (named 'perfmon' in
the DSA (Data Streaming Accelerator) spec [1]), which supports the
collection of information about key events occurring during DSA and
IAX (Intel Analytics Accelerator) device execution, to assist in
performance tuning and debugging.

The idxd perfmon support is implemented as part of the IDXD driver and
interfaces with the Linux perf framework.  It has several features in
common with the existing uncore pmu support:

  - it does not support sampling
  - does not support per-thread counting

However it also has some unique features not present in the core and
uncore support:

  - all general-purpose counters are identical, thus no event constraints
  - operation is always system-wide

While the core perf subsystem assumes that all counters are by default
per-cpu, the uncore pmus are socket-scoped and use a cpu mask to
restrict counting to one cpu from each socket.  IDXD counters use a
similar strategy but expand the scope even further; since IDXD
counters are system-wide and can be read from any cpu, the IDXD perf
driver picks a single cpu to do the work (with cpu hotplug notifiers
to choose a different cpu if the chosen one is taken off-line).

More specifically, the perf userspace tool by default opens a counter
for each cpu for an event.  However, if it finds a cpumask file
associated with the pmu under sysfs, as is the case with the uncore
pmus, it will open counters only on the cpus specified by the cpumask.
Since perfmon only needs to open a single counter per event for a
given IDXD device, the perfmon driver will create a sysfs cpumask file
for the device and insert the first cpu of the system into it.  When a
user uses perf to open an event, perf will open a single counter on
the cpu specified by the cpu mask.  This amounts to the default
system-wide rather than per-cpu counting mentioned previously for
perfmon pmu events.  In order to keep the cpu mask up-to-date, the
driver implements cpu hotplug support for multiple devices, as IDXD
usually enumerates and registers more than one idxd device.

The perfmon driver implements basic perfmon hardware capability
discovery and configuration, and is initialized by the IDXD driver's
probe function.  During initialization, the driver retrieves the total
number of supported performance counters, the pmu ID, and the device
type from idxd device, and registers itself under the Linux perf
framework.

The perf userspace tool can be used to monitor single or multiple
events depending on the given configuration, as well as event groups,
which are also supported by the perfmon driver.  The user configures
events using the perf tool command-line interface by specifying the
event and corresponding event category, along with an optional set of
filters that can be used to restrict counting to specific work queues,
traffic classes, page and transfer sizes, and engines (See [1] for
specifics).

With the configuration specified by the user, the perf tool issues a
system call passing that information to the kernel, which uses it to
initialize the specified event(s).  The event(s) are opened and
started, and following termination of the perf command, they're
stopped.  At that point, the perfmon driver will read the latest count
for the event(s), calculate the difference between the latest counter
values and previously tracked counter values, and display the final
incremental count as the event count for the cycle.  An overflow
handler registered on the IDXD irq path is used to account for counter
overflows, which are signaled by an overflow interrupt.

Below are a couple of examples of perf usage for monitoring DSA events.

The following monitors all events in the 'engine' category.  Becuuse
no filters are specified, this captures all engine events for the
workload, which in this case is 19 iterations of the work generated by
the kernel dmatest module.

Details describing the events can be found in Appendix D of [1],
Performance Monitoring Events, but briefly they are:

  event 0x1:  total input data processed, in 32-byte units
  event 0x2:  total data written, in 32-byte units
  event 0x4:  number of work descriptors that read the source
  event 0x8:  number of work descriptors that write the destination
  event 0x10: number of work descriptors dispatched from batch descriptors
  event 0x20: number of work descriptors dispatched from work queues

 # perf stat -e dsa0/event=0x1,event_category=0x1/,
                dsa0/event=0x2,event_category=0x1/,
		dsa0/event=0x4,event_category=0x1/,
		dsa0/event=0x8,event_category=0x1/,
		dsa0/event=0x10,event_category=0x1/,
		dsa0/event=0x20,event_category=0x1/
		  modprobe dmatest channel=dma0chan0 timeout=2000
		  iterations=19 run=1 wait=1

     Performance counter stats for 'system wide':

                 5,332      dsa0/event=0x1,event_category=0x1/
                 5,327      dsa0/event=0x2,event_category=0x1/
                    19      dsa0/event=0x4,event_category=0x1/
                    19      dsa0/event=0x8,event_category=0x1/
                     0      dsa0/event=0x10,event_category=0x1/
                    19      dsa0/event=0x20,event_category=0x1/

          21.977436186 seconds time elapsed

The command below illustrates filter usage with a simple example.  It
specifies that MEM_MOVE operations should be counted for the DSA
device dsa0 (event 0x8 corresponds to the EV_MEM_MOVE event - Number
of Memory Move Descriptors, which is part of event category 0x3 -
Operations. The detailed category and event IDs are available in
Appendix D, Performance Monitoring Events, of [1]).  In addition to
the event and event category, a number of filters are also specified
(the detailed filter values are available in Chapter 6.4 (Filter
Support) of [1]), which will restrict counting to only those events
that meet all of the filter criteria.  In this case, the filters
specify that only MEM_MOVE operations that are serviced by work queue
wq0 and specifically engine number engine0 and traffic class tc0
having sizes between 0 and 4k and page size of between 0 and 1G result
in a counter hit; anything else will be filtered out and not appear in
the final count.  Note that filters are optional - any filter not
specified is assumed to be all ones and will pass anything.

 # perf stat -e dsa0/filter_wq=0x1,filter_tc=0x1,filter_sz=0x7,
                filter_eng=0x1,event=0x8,event_category=0x3/
		  modprobe dmatest channel=dma0chan0 timeout=2000
		  iterations=19 run=1 wait=1

     Performance counter stats for 'system wide':

       19      dsa0/filter_wq=0x1,filter_tc=0x1,filter_sz=0x7,
               filter_eng=0x1,event=0x8,event_category=0x3/

          21.865914091 seconds time elapsed

The output above reflects that the unspecified workload resulted in
the counting of 19 MEM_MOVE operation events that met the filter
criteria.

[1]: https://software.intel.com/content/www/us/en/develop/download/intel-data-streaming-accelerator-preliminary-architecture-specification.html

[ Based on work originally by Jing Lin. ]

Reviewed-by: Dave Jiang &lt;dave.jiang@intel.com&gt;
Reviewed-by: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Signed-off-by: Tom Zanussi &lt;tom.zanussi@linux.intel.com&gt;
Link: https://lore.kernel.org/r/0c5080a7d541904c4ad42b848c76a1ce056ddac7.1619276133.git.zanussi@kernel.org
Signed-off-by: Vinod Koul &lt;vkoul@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>clocksource/drivers/timer-ti-dm: Handle dra7 timer wrap errata i940</title>
<updated>2021-04-08T14:41:18+00:00</updated>
<author>
<name>Tony Lindgren</name>
<email>tony@atomide.com</email>
</author>
<published>2021-03-23T07:43:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=25de4ce5ed02994aea8bc111d133308f6fd62566'/>
<id>25de4ce5ed02994aea8bc111d133308f6fd62566</id>
<content type='text'>
There is a timer wrap issue on dra7 for the ARM architected timer.
In a typical clock configuration the timer fails to wrap after 388 days.

To work around the issue, we need to use timer-ti-dm percpu timers instead.

Let's configure dmtimer3 and 4 as percpu timers by default, and warn about
the issue if the dtb is not configured properly.

Let's do this as a single patch so it can be backported to v5.8 and later
kernels easily. Note that this patch depends on earlier timer-ti-dm
systimer posted mode fixes, and a preparatory clockevent patch
"clocksource/drivers/timer-ti-dm: Prepare to handle dra7 timer wrap issue".

For more information, please see the errata for "AM572x Sitara Processors
Silicon Revisions 1.1, 2.0":

https://www.ti.com/lit/er/sprz429m/sprz429m.pdf

The concept is based on earlier reference patches done by Tero Kristo and
Keerthy.

Cc: Keerthy &lt;j-keerthy@ti.com&gt;
Cc: Tero Kristo &lt;kristo@kernel.org&gt;
Signed-off-by: Tony Lindgren &lt;tony@atomide.com&gt;
Signed-off-by: Daniel Lezcano &lt;daniel.lezcano@linaro.org&gt;
Link: https://lore.kernel.org/r/20210323074326.28302-3-tony@atomide.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There is a timer wrap issue on dra7 for the ARM architected timer.
In a typical clock configuration the timer fails to wrap after 388 days.

To work around the issue, we need to use timer-ti-dm percpu timers instead.

Let's configure dmtimer3 and 4 as percpu timers by default, and warn about
the issue if the dtb is not configured properly.

Let's do this as a single patch so it can be backported to v5.8 and later
kernels easily. Note that this patch depends on earlier timer-ti-dm
systimer posted mode fixes, and a preparatory clockevent patch
"clocksource/drivers/timer-ti-dm: Prepare to handle dra7 timer wrap issue".

For more information, please see the errata for "AM572x Sitara Processors
Silicon Revisions 1.1, 2.0":

https://www.ti.com/lit/er/sprz429m/sprz429m.pdf

The concept is based on earlier reference patches done by Tero Kristo and
Keerthy.

Cc: Keerthy &lt;j-keerthy@ti.com&gt;
Cc: Tero Kristo &lt;kristo@kernel.org&gt;
Signed-off-by: Tony Lindgren &lt;tony@atomide.com&gt;
Signed-off-by: Daniel Lezcano &lt;daniel.lezcano@linaro.org&gt;
Link: https://lore.kernel.org/r/20210323074326.28302-3-tony@atomide.com
</pre>
</div>
</content>
</entry>
<entry>
<title>irqchip/apple-aic: Add support for the Apple Interrupt Controller</title>
<updated>2021-04-08T11:18:41+00:00</updated>
<author>
<name>Hector Martin</name>
<email>marcan@marcan.st</email>
</author>
<published>2021-01-20T23:55:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=76cde26394114f6af2710c6b2ad6854f1e8ee859'/>
<id>76cde26394114f6af2710c6b2ad6854f1e8ee859</id>
<content type='text'>
This is the root interrupt controller used on Apple ARM SoCs such as the
M1. This irqchip driver performs multiple functions:

* Handles both IRQs and FIQs

* Drives the AIC peripheral itself (which handles IRQs)

* Dispatches FIQs to downstream hard-wired clients (currently the ARM
  timer).

* Implements a virtual IPI multiplexer to funnel multiple Linux IPIs
  into a single hardware IPI

Reviewed-by: Marc Zyngier &lt;maz@kernel.org&gt;
Acked-by: Will Deacon &lt;will@kernel.org&gt;
Signed-off-by: Hector Martin &lt;marcan@marcan.st&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This is the root interrupt controller used on Apple ARM SoCs such as the
M1. This irqchip driver performs multiple functions:

* Handles both IRQs and FIQs

* Drives the AIC peripheral itself (which handles IRQs)

* Dispatches FIQs to downstream hard-wired clients (currently the ARM
  timer).

* Implements a virtual IPI multiplexer to funnel multiple Linux IPIs
  into a single hardware IPI

Reviewed-by: Marc Zyngier &lt;maz@kernel.org&gt;
Acked-by: Will Deacon &lt;will@kernel.org&gt;
Signed-off-by: Hector Martin &lt;marcan@marcan.st&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>iommu/vt-d: Remove IOVA domain rcache flushing for CPU offlining</title>
<updated>2021-04-07T08:27:27+00:00</updated>
<author>
<name>John Garry</name>
<email>john.garry@huawei.com</email>
</author>
<published>2021-03-25T12:29:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=363f266eeff6e22a09483dc922dccd7cd0b9fe9c'/>
<id>363f266eeff6e22a09483dc922dccd7cd0b9fe9c</id>
<content type='text'>
Now that the core code handles flushing per-IOVA domain CPU rcaches,
remove the handling here.

Reviewed-by: Lu Baolu &lt;baolu.lu@linux.intel.com&gt;
Signed-off-by: John Garry &lt;john.garry@huawei.com&gt;
Link: https://lore.kernel.org/r/1616675401-151997-3-git-send-email-john.garry@huawei.com
Signed-off-by: Joerg Roedel &lt;jroedel@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Now that the core code handles flushing per-IOVA domain CPU rcaches,
remove the handling here.

Reviewed-by: Lu Baolu &lt;baolu.lu@linux.intel.com&gt;
Signed-off-by: John Garry &lt;john.garry@huawei.com&gt;
Link: https://lore.kernel.org/r/1616675401-151997-3-git-send-email-john.garry@huawei.com
Signed-off-by: Joerg Roedel &lt;jroedel@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>iova: Add CPU hotplug handler to flush rcaches</title>
<updated>2021-04-07T08:27:27+00:00</updated>
<author>
<name>John Garry</name>
<email>john.garry@huawei.com</email>
</author>
<published>2021-03-25T12:29:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f598a497bc7dfbec60270bca8b8408db3d23ac07'/>
<id>f598a497bc7dfbec60270bca8b8408db3d23ac07</id>
<content type='text'>
Like the Intel IOMMU driver already does, flush the per-IOVA domain
CPU rcache when a CPU goes offline - there's no point in keeping it.

Reviewed-by: Robin Murphy &lt;robin.murphy@arm.com&gt;
Signed-off-by: John Garry &lt;john.garry@huawei.com&gt;
Link: https://lore.kernel.org/r/1616675401-151997-2-git-send-email-john.garry@huawei.com
Signed-off-by: Joerg Roedel &lt;jroedel@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Like the Intel IOMMU driver already does, flush the per-IOVA domain
CPU rcache when a CPU goes offline - there's no point in keeping it.

Reviewed-by: Robin Murphy &lt;robin.murphy@arm.com&gt;
Signed-off-by: John Garry &lt;john.garry@huawei.com&gt;
Link: https://lore.kernel.org/r/1616675401-151997-2-git-send-email-john.garry@huawei.com
Signed-off-by: Joerg Roedel &lt;jroedel@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drivers/perf: hisi: Add support for HiSilicon PA PMU driver</title>
<updated>2021-03-25T13:03:46+00:00</updated>
<author>
<name>Shaokun Zhang</name>
<email>zhangshaokun@hisilicon.com</email>
</author>
<published>2021-03-08T06:50:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a0ab25cd82eeb68bfa19a4d93a097521af5011b8'/>
<id>a0ab25cd82eeb68bfa19a4d93a097521af5011b8</id>
<content type='text'>
On HiSilicon Hip09 platform, there is a PA (Protocol Adapter) module on
each chip SICL (Super I/O Cluster) which incorporates three Hydra interface
and facilitates the cache coherency between the dies on the chip. While PA
uncore PMU model is the same as other Hip09 PMU modules and many PMU events
are supported. Let's support the PMU driver using the HiSilicon uncore PMU
framework.

PA PMU supports the following filter functions:
* tracetag_en: allows user to count events according to tt_req or
tt_core set in L3C PMU. It's the same as other PMUs.

* srcid_cmd &amp; srcid_msk: allows user to filter statistics that come from
specific CCL/ICL by configuration source ID.

* tgtid_cmd &amp; tgtid_msk: it is the similar function to srcid_cmd &amp;
srcid_msk. Both are used to check where the data comes from or go to.

Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Cc: John Garry &lt;john.garry@huawei.com&gt;
Cc: Jonathan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Reviewed-by: John Garry &lt;john.garry@huawei.com&gt;
Co-developed-by: Qi Liu &lt;liuqi115@huawei.com&gt;
Signed-off-by: Qi Liu &lt;liuqi115@huawei.com&gt;
Signed-off-by: Shaokun Zhang &lt;zhangshaokun@hisilicon.com&gt;
Link: https://lore.kernel.org/r/1615186237-22263-9-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
On HiSilicon Hip09 platform, there is a PA (Protocol Adapter) module on
each chip SICL (Super I/O Cluster) which incorporates three Hydra interface
and facilitates the cache coherency between the dies on the chip. While PA
uncore PMU model is the same as other Hip09 PMU modules and many PMU events
are supported. Let's support the PMU driver using the HiSilicon uncore PMU
framework.

PA PMU supports the following filter functions:
* tracetag_en: allows user to count events according to tt_req or
tt_core set in L3C PMU. It's the same as other PMUs.

* srcid_cmd &amp; srcid_msk: allows user to filter statistics that come from
specific CCL/ICL by configuration source ID.

* tgtid_cmd &amp; tgtid_msk: it is the similar function to srcid_cmd &amp;
srcid_msk. Both are used to check where the data comes from or go to.

Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Cc: John Garry &lt;john.garry@huawei.com&gt;
Cc: Jonathan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Reviewed-by: John Garry &lt;john.garry@huawei.com&gt;
Co-developed-by: Qi Liu &lt;liuqi115@huawei.com&gt;
Signed-off-by: Qi Liu &lt;liuqi115@huawei.com&gt;
Signed-off-by: Shaokun Zhang &lt;zhangshaokun@hisilicon.com&gt;
Link: https://lore.kernel.org/r/1615186237-22263-9-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
