<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/kernel/irq, branch linux-4.20.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>genirq: Make sure the initial affinity is not empty</title>
<updated>2019-03-05T16:59:36+00:00</updated>
<author>
<name>Srinivas Ramana</name>
<email>sramana@codeaurora.org</email>
</author>
<published>2018-12-20T13:35:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a1c05dbb46f6634bbd3d14465cd6fdec7977557c'/>
<id>a1c05dbb46f6634bbd3d14465cd6fdec7977557c</id>
<content type='text'>
[ Upstream commit bddda606ec76550dd63592e32a6e87e7d32583f7 ]

If all CPUs in the irq_default_affinity mask are offline when an interrupt
is initialized then irq_setup_affinity() can set an empty affinity mask for
a newly allocated interrupt.

Fix this by falling back to cpu_online_mask in case the resulting affinity
mask is zero.

Signed-off-by: Srinivas Ramana &lt;sramana@codeaurora.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-arm-msm@vger.kernel.org
Link: https://lkml.kernel.org/r/1545312957-8504-1-git-send-email-sramana@codeaurora.org
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit bddda606ec76550dd63592e32a6e87e7d32583f7 ]

If all CPUs in the irq_default_affinity mask are offline when an interrupt
is initialized then irq_setup_affinity() can set an empty affinity mask for
a newly allocated interrupt.

Fix this by falling back to cpu_online_mask in case the resulting affinity
mask is zero.

Signed-off-by: Srinivas Ramana &lt;sramana@codeaurora.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-arm-msm@vger.kernel.org
Link: https://lkml.kernel.org/r/1545312957-8504-1-git-send-email-sramana@codeaurora.org
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>genirq/matrix: Improve target CPU selection for managed interrupts.</title>
<updated>2019-03-05T16:59:34+00:00</updated>
<author>
<name>Long Li</name>
<email>longli@microsoft.com</email>
</author>
<published>2018-11-06T04:00:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=73835f71479e6b69975e3b96839b62e4b2963418'/>
<id>73835f71479e6b69975e3b96839b62e4b2963418</id>
<content type='text'>
[ Upstream commit e8da8794a7fd9eef1ec9a07f0d4897c68581c72b ]

On large systems with multiple devices of the same class (e.g. NVMe disks,
using managed interrupts), the kernel can affinitize these interrupts to a
small subset of CPUs instead of spreading them out evenly.

irq_matrix_alloc_managed() tries to select the CPU in the supplied cpumask
of possible target CPUs which has the lowest number of interrupt vectors
allocated.

This is done by searching the CPU with the highest number of available
vectors. While this is correct for non-managed CPUs it can select the wrong
CPU for managed interrupts. Under certain constellations this results in
affinitizing the managed interrupts of several devices to a single CPU in
a set.

The book keeping of available vectors works the following way:

 1) Non-managed interrupts:

    available is decremented when the interrupt is actually requested by
    the device driver and a vector is assigned. It's incremented when the
    interrupt and the vector are freed.

 2) Managed interrupts:

    Managed interrupts guarantee vector reservation when the MSI/MSI-X
    functionality of a device is enabled, which is achieved by reserving
    vectors in the bitmaps of the possible target CPUs. This reservation
    decrements the available count on each possible target CPU.

    When the interrupt is requested by the device driver then a vector is
    allocated from the reserved region. The operation is reversed when the
    interrupt is freed by the device driver. Neither of these operations
    affect the available count.

    The reservation persist up to the point where the MSI/MSI-X
    functionality is disabled and only this operation increments the
    available count again.

For non-managed interrupts the available count is the correct selection
criterion because the guaranteed reservations need to be taken into
account. Using the allocated counter could lead to a failing allocation in
the following situation (total vector space of 10 assumed):

		 CPU0	CPU1
 available:	    2	   0
 allocated:	    5	   3   &lt;--- CPU1 is selected, but available space = 0
 managed reserved:  3	   7

 while available yields the correct result.

For managed interrupts the available count is not the appropriate
selection criterion because as explained above the available count is not
affected by the actual vector allocation.

The following example illustrates that. Total vector space of 10
assumed. The starting point is:

		 CPU0	CPU1
 available:	    5	   4
 allocated:	    2	   3
 managed reserved:  3	   3

 Allocating vectors for three non-managed interrupts will result in
 affinitizing the first two to CPU0 and the third one to CPU1 because the
 available count is adjusted with each allocation:

		  CPU0	CPU1
 available:	     5	   4	&lt;- Select CPU0 for 1st allocation
 --&gt; allocated:	     3	   3

 available:	     4	   4	&lt;- Select CPU0 for 2nd allocation
 --&gt; allocated:	     4	   3

 available:	     3	   4	&lt;- Select CPU1 for 3rd allocation
 --&gt; allocated:	     4	   4

 But the allocation of three managed interrupts starting from the same
 point will affinitize all of them to CPU0 because the available count is
 not affected by the allocation (see above). So the end result is:

		  CPU0	CPU1
 available:	     5	   4
 allocated:	     5	   3

Introduce a "managed_allocated" field in struct cpumap to track the vector
allocation for managed interrupts separately. Use this information to
select the target CPU when a vector is allocated for a managed interrupt,
which results in more evenly distributed vector assignments. The above
example results in the following allocations:

		 CPU0	CPU1
 managed_allocated: 0	   0	&lt;- Select CPU0 for 1st allocation
 --&gt; allocated:	    3	   3

 managed_allocated: 1	   0	&lt;- Select CPU1 for 2nd allocation
 --&gt; allocated:	    3	   4

 managed_allocated: 1	   1	&lt;- Select CPU0 for 3rd allocation
 --&gt; allocated:	    4	   4

The allocation of non-managed interrupts is not affected by this change and
is still evaluating the available count.

The overall distribution of interrupt vectors for both types of interrupts
might still not be perfectly even depending on the number of non-managed
and managed interrupts in a system, but due to the reservation guarantee
for managed interrupts this cannot be avoided.

Expose the new field in debugfs as well.

[ tglx: Clarified the background of the problem in the changelog and
  	described it independent of NVME ]

Signed-off-by: Long Li &lt;longli@microsoft.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Michael Kelley &lt;mikelley@microsoft.com&gt;
Link: https://lkml.kernel.org/r/20181106040000.27316-1-longli@linuxonhyperv.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit e8da8794a7fd9eef1ec9a07f0d4897c68581c72b ]

On large systems with multiple devices of the same class (e.g. NVMe disks,
using managed interrupts), the kernel can affinitize these interrupts to a
small subset of CPUs instead of spreading them out evenly.

irq_matrix_alloc_managed() tries to select the CPU in the supplied cpumask
of possible target CPUs which has the lowest number of interrupt vectors
allocated.

This is done by searching the CPU with the highest number of available
vectors. While this is correct for non-managed CPUs it can select the wrong
CPU for managed interrupts. Under certain constellations this results in
affinitizing the managed interrupts of several devices to a single CPU in
a set.

The book keeping of available vectors works the following way:

 1) Non-managed interrupts:

    available is decremented when the interrupt is actually requested by
    the device driver and a vector is assigned. It's incremented when the
    interrupt and the vector are freed.

 2) Managed interrupts:

    Managed interrupts guarantee vector reservation when the MSI/MSI-X
    functionality of a device is enabled, which is achieved by reserving
    vectors in the bitmaps of the possible target CPUs. This reservation
    decrements the available count on each possible target CPU.

    When the interrupt is requested by the device driver then a vector is
    allocated from the reserved region. The operation is reversed when the
    interrupt is freed by the device driver. Neither of these operations
    affect the available count.

    The reservation persist up to the point where the MSI/MSI-X
    functionality is disabled and only this operation increments the
    available count again.

For non-managed interrupts the available count is the correct selection
criterion because the guaranteed reservations need to be taken into
account. Using the allocated counter could lead to a failing allocation in
the following situation (total vector space of 10 assumed):

		 CPU0	CPU1
 available:	    2	   0
 allocated:	    5	   3   &lt;--- CPU1 is selected, but available space = 0
 managed reserved:  3	   7

 while available yields the correct result.

For managed interrupts the available count is not the appropriate
selection criterion because as explained above the available count is not
affected by the actual vector allocation.

The following example illustrates that. Total vector space of 10
assumed. The starting point is:

		 CPU0	CPU1
 available:	    5	   4
 allocated:	    2	   3
 managed reserved:  3	   3

 Allocating vectors for three non-managed interrupts will result in
 affinitizing the first two to CPU0 and the third one to CPU1 because the
 available count is adjusted with each allocation:

		  CPU0	CPU1
 available:	     5	   4	&lt;- Select CPU0 for 1st allocation
 --&gt; allocated:	     3	   3

 available:	     4	   4	&lt;- Select CPU0 for 2nd allocation
 --&gt; allocated:	     4	   3

 available:	     3	   4	&lt;- Select CPU1 for 3rd allocation
 --&gt; allocated:	     4	   4

 But the allocation of three managed interrupts starting from the same
 point will affinitize all of them to CPU0 because the available count is
 not affected by the allocation (see above). So the end result is:

		  CPU0	CPU1
 available:	     5	   4
 allocated:	     5	   3

Introduce a "managed_allocated" field in struct cpumap to track the vector
allocation for managed interrupts separately. Use this information to
select the target CPU when a vector is allocated for a managed interrupt,
which results in more evenly distributed vector assignments. The above
example results in the following allocations:

		 CPU0	CPU1
 managed_allocated: 0	   0	&lt;- Select CPU0 for 1st allocation
 --&gt; allocated:	    3	   3

 managed_allocated: 1	   0	&lt;- Select CPU1 for 2nd allocation
 --&gt; allocated:	    3	   4

 managed_allocated: 1	   1	&lt;- Select CPU0 for 3rd allocation
 --&gt; allocated:	    4	   4

The allocation of non-managed interrupts is not affected by this change and
is still evaluating the available count.

The overall distribution of interrupt vectors for both types of interrupts
might still not be perfectly even depending on the number of non-managed
and managed interrupts in a system, but due to the reservation guarantee
for managed interrupts this cannot be avoided.

Expose the new field in debugfs as well.

[ tglx: Clarified the background of the problem in the changelog and
  	described it independent of NVME ]

Signed-off-by: Long Li &lt;longli@microsoft.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Michael Kelley &lt;mikelley@microsoft.com&gt;
Link: https://lkml.kernel.org/r/20181106040000.27316-1-longli@linuxonhyperv.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>genirq/affinity: Spread IRQs to all available NUMA nodes</title>
<updated>2019-02-12T19:02:05+00:00</updated>
<author>
<name>Long Li</name>
<email>longli@microsoft.com</email>
</author>
<published>2018-11-02T18:02:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6dd7618d7dead7b4f369d3e24e80232b8eb676c3'/>
<id>6dd7618d7dead7b4f369d3e24e80232b8eb676c3</id>
<content type='text'>
[ Upstream commit b82592199032bf7c778f861b936287e37ebc9f62 ]

If the number of NUMA nodes exceeds the number of MSI/MSI-X interrupts
which are allocated for a device, the interrupt affinity spreading code
fails to spread them across all nodes.

The reason is, that the spreading code starts from node 0 and continues up
to the number of interrupts requested for allocation. This leaves the nodes
past the last interrupt unused.

This results in interrupt concentration on the first nodes which violates
the assumption of the block layer that all nodes are covered evenly. As a
consequence the NUMA nodes above the number of interrupts are all assigned
to hardware queue 0 and therefore NUMA node 0, which results in bad
performance and has CPU hotplug implications, because queue 0 gets shut
down when the last CPU of node 0 is offlined.

Go over all NUMA nodes and assign them round-robin to all requested
interrupts to solve this.

[ tglx: Massaged changelog ]

Signed-off-by: Long Li &lt;longli@microsoft.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Cc: Michael Kelley &lt;mikelley@microsoft.com&gt;
Link: https://lkml.kernel.org/r/20181102180248.13583-1-longli@linuxonhyperv.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit b82592199032bf7c778f861b936287e37ebc9f62 ]

If the number of NUMA nodes exceeds the number of MSI/MSI-X interrupts
which are allocated for a device, the interrupt affinity spreading code
fails to spread them across all nodes.

The reason is, that the spreading code starts from node 0 and continues up
to the number of interrupts requested for allocation. This leaves the nodes
past the last interrupt unused.

This results in interrupt concentration on the first nodes which violates
the assumption of the block layer that all nodes are covered evenly. As a
consequence the NUMA nodes above the number of interrupts are all assigned
to hardware queue 0 and therefore NUMA node 0, which results in bad
performance and has CPU hotplug implications, because queue 0 gets shut
down when the last CPU of node 0 is offlined.

Go over all NUMA nodes and assign them round-robin to all requested
interrupts to solve this.

[ tglx: Massaged changelog ]

Signed-off-by: Long Li &lt;longli@microsoft.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Cc: Michael Kelley &lt;mikelley@microsoft.com&gt;
Link: https://lkml.kernel.org/r/20181102180248.13583-1-longli@linuxonhyperv.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>irq/matrix: Fix memory overallocation</title>
<updated>2018-11-01T09:00:38+00:00</updated>
<author>
<name>Michael Kelley</name>
<email>mikelley@microsoft.com</email>
</author>
<published>2018-11-01T00:35:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=57f01796f14fecf00d330fe39c8d2477ced9cd79'/>
<id>57f01796f14fecf00d330fe39c8d2477ced9cd79</id>
<content type='text'>
IRQ_MATRIX_SIZE is the number of longs needed for a bitmap, multiplied by
the size of a long, yielding a byte count. But it is used to size an array
of longs, which is way more memory than is needed.

Change IRQ_MATRIX_SIZE so it is just the number of longs needed and the
arrays come out the correct size.

Fixes: 2f75d9e1c905 ("genirq: Implement bitmap matrix allocator")
Signed-off-by: Michael Kelley &lt;mikelley@microsoft.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: KY Srinivasan &lt;kys@microsoft.com&gt;
Link: https://lkml.kernel.org/r/1541032428-10392-1-git-send-email-mikelley@microsoft.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
IRQ_MATRIX_SIZE is the number of longs needed for a bitmap, multiplied by
the size of a long, yielding a byte count. But it is used to size an array
of longs, which is way more memory than is needed.

Change IRQ_MATRIX_SIZE so it is just the number of longs needed and the
arrays come out the correct size.

Fixes: 2f75d9e1c905 ("genirq: Implement bitmap matrix allocator")
Signed-off-by: Michael Kelley &lt;mikelley@microsoft.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: KY Srinivasan &lt;kys@microsoft.com&gt;
Link: https://lkml.kernel.org/r/1541032428-10392-1-git-send-email-mikelley@microsoft.com

</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2018-10-25T18:43:47+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-10-25T18:43:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=5947a64a7e0c70cc16d5d1e5af3cf3b44535047a'/>
<id>5947a64a7e0c70cc16d5d1e5af3cf3b44535047a</id>
<content type='text'>
Pull irq updates from Thomas Gleixner:
 "The interrupt brigade came up with the following updates:

   - Driver for the Marvell System Error Interrupt machinery

   - Overhaul of the GIC-V3 ITS driver

   - Small updates and fixes all over the place"

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits)
  genirq: Fix race on spurious interrupt detection
  softirq: Fix typo in __do_softirq() comments
  genirq: Fix grammar s/an /a /
  irqchip/gic: Unify GIC priority definitions
  irqchip/gic-v3: Remove acknowledge loop
  dt-bindings/interrupt-controller: Add documentation for Marvell SEI controller
  dt-bindings/interrupt-controller: Update Marvell ICU bindings
  irqchip/irq-mvebu-icu: Add support for System Error Interrupts (SEI)
  arm64: marvell: Enable SEI driver
  irqchip/irq-mvebu-sei: Add new driver for Marvell SEI
  irqchip/irq-mvebu-icu: Support ICU subnodes
  irqchip/irq-mvebu-icu: Disociate ICU and NSR
  irqchip/irq-mvebu-icu: Clarify the reset operation of configured interrupts
  irqchip/irq-mvebu-icu: Fix wrong private data retrieval
  dt-bindings/interrupt-controller: Fix Marvell ICU length in the example
  genirq/msi: Allow creation of a tree-based irqdomain for platform-msi
  dt-bindings: irqchip: renesas-irqc: Document r8a7744 support
  dt-bindings: irqchip: renesas-irqc: Document R-Car E3 support
  irqchip/pdc: Setup all edge interrupts as rising edge at GIC
  irqchip/gic-v3-its: Allow use of LPI tables in reserved memory
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull irq updates from Thomas Gleixner:
 "The interrupt brigade came up with the following updates:

   - Driver for the Marvell System Error Interrupt machinery

   - Overhaul of the GIC-V3 ITS driver

   - Small updates and fixes all over the place"

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits)
  genirq: Fix race on spurious interrupt detection
  softirq: Fix typo in __do_softirq() comments
  genirq: Fix grammar s/an /a /
  irqchip/gic: Unify GIC priority definitions
  irqchip/gic-v3: Remove acknowledge loop
  dt-bindings/interrupt-controller: Add documentation for Marvell SEI controller
  dt-bindings/interrupt-controller: Update Marvell ICU bindings
  irqchip/irq-mvebu-icu: Add support for System Error Interrupts (SEI)
  arm64: marvell: Enable SEI driver
  irqchip/irq-mvebu-sei: Add new driver for Marvell SEI
  irqchip/irq-mvebu-icu: Support ICU subnodes
  irqchip/irq-mvebu-icu: Disociate ICU and NSR
  irqchip/irq-mvebu-icu: Clarify the reset operation of configured interrupts
  irqchip/irq-mvebu-icu: Fix wrong private data retrieval
  dt-bindings/interrupt-controller: Fix Marvell ICU length in the example
  genirq/msi: Allow creation of a tree-based irqdomain for platform-msi
  dt-bindings: irqchip: renesas-irqc: Document r8a7744 support
  dt-bindings: irqchip: renesas-irqc: Document R-Car E3 support
  irqchip/pdc: Setup all edge interrupts as rising edge at GIC
  irqchip/gic-v3-its: Allow use of LPI tables in reserved memory
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>genirq: Fix race on spurious interrupt detection</title>
<updated>2018-10-19T15:31:00+00:00</updated>
<author>
<name>Lukas Wunner</name>
<email>lukas@wunner.de</email>
</author>
<published>2018-10-18T13:15:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=746a923b863a1065ef77324e1e43f19b1a3eab5c'/>
<id>746a923b863a1065ef77324e1e43f19b1a3eab5c</id>
<content type='text'>
Commit 1e77d0a1ed74 ("genirq: Sanitize spurious interrupt detection of
threaded irqs") made detection of spurious interrupts work for threaded
handlers by:

a) incrementing a counter every time the thread returns IRQ_HANDLED, and
b) checking whether that counter has increased every time the thread is
   woken.

However for oneshot interrupts, the commit unmasks the interrupt before
incrementing the counter.  If another interrupt occurs right after
unmasking but before the counter is incremented, that interrupt is
incorrectly considered spurious:

time
 |  irq_thread()
 |    irq_thread_fn()
 |      action-&gt;thread_fn()
 |      irq_finalize_oneshot()
 |        unmask_threaded_irq()            /* interrupt is unmasked */
 |
 |                  /* interrupt fires, incorrectly deemed spurious */
 |
 |    atomic_inc(&amp;desc-&gt;threads_handled); /* counter is incremented */
 v

This is observed with a hi3110 CAN controller receiving data at high volume
(from a separate machine sending with "cangen -g 0 -i -x"): The controller
signals a huge number of interrupts (hundreds of millions per day) and
every second there are about a dozen which are deemed spurious.

In theory with high CPU load and the presence of higher priority tasks, the
number of incorrectly detected spurious interrupts might increase beyond
the 99,900 threshold and cause disablement of the interrupt.

In practice it just increments the spurious interrupt count. But that can
cause people to waste time investigating it over and over.

Fix it by moving the accounting before the invocation of
irq_finalize_oneshot().

[ tglx: Folded change log update ]

Fixes: 1e77d0a1ed74 ("genirq: Sanitize spurious interrupt detection of threaded irqs")
Signed-off-by: Lukas Wunner &lt;lukas@wunner.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Mathias Duckeck &lt;m.duckeck@kunbus.de&gt;
Cc: Akshay Bhat &lt;akshay.bhat@timesys.com&gt;
Cc: Casey Fitzpatrick &lt;casey.fitzpatrick@timesys.com&gt;
Cc: stable@vger.kernel.org # v3.16+
Link: https://lkml.kernel.org/r/1dfd8bbd16163940648045495e3e9698e63b50ad.1539867047.git.lukas@wunner.de

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit 1e77d0a1ed74 ("genirq: Sanitize spurious interrupt detection of
threaded irqs") made detection of spurious interrupts work for threaded
handlers by:

a) incrementing a counter every time the thread returns IRQ_HANDLED, and
b) checking whether that counter has increased every time the thread is
   woken.

However for oneshot interrupts, the commit unmasks the interrupt before
incrementing the counter.  If another interrupt occurs right after
unmasking but before the counter is incremented, that interrupt is
incorrectly considered spurious:

time
 |  irq_thread()
 |    irq_thread_fn()
 |      action-&gt;thread_fn()
 |      irq_finalize_oneshot()
 |        unmask_threaded_irq()            /* interrupt is unmasked */
 |
 |                  /* interrupt fires, incorrectly deemed spurious */
 |
 |    atomic_inc(&amp;desc-&gt;threads_handled); /* counter is incremented */
 v

This is observed with a hi3110 CAN controller receiving data at high volume
(from a separate machine sending with "cangen -g 0 -i -x"): The controller
signals a huge number of interrupts (hundreds of millions per day) and
every second there are about a dozen which are deemed spurious.

In theory with high CPU load and the presence of higher priority tasks, the
number of incorrectly detected spurious interrupts might increase beyond
the 99,900 threshold and cause disablement of the interrupt.

In practice it just increments the spurious interrupt count. But that can
cause people to waste time investigating it over and over.

Fix it by moving the accounting before the invocation of
irq_finalize_oneshot().

[ tglx: Folded change log update ]

Fixes: 1e77d0a1ed74 ("genirq: Sanitize spurious interrupt detection of threaded irqs")
Signed-off-by: Lukas Wunner &lt;lukas@wunner.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Mathias Duckeck &lt;m.duckeck@kunbus.de&gt;
Cc: Akshay Bhat &lt;akshay.bhat@timesys.com&gt;
Cc: Casey Fitzpatrick &lt;casey.fitzpatrick@timesys.com&gt;
Cc: stable@vger.kernel.org # v3.16+
Link: https://lkml.kernel.org/r/1dfd8bbd16163940648045495e3e9698e63b50ad.1539867047.git.lukas@wunner.de

</pre>
</div>
</content>
</entry>
<entry>
<title>genirq: Fix grammar s/an /a /</title>
<updated>2018-10-09T05:50:41+00:00</updated>
<author>
<name>Geert Uytterhoeven</name>
<email>geert+renesas@glider.be</email>
</author>
<published>2018-10-08T11:17:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b8d62f33b7b225935649ab165d901fe8dd7f95e5'/>
<id>b8d62f33b7b225935649ab165d901fe8dd7f95e5</id>
<content type='text'>
Fix a grammar mistake in &lt;linux/interrupt.h&gt;.

[ mingo: While at it also fix another similar error in another comment as well. ]

Signed-off-by: Geert Uytterhoeven &lt;geert+renesas@glider.be&gt;
Cc: Jiri Kosina &lt;trivial@kernel.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/r/20181008111726.26286-1-geert%2Brenesas@glider.be
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix a grammar mistake in &lt;linux/interrupt.h&gt;.

[ mingo: While at it also fix another similar error in another comment as well. ]

Signed-off-by: Geert Uytterhoeven &lt;geert+renesas@glider.be&gt;
Cc: Jiri Kosina &lt;trivial@kernel.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/r/20181008111726.26286-1-geert%2Brenesas@glider.be
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>genirq/debugfs: Reinstate full OF path for domain name</title>
<updated>2018-10-01T10:24:53+00:00</updated>
<author>
<name>Marc Zyngier</name>
<email>marc.zyngier@arm.com</email>
</author>
<published>2018-10-01T10:05:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=94967b55ebf3b603f2fe750ecedd896042585a1c'/>
<id>94967b55ebf3b603f2fe750ecedd896042585a1c</id>
<content type='text'>
On a DT based system, we use the of_node full name to name the
corresponding irq domain. We expect that name to be unique, so so that
domains with the same base name won't clash (this happens on multi-node
topologies, for example).

Since a7e4cfb0a7ca ("of/fdt: only store the device node basename in
full_name"), of_node_full_name() lies and only returns the basename. This
breaks the above requirement, and we end-up with only a subset of the
domains in /sys/kernel/debug/irq/domains.

Let's reinstate the feature by using the fancy new %pOF format specifier,
which happens to do the right thing.

Fixes: a7e4cfb0a7ca ("of/fdt: only store the device node basename in full_name")
Signed-off-by: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lkml.kernel.org/r/20181001100522.180054-3-marc.zyngier@arm.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
On a DT based system, we use the of_node full name to name the
corresponding irq domain. We expect that name to be unique, so so that
domains with the same base name won't clash (this happens on multi-node
topologies, for example).

Since a7e4cfb0a7ca ("of/fdt: only store the device node basename in
full_name"), of_node_full_name() lies and only returns the basename. This
breaks the above requirement, and we end-up with only a subset of the
domains in /sys/kernel/debug/irq/domains.

Let's reinstate the feature by using the fancy new %pOF format specifier,
which happens to do the right thing.

Fixes: a7e4cfb0a7ca ("of/fdt: only store the device node basename in full_name")
Signed-off-by: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lkml.kernel.org/r/20181001100522.180054-3-marc.zyngier@arm.com

</pre>
</div>
</content>
</entry>
<entry>
<title>genirq/debugfs: Reset domain debugfs_file on removal of the debugfs file</title>
<updated>2018-10-01T10:24:53+00:00</updated>
<author>
<name>Marc Zyngier</name>
<email>marc.zyngier@arm.com</email>
</author>
<published>2018-10-01T10:05:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=513145ea66af95f1a5c744d7b5a4f4a97625e669'/>
<id>513145ea66af95f1a5c744d7b5a4f4a97625e669</id>
<content type='text'>
When removing a debugfs file for a given irq domain, we fail to clear the
corresponding field, meaning that the corresponding domain won't be created
again if we need to do so.

It turns out that this is exactly what irq_domain_update_bus_token does
(delete old file, update domain name, recreate file).

This doesn't have any impact other than making debug more difficult, but we
do value ease of debugging... So clear the debugfs_file field.

Signed-off-by: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lkml.kernel.org/r/20181001100522.180054-2-marc.zyngier@arm.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When removing a debugfs file for a given irq domain, we fail to clear the
corresponding field, meaning that the corresponding domain won't be created
again if we need to do so.

It turns out that this is exactly what irq_domain_update_bus_token does
(delete old file, update domain name, recreate file).

This doesn't have any impact other than making debug more difficult, but we
do value ease of debugging... So clear the debugfs_file field.

Signed-off-by: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lkml.kernel.org/r/20181001100522.180054-2-marc.zyngier@arm.com

</pre>
</div>
</content>
</entry>
<entry>
<title>irq/matrix: Spread managed interrupts on allocation</title>
<updated>2018-09-18T16:27:24+00:00</updated>
<author>
<name>Dou Liyang</name>
<email>douly.fnst@cn.fujitsu.com</email>
</author>
<published>2018-09-08T17:58:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=76f99ae5b54d48430d1f0c5512a84da0ff9761e0'/>
<id>76f99ae5b54d48430d1f0c5512a84da0ff9761e0</id>
<content type='text'>
Linux spreads out the non managed interrupt across the possible target CPUs
to avoid vector space exhaustion.

Managed interrupts are treated differently, as for them the vectors are
reserved (with guarantee) when the interrupt descriptors are initialized.

When the interrupt is requested a real vector is assigned. The assignment
logic uses the first CPU in the affinity mask for assignment. If the
interrupt has more than one CPU in the affinity mask, which happens when a
multi queue device has less queues than CPUs, then doing the same search as
for non managed interrupts makes sense as it puts the interrupt on the
least interrupt plagued CPU. For single CPU affine vectors that's obviously
a NOOP.

Restructre the matrix allocation code so it does the 'best CPU' search, add
the sanity check for an empty affinity mask and adapt the call site in the
x86 vector management code.

[ tglx: Added the empty mask check to the core and improved change log ]

Signed-off-by: Dou Liyang &lt;douly.fnst@cn.fujitsu.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/20180908175838.14450-2-dou_liyang@163.com

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Linux spreads out the non managed interrupt across the possible target CPUs
to avoid vector space exhaustion.

Managed interrupts are treated differently, as for them the vectors are
reserved (with guarantee) when the interrupt descriptors are initialized.

When the interrupt is requested a real vector is assigned. The assignment
logic uses the first CPU in the affinity mask for assignment. If the
interrupt has more than one CPU in the affinity mask, which happens when a
multi queue device has less queues than CPUs, then doing the same search as
for non managed interrupts makes sense as it puts the interrupt on the
least interrupt plagued CPU. For single CPU affine vectors that's obviously
a NOOP.

Restructre the matrix allocation code so it does the 'best CPU' search, add
the sanity check for an empty affinity mask and adapt the call site in the
x86 vector management code.

[ tglx: Added the empty mask check to the core and improved change log ]

Signed-off-by: Dou Liyang &lt;douly.fnst@cn.fujitsu.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/20180908175838.14450-2-dou_liyang@163.com

</pre>
</div>
</content>
</entry>
</feed>
