<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/net/ethernet/intel, branch v6.6</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>iavf: in iavf_down, disable queues when removing the driver</title>
<updated>2023-10-26T00:48:31+00:00</updated>
<author>
<name>Michal Schmidt</name>
<email>mschmidt@redhat.com</email>
</author>
<published>2023-10-25T18:32:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=53798666648af3aa0dd512c2380576627237a800'/>
<id>53798666648af3aa0dd512c2380576627237a800</id>
<content type='text'>
In iavf_down, we're skipping the scheduling of certain operations if
the driver is being removed. However, the IAVF_FLAG_AQ_DISABLE_QUEUES
request must not be skipped in this case, because iavf_close waits
for the transition to the __IAVF_DOWN state, which happens in
iavf_virtchnl_completion after the queues are released.

Without this fix, "rmmod iavf" takes half a second per interface that's
up and prints the "Device resources not yet released" warning.

Fixes: c8de44b577eb ("iavf: do not process adminq tasks when __IAVF_IN_REMOVE_TASK is set")
Signed-off-by: Michal Schmidt &lt;mschmidt@redhat.com&gt;
Reviewed-by: Wojciech Drewek &lt;wojciech.drewek@intel.com&gt;
Tested-by: Rafal Romanowski &lt;rafal.romanowski@intel.com&gt;
Tested-by: Jacob Keller &lt;jacob.e.keller@intel.com&gt;
Signed-off-by: Jacob Keller &lt;jacob.e.keller@intel.com&gt;
Link: https://lore.kernel.org/r/20231025183213.874283-1-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In iavf_down, we're skipping the scheduling of certain operations if
the driver is being removed. However, the IAVF_FLAG_AQ_DISABLE_QUEUES
request must not be skipped in this case, because iavf_close waits
for the transition to the __IAVF_DOWN state, which happens in
iavf_virtchnl_completion after the queues are released.

Without this fix, "rmmod iavf" takes half a second per interface that's
up and prints the "Device resources not yet released" warning.

Fixes: c8de44b577eb ("iavf: do not process adminq tasks when __IAVF_IN_REMOVE_TASK is set")
Signed-off-by: Michal Schmidt &lt;mschmidt@redhat.com&gt;
Reviewed-by: Wojciech Drewek &lt;wojciech.drewek@intel.com&gt;
Tested-by: Rafal Romanowski &lt;rafal.romanowski@intel.com&gt;
Tested-by: Jacob Keller &lt;jacob.e.keller@intel.com&gt;
Signed-off-by: Jacob Keller &lt;jacob.e.keller@intel.com&gt;
Link: https://lore.kernel.org/r/20231025183213.874283-1-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>i40e: Fix wrong check for I40E_TXR_FLAGS_WB_ON_ITR</title>
<updated>2023-10-25T00:03:29+00:00</updated>
<author>
<name>Ivan Vecera</name>
<email>ivecera@redhat.com</email>
</author>
<published>2023-10-23T21:27:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=77a8c982ff0d4c3a14022c6fe9e3dbfb327552ec'/>
<id>77a8c982ff0d4c3a14022c6fe9e3dbfb327552ec</id>
<content type='text'>
The I40E_TXR_FLAGS_WB_ON_ITR is i40e_ring flag and not i40e_pf one.

Fixes: 8e0764b4d6be42 ("i40e/i40evf: Add support for writeback on ITR feature for X722")
Signed-off-by: Ivan Vecera &lt;ivecera@redhat.com&gt;
Tested-by: Pucha Himasekhar Reddy &lt;himasekharx.reddy.pucha@intel.com&gt; (A Contingent worker at Intel)
Signed-off-by: Jacob Keller &lt;jacob.e.keller@intel.com&gt;
Link: https://lore.kernel.org/r/20231023212714.178032-1-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The I40E_TXR_FLAGS_WB_ON_ITR is i40e_ring flag and not i40e_pf one.

Fixes: 8e0764b4d6be42 ("i40e/i40evf: Add support for writeback on ITR feature for X722")
Signed-off-by: Ivan Vecera &lt;ivecera@redhat.com&gt;
Tested-by: Pucha Himasekhar Reddy &lt;himasekharx.reddy.pucha@intel.com&gt; (A Contingent worker at Intel)
Signed-off-by: Jacob Keller &lt;jacob.e.keller@intel.com&gt;
Link: https://lore.kernel.org/r/20231023212714.178032-1-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>i40e: sync next_to_clean and next_to_process for programming status desc</title>
<updated>2023-10-21T01:17:11+00:00</updated>
<author>
<name>Tirthendu Sarkar</name>
<email>tirthendu.sarkar@intel.com</email>
</author>
<published>2023-10-19T20:38:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=068d8b75c1aee153193522211ace6c13c21cd16b'/>
<id>068d8b75c1aee153193522211ace6c13c21cd16b</id>
<content type='text'>
When a programming status desc is encountered on the rx_ring,
next_to_process is bumped along with cleaned_count but next_to_clean is
not. This causes I40E_DESC_UNUSED() macro to misbehave resulting in
overwriting whole ring with new buffers.

Update next_to_clean to point to next_to_process on seeing a programming
status desc if not in the middle of handling a multi-frag packet. Also,
bump cleaned_count only for such case as otherwise next_to_clean buffer
may be returned to hardware on reaching clean_threshold.

Fixes: e9031f2da1ae ("i40e: introduce next_to_process to i40e_ring")
Suggested-by: Maciej Fijalkowski &lt;maciej.fijalkowski@intel.com&gt;
Reported-by: hq.dev+kernel@msdfc.xyz
Reported by: Solomon Peachy &lt;pizza@shaftnet.org&gt;
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217678
Tested-by: hq.dev+kernel@msdfc.xyz
Tested by: Indrek Järve &lt;incx@dustbite.net&gt;
Signed-off-by: Tirthendu Sarkar &lt;tirthendu.sarkar@intel.com&gt;
Tested-by: Arpana Arland &lt;arpanax.arland@intel.com&gt; (A Contingent worker at Intel)
Signed-off-by: Jacob Keller &lt;jacob.e.keller@intel.com&gt;
Acked-by: Maciej Fijalkowski &lt;maciej.fijalkowski@intel.com&gt;
Link: https://lore.kernel.org/r/20231019203852.3663665-1-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When a programming status desc is encountered on the rx_ring,
next_to_process is bumped along with cleaned_count but next_to_clean is
not. This causes I40E_DESC_UNUSED() macro to misbehave resulting in
overwriting whole ring with new buffers.

Update next_to_clean to point to next_to_process on seeing a programming
status desc if not in the middle of handling a multi-frag packet. Also,
bump cleaned_count only for such case as otherwise next_to_clean buffer
may be returned to hardware on reaching clean_threshold.

Fixes: e9031f2da1ae ("i40e: introduce next_to_process to i40e_ring")
Suggested-by: Maciej Fijalkowski &lt;maciej.fijalkowski@intel.com&gt;
Reported-by: hq.dev+kernel@msdfc.xyz
Reported by: Solomon Peachy &lt;pizza@shaftnet.org&gt;
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217678
Tested-by: hq.dev+kernel@msdfc.xyz
Tested by: Indrek Järve &lt;incx@dustbite.net&gt;
Signed-off-by: Tirthendu Sarkar &lt;tirthendu.sarkar@intel.com&gt;
Tested-by: Arpana Arland &lt;arpanax.arland@intel.com&gt; (A Contingent worker at Intel)
Signed-off-by: Jacob Keller &lt;jacob.e.keller@intel.com&gt;
Acked-by: Maciej Fijalkowski &lt;maciej.fijalkowski@intel.com&gt;
Link: https://lore.kernel.org/r/20231019203852.3663665-1-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>igc: Fix ambiguity in the ethtool advertising</title>
<updated>2023-10-21T01:15:38+00:00</updated>
<author>
<name>Sasha Neftin</name>
<email>sasha.neftin@intel.com</email>
</author>
<published>2023-10-19T20:36:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e7684d29efdf37304c62bb337ea55b3428ca118e'/>
<id>e7684d29efdf37304c62bb337ea55b3428ca118e</id>
<content type='text'>
The 'ethtool_convert_link_mode_to_legacy_u32' method does not allow us to
advertise 2500M speed support and TP (twisted pair) properly. Convert to
'ethtool_link_ksettings_test_link_mode' to advertise supported speed and
eliminate ambiguity.

Fixes: 8c5ad0dae93c ("igc: Add ethtool support")
Suggested-by: Dima Ruinskiy &lt;dima.ruinskiy@intel.com&gt;
Suggested-by: Vitaly Lifshits &lt;vitaly.lifshits@intel.com&gt;
Signed-off-by: Sasha Neftin &lt;sasha.neftin@intel.com&gt;
Tested-by: Naama Meir &lt;naamax.meir@linux.intel.com&gt;
Signed-off-by: Jacob Keller &lt;jacob.e.keller@intel.com&gt;
Link: https://lore.kernel.org/r/20231019203641.3661960-1-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The 'ethtool_convert_link_mode_to_legacy_u32' method does not allow us to
advertise 2500M speed support and TP (twisted pair) properly. Convert to
'ethtool_link_ksettings_test_link_mode' to advertise supported speed and
eliminate ambiguity.

Fixes: 8c5ad0dae93c ("igc: Add ethtool support")
Suggested-by: Dima Ruinskiy &lt;dima.ruinskiy@intel.com&gt;
Suggested-by: Vitaly Lifshits &lt;vitaly.lifshits@intel.com&gt;
Signed-off-by: Sasha Neftin &lt;sasha.neftin@intel.com&gt;
Tested-by: Naama Meir &lt;naamax.meir@linux.intel.com&gt;
Signed-off-by: Jacob Keller &lt;jacob.e.keller@intel.com&gt;
Link: https://lore.kernel.org/r/20231019203641.3661960-1-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>igb: Fix potential memory leak in igb_add_ethtool_nfc_entry</title>
<updated>2023-10-20T12:29:40+00:00</updated>
<author>
<name>Mateusz Palczewski</name>
<email>mateusz.palczewski@intel.com</email>
</author>
<published>2023-10-19T20:40:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=8c0b48e01daba5ca58f939a8425855d3f4f2ed14'/>
<id>8c0b48e01daba5ca58f939a8425855d3f4f2ed14</id>
<content type='text'>
Add check for return of igb_update_ethtool_nfc_entry so that in case
of any potential errors the memory alocated for input will be freed.

Fixes: 0e71def25281 ("igb: add support of RX network flow classification")
Reviewed-by: Wojciech Drewek &lt;wojciech.drewek@intel.com&gt;
Signed-off-by: Mateusz Palczewski &lt;mateusz.palczewski@intel.com&gt;
Tested-by: Arpana Arland &lt;arpanax.arland@intel.com&gt; (A Contingent worker at Intel)
Signed-off-by: Jacob Keller &lt;jacob.e.keller@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add check for return of igb_update_ethtool_nfc_entry so that in case
of any potential errors the memory alocated for input will be freed.

Fixes: 0e71def25281 ("igb: add support of RX network flow classification")
Reviewed-by: Wojciech Drewek &lt;wojciech.drewek@intel.com&gt;
Signed-off-by: Mateusz Palczewski &lt;mateusz.palczewski@intel.com&gt;
Tested-by: Arpana Arland &lt;arpanax.arland@intel.com&gt; (A Contingent worker at Intel)
Signed-off-by: Jacob Keller &lt;jacob.e.keller@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>i40e: Fix I40E_FLAG_VF_VLAN_PRUNING value</title>
<updated>2023-10-20T11:49:45+00:00</updated>
<author>
<name>Ivan Vecera</name>
<email>ivecera@redhat.com</email>
</author>
<published>2023-10-19T16:37:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=665e7d83c5386f9abdc67b2e4b6e6d9579aadfcb'/>
<id>665e7d83c5386f9abdc67b2e4b6e6d9579aadfcb</id>
<content type='text'>
Commit c87c938f62d8f1 ("i40e: Add VF VLAN pruning") added new
PF flag I40E_FLAG_VF_VLAN_PRUNING but its value collides with
existing I40E_FLAG_TOTAL_PORT_SHUTDOWN_ENABLED flag.

Move the affected flag at the end of the flags and fix its value.

Reproducer:
[root@cnb-03 ~]# ethtool --set-priv-flags enp2s0f0np0 link-down-on-close on
[root@cnb-03 ~]# ethtool --set-priv-flags enp2s0f0np0 vf-vlan-pruning on
[root@cnb-03 ~]# ethtool --set-priv-flags enp2s0f0np0 link-down-on-close off
[ 6323.142585] i40e 0000:02:00.0: Setting link-down-on-close not supported on this port (because total-port-shutdown is enabled)
netlink error: Operation not supported
[root@cnb-03 ~]# ethtool --set-priv-flags enp2s0f0np0 vf-vlan-pruning off
[root@cnb-03 ~]# ethtool --set-priv-flags enp2s0f0np0 link-down-on-close off

The link-down-on-close flag cannot be modified after setting vf-vlan-pruning
because vf-vlan-pruning shares the same bit with total-port-shutdown flag
that prevents any modification of link-down-on-close flag.

Fixes: c87c938f62d8 ("i40e: Add VF VLAN pruning")
Cc: Mateusz Palczewski &lt;mateusz.palczewski@intel.com&gt;
Cc: Simon Horman &lt;horms@kernel.org&gt;
Signed-off-by: Ivan Vecera &lt;ivecera@redhat.com&gt;
Reviewed-by: Jacob Keller &lt;jacob.e.keller@intel.com&gt;
Tested-by: Pucha Himasekhar Reddy &lt;himasekharx.reddy.pucha@intel.com&gt; (A Contingent worker at Intel)
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit c87c938f62d8f1 ("i40e: Add VF VLAN pruning") added new
PF flag I40E_FLAG_VF_VLAN_PRUNING but its value collides with
existing I40E_FLAG_TOTAL_PORT_SHUTDOWN_ENABLED flag.

Move the affected flag at the end of the flags and fix its value.

Reproducer:
[root@cnb-03 ~]# ethtool --set-priv-flags enp2s0f0np0 link-down-on-close on
[root@cnb-03 ~]# ethtool --set-priv-flags enp2s0f0np0 vf-vlan-pruning on
[root@cnb-03 ~]# ethtool --set-priv-flags enp2s0f0np0 link-down-on-close off
[ 6323.142585] i40e 0000:02:00.0: Setting link-down-on-close not supported on this port (because total-port-shutdown is enabled)
netlink error: Operation not supported
[root@cnb-03 ~]# ethtool --set-priv-flags enp2s0f0np0 vf-vlan-pruning off
[root@cnb-03 ~]# ethtool --set-priv-flags enp2s0f0np0 link-down-on-close off

The link-down-on-close flag cannot be modified after setting vf-vlan-pruning
because vf-vlan-pruning shares the same bit with total-port-shutdown flag
that prevents any modification of link-down-on-close flag.

Fixes: c87c938f62d8 ("i40e: Add VF VLAN pruning")
Cc: Mateusz Palczewski &lt;mateusz.palczewski@intel.com&gt;
Cc: Simon Horman &lt;horms@kernel.org&gt;
Signed-off-by: Ivan Vecera &lt;ivecera@redhat.com&gt;
Reviewed-by: Jacob Keller &lt;jacob.e.keller@intel.com&gt;
Tested-by: Pucha Himasekhar Reddy &lt;himasekharx.reddy.pucha@intel.com&gt; (A Contingent worker at Intel)
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>iavf: initialize waitqueues before starting watchdog_task</title>
<updated>2023-10-20T10:45:07+00:00</updated>
<author>
<name>Michal Schmidt</name>
<email>mschmidt@redhat.com</email>
</author>
<published>2023-10-19T07:13:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7db3111043885c146e795c199d39c3f9042d97c0'/>
<id>7db3111043885c146e795c199d39c3f9042d97c0</id>
<content type='text'>
It is not safe to initialize the waitqueues after queueing the
watchdog_task. It will be using them.

The chance of this causing a real problem is very small, because
there will be some sleeping before any of the waitqueues get used.
I got a crash only after inserting an artificial sleep in iavf_probe.

Queue the watchdog_task as the last step in iavf_probe. Add a comment to
prevent repeating the mistake.

Fixes: fe2647ab0c99 ("i40evf: prevent VF close returning before state transitions to DOWN")
Signed-off-by: Michal Schmidt &lt;mschmidt@redhat.com&gt;
Reviewed-by: Paul Menzel &lt;pmenzel@molgen.mpg.de&gt;
Reviewed-by: Przemek Kitszel &lt;przemyslaw.kitszel@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It is not safe to initialize the waitqueues after queueing the
watchdog_task. It will be using them.

The chance of this causing a real problem is very small, because
there will be some sleeping before any of the waitqueues get used.
I got a crash only after inserting an artificial sleep in iavf_probe.

Queue the watchdog_task as the last step in iavf_probe. Add a comment to
prevent repeating the mistake.

Fixes: fe2647ab0c99 ("i40evf: prevent VF close returning before state transitions to DOWN")
Signed-off-by: Michal Schmidt &lt;mschmidt@redhat.com&gt;
Reviewed-by: Paul Menzel &lt;pmenzel@molgen.mpg.de&gt;
Reviewed-by: Przemek Kitszel &lt;przemyslaw.kitszel@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>i40e: xsk: remove count_mask</title>
<updated>2023-10-20T00:27:21+00:00</updated>
<author>
<name>Maciej Fijalkowski</name>
<email>maciej.fijalkowski@intel.com</email>
</author>
<published>2023-10-18T16:39:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=913eda2b08cc49d31f382579e2be34c2709eb789'/>
<id>913eda2b08cc49d31f382579e2be34c2709eb789</id>
<content type='text'>
Cited commit introduced a neat way of updating next_to_clean that does
not require boundary checks on each increment. This was done by masking
the new value with (ring length - 1) mask. Problem is that this is
applicable only for power of 2 ring sizes, for every other size this
assumption can not be made. In turn, it leads to cleaning descriptors
out of order as well as splats:

[ 1388.411915] Workqueue: events xp_release_deferred
[ 1388.411919] RIP: 0010:xp_free+0x1a/0x50
[ 1388.411921] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 55 48 8b 57 70 48 8d 47 70 48 89 e5 48 39 d0 74 06 &lt;5d&gt; c3 cc cc cc cc 48 8b 57 60 83 82 b8 00 00 00 01 48 8b 57 60 48
[ 1388.411922] RSP: 0018:ffa0000000a83cb0 EFLAGS: 00000206
[ 1388.411923] RAX: ff11000119aa5030 RBX: 000000000000001d RCX: ff110001129b6e50
[ 1388.411924] RDX: ff11000119aa4fa0 RSI: 0000000055555554 RDI: ff11000119aa4fc0
[ 1388.411925] RBP: ffa0000000a83cb0 R08: 0000000000000000 R09: 0000000000000000
[ 1388.411926] R10: 0000000000000001 R11: 0000000000000000 R12: ff11000115829b80
[ 1388.411927] R13: 000000000000005f R14: 0000000000000000 R15: ff11000119aa4fc0
[ 1388.411928] FS:  0000000000000000(0000) GS:ff11000277e00000(0000) knlGS:0000000000000000
[ 1388.411929] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1388.411930] CR2: 00007f1f564e6c14 CR3: 000000000783c005 CR4: 0000000000771ef0
[ 1388.411931] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1388.411931] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1388.411932] PKRU: 55555554
[ 1388.411933] Call Trace:
[ 1388.411934]  &lt;IRQ&gt;
[ 1388.411935]  ? show_regs+0x6e/0x80
[ 1388.411937]  ? watchdog_timer_fn+0x1d2/0x240
[ 1388.411939]  ? __pfx_watchdog_timer_fn+0x10/0x10
[ 1388.411941]  ? __hrtimer_run_queues+0x10e/0x290
[ 1388.411945]  ? clockevents_program_event+0xae/0x130
[ 1388.411947]  ? hrtimer_interrupt+0x105/0x240
[ 1388.411949]  ? __sysvec_apic_timer_interrupt+0x54/0x150
[ 1388.411952]  ? sysvec_apic_timer_interrupt+0x7f/0x90
[ 1388.411955]  &lt;/IRQ&gt;
[ 1388.411955]  &lt;TASK&gt;
[ 1388.411956]  ? asm_sysvec_apic_timer_interrupt+0x1f/0x30
[ 1388.411958]  ? xp_free+0x1a/0x50
[ 1388.411960]  i40e_xsk_clean_rx_ring+0x5d/0x100 [i40e]
[ 1388.411968]  i40e_clean_rx_ring+0x14c/0x170 [i40e]
[ 1388.411977]  i40e_queue_pair_disable+0xda/0x260 [i40e]
[ 1388.411986]  i40e_xsk_pool_setup+0x192/0x1d0 [i40e]
[ 1388.411993]  i40e_reconfig_rss_queues+0x1f0/0x1450 [i40e]
[ 1388.412002]  xp_disable_drv_zc+0x73/0xf0
[ 1388.412004]  ? mutex_lock+0x17/0x50
[ 1388.412007]  xp_release_deferred+0x2b/0xc0
[ 1388.412010]  process_one_work+0x178/0x350
[ 1388.412011]  ? __pfx_worker_thread+0x10/0x10
[ 1388.412012]  worker_thread+0x2f7/0x420
[ 1388.412014]  ? __pfx_worker_thread+0x10/0x10
[ 1388.412015]  kthread+0xf8/0x130
[ 1388.412017]  ? __pfx_kthread+0x10/0x10
[ 1388.412019]  ret_from_fork+0x3d/0x60
[ 1388.412021]  ? __pfx_kthread+0x10/0x10
[ 1388.412023]  ret_from_fork_asm+0x1b/0x30
[ 1388.412026]  &lt;/TASK&gt;

It comes from picking wrong ring entries when cleaning xsk buffers
during pool detach.

Remove the count_mask logic and use they boundary check when updating
next_to_process (which used to be a next_to_clean).

Fixes: c8a8ca3408dc ("i40e: remove unnecessary memory writes of the next to clean pointer")
Reported-by: Tushar Vyavahare &lt;tushar.vyavahare@intel.com&gt;
Tested-by: Tushar Vyavahare &lt;tushar.vyavahare@intel.com&gt;
Signed-off-by: Maciej Fijalkowski &lt;maciej.fijalkowski@intel.com&gt;
Reviewed-by: Jacob Keller &lt;jacob.e.keller@intel.com&gt;
Link: https://lore.kernel.org/r/20231018163908.40841-1-maciej.fijalkowski@intel.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Cited commit introduced a neat way of updating next_to_clean that does
not require boundary checks on each increment. This was done by masking
the new value with (ring length - 1) mask. Problem is that this is
applicable only for power of 2 ring sizes, for every other size this
assumption can not be made. In turn, it leads to cleaning descriptors
out of order as well as splats:

[ 1388.411915] Workqueue: events xp_release_deferred
[ 1388.411919] RIP: 0010:xp_free+0x1a/0x50
[ 1388.411921] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 55 48 8b 57 70 48 8d 47 70 48 89 e5 48 39 d0 74 06 &lt;5d&gt; c3 cc cc cc cc 48 8b 57 60 83 82 b8 00 00 00 01 48 8b 57 60 48
[ 1388.411922] RSP: 0018:ffa0000000a83cb0 EFLAGS: 00000206
[ 1388.411923] RAX: ff11000119aa5030 RBX: 000000000000001d RCX: ff110001129b6e50
[ 1388.411924] RDX: ff11000119aa4fa0 RSI: 0000000055555554 RDI: ff11000119aa4fc0
[ 1388.411925] RBP: ffa0000000a83cb0 R08: 0000000000000000 R09: 0000000000000000
[ 1388.411926] R10: 0000000000000001 R11: 0000000000000000 R12: ff11000115829b80
[ 1388.411927] R13: 000000000000005f R14: 0000000000000000 R15: ff11000119aa4fc0
[ 1388.411928] FS:  0000000000000000(0000) GS:ff11000277e00000(0000) knlGS:0000000000000000
[ 1388.411929] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1388.411930] CR2: 00007f1f564e6c14 CR3: 000000000783c005 CR4: 0000000000771ef0
[ 1388.411931] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1388.411931] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1388.411932] PKRU: 55555554
[ 1388.411933] Call Trace:
[ 1388.411934]  &lt;IRQ&gt;
[ 1388.411935]  ? show_regs+0x6e/0x80
[ 1388.411937]  ? watchdog_timer_fn+0x1d2/0x240
[ 1388.411939]  ? __pfx_watchdog_timer_fn+0x10/0x10
[ 1388.411941]  ? __hrtimer_run_queues+0x10e/0x290
[ 1388.411945]  ? clockevents_program_event+0xae/0x130
[ 1388.411947]  ? hrtimer_interrupt+0x105/0x240
[ 1388.411949]  ? __sysvec_apic_timer_interrupt+0x54/0x150
[ 1388.411952]  ? sysvec_apic_timer_interrupt+0x7f/0x90
[ 1388.411955]  &lt;/IRQ&gt;
[ 1388.411955]  &lt;TASK&gt;
[ 1388.411956]  ? asm_sysvec_apic_timer_interrupt+0x1f/0x30
[ 1388.411958]  ? xp_free+0x1a/0x50
[ 1388.411960]  i40e_xsk_clean_rx_ring+0x5d/0x100 [i40e]
[ 1388.411968]  i40e_clean_rx_ring+0x14c/0x170 [i40e]
[ 1388.411977]  i40e_queue_pair_disable+0xda/0x260 [i40e]
[ 1388.411986]  i40e_xsk_pool_setup+0x192/0x1d0 [i40e]
[ 1388.411993]  i40e_reconfig_rss_queues+0x1f0/0x1450 [i40e]
[ 1388.412002]  xp_disable_drv_zc+0x73/0xf0
[ 1388.412004]  ? mutex_lock+0x17/0x50
[ 1388.412007]  xp_release_deferred+0x2b/0xc0
[ 1388.412010]  process_one_work+0x178/0x350
[ 1388.412011]  ? __pfx_worker_thread+0x10/0x10
[ 1388.412012]  worker_thread+0x2f7/0x420
[ 1388.412014]  ? __pfx_worker_thread+0x10/0x10
[ 1388.412015]  kthread+0xf8/0x130
[ 1388.412017]  ? __pfx_kthread+0x10/0x10
[ 1388.412019]  ret_from_fork+0x3d/0x60
[ 1388.412021]  ? __pfx_kthread+0x10/0x10
[ 1388.412023]  ret_from_fork_asm+0x1b/0x30
[ 1388.412026]  &lt;/TASK&gt;

It comes from picking wrong ring entries when cleaning xsk buffers
during pool detach.

Remove the count_mask logic and use they boundary check when updating
next_to_process (which used to be a next_to_clean).

Fixes: c8a8ca3408dc ("i40e: remove unnecessary memory writes of the next to clean pointer")
Reported-by: Tushar Vyavahare &lt;tushar.vyavahare@intel.com&gt;
Tested-by: Tushar Vyavahare &lt;tushar.vyavahare@intel.com&gt;
Signed-off-by: Maciej Fijalkowski &lt;maciej.fijalkowski@intel.com&gt;
Reviewed-by: Jacob Keller &lt;jacob.e.keller@intel.com&gt;
Link: https://lore.kernel.org/r/20231018163908.40841-1-maciej.fijalkowski@intel.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ice: Fix safe mode when DDP is missing</title>
<updated>2023-10-14T00:57:05+00:00</updated>
<author>
<name>Mateusz Pacuszka</name>
<email>mateuszx.pacuszka@intel.com</email>
</author>
<published>2023-10-11T23:33:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=42066c4d5d344cdf8564556cdbe0aa36854fefa4'/>
<id>42066c4d5d344cdf8564556cdbe0aa36854fefa4</id>
<content type='text'>
One thing is broken in the safe mode, that is
ice_deinit_features() is being executed even
that ice_init_features() was not causing stack
trace during pci_unregister_driver().

Add check on the top of the function.

Fixes: 5b246e533d01 ("ice: split probe into smaller functions")
Signed-off-by: Mateusz Pacuszka &lt;mateuszx.pacuszka@intel.com&gt;
Signed-off-by: Jan Sokolowski &lt;jan.sokolowski@intel.com&gt;
Reviewed-by: Przemek Kitszel &lt;przemyslaw.kitszel@intel.com&gt;
Tested-by: Pucha Himasekhar Reddy &lt;himasekharx.reddy.pucha@intel.com&gt; (A Contingent worker at Intel)
Link: https://lore.kernel.org/r/20231011233334.336092-4-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
One thing is broken in the safe mode, that is
ice_deinit_features() is being executed even
that ice_init_features() was not causing stack
trace during pci_unregister_driver().

Add check on the top of the function.

Fixes: 5b246e533d01 ("ice: split probe into smaller functions")
Signed-off-by: Mateusz Pacuszka &lt;mateuszx.pacuszka@intel.com&gt;
Signed-off-by: Jan Sokolowski &lt;jan.sokolowski@intel.com&gt;
Reviewed-by: Przemek Kitszel &lt;przemyslaw.kitszel@intel.com&gt;
Tested-by: Pucha Himasekhar Reddy &lt;himasekharx.reddy.pucha@intel.com&gt; (A Contingent worker at Intel)
Link: https://lore.kernel.org/r/20231011233334.336092-4-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ice: reset first in crash dump kernels</title>
<updated>2023-10-14T00:57:05+00:00</updated>
<author>
<name>Jesse Brandeburg</name>
<email>jesse.brandeburg@intel.com</email>
</author>
<published>2023-10-11T23:33:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0288c3e709e5fabd51e84715c5c798a02f43061a'/>
<id>0288c3e709e5fabd51e84715c5c798a02f43061a</id>
<content type='text'>
When the system boots into the crash dump kernel after a panic, the ice
networking device may still have pending transactions that can cause errors
or machine checks when the device is re-enabled. This can prevent the crash
dump kernel from loading the driver or collecting the crash data.

To avoid this issue, perform a function level reset (FLR) on the ice device
via PCIe config space before enabling it on the crash kernel. This will
clear any outstanding transactions and stop all queues and interrupts.
Restore the config space after the FLR, otherwise it was found in testing
that the driver wouldn't load successfully.

The following sequence causes the original issue:
- Load the ice driver with modprobe ice
- Enable SR-IOV with 2 VFs: echo 2 &gt; /sys/class/net/eth0/device/sriov_num_vfs
- Trigger a crash with echo c &gt; /proc/sysrq-trigger
- Load the ice driver again (or let it load automatically) with modprobe ice
- The system crashes again during pcim_enable_device()

Fixes: 837f08fdecbe ("ice: Add basic driver framework for Intel(R) E800 Series")
Reported-by: Vishal Agrawal &lt;vagrawal@redhat.com&gt;
Reviewed-by: Jay Vosburgh &lt;jay.vosburgh@canonical.com&gt;
Reviewed-by: Przemek Kitszel &lt;przemyslaw.kitszel@intel.com&gt;
Signed-off-by: Jesse Brandeburg &lt;jesse.brandeburg@intel.com&gt;
Tested-by: Pucha Himasekhar Reddy &lt;himasekharx.reddy.pucha@intel.com&gt; (A Contingent worker at Intel)
Link: https://lore.kernel.org/r/20231011233334.336092-3-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When the system boots into the crash dump kernel after a panic, the ice
networking device may still have pending transactions that can cause errors
or machine checks when the device is re-enabled. This can prevent the crash
dump kernel from loading the driver or collecting the crash data.

To avoid this issue, perform a function level reset (FLR) on the ice device
via PCIe config space before enabling it on the crash kernel. This will
clear any outstanding transactions and stop all queues and interrupts.
Restore the config space after the FLR, otherwise it was found in testing
that the driver wouldn't load successfully.

The following sequence causes the original issue:
- Load the ice driver with modprobe ice
- Enable SR-IOV with 2 VFs: echo 2 &gt; /sys/class/net/eth0/device/sriov_num_vfs
- Trigger a crash with echo c &gt; /proc/sysrq-trigger
- Load the ice driver again (or let it load automatically) with modprobe ice
- The system crashes again during pcim_enable_device()

Fixes: 837f08fdecbe ("ice: Add basic driver framework for Intel(R) E800 Series")
Reported-by: Vishal Agrawal &lt;vagrawal@redhat.com&gt;
Reviewed-by: Jay Vosburgh &lt;jay.vosburgh@canonical.com&gt;
Reviewed-by: Przemek Kitszel &lt;przemyslaw.kitszel@intel.com&gt;
Signed-off-by: Jesse Brandeburg &lt;jesse.brandeburg@intel.com&gt;
Tested-by: Pucha Himasekhar Reddy &lt;himasekharx.reddy.pucha@intel.com&gt; (A Contingent worker at Intel)
Link: https://lore.kernel.org/r/20231011233334.336092-3-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
