<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/include/linux/mlx5, branch v6.4</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma</title>
<updated>2023-06-16T03:13:56+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-06-16T03:13:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=93fd8eb053800a241d09c00ef075cae0b5b03ecf'/>
<id>93fd8eb053800a241d09c00ef075cae0b5b03ecf</id>
<content type='text'>
Pull rdma fixes from Jason Gunthorpe:
 "This is an unusually large bunch of bug fixes for the later rc cycle,
  rxe and mlx5 both dumped a lot of things at once. rxe continues to fix
  itself, and mlx5 is fixing a bunch of "queue counters" related bugs.

  There is one highly notable bug fix regarding the qkey. This small
  security check was missed in the original 2005 implementation and it
  allows some significant issues.

  Summary:

   - Two rtrs bug fixes for error unwind bugs

   - Several rxe bug fixes:
      * Incorrect Rx packet validation
      * Using memory without a refcount
      * Syzkaller found use before initialization
      * Regression fix for missing locking with the tasklet conversion
        from this merge window

   - Have bnxt report the correct link properties to userspace, this was
     a regression in v6.3

   - Several mlx5 bug fixes:
      * Kernel crash triggerable by userspace for the RAW ethernet
        profile
      * Defend against steering refcounting issues created by userspace
      * Incorrect change of QP port affinity parameters in some LAG
        configurations

   - Fix mlx5 Q counters:
      * Do not over allocate Q counters to allow userspace to use the
        full port capacity
      * Kernel crash triggered by eswitch due to mis-use of Q counters
      * Incorrect mlx5_device for Q counters in some LAG configurations

   - Properly implement the IBA spec restricting privileged qkeys to
     root

   - Always an error when reading from a disassociated device's event
     queue

   - isert bug fixes:
      * Avoid a deadlock with the CM handler and CM ID destruction
      * Correct list corruption due to incorrect locking
      * Fix a use after free around connection tear down"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/rxe: Fix rxe_cq_post
  IB/isert: Fix incorrect release of isert connection
  IB/isert: Fix possible list corruption in CMA handler
  IB/isert: Fix dead lock in ib_isert
  RDMA/mlx5: Fix affinity assignment
  IB/uverbs: Fix to consider event queue closing also upon non-blocking mode
  RDMA/uverbs: Restrict usage of privileged QKEYs
  RDMA/cma: Always set static rate to 0 for RoCE
  RDMA/mlx5: Fix Q-counters query in LAG mode
  RDMA/mlx5: Remove vport Q-counters dependency on normal Q-counters
  RDMA/mlx5: Fix Q-counters per vport allocation
  RDMA/mlx5: Create an indirect flow table for steering anchor
  RDMA/mlx5: Initiate dropless RQ for RAW Ethernet functions
  RDMA/rxe: Fix the use-before-initialization error of resp_pkts
  RDMA/bnxt_re: Fix reporting active_{speed,width} attributes
  RDMA/rxe: Fix ref count error in check_rkey()
  RDMA/rxe: Fix packet length checks
  RDMA/rtrs: Fix rxe_dealloc_pd warning
  RDMA/rtrs: Fix the last iu-&gt;buf leak in err path
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull rdma fixes from Jason Gunthorpe:
 "This is an unusually large bunch of bug fixes for the later rc cycle,
  rxe and mlx5 both dumped a lot of things at once. rxe continues to fix
  itself, and mlx5 is fixing a bunch of "queue counters" related bugs.

  There is one highly notable bug fix regarding the qkey. This small
  security check was missed in the original 2005 implementation and it
  allows some significant issues.

  Summary:

   - Two rtrs bug fixes for error unwind bugs

   - Several rxe bug fixes:
      * Incorrect Rx packet validation
      * Using memory without a refcount
      * Syzkaller found use before initialization
      * Regression fix for missing locking with the tasklet conversion
        from this merge window

   - Have bnxt report the correct link properties to userspace, this was
     a regression in v6.3

   - Several mlx5 bug fixes:
      * Kernel crash triggerable by userspace for the RAW ethernet
        profile
      * Defend against steering refcounting issues created by userspace
      * Incorrect change of QP port affinity parameters in some LAG
        configurations

   - Fix mlx5 Q counters:
      * Do not over allocate Q counters to allow userspace to use the
        full port capacity
      * Kernel crash triggered by eswitch due to mis-use of Q counters
      * Incorrect mlx5_device for Q counters in some LAG configurations

   - Properly implement the IBA spec restricting privileged qkeys to
     root

   - Always an error when reading from a disassociated device's event
     queue

   - isert bug fixes:
      * Avoid a deadlock with the CM handler and CM ID destruction
      * Correct list corruption due to incorrect locking
      * Fix a use after free around connection tear down"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/rxe: Fix rxe_cq_post
  IB/isert: Fix incorrect release of isert connection
  IB/isert: Fix possible list corruption in CMA handler
  IB/isert: Fix dead lock in ib_isert
  RDMA/mlx5: Fix affinity assignment
  IB/uverbs: Fix to consider event queue closing also upon non-blocking mode
  RDMA/uverbs: Restrict usage of privileged QKEYs
  RDMA/cma: Always set static rate to 0 for RoCE
  RDMA/mlx5: Fix Q-counters query in LAG mode
  RDMA/mlx5: Remove vport Q-counters dependency on normal Q-counters
  RDMA/mlx5: Fix Q-counters per vport allocation
  RDMA/mlx5: Create an indirect flow table for steering anchor
  RDMA/mlx5: Initiate dropless RQ for RAW Ethernet functions
  RDMA/rxe: Fix the use-before-initialization error of resp_pkts
  RDMA/bnxt_re: Fix reporting active_{speed,width} attributes
  RDMA/rxe: Fix ref count error in check_rkey()
  RDMA/rxe: Fix packet length checks
  RDMA/rtrs: Fix rxe_dealloc_pd warning
  RDMA/rtrs: Fix the last iu-&gt;buf leak in err path
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/mlx5: Fix affinity assignment</title>
<updated>2023-06-11T08:27:17+00:00</updated>
<author>
<name>Mark Bloch</name>
<email>mbloch@nvidia.com</email>
</author>
<published>2023-06-05T10:33:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=617f5db1a626f18d5cbb7c7faf7bf8f9ea12be78'/>
<id>617f5db1a626f18d5cbb7c7faf7bf8f9ea12be78</id>
<content type='text'>
The cited commit aimed to ensure that Virtual Functions (VFs) assign a
queue affinity to a Queue Pair (QP) to distribute traffic when
the LAG master creates a hardware LAG. If the affinity was set while
the hardware was not in LAG, the firmware would ignore the affinity value.

However, this commit unintentionally assigned an affinity to QPs on the LAG
master's VPORT even if the RDMA device was not marked as LAG-enabled.
In most cases, this was not an issue because when the hardware entered
hardware LAG configuration, the RDMA device of the LAG master would be
destroyed and a new one would be created, marked as LAG-enabled.

The problem arises when a user configures Equal-Cost Multipath (ECMP).
In ECMP mode, traffic can be directed to different physical ports based on
the queue affinity, which is intended for use by VPORTS other than the
E-Switch manager. ECMP mode is supported only if both E-Switch managers are
in switchdev mode and the appropriate route is configured via IP. In this
configuration, the RDMA device is not destroyed, and we retain the RDMA
device that is not marked as LAG-enabled.

To ensure correct behavior, Send Queues (SQs) opened by the E-Switch
manager through verbs should be assigned strict affinity. This means they
will only be able to communicate through the native physical port
associated with the E-Switch manager. This will prevent the firmware from
assigning affinity and will not allow the SQs to be remapped in case of
failover.

Fixes: 802dcc7fc5ec ("RDMA/mlx5: Support TX port affinity for VF drivers in LAG mode")
Reviewed-by: Maor Gottlieb &lt;maorg@nvidia.com&gt;
Signed-off-by: Mark Bloch &lt;mbloch@nvidia.com&gt;
Link: https://lore.kernel.org/r/425b05f4da840bc684b0f7e8ebf61aeb5cef09b0.1685960567.git.leon@kernel.org
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The cited commit aimed to ensure that Virtual Functions (VFs) assign a
queue affinity to a Queue Pair (QP) to distribute traffic when
the LAG master creates a hardware LAG. If the affinity was set while
the hardware was not in LAG, the firmware would ignore the affinity value.

However, this commit unintentionally assigned an affinity to QPs on the LAG
master's VPORT even if the RDMA device was not marked as LAG-enabled.
In most cases, this was not an issue because when the hardware entered
hardware LAG configuration, the RDMA device of the LAG master would be
destroyed and a new one would be created, marked as LAG-enabled.

The problem arises when a user configures Equal-Cost Multipath (ECMP).
In ECMP mode, traffic can be directed to different physical ports based on
the queue affinity, which is intended for use by VPORTS other than the
E-Switch manager. ECMP mode is supported only if both E-Switch managers are
in switchdev mode and the appropriate route is configured via IP. In this
configuration, the RDMA device is not destroyed, and we retain the RDMA
device that is not marked as LAG-enabled.

To ensure correct behavior, Send Queues (SQs) opened by the E-Switch
manager through verbs should be assigned strict affinity. This means they
will only be able to communicate through the native physical port
associated with the E-Switch manager. This will prevent the firmware from
assigning affinity and will not allow the SQs to be remapped in case of
failover.

Fixes: 802dcc7fc5ec ("RDMA/mlx5: Support TX port affinity for VF drivers in LAG mode")
Reviewed-by: Maor Gottlieb &lt;maorg@nvidia.com&gt;
Signed-off-by: Mark Bloch &lt;mbloch@nvidia.com&gt;
Link: https://lore.kernel.org/r/425b05f4da840bc684b0f7e8ebf61aeb5cef09b0.1685960567.git.leon@kernel.org
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net/mlx5e: Use query_special_contexts cmd only once per mdev</title>
<updated>2023-05-25T03:44:18+00:00</updated>
<author>
<name>Dragos Tatulea</name>
<email>dtatulea@nvidia.com</email>
</author>
<published>2023-04-13T12:48:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1db1f21caebbb1b6e9b1e7657df613616be3fb49'/>
<id>1db1f21caebbb1b6e9b1e7657df613616be3fb49</id>
<content type='text'>
Don't query the firmware so many times (num rqs * num wqes * wqe frags)
because it slows down linearly the interface creation time when the
product is larger. Do it only once per mdev and store the result in
mlx5e_param.

Due to helper function being called from different files, move it to
an appropriate location. Rename the function with a proper prefix and
add a small cleanup.

This fix applies only for legacy rq.

Fixes: 1b1e4868836a ("net/mlx5e: Use query_special_contexts for mkeys")
Signed-off-by: Dragos Tatulea &lt;dtatulea@nvidia.com&gt;
Reviewed-by: Or Har-Toov &lt;ohartoov@nvidia.com&gt;
Reviewed-by: Tariq Toukan &lt;tariqt@nvidia.com&gt;
Signed-off-by: Saeed Mahameed &lt;saeedm@nvidia.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Don't query the firmware so many times (num rqs * num wqes * wqe frags)
because it slows down linearly the interface creation time when the
product is larger. Do it only once per mdev and store the result in
mlx5e_param.

Due to helper function being called from different files, move it to
an appropriate location. Rename the function with a proper prefix and
add a small cleanup.

This fix applies only for legacy rq.

Fixes: 1b1e4868836a ("net/mlx5e: Use query_special_contexts for mkeys")
Signed-off-by: Dragos Tatulea &lt;dtatulea@nvidia.com&gt;
Reviewed-by: Or Har-Toov &lt;ohartoov@nvidia.com&gt;
Reviewed-by: Tariq Toukan &lt;tariqt@nvidia.com&gt;
Signed-off-by: Saeed Mahameed &lt;saeedm@nvidia.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net/mlx5: DR, Check force-loopback RC QP capability independently from RoCE</title>
<updated>2023-05-23T05:38:05+00:00</updated>
<author>
<name>Yevgeny Kliteynik</name>
<email>kliteyn@nvidia.com</email>
</author>
<published>2023-04-02T14:14:10+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c7dd225bc224726c22db08e680bf787f60ebdee3'/>
<id>c7dd225bc224726c22db08e680bf787f60ebdee3</id>
<content type='text'>
SW Steering uses RC QP for writing STEs to ICM. This writingis done in LB
(loopback), and FL (force-loopback) QP is preferred for performance. FL is
available when RoCE is enabled or disabled based on RoCE caps.
This patch adds reading of FL capability from HCA caps in addition to the
existing reading from RoCE caps, thus fixing the case where we didn't
have loopback enabled when RoCE was disabled.

Fixes: 7304d603a57a ("net/mlx5: DR, Add support for force-loopback QP")
Signed-off-by: Itamar Gozlan &lt;igozlan@nvidia.com&gt;
Signed-off-by: Yevgeny Kliteynik &lt;kliteyn@nvidia.com&gt;
Signed-off-by: Saeed Mahameed &lt;saeedm@nvidia.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
SW Steering uses RC QP for writing STEs to ICM. This writingis done in LB
(loopback), and FL (force-loopback) QP is preferred for performance. FL is
available when RoCE is enabled or disabled based on RoCE caps.
This patch adds reading of FL capability from HCA caps in addition to the
existing reading from RoCE caps, thus fixing the case where we didn't
have loopback enabled when RoCE was disabled.

Fixes: 7304d603a57a ("net/mlx5: DR, Add support for force-loopback QP")
Signed-off-by: Itamar Gozlan &lt;igozlan@nvidia.com&gt;
Signed-off-by: Yevgeny Kliteynik &lt;kliteyn@nvidia.com&gt;
Signed-off-by: Saeed Mahameed &lt;saeedm@nvidia.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma</title>
<updated>2023-04-30T00:21:24+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-04-30T00:21:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=af3877265dd88d7e333f94fb37bc09554544adca'/>
<id>af3877265dd88d7e333f94fb37bc09554544adca</id>
<content type='text'>
Pull rdma updates from Jason Gunthorpe:
 "Usual wide collection of unrelated items in drivers:

   - Driver bug fixes and treewide cleanups in hfi1, siw, qib, mlx5,
     rxe, usnic, usnic, bnxt_re, ocrdma, iser:
       - remove unnecessary NULL checks
       - kmap obsolescence
       - pci_enable_pcie_error_reporting() obsolescence
       - unused variables and macros
       - trace event related warnings
       - casting warnings

   - Code cleanups for irdm and erdma

   - EFA reporting of 128 byte PCIe TLP support

   - mlx5 more agressively uses the out of order HW feature

   - Big rework of how state machines and tasks work in rxe

   - Fix a syzkaller found crash netdev refcount leak in siw

   - bnxt_re revises their HW description header

   - Congestion control for bnxt_re

   - Use mmu_notifiers more safely in hfi1

   - mlx5 gets better support for PCIe relaxed ordering inside VMs"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (81 commits)
  RDMA/efa: Add rdma write capability to device caps
  RDMA/mlx5: Use correct device num_ports when modify DC
  RDMA/irdma: Drop spurious WQ_UNBOUND from alloc_ordered_workqueue() call
  RDMA/rxe: Fix spinlock recursion deadlock on requester
  RDMA/mlx5: Fix flow counter query via DEVX
  RDMA/rxe: Protect QP state with qp-&gt;state_lock
  RDMA/rxe: Move code to check if drained to subroutine
  RDMA/rxe: Remove qp-&gt;req.state
  RDMA/rxe: Remove qp-&gt;comp.state
  RDMA/rxe: Remove qp-&gt;resp.state
  RDMA/mlx5: Allow relaxed ordering read in VFs and VMs
  net/mlx5: Update relaxed ordering read HCA capabilities
  RDMA/mlx5: Check pcie_relaxed_ordering_enabled() in UMR
  RDMA/mlx5: Remove pcie_relaxed_ordering_enabled() check for RO write
  RDMA: Add ib_virt_dma_to_page()
  RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task"
  RDMA/irdma: Slightly optimize irdma_form_ah_cm_frame()
  RDMA/rxe: Fix incorrect TASKLET_STATE_SCHED check in rxe_task.c
  IB/hfi1: Place struct mmu_rb_handler on cache line start
  IB/hfi1: Fix bugs with non-PAGE_SIZE-end multi-iovec user SDMA requests
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull rdma updates from Jason Gunthorpe:
 "Usual wide collection of unrelated items in drivers:

   - Driver bug fixes and treewide cleanups in hfi1, siw, qib, mlx5,
     rxe, usnic, usnic, bnxt_re, ocrdma, iser:
       - remove unnecessary NULL checks
       - kmap obsolescence
       - pci_enable_pcie_error_reporting() obsolescence
       - unused variables and macros
       - trace event related warnings
       - casting warnings

   - Code cleanups for irdm and erdma

   - EFA reporting of 128 byte PCIe TLP support

   - mlx5 more agressively uses the out of order HW feature

   - Big rework of how state machines and tasks work in rxe

   - Fix a syzkaller found crash netdev refcount leak in siw

   - bnxt_re revises their HW description header

   - Congestion control for bnxt_re

   - Use mmu_notifiers more safely in hfi1

   - mlx5 gets better support for PCIe relaxed ordering inside VMs"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (81 commits)
  RDMA/efa: Add rdma write capability to device caps
  RDMA/mlx5: Use correct device num_ports when modify DC
  RDMA/irdma: Drop spurious WQ_UNBOUND from alloc_ordered_workqueue() call
  RDMA/rxe: Fix spinlock recursion deadlock on requester
  RDMA/mlx5: Fix flow counter query via DEVX
  RDMA/rxe: Protect QP state with qp-&gt;state_lock
  RDMA/rxe: Move code to check if drained to subroutine
  RDMA/rxe: Remove qp-&gt;req.state
  RDMA/rxe: Remove qp-&gt;comp.state
  RDMA/rxe: Remove qp-&gt;resp.state
  RDMA/mlx5: Allow relaxed ordering read in VFs and VMs
  net/mlx5: Update relaxed ordering read HCA capabilities
  RDMA/mlx5: Check pcie_relaxed_ordering_enabled() in UMR
  RDMA/mlx5: Remove pcie_relaxed_ordering_enabled() check for RO write
  RDMA: Add ib_virt_dma_to_page()
  RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task"
  RDMA/irdma: Slightly optimize irdma_form_ah_cm_frame()
  RDMA/rxe: Fix incorrect TASKLET_STATE_SCHED check in rxe_task.c
  IB/hfi1: Place struct mmu_rb_handler on cache line start
  IB/hfi1: Fix bugs with non-PAGE_SIZE-end multi-iovec user SDMA requests
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'mlx5-updates-2023-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux</title>
<updated>2023-04-22T03:47:05+00:00</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2023-04-22T03:47:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=fbc1449d385d65be49a8d164dfd3772f2cb049ae'/>
<id>fbc1449d385d65be49a8d164dfd3772f2cb049ae</id>
<content type='text'>
Saeed Mahameed says:

====================
mlx5-updates-2023-04-20

1) Dragos Improves RX page pool, and provides some fixes to his previous
   series:
 1.1) Fix releasing page_pool for striding RQ and legacy RQ nonlinear case
 1.2) Hook NAPIs to page pools to gain more performance.

2) From Roi, Some cleanups to TC and eswitch modules.

3) Maher migrates vnic diagnostic counters reporting from debugfs to a
    dedicated devlink health reporter

Maher Says:
===========
 net/mlx5: Expose vnic diagnostic counters using devlink

Currently, vnic diagnostic counters are exposed through the following
debugfs:

$ ls /sys/kernel/debug/mlx5/0000:08:00.0/esw/vf_0/vnic_diag/
cq_overrun
quota_exceeded_command
total_q_under_processor_handle
invalid_command
send_queue_priority_update_flow
nic_receive_steering_discard

The current design does not allow the hypervisor to view the diagnostic
counters of its VFs, in case the VFs get bound to a VM. In other words,
the counters are not exposed for representor interfaces.
Furthermore, the debugfs design is inconvenient future-wise, in case more
counters need to be reported by the driver in the future.

As these counters pertain to vNIC health, it is more appropriate to
utilize the devlink health reporter to expose them.

Thus, this patchest includes the following changes:

* Drop the current vnic diagnostic counters debugfs interface.
* Add a vnic devlink health reporter for PFs/VFs core devices, which
  when diagnosed will dump vnic diagnostic counter values that are
  queried from FW.
* Add a vnic devlink health reporter for the representor interface, which
  serves the same purpose listed in the previous point, in addition to
  allowing the hypervisor to view its VFs diagnostic counters, even when
  the VFs are bounded to external VMs.

Example of devlink health reporter usage is:
$devlink health diagnose pci/0000:08:00.0 reporter vnic
 vNIC env counters:
    total_error_queues: 0 send_queue_priority_update_flow: 0
    comp_eq_overrun: 0 async_eq_overrun: 0 cq_overrun: 0
    invalid_command: 0 quota_exceeded_command: 0
    nic_receive_steering_discard: 0

===========

4) SW steering fixes and improvements

Yevgeny Kliteynik Says:
=======================
These short patch series are just small fixes / improvements for
SW steering:

 - Patch 1: Fix dumping of legacy modify_hdr in debug dump to
   align to what is expected by parser
 - Patch 2: Have separate threshold for ICM sync per ICM type
 - Patch 3: Add more info to the steering debug dump - Linux
   version and device name
 - Patch 4: Keep track of number of buddies that are currently
   in use per domain per buddy type

=======================

* tag 'mlx5-updates-2023-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: Update op_mode to op_mod for port selection
  net/mlx5: E-Switch, Remove unused mlx5_esw_offloads_vport_metadata_set()
  net/mlx5: E-Switch, Remove redundant dev arg from mlx5_esw_vport_alloc()
  net/mlx5: Include linux/pci.h for pci_msix_can_alloc_dyn()
  net/mlx5e: RX, Hook NAPIs to page pools
  net/mlx5e: RX, Fix XDP_TX page release for legacy rq nonlinear case
  net/mlx5e: RX, Fix releasing page_pool pages twice for striding RQ
  net/mlx5e: Add vnic devlink health reporter to representors
  net/mlx5: Add vnic devlink health reporter to PFs/VFs
  Revert "net/mlx5: Expose vnic diagnostic counters for eswitch managed vports"
  Revert "net/mlx5: Expose steering dropped packets counter"
  net/mlx5: DR, Add memory statistics for domain object
  net/mlx5: DR, Add more info in domain dbg dump
  net/mlx5: DR, Calculate sync threshold of each pool according to its type
  net/mlx5: DR, Fix dumping of legacy modify_hdr in debug dump
====================

Link: https://lore.kernel.org/r/20230421013850.349646-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Saeed Mahameed says:

====================
mlx5-updates-2023-04-20

1) Dragos Improves RX page pool, and provides some fixes to his previous
   series:
 1.1) Fix releasing page_pool for striding RQ and legacy RQ nonlinear case
 1.2) Hook NAPIs to page pools to gain more performance.

2) From Roi, Some cleanups to TC and eswitch modules.

3) Maher migrates vnic diagnostic counters reporting from debugfs to a
    dedicated devlink health reporter

Maher Says:
===========
 net/mlx5: Expose vnic diagnostic counters using devlink

Currently, vnic diagnostic counters are exposed through the following
debugfs:

$ ls /sys/kernel/debug/mlx5/0000:08:00.0/esw/vf_0/vnic_diag/
cq_overrun
quota_exceeded_command
total_q_under_processor_handle
invalid_command
send_queue_priority_update_flow
nic_receive_steering_discard

The current design does not allow the hypervisor to view the diagnostic
counters of its VFs, in case the VFs get bound to a VM. In other words,
the counters are not exposed for representor interfaces.
Furthermore, the debugfs design is inconvenient future-wise, in case more
counters need to be reported by the driver in the future.

As these counters pertain to vNIC health, it is more appropriate to
utilize the devlink health reporter to expose them.

Thus, this patchest includes the following changes:

* Drop the current vnic diagnostic counters debugfs interface.
* Add a vnic devlink health reporter for PFs/VFs core devices, which
  when diagnosed will dump vnic diagnostic counter values that are
  queried from FW.
* Add a vnic devlink health reporter for the representor interface, which
  serves the same purpose listed in the previous point, in addition to
  allowing the hypervisor to view its VFs diagnostic counters, even when
  the VFs are bounded to external VMs.

Example of devlink health reporter usage is:
$devlink health diagnose pci/0000:08:00.0 reporter vnic
 vNIC env counters:
    total_error_queues: 0 send_queue_priority_update_flow: 0
    comp_eq_overrun: 0 async_eq_overrun: 0 cq_overrun: 0
    invalid_command: 0 quota_exceeded_command: 0
    nic_receive_steering_discard: 0

===========

4) SW steering fixes and improvements

Yevgeny Kliteynik Says:
=======================
These short patch series are just small fixes / improvements for
SW steering:

 - Patch 1: Fix dumping of legacy modify_hdr in debug dump to
   align to what is expected by parser
 - Patch 2: Have separate threshold for ICM sync per ICM type
 - Patch 3: Add more info to the steering debug dump - Linux
   version and device name
 - Patch 4: Keep track of number of buddies that are currently
   in use per domain per buddy type

=======================

* tag 'mlx5-updates-2023-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: Update op_mode to op_mod for port selection
  net/mlx5: E-Switch, Remove unused mlx5_esw_offloads_vport_metadata_set()
  net/mlx5: E-Switch, Remove redundant dev arg from mlx5_esw_vport_alloc()
  net/mlx5: Include linux/pci.h for pci_msix_can_alloc_dyn()
  net/mlx5e: RX, Hook NAPIs to page pools
  net/mlx5e: RX, Fix XDP_TX page release for legacy rq nonlinear case
  net/mlx5e: RX, Fix releasing page_pool pages twice for striding RQ
  net/mlx5e: Add vnic devlink health reporter to representors
  net/mlx5: Add vnic devlink health reporter to PFs/VFs
  Revert "net/mlx5: Expose vnic diagnostic counters for eswitch managed vports"
  Revert "net/mlx5: Expose steering dropped packets counter"
  net/mlx5: DR, Add memory statistics for domain object
  net/mlx5: DR, Add more info in domain dbg dump
  net/mlx5: DR, Calculate sync threshold of each pool according to its type
  net/mlx5: DR, Fix dumping of legacy modify_hdr in debug dump
====================

Link: https://lore.kernel.org/r/20230421013850.349646-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net/mlx5: Update op_mode to op_mod for port selection</title>
<updated>2023-04-21T01:35:50+00:00</updated>
<author>
<name>Roi Dayan</name>
<email>roid@nvidia.com</email>
</author>
<published>2023-03-22T08:21:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f9c895a72a390656f9582e048fdcc3d2cec1dd7c'/>
<id>f9c895a72a390656f9582e048fdcc3d2cec1dd7c</id>
<content type='text'>
To be consistent with the other enum keys use OP_MOD
instead of OP_MODE.

Signed-off-by: Roi Dayan &lt;roid@nvidia.com&gt;
Reviewed-by: Maor Dickman &lt;maord@nvidia.com&gt;
Signed-off-by: Saeed Mahameed &lt;saeedm@nvidia.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
To be consistent with the other enum keys use OP_MOD
instead of OP_MODE.

Signed-off-by: Roi Dayan &lt;roid@nvidia.com&gt;
Reviewed-by: Maor Dickman &lt;maord@nvidia.com&gt;
Signed-off-by: Saeed Mahameed &lt;saeedm@nvidia.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net/mlx5: Add vnic devlink health reporter to PFs/VFs</title>
<updated>2023-04-21T01:35:49+00:00</updated>
<author>
<name>Maher Sanalla</name>
<email>msanalla@nvidia.com</email>
</author>
<published>2023-03-20T22:10:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=b0bc615df488abd0e95107e4a9ecefb9bf8c250a'/>
<id>b0bc615df488abd0e95107e4a9ecefb9bf8c250a</id>
<content type='text'>
Create a vnic devlink health reporter for PFs/VFs interfaces.
The reporter's diagnose callback displays the values of vNIC/vport
transport debug counters of PFs/VFs, as follows:

$ devlink health diagnose pci/0000:08:00.0 reporter vnic
 vNIC env counters:
    total_error_queues: 0 send_queue_priority_update_flow: 0
    comp_eq_overrun: 0 async_eq_overrun: 0 cq_overrun: 0
    invalid_command: 0 quota_exceeded_command: 0
    nic_receive_steering_discard: 0

Moreover, add documentation on the reporter functionality and the
counters description.

While at it, expose the vNIC counters diagnose function to be used by
the downstream patch, which will reveal the counters for representor
interfaces.

Signed-off-by: Maher Sanalla &lt;msanalla@nvidia.com&gt;
Reviewed-by: Moshe Shemesh &lt;moshe@nvidia.com&gt;
Signed-off-by: Saeed Mahameed &lt;saeedm@nvidia.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Create a vnic devlink health reporter for PFs/VFs interfaces.
The reporter's diagnose callback displays the values of vNIC/vport
transport debug counters of PFs/VFs, as follows:

$ devlink health diagnose pci/0000:08:00.0 reporter vnic
 vNIC env counters:
    total_error_queues: 0 send_queue_priority_update_flow: 0
    comp_eq_overrun: 0 async_eq_overrun: 0 cq_overrun: 0
    invalid_command: 0 quota_exceeded_command: 0
    nic_receive_steering_discard: 0

Moreover, add documentation on the reporter functionality and the
counters description.

While at it, expose the vNIC counters diagnose function to be used by
the downstream patch, which will reveal the counters for representor
interfaces.

Signed-off-by: Maher Sanalla &lt;msanalla@nvidia.com&gt;
Reviewed-by: Moshe Shemesh &lt;moshe@nvidia.com&gt;
Signed-off-by: Saeed Mahameed &lt;saeedm@nvidia.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net</title>
<updated>2023-04-20T23:29:51+00:00</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2023-04-20T23:27:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=681c5b51dc6b8ff1ec05555243eccf64a08cb2fd'/>
<id>681c5b51dc6b8ff1ec05555243eccf64a08cb2fd</id>
<content type='text'>
Adjacent changes:

net/mptcp/protocol.h
  63740448a32e ("mptcp: fix accept vs worker race")
  2a6a870e44dd ("mptcp: stops worker on unaccepted sockets at listener close")
  ddb1a072f858 ("mptcp: move first subflow allocation at mpc access time")

Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Adjacent changes:

net/mptcp/protocol.h
  63740448a32e ("mptcp: fix accept vs worker race")
  2a6a870e44dd ("mptcp: stops worker on unaccepted sockets at listener close")
  ddb1a072f858 ("mptcp: move first subflow allocation at mpc access time")

Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Revert "net/mlx5: Enable management PF initialization"</title>
<updated>2023-04-20T01:51:28+00:00</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2023-04-13T22:25:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f52cc627b832e08a7bcf1b7e81e650ec308fe1d8'/>
<id>f52cc627b832e08a7bcf1b7e81e650ec308fe1d8</id>
<content type='text'>
This reverts commit fe998a3c77b9f989a30a2a01fb00d3729a6d53a4.

Paul reports that it causes a regression with IB on CX4
and FW 12.18.1000. In addition I think that the concept
of "management PF" is not fully accepted and requires
a discussion.

Fixes: fe998a3c77b9 ("net/mlx5: Enable management PF initialization")
Reported-by: Paul Moore &lt;paul@paul-moore.com&gt;
Link: https://lore.kernel.org/all/CAHC9VhQ7A4+msL38WpbOMYjAqLp0EtOjeLh4Dc6SQtD6OUvCQg@mail.gmail.com/
Link: https://lore.kernel.org/r/20230413222547.56901-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This reverts commit fe998a3c77b9f989a30a2a01fb00d3729a6d53a4.

Paul reports that it causes a regression with IB on CX4
and FW 12.18.1000. In addition I think that the concept
of "management PF" is not fully accepted and requires
a discussion.

Fixes: fe998a3c77b9 ("net/mlx5: Enable management PF initialization")
Reported-by: Paul Moore &lt;paul@paul-moore.com&gt;
Link: https://lore.kernel.org/all/CAHC9VhQ7A4+msL38WpbOMYjAqLp0EtOjeLh4Dc6SQtD6OUvCQg@mail.gmail.com/
Link: https://lore.kernel.org/r/20230413222547.56901-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
