<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/accel, branch v6.11</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>Merge tag 'drm-habanalabs-next-2024-06-23' of https://github.com/HabanaAI/drivers.accel.habanalabs.kernel into drm-next</title>
<updated>2024-06-27T23:41:04+00:00</updated>
<author>
<name>Dave Airlie</name>
<email>airlied@redhat.com</email>
</author>
<published>2024-06-27T23:41:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=fb625bf6187d97c3cd28d680b14bf80f84207e5a'/>
<id>fb625bf6187d97c3cd28d680b14bf80f84207e5a</id>
<content type='text'>
This tag contains habanalabs driver changes for v6.11.

The notable changes are:

- uAPI changes:
  - Use device-name directory in debugfs-driver-habanalabs.
  - Expose server type in debugfs.

- New features and improvements:
  - Gradual sleep in polling memory macro.
  - Reduce Gaudi2 MSI-X interrupt count to 128.
  - Add Gaudi2-D revision support.

- Firmware related changes:
  - Add timestamp to CPLD info.
  - Gaudi2: Assume hard-reset by firmware upon MC SEI severe error.
  - Align Gaudi2 interrupt names.
  - Check for errors after preboot is ready.

- Bug fixes and code cleanups:
  - Move heartbeat work initialization to early init.
  - Fix a race when receiving events during reset.
  - Change the heartbeat scheduling point.

- Maintainers:
  - Change habanalabs maintainer and git repo path.

Signed-off-by: Dave Airlie &lt;airlied@redhat.com&gt;
From: Ofir Bitton &lt;obitton@habana.ai&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/ZnfIjTH5AYQvPe7n@obitton-vm-u22.habana-labs.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This tag contains habanalabs driver changes for v6.11.

The notable changes are:

- uAPI changes:
  - Use device-name directory in debugfs-driver-habanalabs.
  - Expose server type in debugfs.

- New features and improvements:
  - Gradual sleep in polling memory macro.
  - Reduce Gaudi2 MSI-X interrupt count to 128.
  - Add Gaudi2-D revision support.

- Firmware related changes:
  - Add timestamp to CPLD info.
  - Gaudi2: Assume hard-reset by firmware upon MC SEI severe error.
  - Align Gaudi2 interrupt names.
  - Check for errors after preboot is ready.

- Bug fixes and code cleanups:
  - Move heartbeat work initialization to early init.
  - Fix a race when receiving events during reset.
  - Change the heartbeat scheduling point.

- Maintainers:
  - Change habanalabs maintainer and git repo path.

Signed-off-by: Dave Airlie &lt;airlied@redhat.com&gt;
From: Ofir Bitton &lt;obitton@habana.ai&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/ZnfIjTH5AYQvPe7n@obitton-vm-u22.habana-labs.com
</pre>
</div>
</content>
</entry>
<entry>
<title>accel/habanalabs: gradual sleep in polling memory macro</title>
<updated>2024-06-23T06:53:33+00:00</updated>
<author>
<name>Didi Freiman</name>
<email>dfreiman@habana.ai</email>
</author>
<published>2024-05-07T10:47:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9dec27bb8ae4e0792c3a4cb8504bce8931676fb1'/>
<id>9dec27bb8ae4e0792c3a4cb8504bce8931676fb1</id>
<content type='text'>
It’s better to avoid long sleeps right from the beginning of the polling
since the data may be available much sooner than the sleep period.
Because polling host memory is inexpensive, this change gradually
increases the sleep time up to the user-requested period.

Signed-off-by: Didi Freiman &lt;dfreiman@habana.ai&gt;
Reviewed-by: Ofir Bitton &lt;obitton@habana.ai&gt;
Signed-off-by: Ofir Bitton &lt;obitton@habana.ai&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It’s better to avoid long sleeps right from the beginning of the polling
since the data may be available much sooner than the sleep period.
Because polling host memory is inexpensive, this change gradually
increases the sleep time up to the user-requested period.

Signed-off-by: Didi Freiman &lt;dfreiman@habana.ai&gt;
Reviewed-by: Ofir Bitton &lt;obitton@habana.ai&gt;
Signed-off-by: Ofir Bitton &lt;obitton@habana.ai&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>accel/habanalabs: move heartbeat work initialization to early init</title>
<updated>2024-06-23T06:53:33+00:00</updated>
<author>
<name>Tomer Tayar</name>
<email>ttayar@habana.ai</email>
</author>
<published>2024-05-13T11:40:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0199e6392e067299ece25863abd7453e4823f13b'/>
<id>0199e6392e067299ece25863abd7453e4823f13b</id>
<content type='text'>
The device heartbeat work is currently initialized at
device_heartbeat_schedule() which is called at the end of
hl_device_init().
However hl_device_init() can fail at a previous step, and in such a
case, a subsequent call to hl_device_fini() will lead to calling
cleanup_resources() and accessing this work uninitialized.

As there is no real need to re-initialize this work every time it is
rescheduled, move this initialization to device_early_init() to be done
once and early enough.

Signed-off-by: Tomer Tayar &lt;ttayar@habana.ai&gt;
Reviewed-by: Ofir Bitton &lt;obitton@habana.ai&gt;
Signed-off-by: Ofir Bitton &lt;obitton@habana.ai&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The device heartbeat work is currently initialized at
device_heartbeat_schedule() which is called at the end of
hl_device_init().
However hl_device_init() can fail at a previous step, and in such a
case, a subsequent call to hl_device_fini() will lead to calling
cleanup_resources() and accessing this work uninitialized.

As there is no real need to re-initialize this work every time it is
rescheduled, move this initialization to device_early_init() to be done
once and early enough.

Signed-off-by: Tomer Tayar &lt;ttayar@habana.ai&gt;
Reviewed-by: Ofir Bitton &lt;obitton@habana.ai&gt;
Signed-off-by: Ofir Bitton &lt;obitton@habana.ai&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>accel/habanalabs: print timestamp of last PQ heartbeat on EQ heartbeat failure</title>
<updated>2024-06-23T06:53:33+00:00</updated>
<author>
<name>Tomer Tayar</name>
<email>ttayar@habana.ai</email>
</author>
<published>2024-05-01T12:10:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=5cb97d74c3c7ad93750fc3f11875159500380d1c'/>
<id>5cb97d74c3c7ad93750fc3f11875159500380d1c</id>
<content type='text'>
The test packet which is sent to FW for the PQ heartbeat is used also as
the trigger in FW to send the EQ heartbeat event.
Add the time of the last sent packet to the debug info which is printed
upon a EQ heartbeat failure.

Signed-off-by: Tomer Tayar &lt;ttayar@habana.ai&gt;
Reviewed-by: Ofir Bitton &lt;obitton@habana.ai&gt;
Signed-off-by: Ofir Bitton &lt;obitton@habana.ai&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The test packet which is sent to FW for the PQ heartbeat is used also as
the trigger in FW to send the EQ heartbeat event.
Add the time of the last sent packet to the debug info which is printed
upon a EQ heartbeat failure.

Signed-off-by: Tomer Tayar &lt;ttayar@habana.ai&gt;
Reviewed-by: Ofir Bitton &lt;obitton@habana.ai&gt;
Signed-off-by: Ofir Bitton &lt;obitton@habana.ai&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>accel/habanalabs: dump the EQ entries headers on EQ heartbeat failure</title>
<updated>2024-06-23T06:53:32+00:00</updated>
<author>
<name>Tomer Tayar</name>
<email>ttayar@habana.ai</email>
</author>
<published>2024-04-16T16:01:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c4548eee537ed7671170eb370c6c8db12dcfd3e8'/>
<id>c4548eee537ed7671170eb370c6c8db12dcfd3e8</id>
<content type='text'>
Add a dump of the EQ entries headers upon a EQ heartbeat failure.

Signed-off-by: Tomer Tayar &lt;ttayar@habana.ai&gt;
Reviewed-by: Ofir Bitton &lt;obitton@habana.ai&gt;
Signed-off-by: Ofir Bitton &lt;obitton@habana.ai&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Add a dump of the EQ entries headers upon a EQ heartbeat failure.

Signed-off-by: Tomer Tayar &lt;ttayar@habana.ai&gt;
Reviewed-by: Ofir Bitton &lt;obitton@habana.ai&gt;
Signed-off-by: Ofir Bitton &lt;obitton@habana.ai&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>accel/habanalabs: revise print on EQ heartbeat failure</title>
<updated>2024-06-23T06:53:32+00:00</updated>
<author>
<name>Tomer Tayar</name>
<email>ttayar@habana.ai</email>
</author>
<published>2024-04-16T14:01:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=795f93e650fc41c3f627d2733458c2f911bc9568'/>
<id>795f93e650fc41c3f627d2733458c2f911bc9568</id>
<content type='text'>
Don't print the "previous EQ index" value in case of a EQ heartbeat
failure, because it is incremented along with the EQ CI and therefore
redundant.

In addition, as the CPU-CP PI is zeroed when it reaches a value that is
twice the queue size, add a value of the CI with a similar wrap around,
to make it easier to compare the values.

Signed-off-by: Tomer Tayar &lt;ttayar@habana.ai&gt;
Reviewed-by: Ofir Bitton &lt;obitton@habana.ai&gt;
Signed-off-by: Ofir Bitton &lt;obitton@habana.ai&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Don't print the "previous EQ index" value in case of a EQ heartbeat
failure, because it is incremented along with the EQ CI and therefore
redundant.

In addition, as the CPU-CP PI is zeroed when it reaches a value that is
twice the queue size, add a value of the CI with a similar wrap around,
to make it easier to compare the values.

Signed-off-by: Tomer Tayar &lt;ttayar@habana.ai&gt;
Reviewed-by: Ofir Bitton &lt;obitton@habana.ai&gt;
Signed-off-by: Ofir Bitton &lt;obitton@habana.ai&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>accel/habanalabs: add more info upon cpu pkt timeout</title>
<updated>2024-06-23T06:53:32+00:00</updated>
<author>
<name>Farah Kassabri</name>
<email>fkassabri@habana.ai</email>
</author>
<published>2024-04-09T11:46:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9ee446f9b5d0172a94681aae01fabde4891f7123'/>
<id>9ee446f9b5d0172a94681aae01fabde4891f7123</id>
<content type='text'>
In order to have better debuggability upon encountering FW issues,
We are adding additional info once CPU packet timeout expires.

Signed-off-by: Farah Kassabri &lt;fkassabri@habana.ai&gt;
Reviewed-by: Ofir Bitton &lt;obitton@habana.ai&gt;
Signed-off-by: Ofir Bitton &lt;obitton@habana.ai&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In order to have better debuggability upon encountering FW issues,
We are adding additional info once CPU packet timeout expires.

Signed-off-by: Farah Kassabri &lt;fkassabri@habana.ai&gt;
Reviewed-by: Ofir Bitton &lt;obitton@habana.ai&gt;
Signed-off-by: Ofir Bitton &lt;obitton@habana.ai&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>accel/habanalabs: additional print in device-in-use info</title>
<updated>2024-06-23T06:53:32+00:00</updated>
<author>
<name>Ilia Levi</name>
<email>illevi@habana.ai</email>
</author>
<published>2024-04-04T10:50:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=fda92282b09ed6dc85af22ab4195aec791cdde2f'/>
<id>fda92282b09ed6dc85af22ab4195aec791cdde2f</id>
<content type='text'>
When device release triggers a hard reset, there is a printout of
the cause. Currently listed causes (that increment context refcount)
are active command submissions and exported DMA buffer objects. In
any other case, the printout emits "unknown reason". We identify and
print another reason - allocated command buffers.

Signed-off-by: Ilia Levi &lt;illevi@habana.ai&gt;
Reviewed-by: Ofir Bitton &lt;obitton@habana.ai&gt;
Signed-off-by: Ofir Bitton &lt;obitton@habana.ai&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When device release triggers a hard reset, there is a printout of
the cause. Currently listed causes (that increment context refcount)
are active command submissions and exported DMA buffer objects. In
any other case, the printout emits "unknown reason". We identify and
print another reason - allocated command buffers.

Signed-off-by: Ilia Levi &lt;illevi@habana.ai&gt;
Reviewed-by: Ofir Bitton &lt;obitton@habana.ai&gt;
Signed-off-by: Ofir Bitton &lt;obitton@habana.ai&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>accel/habanalbs/gaudi2: reduce interrupt count to 128</title>
<updated>2024-06-23T06:53:07+00:00</updated>
<author>
<name>Ofir Bitton</name>
<email>obitton@habana.ai</email>
</author>
<published>2024-03-31T12:37:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f8422017b2e9331876efcfa07bb7579d5bc3e671'/>
<id>f8422017b2e9331876efcfa07bb7579d5bc3e671</id>
<content type='text'>
Some systems allow a maximum number of 128 MSI-X interrupts.
Hence we reduce the interrupt count to 128 instead of 512.

Reviewed-by: Tomer Tayar &lt;ttayar@habana.ai&gt;
Signed-off-by: Ofir Bitton &lt;obitton@habana.ai&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Some systems allow a maximum number of 128 MSI-X interrupts.
Hence we reduce the interrupt count to 128 instead of 512.

Reviewed-by: Tomer Tayar &lt;ttayar@habana.ai&gt;
Signed-off-by: Ofir Bitton &lt;obitton@habana.ai&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>accel/habanalabs: disable EQ interrupt after disabling pci</title>
<updated>2024-06-23T06:53:05+00:00</updated>
<author>
<name>Tal Cohen</name>
<email>talcohen@habana.ai</email>
</author>
<published>2024-04-03T10:09:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=61f4f624eaaeefcdcbef368b31960b0336e014fb'/>
<id>61f4f624eaaeefcdcbef368b31960b0336e014fb</id>
<content type='text'>
When sending disable pci msg towards firmware, there is a
possibility that an EQ packet is already pending,
disabling EQ interrupt will prevent this from happening.
The interrupt will be re-enabled after reset.

Signed-off-by: Tal Cohen &lt;talcohen@habana.ai&gt;
Reviewed-by: Ofir Bitton &lt;obitton@habana.ai&gt;
Signed-off-by: Ofir Bitton &lt;obitton@habana.ai&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When sending disable pci msg towards firmware, there is a
possibility that an EQ packet is already pending,
disabling EQ interrupt will prevent this from happening.
The interrupt will be re-enabled after reset.

Signed-off-by: Tal Cohen &lt;talcohen@habana.ai&gt;
Reviewed-by: Ofir Bitton &lt;obitton@habana.ai&gt;
Signed-off-by: Ofir Bitton &lt;obitton@habana.ai&gt;
</pre>
</div>
</content>
</entry>
</feed>
