<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/include/linux/device.h, branch v6.12.86</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>driver core: Add kernel-doc for DEV_FLAG_COUNT enum value</title>
<updated>2026-05-07T04:09:47+00:00</updated>
<author>
<name>Douglas Anderson</name>
<email>dianders@chromium.org</email>
</author>
<published>2026-04-14T02:59:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=c5a22b92ed4ddd4d5d2a13d840311b01d2701e1f'/>
<id>c5a22b92ed4ddd4d5d2a13d840311b01d2701e1f</id>
<content type='text'>
commit 5b484311507b5d403c1f7a45f6aa3778549e268b upstream.

Even though nobody should use this value (except when declaring the
"flags" bitmap), kernel-doc still gets upset that it's not documented.
It reports:

  WARNING: ../include/linux/device.h:519
  Enum value 'DEV_FLAG_COUNT' not described in enum 'struct_device_flags'

Add the description of DEV_FLAG_COUNT.

Fixes: a2225b6e834a ("driver core: Don't let a device probe until it's ready")
Reported-by: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Closes: https://lore.kernel.org/f318cd43-81fd-48b9-abf7-92af85f12f91@infradead.org
Signed-off-by: Douglas Anderson &lt;dianders@chromium.org&gt;
Tested-by: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Reviewed-by: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Link: https://patch.msgid.link/20260413195910.1.I23aca74fe2d3636a47df196a80920fecb2643220@changeid
Signed-off-by: Danilo Krummrich &lt;dakr@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 5b484311507b5d403c1f7a45f6aa3778549e268b upstream.

Even though nobody should use this value (except when declaring the
"flags" bitmap), kernel-doc still gets upset that it's not documented.
It reports:

  WARNING: ../include/linux/device.h:519
  Enum value 'DEV_FLAG_COUNT' not described in enum 'struct_device_flags'

Add the description of DEV_FLAG_COUNT.

Fixes: a2225b6e834a ("driver core: Don't let a device probe until it's ready")
Reported-by: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Closes: https://lore.kernel.org/f318cd43-81fd-48b9-abf7-92af85f12f91@infradead.org
Signed-off-by: Douglas Anderson &lt;dianders@chromium.org&gt;
Tested-by: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Reviewed-by: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Link: https://patch.msgid.link/20260413195910.1.I23aca74fe2d3636a47df196a80920fecb2643220@changeid
Signed-off-by: Danilo Krummrich &lt;dakr@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>driver core: Don't let a device probe until it's ready</title>
<updated>2026-05-07T04:09:24+00:00</updated>
<author>
<name>Douglas Anderson</name>
<email>dianders@chromium.org</email>
</author>
<published>2026-04-06T23:22:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=88e338bd9b6ef9df1282e406dae21c69405ded71'/>
<id>88e338bd9b6ef9df1282e406dae21c69405ded71</id>
<content type='text'>
commit a2225b6e834a838ae3c93709760edc0a169eb2f2 upstream.

The moment we link a "struct device" into the list of devices for the
bus, it's possible probe can happen. This is because another thread
can load the driver at any time and that can cause the device to
probe. This has been seen in practice with a stack crawl that looks
like this [1]:

  really_probe()
  __driver_probe_device()
  driver_probe_device()
  __driver_attach()
  bus_for_each_dev()
  driver_attach()
  bus_add_driver()
  driver_register()
  __platform_driver_register()
  init_module() [some module]
  do_one_initcall()
  do_init_module()
  load_module()
  __arm64_sys_finit_module()
  invoke_syscall()

As a result of the above, it was seen that device_links_driver_bound()
could be called for the device before "dev-&gt;fwnode-&gt;dev" was
assigned. This prevented __fw_devlink_pickup_dangling_consumers() from
being called which meant that other devices waiting on our driver's
sub-nodes were stuck deferring forever.

It's believed that this problem is showing up suddenly for two
reasons:
1. Android has recently (last ~1 year) implemented an optimization to
   the order it loads modules [2]. When devices opt-in to this faster
   loading, modules are loaded one-after-the-other very quickly. This
   is unlike how other distributions do it. The reproduction of this
   problem has only been seen on devices that opt-in to Android's
   "parallel module loading".
2. Android devices typically opt-in to fw_devlink, and the most
   noticeable issue is the NULL "dev-&gt;fwnode-&gt;dev" in
   device_links_driver_bound(). fw_devlink is somewhat new code and
   also not in use by all Linux devices.

Even though the specific symptom where "dev-&gt;fwnode-&gt;dev" wasn't
assigned could be fixed by moving that assignment higher in
device_add(), other parts of device_add() (like the call to
device_pm_add()) are also important to run before probe. Only moving
the "dev-&gt;fwnode-&gt;dev" assignment would likely fix the current
symptoms but lead to difficult-to-debug problems in the future.

Fix the problem by preventing probe until device_add() has run far
enough that the device is ready to probe. If somehow we end up trying
to probe before we're allowed, __driver_probe_device() will return
-EPROBE_DEFER which will make certain the device is noticed.

In the race condition that was seen with Android's faster module
loading, we will temporarily add the device to the deferred list and
then take it off immediately when device_add() probes the device.

Instead of adding another flag to the bitfields already in "struct
device", instead add a new "flags" field and use that. This allows us
to freely change the bit from different thread without worrying about
corrupting nearby bits (and means threads changing other bit won't
corrupt us).

[1] Captured on a machine running a downstream 6.6 kernel
[2] https://cs.android.com/android/platform/superproject/main/+/main:system/core/libmodprobe/libmodprobe.cpp?q=LoadModulesParallel

Cc: stable@vger.kernel.org
Fixes: 2023c610dc54 ("Driver core: add new device to bus's list before probing")
Reviewed-by: Alan Stern &lt;stern@rowland.harvard.edu&gt;
Reviewed-by: Rafael J. Wysocki (Intel) &lt;rafael@kernel.org&gt;
Reviewed-by: Danilo Krummrich &lt;dakr@kernel.org&gt;
Acked-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Acked-by: Marek Szyprowski &lt;m.szyprowski@samsung.com&gt;
Signed-off-by: Douglas Anderson &lt;dianders@chromium.org&gt;
Link: https://patch.msgid.link/20260406162231.v5.1.Id750b0fbcc94f23ed04b7aecabcead688d0d8c17@changeid
Signed-off-by: Danilo Krummrich &lt;dakr@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit a2225b6e834a838ae3c93709760edc0a169eb2f2 upstream.

The moment we link a "struct device" into the list of devices for the
bus, it's possible probe can happen. This is because another thread
can load the driver at any time and that can cause the device to
probe. This has been seen in practice with a stack crawl that looks
like this [1]:

  really_probe()
  __driver_probe_device()
  driver_probe_device()
  __driver_attach()
  bus_for_each_dev()
  driver_attach()
  bus_add_driver()
  driver_register()
  __platform_driver_register()
  init_module() [some module]
  do_one_initcall()
  do_init_module()
  load_module()
  __arm64_sys_finit_module()
  invoke_syscall()

As a result of the above, it was seen that device_links_driver_bound()
could be called for the device before "dev-&gt;fwnode-&gt;dev" was
assigned. This prevented __fw_devlink_pickup_dangling_consumers() from
being called which meant that other devices waiting on our driver's
sub-nodes were stuck deferring forever.

It's believed that this problem is showing up suddenly for two
reasons:
1. Android has recently (last ~1 year) implemented an optimization to
   the order it loads modules [2]. When devices opt-in to this faster
   loading, modules are loaded one-after-the-other very quickly. This
   is unlike how other distributions do it. The reproduction of this
   problem has only been seen on devices that opt-in to Android's
   "parallel module loading".
2. Android devices typically opt-in to fw_devlink, and the most
   noticeable issue is the NULL "dev-&gt;fwnode-&gt;dev" in
   device_links_driver_bound(). fw_devlink is somewhat new code and
   also not in use by all Linux devices.

Even though the specific symptom where "dev-&gt;fwnode-&gt;dev" wasn't
assigned could be fixed by moving that assignment higher in
device_add(), other parts of device_add() (like the call to
device_pm_add()) are also important to run before probe. Only moving
the "dev-&gt;fwnode-&gt;dev" assignment would likely fix the current
symptoms but lead to difficult-to-debug problems in the future.

Fix the problem by preventing probe until device_add() has run far
enough that the device is ready to probe. If somehow we end up trying
to probe before we're allowed, __driver_probe_device() will return
-EPROBE_DEFER which will make certain the device is noticed.

In the race condition that was seen with Android's faster module
loading, we will temporarily add the device to the deferred list and
then take it off immediately when device_add() probes the device.

Instead of adding another flag to the bitfields already in "struct
device", instead add a new "flags" field and use that. This allows us
to freely change the bit from different thread without worrying about
corrupting nearby bits (and means threads changing other bit won't
corrupt us).

[1] Captured on a machine running a downstream 6.6 kernel
[2] https://cs.android.com/android/platform/superproject/main/+/main:system/core/libmodprobe/libmodprobe.cpp?q=LoadModulesParallel

Cc: stable@vger.kernel.org
Fixes: 2023c610dc54 ("Driver core: add new device to bus's list before probing")
Reviewed-by: Alan Stern &lt;stern@rowland.harvard.edu&gt;
Reviewed-by: Rafael J. Wysocki (Intel) &lt;rafael@kernel.org&gt;
Reviewed-by: Danilo Krummrich &lt;dakr@kernel.org&gt;
Acked-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Acked-by: Marek Szyprowski &lt;m.szyprowski@samsung.com&gt;
Signed-off-by: Douglas Anderson &lt;dianders@chromium.org&gt;
Link: https://patch.msgid.link/20260406162231.v5.1.Id750b0fbcc94f23ed04b7aecabcead688d0d8c17@changeid
Signed-off-by: Danilo Krummrich &lt;dakr@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>driver core: generalize driver_override in struct device</title>
<updated>2026-04-02T11:09:25+00:00</updated>
<author>
<name>Danilo Krummrich</name>
<email>dakr@kernel.org</email>
</author>
<published>2026-03-03T11:53:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=78aba57ca3de5ecebe2c45bd82eb14d2d41b297b'/>
<id>78aba57ca3de5ecebe2c45bd82eb14d2d41b297b</id>
<content type='text'>
[ Upstream commit cb3d1049f4ea77d5ad93f17d8ac1f2ed4da70501 ]

Currently, there are 12 busses (including platform and PCI) that
duplicate the driver_override logic for their individual devices.

All of them seem to be prone to the bug described in [1].

While this could be solved for every bus individually using a separate
lock, solving this in the driver-core generically results in less (and
cleaner) changes overall.

Thus, move driver_override to struct device, provide corresponding
accessors for busses and handle locking with a separate lock internally.

In particular, add device_set_driver_override(),
device_has_driver_override(), device_match_driver_override() and
generalize the sysfs store() and show() callbacks via a driver_override
feature flag in struct bus_type.

Until all busses have migrated, keep driver_set_override() in place.

Note that we can't use the device lock for the reasons described in [2].

Link: https://bugzilla.kernel.org/show_bug.cgi?id=220789 [1]
Link: https://lore.kernel.org/driver-core/DGRGTIRHA62X.3RY09D9SOK77P@kernel.org/ [2]
Tested-by: Gui-Dong Han &lt;hanguidong02@gmail.com&gt;
Co-developed-by: Gui-Dong Han &lt;hanguidong02@gmail.com&gt;
Signed-off-by: Gui-Dong Han &lt;hanguidong02@gmail.com&gt;
Reviewed-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Link: https://patch.msgid.link/20260303115720.48783-2-dakr@kernel.org
[ Use dev-&gt;bus instead of sp-&gt;bus for consistency; fix commit message to
  refer to the struct bus_type's driver_override feature flag. - Danilo ]
Signed-off-by: Danilo Krummrich &lt;dakr@kernel.org&gt;
Stable-dep-of: 2b38efc05bf7 ("driver core: platform: use generic driver_override infrastructure")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit cb3d1049f4ea77d5ad93f17d8ac1f2ed4da70501 ]

Currently, there are 12 busses (including platform and PCI) that
duplicate the driver_override logic for their individual devices.

All of them seem to be prone to the bug described in [1].

While this could be solved for every bus individually using a separate
lock, solving this in the driver-core generically results in less (and
cleaner) changes overall.

Thus, move driver_override to struct device, provide corresponding
accessors for busses and handle locking with a separate lock internally.

In particular, add device_set_driver_override(),
device_has_driver_override(), device_match_driver_override() and
generalize the sysfs store() and show() callbacks via a driver_override
feature flag in struct bus_type.

Until all busses have migrated, keep driver_set_override() in place.

Note that we can't use the device lock for the reasons described in [2].

Link: https://bugzilla.kernel.org/show_bug.cgi?id=220789 [1]
Link: https://lore.kernel.org/driver-core/DGRGTIRHA62X.3RY09D9SOK77P@kernel.org/ [2]
Tested-by: Gui-Dong Han &lt;hanguidong02@gmail.com&gt;
Co-developed-by: Gui-Dong Han &lt;hanguidong02@gmail.com&gt;
Signed-off-by: Gui-Dong Han &lt;hanguidong02@gmail.com&gt;
Reviewed-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Link: https://patch.msgid.link/20260303115720.48783-2-dakr@kernel.org
[ Use dev-&gt;bus instead of sp-&gt;bus for consistency; fix commit message to
  refer to the struct bus_type's driver_override feature flag. - Danilo ]
Signed-off-by: Danilo Krummrich &lt;dakr@kernel.org&gt;
Stable-dep-of: 2b38efc05bf7 ("driver core: platform: use generic driver_override infrastructure")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>driver core/PM: Set power.no_callbacks along with power.no_pm</title>
<updated>2025-10-12T10:57:19+00:00</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2025-08-28T10:59:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=77dd0e6cb9f9771a6e0bbebbb6cec6f3da1cdbb1'/>
<id>77dd0e6cb9f9771a6e0bbebbb6cec6f3da1cdbb1</id>
<content type='text'>
commit c2ce2453413d429e302659abc5ace634e873f6f5 upstream.

Devices with power.no_pm set are not expected to need any power
management at all, so modify device_set_pm_not_required() to set
power.no_callbacks for them too in case runtime PM will be enabled
for any of them (which in principle may be done for convenience if
such a device participates in a dependency chain).

Since device_set_pm_not_required() must be called before device_add()
or it would not have any effect, it can update power.no_callbacks
without locking, unlike pm_runtime_no_callbacks() that can be called
after registering the target device.

Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Cc: stable &lt;stable@kernel.org&gt;
Reviewed-by: Sudeep Holla &lt;sudeep.holla@arm.com&gt;
Link: https://lore.kernel.org/r/1950054.tdWV9SEqCh@rafael.j.wysocki
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit c2ce2453413d429e302659abc5ace634e873f6f5 upstream.

Devices with power.no_pm set are not expected to need any power
management at all, so modify device_set_pm_not_required() to set
power.no_callbacks for them too in case runtime PM will be enabled
for any of them (which in principle may be done for convenience if
such a device participates in a dependency chain).

Since device_set_pm_not_required() must be called before device_add()
or it would not have any effect, it can update power.no_callbacks
without locking, unlike pm_runtime_no_callbacks() that can be called
after registering the target device.

Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Cc: stable &lt;stable@kernel.org&gt;
Reviewed-by: Sudeep Holla &lt;sudeep.holla@arm.com&gt;
Link: https://lore.kernel.org/r/1950054.tdWV9SEqCh@rafael.j.wysocki
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>driver core: Split devres APIs to device/devres.h</title>
<updated>2025-05-29T09:03:15+00:00</updated>
<author>
<name>Andy Shevchenko</name>
<email>andriy.shevchenko@linux.intel.com</email>
</author>
<published>2025-02-12T06:25:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7207effe4743f49eb8b74ceb66e170b70e4746d2'/>
<id>7207effe4743f49eb8b74ceb66e170b70e4746d2</id>
<content type='text'>
[ Upstream commit a21cad9312767d26b5257ce0662699bb202cdda1 ]

device.h is a huge header which is hard to follow and easy to miss
something. Improve that by splitting devres APIs to device/devres.h.

In particular this helps to speedup the build of the code that includes
device.h solely for a devres APIs.

While at it, cast the error pointers to __iomem using IOMEM_ERR_PTR()
and fix sparse warnings.

Signed-off-by: Raag Jadav &lt;raag.jadav@intel.com&gt;
Acked-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Reviewed-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Andy Shevchenko &lt;andriy.shevchenko@linux.intel.com&gt;
Stable-dep-of: 7dd7f39fce00 ("ASoC: SOF: Intel: hda: Fix UAF when reloading module")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit a21cad9312767d26b5257ce0662699bb202cdda1 ]

device.h is a huge header which is hard to follow and easy to miss
something. Improve that by splitting devres APIs to device/devres.h.

In particular this helps to speedup the build of the code that includes
device.h solely for a devres APIs.

While at it, cast the error pointers to __iomem using IOMEM_ERR_PTR()
and fix sparse warnings.

Signed-off-by: Raag Jadav &lt;raag.jadav@intel.com&gt;
Acked-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Reviewed-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Andy Shevchenko &lt;andriy.shevchenko@linux.intel.com&gt;
Stable-dep-of: 7dd7f39fce00 ("ASoC: SOF: Intel: hda: Fix UAF when reloading module")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cxl/port: Fix use-after-free, permit out-of-order decoder shutdown</title>
<updated>2024-10-25T21:07:03+00:00</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2024-10-23T01:43:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=101c268bd2f37e965a5468353e62d154db38838e'/>
<id>101c268bd2f37e965a5468353e62d154db38838e</id>
<content type='text'>
In support of investigating an initialization failure report [1],
cxl_test was updated to register mock memory-devices after the mock
root-port/bus device had been registered. That led to cxl_test crashing
with a use-after-free bug with the following signature:

    cxl_port_attach_region: cxl region3: cxl_host_bridge.0:port3 decoder3.0 add: mem0:decoder7.0 @ 0 next: cxl_switch_uport.0 nr_eps: 1 nr_targets: 1
    cxl_port_attach_region: cxl region3: cxl_host_bridge.0:port3 decoder3.0 add: mem4:decoder14.0 @ 1 next: cxl_switch_uport.0 nr_eps: 2 nr_targets: 1
    cxl_port_setup_targets: cxl region3: cxl_switch_uport.0:port6 target[0] = cxl_switch_dport.0 for mem0:decoder7.0 @ 0
1)  cxl_port_setup_targets: cxl region3: cxl_switch_uport.0:port6 target[1] = cxl_switch_dport.4 for mem4:decoder14.0 @ 1
    [..]
    cxld_unregister: cxl decoder14.0:
    cxl_region_decode_reset: cxl_region region3:
    mock_decoder_reset: cxl_port port3: decoder3.0 reset
2)  mock_decoder_reset: cxl_port port3: decoder3.0: out of order reset, expected decoder3.1
    cxl_endpoint_decoder_release: cxl decoder14.0:
    [..]
    cxld_unregister: cxl decoder7.0:
3)  cxl_region_decode_reset: cxl_region region3:
    Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6bc3: 0000 [#1] PREEMPT SMP PTI
    [..]
    RIP: 0010:to_cxl_port+0x8/0x60 [cxl_core]
    [..]
    Call Trace:
     &lt;TASK&gt;
     cxl_region_decode_reset+0x69/0x190 [cxl_core]
     cxl_region_detach+0xe8/0x210 [cxl_core]
     cxl_decoder_kill_region+0x27/0x40 [cxl_core]
     cxld_unregister+0x5d/0x60 [cxl_core]

At 1) a region has been established with 2 endpoint decoders (7.0 and
14.0). Those endpoints share a common switch-decoder in the topology
(3.0). At teardown, 2), decoder14.0 is the first to be removed and hits
the "out of order reset case" in the switch decoder. The effect though
is that region3 cleanup is aborted leaving it in-tact and
referencing decoder14.0. At 3) the second attempt to teardown region3
trips over the stale decoder14.0 object which has long since been
deleted.

The fix here is to recognize that the CXL specification places no
mandate on in-order shutdown of switch-decoders, the driver enforces
in-order allocation, and hardware enforces in-order commit. So, rather
than fail and leave objects dangling, always remove them.

In support of making cxl_region_decode_reset() always succeed,
cxl_region_invalidate_memregion() failures are turned into warnings.
Crashing the kernel is ok there since system integrity is at risk if
caches cannot be managed around physical address mutation events like
CXL region destruction.

A new device_for_each_child_reverse_from() is added to cleanup
port-&gt;commit_end after all dependent decoders have been disabled. In
other words if decoders are allocated 0-&gt;1-&gt;2 and disabled 1-&gt;2-&gt;0 then
port-&gt;commit_end only decrements from 2 after 2 has been disabled, and
it decrements all the way to zero since 1 was disabled previously.

Link: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net [1]
Cc: stable@vger.kernel.org
Fixes: 176baefb2eb5 ("cxl/hdm: Commit decoder state to hardware")
Reviewed-by: Jonathan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Dave Jiang &lt;dave.jiang@intel.com&gt;
Cc: Alison Schofield &lt;alison.schofield@intel.com&gt;
Cc: Ira Weiny &lt;ira.weiny@intel.com&gt;
Cc: Zijun Hu &lt;quic_zijuhu@quicinc.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Reviewed-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
Link: https://patch.msgid.link/172964782781.81806.17902885593105284330.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In support of investigating an initialization failure report [1],
cxl_test was updated to register mock memory-devices after the mock
root-port/bus device had been registered. That led to cxl_test crashing
with a use-after-free bug with the following signature:

    cxl_port_attach_region: cxl region3: cxl_host_bridge.0:port3 decoder3.0 add: mem0:decoder7.0 @ 0 next: cxl_switch_uport.0 nr_eps: 1 nr_targets: 1
    cxl_port_attach_region: cxl region3: cxl_host_bridge.0:port3 decoder3.0 add: mem4:decoder14.0 @ 1 next: cxl_switch_uport.0 nr_eps: 2 nr_targets: 1
    cxl_port_setup_targets: cxl region3: cxl_switch_uport.0:port6 target[0] = cxl_switch_dport.0 for mem0:decoder7.0 @ 0
1)  cxl_port_setup_targets: cxl region3: cxl_switch_uport.0:port6 target[1] = cxl_switch_dport.4 for mem4:decoder14.0 @ 1
    [..]
    cxld_unregister: cxl decoder14.0:
    cxl_region_decode_reset: cxl_region region3:
    mock_decoder_reset: cxl_port port3: decoder3.0 reset
2)  mock_decoder_reset: cxl_port port3: decoder3.0: out of order reset, expected decoder3.1
    cxl_endpoint_decoder_release: cxl decoder14.0:
    [..]
    cxld_unregister: cxl decoder7.0:
3)  cxl_region_decode_reset: cxl_region region3:
    Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6bc3: 0000 [#1] PREEMPT SMP PTI
    [..]
    RIP: 0010:to_cxl_port+0x8/0x60 [cxl_core]
    [..]
    Call Trace:
     &lt;TASK&gt;
     cxl_region_decode_reset+0x69/0x190 [cxl_core]
     cxl_region_detach+0xe8/0x210 [cxl_core]
     cxl_decoder_kill_region+0x27/0x40 [cxl_core]
     cxld_unregister+0x5d/0x60 [cxl_core]

At 1) a region has been established with 2 endpoint decoders (7.0 and
14.0). Those endpoints share a common switch-decoder in the topology
(3.0). At teardown, 2), decoder14.0 is the first to be removed and hits
the "out of order reset case" in the switch decoder. The effect though
is that region3 cleanup is aborted leaving it in-tact and
referencing decoder14.0. At 3) the second attempt to teardown region3
trips over the stale decoder14.0 object which has long since been
deleted.

The fix here is to recognize that the CXL specification places no
mandate on in-order shutdown of switch-decoders, the driver enforces
in-order allocation, and hardware enforces in-order commit. So, rather
than fail and leave objects dangling, always remove them.

In support of making cxl_region_decode_reset() always succeed,
cxl_region_invalidate_memregion() failures are turned into warnings.
Crashing the kernel is ok there since system integrity is at risk if
caches cannot be managed around physical address mutation events like
CXL region destruction.

A new device_for_each_child_reverse_from() is added to cleanup
port-&gt;commit_end after all dependent decoders have been disabled. In
other words if decoders are allocated 0-&gt;1-&gt;2 and disabled 1-&gt;2-&gt;0 then
port-&gt;commit_end only decrements from 2 after 2 has been disabled, and
it decrements all the way to zero since 1 was disabled previously.

Link: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net [1]
Cc: stable@vger.kernel.org
Fixes: 176baefb2eb5 ("cxl/hdm: Commit decoder state to hardware")
Reviewed-by: Jonathan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Dave Jiang &lt;dave.jiang@intel.com&gt;
Cc: Alison Schofield &lt;alison.schofield@intel.com&gt;
Cc: Ira Weiny &lt;ira.weiny@intel.com&gt;
Cc: Zijun Hu &lt;quic_zijuhu@quicinc.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Reviewed-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
Link: https://patch.msgid.link/172964782781.81806.17902885593105284330.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dma-mapping: clearly mark DMA ops as an architecture feature</title>
<updated>2024-09-04T04:08:51+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2024-08-28T06:02:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=de6c85bf918ea52d5c680f0d130b37ee2ff152d6'/>
<id>de6c85bf918ea52d5c680f0d130b37ee2ff152d6</id>
<content type='text'>
DMA ops are a helper for architectures and not for drivers to override
the DMA implementation.

Unfortunately driver authors keep ignoring this.  Make the fact more
clear by renaming the symbol to ARCH_HAS_DMA_OPS and having the two drivers
overriding their dma_ops depend on that.  These drivers should probably be
marked broken, but we can give them a bit of a grace period for that.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Sakari Ailus &lt;sakari.ailus@linux.intel.com&gt; # for IPU6
Acked-by: Robin Murphy &lt;robin.murphy@arm.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
DMA ops are a helper for architectures and not for drivers to override
the DMA implementation.

Unfortunately driver authors keep ignoring this.  Make the fact more
clear by renaming the symbol to ARCH_HAS_DMA_OPS and having the two drivers
overriding their dma_ops depend on that.  These drivers should probably be
marked broken, but we can give them a bit of a grace period for that.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Sakari Ailus &lt;sakari.ailus@linux.intel.com&gt; # for IPU6
Acked-by: Robin Murphy &lt;robin.murphy@arm.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>dma-mapping: direct calls for dma-iommu</title>
<updated>2024-08-22T04:18:11+00:00</updated>
<author>
<name>Leon Romanovsky</name>
<email>leonro@nvidia.com</email>
</author>
<published>2024-07-24T18:04:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b5c58b2fdc427e7958412ecb2de2804a1f7c1572'/>
<id>b5c58b2fdc427e7958412ecb2de2804a1f7c1572</id>
<content type='text'>
Directly call into dma-iommu just like we have been doing for dma-direct
for a while.  This avoids the indirect call overhead for IOMMU ops and
removes the need to have DMA ops entirely for many common configurations.

Signed-off-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Acked-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Acked-by: Robin Murphy &lt;robin.murphy@arm.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Directly call into dma-iommu just like we have been doing for dma-direct
for a while.  This avoids the indirect call overhead for IOMMU ops and
removes the need to have DMA ops entirely for many common configurations.

Signed-off-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Acked-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Acked-by: Robin Murphy &lt;robin.murphy@arm.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>driver core: make [device_]driver_attach take a const *</title>
<updated>2024-06-20T10:51:42+00:00</updated>
<author>
<name>Greg Kroah-Hartman</name>
<email>gregkh@linuxfoundation.org</email>
</author>
<published>2024-06-14T09:41:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=269e974e664207cc45f83b579565ba73de1b75dc'/>
<id>269e974e664207cc45f83b579565ba73de1b75dc</id>
<content type='text'>
Change device_driver_attach() and driver_attach() to take a const * to
struct device driver as neither of them modify the structure at all.

Also, for some odd reason, drivers/dma/idxd/compat.c had a duplicate
external reference to device_driver_attach(), so remove that to fix up
the build, it should never have had that there in the first place.

Cc: Rafael J. Wysocki &lt;rafael@kernel.org&gt;
Cc: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Cc: Dave Jiang &lt;dave.jiang@intel.com&gt;
Cc: Vinod Koul &lt;vkoul@kernel.org&gt;
Cc: Andy Shevchenko &lt;andriy.shevchenko@linux.intel.com&gt;
Cc: Petr Tesarik &lt;petr.tesarik.ext@huawei.com&gt;
Cc: Alexander Lobakin &lt;aleksander.lobakin@intel.com&gt;
Cc: dmaengine@vger.kernel.org
Link: https://lore.kernel.org/r/2024061401-rasping-manger-c385@gregkh
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change device_driver_attach() and driver_attach() to take a const * to
struct device driver as neither of them modify the structure at all.

Also, for some odd reason, drivers/dma/idxd/compat.c had a duplicate
external reference to device_driver_attach(), so remove that to fix up
the build, it should never have had that there in the first place.

Cc: Rafael J. Wysocki &lt;rafael@kernel.org&gt;
Cc: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Cc: Dave Jiang &lt;dave.jiang@intel.com&gt;
Cc: Vinod Koul &lt;vkoul@kernel.org&gt;
Cc: Andy Shevchenko &lt;andriy.shevchenko@linux.intel.com&gt;
Cc: Petr Tesarik &lt;petr.tesarik.ext@huawei.com&gt;
Cc: Alexander Lobakin &lt;aleksander.lobakin@intel.com&gt;
Cc: dmaengine@vger.kernel.org
Link: https://lore.kernel.org/r/2024061401-rasping-manger-c385@gregkh
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'v6.10-rc4' into driver-core-next</title>
<updated>2024-06-17T06:33:41+00:00</updated>
<author>
<name>Greg Kroah-Hartman</name>
<email>gregkh@linuxfoundation.org</email>
</author>
<published>2024-06-17T06:33:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b5dd424181f33c2978562c64b286fc22cf5ef4bf'/>
<id>b5dd424181f33c2978562c64b286fc22cf5ef4bf</id>
<content type='text'>
We need the driver core and sysfs fixes in here to build on top of.

Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We need the driver core and sysfs fixes in here to build on top of.

Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
