<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/base/dd.c, branch linux-5.15.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>driver core: Release all resources during unbind before updating device links</title>
<updated>2023-11-28T16:56:36+00:00</updated>
<author>
<name>Saravana Kannan</name>
<email>saravanak@google.com</email>
</author>
<published>2023-10-18T01:38:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=947c9e12ddd6866603fd60000c0cca8981687dd3'/>
<id>947c9e12ddd6866603fd60000c0cca8981687dd3</id>
<content type='text'>
commit 2e84dc37920012b458e9458b19fc4ed33f81bc74 upstream.

This commit fixes a bug in commit 9ed9895370ae ("driver core: Functional
dependencies tracking support") where the device link status was
incorrectly updated in the driver unbind path before all the device's
resources were released.

Fixes: 9ed9895370ae ("driver core: Functional dependencies tracking support")
Cc: stable &lt;stable@kernel.org&gt;
Reported-by: Uwe Kleine-König &lt;u.kleine-koenig@pengutronix.de&gt;
Closes: https://lore.kernel.org/all/20231014161721.f4iqyroddkcyoefo@pengutronix.de/
Signed-off-by: Saravana Kannan &lt;saravanak@google.com&gt;
Cc: Thierry Reding &lt;thierry.reding@gmail.com&gt;
Cc: Yang Yingliang &lt;yangyingliang@huawei.com&gt;
Cc: Andy Shevchenko &lt;andriy.shevchenko@linux.intel.com&gt;
Cc: Mark Brown &lt;broonie@kernel.org&gt;
Cc: Matti Vaittinen &lt;mazziesaccount@gmail.com&gt;
Cc: James Clark &lt;james.clark@arm.com&gt;
Acked-by: "Rafael J. Wysocki" &lt;rafael@kernel.org&gt;
Tested-by: Uwe Kleine-König &lt;u.kleine-koenig@pengutronix.de&gt;
Acked-by: Uwe Kleine-König &lt;u.kleine-koenig@pengutronix.de&gt;
Link: https://lore.kernel.org/r/20231018013851.3303928-1-saravanak@google.com
Signed-off-by: Uwe Kleine-König &lt;u.kleine-koenig@pengutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 2e84dc37920012b458e9458b19fc4ed33f81bc74 upstream.

This commit fixes a bug in commit 9ed9895370ae ("driver core: Functional
dependencies tracking support") where the device link status was
incorrectly updated in the driver unbind path before all the device's
resources were released.

Fixes: 9ed9895370ae ("driver core: Functional dependencies tracking support")
Cc: stable &lt;stable@kernel.org&gt;
Reported-by: Uwe Kleine-König &lt;u.kleine-koenig@pengutronix.de&gt;
Closes: https://lore.kernel.org/all/20231014161721.f4iqyroddkcyoefo@pengutronix.de/
Signed-off-by: Saravana Kannan &lt;saravanak@google.com&gt;
Cc: Thierry Reding &lt;thierry.reding@gmail.com&gt;
Cc: Yang Yingliang &lt;yangyingliang@huawei.com&gt;
Cc: Andy Shevchenko &lt;andriy.shevchenko@linux.intel.com&gt;
Cc: Mark Brown &lt;broonie@kernel.org&gt;
Cc: Matti Vaittinen &lt;mazziesaccount@gmail.com&gt;
Cc: James Clark &lt;james.clark@arm.com&gt;
Acked-by: "Rafael J. Wysocki" &lt;rafael@kernel.org&gt;
Tested-by: Uwe Kleine-König &lt;u.kleine-koenig@pengutronix.de&gt;
Acked-by: Uwe Kleine-König &lt;u.kleine-koenig@pengutronix.de&gt;
Link: https://lore.kernel.org/r/20231018013851.3303928-1-saravanak@google.com
Signed-off-by: Uwe Kleine-König &lt;u.kleine-koenig@pengutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>driver core: Don't require dynamic_debug for initcall_debug probe timing</title>
<updated>2023-04-30T23:23:24+00:00</updated>
<author>
<name>Stephen Boyd</name>
<email>swboyd@chromium.org</email>
</author>
<published>2023-04-12T22:58:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0994aa001fde56d6fb4286c2493ea10b0e9a8de8'/>
<id>0994aa001fde56d6fb4286c2493ea10b0e9a8de8</id>
<content type='text'>
commit e2f06aa885081e1391916367f53bad984714b4db upstream.

Don't require the use of dynamic debug (or modification of the kernel to
add a #define DEBUG to the top of this file) to get the printk message
about driver probe timing. This printk is only emitted when
initcall_debug is enabled on the kernel commandline, and it isn't
immediately obvious that you have to do something else to debug boot
timing issues related to driver probe. Add a comment too so it doesn't
get converted back to pr_debug().

Fixes: eb7fbc9fb118 ("driver core: Add missing '\n' in log messages")
Cc: stable &lt;stable@kernel.org&gt;
Cc: Christophe JAILLET &lt;christophe.jaillet@wanadoo.fr&gt;
Cc: Brian Norris &lt;briannorris@chromium.org&gt;
Reviewed-by: Brian Norris &lt;briannorris@chromium.org&gt;
Acked-by: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Signed-off-by: Stephen Boyd &lt;swboyd@chromium.org&gt;
Link: https://lore.kernel.org/r/20230412225842.3196599-1-swboyd@chromium.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit e2f06aa885081e1391916367f53bad984714b4db upstream.

Don't require the use of dynamic debug (or modification of the kernel to
add a #define DEBUG to the top of this file) to get the printk message
about driver probe timing. This printk is only emitted when
initcall_debug is enabled on the kernel commandline, and it isn't
immediately obvious that you have to do something else to debug boot
timing issues related to driver probe. Add a comment too so it doesn't
get converted back to pr_debug().

Fixes: eb7fbc9fb118 ("driver core: Add missing '\n' in log messages")
Cc: stable &lt;stable@kernel.org&gt;
Cc: Christophe JAILLET &lt;christophe.jaillet@wanadoo.fr&gt;
Cc: Brian Norris &lt;briannorris@chromium.org&gt;
Reviewed-by: Brian Norris &lt;briannorris@chromium.org&gt;
Acked-by: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Signed-off-by: Stephen Boyd &lt;swboyd@chromium.org&gt;
Link: https://lore.kernel.org/r/20230412225842.3196599-1-swboyd@chromium.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drivers: base: dd: fix memory leak with using debugfs_lookup()</title>
<updated>2023-03-11T12:57:38+00:00</updated>
<author>
<name>Greg Kroah-Hartman</name>
<email>gregkh@linuxfoundation.org</email>
</author>
<published>2023-02-02T14:16:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7f1e53f88e8babf293ec052b70aa9d2a3554360c'/>
<id>7f1e53f88e8babf293ec052b70aa9d2a3554360c</id>
<content type='text'>
[ Upstream commit 36c893d3a759ae7c91ee7d4871ebfc7504f08c40 ]

When calling debugfs_lookup() the result must have dput() called on it,
otherwise the memory will leak over time.  To make things simpler, just
call debugfs_lookup_and_remove() instead which handles all of the logic
at once.

Cc: "Rafael J. Wysocki" &lt;rafael@kernel.org&gt;
Link: https://lore.kernel.org/r/20230202141621.2296458-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 36c893d3a759ae7c91ee7d4871ebfc7504f08c40 ]

When calling debugfs_lookup() the result must have dput() called on it,
otherwise the memory will leak over time.  To make things simpler, just
call debugfs_lookup_and_remove() instead which handles all of the logic
at once.

Cc: "Rafael J. Wysocki" &lt;rafael@kernel.org&gt;
Link: https://lore.kernel.org/r/20230202141621.2296458-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>driver core: Fix bus_type.match() error handling in __driver_attach()</title>
<updated>2023-01-12T10:58:58+00:00</updated>
<author>
<name>Isaac J. Manjarres</name>
<email>isaacmanjarres@google.com</email>
</author>
<published>2022-09-21T00:14:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=dfd05a13355693dd55beea21a49eed0d9dff1217'/>
<id>dfd05a13355693dd55beea21a49eed0d9dff1217</id>
<content type='text'>
commit 27c0d217340e47ec995557f61423ef415afba987 upstream.

When a driver registers with a bus, it will attempt to match with every
device on the bus through the __driver_attach() function. Currently, if
the bus_type.match() function encounters an error that is not
-EPROBE_DEFER, __driver_attach() will return a negative error code, which
causes the driver registration logic to stop trying to match with the
remaining devices on the bus.

This behavior is not correct; a failure while matching a driver to a
device does not mean that the driver won't be able to match and bind
with other devices on the bus. Update the logic in __driver_attach()
to reflect this.

Fixes: 656b8035b0ee ("ARM: 8524/1: driver cohandle -EPROBE_DEFER from bus_type.match()")
Cc: stable@vger.kernel.org
Cc: Saravana Kannan &lt;saravanak@google.com&gt;
Signed-off-by: Isaac J. Manjarres &lt;isaacmanjarres@google.com&gt;
Link: https://lore.kernel.org/r/20220921001414.4046492-1-isaacmanjarres@google.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 27c0d217340e47ec995557f61423ef415afba987 upstream.

When a driver registers with a bus, it will attempt to match with every
device on the bus through the __driver_attach() function. Currently, if
the bus_type.match() function encounters an error that is not
-EPROBE_DEFER, __driver_attach() will return a negative error code, which
causes the driver registration logic to stop trying to match with the
remaining devices on the bus.

This behavior is not correct; a failure while matching a driver to a
device does not mean that the driver won't be able to match and bind
with other devices on the bus. Update the logic in __driver_attach()
to reflect this.

Fixes: 656b8035b0ee ("ARM: 8524/1: driver cohandle -EPROBE_DEFER from bus_type.match()")
Cc: stable@vger.kernel.org
Cc: Saravana Kannan &lt;saravanak@google.com&gt;
Signed-off-by: Isaac J. Manjarres &lt;isaacmanjarres@google.com&gt;
Link: https://lore.kernel.org/r/20220921001414.4046492-1-isaacmanjarres@google.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>driver core: Don't probe devices after bus_type.match() probe deferral</title>
<updated>2022-09-08T10:28:07+00:00</updated>
<author>
<name>Isaac J. Manjarres</name>
<email>isaacmanjarres@google.com</email>
</author>
<published>2022-08-17T18:40:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=253ec5fb84057c681a30d20314b8662fe424e3df'/>
<id>253ec5fb84057c681a30d20314b8662fe424e3df</id>
<content type='text'>
commit 25e9fbf0fd38868a429feabc38abebfc6dbf6542 upstream.

Both __device_attach_driver() and __driver_attach() check the return
code of the bus_type.match() function to see if the device needs to be
added to the deferred probe list. After adding the device to the list,
the logic attempts to bind the device to the driver anyway, as if the
device had matched with the driver, which is not correct.

If __device_attach_driver() detects that the device in question is not
ready to match with a driver on the bus, then it doesn't make sense for
the device to attempt to bind with the current driver or continue
attempting to match with any of the other drivers on the bus. So, update
the logic in __device_attach_driver() to reflect this.

If __driver_attach() detects that a driver tried to match with a device
that is not ready to match yet, then the driver should not attempt to bind
with the device. However, the driver can still attempt to match and bind
with other devices on the bus, as drivers can be bound to multiple
devices. So, update the logic in __driver_attach() to reflect this.

Fixes: 656b8035b0ee ("ARM: 8524/1: driver cohandle -EPROBE_DEFER from bus_type.match()")
Cc: stable@vger.kernel.org
Cc: Saravana Kannan &lt;saravanak@google.com&gt;
Reported-by: Guenter Roeck &lt;linux@roeck-us.net&gt;
Tested-by: Guenter Roeck &lt;linux@roeck-us.net&gt;
Tested-by: Linus Walleij &lt;linus.walleij@linaro.org&gt;
Reviewed-by: Saravana Kannan &lt;saravanak@google.com&gt;
Signed-off-by: Isaac J. Manjarres &lt;isaacmanjarres@google.com&gt;
Link: https://lore.kernel.org/r/20220817184026.3468620-1-isaacmanjarres@google.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 25e9fbf0fd38868a429feabc38abebfc6dbf6542 upstream.

Both __device_attach_driver() and __driver_attach() check the return
code of the bus_type.match() function to see if the device needs to be
added to the deferred probe list. After adding the device to the list,
the logic attempts to bind the device to the driver anyway, as if the
device had matched with the driver, which is not correct.

If __device_attach_driver() detects that the device in question is not
ready to match with a driver on the bus, then it doesn't make sense for
the device to attempt to bind with the current driver or continue
attempting to match with any of the other drivers on the bus. So, update
the logic in __device_attach_driver() to reflect this.

If __driver_attach() detects that a driver tried to match with a device
that is not ready to match yet, then the driver should not attempt to bind
with the device. However, the driver can still attempt to match and bind
with other devices on the bus, as drivers can be bound to multiple
devices. So, update the logic in __driver_attach() to reflect this.

Fixes: 656b8035b0ee ("ARM: 8524/1: driver cohandle -EPROBE_DEFER from bus_type.match()")
Cc: stable@vger.kernel.org
Cc: Saravana Kannan &lt;saravanak@google.com&gt;
Reported-by: Guenter Roeck &lt;linux@roeck-us.net&gt;
Tested-by: Guenter Roeck &lt;linux@roeck-us.net&gt;
Tested-by: Linus Walleij &lt;linus.walleij@linaro.org&gt;
Reviewed-by: Saravana Kannan &lt;saravanak@google.com&gt;
Signed-off-by: Isaac J. Manjarres &lt;isaacmanjarres@google.com&gt;
Link: https://lore.kernel.org/r/20220817184026.3468620-1-isaacmanjarres@google.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>driver core: fix potential deadlock in __driver_attach</title>
<updated>2022-08-17T12:23:45+00:00</updated>
<author>
<name>Zhang Wensheng</name>
<email>zhangwensheng5@huawei.com</email>
</author>
<published>2022-06-22T07:43:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=733ab0c19bf17f6ad7c2b580ede006e369d5ab1b'/>
<id>733ab0c19bf17f6ad7c2b580ede006e369d5ab1b</id>
<content type='text'>
[ Upstream commit 70fe758352cafdee72a7b13bf9db065f9613ced8 ]

In __driver_attach function, There are also AA deadlock problem,
like the commit b232b02bf3c2 ("driver core: fix deadlock in
__device_attach").

stack like commit b232b02bf3c2 ("driver core: fix deadlock in
__device_attach").
list below:
    In __driver_attach function, The lock holding logic is as follows:
    ...
    __driver_attach
    if (driver_allows_async_probing(drv))
      device_lock(dev)      // get lock dev
        async_schedule_dev(__driver_attach_async_helper, dev); // func
          async_schedule_node
            async_schedule_node_domain(func)
              entry = kzalloc(sizeof(struct async_entry), GFP_ATOMIC);
              /* when fail or work limit, sync to execute func, but
                 __driver_attach_async_helper will get lock dev as
                 will, which will lead to A-A deadlock.  */
              if (!entry || atomic_read(&amp;entry_count) &gt; MAX_WORK) {
                func;
              else
                queue_work_node(node, system_unbound_wq, &amp;entry-&gt;work)
      device_unlock(dev)

    As above show, when it is allowed to do async probes, because of
    out of memory or work limit, async work is not be allowed, to do
    sync execute instead. it will lead to A-A deadlock because of
    __driver_attach_async_helper getting lock dev.

Reproduce:
and it can be reproduce by make the condition
(if (!entry || atomic_read(&amp;entry_count) &gt; MAX_WORK)) untenable, like
below:

[  370.785650] "echo 0 &gt; /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[  370.787154] task:swapper/0       state:D stack:    0 pid:    1 ppid:
0 flags:0x00004000
[  370.788865] Call Trace:
[  370.789374]  &lt;TASK&gt;
[  370.789841]  __schedule+0x482/0x1050
[  370.790613]  schedule+0x92/0x1a0
[  370.791290]  schedule_preempt_disabled+0x2c/0x50
[  370.792256]  __mutex_lock.isra.0+0x757/0xec0
[  370.793158]  __mutex_lock_slowpath+0x1f/0x30
[  370.794079]  mutex_lock+0x50/0x60
[  370.794795]  __device_driver_lock+0x2f/0x70
[  370.795677]  ? driver_probe_device+0xd0/0xd0
[  370.796576]  __driver_attach_async_helper+0x1d/0xd0
[  370.797318]  ? driver_probe_device+0xd0/0xd0
[  370.797957]  async_schedule_node_domain+0xa5/0xc0
[  370.798652]  async_schedule_node+0x19/0x30
[  370.799243]  __driver_attach+0x246/0x290
[  370.799828]  ? driver_allows_async_probing+0xa0/0xa0
[  370.800548]  bus_for_each_dev+0x9d/0x130
[  370.801132]  driver_attach+0x22/0x30
[  370.801666]  bus_add_driver+0x290/0x340
[  370.802246]  driver_register+0x88/0x140
[  370.802817]  ? virtio_scsi_init+0x116/0x116
[  370.803425]  scsi_register_driver+0x1a/0x30
[  370.804057]  init_sd+0x184/0x226
[  370.804533]  do_one_initcall+0x71/0x3a0
[  370.805107]  kernel_init_freeable+0x39a/0x43a
[  370.805759]  ? rest_init+0x150/0x150
[  370.806283]  kernel_init+0x26/0x230
[  370.806799]  ret_from_fork+0x1f/0x30

To fix the deadlock, move the async_schedule_dev outside device_lock,
as we can see, in async_schedule_node_domain, the parameter of
queue_work_node is system_unbound_wq, so it can accept concurrent
operations. which will also not change the code logic, and will
not lead to deadlock.

Fixes: ef0ff68351be ("driver core: Probe devices asynchronously instead of the driver")
Signed-off-by: Zhang Wensheng &lt;zhangwensheng5@huawei.com&gt;
Link: https://lore.kernel.org/r/20220622074327.497102-1-zhangwensheng5@huawei.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 70fe758352cafdee72a7b13bf9db065f9613ced8 ]

In __driver_attach function, There are also AA deadlock problem,
like the commit b232b02bf3c2 ("driver core: fix deadlock in
__device_attach").

stack like commit b232b02bf3c2 ("driver core: fix deadlock in
__device_attach").
list below:
    In __driver_attach function, The lock holding logic is as follows:
    ...
    __driver_attach
    if (driver_allows_async_probing(drv))
      device_lock(dev)      // get lock dev
        async_schedule_dev(__driver_attach_async_helper, dev); // func
          async_schedule_node
            async_schedule_node_domain(func)
              entry = kzalloc(sizeof(struct async_entry), GFP_ATOMIC);
              /* when fail or work limit, sync to execute func, but
                 __driver_attach_async_helper will get lock dev as
                 will, which will lead to A-A deadlock.  */
              if (!entry || atomic_read(&amp;entry_count) &gt; MAX_WORK) {
                func;
              else
                queue_work_node(node, system_unbound_wq, &amp;entry-&gt;work)
      device_unlock(dev)

    As above show, when it is allowed to do async probes, because of
    out of memory or work limit, async work is not be allowed, to do
    sync execute instead. it will lead to A-A deadlock because of
    __driver_attach_async_helper getting lock dev.

Reproduce:
and it can be reproduce by make the condition
(if (!entry || atomic_read(&amp;entry_count) &gt; MAX_WORK)) untenable, like
below:

[  370.785650] "echo 0 &gt; /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[  370.787154] task:swapper/0       state:D stack:    0 pid:    1 ppid:
0 flags:0x00004000
[  370.788865] Call Trace:
[  370.789374]  &lt;TASK&gt;
[  370.789841]  __schedule+0x482/0x1050
[  370.790613]  schedule+0x92/0x1a0
[  370.791290]  schedule_preempt_disabled+0x2c/0x50
[  370.792256]  __mutex_lock.isra.0+0x757/0xec0
[  370.793158]  __mutex_lock_slowpath+0x1f/0x30
[  370.794079]  mutex_lock+0x50/0x60
[  370.794795]  __device_driver_lock+0x2f/0x70
[  370.795677]  ? driver_probe_device+0xd0/0xd0
[  370.796576]  __driver_attach_async_helper+0x1d/0xd0
[  370.797318]  ? driver_probe_device+0xd0/0xd0
[  370.797957]  async_schedule_node_domain+0xa5/0xc0
[  370.798652]  async_schedule_node+0x19/0x30
[  370.799243]  __driver_attach+0x246/0x290
[  370.799828]  ? driver_allows_async_probing+0xa0/0xa0
[  370.800548]  bus_for_each_dev+0x9d/0x130
[  370.801132]  driver_attach+0x22/0x30
[  370.801666]  bus_add_driver+0x290/0x340
[  370.802246]  driver_register+0x88/0x140
[  370.802817]  ? virtio_scsi_init+0x116/0x116
[  370.803425]  scsi_register_driver+0x1a/0x30
[  370.804057]  init_sd+0x184/0x226
[  370.804533]  do_one_initcall+0x71/0x3a0
[  370.805107]  kernel_init_freeable+0x39a/0x43a
[  370.805759]  ? rest_init+0x150/0x150
[  370.806283]  kernel_init+0x26/0x230
[  370.806799]  ret_from_fork+0x1f/0x30

To fix the deadlock, move the async_schedule_dev outside device_lock,
as we can see, in async_schedule_node_domain, the parameter of
queue_work_node is system_unbound_wq, so it can accept concurrent
operations. which will also not change the code logic, and will
not lead to deadlock.

Fixes: ef0ff68351be ("driver core: Probe devices asynchronously instead of the driver")
Signed-off-by: Zhang Wensheng &lt;zhangwensheng5@huawei.com&gt;
Link: https://lore.kernel.org/r/20220622074327.497102-1-zhangwensheng5@huawei.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>driver core: Fix wait_for_device_probe() &amp; deferred_probe_timeout interaction</title>
<updated>2022-06-14T16:36:13+00:00</updated>
<author>
<name>Saravana Kannan</name>
<email>saravanak@google.com</email>
</author>
<published>2022-06-03T11:31:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=29357883a89193863f3cc6a2c5e0b42ceb022761'/>
<id>29357883a89193863f3cc6a2c5e0b42ceb022761</id>
<content type='text'>
[ Upstream commit 5ee76c256e928455212ab759c51d198fedbe7523 ]

Mounting NFS rootfs was timing out when deferred_probe_timeout was
non-zero [1].  This was because ip_auto_config() initcall times out
waiting for the network interfaces to show up when
deferred_probe_timeout was non-zero. While ip_auto_config() calls
wait_for_device_probe() to make sure any currently running deferred
probe work or asynchronous probe finishes, that wasn't sufficient to
account for devices being deferred until deferred_probe_timeout.

Commit 35a672363ab3 ("driver core: Ensure wait_for_device_probe() waits
until the deferred_probe_timeout fires") tried to fix that by making
sure wait_for_device_probe() waits for deferred_probe_timeout to expire
before returning.

However, if wait_for_device_probe() is called from the kernel_init()
context:

- Before deferred_probe_initcall() [2], it causes the boot process to
  hang due to a deadlock.

- After deferred_probe_initcall() [3], it blocks kernel_init() from
  continuing till deferred_probe_timeout expires and beats the point of
  deferred_probe_timeout that's trying to wait for userspace to load
  modules.

Neither of this is good. So revert the changes to
wait_for_device_probe().

[1] - https://lore.kernel.org/lkml/TYAPR01MB45443DF63B9EF29054F7C41FD8C60@TYAPR01MB4544.jpnprd01.prod.outlook.com/
[2] - https://lore.kernel.org/lkml/YowHNo4sBjr9ijZr@dev-arch.thelio-3990X/
[3] - https://lore.kernel.org/lkml/Yo3WvGnNk3LvLb7R@linutronix.de/

Fixes: 35a672363ab3 ("driver core: Ensure wait_for_device_probe() waits until the deferred_probe_timeout fires")
Cc: John Stultz &lt;jstultz@google.com&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Alexey Kuznetsov &lt;kuznet@ms2.inr.ac.ru&gt;
Cc: Hideaki YOSHIFUJI &lt;yoshfuji@linux-ipv6.org&gt;
Cc: Jakub Kicinski &lt;kuba@kernel.org&gt;
Cc: Rob Herring &lt;robh@kernel.org&gt;
Cc: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: Yoshihiro Shimoda &lt;yoshihiro.shimoda.uh@renesas.com&gt;
Cc: Robin Murphy &lt;robin.murphy@arm.com&gt;
Cc: Andy Shevchenko &lt;andy.shevchenko@gmail.com&gt;
Cc: Sudeep Holla &lt;sudeep.holla@arm.com&gt;
Cc: Andy Shevchenko &lt;andriy.shevchenko@linux.intel.com&gt;
Cc: Naresh Kamboju &lt;naresh.kamboju@linaro.org&gt;
Cc: Basil Eljuse &lt;Basil.Eljuse@arm.com&gt;
Cc: Ferry Toth &lt;fntoth@gmail.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Anders Roxell &lt;anders.roxell@linaro.org&gt;
Cc: linux-pm@vger.kernel.org
Reported-by: Nathan Chancellor &lt;nathan@kernel.org&gt;
Reported-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Tested-by: Geert Uytterhoeven &lt;geert+renesas@glider.be&gt;
Acked-by: John Stultz &lt;jstultz@google.com&gt;
Signed-off-by: Saravana Kannan &lt;saravanak@google.com&gt;
Link: https://lore.kernel.org/r/20220526034609.480766-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Reviewed-by: Rafael J. Wysocki &lt;rafael@kernel.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 5ee76c256e928455212ab759c51d198fedbe7523 ]

Mounting NFS rootfs was timing out when deferred_probe_timeout was
non-zero [1].  This was because ip_auto_config() initcall times out
waiting for the network interfaces to show up when
deferred_probe_timeout was non-zero. While ip_auto_config() calls
wait_for_device_probe() to make sure any currently running deferred
probe work or asynchronous probe finishes, that wasn't sufficient to
account for devices being deferred until deferred_probe_timeout.

Commit 35a672363ab3 ("driver core: Ensure wait_for_device_probe() waits
until the deferred_probe_timeout fires") tried to fix that by making
sure wait_for_device_probe() waits for deferred_probe_timeout to expire
before returning.

However, if wait_for_device_probe() is called from the kernel_init()
context:

- Before deferred_probe_initcall() [2], it causes the boot process to
  hang due to a deadlock.

- After deferred_probe_initcall() [3], it blocks kernel_init() from
  continuing till deferred_probe_timeout expires and beats the point of
  deferred_probe_timeout that's trying to wait for userspace to load
  modules.

Neither of this is good. So revert the changes to
wait_for_device_probe().

[1] - https://lore.kernel.org/lkml/TYAPR01MB45443DF63B9EF29054F7C41FD8C60@TYAPR01MB4544.jpnprd01.prod.outlook.com/
[2] - https://lore.kernel.org/lkml/YowHNo4sBjr9ijZr@dev-arch.thelio-3990X/
[3] - https://lore.kernel.org/lkml/Yo3WvGnNk3LvLb7R@linutronix.de/

Fixes: 35a672363ab3 ("driver core: Ensure wait_for_device_probe() waits until the deferred_probe_timeout fires")
Cc: John Stultz &lt;jstultz@google.com&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Alexey Kuznetsov &lt;kuznet@ms2.inr.ac.ru&gt;
Cc: Hideaki YOSHIFUJI &lt;yoshfuji@linux-ipv6.org&gt;
Cc: Jakub Kicinski &lt;kuba@kernel.org&gt;
Cc: Rob Herring &lt;robh@kernel.org&gt;
Cc: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: Yoshihiro Shimoda &lt;yoshihiro.shimoda.uh@renesas.com&gt;
Cc: Robin Murphy &lt;robin.murphy@arm.com&gt;
Cc: Andy Shevchenko &lt;andy.shevchenko@gmail.com&gt;
Cc: Sudeep Holla &lt;sudeep.holla@arm.com&gt;
Cc: Andy Shevchenko &lt;andriy.shevchenko@linux.intel.com&gt;
Cc: Naresh Kamboju &lt;naresh.kamboju@linaro.org&gt;
Cc: Basil Eljuse &lt;Basil.Eljuse@arm.com&gt;
Cc: Ferry Toth &lt;fntoth@gmail.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Anders Roxell &lt;anders.roxell@linaro.org&gt;
Cc: linux-pm@vger.kernel.org
Reported-by: Nathan Chancellor &lt;nathan@kernel.org&gt;
Reported-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Tested-by: Geert Uytterhoeven &lt;geert+renesas@glider.be&gt;
Acked-by: John Stultz &lt;jstultz@google.com&gt;
Signed-off-by: Saravana Kannan &lt;saravanak@google.com&gt;
Link: https://lore.kernel.org/r/20220526034609.480766-2-saravanak@google.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Reviewed-by: Rafael J. Wysocki &lt;rafael@kernel.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>driver core: fix deadlock in __device_attach</title>
<updated>2022-06-14T16:36:09+00:00</updated>
<author>
<name>Zhang Wensheng</name>
<email>zhangwensheng5@huawei.com</email>
</author>
<published>2022-05-18T07:45:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=df6de52b80aa3b46f5ac804412355ffe2e1df93e'/>
<id>df6de52b80aa3b46f5ac804412355ffe2e1df93e</id>
<content type='text'>
[ Upstream commit b232b02bf3c205b13a26dcec08e53baddd8e59ed ]

In __device_attach function, The lock holding logic is as follows:
...
__device_attach
device_lock(dev)      // get lock dev
  async_schedule_dev(__device_attach_async_helper, dev); // func
    async_schedule_node
      async_schedule_node_domain(func)
        entry = kzalloc(sizeof(struct async_entry), GFP_ATOMIC);
	/* when fail or work limit, sync to execute func, but
	   __device_attach_async_helper will get lock dev as
	   well, which will lead to A-A deadlock.  */
	if (!entry || atomic_read(&amp;entry_count) &gt; MAX_WORK) {
	  func;
	else
	  queue_work_node(node, system_unbound_wq, &amp;entry-&gt;work)
  device_unlock(dev)

As shown above, when it is allowed to do async probes, because of
out of memory or work limit, async work is not allowed, to do
sync execute instead. it will lead to A-A deadlock because of
__device_attach_async_helper getting lock dev.

To fix the deadlock, move the async_schedule_dev outside device_lock,
as we can see, in async_schedule_node_domain, the parameter of
queue_work_node is system_unbound_wq, so it can accept concurrent
operations. which will also not change the code logic, and will
not lead to deadlock.

Fixes: 765230b5f084 ("driver-core: add asynchronous probing support for drivers")
Signed-off-by: Zhang Wensheng &lt;zhangwensheng5@huawei.com&gt;
Link: https://lore.kernel.org/r/20220518074516.1225580-1-zhangwensheng5@huawei.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit b232b02bf3c205b13a26dcec08e53baddd8e59ed ]

In __device_attach function, The lock holding logic is as follows:
...
__device_attach
device_lock(dev)      // get lock dev
  async_schedule_dev(__device_attach_async_helper, dev); // func
    async_schedule_node
      async_schedule_node_domain(func)
        entry = kzalloc(sizeof(struct async_entry), GFP_ATOMIC);
	/* when fail or work limit, sync to execute func, but
	   __device_attach_async_helper will get lock dev as
	   well, which will lead to A-A deadlock.  */
	if (!entry || atomic_read(&amp;entry_count) &gt; MAX_WORK) {
	  func;
	else
	  queue_work_node(node, system_unbound_wq, &amp;entry-&gt;work)
  device_unlock(dev)

As shown above, when it is allowed to do async probes, because of
out of memory or work limit, async work is not allowed, to do
sync execute instead. it will lead to A-A deadlock because of
__device_attach_async_helper getting lock dev.

To fix the deadlock, move the async_schedule_dev outside device_lock,
as we can see, in async_schedule_node_domain, the parameter of
queue_work_node is system_unbound_wq, so it can accept concurrent
operations. which will also not change the code logic, and will
not lead to deadlock.

Fixes: 765230b5f084 ("driver-core: add asynchronous probing support for drivers")
Signed-off-by: Zhang Wensheng &lt;zhangwensheng5@huawei.com&gt;
Link: https://lore.kernel.org/r/20220518074516.1225580-1-zhangwensheng5@huawei.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>net: mdio: don't defer probe forever if PHY IRQ provider is missing</title>
<updated>2022-04-20T07:34:10+00:00</updated>
<author>
<name>Vladimir Oltean</name>
<email>vladimir.oltean@nxp.com</email>
</author>
<published>2022-04-07T16:55:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=152b813d8ba55d01a27962d30091485d019eed8e'/>
<id>152b813d8ba55d01a27962d30091485d019eed8e</id>
<content type='text'>
[ Upstream commit 74befa447e6839cdd90ed541159ec783726946f9 ]

When a driver for an interrupt controller is missing, of_irq_get()
returns -EPROBE_DEFER ad infinitum, causing
fwnode_mdiobus_phy_device_register(), and ultimately, the entire
of_mdiobus_register() call, to fail. In turn, any phy_connect() call
towards a PHY on this MDIO bus will also fail.

This is not what is expected to happen, because the PHY library falls
back to poll mode when of_irq_get() returns a hard error code, and the
MDIO bus, PHY and attached Ethernet controller work fine, albeit
suboptimally, when the PHY library polls for link status. However,
-EPROBE_DEFER has special handling given the assumption that at some
point probe deferral will stop, and the driver for the supplier will
kick in and create the IRQ domain.

Reasons for which the interrupt controller may be missing:

- It is not yet written. This may happen if a more recent DT blob (with
  an interrupt-parent for the PHY) is used to boot an old kernel where
  the driver didn't exist, and that kernel worked with the
  vintage-correct DT blob using poll mode.

- It is compiled out. Behavior is the same as above.

- It is compiled as a module. The kernel will wait for a number of
  seconds specified in the "deferred_probe_timeout" boot parameter for
  user space to load the required module. The current default is 0,
  which times out at the end of initcalls. It is possible that this
  might cause regressions unless users adjust this boot parameter.

The proposed solution is to use the driver_deferred_probe_check_state()
helper function provided by the driver core, which gives up after some
-EPROBE_DEFER attempts, taking "deferred_probe_timeout" into consideration.
The return code is changed from -EPROBE_DEFER into -ENODEV or
-ETIMEDOUT, depending on whether the kernel is compiled with support for
modules or not.

Fixes: 66bdede495c7 ("of_mdio: Fix broken PHY IRQ in case of probe deferral")
Suggested-by: Robin Murphy &lt;robin.murphy@arm.com&gt;
Signed-off-by: Vladimir Oltean &lt;vladimir.oltean@nxp.com&gt;
Acked-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Reviewed-by: Florian Fainelli &lt;f.fainelli@gmail.com&gt;
Link: https://lore.kernel.org/r/20220407165538.4084809-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 74befa447e6839cdd90ed541159ec783726946f9 ]

When a driver for an interrupt controller is missing, of_irq_get()
returns -EPROBE_DEFER ad infinitum, causing
fwnode_mdiobus_phy_device_register(), and ultimately, the entire
of_mdiobus_register() call, to fail. In turn, any phy_connect() call
towards a PHY on this MDIO bus will also fail.

This is not what is expected to happen, because the PHY library falls
back to poll mode when of_irq_get() returns a hard error code, and the
MDIO bus, PHY and attached Ethernet controller work fine, albeit
suboptimally, when the PHY library polls for link status. However,
-EPROBE_DEFER has special handling given the assumption that at some
point probe deferral will stop, and the driver for the supplier will
kick in and create the IRQ domain.

Reasons for which the interrupt controller may be missing:

- It is not yet written. This may happen if a more recent DT blob (with
  an interrupt-parent for the PHY) is used to boot an old kernel where
  the driver didn't exist, and that kernel worked with the
  vintage-correct DT blob using poll mode.

- It is compiled out. Behavior is the same as above.

- It is compiled as a module. The kernel will wait for a number of
  seconds specified in the "deferred_probe_timeout" boot parameter for
  user space to load the required module. The current default is 0,
  which times out at the end of initcalls. It is possible that this
  might cause regressions unless users adjust this boot parameter.

The proposed solution is to use the driver_deferred_probe_check_state()
helper function provided by the driver core, which gives up after some
-EPROBE_DEFER attempts, taking "deferred_probe_timeout" into consideration.
The return code is changed from -EPROBE_DEFER into -ENODEV or
-ETIMEDOUT, depending on whether the kernel is compiled with support for
modules or not.

Fixes: 66bdede495c7 ("of_mdio: Fix broken PHY IRQ in case of probe deferral")
Suggested-by: Robin Murphy &lt;robin.murphy@arm.com&gt;
Signed-off-by: Vladimir Oltean &lt;vladimir.oltean@nxp.com&gt;
Acked-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Reviewed-by: Florian Fainelli &lt;f.fainelli@gmail.com&gt;
Link: https://lore.kernel.org/r/20220407165538.4084809-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>driver core: dd: fix return value of __setup handler</title>
<updated>2022-04-08T12:23:50+00:00</updated>
<author>
<name>Randy Dunlap</name>
<email>rdunlap@infradead.org</email>
</author>
<published>2022-03-01T04:18:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=104852921ff6c87c84b782df46d269db8d2d0595'/>
<id>104852921ff6c87c84b782df46d269db8d2d0595</id>
<content type='text'>
[ Upstream commit f2aad54703dbe630f9d8b235eb58e8c8cc78f37d ]

When "driver_async_probe=nulltty" is used on the kernel boot command line,
it causes an Unknown parameter message and the string is added to init's
environment strings, polluting them.

  Unknown kernel command line parameters "BOOT_IMAGE=/boot/bzImage-517rc6
  driver_async_probe=nulltty", will be passed to user space.

 Run /sbin/init as init process
   with arguments:
     /sbin/init
   with environment:
     HOME=/
     TERM=linux
     BOOT_IMAGE=/boot/bzImage-517rc6
     driver_async_probe=nulltty

Change the return value of the __setup function to 1 to indicate
that the __setup option has been handled.

Link: lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru
Fixes: 1ea61b68d0f8 ("async: Add cmdline option to specify drivers to be async probed")
Cc: Feng Tang &lt;feng.tang@intel.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: "Rafael J. Wysocki" &lt;rafael@kernel.org&gt;
Reported-by: Igor Zhbanov &lt;i.zhbanov@omprussia.ru&gt;
Reviewed-by: Feng Tang &lt;feng.tang@intel.com&gt;
Signed-off-by: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Link: https://lore.kernel.org/r/20220301041829.15137-1-rdunlap@infradead.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit f2aad54703dbe630f9d8b235eb58e8c8cc78f37d ]

When "driver_async_probe=nulltty" is used on the kernel boot command line,
it causes an Unknown parameter message and the string is added to init's
environment strings, polluting them.

  Unknown kernel command line parameters "BOOT_IMAGE=/boot/bzImage-517rc6
  driver_async_probe=nulltty", will be passed to user space.

 Run /sbin/init as init process
   with arguments:
     /sbin/init
   with environment:
     HOME=/
     TERM=linux
     BOOT_IMAGE=/boot/bzImage-517rc6
     driver_async_probe=nulltty

Change the return value of the __setup function to 1 to indicate
that the __setup option has been handled.

Link: lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru
Fixes: 1ea61b68d0f8 ("async: Add cmdline option to specify drivers to be async probed")
Cc: Feng Tang &lt;feng.tang@intel.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: "Rafael J. Wysocki" &lt;rafael@kernel.org&gt;
Reported-by: Igor Zhbanov &lt;i.zhbanov@omprussia.ru&gt;
Reviewed-by: Feng Tang &lt;feng.tang@intel.com&gt;
Signed-off-by: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Link: https://lore.kernel.org/r/20220301041829.15137-1-rdunlap@infradead.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
