<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/base/core.c, branch linux-5.15.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>driver core: Introduce device_find_any_child() helper</title>
<updated>2024-12-14T18:50:59+00:00</updated>
<author>
<name>Andy Shevchenko</name>
<email>andriy.shevchenko@linux.intel.com</email>
</author>
<published>2022-06-10T12:02:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=86bd0ba393a307c85396bb7309fb29f0d288ace6'/>
<id>86bd0ba393a307c85396bb7309fb29f0d288ace6</id>
<content type='text'>
[ Upstream commit 82b070beae1ef55b0049768c8dc91d87565bb191 ]

There are several places in the kernel where this kind of functionality is
being used. Provide a generic helper for such cases.

Reviewed-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Signed-off-by: Andy Shevchenko &lt;andriy.shevchenko@linux.intel.com&gt;
Link: https://lore.kernel.org/r/20220610120219.18988-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Stable-dep-of: 27aabf27fd01 ("Bluetooth: fix use-after-free in device_for_each_child()")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 82b070beae1ef55b0049768c8dc91d87565bb191 ]

There are several places in the kernel where this kind of functionality is
being used. Provide a generic helper for such cases.

Reviewed-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Signed-off-by: Andy Shevchenko &lt;andriy.shevchenko@linux.intel.com&gt;
Link: https://lore.kernel.org/r/20220610120219.18988-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Stable-dep-of: 27aabf27fd01 ("Bluetooth: fix use-after-free in device_for_each_child()")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Revert "driver core: Fix uevent_show() vs driver detach race"</title>
<updated>2024-11-08T15:25:54+00:00</updated>
<author>
<name>Greg Kroah-Hartman</name>
<email>gregkh@linuxfoundation.org</email>
</author>
<published>2024-10-29T00:23:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4749d336170dbb629e515a857e58a82e61c37a9c'/>
<id>4749d336170dbb629e515a857e58a82e61c37a9c</id>
<content type='text'>
commit 9a71892cbcdb9d1459c84f5a4c722b14354158a5 upstream.

This reverts commit 15fffc6a5624b13b428bb1c6e9088e32a55eb82c.

This commit causes a regression, so revert it for now until it can come
back in a way that works for everyone.

Link: https://lore.kernel.org/all/172790598832.1168608.4519484276671503678.stgit@dwillia2-xfh.jf.intel.com/
Fixes: 15fffc6a5624 ("driver core: Fix uevent_show() vs driver detach race")
Cc: stable &lt;stable@kernel.org&gt;
Cc: Ashish Sangwan &lt;a.sangwan@samsung.com&gt;
Cc: Namjae Jeon &lt;namjae.jeon@samsung.com&gt;
Cc: Dirk Behme &lt;dirk.behme@de.bosch.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Rafael J. Wysocki &lt;rafael@kernel.org&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 9a71892cbcdb9d1459c84f5a4c722b14354158a5 upstream.

This reverts commit 15fffc6a5624b13b428bb1c6e9088e32a55eb82c.

This commit causes a regression, so revert it for now until it can come
back in a way that works for everyone.

Link: https://lore.kernel.org/all/172790598832.1168608.4519484276671503678.stgit@dwillia2-xfh.jf.intel.com/
Fixes: 15fffc6a5624 ("driver core: Fix uevent_show() vs driver detach race")
Cc: stable &lt;stable@kernel.org&gt;
Cc: Ashish Sangwan &lt;a.sangwan@samsung.com&gt;
Cc: Namjae Jeon &lt;namjae.jeon@samsung.com&gt;
Cc: Dirk Behme &lt;dirk.behme@de.bosch.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Rafael J. Wysocki &lt;rafael@kernel.org&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>driver core: Fix uevent_show() vs driver detach race</title>
<updated>2024-08-19T03:45:45+00:00</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2024-07-12T19:42:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=9c23fc327d6ec67629b4ad323bd64d3834c0417d'/>
<id>9c23fc327d6ec67629b4ad323bd64d3834c0417d</id>
<content type='text'>
commit 15fffc6a5624b13b428bb1c6e9088e32a55eb82c upstream.

uevent_show() wants to de-reference dev-&gt;driver-&gt;name. There is no clean
way for a device attribute to de-reference dev-&gt;driver unless that
attribute is defined via (struct device_driver).dev_groups. Instead, the
anti-pattern of taking the device_lock() in the attribute handler risks
deadlocks with code paths that remove device attributes while holding
the lock.

This deadlock is typically invisible to lockdep given the device_lock()
is marked lockdep_set_novalidate_class(), but some subsystems allocate a
local lockdep key for @dev-&gt;mutex to reveal reports of the form:

 ======================================================
 WARNING: possible circular locking dependency detected
 6.10.0-rc7+ #275 Tainted: G           OE    N
 ------------------------------------------------------
 modprobe/2374 is trying to acquire lock:
 ffff8c2270070de0 (kn-&gt;active#6){++++}-{0:0}, at: __kernfs_remove+0xde/0x220

 but task is already holding lock:
 ffff8c22016e88f8 (&amp;cxl_root_key){+.+.}-{3:3}, at: device_release_driver_internal+0x39/0x210

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -&gt; #1 (&amp;cxl_root_key){+.+.}-{3:3}:
        __mutex_lock+0x99/0xc30
        uevent_show+0xac/0x130
        dev_attr_show+0x18/0x40
        sysfs_kf_seq_show+0xac/0xf0
        seq_read_iter+0x110/0x450
        vfs_read+0x25b/0x340
        ksys_read+0x67/0xf0
        do_syscall_64+0x75/0x190
        entry_SYSCALL_64_after_hwframe+0x76/0x7e

 -&gt; #0 (kn-&gt;active#6){++++}-{0:0}:
        __lock_acquire+0x121a/0x1fa0
        lock_acquire+0xd6/0x2e0
        kernfs_drain+0x1e9/0x200
        __kernfs_remove+0xde/0x220
        kernfs_remove_by_name_ns+0x5e/0xa0
        device_del+0x168/0x410
        device_unregister+0x13/0x60
        devres_release_all+0xb8/0x110
        device_unbind_cleanup+0xe/0x70
        device_release_driver_internal+0x1c7/0x210
        driver_detach+0x47/0x90
        bus_remove_driver+0x6c/0xf0
        cxl_acpi_exit+0xc/0x11 [cxl_acpi]
        __do_sys_delete_module.isra.0+0x181/0x260
        do_syscall_64+0x75/0x190
        entry_SYSCALL_64_after_hwframe+0x76/0x7e

The observation though is that driver objects are typically much longer
lived than device objects. It is reasonable to perform lockless
de-reference of a @driver pointer even if it is racing detach from a
device. Given the infrequency of driver unregistration, use
synchronize_rcu() in module_remove_driver() to close any potential
races.  It is potentially overkill to suffer synchronize_rcu() just to
handle the rare module removal racing uevent_show() event.

Thanks to Tetsuo Handa for the debug analysis of the syzbot report [1].

Fixes: c0a40097f0bc ("drivers: core: synchronize really_probe() and dev_uevent()")
Reported-by: syzbot+4762dd74e32532cda5ff@syzkaller.appspotmail.com
Reported-by: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Closes: http://lore.kernel.org/5aa5558f-90a4-4864-b1b1-5d6784c5607d@I-love.SAKURA.ne.jp [1]
Link: http://lore.kernel.org/669073b8ea479_5fffa294c1@dwillia2-xfh.jf.intel.com.notmuch
Cc: stable@vger.kernel.org
Cc: Ashish Sangwan &lt;a.sangwan@samsung.com&gt;
Cc: Namjae Jeon &lt;namjae.jeon@samsung.com&gt;
Cc: Dirk Behme &lt;dirk.behme@de.bosch.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Rafael J. Wysocki &lt;rafael@kernel.org&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Link: https://lore.kernel.org/r/172081332794.577428.9738802016494057132.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 15fffc6a5624b13b428bb1c6e9088e32a55eb82c upstream.

uevent_show() wants to de-reference dev-&gt;driver-&gt;name. There is no clean
way for a device attribute to de-reference dev-&gt;driver unless that
attribute is defined via (struct device_driver).dev_groups. Instead, the
anti-pattern of taking the device_lock() in the attribute handler risks
deadlocks with code paths that remove device attributes while holding
the lock.

This deadlock is typically invisible to lockdep given the device_lock()
is marked lockdep_set_novalidate_class(), but some subsystems allocate a
local lockdep key for @dev-&gt;mutex to reveal reports of the form:

 ======================================================
 WARNING: possible circular locking dependency detected
 6.10.0-rc7+ #275 Tainted: G           OE    N
 ------------------------------------------------------
 modprobe/2374 is trying to acquire lock:
 ffff8c2270070de0 (kn-&gt;active#6){++++}-{0:0}, at: __kernfs_remove+0xde/0x220

 but task is already holding lock:
 ffff8c22016e88f8 (&amp;cxl_root_key){+.+.}-{3:3}, at: device_release_driver_internal+0x39/0x210

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -&gt; #1 (&amp;cxl_root_key){+.+.}-{3:3}:
        __mutex_lock+0x99/0xc30
        uevent_show+0xac/0x130
        dev_attr_show+0x18/0x40
        sysfs_kf_seq_show+0xac/0xf0
        seq_read_iter+0x110/0x450
        vfs_read+0x25b/0x340
        ksys_read+0x67/0xf0
        do_syscall_64+0x75/0x190
        entry_SYSCALL_64_after_hwframe+0x76/0x7e

 -&gt; #0 (kn-&gt;active#6){++++}-{0:0}:
        __lock_acquire+0x121a/0x1fa0
        lock_acquire+0xd6/0x2e0
        kernfs_drain+0x1e9/0x200
        __kernfs_remove+0xde/0x220
        kernfs_remove_by_name_ns+0x5e/0xa0
        device_del+0x168/0x410
        device_unregister+0x13/0x60
        devres_release_all+0xb8/0x110
        device_unbind_cleanup+0xe/0x70
        device_release_driver_internal+0x1c7/0x210
        driver_detach+0x47/0x90
        bus_remove_driver+0x6c/0xf0
        cxl_acpi_exit+0xc/0x11 [cxl_acpi]
        __do_sys_delete_module.isra.0+0x181/0x260
        do_syscall_64+0x75/0x190
        entry_SYSCALL_64_after_hwframe+0x76/0x7e

The observation though is that driver objects are typically much longer
lived than device objects. It is reasonable to perform lockless
de-reference of a @driver pointer even if it is racing detach from a
device. Given the infrequency of driver unregistration, use
synchronize_rcu() in module_remove_driver() to close any potential
races.  It is potentially overkill to suffer synchronize_rcu() just to
handle the rare module removal racing uevent_show() event.

Thanks to Tetsuo Handa for the debug analysis of the syzbot report [1].

Fixes: c0a40097f0bc ("drivers: core: synchronize really_probe() and dev_uevent()")
Reported-by: syzbot+4762dd74e32532cda5ff@syzkaller.appspotmail.com
Reported-by: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Closes: http://lore.kernel.org/5aa5558f-90a4-4864-b1b1-5d6784c5607d@I-love.SAKURA.ne.jp [1]
Link: http://lore.kernel.org/669073b8ea479_5fffa294c1@dwillia2-xfh.jf.intel.com.notmuch
Cc: stable@vger.kernel.org
Cc: Ashish Sangwan &lt;a.sangwan@samsung.com&gt;
Cc: Namjae Jeon &lt;namjae.jeon@samsung.com&gt;
Cc: Dirk Behme &lt;dirk.behme@de.bosch.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Rafael J. Wysocki &lt;rafael@kernel.org&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Link: https://lore.kernel.org/r/172081332794.577428.9738802016494057132.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>drivers: core: synchronize really_probe() and dev_uevent()</title>
<updated>2024-07-05T07:14:19+00:00</updated>
<author>
<name>Dirk Behme</name>
<email>dirk.behme@de.bosch.com</email>
</author>
<published>2024-05-13T05:06:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ec772ed7cb21b46fb132f89241682553efd0b721'/>
<id>ec772ed7cb21b46fb132f89241682553efd0b721</id>
<content type='text'>
commit c0a40097f0bc81deafc15f9195d1fb54595cd6d0 upstream.

Synchronize the dev-&gt;driver usage in really_probe() and dev_uevent().
These can run in different threads, what can result in the following
race condition for dev-&gt;driver uninitialization:

Thread #1:
==========

really_probe() {
...
probe_failed:
...
device_unbind_cleanup(dev) {
    ...
    dev-&gt;driver = NULL;   // &lt;= Failed probe sets dev-&gt;driver to NULL
    ...
    }
...
}

Thread #2:
==========

dev_uevent() {
...
if (dev-&gt;driver)
      // If dev-&gt;driver is NULLed from really_probe() from here on,
      // after above check, the system crashes
      add_uevent_var(env, "DRIVER=%s", dev-&gt;driver-&gt;name);
...
}

really_probe() holds the lock, already. So nothing needs to be done
there. dev_uevent() is called with lock held, often, too. But not
always. What implies that we can't add any locking in dev_uevent()
itself. So fix this race by adding the lock to the non-protected
path. This is the path where above race is observed:

 dev_uevent+0x235/0x380
 uevent_show+0x10c/0x1f0  &lt;= Add lock here
 dev_attr_show+0x3a/0xa0
 sysfs_kf_seq_show+0x17c/0x250
 kernfs_seq_show+0x7c/0x90
 seq_read_iter+0x2d7/0x940
 kernfs_fop_read_iter+0xc6/0x310
 vfs_read+0x5bc/0x6b0
 ksys_read+0xeb/0x1b0
 __x64_sys_read+0x42/0x50
 x64_sys_call+0x27ad/0x2d30
 do_syscall_64+0xcd/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Similar cases are reported by syzkaller in

https://syzkaller.appspot.com/bug?extid=ffa8143439596313a85a

But these are regarding the *initialization* of dev-&gt;driver

dev-&gt;driver = drv;

As this switches dev-&gt;driver to non-NULL these reports can be considered
to be false-positives (which should be "fixed" by this commit, as well,
though).

The same issue was reported and tried to be fixed back in 2015 in

https://lore.kernel.org/lkml/1421259054-2574-1-git-send-email-a.sangwan@samsung.com/

already.

Fixes: 239378f16aa1 ("Driver core: add uevent vars for devices of a class")
Cc: stable &lt;stable@kernel.org&gt;
Cc: syzbot+ffa8143439596313a85a@syzkaller.appspotmail.com
Cc: Ashish Sangwan &lt;a.sangwan@samsung.com&gt;
Cc: Namjae Jeon &lt;namjae.jeon@samsung.com&gt;
Signed-off-by: Dirk Behme &lt;dirk.behme@de.bosch.com&gt;
Link: https://lore.kernel.org/r/20240513050634.3964461-1-dirk.behme@de.bosch.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit c0a40097f0bc81deafc15f9195d1fb54595cd6d0 upstream.

Synchronize the dev-&gt;driver usage in really_probe() and dev_uevent().
These can run in different threads, what can result in the following
race condition for dev-&gt;driver uninitialization:

Thread #1:
==========

really_probe() {
...
probe_failed:
...
device_unbind_cleanup(dev) {
    ...
    dev-&gt;driver = NULL;   // &lt;= Failed probe sets dev-&gt;driver to NULL
    ...
    }
...
}

Thread #2:
==========

dev_uevent() {
...
if (dev-&gt;driver)
      // If dev-&gt;driver is NULLed from really_probe() from here on,
      // after above check, the system crashes
      add_uevent_var(env, "DRIVER=%s", dev-&gt;driver-&gt;name);
...
}

really_probe() holds the lock, already. So nothing needs to be done
there. dev_uevent() is called with lock held, often, too. But not
always. What implies that we can't add any locking in dev_uevent()
itself. So fix this race by adding the lock to the non-protected
path. This is the path where above race is observed:

 dev_uevent+0x235/0x380
 uevent_show+0x10c/0x1f0  &lt;= Add lock here
 dev_attr_show+0x3a/0xa0
 sysfs_kf_seq_show+0x17c/0x250
 kernfs_seq_show+0x7c/0x90
 seq_read_iter+0x2d7/0x940
 kernfs_fop_read_iter+0xc6/0x310
 vfs_read+0x5bc/0x6b0
 ksys_read+0xeb/0x1b0
 __x64_sys_read+0x42/0x50
 x64_sys_call+0x27ad/0x2d30
 do_syscall_64+0xcd/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Similar cases are reported by syzkaller in

https://syzkaller.appspot.com/bug?extid=ffa8143439596313a85a

But these are regarding the *initialization* of dev-&gt;driver

dev-&gt;driver = drv;

As this switches dev-&gt;driver to non-NULL these reports can be considered
to be false-positives (which should be "fixed" by this commit, as well,
though).

The same issue was reported and tried to be fixed back in 2015 in

https://lore.kernel.org/lkml/1421259054-2574-1-git-send-email-a.sangwan@samsung.com/

already.

Fixes: 239378f16aa1 ("Driver core: add uevent vars for devices of a class")
Cc: stable &lt;stable@kernel.org&gt;
Cc: syzbot+ffa8143439596313a85a@syzkaller.appspotmail.com
Cc: Ashish Sangwan &lt;a.sangwan@samsung.com&gt;
Cc: Namjae Jeon &lt;namjae.jeon@samsung.com&gt;
Signed-off-by: Dirk Behme &lt;dirk.behme@de.bosch.com&gt;
Link: https://lore.kernel.org/r/20240513050634.3964461-1-dirk.behme@de.bosch.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>driver core: Introduce device_link_wait_removal()</title>
<updated>2024-04-10T14:19:42+00:00</updated>
<author>
<name>Herve Codina</name>
<email>herve.codina@bootlin.com</email>
</author>
<published>2024-03-25T15:21:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=03c356860b8ba3372f0ff4df5d26e14e67f4c050'/>
<id>03c356860b8ba3372f0ff4df5d26e14e67f4c050</id>
<content type='text'>
commit 0462c56c290a99a7f03e817ae5b843116dfb575c upstream.

The commit 80dd33cf72d1 ("drivers: base: Fix device link removal")
introduces a workqueue to release the consumer and supplier devices used
in the devlink.
In the job queued, devices are release and in turn, when all the
references to these devices are dropped, the release function of the
device itself is called.

Nothing is present to provide some synchronisation with this workqueue
in order to ensure that all ongoing releasing operations are done and
so, some other operations can be started safely.

For instance, in the following sequence:
  1) of_platform_depopulate()
  2) of_overlay_remove()

During the step 1, devices are released and related devlinks are removed
(jobs pushed in the workqueue).
During the step 2, OF nodes are destroyed but, without any
synchronisation with devlink removal jobs, of_overlay_remove() can raise
warnings related to missing of_node_put():
  ERROR: memory leak, expected refcount 1 instead of 2

Indeed, the missing of_node_put() call is going to be done, too late,
from the workqueue job execution.

Introduce device_link_wait_removal() to offer a way to synchronize
operations waiting for the end of devlink removals (i.e. end of
workqueue jobs).
Also, as a flushing operation is done on the workqueue, the workqueue
used is moved from a system-wide workqueue to a local one.

Cc: stable@vger.kernel.org
Signed-off-by: Herve Codina &lt;herve.codina@bootlin.com&gt;
Tested-by: Luca Ceresoli &lt;luca.ceresoli@bootlin.com&gt;
Reviewed-by: Nuno Sa &lt;nuno.sa@analog.com&gt;
Reviewed-by: Saravana Kannan &lt;saravanak@google.com&gt;
Acked-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Link: https://lore.kernel.org/r/20240325152140.198219-2-herve.codina@bootlin.com
Signed-off-by: Rob Herring &lt;robh@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 0462c56c290a99a7f03e817ae5b843116dfb575c upstream.

The commit 80dd33cf72d1 ("drivers: base: Fix device link removal")
introduces a workqueue to release the consumer and supplier devices used
in the devlink.
In the job queued, devices are release and in turn, when all the
references to these devices are dropped, the release function of the
device itself is called.

Nothing is present to provide some synchronisation with this workqueue
in order to ensure that all ongoing releasing operations are done and
so, some other operations can be started safely.

For instance, in the following sequence:
  1) of_platform_depopulate()
  2) of_overlay_remove()

During the step 1, devices are released and related devlinks are removed
(jobs pushed in the workqueue).
During the step 2, OF nodes are destroyed but, without any
synchronisation with devlink removal jobs, of_overlay_remove() can raise
warnings related to missing of_node_put():
  ERROR: memory leak, expected refcount 1 instead of 2

Indeed, the missing of_node_put() call is going to be done, too late,
from the workqueue job execution.

Introduce device_link_wait_removal() to offer a way to synchronize
operations waiting for the end of devlink removals (i.e. end of
workqueue jobs).
Also, as a flushing operation is done on the workqueue, the workqueue
used is moved from a system-wide workqueue to a local one.

Cc: stable@vger.kernel.org
Signed-off-by: Herve Codina &lt;herve.codina@bootlin.com&gt;
Tested-by: Luca Ceresoli &lt;luca.ceresoli@bootlin.com&gt;
Reviewed-by: Nuno Sa &lt;nuno.sa@analog.com&gt;
Reviewed-by: Saravana Kannan &lt;saravanak@google.com&gt;
Acked-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Link: https://lore.kernel.org/r/20240325152140.198219-2-herve.codina@bootlin.com
Signed-off-by: Rob Herring &lt;robh@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>driver core: fix resource leak in device_add()</title>
<updated>2023-03-10T08:39:38+00:00</updated>
<author>
<name>Zhengchao Shao</name>
<email>shaozhengchao@huawei.com</email>
</author>
<published>2022-11-23T01:20:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=8d389e363075c2e1deb84a560686ea92123e4b8b'/>
<id>8d389e363075c2e1deb84a560686ea92123e4b8b</id>
<content type='text'>
[ Upstream commit 6977b1a5d67097eaa4d02b0c126c04cc6e8917c0 ]

When calling kobject_add() failed in device_add(), it will call
cleanup_glue_dir() to free resource. But in kobject_add(),
dev-&gt;kobj.parent has been set to NULL. This will cause resource leak.

The process is as follows:
device_add()
	get_device_parent()
		class_dir_create_and_add()
			kobject_add()		//kobject_get()
	...
	dev-&gt;kobj.parent = kobj;
	...
	kobject_add()		//failed, but set dev-&gt;kobj.parent = NULL
	...
	glue_dir = get_glue_dir(dev)	//glue_dir = NULL, and goto
					//"Error" label
	...
	cleanup_glue_dir()	//becaues glue_dir is NULL, not call
				//kobject_put()

The preceding problem may cause insmod mac80211_hwsim.ko to failed.
sysfs: cannot create duplicate filename '/devices/virtual/mac80211_hwsim'
Call Trace:
&lt;TASK&gt;
dump_stack_lvl+0x8e/0xd1
sysfs_warn_dup.cold+0x1c/0x29
sysfs_create_dir_ns+0x224/0x280
kobject_add_internal+0x2aa/0x880
kobject_add+0x135/0x1a0
get_device_parent+0x3d7/0x590
device_add+0x2aa/0x1cb0
device_create_groups_vargs+0x1eb/0x260
device_create+0xdc/0x110
mac80211_hwsim_new_radio+0x31e/0x4790 [mac80211_hwsim]
init_mac80211_hwsim+0x48d/0x1000 [mac80211_hwsim]
do_one_initcall+0x10f/0x630
do_init_module+0x19f/0x5e0
load_module+0x64b7/0x6eb0
__do_sys_finit_module+0x140/0x200
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
&lt;/TASK&gt;
kobject_add_internal failed for mac80211_hwsim with -EEXIST, don't try to
register things with the same name in the same directory.

Fixes: cebf8fd16900 ("driver core: fix race between creating/querying glue dir and its cleanup")
Signed-off-by: Zhengchao Shao &lt;shaozhengchao@huawei.com&gt;
Link: https://lore.kernel.org/r/20221123012042.335252-1-shaozhengchao@huawei.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 6977b1a5d67097eaa4d02b0c126c04cc6e8917c0 ]

When calling kobject_add() failed in device_add(), it will call
cleanup_glue_dir() to free resource. But in kobject_add(),
dev-&gt;kobj.parent has been set to NULL. This will cause resource leak.

The process is as follows:
device_add()
	get_device_parent()
		class_dir_create_and_add()
			kobject_add()		//kobject_get()
	...
	dev-&gt;kobj.parent = kobj;
	...
	kobject_add()		//failed, but set dev-&gt;kobj.parent = NULL
	...
	glue_dir = get_glue_dir(dev)	//glue_dir = NULL, and goto
					//"Error" label
	...
	cleanup_glue_dir()	//becaues glue_dir is NULL, not call
				//kobject_put()

The preceding problem may cause insmod mac80211_hwsim.ko to failed.
sysfs: cannot create duplicate filename '/devices/virtual/mac80211_hwsim'
Call Trace:
&lt;TASK&gt;
dump_stack_lvl+0x8e/0xd1
sysfs_warn_dup.cold+0x1c/0x29
sysfs_create_dir_ns+0x224/0x280
kobject_add_internal+0x2aa/0x880
kobject_add+0x135/0x1a0
get_device_parent+0x3d7/0x590
device_add+0x2aa/0x1cb0
device_create_groups_vargs+0x1eb/0x260
device_create+0xdc/0x110
mac80211_hwsim_new_radio+0x31e/0x4790 [mac80211_hwsim]
init_mac80211_hwsim+0x48d/0x1000 [mac80211_hwsim]
do_one_initcall+0x10f/0x630
do_init_module+0x19f/0x5e0
load_module+0x64b7/0x6eb0
__do_sys_finit_module+0x140/0x200
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
&lt;/TASK&gt;
kobject_add_internal failed for mac80211_hwsim with -EEXIST, don't try to
register things with the same name in the same directory.

Fixes: cebf8fd16900 ("driver core: fix race between creating/querying glue dir and its cleanup")
Signed-off-by: Zhengchao Shao &lt;shaozhengchao@huawei.com&gt;
Link: https://lore.kernel.org/r/20221123012042.335252-1-shaozhengchao@huawei.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>driver core: fix potential null-ptr-deref in device_add()</title>
<updated>2023-03-10T08:39:35+00:00</updated>
<author>
<name>Yang Yingliang</name>
<email>yangyingliang@huawei.com</email>
</author>
<published>2022-12-05T03:49:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=2c59650d078b1b3f1ea50d5f8ee9fcc537dc02d3'/>
<id>2c59650d078b1b3f1ea50d5f8ee9fcc537dc02d3</id>
<content type='text'>
[ Upstream commit f6837f34a34973ef6600c08195ed300e24e97317 ]

I got the following null-ptr-deref report while doing fault injection test:

BUG: kernel NULL pointer dereference, address: 0000000000000058
CPU: 2 PID: 278 Comm: 37-i2c-ds2482 Tainted: G    B   W        N 6.1.0-rc3+
RIP: 0010:klist_put+0x2d/0xd0
Call Trace:
 &lt;TASK&gt;
 klist_remove+0xf1/0x1c0
 device_release_driver_internal+0x196/0x210
 bus_remove_device+0x1bd/0x240
 device_add+0xd3d/0x1100
 w1_add_master_device+0x476/0x490 [wire]
 ds2482_probe+0x303/0x3e0 [ds2482]

This is how it happened:

w1_alloc_dev()
  // The dev-&gt;driver is set to w1_master_driver.
  memcpy(&amp;dev-&gt;dev, device, sizeof(struct device));
  device_add()
    bus_add_device()
    dpm_sysfs_add() // It fails, calls bus_remove_device.

    // error path
    bus_remove_device()
      // The dev-&gt;driver is not null, but driver is not bound.
      __device_release_driver()
        klist_remove(&amp;dev-&gt;p-&gt;knode_driver) &lt;-- It causes null-ptr-deref.

    // normal path
    bus_probe_device() // It's not called yet.
      device_bind_driver()

If dev-&gt;driver is set, in the error path after calling bus_add_device()
in device_add(), bus_remove_device() is called, then the device will be
detached from driver. But device_bind_driver() is not called yet, so it
causes null-ptr-deref while access the 'knode_driver'. To fix this, set
dev-&gt;driver to null in the error path before calling bus_remove_device().

Fixes: 57eee3d23e88 ("Driver core: Call device_pm_add() after bus_add_device() in device_add()")
Signed-off-by: Yang Yingliang &lt;yangyingliang@huawei.com&gt;
Link: https://lore.kernel.org/r/20221205034904.2077765-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit f6837f34a34973ef6600c08195ed300e24e97317 ]

I got the following null-ptr-deref report while doing fault injection test:

BUG: kernel NULL pointer dereference, address: 0000000000000058
CPU: 2 PID: 278 Comm: 37-i2c-ds2482 Tainted: G    B   W        N 6.1.0-rc3+
RIP: 0010:klist_put+0x2d/0xd0
Call Trace:
 &lt;TASK&gt;
 klist_remove+0xf1/0x1c0
 device_release_driver_internal+0x196/0x210
 bus_remove_device+0x1bd/0x240
 device_add+0xd3d/0x1100
 w1_add_master_device+0x476/0x490 [wire]
 ds2482_probe+0x303/0x3e0 [ds2482]

This is how it happened:

w1_alloc_dev()
  // The dev-&gt;driver is set to w1_master_driver.
  memcpy(&amp;dev-&gt;dev, device, sizeof(struct device));
  device_add()
    bus_add_device()
    dpm_sysfs_add() // It fails, calls bus_remove_device.

    // error path
    bus_remove_device()
      // The dev-&gt;driver is not null, but driver is not bound.
      __device_release_driver()
        klist_remove(&amp;dev-&gt;p-&gt;knode_driver) &lt;-- It causes null-ptr-deref.

    // normal path
    bus_probe_device() // It's not called yet.
      device_bind_driver()

If dev-&gt;driver is set, in the error path after calling bus_add_device()
in device_add(), bus_remove_device() is called, then the device will be
detached from driver. But device_bind_driver() is not called yet, so it
causes null-ptr-deref while access the 'knode_driver'. To fix this, set
dev-&gt;driver to null in the error path before calling bus_remove_device().

Fixes: 57eee3d23e88 ("Driver core: Call device_pm_add() after bus_add_device() in device_add()")
Signed-off-by: Yang Yingliang &lt;yangyingliang@huawei.com&gt;
Link: https://lore.kernel.org/r/20221205034904.2077765-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>PM: runtime: Redefine pm_runtime_release_supplier()</title>
<updated>2022-07-12T14:35:09+00:00</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2022-06-27T18:42:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=79827e53b069276785ed32f34915b7c76055e42c'/>
<id>79827e53b069276785ed32f34915b7c76055e42c</id>
<content type='text'>
commit 07358194badf73e267289b40b761f5dc56928eab upstream.

Instead of passing an extra bool argument to pm_runtime_release_supplier(),
make its callers take care of triggering a runtime-suspend of the
supplier device as needed.

No expected functional impact.

Suggested-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Reviewed-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: 5.1+ &lt;stable@vger.kernel.org&gt; # 5.1+
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 07358194badf73e267289b40b761f5dc56928eab upstream.

Instead of passing an extra bool argument to pm_runtime_release_supplier(),
make its callers take care of triggering a runtime-suspend of the
supplier device as needed.

No expected functional impact.

Suggested-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Reviewed-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: 5.1+ &lt;stable@vger.kernel.org&gt; # 5.1+
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>PM: runtime: Add safety net to supplier device release</title>
<updated>2022-01-27T10:04:44+00:00</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2021-12-10T16:10:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=fddbdd20c8e0bbc813bebebf82216a9bef22d29a'/>
<id>fddbdd20c8e0bbc813bebebf82216a9bef22d29a</id>
<content type='text'>
[ Upstream commit d1579e61192e0e686faa4208500ef4c3b529b16c ]

Because refcount_dec_not_one() returns true if the target refcount
becomes saturated, it is generally unsafe to use its return value as
a loop termination condition, but that is what happens when a device
link's supplier device is released during runtime PM suspend
operations and on device link removal.

To address this, introduce pm_runtime_release_supplier() to be used
in the above cases which will check the supplier device's runtime
PM usage counter in addition to the refcount_dec_not_one() return
value, so the loop can be terminated in case the rpm_active refcount
value becomes invalid, and update the code in question to use it as
appropriate.

This change is not expected to have any visible functional impact.

Reported-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Acked-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit d1579e61192e0e686faa4208500ef4c3b529b16c ]

Because refcount_dec_not_one() returns true if the target refcount
becomes saturated, it is generally unsafe to use its return value as
a loop termination condition, but that is what happens when a device
link's supplier device is released during runtime PM suspend
operations and on device link removal.

To address this, introduce pm_runtime_release_supplier() to be used
in the above cases which will check the supplier device's runtime
PM usage counter in addition to the refcount_dec_not_one() return
value, so the loop can be terminated in case the rpm_active refcount
value becomes invalid, and update the code in question to use it as
appropriate.

This change is not expected to have any visible functional impact.

Reported-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Acked-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>driver core: Fix possible memory leak in device_link_add()</title>
<updated>2021-11-18T18:16:50+00:00</updated>
<author>
<name>Yang Yingliang</name>
<email>yangyingliang@huawei.com</email>
</author>
<published>2021-09-30T08:57:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7f8beede9915a0a0fc548858a4a54042cad6705b'/>
<id>7f8beede9915a0a0fc548858a4a54042cad6705b</id>
<content type='text'>
[ Upstream commit df0a18149474c7e6b21f6367fbc6bc8d0f192444 ]

I got memory leak as follows:

unreferenced object 0xffff88801f0b2200 (size 64):
  comm "i2c-lis2hh12-21", pid 5455, jiffies 4294944606 (age 15.224s)
  hex dump (first 32 bytes):
    72 65 67 75 6c 61 74 6f 72 3a 72 65 67 75 6c 61  regulator:regula
    74 6f 72 2e 30 2d 2d 69 32 63 3a 31 2d 30 30 31  tor.0--i2c:1-001
  backtrace:
    [&lt;00000000bf5b0c3b&gt;] __kmalloc_track_caller+0x19f/0x3a0
    [&lt;0000000050da42d9&gt;] kvasprintf+0xb5/0x150
    [&lt;000000004bbbed13&gt;] kvasprintf_const+0x60/0x190
    [&lt;00000000cdac7480&gt;] kobject_set_name_vargs+0x56/0x150
    [&lt;00000000bf83f8e8&gt;] dev_set_name+0xc0/0x100
    [&lt;00000000cc1cf7e3&gt;] device_link_add+0x6b4/0x17c0
    [&lt;000000009db9faed&gt;] _regulator_get+0x297/0x680
    [&lt;00000000845e7f2b&gt;] _devm_regulator_get+0x5b/0xe0
    [&lt;000000003958ee25&gt;] st_sensors_power_enable+0x71/0x1b0 [st_sensors]
    [&lt;000000005f450f52&gt;] st_accel_i2c_probe+0xd9/0x150 [st_accel_i2c]
    [&lt;00000000b5f2ab33&gt;] i2c_device_probe+0x4d8/0xbe0
    [&lt;0000000070fb977b&gt;] really_probe+0x299/0xc30
    [&lt;0000000088e226ce&gt;] __driver_probe_device+0x357/0x500
    [&lt;00000000c21dda32&gt;] driver_probe_device+0x4e/0x140
    [&lt;000000004e650441&gt;] __device_attach_driver+0x257/0x340
    [&lt;00000000cf1891b8&gt;] bus_for_each_drv+0x166/0x1e0

When device_register() returns an error, the name allocated in dev_set_name()
will be leaked, the put_device() should be used instead of kfree() to give up
the device reference, then the name will be freed in kobject_cleanup() and the
references of consumer and supplier will be decreased in device_link_release_fn().

Fixes: 287905e68dd2 ("driver core: Expose device link details in sysfs")
Reported-by: Hulk Robot &lt;hulkci@huawei.com&gt;
Reviewed-by: Saravana Kannan &lt;saravanak@google.com&gt;
Reviewed-by: Rafael J. Wysocki &lt;rafael@kernel.org&gt;
Signed-off-by: Yang Yingliang &lt;yangyingliang@huawei.com&gt;
Link: https://lore.kernel.org/r/20210930085714.2057460-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit df0a18149474c7e6b21f6367fbc6bc8d0f192444 ]

I got memory leak as follows:

unreferenced object 0xffff88801f0b2200 (size 64):
  comm "i2c-lis2hh12-21", pid 5455, jiffies 4294944606 (age 15.224s)
  hex dump (first 32 bytes):
    72 65 67 75 6c 61 74 6f 72 3a 72 65 67 75 6c 61  regulator:regula
    74 6f 72 2e 30 2d 2d 69 32 63 3a 31 2d 30 30 31  tor.0--i2c:1-001
  backtrace:
    [&lt;00000000bf5b0c3b&gt;] __kmalloc_track_caller+0x19f/0x3a0
    [&lt;0000000050da42d9&gt;] kvasprintf+0xb5/0x150
    [&lt;000000004bbbed13&gt;] kvasprintf_const+0x60/0x190
    [&lt;00000000cdac7480&gt;] kobject_set_name_vargs+0x56/0x150
    [&lt;00000000bf83f8e8&gt;] dev_set_name+0xc0/0x100
    [&lt;00000000cc1cf7e3&gt;] device_link_add+0x6b4/0x17c0
    [&lt;000000009db9faed&gt;] _regulator_get+0x297/0x680
    [&lt;00000000845e7f2b&gt;] _devm_regulator_get+0x5b/0xe0
    [&lt;000000003958ee25&gt;] st_sensors_power_enable+0x71/0x1b0 [st_sensors]
    [&lt;000000005f450f52&gt;] st_accel_i2c_probe+0xd9/0x150 [st_accel_i2c]
    [&lt;00000000b5f2ab33&gt;] i2c_device_probe+0x4d8/0xbe0
    [&lt;0000000070fb977b&gt;] really_probe+0x299/0xc30
    [&lt;0000000088e226ce&gt;] __driver_probe_device+0x357/0x500
    [&lt;00000000c21dda32&gt;] driver_probe_device+0x4e/0x140
    [&lt;000000004e650441&gt;] __device_attach_driver+0x257/0x340
    [&lt;00000000cf1891b8&gt;] bus_for_each_drv+0x166/0x1e0

When device_register() returns an error, the name allocated in dev_set_name()
will be leaked, the put_device() should be used instead of kfree() to give up
the device reference, then the name will be freed in kobject_cleanup() and the
references of consumer and supplier will be decreased in device_link_release_fn().

Fixes: 287905e68dd2 ("driver core: Expose device link details in sysfs")
Reported-by: Hulk Robot &lt;hulkci@huawei.com&gt;
Reviewed-by: Saravana Kannan &lt;saravanak@google.com&gt;
Reviewed-by: Rafael J. Wysocki &lt;rafael@kernel.org&gt;
Signed-off-by: Yang Yingliang &lt;yangyingliang@huawei.com&gt;
Link: https://lore.kernel.org/r/20210930085714.2057460-1-yangyingliang@huawei.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
