<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/base/bus.c, branch linux-3.10.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>driver core: Fix unbalanced device reference in drivers_probe</title>
<updated>2015-01-16T14:59:01+00:00</updated>
<author>
<name>Alex Williamson</name>
<email>alex.williamson@redhat.com</email>
</author>
<published>2014-10-31T17:13:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=bdf2a0db176e1de7c93fe7b7c5a74756b976fb33'/>
<id>bdf2a0db176e1de7c93fe7b7c5a74756b976fb33</id>
<content type='text'>
commit bb34cb6bbd287b57e955bc5cfd42fcde6aaca279 upstream.

bus_find_device_by_name() acquires a device reference which is never
released.  This results in an object leak, which on older kernels
results in failure to release all resources of PCI devices.  libvirt
uses drivers_probe to re-attach devices to the host after assignment
and is therefore a common trigger for this leak.

Example:

# cd /sys/bus/pci/
# dmesg -C
# echo 1 &gt; devices/0000\:01\:00.0/sriov_numvfs
# echo 0 &gt; devices/0000\:01\:00.0/sriov_numvfs
# dmesg | grep 01:10
 pci 0000:01:10.0: [8086:10ca] type 00 class 0x020000
 kobject: '0000:01:10.0' (ffff8801d79cd0a8): kobject_add_internal: parent: '0000:00:01.0', set: 'devices'
 kobject: '0000:01:10.0' (ffff8801d79cd0a8): kobject_uevent_env
 kobject: '0000:01:10.0' (ffff8801d79cd0a8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:01.0/0000:01:10.0'
 kobject: '0000:01:10.0' (ffff8801d79cd0a8): kobject_uevent_env
 kobject: '0000:01:10.0' (ffff8801d79cd0a8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:01.0/0000:01:10.0'
 kobject: '0000:01:10.0' (ffff8801d79cd0a8): kobject_uevent_env
 kobject: '0000:01:10.0' (ffff8801d79cd0a8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:01.0/0000:01:10.0'
 kobject: '0000:01:10.0' (ffff8801d79cd0a8): kobject_cleanup, parent           (null)
 kobject: '0000:01:10.0' (ffff8801d79cd0a8): calling ktype release
 kobject: '0000:01:10.0': free name

[kobject freed as expected]

# dmesg -C
# echo 1 &gt; devices/0000\:01\:00.0/sriov_numvfs
# echo 0000:01:10.0 &gt; drivers_probe
# echo 0 &gt; devices/0000\:01\:00.0/sriov_numvfs
# dmesg | grep 01:10
 pci 0000:01:10.0: [8086:10ca] type 00 class 0x020000
 kobject: '0000:01:10.0' (ffff8801d79ce0a8): kobject_add_internal: parent: '0000:00:01.0', set: 'devices'
 kobject: '0000:01:10.0' (ffff8801d79ce0a8): kobject_uevent_env
 kobject: '0000:01:10.0' (ffff8801d79ce0a8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:01.0/0000:01:10.0'
 kobject: '0000:01:10.0' (ffff8801d79ce0a8): kobject_uevent_env
 kobject: '0000:01:10.0' (ffff8801d79ce0a8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:01.0/0000:01:10.0'
 kobject: '0000:01:10.0' (ffff8801d79ce0a8): kobject_uevent_env
 kobject: '0000:01:10.0' (ffff8801d79ce0a8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:01.0/0000:01:10.0'

[no free]

Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit bb34cb6bbd287b57e955bc5cfd42fcde6aaca279 upstream.

bus_find_device_by_name() acquires a device reference which is never
released.  This results in an object leak, which on older kernels
results in failure to release all resources of PCI devices.  libvirt
uses drivers_probe to re-attach devices to the host after assignment
and is therefore a common trigger for this leak.

Example:

# cd /sys/bus/pci/
# dmesg -C
# echo 1 &gt; devices/0000\:01\:00.0/sriov_numvfs
# echo 0 &gt; devices/0000\:01\:00.0/sriov_numvfs
# dmesg | grep 01:10
 pci 0000:01:10.0: [8086:10ca] type 00 class 0x020000
 kobject: '0000:01:10.0' (ffff8801d79cd0a8): kobject_add_internal: parent: '0000:00:01.0', set: 'devices'
 kobject: '0000:01:10.0' (ffff8801d79cd0a8): kobject_uevent_env
 kobject: '0000:01:10.0' (ffff8801d79cd0a8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:01.0/0000:01:10.0'
 kobject: '0000:01:10.0' (ffff8801d79cd0a8): kobject_uevent_env
 kobject: '0000:01:10.0' (ffff8801d79cd0a8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:01.0/0000:01:10.0'
 kobject: '0000:01:10.0' (ffff8801d79cd0a8): kobject_uevent_env
 kobject: '0000:01:10.0' (ffff8801d79cd0a8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:01.0/0000:01:10.0'
 kobject: '0000:01:10.0' (ffff8801d79cd0a8): kobject_cleanup, parent           (null)
 kobject: '0000:01:10.0' (ffff8801d79cd0a8): calling ktype release
 kobject: '0000:01:10.0': free name

[kobject freed as expected]

# dmesg -C
# echo 1 &gt; devices/0000\:01\:00.0/sriov_numvfs
# echo 0000:01:10.0 &gt; drivers_probe
# echo 0 &gt; devices/0000\:01\:00.0/sriov_numvfs
# dmesg | grep 01:10
 pci 0000:01:10.0: [8086:10ca] type 00 class 0x020000
 kobject: '0000:01:10.0' (ffff8801d79ce0a8): kobject_add_internal: parent: '0000:00:01.0', set: 'devices'
 kobject: '0000:01:10.0' (ffff8801d79ce0a8): kobject_uevent_env
 kobject: '0000:01:10.0' (ffff8801d79ce0a8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:01.0/0000:01:10.0'
 kobject: '0000:01:10.0' (ffff8801d79ce0a8): kobject_uevent_env
 kobject: '0000:01:10.0' (ffff8801d79ce0a8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:01.0/0000:01:10.0'
 kobject: '0000:01:10.0' (ffff8801d79ce0a8): kobject_uevent_env
 kobject: '0000:01:10.0' (ffff8801d79ce0a8): fill_kobj_path: path = '/devices/pci0000:00/0000:00:01.0/0000:01:10.0'

[no free]

Signed-off-by: Alex Williamson &lt;alex.williamson@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>driver core: export subsys_virtual_register</title>
<updated>2013-05-21T16:05:52+00:00</updated>
<author>
<name>Greg Kroah-Hartman</name>
<email>gregkh@linuxfoundation.org</email>
</author>
<published>2013-05-10T16:14:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1c04fc3536b9e6d143991a8c5c16b04866baeed6'/>
<id>1c04fc3536b9e6d143991a8c5c16b04866baeed6</id>
<content type='text'>
Modules want to call this function, so it needs to be exported.

Reported-by: Daniel Mack &lt;zonque@gmail.com&gt;
Cc: Kay Sievers &lt;kay@vrfy.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Modules want to call this function, so it needs to be exported.

Reported-by: Daniel Mack &lt;zonque@gmail.com&gt;
Cc: Kay Sievers &lt;kay@vrfy.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq</title>
<updated>2013-04-30T02:07:40+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2013-04-30T02:07:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=46d9be3e5eb01f71fc02653755d970247174b400'/>
<id>46d9be3e5eb01f71fc02653755d970247174b400</id>
<content type='text'>
Pull workqueue updates from Tejun Heo:
 "A lot of activities on workqueue side this time.  The changes achieve
  the followings.

   - WQ_UNBOUND workqueues - the workqueues which are per-cpu - are
     updated to be able to interface with multiple backend worker pools.
     This involved a lot of churning but the end result seems actually
     neater as unbound workqueues are now a lot closer to per-cpu ones.

   - The ability to interface with multiple backend worker pools are
     used to implement unbound workqueues with custom attributes.
     Currently the supported attributes are the nice level and CPU
     affinity.  It may be expanded to include cgroup association in
     future.  The attributes can be specified either by calling
     apply_workqueue_attrs() or through /sys/bus/workqueue/WQ_NAME/* if
     the workqueue in question is exported through sysfs.

     The backend worker pools are keyed by the actual attributes and
     shared by any workqueues which share the same attributes.  When
     attributes of a workqueue are changed, the workqueue binds to the
     worker pool with the specified attributes while leaving the work
     items which are already executing in its previous worker pools
     alone.

     This allows converting custom worker pool implementations which
     want worker attribute tuning to use workqueues.  The writeback pool
     is already converted in block tree and there are a couple others
     are likely to follow including btrfs io workers.

   - WQ_UNBOUND's ability to bind to multiple worker pools is also used
     to make it NUMA-aware.  Because there's no association between work
     item issuer and the specific worker assigned to execute it, before
     this change, using unbound workqueue led to unnecessary cross-node
     bouncing and it couldn't be helped by autonuma as it requires tasks
     to have implicit node affinity and workers are assigned randomly.

     After these changes, an unbound workqueue now binds to multiple
     NUMA-affine worker pools so that queued work items are executed in
     the same node.  This is turned on by default but can be disabled
     system-wide or for individual workqueues.

     Crypto was requesting NUMA affinity as encrypting data across
     different nodes can contribute noticeable overhead and doing it
     per-cpu was too limiting for certain cases and IO throughput could
     be bottlenecked by one CPU being fully occupied while others have
     idle cycles.

  While the new features required a lot of changes including
  restructuring locking, it didn't complicate the execution paths much.
  The unbound workqueue handling is now closer to per-cpu ones and the
  new features are implemented by simply associating a workqueue with
  different sets of backend worker pools without changing queue,
  execution or flush paths.

  As such, even though the amount of change is very high, I feel
  relatively safe in that it isn't likely to cause subtle issues with
  basic correctness of work item execution and handling.  If something
  is wrong, it's likely to show up as being associated with worker pools
  with the wrong attributes or OOPS while workqueue attributes are being
  changed or during CPU hotplug.

  While this creates more backend worker pools, it doesn't add too many
  more workers unless, of course, there are many workqueues with unique
  combinations of attributes.  Assuming everything else is the same,
  NUMA awareness costs an extra worker pool per NUMA node with online
  CPUs.

  There are also a couple things which are being routed outside the
  workqueue tree.

   - block tree pulled in workqueue for-3.10 so that writeback worker
     pool can be converted to unbound workqueue with sysfs control
     exposed.  This simplifies the code, makes writeback workers
     NUMA-aware and allows tuning nice level and CPU affinity via sysfs.

   - The conversion to workqueue means that there's no 1:1 association
     between a specific worker, which makes writeback folks unhappy as
     they want to be able to tell which filesystem caused a problem from
     backtrace on systems with many filesystems mounted.  This is
     resolved by allowing work items to set debug info string which is
     printed when the task is dumped.  As this change involves unifying
     implementations of dump_stack() and friends in arch codes, it's
     being routed through Andrew's -mm tree."

* 'for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (84 commits)
  workqueue: use kmem_cache_free() instead of kfree()
  workqueue: avoid false negative WARN_ON() in destroy_workqueue()
  workqueue: update sysfs interface to reflect NUMA awareness and a kernel param to disable NUMA affinity
  workqueue: implement NUMA affinity for unbound workqueues
  workqueue: introduce put_pwq_unlocked()
  workqueue: introduce numa_pwq_tbl_install()
  workqueue: use NUMA-aware allocation for pool_workqueues
  workqueue: break init_and_link_pwq() into two functions and introduce alloc_unbound_pwq()
  workqueue: map an unbound workqueues to multiple per-node pool_workqueues
  workqueue: move hot fields of workqueue_struct to the end
  workqueue: make workqueue-&gt;name[] fixed len
  workqueue: add workqueue-&gt;unbound_attrs
  workqueue: determine NUMA node of workers accourding to the allowed cpumask
  workqueue: drop 'H' from kworker names of unbound worker pools
  workqueue: add wq_numa_tbl_len and wq_numa_possible_cpumask[]
  workqueue: move pwq_pool_locking outside of get/put_unbound_pool()
  workqueue: fix memory leak in apply_workqueue_attrs()
  workqueue: fix unbound workqueue attrs hashing / comparison
  workqueue: fix race condition in unbound workqueue free path
  workqueue: remove pwq_lock which is no longer used
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull workqueue updates from Tejun Heo:
 "A lot of activities on workqueue side this time.  The changes achieve
  the followings.

   - WQ_UNBOUND workqueues - the workqueues which are per-cpu - are
     updated to be able to interface with multiple backend worker pools.
     This involved a lot of churning but the end result seems actually
     neater as unbound workqueues are now a lot closer to per-cpu ones.

   - The ability to interface with multiple backend worker pools are
     used to implement unbound workqueues with custom attributes.
     Currently the supported attributes are the nice level and CPU
     affinity.  It may be expanded to include cgroup association in
     future.  The attributes can be specified either by calling
     apply_workqueue_attrs() or through /sys/bus/workqueue/WQ_NAME/* if
     the workqueue in question is exported through sysfs.

     The backend worker pools are keyed by the actual attributes and
     shared by any workqueues which share the same attributes.  When
     attributes of a workqueue are changed, the workqueue binds to the
     worker pool with the specified attributes while leaving the work
     items which are already executing in its previous worker pools
     alone.

     This allows converting custom worker pool implementations which
     want worker attribute tuning to use workqueues.  The writeback pool
     is already converted in block tree and there are a couple others
     are likely to follow including btrfs io workers.

   - WQ_UNBOUND's ability to bind to multiple worker pools is also used
     to make it NUMA-aware.  Because there's no association between work
     item issuer and the specific worker assigned to execute it, before
     this change, using unbound workqueue led to unnecessary cross-node
     bouncing and it couldn't be helped by autonuma as it requires tasks
     to have implicit node affinity and workers are assigned randomly.

     After these changes, an unbound workqueue now binds to multiple
     NUMA-affine worker pools so that queued work items are executed in
     the same node.  This is turned on by default but can be disabled
     system-wide or for individual workqueues.

     Crypto was requesting NUMA affinity as encrypting data across
     different nodes can contribute noticeable overhead and doing it
     per-cpu was too limiting for certain cases and IO throughput could
     be bottlenecked by one CPU being fully occupied while others have
     idle cycles.

  While the new features required a lot of changes including
  restructuring locking, it didn't complicate the execution paths much.
  The unbound workqueue handling is now closer to per-cpu ones and the
  new features are implemented by simply associating a workqueue with
  different sets of backend worker pools without changing queue,
  execution or flush paths.

  As such, even though the amount of change is very high, I feel
  relatively safe in that it isn't likely to cause subtle issues with
  basic correctness of work item execution and handling.  If something
  is wrong, it's likely to show up as being associated with worker pools
  with the wrong attributes or OOPS while workqueue attributes are being
  changed or during CPU hotplug.

  While this creates more backend worker pools, it doesn't add too many
  more workers unless, of course, there are many workqueues with unique
  combinations of attributes.  Assuming everything else is the same,
  NUMA awareness costs an extra worker pool per NUMA node with online
  CPUs.

  There are also a couple things which are being routed outside the
  workqueue tree.

   - block tree pulled in workqueue for-3.10 so that writeback worker
     pool can be converted to unbound workqueue with sysfs control
     exposed.  This simplifies the code, makes writeback workers
     NUMA-aware and allows tuning nice level and CPU affinity via sysfs.

   - The conversion to workqueue means that there's no 1:1 association
     between a specific worker, which makes writeback folks unhappy as
     they want to be able to tell which filesystem caused a problem from
     backtrace on systems with many filesystems mounted.  This is
     resolved by allowing work items to set debug info string which is
     printed when the task is dumped.  As this change involves unifying
     implementations of dump_stack() and friends in arch codes, it's
     being routed through Andrew's -mm tree."

* 'for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (84 commits)
  workqueue: use kmem_cache_free() instead of kfree()
  workqueue: avoid false negative WARN_ON() in destroy_workqueue()
  workqueue: update sysfs interface to reflect NUMA awareness and a kernel param to disable NUMA affinity
  workqueue: implement NUMA affinity for unbound workqueues
  workqueue: introduce put_pwq_unlocked()
  workqueue: introduce numa_pwq_tbl_install()
  workqueue: use NUMA-aware allocation for pool_workqueues
  workqueue: break init_and_link_pwq() into two functions and introduce alloc_unbound_pwq()
  workqueue: map an unbound workqueues to multiple per-node pool_workqueues
  workqueue: move hot fields of workqueue_struct to the end
  workqueue: make workqueue-&gt;name[] fixed len
  workqueue: add workqueue-&gt;unbound_attrs
  workqueue: determine NUMA node of workers accourding to the allowed cpumask
  workqueue: drop 'H' from kworker names of unbound worker pools
  workqueue: add wq_numa_tbl_len and wq_numa_possible_cpumask[]
  workqueue: move pwq_pool_locking outside of get/put_unbound_pool()
  workqueue: fix memory leak in apply_workqueue_attrs()
  workqueue: fix unbound workqueue attrs hashing / comparison
  workqueue: fix race condition in unbound workqueue free path
  workqueue: remove pwq_lock which is no longer used
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>device: separate all subsys mutexes</title>
<updated>2013-03-13T15:48:28+00:00</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.cz</email>
</author>
<published>2013-03-12T16:21:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=be871b7e54711479d3b9d3617d49898770830db2'/>
<id>be871b7e54711479d3b9d3617d49898770830db2</id>
<content type='text'>
ca22e56d (driver-core: implement 'sysdev' functionality for regular
devices and buses) has introduced bus_register macro with a static
key to distinguish different subsys mutex classes.

This however doesn't work for different subsys which use a common
registering function. One example is subsys_system_register (and
mce_device and cpu_device).

In the end this leads to the following lockdep splat:
[  207.271924] ======================================================
[  207.271932] [ INFO: possible circular locking dependency detected ]
[  207.271942] 3.9.0-rc1-0.7-default+ #34 Not tainted
[  207.271948] -------------------------------------------------------
[  207.271957] bash/10493 is trying to acquire lock:
[  207.271963]  (subsys mutex){+.+.+.}, at: [&lt;ffffffff8134af27&gt;] bus_remove_device+0x37/0x1c0
[  207.271987]
[  207.271987] but task is already holding lock:
[  207.271995]  (cpu_hotplug.lock){+.+.+.}, at: [&lt;ffffffff81046ccf&gt;] cpu_hotplug_begin+0x2f/0x60
[  207.272012]
[  207.272012] which lock already depends on the new lock.
[  207.272012]
[  207.272023]
[  207.272023] the existing dependency chain (in reverse order) is:
[  207.272033]
[  207.272033] -&gt; #4 (cpu_hotplug.lock){+.+.+.}:
[  207.272044]        [&lt;ffffffff810ae329&gt;] lock_acquire+0xe9/0x120
[  207.272056]        [&lt;ffffffff814ad807&gt;] mutex_lock_nested+0x37/0x360
[  207.272069]        [&lt;ffffffff81046ba9&gt;] get_online_cpus+0x29/0x40
[  207.272082]        [&lt;ffffffff81185210&gt;] drain_all_stock+0x30/0x150
[  207.272094]        [&lt;ffffffff811853da&gt;] mem_cgroup_reclaim+0xaa/0xe0
[  207.272104]        [&lt;ffffffff8118775e&gt;] __mem_cgroup_try_charge+0x51e/0xcf0
[  207.272114]        [&lt;ffffffff81188486&gt;] mem_cgroup_charge_common+0x36/0x60
[  207.272125]        [&lt;ffffffff811884da&gt;] mem_cgroup_newpage_charge+0x2a/0x30
[  207.272135]        [&lt;ffffffff81150531&gt;] do_wp_page+0x231/0x830
[  207.272147]        [&lt;ffffffff8115151e&gt;] handle_pte_fault+0x19e/0x8d0
[  207.272157]        [&lt;ffffffff81151da8&gt;] handle_mm_fault+0x158/0x1e0
[  207.272166]        [&lt;ffffffff814b6153&gt;] do_page_fault+0x2a3/0x4e0
[  207.272178]        [&lt;ffffffff814b2578&gt;] page_fault+0x28/0x30
[  207.272189]
[  207.272189] -&gt; #3 (&amp;mm-&gt;mmap_sem){++++++}:
[  207.272199]        [&lt;ffffffff810ae329&gt;] lock_acquire+0xe9/0x120
[  207.272208]        [&lt;ffffffff8114c5ad&gt;] might_fault+0x6d/0x90
[  207.272218]        [&lt;ffffffff811a11e3&gt;] filldir64+0xb3/0x120
[  207.272229]        [&lt;ffffffffa013fc19&gt;] call_filldir+0x89/0x130 [ext3]
[  207.272248]        [&lt;ffffffffa0140377&gt;] ext3_readdir+0x6b7/0x7e0 [ext3]
[  207.272263]        [&lt;ffffffff811a1519&gt;] vfs_readdir+0xa9/0xc0
[  207.272273]        [&lt;ffffffff811a15cb&gt;] sys_getdents64+0x9b/0x110
[  207.272284]        [&lt;ffffffff814bb599&gt;] system_call_fastpath+0x16/0x1b
[  207.272296]
[  207.272296] -&gt; #2 (&amp;type-&gt;i_mutex_dir_key#3){+.+.+.}:
[  207.272309]        [&lt;ffffffff810ae329&gt;] lock_acquire+0xe9/0x120
[  207.272319]        [&lt;ffffffff814ad807&gt;] mutex_lock_nested+0x37/0x360
[  207.272329]        [&lt;ffffffff8119c254&gt;] link_path_walk+0x6f4/0x9a0
[  207.272339]        [&lt;ffffffff8119e7fa&gt;] path_openat+0xba/0x470
[  207.272349]        [&lt;ffffffff8119ecf8&gt;] do_filp_open+0x48/0xa0
[  207.272358]        [&lt;ffffffff8118d81c&gt;] file_open_name+0xdc/0x110
[  207.272369]        [&lt;ffffffff8118d885&gt;] filp_open+0x35/0x40
[  207.272378]        [&lt;ffffffff8135c76e&gt;] _request_firmware+0x52e/0xb20
[  207.272389]        [&lt;ffffffff8135cdd6&gt;] request_firmware+0x16/0x20
[  207.272399]        [&lt;ffffffffa03bdb91&gt;] request_microcode_fw+0x61/0xd0 [microcode]
[  207.272416]        [&lt;ffffffffa03bd554&gt;] microcode_init_cpu+0x104/0x150 [microcode]
[  207.272431]        [&lt;ffffffffa03bd61c&gt;] mc_device_add+0x7c/0xb0 [microcode]
[  207.272444]        [&lt;ffffffff8134a419&gt;] subsys_interface_register+0xc9/0x100
[  207.272457]        [&lt;ffffffffa04fc0f4&gt;] 0xffffffffa04fc0f4
[  207.272472]        [&lt;ffffffff81000202&gt;] do_one_initcall+0x42/0x180
[  207.272485]        [&lt;ffffffff810bbeff&gt;] load_module+0x19df/0x1b70
[  207.272499]        [&lt;ffffffff810bc376&gt;] sys_init_module+0xe6/0x130
[  207.272511]        [&lt;ffffffff814bb599&gt;] system_call_fastpath+0x16/0x1b
[  207.272523]
[  207.272523] -&gt; #1 (umhelper_sem){++++.+}:
[  207.272537]        [&lt;ffffffff810ae329&gt;] lock_acquire+0xe9/0x120
[  207.272548]        [&lt;ffffffff814ae9c4&gt;] down_read+0x34/0x50
[  207.272559]        [&lt;ffffffff81062bff&gt;] usermodehelper_read_trylock+0x4f/0x100
[  207.272575]        [&lt;ffffffff8135c7dd&gt;] _request_firmware+0x59d/0xb20
[  207.272587]        [&lt;ffffffff8135cdd6&gt;] request_firmware+0x16/0x20
[  207.272599]        [&lt;ffffffffa03bdb91&gt;] request_microcode_fw+0x61/0xd0 [microcode]
[  207.272613]        [&lt;ffffffffa03bd554&gt;] microcode_init_cpu+0x104/0x150 [microcode]
[  207.272627]        [&lt;ffffffffa03bd61c&gt;] mc_device_add+0x7c/0xb0 [microcode]
[  207.272641]        [&lt;ffffffff8134a419&gt;] subsys_interface_register+0xc9/0x100
[  207.272654]        [&lt;ffffffffa04fc0f4&gt;] 0xffffffffa04fc0f4
[  207.272666]        [&lt;ffffffff81000202&gt;] do_one_initcall+0x42/0x180
[  207.272678]        [&lt;ffffffff810bbeff&gt;] load_module+0x19df/0x1b70
[  207.272690]        [&lt;ffffffff810bc376&gt;] sys_init_module+0xe6/0x130
[  207.272702]        [&lt;ffffffff814bb599&gt;] system_call_fastpath+0x16/0x1b
[  207.272715]
[  207.272715] -&gt; #0 (subsys mutex){+.+.+.}:
[  207.272729]        [&lt;ffffffff810ae002&gt;] __lock_acquire+0x13b2/0x15f0
[  207.272740]        [&lt;ffffffff810ae329&gt;] lock_acquire+0xe9/0x120
[  207.272751]        [&lt;ffffffff814ad807&gt;] mutex_lock_nested+0x37/0x360
[  207.272763]        [&lt;ffffffff8134af27&gt;] bus_remove_device+0x37/0x1c0
[  207.272775]        [&lt;ffffffff81349114&gt;] device_del+0x134/0x1f0
[  207.272786]        [&lt;ffffffff813491f2&gt;] device_unregister+0x22/0x60
[  207.272798]        [&lt;ffffffff814a24ea&gt;] mce_cpu_callback+0x15e/0x1ad
[  207.272812]        [&lt;ffffffff814b6402&gt;] notifier_call_chain+0x72/0x130
[  207.272824]        [&lt;ffffffff81073d6e&gt;] __raw_notifier_call_chain+0xe/0x10
[  207.272839]        [&lt;ffffffff81498f76&gt;] _cpu_down+0x1d6/0x350
[  207.272851]        [&lt;ffffffff81499130&gt;] cpu_down+0x40/0x60
[  207.272862]        [&lt;ffffffff8149cc55&gt;] store_online+0x75/0xe0
[  207.272874]        [&lt;ffffffff813474a0&gt;] dev_attr_store+0x20/0x30
[  207.272886]        [&lt;ffffffff812090d9&gt;] sysfs_write_file+0xd9/0x150
[  207.272900]        [&lt;ffffffff8118e10b&gt;] vfs_write+0xcb/0x130
[  207.272911]        [&lt;ffffffff8118e924&gt;] sys_write+0x64/0xa0
[  207.272923]        [&lt;ffffffff814bb599&gt;] system_call_fastpath+0x16/0x1b
[  207.272936]
[  207.272936] other info that might help us debug this:
[  207.272936]
[  207.272952] Chain exists of:
[  207.272952]   subsys mutex --&gt; &amp;mm-&gt;mmap_sem --&gt; cpu_hotplug.lock
[  207.272952]
[  207.272973]  Possible unsafe locking scenario:
[  207.272973]
[  207.272984]        CPU0                    CPU1
[  207.272992]        ----                    ----
[  207.273000]   lock(cpu_hotplug.lock);
[  207.273009]                                lock(&amp;mm-&gt;mmap_sem);
[  207.273020]                                lock(cpu_hotplug.lock);
[  207.273031]   lock(subsys mutex);
[  207.273040]
[  207.273040]  *** DEADLOCK ***
[  207.273040]
[  207.273055] 5 locks held by bash/10493:
[  207.273062]  #0:  (&amp;buffer-&gt;mutex){+.+.+.}, at: [&lt;ffffffff81209049&gt;] sysfs_write_file+0x49/0x150
[  207.273080]  #1:  (s_active#150){.+.+.+}, at: [&lt;ffffffff812090c2&gt;] sysfs_write_file+0xc2/0x150
[  207.273099]  #2:  (x86_cpu_hotplug_driver_mutex){+.+.+.}, at: [&lt;ffffffff81027557&gt;] cpu_hotplug_driver_lock+0x17/0x20
[  207.273121]  #3:  (cpu_add_remove_lock){+.+.+.}, at: [&lt;ffffffff8149911c&gt;] cpu_down+0x2c/0x60
[  207.273140]  #4:  (cpu_hotplug.lock){+.+.+.}, at: [&lt;ffffffff81046ccf&gt;] cpu_hotplug_begin+0x2f/0x60
[  207.273158]
[  207.273158] stack backtrace:
[  207.273170] Pid: 10493, comm: bash Not tainted 3.9.0-rc1-0.7-default+ #34
[  207.273180] Call Trace:
[  207.273192]  [&lt;ffffffff810ab373&gt;] print_circular_bug+0x223/0x310
[  207.273204]  [&lt;ffffffff810ae002&gt;] __lock_acquire+0x13b2/0x15f0
[  207.273216]  [&lt;ffffffff812086b0&gt;] ? sysfs_hash_and_remove+0x60/0xc0
[  207.273227]  [&lt;ffffffff810ae329&gt;] lock_acquire+0xe9/0x120
[  207.273239]  [&lt;ffffffff8134af27&gt;] ? bus_remove_device+0x37/0x1c0
[  207.273251]  [&lt;ffffffff814ad807&gt;] mutex_lock_nested+0x37/0x360
[  207.273263]  [&lt;ffffffff8134af27&gt;] ? bus_remove_device+0x37/0x1c0
[  207.273274]  [&lt;ffffffff812086b0&gt;] ? sysfs_hash_and_remove+0x60/0xc0
[  207.273286]  [&lt;ffffffff8134af27&gt;] bus_remove_device+0x37/0x1c0
[  207.273298]  [&lt;ffffffff81349114&gt;] device_del+0x134/0x1f0
[  207.273309]  [&lt;ffffffff813491f2&gt;] device_unregister+0x22/0x60
[  207.273321]  [&lt;ffffffff814a24ea&gt;] mce_cpu_callback+0x15e/0x1ad
[  207.273332]  [&lt;ffffffff814b6402&gt;] notifier_call_chain+0x72/0x130
[  207.273344]  [&lt;ffffffff81073d6e&gt;] __raw_notifier_call_chain+0xe/0x10
[  207.273356]  [&lt;ffffffff81498f76&gt;] _cpu_down+0x1d6/0x350
[  207.273368]  [&lt;ffffffff81027557&gt;] ? cpu_hotplug_driver_lock+0x17/0x20
[  207.273380]  [&lt;ffffffff81499130&gt;] cpu_down+0x40/0x60
[  207.273391]  [&lt;ffffffff8149cc55&gt;] store_online+0x75/0xe0
[  207.273402]  [&lt;ffffffff813474a0&gt;] dev_attr_store+0x20/0x30
[  207.273413]  [&lt;ffffffff812090d9&gt;] sysfs_write_file+0xd9/0x150
[  207.273425]  [&lt;ffffffff8118e10b&gt;] vfs_write+0xcb/0x130
[  207.273436]  [&lt;ffffffff8118e924&gt;] sys_write+0x64/0xa0
[  207.273447]  [&lt;ffffffff814bb599&gt;] system_call_fastpath+0x16/0x1b

Which reports a false possitive deadlock because it sees:
1) load_module -&gt; subsys_interface_register -&gt; mc_deveice_add (*) -&gt; subsys-&gt;p-&gt;mutex -&gt; link_path_walk -&gt; lookup_slow -&gt; i_mutex
2) sys_write -&gt; _cpu_down -&gt; cpu_hotplug_begin -&gt; cpu_hotplug.lock -&gt; mce_cpu_callback -&gt; mce_device_remove(**) -&gt; device_unregister -&gt; bus_remove_device -&gt; subsys mutex
3) vfs_readdir -&gt; i_mutex -&gt; filldir64 -&gt; might_fault -&gt; might_lock_read(mmap_sem) -&gt; page_fault -&gt; mmap_sem -&gt; drain_all_stock -&gt; cpu_hotplug.lock

but
1) takes cpu_subsys subsys (*) but 2) takes mce_device subsys (**) so
the deadlock is not possible AFAICS.

The fix is quite simple. We can pull the key inside bus_type structure
because they are defined per device so the pointer will be unique as
well. bus_register doesn't need to be a macro anymore so change it
to the inline. We could get rid of __bus_register as there is no other
caller but maybe somebody will want to use a different key so keep it
around for now.

Reported-by: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Signed-off-by: Jiri Kosina &lt;jkosina@suse.cz&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
ca22e56d (driver-core: implement 'sysdev' functionality for regular
devices and buses) has introduced bus_register macro with a static
key to distinguish different subsys mutex classes.

This however doesn't work for different subsys which use a common
registering function. One example is subsys_system_register (and
mce_device and cpu_device).

In the end this leads to the following lockdep splat:
[  207.271924] ======================================================
[  207.271932] [ INFO: possible circular locking dependency detected ]
[  207.271942] 3.9.0-rc1-0.7-default+ #34 Not tainted
[  207.271948] -------------------------------------------------------
[  207.271957] bash/10493 is trying to acquire lock:
[  207.271963]  (subsys mutex){+.+.+.}, at: [&lt;ffffffff8134af27&gt;] bus_remove_device+0x37/0x1c0
[  207.271987]
[  207.271987] but task is already holding lock:
[  207.271995]  (cpu_hotplug.lock){+.+.+.}, at: [&lt;ffffffff81046ccf&gt;] cpu_hotplug_begin+0x2f/0x60
[  207.272012]
[  207.272012] which lock already depends on the new lock.
[  207.272012]
[  207.272023]
[  207.272023] the existing dependency chain (in reverse order) is:
[  207.272033]
[  207.272033] -&gt; #4 (cpu_hotplug.lock){+.+.+.}:
[  207.272044]        [&lt;ffffffff810ae329&gt;] lock_acquire+0xe9/0x120
[  207.272056]        [&lt;ffffffff814ad807&gt;] mutex_lock_nested+0x37/0x360
[  207.272069]        [&lt;ffffffff81046ba9&gt;] get_online_cpus+0x29/0x40
[  207.272082]        [&lt;ffffffff81185210&gt;] drain_all_stock+0x30/0x150
[  207.272094]        [&lt;ffffffff811853da&gt;] mem_cgroup_reclaim+0xaa/0xe0
[  207.272104]        [&lt;ffffffff8118775e&gt;] __mem_cgroup_try_charge+0x51e/0xcf0
[  207.272114]        [&lt;ffffffff81188486&gt;] mem_cgroup_charge_common+0x36/0x60
[  207.272125]        [&lt;ffffffff811884da&gt;] mem_cgroup_newpage_charge+0x2a/0x30
[  207.272135]        [&lt;ffffffff81150531&gt;] do_wp_page+0x231/0x830
[  207.272147]        [&lt;ffffffff8115151e&gt;] handle_pte_fault+0x19e/0x8d0
[  207.272157]        [&lt;ffffffff81151da8&gt;] handle_mm_fault+0x158/0x1e0
[  207.272166]        [&lt;ffffffff814b6153&gt;] do_page_fault+0x2a3/0x4e0
[  207.272178]        [&lt;ffffffff814b2578&gt;] page_fault+0x28/0x30
[  207.272189]
[  207.272189] -&gt; #3 (&amp;mm-&gt;mmap_sem){++++++}:
[  207.272199]        [&lt;ffffffff810ae329&gt;] lock_acquire+0xe9/0x120
[  207.272208]        [&lt;ffffffff8114c5ad&gt;] might_fault+0x6d/0x90
[  207.272218]        [&lt;ffffffff811a11e3&gt;] filldir64+0xb3/0x120
[  207.272229]        [&lt;ffffffffa013fc19&gt;] call_filldir+0x89/0x130 [ext3]
[  207.272248]        [&lt;ffffffffa0140377&gt;] ext3_readdir+0x6b7/0x7e0 [ext3]
[  207.272263]        [&lt;ffffffff811a1519&gt;] vfs_readdir+0xa9/0xc0
[  207.272273]        [&lt;ffffffff811a15cb&gt;] sys_getdents64+0x9b/0x110
[  207.272284]        [&lt;ffffffff814bb599&gt;] system_call_fastpath+0x16/0x1b
[  207.272296]
[  207.272296] -&gt; #2 (&amp;type-&gt;i_mutex_dir_key#3){+.+.+.}:
[  207.272309]        [&lt;ffffffff810ae329&gt;] lock_acquire+0xe9/0x120
[  207.272319]        [&lt;ffffffff814ad807&gt;] mutex_lock_nested+0x37/0x360
[  207.272329]        [&lt;ffffffff8119c254&gt;] link_path_walk+0x6f4/0x9a0
[  207.272339]        [&lt;ffffffff8119e7fa&gt;] path_openat+0xba/0x470
[  207.272349]        [&lt;ffffffff8119ecf8&gt;] do_filp_open+0x48/0xa0
[  207.272358]        [&lt;ffffffff8118d81c&gt;] file_open_name+0xdc/0x110
[  207.272369]        [&lt;ffffffff8118d885&gt;] filp_open+0x35/0x40
[  207.272378]        [&lt;ffffffff8135c76e&gt;] _request_firmware+0x52e/0xb20
[  207.272389]        [&lt;ffffffff8135cdd6&gt;] request_firmware+0x16/0x20
[  207.272399]        [&lt;ffffffffa03bdb91&gt;] request_microcode_fw+0x61/0xd0 [microcode]
[  207.272416]        [&lt;ffffffffa03bd554&gt;] microcode_init_cpu+0x104/0x150 [microcode]
[  207.272431]        [&lt;ffffffffa03bd61c&gt;] mc_device_add+0x7c/0xb0 [microcode]
[  207.272444]        [&lt;ffffffff8134a419&gt;] subsys_interface_register+0xc9/0x100
[  207.272457]        [&lt;ffffffffa04fc0f4&gt;] 0xffffffffa04fc0f4
[  207.272472]        [&lt;ffffffff81000202&gt;] do_one_initcall+0x42/0x180
[  207.272485]        [&lt;ffffffff810bbeff&gt;] load_module+0x19df/0x1b70
[  207.272499]        [&lt;ffffffff810bc376&gt;] sys_init_module+0xe6/0x130
[  207.272511]        [&lt;ffffffff814bb599&gt;] system_call_fastpath+0x16/0x1b
[  207.272523]
[  207.272523] -&gt; #1 (umhelper_sem){++++.+}:
[  207.272537]        [&lt;ffffffff810ae329&gt;] lock_acquire+0xe9/0x120
[  207.272548]        [&lt;ffffffff814ae9c4&gt;] down_read+0x34/0x50
[  207.272559]        [&lt;ffffffff81062bff&gt;] usermodehelper_read_trylock+0x4f/0x100
[  207.272575]        [&lt;ffffffff8135c7dd&gt;] _request_firmware+0x59d/0xb20
[  207.272587]        [&lt;ffffffff8135cdd6&gt;] request_firmware+0x16/0x20
[  207.272599]        [&lt;ffffffffa03bdb91&gt;] request_microcode_fw+0x61/0xd0 [microcode]
[  207.272613]        [&lt;ffffffffa03bd554&gt;] microcode_init_cpu+0x104/0x150 [microcode]
[  207.272627]        [&lt;ffffffffa03bd61c&gt;] mc_device_add+0x7c/0xb0 [microcode]
[  207.272641]        [&lt;ffffffff8134a419&gt;] subsys_interface_register+0xc9/0x100
[  207.272654]        [&lt;ffffffffa04fc0f4&gt;] 0xffffffffa04fc0f4
[  207.272666]        [&lt;ffffffff81000202&gt;] do_one_initcall+0x42/0x180
[  207.272678]        [&lt;ffffffff810bbeff&gt;] load_module+0x19df/0x1b70
[  207.272690]        [&lt;ffffffff810bc376&gt;] sys_init_module+0xe6/0x130
[  207.272702]        [&lt;ffffffff814bb599&gt;] system_call_fastpath+0x16/0x1b
[  207.272715]
[  207.272715] -&gt; #0 (subsys mutex){+.+.+.}:
[  207.272729]        [&lt;ffffffff810ae002&gt;] __lock_acquire+0x13b2/0x15f0
[  207.272740]        [&lt;ffffffff810ae329&gt;] lock_acquire+0xe9/0x120
[  207.272751]        [&lt;ffffffff814ad807&gt;] mutex_lock_nested+0x37/0x360
[  207.272763]        [&lt;ffffffff8134af27&gt;] bus_remove_device+0x37/0x1c0
[  207.272775]        [&lt;ffffffff81349114&gt;] device_del+0x134/0x1f0
[  207.272786]        [&lt;ffffffff813491f2&gt;] device_unregister+0x22/0x60
[  207.272798]        [&lt;ffffffff814a24ea&gt;] mce_cpu_callback+0x15e/0x1ad
[  207.272812]        [&lt;ffffffff814b6402&gt;] notifier_call_chain+0x72/0x130
[  207.272824]        [&lt;ffffffff81073d6e&gt;] __raw_notifier_call_chain+0xe/0x10
[  207.272839]        [&lt;ffffffff81498f76&gt;] _cpu_down+0x1d6/0x350
[  207.272851]        [&lt;ffffffff81499130&gt;] cpu_down+0x40/0x60
[  207.272862]        [&lt;ffffffff8149cc55&gt;] store_online+0x75/0xe0
[  207.272874]        [&lt;ffffffff813474a0&gt;] dev_attr_store+0x20/0x30
[  207.272886]        [&lt;ffffffff812090d9&gt;] sysfs_write_file+0xd9/0x150
[  207.272900]        [&lt;ffffffff8118e10b&gt;] vfs_write+0xcb/0x130
[  207.272911]        [&lt;ffffffff8118e924&gt;] sys_write+0x64/0xa0
[  207.272923]        [&lt;ffffffff814bb599&gt;] system_call_fastpath+0x16/0x1b
[  207.272936]
[  207.272936] other info that might help us debug this:
[  207.272936]
[  207.272952] Chain exists of:
[  207.272952]   subsys mutex --&gt; &amp;mm-&gt;mmap_sem --&gt; cpu_hotplug.lock
[  207.272952]
[  207.272973]  Possible unsafe locking scenario:
[  207.272973]
[  207.272984]        CPU0                    CPU1
[  207.272992]        ----                    ----
[  207.273000]   lock(cpu_hotplug.lock);
[  207.273009]                                lock(&amp;mm-&gt;mmap_sem);
[  207.273020]                                lock(cpu_hotplug.lock);
[  207.273031]   lock(subsys mutex);
[  207.273040]
[  207.273040]  *** DEADLOCK ***
[  207.273040]
[  207.273055] 5 locks held by bash/10493:
[  207.273062]  #0:  (&amp;buffer-&gt;mutex){+.+.+.}, at: [&lt;ffffffff81209049&gt;] sysfs_write_file+0x49/0x150
[  207.273080]  #1:  (s_active#150){.+.+.+}, at: [&lt;ffffffff812090c2&gt;] sysfs_write_file+0xc2/0x150
[  207.273099]  #2:  (x86_cpu_hotplug_driver_mutex){+.+.+.}, at: [&lt;ffffffff81027557&gt;] cpu_hotplug_driver_lock+0x17/0x20
[  207.273121]  #3:  (cpu_add_remove_lock){+.+.+.}, at: [&lt;ffffffff8149911c&gt;] cpu_down+0x2c/0x60
[  207.273140]  #4:  (cpu_hotplug.lock){+.+.+.}, at: [&lt;ffffffff81046ccf&gt;] cpu_hotplug_begin+0x2f/0x60
[  207.273158]
[  207.273158] stack backtrace:
[  207.273170] Pid: 10493, comm: bash Not tainted 3.9.0-rc1-0.7-default+ #34
[  207.273180] Call Trace:
[  207.273192]  [&lt;ffffffff810ab373&gt;] print_circular_bug+0x223/0x310
[  207.273204]  [&lt;ffffffff810ae002&gt;] __lock_acquire+0x13b2/0x15f0
[  207.273216]  [&lt;ffffffff812086b0&gt;] ? sysfs_hash_and_remove+0x60/0xc0
[  207.273227]  [&lt;ffffffff810ae329&gt;] lock_acquire+0xe9/0x120
[  207.273239]  [&lt;ffffffff8134af27&gt;] ? bus_remove_device+0x37/0x1c0
[  207.273251]  [&lt;ffffffff814ad807&gt;] mutex_lock_nested+0x37/0x360
[  207.273263]  [&lt;ffffffff8134af27&gt;] ? bus_remove_device+0x37/0x1c0
[  207.273274]  [&lt;ffffffff812086b0&gt;] ? sysfs_hash_and_remove+0x60/0xc0
[  207.273286]  [&lt;ffffffff8134af27&gt;] bus_remove_device+0x37/0x1c0
[  207.273298]  [&lt;ffffffff81349114&gt;] device_del+0x134/0x1f0
[  207.273309]  [&lt;ffffffff813491f2&gt;] device_unregister+0x22/0x60
[  207.273321]  [&lt;ffffffff814a24ea&gt;] mce_cpu_callback+0x15e/0x1ad
[  207.273332]  [&lt;ffffffff814b6402&gt;] notifier_call_chain+0x72/0x130
[  207.273344]  [&lt;ffffffff81073d6e&gt;] __raw_notifier_call_chain+0xe/0x10
[  207.273356]  [&lt;ffffffff81498f76&gt;] _cpu_down+0x1d6/0x350
[  207.273368]  [&lt;ffffffff81027557&gt;] ? cpu_hotplug_driver_lock+0x17/0x20
[  207.273380]  [&lt;ffffffff81499130&gt;] cpu_down+0x40/0x60
[  207.273391]  [&lt;ffffffff8149cc55&gt;] store_online+0x75/0xe0
[  207.273402]  [&lt;ffffffff813474a0&gt;] dev_attr_store+0x20/0x30
[  207.273413]  [&lt;ffffffff812090d9&gt;] sysfs_write_file+0xd9/0x150
[  207.273425]  [&lt;ffffffff8118e10b&gt;] vfs_write+0xcb/0x130
[  207.273436]  [&lt;ffffffff8118e924&gt;] sys_write+0x64/0xa0
[  207.273447]  [&lt;ffffffff814bb599&gt;] system_call_fastpath+0x16/0x1b

Which reports a false possitive deadlock because it sees:
1) load_module -&gt; subsys_interface_register -&gt; mc_deveice_add (*) -&gt; subsys-&gt;p-&gt;mutex -&gt; link_path_walk -&gt; lookup_slow -&gt; i_mutex
2) sys_write -&gt; _cpu_down -&gt; cpu_hotplug_begin -&gt; cpu_hotplug.lock -&gt; mce_cpu_callback -&gt; mce_device_remove(**) -&gt; device_unregister -&gt; bus_remove_device -&gt; subsys mutex
3) vfs_readdir -&gt; i_mutex -&gt; filldir64 -&gt; might_fault -&gt; might_lock_read(mmap_sem) -&gt; page_fault -&gt; mmap_sem -&gt; drain_all_stock -&gt; cpu_hotplug.lock

but
1) takes cpu_subsys subsys (*) but 2) takes mce_device subsys (**) so
the deadlock is not possible AFAICS.

The fix is quite simple. We can pull the key inside bus_type structure
because they are defined per device so the pointer will be unique as
well. bus_register doesn't need to be a macro anymore so change it
to the inline. We could get rid of __bus_register as there is no other
caller but maybe somebody will want to use a different key so keep it
around for now.

Reported-by: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Signed-off-by: Jiri Kosina &lt;jkosina@suse.cz&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>driver/base: implement subsys_virtual_register()</title>
<updated>2013-03-12T18:36:35+00:00</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2013-03-12T18:30:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d73ce004225a7b2ed75f4340bb63721d55552265'/>
<id>d73ce004225a7b2ed75f4340bb63721d55552265</id>
<content type='text'>
Kay tells me the most appropriate place to expose workqueues to
userland would be /sys/devices/virtual/workqueues/WQ_NAME which is
symlinked to /sys/bus/workqueue/devices/WQ_NAME and that we're lacking
a way to do that outside of driver core as virtual_device_parent()
isn't exported and there's no inteface to conveniently create a
virtual subsystem.

This patch implements subsys_virtual_register() by factoring out
subsys_register() from subsys_system_register() and using it with
virtual_device_parent() as the origin directory.  It's identical to
subsys_system_register() other than the origin directory but we aren't
gonna restrict the device names which should be used under it.

This will be used to expose workqueue attributes to userland.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Kay Sievers &lt;kay.sievers@vrfy.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Kay tells me the most appropriate place to expose workqueues to
userland would be /sys/devices/virtual/workqueues/WQ_NAME which is
symlinked to /sys/bus/workqueue/devices/WQ_NAME and that we're lacking
a way to do that outside of driver core as virtual_device_parent()
isn't exported and there's no inteface to conveniently create a
virtual subsystem.

This patch implements subsys_virtual_register() by factoring out
subsys_register() from subsys_system_register() and using it with
virtual_device_parent() as the origin directory.  It's identical to
subsys_system_register() other than the origin directory but we aren't
gonna restrict the device names which should be used under it.

This will be used to expose workqueue attributes to userland.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Kay Sievers &lt;kay.sievers@vrfy.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Driver core: treat unregistered bus_types as having no devices</title>
<updated>2013-02-04T01:55:29+00:00</updated>
<author>
<name>Bjorn Helgaas</name>
<email>bhelgaas@google.com</email>
</author>
<published>2013-01-29T23:44:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=4fa3e78be7e985ca814ce2aa0c09cbee404efcf7'/>
<id>4fa3e78be7e985ca814ce2aa0c09cbee404efcf7</id>
<content type='text'>
A bus_type has a list of devices (klist_devices), but the list and the
subsys_private structure that contains it are not initialized until the
bus_type is registered with bus_register().

The panic/reboot path has fixups that look up devices in pci_bus_type.  If
we panic before registering pci_bus_type, the bus_type exists but the list
does not, so mach_reboot_fixups() trips over a null pointer and panics
again:

    mach_reboot_fixups
      pci_get_device
        ..
          bus_find_device(&amp;pci_bus_type, ...)
            bus-&gt;p is NULL

Joonsoo reported a problem when panicking before PCI was initialized.
I think this patch should be sufficient to replace the patch he posted
here: https://lkml.org/lkml/2012/12/28/75 ("[PATCH] x86, reboot: skip
reboot_fixups in early boot phase")

Reported-by: Joonsoo Kim &lt;js1304@gmail.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Cc: stable &lt;stable@vger.kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
A bus_type has a list of devices (klist_devices), but the list and the
subsys_private structure that contains it are not initialized until the
bus_type is registered with bus_register().

The panic/reboot path has fixups that look up devices in pci_bus_type.  If
we panic before registering pci_bus_type, the bus_type exists but the list
does not, so mach_reboot_fixups() trips over a null pointer and panics
again:

    mach_reboot_fixups
      pci_get_device
        ..
          bus_find_device(&amp;pci_bus_type, ...)
            bus-&gt;p is NULL

Joonsoo reported a problem when panicking before PCI was initialized.
I think this patch should be sufficient to replace the patch he posted
here: https://lkml.org/lkml/2012/12/28/75 ("[PATCH] x86, reboot: skip
reboot_fixups in early boot phase")

Reported-by: Joonsoo Kim &lt;js1304@gmail.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Cc: stable &lt;stable@vger.kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>driver core: fix possible missing of device probe</title>
<updated>2013-01-17T21:02:11+00:00</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@canonical.com</email>
</author>
<published>2012-11-19T15:35:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=190888ac01d059e38ffe77a2291d44cafa9016fb'/>
<id>190888ac01d059e38ffe77a2291d44cafa9016fb</id>
<content type='text'>
Inside bus_add_driver(), one device might be added(device_add()) into
the bus or probed which is triggered by deferred probe
just after completing of driver_attach() and before
'klist_add_tail(&amp;priv-&gt;knode_bus, &amp;bus-&gt;p-&gt;klist_drivers)',
so the device won't be probed by this driver.

This patch moves the below line

	'klist_add_tail(&amp;priv-&gt;knode_bus, &amp;bus-&gt;p-&gt;klist_drivers)'

before driver_attach() inside bus_add_driver() to fix the
problem.

Signed-off-by: Ming Lei &lt;ming.lei@canonical.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Inside bus_add_driver(), one device might be added(device_add()) into
the bus or probed which is triggered by deferred probe
just after completing of driver_attach() and before
'klist_add_tail(&amp;priv-&gt;knode_bus, &amp;bus-&gt;p-&gt;klist_drivers)',
so the device won't be probed by this driver.

This patch moves the below line

	'klist_add_tail(&amp;priv-&gt;knode_bus, &amp;bus-&gt;p-&gt;klist_drivers)'

before driver_attach() inside bus_add_driver() to fix the
problem.

Signed-off-by: Ming Lei &lt;ming.lei@canonical.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>driver core: remove CONFIG_HOTPLUG ifdefs</title>
<updated>2012-11-28T18:33:03+00:00</updated>
<author>
<name>Bill Pemberton</name>
<email>wfp5p@virginia.edu</email>
</author>
<published>2012-11-19T18:19:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=a42d1e31d4a2380cf1bda8e3510ff1859ddf55a5'/>
<id>a42d1e31d4a2380cf1bda8e3510ff1859ddf55a5</id>
<content type='text'>
Remove conditional code based on CONFIG_HOTPLUG being false.  It's
always on now in preparation of it going away as an option.

Signed-off-by: Bill Pemberton &lt;wfp5p@virginia.edu&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Remove conditional code based on CONFIG_HOTPLUG being false.  It's
always on now in preparation of it going away as an option.

Signed-off-by: Bill Pemberton &lt;wfp5p@virginia.edu&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>driver core: move uevent call to driver_register</title>
<updated>2012-07-17T01:04:25+00:00</updated>
<author>
<name>Sebastian Ott</name>
<email>sebott@linux.vnet.ibm.com</email>
</author>
<published>2012-07-02T17:08:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=5a7689fd5b4f2094e7a32beae67f290f8619b042'/>
<id>5a7689fd5b4f2094e7a32beae67f290f8619b042</id>
<content type='text'>
Device driver attribute groups are created after userspace is notified
via an add event. Fix this by moving the kobject_uevent call to
driver_register after the attribute groups are added.

Signed-off-by: Sebastian Ott &lt;sebott@linux.vnet.ibm.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Device driver attribute groups are created after userspace is notified
via an add event. Fix this by moving the kobject_uevent call to
driver_register after the attribute groups are added.

Signed-off-by: Sebastian Ott &lt;sebott@linux.vnet.ibm.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Revert "driver core: check start node in klist_iter_init_node"</title>
<updated>2012-04-20T02:17:30+00:00</updated>
<author>
<name>Greg Kroah-Hartman</name>
<email>gregkh@linuxfoundation.org</email>
</author>
<published>2012-04-20T02:17:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7cd9c9bb57476167e83b7780dbc06d1dd601789d'/>
<id>7cd9c9bb57476167e83b7780dbc06d1dd601789d</id>
<content type='text'>
This reverts commit a15d49fd3094cff90e5410ca454a870e0a722fe1 as that
patch broke the build.

Cc: Hannes Reinecke &lt;hare@suse.de&gt;
Reported-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This reverts commit a15d49fd3094cff90e5410ca454a870e0a722fe1 as that
patch broke the build.

Cc: Hannes Reinecke &lt;hare@suse.de&gt;
Reported-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
