<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/thermal, branch v5.4.2</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>cpufreq: Use per-policy frequency QoS</title>
<updated>2019-10-21T00:05:21+00:00</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2019-10-16T10:47:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3000ce3c52f8b8db093e4dc649cd172390f71137'/>
<id>3000ce3c52f8b8db093e4dc649cd172390f71137</id>
<content type='text'>
Replace the CPU device PM QoS used for the management of min and max
frequency constraints in cpufreq (and its users) with per-policy
frequency QoS to avoid problems with cpufreq policies covering
more then one CPU.

Namely, a cpufreq driver is registered with the subsys interface
which calls cpufreq_add_dev() for each CPU, starting from CPU0, so
currently the PM QoS notifiers are added to the first CPU in the
policy (i.e. CPU0 in the majority of cases).

In turn, when the cpufreq driver is unregistered, the subsys interface
doing that calls cpufreq_remove_dev() for each CPU, starting from CPU0,
and the PM QoS notifiers are only removed when cpufreq_remove_dev() is
called for the last CPU in the policy, say CPUx, which as a rule is
not CPU0 if the policy covers more than one CPU.  Then, the PM QoS
notifiers cannot be removed, because CPUx does not have them, and
they are still there in the device PM QoS notifiers list of CPU0,
which prevents new PM QoS notifiers from being registered for CPU0
on the next attempt to register the cpufreq driver.

The same issue occurs when the first CPU in the policy goes offline
before unregistering the driver.

After this change it does not matter which CPU is the policy CPU at
the driver registration time and whether or not it is online all the
time, because the frequency QoS is per policy and not per CPU.

Fixes: 67d874c3b2c6 ("cpufreq: Register notifiers with the PM QoS framework")
Reported-by: Dmitry Osipenko &lt;digetx@gmail.com&gt;
Tested-by: Dmitry Osipenko &lt;digetx@gmail.com&gt;
Reported-by: Sudeep Holla &lt;sudeep.holla@arm.com&gt;
Tested-by: Sudeep Holla &lt;sudeep.holla@arm.com&gt;
Diagnosed-by: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
Link: https://lore.kernel.org/linux-pm/5ad2624194baa2f53acc1f1e627eb7684c577a19.1562210705.git.viresh.kumar@linaro.org/T/#md2d89e95906b8c91c15f582146173dce2e86e99f
Link: https://lore.kernel.org/linux-pm/20191017094612.6tbkwoq4harsjcqv@vireshk-i7/T/#m30d48cc23b9a80467fbaa16e30f90b3828a5a29b
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Acked-by: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Replace the CPU device PM QoS used for the management of min and max
frequency constraints in cpufreq (and its users) with per-policy
frequency QoS to avoid problems with cpufreq policies covering
more then one CPU.

Namely, a cpufreq driver is registered with the subsys interface
which calls cpufreq_add_dev() for each CPU, starting from CPU0, so
currently the PM QoS notifiers are added to the first CPU in the
policy (i.e. CPU0 in the majority of cases).

In turn, when the cpufreq driver is unregistered, the subsys interface
doing that calls cpufreq_remove_dev() for each CPU, starting from CPU0,
and the PM QoS notifiers are only removed when cpufreq_remove_dev() is
called for the last CPU in the policy, say CPUx, which as a rule is
not CPU0 if the policy covers more than one CPU.  Then, the PM QoS
notifiers cannot be removed, because CPUx does not have them, and
they are still there in the device PM QoS notifiers list of CPU0,
which prevents new PM QoS notifiers from being registered for CPU0
on the next attempt to register the cpufreq driver.

The same issue occurs when the first CPU in the policy goes offline
before unregistering the driver.

After this change it does not matter which CPU is the policy CPU at
the driver registration time and whether or not it is online all the
time, because the frequency QoS is per policy and not per CPU.

Fixes: 67d874c3b2c6 ("cpufreq: Register notifiers with the PM QoS framework")
Reported-by: Dmitry Osipenko &lt;digetx@gmail.com&gt;
Tested-by: Dmitry Osipenko &lt;digetx@gmail.com&gt;
Reported-by: Sudeep Holla &lt;sudeep.holla@arm.com&gt;
Tested-by: Sudeep Holla &lt;sudeep.holla@arm.com&gt;
Diagnosed-by: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
Link: https://lore.kernel.org/linux-pm/5ad2624194baa2f53acc1f1e627eb7684c577a19.1562210705.git.viresh.kumar@linaro.org/T/#md2d89e95906b8c91c15f582146173dce2e86e99f
Link: https://lore.kernel.org/linux-pm/20191017094612.6tbkwoq4harsjcqv@vireshk-i7/T/#m30d48cc23b9a80467fbaa16e30f90b3828a5a29b
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Acked-by: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal</title>
<updated>2019-09-29T17:24:23+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-09-29T17:24:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=939ca9f1751d1d65424f80b9284b6c18e78c7f4e'/>
<id>939ca9f1751d1d65424f80b9284b6c18e78c7f4e</id>
<content type='text'>
Pull thermal SoC updates from Eduardo Valentin:
 "This is a really small pull in the midst of a lot of pending patches.

  We are in the middle of restructuring how we are maintaining the
  thermal subsystem, as per discussion in our last LPC. For now, I am
  sending just some changes that were pending in my tree. Looking
  forward to get a more streamlined process in the next merge window"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal:
  thermal: db8500: Rewrite to be a pure OF sensor
  thermal: db8500: Use dev helper variable
  thermal: db8500: Finalize device tree conversion
  thermal: thermal_mmio: remove some dead code
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull thermal SoC updates from Eduardo Valentin:
 "This is a really small pull in the midst of a lot of pending patches.

  We are in the middle of restructuring how we are maintaining the
  thermal subsystem, as per discussion in our last LPC. For now, I am
  sending just some changes that were pending in my tree. Looking
  forward to get a more streamlined process in the next merge window"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal:
  thermal: db8500: Rewrite to be a pure OF sensor
  thermal: db8500: Use dev helper variable
  thermal: db8500: Finalize device tree conversion
  thermal: thermal_mmio: remove some dead code
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux</title>
<updated>2019-09-27T18:35:13+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-09-27T18:35:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d0e00bc5ada53bda296ce8bfffc2f2be9eb22632'/>
<id>d0e00bc5ada53bda296ce8bfffc2f2be9eb22632</id>
<content type='text'>
Pull thermal management updates from Zhang Rui:

 - Add Amit Kucheria as thermal subsystem Reviewer (Amit Kucheria)

 - Fix a use after free bug when unregistering thermal zone devices (Ido
   Schimmel)

 - Fix thermal core framework to use put_device() when device_register()
   fails (Yue Hu)

 - Enable intel_pch_thermal and MMIO RAPL support for Intel Icelake
   platform (Srinivas Pandruvada)

 - Add clock operations in qorip thermal driver, for some platforms with
   clock control like i.MX8MQ (Anson Huang)

 - A couple of trivial fixes and cleanups for thermal core and different
   soc thermal drivers (Amit Kucheria, Christophe JAILLET, Chuhong Yuan,
   Fuqian Huang, Kelsey Skunberg, Nathan Huckleberry, Rishi Gupta,
   Srinivas Kandagatla)

* 'for-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
  MAINTAINERS: Add Amit Kucheria as reviewer for thermal
  thermal: Add some error messages
  thermal: Fix use-after-free when unregistering thermal zone device
  thermal/drivers/core: Use put_device() if device_register() fails
  thermal_hwmon: Sanitize thermal_zone type
  thermal: intel: Use dev_get_drvdata
  thermal: intel: int3403: replace printk(KERN_WARN...) with pr_warn(...)
  thermal: intel: int340x_thermal: Remove unnecessary acpi_has_method() uses
  thermal: int340x: processor_thermal: Add Ice Lake support
  drivers: thermal: qcom: tsens: Fix memory leak from qfprom read
  thermal: tegra: Fix a typo
  thermal: rcar_gen3_thermal: Replace devm_add_action() followed by failure action with devm_add_action_or_reset()
  thermal: armada: Fix -Wshift-negative-value
  dt-bindings: thermal: qoriq: Add optional clocks property
  thermal: qoriq: Use __maybe_unused instead of #if CONFIG_PM_SLEEP
  thermal: qoriq: Use devm_platform_ioremap_resource() instead of of_iomap()
  thermal: qoriq: Fix error path of calling qoriq_tmu_register_tmu_zone fail
  thermal: qoriq: Add clock operations
  drivers: thermal: processor_thermal_device: Export sysfs interface for TCC offset
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull thermal management updates from Zhang Rui:

 - Add Amit Kucheria as thermal subsystem Reviewer (Amit Kucheria)

 - Fix a use after free bug when unregistering thermal zone devices (Ido
   Schimmel)

 - Fix thermal core framework to use put_device() when device_register()
   fails (Yue Hu)

 - Enable intel_pch_thermal and MMIO RAPL support for Intel Icelake
   platform (Srinivas Pandruvada)

 - Add clock operations in qorip thermal driver, for some platforms with
   clock control like i.MX8MQ (Anson Huang)

 - A couple of trivial fixes and cleanups for thermal core and different
   soc thermal drivers (Amit Kucheria, Christophe JAILLET, Chuhong Yuan,
   Fuqian Huang, Kelsey Skunberg, Nathan Huckleberry, Rishi Gupta,
   Srinivas Kandagatla)

* 'for-5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
  MAINTAINERS: Add Amit Kucheria as reviewer for thermal
  thermal: Add some error messages
  thermal: Fix use-after-free when unregistering thermal zone device
  thermal/drivers/core: Use put_device() if device_register() fails
  thermal_hwmon: Sanitize thermal_zone type
  thermal: intel: Use dev_get_drvdata
  thermal: intel: int3403: replace printk(KERN_WARN...) with pr_warn(...)
  thermal: intel: int340x_thermal: Remove unnecessary acpi_has_method() uses
  thermal: int340x: processor_thermal: Add Ice Lake support
  drivers: thermal: qcom: tsens: Fix memory leak from qfprom read
  thermal: tegra: Fix a typo
  thermal: rcar_gen3_thermal: Replace devm_add_action() followed by failure action with devm_add_action_or_reset()
  thermal: armada: Fix -Wshift-negative-value
  dt-bindings: thermal: qoriq: Add optional clocks property
  thermal: qoriq: Use __maybe_unused instead of #if CONFIG_PM_SLEEP
  thermal: qoriq: Use devm_platform_ioremap_resource() instead of of_iomap()
  thermal: qoriq: Fix error path of calling qoriq_tmu_register_tmu_zone fail
  thermal: qoriq: Add clock operations
  drivers: thermal: processor_thermal_device: Export sysfs interface for TCC offset
</pre>
</div>
</content>
</entry>
<entry>
<title>thermal: db8500: Rewrite to be a pure OF sensor</title>
<updated>2019-09-25T05:59:22+00:00</updated>
<author>
<name>Linus Walleij</name>
<email>linus.walleij@linaro.org</email>
</author>
<published>2019-08-28T13:03:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=6c375eccded41df8033ed55a1b785531b304fc67'/>
<id>6c375eccded41df8033ed55a1b785531b304fc67</id>
<content type='text'>
This patch rewrites the DB8500 thermal sensor to be a
pure OF sensor, so that it can be used with thermal zones
defined in the device tree.

This driver was initially merged before we had generic
thermal zone device tree bindings, and now it gets
modernized to the way we do things these days.

The old driver depended on a set of trigger points
provided in the device tree or platform data to
interpolate the current temperature between trigger
points depending on whether the trend was rising or
falling. This was bad because the trigger points should
be used for defining temperature zone policies and
bind to cooling devices.

As the PRCMU (power reset control management unit) can
only issue IRQs when we pass temperature trigger points
upward or downward We instead define a number of
temperature points inside the driver ranging from
15 to 100 degrees celsius. The effect is that when
we register the device we quickly trigger 15, 20 ... up
to the room temperature in succession and then we
get continous event IRQs also under normal operating
conditions, and the temperature of the system is now
reported more accurately (+/- 2.5 degrees celsius)
while in the past the first trigger point was at 70
degrees and the average temperature was simply reported
as 35 degrees celsius (between 70 degrees and 0) until
we passed 70 degrees which didn't accurately represent
the temperature of the system.

As a result of dropping all the trigger points from the
driver and reusing the core DT thermal zone management
code we reduce the code footprint quite a bit.

Cc: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Suggested-by: Daniel Lezcano &lt;daniel.lezcano@linaro.org&gt;
Signed-off-by: Linus Walleij &lt;linus.walleij@linaro.org&gt;
Reviewed-by: Daniel Lezcano &lt;daniel.lezcano@linaro.org&gt;
Signed-off-by: Eduardo Valentin &lt;edubezval@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This patch rewrites the DB8500 thermal sensor to be a
pure OF sensor, so that it can be used with thermal zones
defined in the device tree.

This driver was initially merged before we had generic
thermal zone device tree bindings, and now it gets
modernized to the way we do things these days.

The old driver depended on a set of trigger points
provided in the device tree or platform data to
interpolate the current temperature between trigger
points depending on whether the trend was rising or
falling. This was bad because the trigger points should
be used for defining temperature zone policies and
bind to cooling devices.

As the PRCMU (power reset control management unit) can
only issue IRQs when we pass temperature trigger points
upward or downward We instead define a number of
temperature points inside the driver ranging from
15 to 100 degrees celsius. The effect is that when
we register the device we quickly trigger 15, 20 ... up
to the room temperature in succession and then we
get continous event IRQs also under normal operating
conditions, and the temperature of the system is now
reported more accurately (+/- 2.5 degrees celsius)
while in the past the first trigger point was at 70
degrees and the average temperature was simply reported
as 35 degrees celsius (between 70 degrees and 0) until
we passed 70 degrees which didn't accurately represent
the temperature of the system.

As a result of dropping all the trigger points from the
driver and reusing the core DT thermal zone management
code we reduce the code footprint quite a bit.

Cc: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Suggested-by: Daniel Lezcano &lt;daniel.lezcano@linaro.org&gt;
Signed-off-by: Linus Walleij &lt;linus.walleij@linaro.org&gt;
Reviewed-by: Daniel Lezcano &lt;daniel.lezcano@linaro.org&gt;
Signed-off-by: Eduardo Valentin &lt;edubezval@gmail.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>thermal: db8500: Use dev helper variable</title>
<updated>2019-09-25T05:58:46+00:00</updated>
<author>
<name>Linus Walleij</name>
<email>linus.walleij@linaro.org</email>
</author>
<published>2019-08-28T13:03:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3de9e4dff889531d517c8808898972e96fa507b2'/>
<id>3de9e4dff889531d517c8808898972e96fa507b2</id>
<content type='text'>
The code gets easier to read like this.

Cc: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Reviewed-by: Daniel Lezcano &lt;daniel.lezcano@linaro.org&gt;
Signed-off-by: Linus Walleij &lt;linus.walleij@linaro.org&gt;
Signed-off-by: Eduardo Valentin &lt;edubezval@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The code gets easier to read like this.

Cc: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Reviewed-by: Daniel Lezcano &lt;daniel.lezcano@linaro.org&gt;
Signed-off-by: Linus Walleij &lt;linus.walleij@linaro.org&gt;
Signed-off-by: Eduardo Valentin &lt;edubezval@gmail.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>thermal: db8500: Finalize device tree conversion</title>
<updated>2019-09-25T05:54:49+00:00</updated>
<author>
<name>Linus Walleij</name>
<email>linus.walleij@linaro.org</email>
</author>
<published>2019-08-28T13:03:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=cb063a83ca321fbf0cb2b4044186f241d89f3dc1'/>
<id>cb063a83ca321fbf0cb2b4044186f241d89f3dc1</id>
<content type='text'>
At some point there was an attempt to convert the DB8500
thermal sensor to device tree: a probe path was added
and the device tree was augmented for the Snowball board.
The switchover was never completed: instead the thermal
devices came from from the PRCMU MFD device and the probe
on the Snowball was confused as another set of configuration
appeared from the device tree.

Move over to a device-tree only approach, as we fixed up
the device trees.

Cc: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Acked-by: Lee Jones &lt;lee.jones@linaro.org&gt;
Reviewed-by: Daniel Lezcano &lt;daniel.lezcano@linaro.org&gt;
Signed-off-by: Linus Walleij &lt;linus.walleij@linaro.org&gt;
Signed-off-by: Eduardo Valentin &lt;edubezval@gmail.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
At some point there was an attempt to convert the DB8500
thermal sensor to device tree: a probe path was added
and the device tree was augmented for the Snowball board.
The switchover was never completed: instead the thermal
devices came from from the PRCMU MFD device and the probe
on the Snowball was confused as another set of configuration
appeared from the device tree.

Move over to a device-tree only approach, as we fixed up
the device trees.

Cc: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Acked-by: Lee Jones &lt;lee.jones@linaro.org&gt;
Reviewed-by: Daniel Lezcano &lt;daniel.lezcano@linaro.org&gt;
Signed-off-by: Linus Walleij &lt;linus.walleij@linaro.org&gt;
Signed-off-by: Eduardo Valentin &lt;edubezval@gmail.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branches 'thermal-core', 'thermal-intel' and 'thermal-soc' into for-5.4</title>
<updated>2019-09-24T01:56:37+00:00</updated>
<author>
<name>Zhang Rui</name>
<email>rui.zhang@intel.com</email>
</author>
<published>2019-09-24T01:56:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=0f84d1d18c46d0f995962c876c8b2900fd183fd7'/>
<id>0f84d1d18c46d0f995962c876c8b2900fd183fd7</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>thermal: Add some error messages</title>
<updated>2019-09-24T01:56:08+00:00</updated>
<author>
<name>Amit Kucheria</name>
<email>amit.kucheria@linaro.org</email>
</author>
<published>2019-07-11T20:31:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=67eed44b8a8ae7ca1a1a77c64d2c4815f00e361b'/>
<id>67eed44b8a8ae7ca1a1a77c64d2c4815f00e361b</id>
<content type='text'>
When registering a thermal zone device, we currently return -EINVAL in
four cases. This makes it a little hard to debug the real cause of the
failure.

Print some error messages to make it easier for developer to figure out
what happened.

Signed-off-by: Amit Kucheria &lt;amit.kucheria@linaro.org&gt;
Signed-off-by: Zhang Rui &lt;rui.zhang@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When registering a thermal zone device, we currently return -EINVAL in
four cases. This makes it a little hard to debug the real cause of the
failure.

Print some error messages to make it easier for developer to figure out
what happened.

Signed-off-by: Amit Kucheria &lt;amit.kucheria@linaro.org&gt;
Signed-off-by: Zhang Rui &lt;rui.zhang@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>thermal: Fix use-after-free when unregistering thermal zone device</title>
<updated>2019-09-24T01:56:08+00:00</updated>
<author>
<name>Ido Schimmel</name>
<email>idosch@mellanox.com</email>
</author>
<published>2019-07-10T10:14:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1851799e1d2978f68eea5d9dff322e121dcf59c1'/>
<id>1851799e1d2978f68eea5d9dff322e121dcf59c1</id>
<content type='text'>
thermal_zone_device_unregister() cancels the delayed work that polls the
thermal zone, but it does not wait for it to finish. This is racy with
respect to the freeing of the thermal zone device, which can result in a
use-after-free [1].

Fix this by waiting for the delayed work to finish before freeing the
thermal zone device. Note that thermal_zone_device_set_polling() is
never invoked from an atomic context, so it is safe to call
cancel_delayed_work_sync() that can block.

[1]
[  +0.002221] ==================================================================
[  +0.000064] BUG: KASAN: use-after-free in __mutex_lock+0x1076/0x11c0
[  +0.000016] Read of size 8 at addr ffff8881e48e0450 by task kworker/1:0/17

[  +0.000023] CPU: 1 PID: 17 Comm: kworker/1:0 Not tainted 5.2.0-rc6-custom-02495-g8e73ca3be4af #1701
[  +0.000010] Hardware name: Mellanox Technologies Ltd. MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016
[  +0.000016] Workqueue: events_freezable_power_ thermal_zone_device_check
[  +0.000012] Call Trace:
[  +0.000021]  dump_stack+0xa9/0x10e
[  +0.000020]  print_address_description.cold.2+0x9/0x25e
[  +0.000018]  __kasan_report.cold.3+0x78/0x9d
[  +0.000016]  kasan_report+0xe/0x20
[  +0.000016]  __mutex_lock+0x1076/0x11c0
[  +0.000014]  step_wise_throttle+0x72/0x150
[  +0.000018]  handle_thermal_trip+0x167/0x760
[  +0.000019]  thermal_zone_device_update+0x19e/0x5f0
[  +0.000019]  process_one_work+0x969/0x16f0
[  +0.000017]  worker_thread+0x91/0xc40
[  +0.000014]  kthread+0x33d/0x400
[  +0.000015]  ret_from_fork+0x3a/0x50

[  +0.000020] Allocated by task 1:
[  +0.000015]  save_stack+0x19/0x80
[  +0.000015]  __kasan_kmalloc.constprop.4+0xc1/0xd0
[  +0.000014]  kmem_cache_alloc_trace+0x152/0x320
[  +0.000015]  thermal_zone_device_register+0x1b4/0x13a0
[  +0.000015]  mlxsw_thermal_init+0xc92/0x23d0
[  +0.000014]  __mlxsw_core_bus_device_register+0x659/0x11b0
[  +0.000013]  mlxsw_core_bus_device_register+0x3d/0x90
[  +0.000013]  mlxsw_pci_probe+0x355/0x4b0
[  +0.000014]  local_pci_probe+0xc3/0x150
[  +0.000013]  pci_device_probe+0x280/0x410
[  +0.000013]  really_probe+0x26a/0xbb0
[  +0.000013]  driver_probe_device+0x208/0x2e0
[  +0.000013]  device_driver_attach+0xfe/0x140
[  +0.000013]  __driver_attach+0x110/0x310
[  +0.000013]  bus_for_each_dev+0x14b/0x1d0
[  +0.000013]  driver_register+0x1c0/0x400
[  +0.000015]  mlxsw_sp_module_init+0x5d/0xd3
[  +0.000014]  do_one_initcall+0x239/0x4dd
[  +0.000013]  kernel_init_freeable+0x42b/0x4e8
[  +0.000012]  kernel_init+0x11/0x18b
[  +0.000013]  ret_from_fork+0x3a/0x50

[  +0.000015] Freed by task 581:
[  +0.000013]  save_stack+0x19/0x80
[  +0.000014]  __kasan_slab_free+0x125/0x170
[  +0.000013]  kfree+0xf3/0x310
[  +0.000013]  thermal_release+0xc7/0xf0
[  +0.000014]  device_release+0x77/0x200
[  +0.000014]  kobject_put+0x1a8/0x4c0
[  +0.000014]  device_unregister+0x38/0xc0
[  +0.000014]  thermal_zone_device_unregister+0x54e/0x6a0
[  +0.000014]  mlxsw_thermal_fini+0x184/0x35a
[  +0.000014]  mlxsw_core_bus_device_unregister+0x10a/0x640
[  +0.000013]  mlxsw_devlink_core_bus_device_reload+0x92/0x210
[  +0.000015]  devlink_nl_cmd_reload+0x113/0x1f0
[  +0.000014]  genl_family_rcv_msg+0x700/0xee0
[  +0.000013]  genl_rcv_msg+0xca/0x170
[  +0.000013]  netlink_rcv_skb+0x137/0x3a0
[  +0.000012]  genl_rcv+0x29/0x40
[  +0.000013]  netlink_unicast+0x49b/0x660
[  +0.000013]  netlink_sendmsg+0x755/0xc90
[  +0.000013]  __sys_sendto+0x3de/0x430
[  +0.000013]  __x64_sys_sendto+0xe2/0x1b0
[  +0.000013]  do_syscall_64+0xa4/0x4d0
[  +0.000013]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

[  +0.000017] The buggy address belongs to the object at ffff8881e48e0008
               which belongs to the cache kmalloc-2k of size 2048
[  +0.000012] The buggy address is located 1096 bytes inside of
               2048-byte region [ffff8881e48e0008, ffff8881e48e0808)
[  +0.000007] The buggy address belongs to the page:
[  +0.000012] page:ffffea0007923800 refcount:1 mapcount:0 mapping:ffff88823680d0c0 index:0x0 compound_mapcount: 0
[  +0.000020] flags: 0x200000000010200(slab|head)
[  +0.000019] raw: 0200000000010200 ffffea0007682008 ffffea00076ab808 ffff88823680d0c0
[  +0.000016] raw: 0000000000000000 00000000000d000d 00000001ffffffff 0000000000000000
[  +0.000007] page dumped because: kasan: bad access detected

[  +0.000012] Memory state around the buggy address:
[  +0.000012]  ffff8881e48e0300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  +0.000012]  ffff8881e48e0380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  +0.000012] &gt;ffff8881e48e0400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  +0.000008]                                                  ^
[  +0.000012]  ffff8881e48e0480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  +0.000012]  ffff8881e48e0500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  +0.000007] ==================================================================

Fixes: b1569e99c795 ("ACPI: move thermal trip handling to generic thermal layer")
Reported-by: Jiri Pirko &lt;jiri@mellanox.com&gt;
Signed-off-by: Ido Schimmel &lt;idosch@mellanox.com&gt;
Acked-by: Jiri Pirko &lt;jiri@mellanox.com&gt;
Signed-off-by: Zhang Rui &lt;rui.zhang@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
thermal_zone_device_unregister() cancels the delayed work that polls the
thermal zone, but it does not wait for it to finish. This is racy with
respect to the freeing of the thermal zone device, which can result in a
use-after-free [1].

Fix this by waiting for the delayed work to finish before freeing the
thermal zone device. Note that thermal_zone_device_set_polling() is
never invoked from an atomic context, so it is safe to call
cancel_delayed_work_sync() that can block.

[1]
[  +0.002221] ==================================================================
[  +0.000064] BUG: KASAN: use-after-free in __mutex_lock+0x1076/0x11c0
[  +0.000016] Read of size 8 at addr ffff8881e48e0450 by task kworker/1:0/17

[  +0.000023] CPU: 1 PID: 17 Comm: kworker/1:0 Not tainted 5.2.0-rc6-custom-02495-g8e73ca3be4af #1701
[  +0.000010] Hardware name: Mellanox Technologies Ltd. MSN2100-CB2FO/SA001017, BIOS 5.6.5 06/07/2016
[  +0.000016] Workqueue: events_freezable_power_ thermal_zone_device_check
[  +0.000012] Call Trace:
[  +0.000021]  dump_stack+0xa9/0x10e
[  +0.000020]  print_address_description.cold.2+0x9/0x25e
[  +0.000018]  __kasan_report.cold.3+0x78/0x9d
[  +0.000016]  kasan_report+0xe/0x20
[  +0.000016]  __mutex_lock+0x1076/0x11c0
[  +0.000014]  step_wise_throttle+0x72/0x150
[  +0.000018]  handle_thermal_trip+0x167/0x760
[  +0.000019]  thermal_zone_device_update+0x19e/0x5f0
[  +0.000019]  process_one_work+0x969/0x16f0
[  +0.000017]  worker_thread+0x91/0xc40
[  +0.000014]  kthread+0x33d/0x400
[  +0.000015]  ret_from_fork+0x3a/0x50

[  +0.000020] Allocated by task 1:
[  +0.000015]  save_stack+0x19/0x80
[  +0.000015]  __kasan_kmalloc.constprop.4+0xc1/0xd0
[  +0.000014]  kmem_cache_alloc_trace+0x152/0x320
[  +0.000015]  thermal_zone_device_register+0x1b4/0x13a0
[  +0.000015]  mlxsw_thermal_init+0xc92/0x23d0
[  +0.000014]  __mlxsw_core_bus_device_register+0x659/0x11b0
[  +0.000013]  mlxsw_core_bus_device_register+0x3d/0x90
[  +0.000013]  mlxsw_pci_probe+0x355/0x4b0
[  +0.000014]  local_pci_probe+0xc3/0x150
[  +0.000013]  pci_device_probe+0x280/0x410
[  +0.000013]  really_probe+0x26a/0xbb0
[  +0.000013]  driver_probe_device+0x208/0x2e0
[  +0.000013]  device_driver_attach+0xfe/0x140
[  +0.000013]  __driver_attach+0x110/0x310
[  +0.000013]  bus_for_each_dev+0x14b/0x1d0
[  +0.000013]  driver_register+0x1c0/0x400
[  +0.000015]  mlxsw_sp_module_init+0x5d/0xd3
[  +0.000014]  do_one_initcall+0x239/0x4dd
[  +0.000013]  kernel_init_freeable+0x42b/0x4e8
[  +0.000012]  kernel_init+0x11/0x18b
[  +0.000013]  ret_from_fork+0x3a/0x50

[  +0.000015] Freed by task 581:
[  +0.000013]  save_stack+0x19/0x80
[  +0.000014]  __kasan_slab_free+0x125/0x170
[  +0.000013]  kfree+0xf3/0x310
[  +0.000013]  thermal_release+0xc7/0xf0
[  +0.000014]  device_release+0x77/0x200
[  +0.000014]  kobject_put+0x1a8/0x4c0
[  +0.000014]  device_unregister+0x38/0xc0
[  +0.000014]  thermal_zone_device_unregister+0x54e/0x6a0
[  +0.000014]  mlxsw_thermal_fini+0x184/0x35a
[  +0.000014]  mlxsw_core_bus_device_unregister+0x10a/0x640
[  +0.000013]  mlxsw_devlink_core_bus_device_reload+0x92/0x210
[  +0.000015]  devlink_nl_cmd_reload+0x113/0x1f0
[  +0.000014]  genl_family_rcv_msg+0x700/0xee0
[  +0.000013]  genl_rcv_msg+0xca/0x170
[  +0.000013]  netlink_rcv_skb+0x137/0x3a0
[  +0.000012]  genl_rcv+0x29/0x40
[  +0.000013]  netlink_unicast+0x49b/0x660
[  +0.000013]  netlink_sendmsg+0x755/0xc90
[  +0.000013]  __sys_sendto+0x3de/0x430
[  +0.000013]  __x64_sys_sendto+0xe2/0x1b0
[  +0.000013]  do_syscall_64+0xa4/0x4d0
[  +0.000013]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

[  +0.000017] The buggy address belongs to the object at ffff8881e48e0008
               which belongs to the cache kmalloc-2k of size 2048
[  +0.000012] The buggy address is located 1096 bytes inside of
               2048-byte region [ffff8881e48e0008, ffff8881e48e0808)
[  +0.000007] The buggy address belongs to the page:
[  +0.000012] page:ffffea0007923800 refcount:1 mapcount:0 mapping:ffff88823680d0c0 index:0x0 compound_mapcount: 0
[  +0.000020] flags: 0x200000000010200(slab|head)
[  +0.000019] raw: 0200000000010200 ffffea0007682008 ffffea00076ab808 ffff88823680d0c0
[  +0.000016] raw: 0000000000000000 00000000000d000d 00000001ffffffff 0000000000000000
[  +0.000007] page dumped because: kasan: bad access detected

[  +0.000012] Memory state around the buggy address:
[  +0.000012]  ffff8881e48e0300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  +0.000012]  ffff8881e48e0380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  +0.000012] &gt;ffff8881e48e0400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  +0.000008]                                                  ^
[  +0.000012]  ffff8881e48e0480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  +0.000012]  ffff8881e48e0500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  +0.000007] ==================================================================

Fixes: b1569e99c795 ("ACPI: move thermal trip handling to generic thermal layer")
Reported-by: Jiri Pirko &lt;jiri@mellanox.com&gt;
Signed-off-by: Ido Schimmel &lt;idosch@mellanox.com&gt;
Acked-by: Jiri Pirko &lt;jiri@mellanox.com&gt;
Signed-off-by: Zhang Rui &lt;rui.zhang@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>thermal/drivers/core: Use put_device() if device_register() fails</title>
<updated>2019-09-24T01:56:08+00:00</updated>
<author>
<name>Yue Hu</name>
<email>huyue2@yulong.com</email>
</author>
<published>2019-08-07T03:01:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=adc8749b150c51e857ac248017971371f70de197'/>
<id>adc8749b150c51e857ac248017971371f70de197</id>
<content type='text'>
Never directly free @dev after calling device_register(), even if it
returned an error! Always use put_device() to give up the reference
initialized. Clean up the rollback block also.

Signed-off-by: Yue Hu &lt;huyue2@yulong.com&gt;
Signed-off-by: Zhang Rui &lt;rui.zhang@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Never directly free @dev after calling device_register(), even if it
returned an error! Always use put_device() to give up the reference
initialized. Clean up the rollback block also.

Signed-off-by: Yue Hu &lt;huyue2@yulong.com&gt;
Signed-off-by: Zhang Rui &lt;rui.zhang@intel.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
