<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/thermal, branch v4.9.78</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>thermal/drivers/hisi: Fix multiple alarm interrupts firing</title>
<updated>2017-12-25T13:23:46+00:00</updated>
<author>
<name>Daniel Lezcano</name>
<email>daniel.lezcano@linaro.org</email>
</author>
<published>2017-10-19T17:05:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=3cff90788e289eb9b24aa4d6c9de78a689402b0d'/>
<id>3cff90788e289eb9b24aa4d6c9de78a689402b0d</id>
<content type='text'>
commit db2b0332608c8e648ea1e44727d36ad37cdb56cb upstream.

The DT specifies a threshold of 65000, we setup the register with a value in
the temperature resolution for the controller, 64656.

When we reach 64656, the interrupt fires, the interrupt is disabled. Then the
irq thread runs and calls thermal_zone_device_update() which will call in turn
hisi_thermal_get_temp().

The function will look if the temperature decreased, assuming it was more than
65000, but that is not the case because the current temperature is 64656
(because of the rounding when setting the threshold). This condition being
true, we re-enable the interrupt which fires immediately after exiting the irq
thread. That happens again and again until the temperature goes to more than
65000.

Potentially, there is here an interrupt storm if the temperature stabilizes at
this temperature. A very unlikely case but possible.

In any case, it does not make sense to handle dozens of alarm interrupt for
nothing.

Fix this by rounding the threshold value to the controller resolution so the
check against the threshold is consistent with the one set in the controller.

Signed-off-by: Daniel Lezcano &lt;daniel.lezcano@linaro.org&gt;
Reviewed-by: Leo Yan &lt;leo.yan@linaro.org&gt;
Tested-by: Leo Yan &lt;leo.yan@linaro.org&gt;
Signed-off-by: Eduardo Valentin &lt;edubezval@gmail.com&gt;
Signed-off-by: Kevin Wangtao &lt;kevin.wangtao@hisilicon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit db2b0332608c8e648ea1e44727d36ad37cdb56cb upstream.

The DT specifies a threshold of 65000, we setup the register with a value in
the temperature resolution for the controller, 64656.

When we reach 64656, the interrupt fires, the interrupt is disabled. Then the
irq thread runs and calls thermal_zone_device_update() which will call in turn
hisi_thermal_get_temp().

The function will look if the temperature decreased, assuming it was more than
65000, but that is not the case because the current temperature is 64656
(because of the rounding when setting the threshold). This condition being
true, we re-enable the interrupt which fires immediately after exiting the irq
thread. That happens again and again until the temperature goes to more than
65000.

Potentially, there is here an interrupt storm if the temperature stabilizes at
this temperature. A very unlikely case but possible.

In any case, it does not make sense to handle dozens of alarm interrupt for
nothing.

Fix this by rounding the threshold value to the controller resolution so the
check against the threshold is consistent with the one set in the controller.

Signed-off-by: Daniel Lezcano &lt;daniel.lezcano@linaro.org&gt;
Reviewed-by: Leo Yan &lt;leo.yan@linaro.org&gt;
Tested-by: Leo Yan &lt;leo.yan@linaro.org&gt;
Signed-off-by: Eduardo Valentin &lt;edubezval@gmail.com&gt;
Signed-off-by: Kevin Wangtao &lt;kevin.wangtao@hisilicon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>thermal/drivers/hisi: Simplify the temperature/step computation</title>
<updated>2017-12-25T13:23:46+00:00</updated>
<author>
<name>Daniel Lezcano</name>
<email>daniel.lezcano@linaro.org</email>
</author>
<published>2017-10-19T17:05:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=1b2c46a6be45a36c64230cc23859e2113ab1cc49'/>
<id>1b2c46a6be45a36c64230cc23859e2113ab1cc49</id>
<content type='text'>
commit 48880b979cdc9ef5a70af020f42b8ba1e51dbd34 upstream.

The step and the base temperature are fixed values, we can simplify the
computation by converting the base temperature to milli celsius and use a
pre-computed step value. That saves us a lot of mult + div for nothing at
runtime.

Take also the opportunity to change the function names to be consistent with
the rest of the code.

Signed-off-by: Daniel Lezcano &lt;daniel.lezcano@linaro.org&gt;
Reviewed-by: Leo Yan &lt;leo.yan@linaro.org&gt;
Tested-by: Leo Yan &lt;leo.yan@linaro.org&gt;
Signed-off-by: Eduardo Valentin &lt;edubezval@gmail.com&gt;
Signed-off-by: Kevin Wangtao &lt;kevin.wangtao@hisilicon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 48880b979cdc9ef5a70af020f42b8ba1e51dbd34 upstream.

The step and the base temperature are fixed values, we can simplify the
computation by converting the base temperature to milli celsius and use a
pre-computed step value. That saves us a lot of mult + div for nothing at
runtime.

Take also the opportunity to change the function names to be consistent with
the rest of the code.

Signed-off-by: Daniel Lezcano &lt;daniel.lezcano@linaro.org&gt;
Reviewed-by: Leo Yan &lt;leo.yan@linaro.org&gt;
Tested-by: Leo Yan &lt;leo.yan@linaro.org&gt;
Signed-off-by: Eduardo Valentin &lt;edubezval@gmail.com&gt;
Signed-off-by: Kevin Wangtao &lt;kevin.wangtao@hisilicon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>thermal/drivers/hisi: Fix kernel panic on alarm interrupt</title>
<updated>2017-12-25T13:23:46+00:00</updated>
<author>
<name>Daniel Lezcano</name>
<email>daniel.lezcano@linaro.org</email>
</author>
<published>2017-10-19T17:05:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=2dac559df962c064d13a14610e81535c55fcf33d'/>
<id>2dac559df962c064d13a14610e81535c55fcf33d</id>
<content type='text'>
commit 2cb4de785c40d4a2132cfc13e63828f5a28c3351 upstream.

The threaded interrupt for the alarm interrupt is requested before the
temperature controller is setup. This one can fire an interrupt immediately
leading to a kernel panic as the sensor data is not initialized.

In order to prevent that, move the threaded irq after the Tsensor is setup.

Signed-off-by: Daniel Lezcano &lt;daniel.lezcano@linaro.org&gt;
Reviewed-by: Leo Yan &lt;leo.yan@linaro.org&gt;
Tested-by: Leo Yan &lt;leo.yan@linaro.org&gt;
Signed-off-by: Eduardo Valentin &lt;edubezval@gmail.com&gt;
Signed-off-by: Kevin Wangtao &lt;kevin.wangtao@hisilicon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 2cb4de785c40d4a2132cfc13e63828f5a28c3351 upstream.

The threaded interrupt for the alarm interrupt is requested before the
temperature controller is setup. This one can fire an interrupt immediately
leading to a kernel panic as the sensor data is not initialized.

In order to prevent that, move the threaded irq after the Tsensor is setup.

Signed-off-by: Daniel Lezcano &lt;daniel.lezcano@linaro.org&gt;
Reviewed-by: Leo Yan &lt;leo.yan@linaro.org&gt;
Tested-by: Leo Yan &lt;leo.yan@linaro.org&gt;
Signed-off-by: Eduardo Valentin &lt;edubezval@gmail.com&gt;
Signed-off-by: Kevin Wangtao &lt;kevin.wangtao@hisilicon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>thermal/drivers/hisi: Fix missing interrupt enablement</title>
<updated>2017-12-25T13:23:46+00:00</updated>
<author>
<name>Daniel Lezcano</name>
<email>daniel.lezcano@linaro.org</email>
</author>
<published>2017-10-19T17:05:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=b679b8d7bad06765dde62c1bc2f378d5461928f3'/>
<id>b679b8d7bad06765dde62c1bc2f378d5461928f3</id>
<content type='text'>
commit c176b10b025acee4dc8f2ab1cd64eb73b5ccef53 upstream.

The interrupt for the temperature threshold is not enabled at the end of the
probe function, enable it after the setup is complete.

On the other side, the irq_enabled is not correctly set as we are checking if
the interrupt is masked where 'yes' means irq_enabled=false.

	irq_get_irqchip_state(data-&gt;irq, IRQCHIP_STATE_MASKED,
				&amp;data-&gt;irq_enabled);

As we are always enabling the interrupt, it is pointless to check if
the interrupt is masked or not, just set irq_enabled to 'true'.

Signed-off-by: Daniel Lezcano &lt;daniel.lezcano@linaro.org&gt;
Reviewed-by: Leo Yan &lt;leo.yan@linaro.org&gt;
Tested-by: Leo Yan &lt;leo.yan@linaro.org&gt;
Signed-off-by: Eduardo Valentin &lt;edubezval@gmail.com&gt;
Signed-off-by: Kevin Wangtao &lt;kevin.wangtao@hisilicon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit c176b10b025acee4dc8f2ab1cd64eb73b5ccef53 upstream.

The interrupt for the temperature threshold is not enabled at the end of the
probe function, enable it after the setup is complete.

On the other side, the irq_enabled is not correctly set as we are checking if
the interrupt is masked where 'yes' means irq_enabled=false.

	irq_get_irqchip_state(data-&gt;irq, IRQCHIP_STATE_MASKED,
				&amp;data-&gt;irq_enabled);

As we are always enabling the interrupt, it is pointless to check if
the interrupt is masked or not, just set irq_enabled to 'true'.

Signed-off-by: Daniel Lezcano &lt;daniel.lezcano@linaro.org&gt;
Reviewed-by: Leo Yan &lt;leo.yan@linaro.org&gt;
Tested-by: Leo Yan &lt;leo.yan@linaro.org&gt;
Signed-off-by: Eduardo Valentin &lt;edubezval@gmail.com&gt;
Signed-off-by: Kevin Wangtao &lt;kevin.wangtao@hisilicon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>thermal: hisilicon: Handle return value of clk_prepare_enable</title>
<updated>2017-12-25T13:23:46+00:00</updated>
<author>
<name>Arvind Yadav</name>
<email>arvind.yadav.cs@gmail.com</email>
</author>
<published>2017-06-06T09:34:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=82bf76afa8affad676ba6442bd4656f842312988'/>
<id>82bf76afa8affad676ba6442bd4656f842312988</id>
<content type='text'>
commit 919054fdfc8adf58c5512fe9872eb53ea0f5525d upstream.

clk_prepare_enable() can fail here and we must check its return value.

Signed-off-by: Arvind Yadav &lt;arvind.yadav.cs@gmail.com&gt;
Signed-off-by: Eduardo Valentin &lt;edubezval@gmail.com&gt;
Signed-off-by: Kevin Wangtao &lt;kevin.wangtao@hisilicon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 919054fdfc8adf58c5512fe9872eb53ea0f5525d upstream.

clk_prepare_enable() can fail here and we must check its return value.

Signed-off-by: Arvind Yadav &lt;arvind.yadav.cs@gmail.com&gt;
Signed-off-by: Eduardo Valentin &lt;edubezval@gmail.com&gt;
Signed-off-by: Kevin Wangtao &lt;kevin.wangtao@hisilicon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>thermal/drivers/step_wise: Fix temperature regulation misbehavior</title>
<updated>2017-12-20T09:07:30+00:00</updated>
<author>
<name>Daniel Lezcano</name>
<email>daniel.lezcano@linaro.org</email>
</author>
<published>2017-10-19T17:05:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d8914530f247900885850af7cccf81a2b33907b9'/>
<id>d8914530f247900885850af7cccf81a2b33907b9</id>
<content type='text'>
[ Upstream commit 07209fcf33542c1ff1e29df2dbdf8f29cdaacb10 ]

There is a particular situation when the cooling device is cpufreq and the heat
dissipation is not efficient enough where the temperature increases little by
little until reaching the critical threshold and leading to a SoC reset.

The behavior is reproducible on a hikey6220 with bad heat dissipation (eg.
stacked with other boards).

Running a simple C program doing while(1); for each CPU of the SoC makes the
temperature to reach the passive regulation trip point and ends up to the
maximum allowed temperature followed by a reset.

This issue has been also reported by running the libhugetlbfs test suite.

What is observed is a ping pong between two cpu frequencies, 1.2GHz and 900MHz
while the temperature continues to grow.

It appears the step wise governor calls get_target_state() the first time with
the throttle set to true and the trend to 'raising'. The code selects logically
the next state, so the cpu frequency decreases from 1.2GHz to 900MHz, so far so
good. The temperature decreases immediately but still stays greater than the
trip point, then get_target_state() is called again, this time with the
throttle set to true *and* the trend to 'dropping'. From there the algorithm
assumes we have to step down the state and the cpu frequency jumps back to
1.2GHz. But the temperature is still higher than the trip point, so
get_target_state() is called with throttle=1 and trend='raising' again, we jump
to 900MHz, then get_target_state() is called with throttle=1 and
trend='dropping', we jump to 1.2GHz, etc ... but the temperature does not
stabilizes and continues to increase.

[  237.922654] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1
[  237.922678] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1
[  237.922690] thermal cooling_device0: cur_state=0
[  237.922701] thermal cooling_device0: old_target=0, target=1
[  238.026656] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1
[  238.026680] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=1
[  238.026694] thermal cooling_device0: cur_state=1
[  238.026707] thermal cooling_device0: old_target=1, target=0
[  238.134647] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1
[  238.134667] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1
[  238.134679] thermal cooling_device0: cur_state=0
[  238.134690] thermal cooling_device0: old_target=0, target=1

In this situation the temperature continues to increase while the trend is
oscillating between 'dropping' and 'raising'. We need to keep the current state
untouched if the throttle is set, so the temperature can decrease or a higher
state could be selected, thus preventing this oscillation.

Keeping the next_target untouched when 'throttle' is true at 'dropping' time
fixes the issue.

The following traces show the governor does not change the next state if
trend==2 (dropping) and throttle==1.

[ 2306.127987] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1
[ 2306.128009] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1
[ 2306.128021] thermal cooling_device0: cur_state=0
[ 2306.128031] thermal cooling_device0: old_target=0, target=1
[ 2306.231991] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1
[ 2306.232016] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=1
[ 2306.232030] thermal cooling_device0: cur_state=1
[ 2306.232042] thermal cooling_device0: old_target=1, target=1
[ 2306.335982] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=0,throttle=1
[ 2306.336006] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=0,throttle=1
[ 2306.336021] thermal cooling_device0: cur_state=1
[ 2306.336034] thermal cooling_device0: old_target=1, target=1
[ 2306.439984] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1
[ 2306.440008] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=0
[ 2306.440022] thermal cooling_device0: cur_state=1
[ 2306.440034] thermal cooling_device0: old_target=1, target=0

[ ... ]

After a while, if the temperature continues to increase, the next state becomes
2 which is 720MHz on the hikey. That results in the temperature stabilizing
around the trip point.

[ 2455.831982] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1
[ 2455.832006] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=0
[ 2455.832019] thermal cooling_device0: cur_state=1
[ 2455.832032] thermal cooling_device0: old_target=1, target=1
[ 2455.935985] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=0,throttle=1
[ 2455.936013] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=0,throttle=0
[ 2455.936027] thermal cooling_device0: cur_state=1
[ 2455.936040] thermal cooling_device0: old_target=1, target=1
[ 2456.043984] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=0,throttle=1
[ 2456.044009] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=0,throttle=0
[ 2456.044023] thermal cooling_device0: cur_state=1
[ 2456.044036] thermal cooling_device0: old_target=1, target=1
[ 2456.148001] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1
[ 2456.148028] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1
[ 2456.148042] thermal cooling_device0: cur_state=1
[ 2456.148055] thermal cooling_device0: old_target=1, target=2
[ 2456.252009] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1
[ 2456.252041] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=0
[ 2456.252058] thermal cooling_device0: cur_state=2
[ 2456.252075] thermal cooling_device0: old_target=2, target=1

IOW, this change is needed to keep the state for a cooling device if the
temperature trend is oscillating while the temperature increases slightly.

Without this change, the situation above leads to a catastrophic crash by a
hardware reset on hikey. This issue has been reported to happen on an OMAP
dra7xx also.

Signed-off-by: Daniel Lezcano &lt;daniel.lezcano@linaro.org&gt;
Cc: Keerthy &lt;j-keerthy@ti.com&gt;
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: Leo Yan &lt;leo.yan@linaro.org&gt;
Tested-by: Keerthy &lt;j-keerthy@ti.com&gt;
Reviewed-by: Keerthy &lt;j-keerthy@ti.com&gt;
Signed-off-by: Eduardo Valentin &lt;edubezval@gmail.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 07209fcf33542c1ff1e29df2dbdf8f29cdaacb10 ]

There is a particular situation when the cooling device is cpufreq and the heat
dissipation is not efficient enough where the temperature increases little by
little until reaching the critical threshold and leading to a SoC reset.

The behavior is reproducible on a hikey6220 with bad heat dissipation (eg.
stacked with other boards).

Running a simple C program doing while(1); for each CPU of the SoC makes the
temperature to reach the passive regulation trip point and ends up to the
maximum allowed temperature followed by a reset.

This issue has been also reported by running the libhugetlbfs test suite.

What is observed is a ping pong between two cpu frequencies, 1.2GHz and 900MHz
while the temperature continues to grow.

It appears the step wise governor calls get_target_state() the first time with
the throttle set to true and the trend to 'raising'. The code selects logically
the next state, so the cpu frequency decreases from 1.2GHz to 900MHz, so far so
good. The temperature decreases immediately but still stays greater than the
trip point, then get_target_state() is called again, this time with the
throttle set to true *and* the trend to 'dropping'. From there the algorithm
assumes we have to step down the state and the cpu frequency jumps back to
1.2GHz. But the temperature is still higher than the trip point, so
get_target_state() is called with throttle=1 and trend='raising' again, we jump
to 900MHz, then get_target_state() is called with throttle=1 and
trend='dropping', we jump to 1.2GHz, etc ... but the temperature does not
stabilizes and continues to increase.

[  237.922654] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1
[  237.922678] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1
[  237.922690] thermal cooling_device0: cur_state=0
[  237.922701] thermal cooling_device0: old_target=0, target=1
[  238.026656] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1
[  238.026680] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=1
[  238.026694] thermal cooling_device0: cur_state=1
[  238.026707] thermal cooling_device0: old_target=1, target=0
[  238.134647] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1
[  238.134667] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1
[  238.134679] thermal cooling_device0: cur_state=0
[  238.134690] thermal cooling_device0: old_target=0, target=1

In this situation the temperature continues to increase while the trend is
oscillating between 'dropping' and 'raising'. We need to keep the current state
untouched if the throttle is set, so the temperature can decrease or a higher
state could be selected, thus preventing this oscillation.

Keeping the next_target untouched when 'throttle' is true at 'dropping' time
fixes the issue.

The following traces show the governor does not change the next state if
trend==2 (dropping) and throttle==1.

[ 2306.127987] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1
[ 2306.128009] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1
[ 2306.128021] thermal cooling_device0: cur_state=0
[ 2306.128031] thermal cooling_device0: old_target=0, target=1
[ 2306.231991] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1
[ 2306.232016] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=1
[ 2306.232030] thermal cooling_device0: cur_state=1
[ 2306.232042] thermal cooling_device0: old_target=1, target=1
[ 2306.335982] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=0,throttle=1
[ 2306.336006] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=0,throttle=1
[ 2306.336021] thermal cooling_device0: cur_state=1
[ 2306.336034] thermal cooling_device0: old_target=1, target=1
[ 2306.439984] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1
[ 2306.440008] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=0
[ 2306.440022] thermal cooling_device0: cur_state=1
[ 2306.440034] thermal cooling_device0: old_target=1, target=0

[ ... ]

After a while, if the temperature continues to increase, the next state becomes
2 which is 720MHz on the hikey. That results in the temperature stabilizing
around the trip point.

[ 2455.831982] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1
[ 2455.832006] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=0
[ 2455.832019] thermal cooling_device0: cur_state=1
[ 2455.832032] thermal cooling_device0: old_target=1, target=1
[ 2455.935985] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=0,throttle=1
[ 2455.936013] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=0,throttle=0
[ 2455.936027] thermal cooling_device0: cur_state=1
[ 2455.936040] thermal cooling_device0: old_target=1, target=1
[ 2456.043984] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=0,throttle=1
[ 2456.044009] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=0,throttle=0
[ 2456.044023] thermal cooling_device0: cur_state=1
[ 2456.044036] thermal cooling_device0: old_target=1, target=1
[ 2456.148001] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=1,throttle=1
[ 2456.148028] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=1,throttle=1
[ 2456.148042] thermal cooling_device0: cur_state=1
[ 2456.148055] thermal cooling_device0: old_target=1, target=2
[ 2456.252009] thermal thermal_zone0: Trip0[type=1,temp=65000]:trend=2,throttle=1
[ 2456.252041] thermal thermal_zone0: Trip1[type=1,temp=75000]:trend=2,throttle=0
[ 2456.252058] thermal cooling_device0: cur_state=2
[ 2456.252075] thermal cooling_device0: old_target=2, target=1

IOW, this change is needed to keep the state for a cooling device if the
temperature trend is oscillating while the temperature increases slightly.

Without this change, the situation above leads to a catastrophic crash by a
hardware reset on hikey. This issue has been reported to happen on an OMAP
dra7xx also.

Signed-off-by: Daniel Lezcano &lt;daniel.lezcano@linaro.org&gt;
Cc: Keerthy &lt;j-keerthy@ti.com&gt;
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: Leo Yan &lt;leo.yan@linaro.org&gt;
Tested-by: Keerthy &lt;j-keerthy@ti.com&gt;
Reviewed-by: Keerthy &lt;j-keerthy@ti.com&gt;
Signed-off-by: Eduardo Valentin &lt;edubezval@gmail.com&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>thermal: cpu_cooling: Avoid accessing potentially freed structures</title>
<updated>2017-07-27T22:07:55+00:00</updated>
<author>
<name>Viresh Kumar</name>
<email>viresh.kumar@linaro.org</email>
</author>
<published>2017-04-25T10:27:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7cd7b56037ae4cbdfe5e7fa9e41b67ba15199d7f'/>
<id>7cd7b56037ae4cbdfe5e7fa9e41b67ba15199d7f</id>
<content type='text'>
commit 289d72afddf83440117c35d864bf0c6309c1d011 upstream.

After the lock is dropped, it is possible that the cpufreq_dev gets
freed before we call get_level() and that can cause kernel to crash.

Drop the lock after we are done using the structure.

Fixes: 02373d7c69b4 ("thermal: cpu_cooling: fix lockdep problems in cpu_cooling")
Signed-off-by: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
Reviewed-by: Lukasz Luba &lt;lukasz.luba@arm.com&gt;
Tested-by: Lukasz Luba &lt;lukasz.luba@arm.com&gt;
Signed-off-by: Eduardo Valentin &lt;edubezval@gmail.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit 289d72afddf83440117c35d864bf0c6309c1d011 upstream.

After the lock is dropped, it is possible that the cpufreq_dev gets
freed before we call get_level() and that can cause kernel to crash.

Drop the lock after we are done using the structure.

Fixes: 02373d7c69b4 ("thermal: cpu_cooling: fix lockdep problems in cpu_cooling")
Signed-off-by: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
Reviewed-by: Lukasz Luba &lt;lukasz.luba@arm.com&gt;
Tested-by: Lukasz Luba &lt;lukasz.luba@arm.com&gt;
Signed-off-by: Eduardo Valentin &lt;edubezval@gmail.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>thermal: max77620: fix device-node reference imbalance</title>
<updated>2017-07-27T22:07:55+00:00</updated>
<author>
<name>Johan Hovold</name>
<email>johan@kernel.org</email>
</author>
<published>2017-06-06T15:59:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=76572609e45895826f4bf49c376271dc62f125dc'/>
<id>76572609e45895826f4bf49c376271dc62f125dc</id>
<content type='text'>
commit c592fafbdbb6b1279b76a54722d1465ca77e5bde upstream.

The thermal child device reuses the parent MFD-device device-tree node
when registering a thermal zone, but did not take a reference to the
node.

This leads to a reference imbalance, and potential use-after-free, when
the node reference is dropped by the platform-bus device destructor
(once for the child and later again for the parent).

Fix this by dropping any reference already held to a device-tree node
and getting a reference to the parent's node which will be balanced on
reprobe or on platform-device release, whichever comes first.

Note that simply clearing the of_node pointer on probe errors and on
driver unbind would not allow the use of device-managed resources as
specifically thermal_zone_of_sensor_unregister() claims that a valid
device-tree node pointer is needed during deregistration (even if it
currently does not seem to use it).

Fixes: ec4664b3fd6d ("thermal: max77620: Add thermal driver for reporting junction temp")
Cc: Laxman Dewangan &lt;ldewangan@nvidia.com&gt;
Signed-off-by: Johan Hovold &lt;johan@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit c592fafbdbb6b1279b76a54722d1465ca77e5bde upstream.

The thermal child device reuses the parent MFD-device device-tree node
when registering a thermal zone, but did not take a reference to the
node.

This leads to a reference imbalance, and potential use-after-free, when
the node reference is dropped by the platform-bus device destructor
(once for the child and later again for the parent).

Fix this by dropping any reference already held to a device-tree node
and getting a reference to the parent's node which will be balanced on
reprobe or on platform-device release, whichever comes first.

Note that simply clearing the of_node pointer on probe errors and on
driver unbind would not allow the use of device-managed resources as
specifically thermal_zone_of_sensor_unregister() claims that a valid
device-tree node pointer is needed during deregistration (even if it
currently does not seem to use it).

Fixes: ec4664b3fd6d ("thermal: max77620: Add thermal driver for reporting junction temp")
Cc: Laxman Dewangan &lt;ldewangan@nvidia.com&gt;
Signed-off-by: Johan Hovold &lt;johan@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>thermal: hwmon: Properly report critical temperature in sysfs</title>
<updated>2017-01-09T07:32:18+00:00</updated>
<author>
<name>Krzysztof Kozlowski</name>
<email>krzk@kernel.org</email>
</author>
<published>2016-11-22T17:22:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=fab303ba78ee2bae3da057f750139fc6222c66d5'/>
<id>fab303ba78ee2bae3da057f750139fc6222c66d5</id>
<content type='text'>
commit f37fabb8643eaf8e3b613333a72f683770c85eca upstream.

In the critical sysfs entry the thermal hwmon was returning wrong
temperature to the user-space.  It was reporting the temperature of the
first trip point instead of the temperature of critical trip point.

For example:
	/sys/class/hwmon/hwmon0/temp1_crit:50000
	/sys/class/thermal/thermal_zone0/trip_point_0_temp:50000
	/sys/class/thermal/thermal_zone0/trip_point_0_type:active
	/sys/class/thermal/thermal_zone0/trip_point_3_temp:120000
	/sys/class/thermal/thermal_zone0/trip_point_3_type:critical

Since commit e68b16abd91d ("thermal: add hwmon sysfs I/F") the driver
have been registering a sysfs entry if get_crit_temp() callback was
provided.  However when accessed, it was calling get_trip_temp() instead
of the get_crit_temp().

Fixes: e68b16abd91d ("thermal: add hwmon sysfs I/F")
Signed-off-by: Krzysztof Kozlowski &lt;krzk@kernel.org&gt;
Signed-off-by: Zhang Rui &lt;rui.zhang@intel.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
commit f37fabb8643eaf8e3b613333a72f683770c85eca upstream.

In the critical sysfs entry the thermal hwmon was returning wrong
temperature to the user-space.  It was reporting the temperature of the
first trip point instead of the temperature of critical trip point.

For example:
	/sys/class/hwmon/hwmon0/temp1_crit:50000
	/sys/class/thermal/thermal_zone0/trip_point_0_temp:50000
	/sys/class/thermal/thermal_zone0/trip_point_0_type:active
	/sys/class/thermal/thermal_zone0/trip_point_3_temp:120000
	/sys/class/thermal/thermal_zone0/trip_point_3_type:critical

Since commit e68b16abd91d ("thermal: add hwmon sysfs I/F") the driver
have been registering a sysfs entry if get_crit_temp() callback was
provided.  However when accessed, it was calling get_trip_temp() instead
of the get_crit_temp().

Fixes: e68b16abd91d ("thermal: add hwmon sysfs I/F")
Signed-off-by: Krzysztof Kozlowski &lt;krzk@kernel.org&gt;
Signed-off-by: Zhang Rui &lt;rui.zhang@intel.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</pre>
</div>
</content>
</entry>
<entry>
<title>thermal/powerclamp: add back module device table</title>
<updated>2016-11-21T12:54:40+00:00</updated>
<author>
<name>Jacob Pan</name>
<email>jacob.jun.pan@linux.intel.com</email>
</author>
<published>2016-11-14T19:08:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ec638db8cb9ddd5ca08b23f2835b6c9c15eb616d'/>
<id>ec638db8cb9ddd5ca08b23f2835b6c9c15eb616d</id>
<content type='text'>
Commit 3105f234e0aba43e44e277c20f9b32ee8add43d4 replaced module
cpu id table with a cpu feature check, which is logically correct.
But we need the module device table to allow module auto loading.

Cc: stable@vger.kernel.org # 4.8
Fixes:3105f234 thermal/powerclamp: correct cpu support check
Signed-off-by: Jacob Pan &lt;jacob.jun.pan@linux.intel.com&gt;
Signed-off-by: Zhang Rui &lt;rui.zhang@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit 3105f234e0aba43e44e277c20f9b32ee8add43d4 replaced module
cpu id table with a cpu feature check, which is logically correct.
But we need the module device table to allow module auto loading.

Cc: stable@vger.kernel.org # 4.8
Fixes:3105f234 thermal/powerclamp: correct cpu support check
Signed-off-by: Jacob Pan &lt;jacob.jun.pan@linux.intel.com&gt;
Signed-off-by: Zhang Rui &lt;rui.zhang@intel.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
