<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/cpuidle, branch linux-6.14.y</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>cpuidle: menu: Avoid discarding useful information</title>
<updated>2025-05-29T09:13:08+00:00</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2025-02-06T14:29:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=886af26f2c64802da948cbe671eb5ca582049aea'/>
<id>886af26f2c64802da948cbe671eb5ca582049aea</id>
<content type='text'>
[ Upstream commit 85975daeaa4d6ec560bfcd354fc9c08ad7f38888 ]

When giving up on making a high-confidence prediction,
get_typical_interval() always returns UINT_MAX which means that the
next idle interval prediction will be based entirely on the time till
the next timer.  However, the information represented by the most
recent intervals may not be completely useless in those cases.

Namely, the largest recent idle interval is an upper bound on the
recently observed idle duration, so it is reasonable to assume that
the next idle duration is unlikely to exceed it.  Moreover, this is
still true after eliminating the suspected outliers if the sample
set still under consideration is at least as large as 50% of the
maximum sample set size.

Accordingly, make get_typical_interval() return the current maximum
recent interval value in that case instead of UINT_MAX.

Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Reported-by: Artem Bityutskiy &lt;artem.bityutskiy@linux.intel.com&gt;
Tested-by: Artem Bityutskiy &lt;artem.bityutskiy@linux.intel.com&gt;
Reviewed-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Tested-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Tested-by: Aboorva Devarajan &lt;aboorvad@linux.ibm.com&gt;
Link: https://patch.msgid.link/7770672.EvYhyI6sBW@rjwysocki.net
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 85975daeaa4d6ec560bfcd354fc9c08ad7f38888 ]

When giving up on making a high-confidence prediction,
get_typical_interval() always returns UINT_MAX which means that the
next idle interval prediction will be based entirely on the time till
the next timer.  However, the information represented by the most
recent intervals may not be completely useless in those cases.

Namely, the largest recent idle interval is an upper bound on the
recently observed idle duration, so it is reasonable to assume that
the next idle duration is unlikely to exceed it.  Moreover, this is
still true after eliminating the suspected outliers if the sample
set still under consideration is at least as large as 50% of the
maximum sample set size.

Accordingly, make get_typical_interval() return the current maximum
recent interval value in that case instead of UINT_MAX.

Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Reported-by: Artem Bityutskiy &lt;artem.bityutskiy@linux.intel.com&gt;
Tested-by: Artem Bityutskiy &lt;artem.bityutskiy@linux.intel.com&gt;
Reviewed-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Tested-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Tested-by: Aboorva Devarajan &lt;aboorvad@linux.ibm.com&gt;
Link: https://patch.msgid.link/7770672.EvYhyI6sBW@rjwysocki.net
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tracing: Disable branch profiling in noinstr code</title>
<updated>2025-04-20T08:22:25+00:00</updated>
<author>
<name>Josh Poimboeuf</name>
<email>jpoimboe@kernel.org</email>
</author>
<published>2025-03-21T19:53:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d95258d7eda865fb758f65d26f20655fa2b7076b'/>
<id>d95258d7eda865fb758f65d26f20655fa2b7076b</id>
<content type='text'>
[ Upstream commit 2cbb20b008dba39893f0e296dc8ca312f40a9a0e ]

CONFIG_TRACE_BRANCH_PROFILING inserts a call to ftrace_likely_update()
for each use of likely() or unlikely().  That breaks noinstr rules if
the affected function is annotated as noinstr.

Disable branch profiling for files with noinstr functions.  In addition
to some individual files, this also includes the entire arch/x86
subtree, as well as the kernel/entry, drivers/cpuidle, and drivers/idle
directories, all of which are noinstr-heavy.

Due to the nature of how sched binaries are built by combining multiple
.c files into one, branch profiling is disabled more broadly across the
sched code than would otherwise be needed.

This fixes many warnings like the following:

  vmlinux.o: warning: objtool: do_syscall_64+0x40: call to ftrace_likely_update() leaves .noinstr.text section
  vmlinux.o: warning: objtool: __rdgsbase_inactive+0x33: call to ftrace_likely_update() leaves .noinstr.text section
  vmlinux.o: warning: objtool: handle_bug.isra.0+0x198: call to ftrace_likely_update() leaves .noinstr.text section
  ...

Reported-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Suggested-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Josh Poimboeuf &lt;jpoimboe@kernel.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Acked-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Link: https://lore.kernel.org/r/fb94fc9303d48a5ed370498f54500cc4c338eb6d.1742586676.git.jpoimboe@kernel.org
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 2cbb20b008dba39893f0e296dc8ca312f40a9a0e ]

CONFIG_TRACE_BRANCH_PROFILING inserts a call to ftrace_likely_update()
for each use of likely() or unlikely().  That breaks noinstr rules if
the affected function is annotated as noinstr.

Disable branch profiling for files with noinstr functions.  In addition
to some individual files, this also includes the entire arch/x86
subtree, as well as the kernel/entry, drivers/cpuidle, and drivers/idle
directories, all of which are noinstr-heavy.

Due to the nature of how sched binaries are built by combining multiple
.c files into one, branch profiling is disabled more broadly across the
sched code than would otherwise be needed.

This fixes many warnings like the following:

  vmlinux.o: warning: objtool: do_syscall_64+0x40: call to ftrace_likely_update() leaves .noinstr.text section
  vmlinux.o: warning: objtool: __rdgsbase_inactive+0x33: call to ftrace_likely_update() leaves .noinstr.text section
  vmlinux.o: warning: objtool: handle_bug.isra.0+0x198: call to ftrace_likely_update() leaves .noinstr.text section
  ...

Reported-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Suggested-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Josh Poimboeuf &lt;jpoimboe@kernel.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Acked-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Link: https://lore.kernel.org/r/fb94fc9303d48a5ed370498f54500cc4c338eb6d.1742586676.git.jpoimboe@kernel.org
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cpuidle: Init cpuidle only for present CPUs</title>
<updated>2025-04-10T12:43:57+00:00</updated>
<author>
<name>Jacky Bai</name>
<email>ping.bai@nxp.com</email>
</author>
<published>2025-03-07T14:55:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=e4cd170e26773259f869728e19df205b0cdd28e9'/>
<id>e4cd170e26773259f869728e19df205b0cdd28e9</id>
<content type='text'>
[ Upstream commit 68cb0139fec8e05b93978dc0ef1bc8df90a86419 ]

for_each_possible_cpu() is currently used to initialize cpuidle
in below cpuidle drivers:
  drivers/cpuidle/cpuidle-arm.c
  drivers/cpuidle/cpuidle-big_little.c
  drivers/cpuidle/cpuidle-psci.c
  drivers/cpuidle/cpuidle-qcom-spm.c
  drivers/cpuidle/cpuidle-riscv-sbi.c

However, in cpu_dev_register_generic(), for_each_present_cpu()
is used to register CPU devices which means the CPU devices are
only registered for present CPUs and not all possible CPUs.

With nosmp or maxcpus=0, only the boot CPU is present, lead
to the failure:

  |  Failed to register cpuidle device for cpu1

Then rollback to cancel all CPUs' cpuidle registration.

Change for_each_possible_cpu() to for_each_present_cpu() in the
above cpuidle drivers to ensure it only registers cpuidle devices
for CPUs that are actually present.

Fixes: b0c69e1214bc ("drivers: base: Use present CPUs in GENERIC_CPU_DEVICES")
Reviewed-by: Dhruva Gole &lt;d-gole@ti.com&gt;
Reviewed-by: Sudeep Holla &lt;sudeep.holla@arm.com&gt;
Tested-by: Yuanjie Yang &lt;quic_yuanjiey@quicinc.com&gt;
Signed-off-by: Jacky Bai &lt;ping.bai@nxp.com&gt;
Reviewed-by: Ulf Hansson &lt;ulf.hansson@linaro.org&gt;
Link: https://patch.msgid.link/20250307145547.2784821-1-ping.bai@nxp.com
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
[ Upstream commit 68cb0139fec8e05b93978dc0ef1bc8df90a86419 ]

for_each_possible_cpu() is currently used to initialize cpuidle
in below cpuidle drivers:
  drivers/cpuidle/cpuidle-arm.c
  drivers/cpuidle/cpuidle-big_little.c
  drivers/cpuidle/cpuidle-psci.c
  drivers/cpuidle/cpuidle-qcom-spm.c
  drivers/cpuidle/cpuidle-riscv-sbi.c

However, in cpu_dev_register_generic(), for_each_present_cpu()
is used to register CPU devices which means the CPU devices are
only registered for present CPUs and not all possible CPUs.

With nosmp or maxcpus=0, only the boot CPU is present, lead
to the failure:

  |  Failed to register cpuidle device for cpu1

Then rollback to cancel all CPUs' cpuidle registration.

Change for_each_possible_cpu() to for_each_present_cpu() in the
above cpuidle drivers to ensure it only registers cpuidle devices
for CPUs that are actually present.

Fixes: b0c69e1214bc ("drivers: base: Use present CPUs in GENERIC_CPU_DEVICES")
Reviewed-by: Dhruva Gole &lt;d-gole@ti.com&gt;
Reviewed-by: Sudeep Holla &lt;sudeep.holla@arm.com&gt;
Tested-by: Yuanjie Yang &lt;quic_yuanjiey@quicinc.com&gt;
Signed-off-by: Jacky Bai &lt;ping.bai@nxp.com&gt;
Reviewed-by: Ulf Hansson &lt;ulf.hansson@linaro.org&gt;
Link: https://patch.msgid.link/20250307145547.2784821-1-ping.bai@nxp.com
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'pm-6.14-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm</title>
<updated>2025-01-30T23:10:34+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-01-30T23:10:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=f55b0671e3f90824ac06dc06b988075eb9c6830c'/>
<id>f55b0671e3f90824ac06dc06b988075eb9c6830c</id>
<content type='text'>
Pull more power management updates from Rafael Wysocki:
 "These are mostly fixes on top of the previously merged power
  management material with the addition of some teo cpuidle governor
  updates, some of which may also be regarded as fixes:

   - Add missing error handling for syscore_suspend() to the hibernation
     core code (Wentao Liang)

   - Revert a commit that added unused macros (Andy Shevchenko)

   - Synchronize the runtime PM status of devices that were runtime-
     suspended before a system-wide suspend and need to be resumed
     during the subsequent system-wide resume transition (Rafael
     Wysocki)

   - Clean up the teo cpuidle governor and make the handling of short
     idle intervals in it consistent regardless of the properties of
     idle states supplied by the cpuidle driver (Rafael Wysocki)

   - Fix some boost-related issues in cpufreq (Lifeng Zheng)

   - Fix build issues in the s3c64xx and airoha cpufreq drivers (Viresh
     Kumar)

   - Remove unconditional binding of schedutil governor kthreads to the
     affected CPUs if the cpufreq driver indicates that updates can
     happen from any CPU (Christian Loehle)"

* tag 'pm-6.14-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: sleep: core: Synchronize runtime PM status of parents and children
  cpufreq: airoha: Depends on OF
  PM: Revert "Add EXPORT macros for exporting PM functions"
  PM: hibernate: Add error handling for syscore_suspend()
  cpufreq/schedutil: Only bind threads if needed
  cpufreq: ACPI: Remove set_boost in acpi_cpufreq_cpu_init()
  cpufreq: CPPC: Fix wrong max_freq in policy initialization
  cpufreq: Introduce a more generic way to set default per-policy boost flag
  cpufreq: Fix re-boost issue after hotplugging a CPU
  cpufreq: s3c64xx: Fix compilation warning
  cpuidle: teo: Skip sleep length computation for low latency constraints
  cpuidle: teo: Replace time_span_ns with a flag
  cpuidle: teo: Simplify handling of total events count
  cpuidle: teo: Skip getting the sleep length if wakeups are very frequent
  cpuidle: teo: Simplify counting events used for tick management
  cpuidle: teo: Clarify two code comments
  cpuidle: teo: Drop local variable prev_intercept_idx
  cpuidle: teo: Combine candidate state index checks against 0
  cpuidle: teo: Reorder candidate state index checks
  cpuidle: teo: Rearrange idle state lookup code
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull more power management updates from Rafael Wysocki:
 "These are mostly fixes on top of the previously merged power
  management material with the addition of some teo cpuidle governor
  updates, some of which may also be regarded as fixes:

   - Add missing error handling for syscore_suspend() to the hibernation
     core code (Wentao Liang)

   - Revert a commit that added unused macros (Andy Shevchenko)

   - Synchronize the runtime PM status of devices that were runtime-
     suspended before a system-wide suspend and need to be resumed
     during the subsequent system-wide resume transition (Rafael
     Wysocki)

   - Clean up the teo cpuidle governor and make the handling of short
     idle intervals in it consistent regardless of the properties of
     idle states supplied by the cpuidle driver (Rafael Wysocki)

   - Fix some boost-related issues in cpufreq (Lifeng Zheng)

   - Fix build issues in the s3c64xx and airoha cpufreq drivers (Viresh
     Kumar)

   - Remove unconditional binding of schedutil governor kthreads to the
     affected CPUs if the cpufreq driver indicates that updates can
     happen from any CPU (Christian Loehle)"

* tag 'pm-6.14-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: sleep: core: Synchronize runtime PM status of parents and children
  cpufreq: airoha: Depends on OF
  PM: Revert "Add EXPORT macros for exporting PM functions"
  PM: hibernate: Add error handling for syscore_suspend()
  cpufreq/schedutil: Only bind threads if needed
  cpufreq: ACPI: Remove set_boost in acpi_cpufreq_cpu_init()
  cpufreq: CPPC: Fix wrong max_freq in policy initialization
  cpufreq: Introduce a more generic way to set default per-policy boost flag
  cpufreq: Fix re-boost issue after hotplugging a CPU
  cpufreq: s3c64xx: Fix compilation warning
  cpuidle: teo: Skip sleep length computation for low latency constraints
  cpuidle: teo: Replace time_span_ns with a flag
  cpuidle: teo: Simplify handling of total events count
  cpuidle: teo: Skip getting the sleep length if wakeups are very frequent
  cpuidle: teo: Simplify counting events used for tick management
  cpuidle: teo: Clarify two code comments
  cpuidle: teo: Drop local variable prev_intercept_idx
  cpuidle: teo: Combine candidate state index checks against 0
  cpuidle: teo: Reorder candidate state index checks
  cpuidle: teo: Rearrange idle state lookup code
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'pmdomain-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm</title>
<updated>2025-01-24T15:43:35+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-01-24T15:43:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=68732c0bf97cf946ad08660203e8eabfea11463e'/>
<id>68732c0bf97cf946ad08660203e8eabfea11463e</id>
<content type='text'>
Pull pmdomain updates from Ulf Hansson:
 "pmdomain core:
   - Add support for naming idlestates through DT

  pmdomain providers:
   - arm: Explicitly request the current state at init for the SCMI PM
     domain
   - mediatek: Add Airoha CPU PM Domain support for CPU frequency
     scaling
   - ti: Add per-device latency constraint management to the ti_sci PM
     domain

  cpuidle-psci:
   - Enable system-wakeup through GENPD_FLAG_ACTIVE_WAKEUP"

* tag 'pmdomain-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
  pmdomain: airoha: Fix compilation error with Clang-20 and Thumb2 mode
  pmdomain: arm: scmi_pm_domain: Send an explicit request to set the current state
  pmdomain: airoha: Add Airoha CPU PM Domain support
  pmdomain: ti_sci: handle wake IRQs for IO daisy chain wakeups
  pmdomain: ti_sci: add wakeup constraint management
  pmdomain: ti_sci: add per-device latency constraint management
  pmdomain: imx-gpcv2: Suppress bind attrs
  pmdomain: imx8m[p]-blk-ctrl: Suppress bind attrs
  pmdomain: core: Support naming idle states
  dt-bindings: power: domain-idle-state: Allow idle-state-name
  cpuidle: psci: Activate GENPD_FLAG_ACTIVE_WAKEUP with OSI
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull pmdomain updates from Ulf Hansson:
 "pmdomain core:
   - Add support for naming idlestates through DT

  pmdomain providers:
   - arm: Explicitly request the current state at init for the SCMI PM
     domain
   - mediatek: Add Airoha CPU PM Domain support for CPU frequency
     scaling
   - ti: Add per-device latency constraint management to the ti_sci PM
     domain

  cpuidle-psci:
   - Enable system-wakeup through GENPD_FLAG_ACTIVE_WAKEUP"

* tag 'pmdomain-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
  pmdomain: airoha: Fix compilation error with Clang-20 and Thumb2 mode
  pmdomain: arm: scmi_pm_domain: Send an explicit request to set the current state
  pmdomain: airoha: Add Airoha CPU PM Domain support
  pmdomain: ti_sci: handle wake IRQs for IO daisy chain wakeups
  pmdomain: ti_sci: add wakeup constraint management
  pmdomain: ti_sci: add per-device latency constraint management
  pmdomain: imx-gpcv2: Suppress bind attrs
  pmdomain: imx8m[p]-blk-ctrl: Suppress bind attrs
  pmdomain: core: Support naming idle states
  dt-bindings: power: domain-idle-state: Allow idle-state-name
  cpuidle: psci: Activate GENPD_FLAG_ACTIVE_WAKEUP with OSI
</pre>
</div>
</content>
</entry>
<entry>
<title>cpuidle: teo: Skip sleep length computation for low latency constraints</title>
<updated>2025-01-20T16:19:53+00:00</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2025-01-20T16:08:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=16c8d7586c196cddcc8822a946ef03c9cfabae30'/>
<id>16c8d7586c196cddcc8822a946ef03c9cfabae30</id>
<content type='text'>
If the idle state exit latency constraint is sufficiently low, it
is better to avoid the additional latency related to calling
tick_nohz_get_sleep_length().  It is also not necessary to compute
the sleep length in that case because shallow idle state selection
will be forced then regardless of the recent wakeup history.

Accordingly, skip the sleep length computation and subsequent
checks of the exit latency constraint is low enough.

Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Reviewed-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Link: https://patch.msgid.link/6122398.lOV4Wx5bFT@rjwysocki.net
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If the idle state exit latency constraint is sufficiently low, it
is better to avoid the additional latency related to calling
tick_nohz_get_sleep_length().  It is also not necessary to compute
the sleep length in that case because shallow idle state selection
will be forced then regardless of the recent wakeup history.

Accordingly, skip the sleep length computation and subsequent
checks of the exit latency constraint is low enough.

Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Reviewed-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Link: https://patch.msgid.link/6122398.lOV4Wx5bFT@rjwysocki.net
</pre>
</div>
</content>
</entry>
<entry>
<title>cpuidle: teo: Replace time_span_ns with a flag</title>
<updated>2025-01-20T16:18:02+00:00</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2025-01-13T18:51:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=65e18e6544751ac25dc284794566ee90d65a379e'/>
<id>65e18e6544751ac25dc284794566ee90d65a379e</id>
<content type='text'>
After recent updates, the time_span_ns field in struct teo_cpu has
become an indicator on whether or not the most recent wakeup has been
"genuine" which may as well be indicated by a bool field without
calling local_clock(), so update the code accordingly.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Reviewed-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Tested-by: Aboorva Devarajan &lt;aboorvad@linux.ibm.com&gt;
Tested-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Link: https://patch.msgid.link/6010475.MhkbZ0Pkbq@rjwysocki.net
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
After recent updates, the time_span_ns field in struct teo_cpu has
become an indicator on whether or not the most recent wakeup has been
"genuine" which may as well be indicated by a bool field without
calling local_clock(), so update the code accordingly.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Reviewed-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Tested-by: Aboorva Devarajan &lt;aboorvad@linux.ibm.com&gt;
Tested-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Link: https://patch.msgid.link/6010475.MhkbZ0Pkbq@rjwysocki.net
</pre>
</div>
</content>
</entry>
<entry>
<title>cpuidle: teo: Simplify handling of total events count</title>
<updated>2025-01-20T16:17:56+00:00</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2025-01-13T18:50:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=ddcfa7964677b1298712edb931a98ac25ffd2fb6'/>
<id>ddcfa7964677b1298712edb931a98ac25ffd2fb6</id>
<content type='text'>
Instead of computing the total events count from scratch every time,
decay it and add a PULSE value to it in teo_update().

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Reviewed-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Tested-by: Aboorva Devarajan &lt;aboorvad@linux.ibm.com&gt;
Tested-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Link: https://patch.msgid.link/9388883.CDJkKcVGEf@rjwysocki.net
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Instead of computing the total events count from scratch every time,
decay it and add a PULSE value to it in teo_update().

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Reviewed-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Tested-by: Aboorva Devarajan &lt;aboorvad@linux.ibm.com&gt;
Tested-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Link: https://patch.msgid.link/9388883.CDJkKcVGEf@rjwysocki.net
</pre>
</div>
</content>
</entry>
<entry>
<title>cpuidle: teo: Skip getting the sleep length if wakeups are very frequent</title>
<updated>2025-01-20T16:17:02+00:00</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2025-01-13T18:48:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=13ed5c4a6d9c91755227b7b0fab7b2543f6adfd2'/>
<id>13ed5c4a6d9c91755227b7b0fab7b2543f6adfd2</id>
<content type='text'>
Commit 6da8f9ba5a87 ("cpuidle: teo: Skip tick_nohz_get_sleep_length()
call in some cases") attempted to reduce the governor overhead in some
cases by making it avoid obtaining the sleep length (the time till the
next timer event) which may be costly.

Among other things, after the above commit, tick_nohz_get_sleep_length()
was not called any more when idle state 0 was to be returned, which
turned out to be problematic and the previous behavior in that respect
was restored by commit 4b20b07ce72f ("cpuidle: teo: Don't count non-
existent intercepts").

However, commit 6da8f9ba5a87 also caused the governor to avoid calling
tick_nohz_get_sleep_length() on systems where idle state 0 is a "polling"
one (that is, it is not really an idle state, but a loop continuously
executed by the CPU) when the target residency of the idle state to be
returned was low enough, so there was no practical need to refine the
idle state selection in any way.  This change was not removed by the
other commit, so now on systems where idle state 0 is a "polling" one,
tick_nohz_get_sleep_length() is called when idle state 0 is to be
returned, but it is not called when a deeper idle state with
sufficiently low target residency is to be returned.  That is arguably
confusing and inconsistent.

Moreover, there is no specific reason why the behavior in question
should depend on whether or not idle state 0 is a "polling" one.

One way to address this would be to make the governor always call
tick_nohz_get_sleep_length() to obtain the sleep length, but that would
effectively mean reverting commit 6da8f9ba5a87 and restoring the latency
issue that was the reason for doing it.  This approach is thus not
particularly attractive.

To address it differently, notice that if a CPU is woken up very often,
this is not likely to be caused by timers in the first place (user space
has a default timer slack of 50 us and there are relatively few timers
with a deadline shorter than several microseconds in the kernel) and
even if it were the case, the potential benefit from using a deep idle
state would then be questionable for latency reasons.  Therefore, if the
majority of CPU wakeups occur within several microseconds, it can be
assumed that all wakeups in that range are non-timer and the sleep
length need not be determined.

Accordingly, introduce a new metric for counting wakeups with the
measured idle duration below RESIDENCY_THRESHOLD_NS and modify the idle
state selection to skip the tick_nohz_get_sleep_length() invocation if
idle state 0 has been selected or the target residency of the candidate
idle state is below RESIDENCY_THRESHOLD_NS and the value of the new
metric is at least 1/2 of the total event count.

Since the above requires the measured idle duration to be determined
every time, except for the cases when one of the safety nets has
triggered in which the wakeup is counted as a hit in the deepest
idle state idle residency range, update the handling of those cases
to avoid skipping the idle duration computation when the CPU wakeup
is "genuine".

Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Link: https://patch.msgid.link/3851791.kQq0lBPeGt@rjwysocki.net
Tested-by: Aboorva Devarajan &lt;aboorvad@linux.ibm.com&gt;
Tested-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Reviewed-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
[ rjw: Renamed a struct field ]
[ rjw: Fixed typo in the subject and one in a comment ]
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit 6da8f9ba5a87 ("cpuidle: teo: Skip tick_nohz_get_sleep_length()
call in some cases") attempted to reduce the governor overhead in some
cases by making it avoid obtaining the sleep length (the time till the
next timer event) which may be costly.

Among other things, after the above commit, tick_nohz_get_sleep_length()
was not called any more when idle state 0 was to be returned, which
turned out to be problematic and the previous behavior in that respect
was restored by commit 4b20b07ce72f ("cpuidle: teo: Don't count non-
existent intercepts").

However, commit 6da8f9ba5a87 also caused the governor to avoid calling
tick_nohz_get_sleep_length() on systems where idle state 0 is a "polling"
one (that is, it is not really an idle state, but a loop continuously
executed by the CPU) when the target residency of the idle state to be
returned was low enough, so there was no practical need to refine the
idle state selection in any way.  This change was not removed by the
other commit, so now on systems where idle state 0 is a "polling" one,
tick_nohz_get_sleep_length() is called when idle state 0 is to be
returned, but it is not called when a deeper idle state with
sufficiently low target residency is to be returned.  That is arguably
confusing and inconsistent.

Moreover, there is no specific reason why the behavior in question
should depend on whether or not idle state 0 is a "polling" one.

One way to address this would be to make the governor always call
tick_nohz_get_sleep_length() to obtain the sleep length, but that would
effectively mean reverting commit 6da8f9ba5a87 and restoring the latency
issue that was the reason for doing it.  This approach is thus not
particularly attractive.

To address it differently, notice that if a CPU is woken up very often,
this is not likely to be caused by timers in the first place (user space
has a default timer slack of 50 us and there are relatively few timers
with a deadline shorter than several microseconds in the kernel) and
even if it were the case, the potential benefit from using a deep idle
state would then be questionable for latency reasons.  Therefore, if the
majority of CPU wakeups occur within several microseconds, it can be
assumed that all wakeups in that range are non-timer and the sleep
length need not be determined.

Accordingly, introduce a new metric for counting wakeups with the
measured idle duration below RESIDENCY_THRESHOLD_NS and modify the idle
state selection to skip the tick_nohz_get_sleep_length() invocation if
idle state 0 has been selected or the target residency of the candidate
idle state is below RESIDENCY_THRESHOLD_NS and the value of the new
metric is at least 1/2 of the total event count.

Since the above requires the measured idle duration to be determined
every time, except for the cases when one of the safety nets has
triggered in which the wakeup is counted as a hit in the deepest
idle state idle residency range, update the handling of those cases
to avoid skipping the idle duration computation when the CPU wakeup
is "genuine".

Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Link: https://patch.msgid.link/3851791.kQq0lBPeGt@rjwysocki.net
Tested-by: Aboorva Devarajan &lt;aboorvad@linux.ibm.com&gt;
Tested-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Reviewed-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
[ rjw: Renamed a struct field ]
[ rjw: Fixed typo in the subject and one in a comment ]
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cpuidle: teo: Simplify counting events used for tick management</title>
<updated>2025-01-20T16:16:53+00:00</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2025-01-13T18:45:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=d619b5cc678024fa5ed7eb3702c3991a2aa96823'/>
<id>d619b5cc678024fa5ed7eb3702c3991a2aa96823</id>
<content type='text'>
Replace the tick_hits metric with a new tick_intercepts one that can be
used directly when deciding whether or not to stop the scheduler tick
and update the governor functional description accordingly.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Reviewed-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Tested-by: Aboorva Devarajan &lt;aboorvad@linux.ibm.com&gt;
Tested-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Link: https://patch.msgid.link/1987985.PYKUYFuaPT@rjwysocki.net
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Replace the tick_hits metric with a new tick_intercepts one that can be
used directly when deciding whether or not to stop the scheduler tick
and update the governor functional description accordingly.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Reviewed-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Tested-by: Aboorva Devarajan &lt;aboorvad@linux.ibm.com&gt;
Tested-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Link: https://patch.msgid.link/1987985.PYKUYFuaPT@rjwysocki.net
</pre>
</div>
</content>
</entry>
</feed>
