<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux-stable.git/drivers/cpuidle, branch v6.15</title>
<subtitle>Linux kernel stable tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/'/>
<entry>
<title>Merge tag 'pmdomain-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm</title>
<updated>2025-03-26T03:40:51+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-03-26T03:40:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=2a2274e90a76c24a03287e386f1ad128a8cc3b67'/>
<id>2a2274e90a76c24a03287e386f1ad128a8cc3b67</id>
<content type='text'>
Pull pmdomain updates from Ulf Hansson:
 "pmdomain core:
   - Add dev_pm_genpd_rpm_always_on() to support more fine-grained PM

  pmdomain providers:
   - arm: Remove redundant state verification for the SCMI PM domain
   - bcm: Add system-wakeup support for bcm2835 via GENPD_FLAG_ACTIVE_WAKEUP
   - rockchip: Add support for regulators
   - rockchip: Use SMC call to properly inform firmware
   - sunxi: Add V853 ppu support
   - thead: Add support for RISC-V TH1520 power-domains

  firmware:
   - Add support for the AON firmware protocol for RISC-V THEAD

  cpuidle-psci:
   - Update section in MAINTAINERS for cpuidle-psci
   - Add trace support for PSCI domain-idlestates"

* tag 'pmdomain-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm: (29 commits)
  firmware: thead: add CONFIG_MAILBOX dependency
  firmware: thead,th1520-aon: Fix use after free in th1520_aon_init()
  pmdomain: arm: scmi_pm_domain: Remove redundant state verification
  pmdomain: thead: fix TH1520_AON_PROTOCOL dependency
  pmdomain: thead: Add power-domain driver for TH1520
  dt-bindings: power: Add TH1520 SoC power domains
  firmware: thead: Add AON firmware protocol driver
  dt-bindings: firmware: thead,th1520: Add support for firmware node
  pmdomain: rockchip: add regulator dependency
  pmdomain: rockchip: add regulator support
  pmdomain: rockchip: fix rockchip_pd_power error handling
  pmdomain: rockchip: reduce indentation in rockchip_pd_power
  pmdomain: rockchip: forward rockchip_do_pmu_set_power_domain errors
  pmdomain: rockchip: cleanup mutex handling in rockchip_pd_power
  dt-bindings: power: rockchip: add regulator support
  pmdomain: rockchip: Fix build error
  pmdomain: imx: gpcv2: use proper helper for property detection
  MAINTAINERS: Update section for cpuidle-psci
  pmdomain: rockchip: Check if SMC could be handled by TA
  cpuidle: psci: Add trace for PSCI domain idle
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull pmdomain updates from Ulf Hansson:
 "pmdomain core:
   - Add dev_pm_genpd_rpm_always_on() to support more fine-grained PM

  pmdomain providers:
   - arm: Remove redundant state verification for the SCMI PM domain
   - bcm: Add system-wakeup support for bcm2835 via GENPD_FLAG_ACTIVE_WAKEUP
   - rockchip: Add support for regulators
   - rockchip: Use SMC call to properly inform firmware
   - sunxi: Add V853 ppu support
   - thead: Add support for RISC-V TH1520 power-domains

  firmware:
   - Add support for the AON firmware protocol for RISC-V THEAD

  cpuidle-psci:
   - Update section in MAINTAINERS for cpuidle-psci
   - Add trace support for PSCI domain-idlestates"

* tag 'pmdomain-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm: (29 commits)
  firmware: thead: add CONFIG_MAILBOX dependency
  firmware: thead,th1520-aon: Fix use after free in th1520_aon_init()
  pmdomain: arm: scmi_pm_domain: Remove redundant state verification
  pmdomain: thead: fix TH1520_AON_PROTOCOL dependency
  pmdomain: thead: Add power-domain driver for TH1520
  dt-bindings: power: Add TH1520 SoC power domains
  firmware: thead: Add AON firmware protocol driver
  dt-bindings: firmware: thead,th1520: Add support for firmware node
  pmdomain: rockchip: add regulator dependency
  pmdomain: rockchip: add regulator support
  pmdomain: rockchip: fix rockchip_pd_power error handling
  pmdomain: rockchip: reduce indentation in rockchip_pd_power
  pmdomain: rockchip: forward rockchip_do_pmu_set_power_domain errors
  pmdomain: rockchip: cleanup mutex handling in rockchip_pd_power
  dt-bindings: power: rockchip: add regulator support
  pmdomain: rockchip: Fix build error
  pmdomain: imx: gpcv2: use proper helper for property detection
  MAINTAINERS: Update section for cpuidle-psci
  pmdomain: rockchip: Check if SMC could be handled by TA
  cpuidle: psci: Add trace for PSCI domain idle
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'pm-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm</title>
<updated>2025-03-25T22:00:18+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-03-25T22:00:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=7d20aa5c32ac8bd272b5470ddbd7ac6e0cb35714'/>
<id>7d20aa5c32ac8bd272b5470ddbd7ac6e0cb35714</id>
<content type='text'>
Pull power management updates from Rafael Wysocki:
 "These are dominated by cpufreq updates which in turn are dominated by
  updates related to boost support in the core and drivers and
  amd-pstate driver optimizations.

  Apart from the above, there are some cpuidle updates including a
  rework of the most recent idle intervals handling in the venerable
  menu governor that leads to significant improvements in some
  performance benchmarks, as the governor is now more likely to predict
  a shorter idle duration in some cases, and there are updates of the
  core device power management code, mostly related to system suspend
  and resume, that should help to avoid potential issues arising when
  the drivers of devices depending on one another want to use different
  optimizations.

  There is also a usual collection of assorted fixes and cleanups,
  including removal of some unused code.

  Specifics:

   - Manage sysfs attributes and boost frequencies efficiently from
     cpufreq core to reduce boilerplate code in drivers (Viresh Kumar)

   - Minor cleanups to cpufreq drivers (Aaron Kling, Benjamin Schneider,
     Dhananjay Ugwekar, Imran Shaik, zuoqian)

   - Migrate some cpufreq drivers to using for_each_present_cpu() (Jacky
     Bai)

   - cpufreq-qcom-hw DT binding fixes (Krzysztof Kozlowski)

   - Use str_enable_disable() helper in cpufreq_online() (Lifeng Zheng)

   - Optimize the amd-pstate driver to avoid cases where call paths end
     up calling the same writes multiple times and needlessly caching
     variables through code reorganization, locking overhaul and tracing
     adjustments (Mario Limonciello, Dhananjay Ugwekar)

   - Make it possible to avoid enabling capacity-aware scheduling (CAS)
     in the intel_pstate driver and relocate a check for out-of-band
     (OOB) platform handling in it to make it detect OOB before checking
     HWP availability (Rafael Wysocki)

   - Fix dbs_update() to avoid inadvertent conversions of negative
     integer values to unsigned int which causes CPU frequency selection
     to be inaccurate in some cases when the "conservative" cpufreq
     governor is in use (Jie Zhan)

   - Update the handling of the most recent idle intervals in the menu
     cpuidle governor to prevent useful information from being discarded
     by it in some cases and improve the prediction accuracy (Rafael
     Wysocki)

   - Make it possible to tell the intel_idle driver to ignore its
     built-in table of idle states for the given processor, clean up the
     handling of auto-demotion disabling on Baytrail and Cherrytrail
     chips in it, and update its MAINTAINERS entry (David Arcari, Artem
     Bityutskiy, Rafael Wysocki)

   - Make some cpuidle drivers use for_each_present_cpu() instead of
     for_each_possible_cpu() during initialization to avoid issues
     occurring when nosmp or maxcpus=0 are used (Jacky Bai)

   - Clean up the Energy Model handling code somewhat (Rafael Wysocki)

   - Use kfree_rcu() to simplify the handling of runtime Energy Model
     updates (Li RongQing)

   - Add an entry for the Energy Model framework to MAINTAINERS as
     properly maintained (Lukasz Luba)

   - Address RCU-related sparse warnings in the Energy Model code
     (Rafael Wysocki)

   - Remove ENERGY_MODEL dependency on SMP and allow it to be selected
     when DEVFREQ is set without CPUFREQ so it can be used on a wider
     range of systems (Jeson Gao)

   - Unify error handling during runtime suspend and runtime resume in
     the core to help drivers to implement more consistent runtime PM
     error handling (Rafael Wysocki)

   - Drop a redundant check from pm_runtime_force_resume() and rearrange
     documentation related to __pm_runtime_disable() (Rafael Wysocki)

   - Rework the handling of the "smart suspend" driver flag in the PM
     core to avoid issues hat may occur when drivers using it depend on
     some other drivers and clean up the related PM core code (Rafael
     Wysocki, Colin Ian King)

   - Fix the handling of devices with the power.direct_complete flag set
     if device_suspend() returns an error for at least one device to
     avoid situations in which some of them may not be resumed (Rafael
     Wysocki)

   - Use mutex_trylock() in hibernate_compressor_param_set() to avoid a
     possible deadlock that may occur if the "compressor" hibernation
     module parameter is accessed during the registration of a new
     ieee80211 device (Lizhi Xu)

   - Suppress sleeping parent warning in device_pm_add() in the case
     when new children are added under a device with the
     power.direct_complete set after it has been processed by
     device_resume() (Xu Yang)

   - Remove needless return in three void functions related to system
     wakeup (Zijun Hu)

   - Replace deprecated kmap_atomic() with kmap_local_page() in the
     hibernation core code (David Reaver)

   - Remove unused helper functions related to system sleep (David Alan
     Gilbert)

   - Clean up s2idle_enter() so it does not lock and unlock CPU offline
     in vain and update comments in it (Ulf Hansson)

   - Clean up broken white space in dpm_wait_for_children() (Geert
     Uytterhoeven)

   - Update the cpupower utility to fix lib version-ing in it and memory
     leaks in error legs, remove hard-coded values, and implement CPU
     physical core querying (Thomas Renninger, John B. Wyatt IV, Shuah
     Khan, Yiwei Lin, Zhongqiu Han)"

* tag 'pm-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (139 commits)
  PM: sleep: Fix bit masking operation
  dt-bindings: cpufreq: cpufreq-qcom-hw: Narrow properties on SDX75, SA8775p and SM8650
  dt-bindings: cpufreq: cpufreq-qcom-hw: Drop redundant minItems:1
  dt-bindings: cpufreq: cpufreq-qcom-hw: Add missing constraint for interrupt-names
  dt-bindings: cpufreq: cpufreq-qcom-hw: Add QCS8300 compatible
  cpufreq: Init cpufreq only for present CPUs
  PM: sleep: Fix handling devices with direct_complete set on errors
  cpuidle: Init cpuidle only for present CPUs
  PM: clk: Remove unused pm_clk_remove()
  PM: sleep: core: Fix indentation in dpm_wait_for_children()
  PM: s2idle: Extend comment in s2idle_enter()
  PM: s2idle: Drop redundant locks when entering s2idle
  PM: sleep: Remove unused pm_generic_ wrappers
  cpufreq: tegra186: Share policy per cluster
  cpupower: Make lib versioning scheme more obvious and fix version link
  PM: EM: Rework the depends on for CONFIG_ENERGY_MODEL
  PM: EM: Address RCU-related sparse warnings
  cpupower: Implement CPU physical core querying
  pm: cpupower: remove hard-coded topology depth values
  pm: cpupower: Fix cmd_monitor() error legs to free cpu_topology
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull power management updates from Rafael Wysocki:
 "These are dominated by cpufreq updates which in turn are dominated by
  updates related to boost support in the core and drivers and
  amd-pstate driver optimizations.

  Apart from the above, there are some cpuidle updates including a
  rework of the most recent idle intervals handling in the venerable
  menu governor that leads to significant improvements in some
  performance benchmarks, as the governor is now more likely to predict
  a shorter idle duration in some cases, and there are updates of the
  core device power management code, mostly related to system suspend
  and resume, that should help to avoid potential issues arising when
  the drivers of devices depending on one another want to use different
  optimizations.

  There is also a usual collection of assorted fixes and cleanups,
  including removal of some unused code.

  Specifics:

   - Manage sysfs attributes and boost frequencies efficiently from
     cpufreq core to reduce boilerplate code in drivers (Viresh Kumar)

   - Minor cleanups to cpufreq drivers (Aaron Kling, Benjamin Schneider,
     Dhananjay Ugwekar, Imran Shaik, zuoqian)

   - Migrate some cpufreq drivers to using for_each_present_cpu() (Jacky
     Bai)

   - cpufreq-qcom-hw DT binding fixes (Krzysztof Kozlowski)

   - Use str_enable_disable() helper in cpufreq_online() (Lifeng Zheng)

   - Optimize the amd-pstate driver to avoid cases where call paths end
     up calling the same writes multiple times and needlessly caching
     variables through code reorganization, locking overhaul and tracing
     adjustments (Mario Limonciello, Dhananjay Ugwekar)

   - Make it possible to avoid enabling capacity-aware scheduling (CAS)
     in the intel_pstate driver and relocate a check for out-of-band
     (OOB) platform handling in it to make it detect OOB before checking
     HWP availability (Rafael Wysocki)

   - Fix dbs_update() to avoid inadvertent conversions of negative
     integer values to unsigned int which causes CPU frequency selection
     to be inaccurate in some cases when the "conservative" cpufreq
     governor is in use (Jie Zhan)

   - Update the handling of the most recent idle intervals in the menu
     cpuidle governor to prevent useful information from being discarded
     by it in some cases and improve the prediction accuracy (Rafael
     Wysocki)

   - Make it possible to tell the intel_idle driver to ignore its
     built-in table of idle states for the given processor, clean up the
     handling of auto-demotion disabling on Baytrail and Cherrytrail
     chips in it, and update its MAINTAINERS entry (David Arcari, Artem
     Bityutskiy, Rafael Wysocki)

   - Make some cpuidle drivers use for_each_present_cpu() instead of
     for_each_possible_cpu() during initialization to avoid issues
     occurring when nosmp or maxcpus=0 are used (Jacky Bai)

   - Clean up the Energy Model handling code somewhat (Rafael Wysocki)

   - Use kfree_rcu() to simplify the handling of runtime Energy Model
     updates (Li RongQing)

   - Add an entry for the Energy Model framework to MAINTAINERS as
     properly maintained (Lukasz Luba)

   - Address RCU-related sparse warnings in the Energy Model code
     (Rafael Wysocki)

   - Remove ENERGY_MODEL dependency on SMP and allow it to be selected
     when DEVFREQ is set without CPUFREQ so it can be used on a wider
     range of systems (Jeson Gao)

   - Unify error handling during runtime suspend and runtime resume in
     the core to help drivers to implement more consistent runtime PM
     error handling (Rafael Wysocki)

   - Drop a redundant check from pm_runtime_force_resume() and rearrange
     documentation related to __pm_runtime_disable() (Rafael Wysocki)

   - Rework the handling of the "smart suspend" driver flag in the PM
     core to avoid issues hat may occur when drivers using it depend on
     some other drivers and clean up the related PM core code (Rafael
     Wysocki, Colin Ian King)

   - Fix the handling of devices with the power.direct_complete flag set
     if device_suspend() returns an error for at least one device to
     avoid situations in which some of them may not be resumed (Rafael
     Wysocki)

   - Use mutex_trylock() in hibernate_compressor_param_set() to avoid a
     possible deadlock that may occur if the "compressor" hibernation
     module parameter is accessed during the registration of a new
     ieee80211 device (Lizhi Xu)

   - Suppress sleeping parent warning in device_pm_add() in the case
     when new children are added under a device with the
     power.direct_complete set after it has been processed by
     device_resume() (Xu Yang)

   - Remove needless return in three void functions related to system
     wakeup (Zijun Hu)

   - Replace deprecated kmap_atomic() with kmap_local_page() in the
     hibernation core code (David Reaver)

   - Remove unused helper functions related to system sleep (David Alan
     Gilbert)

   - Clean up s2idle_enter() so it does not lock and unlock CPU offline
     in vain and update comments in it (Ulf Hansson)

   - Clean up broken white space in dpm_wait_for_children() (Geert
     Uytterhoeven)

   - Update the cpupower utility to fix lib version-ing in it and memory
     leaks in error legs, remove hard-coded values, and implement CPU
     physical core querying (Thomas Renninger, John B. Wyatt IV, Shuah
     Khan, Yiwei Lin, Zhongqiu Han)"

* tag 'pm-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (139 commits)
  PM: sleep: Fix bit masking operation
  dt-bindings: cpufreq: cpufreq-qcom-hw: Narrow properties on SDX75, SA8775p and SM8650
  dt-bindings: cpufreq: cpufreq-qcom-hw: Drop redundant minItems:1
  dt-bindings: cpufreq: cpufreq-qcom-hw: Add missing constraint for interrupt-names
  dt-bindings: cpufreq: cpufreq-qcom-hw: Add QCS8300 compatible
  cpufreq: Init cpufreq only for present CPUs
  PM: sleep: Fix handling devices with direct_complete set on errors
  cpuidle: Init cpuidle only for present CPUs
  PM: clk: Remove unused pm_clk_remove()
  PM: sleep: core: Fix indentation in dpm_wait_for_children()
  PM: s2idle: Extend comment in s2idle_enter()
  PM: s2idle: Drop redundant locks when entering s2idle
  PM: sleep: Remove unused pm_generic_ wrappers
  cpufreq: tegra186: Share policy per cluster
  cpupower: Make lib versioning scheme more obvious and fix version link
  PM: EM: Rework the depends on for CONFIG_ENERGY_MODEL
  PM: EM: Address RCU-related sparse warnings
  cpupower: Implement CPU physical core querying
  pm: cpupower: remove hard-coded topology depth values
  pm: cpupower: Fix cmd_monitor() error legs to free cpu_topology
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>tracing: Disable branch profiling in noinstr code</title>
<updated>2025-03-22T08:49:26+00:00</updated>
<author>
<name>Josh Poimboeuf</name>
<email>jpoimboe@kernel.org</email>
</author>
<published>2025-03-21T19:53:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=2cbb20b008dba39893f0e296dc8ca312f40a9a0e'/>
<id>2cbb20b008dba39893f0e296dc8ca312f40a9a0e</id>
<content type='text'>
CONFIG_TRACE_BRANCH_PROFILING inserts a call to ftrace_likely_update()
for each use of likely() or unlikely().  That breaks noinstr rules if
the affected function is annotated as noinstr.

Disable branch profiling for files with noinstr functions.  In addition
to some individual files, this also includes the entire arch/x86
subtree, as well as the kernel/entry, drivers/cpuidle, and drivers/idle
directories, all of which are noinstr-heavy.

Due to the nature of how sched binaries are built by combining multiple
.c files into one, branch profiling is disabled more broadly across the
sched code than would otherwise be needed.

This fixes many warnings like the following:

  vmlinux.o: warning: objtool: do_syscall_64+0x40: call to ftrace_likely_update() leaves .noinstr.text section
  vmlinux.o: warning: objtool: __rdgsbase_inactive+0x33: call to ftrace_likely_update() leaves .noinstr.text section
  vmlinux.o: warning: objtool: handle_bug.isra.0+0x198: call to ftrace_likely_update() leaves .noinstr.text section
  ...

Reported-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Suggested-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Josh Poimboeuf &lt;jpoimboe@kernel.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Acked-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Link: https://lore.kernel.org/r/fb94fc9303d48a5ed370498f54500cc4c338eb6d.1742586676.git.jpoimboe@kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
CONFIG_TRACE_BRANCH_PROFILING inserts a call to ftrace_likely_update()
for each use of likely() or unlikely().  That breaks noinstr rules if
the affected function is annotated as noinstr.

Disable branch profiling for files with noinstr functions.  In addition
to some individual files, this also includes the entire arch/x86
subtree, as well as the kernel/entry, drivers/cpuidle, and drivers/idle
directories, all of which are noinstr-heavy.

Due to the nature of how sched binaries are built by combining multiple
.c files into one, branch profiling is disabled more broadly across the
sched code than would otherwise be needed.

This fixes many warnings like the following:

  vmlinux.o: warning: objtool: do_syscall_64+0x40: call to ftrace_likely_update() leaves .noinstr.text section
  vmlinux.o: warning: objtool: __rdgsbase_inactive+0x33: call to ftrace_likely_update() leaves .noinstr.text section
  vmlinux.o: warning: objtool: handle_bug.isra.0+0x198: call to ftrace_likely_update() leaves .noinstr.text section
  ...

Reported-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Suggested-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Josh Poimboeuf &lt;jpoimboe@kernel.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Acked-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Link: https://lore.kernel.org/r/fb94fc9303d48a5ed370498f54500cc4c338eb6d.1742586676.git.jpoimboe@kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>cpuidle: Init cpuidle only for present CPUs</title>
<updated>2025-03-12T20:31:59+00:00</updated>
<author>
<name>Jacky Bai</name>
<email>ping.bai@nxp.com</email>
</author>
<published>2025-03-07T14:55:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=68cb0139fec8e05b93978dc0ef1bc8df90a86419'/>
<id>68cb0139fec8e05b93978dc0ef1bc8df90a86419</id>
<content type='text'>
for_each_possible_cpu() is currently used to initialize cpuidle
in below cpuidle drivers:
  drivers/cpuidle/cpuidle-arm.c
  drivers/cpuidle/cpuidle-big_little.c
  drivers/cpuidle/cpuidle-psci.c
  drivers/cpuidle/cpuidle-qcom-spm.c
  drivers/cpuidle/cpuidle-riscv-sbi.c

However, in cpu_dev_register_generic(), for_each_present_cpu()
is used to register CPU devices which means the CPU devices are
only registered for present CPUs and not all possible CPUs.

With nosmp or maxcpus=0, only the boot CPU is present, lead
to the failure:

  |  Failed to register cpuidle device for cpu1

Then rollback to cancel all CPUs' cpuidle registration.

Change for_each_possible_cpu() to for_each_present_cpu() in the
above cpuidle drivers to ensure it only registers cpuidle devices
for CPUs that are actually present.

Fixes: b0c69e1214bc ("drivers: base: Use present CPUs in GENERIC_CPU_DEVICES")
Reviewed-by: Dhruva Gole &lt;d-gole@ti.com&gt;
Reviewed-by: Sudeep Holla &lt;sudeep.holla@arm.com&gt;
Tested-by: Yuanjie Yang &lt;quic_yuanjiey@quicinc.com&gt;
Signed-off-by: Jacky Bai &lt;ping.bai@nxp.com&gt;
Reviewed-by: Ulf Hansson &lt;ulf.hansson@linaro.org&gt;
Link: https://patch.msgid.link/20250307145547.2784821-1-ping.bai@nxp.com
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
for_each_possible_cpu() is currently used to initialize cpuidle
in below cpuidle drivers:
  drivers/cpuidle/cpuidle-arm.c
  drivers/cpuidle/cpuidle-big_little.c
  drivers/cpuidle/cpuidle-psci.c
  drivers/cpuidle/cpuidle-qcom-spm.c
  drivers/cpuidle/cpuidle-riscv-sbi.c

However, in cpu_dev_register_generic(), for_each_present_cpu()
is used to register CPU devices which means the CPU devices are
only registered for present CPUs and not all possible CPUs.

With nosmp or maxcpus=0, only the boot CPU is present, lead
to the failure:

  |  Failed to register cpuidle device for cpu1

Then rollback to cancel all CPUs' cpuidle registration.

Change for_each_possible_cpu() to for_each_present_cpu() in the
above cpuidle drivers to ensure it only registers cpuidle devices
for CPUs that are actually present.

Fixes: b0c69e1214bc ("drivers: base: Use present CPUs in GENERIC_CPU_DEVICES")
Reviewed-by: Dhruva Gole &lt;d-gole@ti.com&gt;
Reviewed-by: Sudeep Holla &lt;sudeep.holla@arm.com&gt;
Tested-by: Yuanjie Yang &lt;quic_yuanjiey@quicinc.com&gt;
Signed-off-by: Jacky Bai &lt;ping.bai@nxp.com&gt;
Reviewed-by: Ulf Hansson &lt;ulf.hansson@linaro.org&gt;
Link: https://patch.msgid.link/20250307145547.2784821-1-ping.bai@nxp.com
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'cpuidle-menu'</title>
<updated>2025-02-25T11:13:52+00:00</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2025-02-25T11:13:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=de585eac08b972c86faa060a77263af3d209ed24'/>
<id>de585eac08b972c86faa060a77263af3d209ed24</id>
<content type='text'>
This work had been triggered by a report that commit 0611a640e60a
("eventpoll: prefer kfree_rcu() in __ep_remove()") had caused the
critical-jOPS metric of the SPECjbb 2015 benchmark [1] to drop by around
50% even though it generally reduced kernel overhead.  Indeed, it was
found during further investigation that the total interrupt rate while
running the SPECjbb workload had fallen as a result of that commit by
55% and the local timer interrupt rate had fallen by almost 80%.

That turned out to cause the menu cpuidle governor to select the deepest
idle state supplied by the cpuidle driver (intel_idle) much more often
which added significantly more idle state latency to the workload and
that led to the decrease of the critical-jOPS score.

Interestingly enough, this problem was not visible when the teo cpuidle
governor was used instead of menu, so it appeared to be specific to the
latter.  CPU wakeup event statistics collected while running the
workload indicated that the menu governor was effectively ignoring non-
timer wakeup information and all of its idle state selection decisions
appeared to be based on timer wakeups only.  Thus, it appeared that the
reduction of the local timer interrupt rate caused the governor to
predict a idle duration much more often while running the workload and
the deepest idle state was selected significantly more often as a result
of that.

A subsequent inspection of the get_typical_interval() function in the
menu governor indicated that it might return UINT_MAX too often which
then caused the governor's decisions to be based entirely on information
related to timers.

Generally speaking, UINT_MAX is returned by get_typical_interval() if it
cannot make a prediction based on the most recent idle intervals data
with sufficiently high confidence, but at least in some cases this means
that useful information is not taken into account at all which may lead
to significant idle state selection mistakes.  Moreover, this is not
really unlikely to happen.

One issue with get_typical_interval() is that, when it eliminates
outliers from the sample set in an attempt to reduce the standard
deviation (and so improve the prediction confidence), it does that by
dropping high-end samples only, while samples at the low end of the set
are retained.  However, the samples at the low end very well may be the
outliers and they should be eliminated from the sample set instead of
the high-end samples.  Accordingly, the likelihood of making a
meaningful idle duration prediction can be improved by making it also
eliminate low-end samples if they are farther from the average than
high-end samples.

Another issue is that get_typical_interval() gives up after eliminating
1/4 of the samples if the standard deviation is still not as low as
desired (within 1/6 of the average or within 20 us if the average is
close to 0), but the remaining samples in the set still represent useful
information at that point and discarding them altogether may lead to
suboptimal idle state selection.

For instance, the largest idle duration value in the get_typical_interval()
data set is the maximum idle duration observed recently and it is likely
that the upcoming idle duration will not exceed it.  Therefore, in the
absence of a better choice, this value can be used as an upper bound on
the target residency of the idle state to select.

* cpuidle-menu:
  cpuidle: menu: Update documentation after get_typical_interval() changes
  cpuidle: menu: Avoid discarding useful information
  cpuidle: menu: Eliminate outliers on both ends of the sample set
  cpuidle: menu: Tweak threshold use in get_typical_interval()
  cpuidle: menu: Use one loop for average and variance computations
  cpuidle: menu: Drop a redundant local variable
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This work had been triggered by a report that commit 0611a640e60a
("eventpoll: prefer kfree_rcu() in __ep_remove()") had caused the
critical-jOPS metric of the SPECjbb 2015 benchmark [1] to drop by around
50% even though it generally reduced kernel overhead.  Indeed, it was
found during further investigation that the total interrupt rate while
running the SPECjbb workload had fallen as a result of that commit by
55% and the local timer interrupt rate had fallen by almost 80%.

That turned out to cause the menu cpuidle governor to select the deepest
idle state supplied by the cpuidle driver (intel_idle) much more often
which added significantly more idle state latency to the workload and
that led to the decrease of the critical-jOPS score.

Interestingly enough, this problem was not visible when the teo cpuidle
governor was used instead of menu, so it appeared to be specific to the
latter.  CPU wakeup event statistics collected while running the
workload indicated that the menu governor was effectively ignoring non-
timer wakeup information and all of its idle state selection decisions
appeared to be based on timer wakeups only.  Thus, it appeared that the
reduction of the local timer interrupt rate caused the governor to
predict a idle duration much more often while running the workload and
the deepest idle state was selected significantly more often as a result
of that.

A subsequent inspection of the get_typical_interval() function in the
menu governor indicated that it might return UINT_MAX too often which
then caused the governor's decisions to be based entirely on information
related to timers.

Generally speaking, UINT_MAX is returned by get_typical_interval() if it
cannot make a prediction based on the most recent idle intervals data
with sufficiently high confidence, but at least in some cases this means
that useful information is not taken into account at all which may lead
to significant idle state selection mistakes.  Moreover, this is not
really unlikely to happen.

One issue with get_typical_interval() is that, when it eliminates
outliers from the sample set in an attempt to reduce the standard
deviation (and so improve the prediction confidence), it does that by
dropping high-end samples only, while samples at the low end of the set
are retained.  However, the samples at the low end very well may be the
outliers and they should be eliminated from the sample set instead of
the high-end samples.  Accordingly, the likelihood of making a
meaningful idle duration prediction can be improved by making it also
eliminate low-end samples if they are farther from the average than
high-end samples.

Another issue is that get_typical_interval() gives up after eliminating
1/4 of the samples if the standard deviation is still not as low as
desired (within 1/6 of the average or within 20 us if the average is
close to 0), but the remaining samples in the set still represent useful
information at that point and discarding them altogether may lead to
suboptimal idle state selection.

For instance, the largest idle duration value in the get_typical_interval()
data set is the maximum idle duration observed recently and it is likely
that the upcoming idle duration will not exceed it.  Therefore, in the
absence of a better choice, this value can be used as an upper bound on
the target residency of the idle state to select.

* cpuidle-menu:
  cpuidle: menu: Update documentation after get_typical_interval() changes
  cpuidle: menu: Avoid discarding useful information
  cpuidle: menu: Eliminate outliers on both ends of the sample set
  cpuidle: menu: Tweak threshold use in get_typical_interval()
  cpuidle: menu: Use one loop for average and variance computations
  cpuidle: menu: Drop a redundant local variable
</pre>
</div>
</content>
</entry>
<entry>
<title>cpuidle: menu: Update documentation after get_typical_interval() changes</title>
<updated>2025-02-25T11:12:46+00:00</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2025-02-20T20:13:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=5c350410999653dff8d2975d794088e4c166e8b5'/>
<id>5c350410999653dff8d2975d794088e4c166e8b5</id>
<content type='text'>
The documentation of the menu cpuidle governor needs to be updated
to match the code behavior after some changes made recently.

No functional impact.

Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Reviewed-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Link: https://patch.msgid.link/4998484.31r3eYUQgx@rjwysocki.net
[ rjw: More specific subject, two typos fixed in the changelog ]
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The documentation of the menu cpuidle governor needs to be updated
to match the code behavior after some changes made recently.

No functional impact.

Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Reviewed-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Link: https://patch.msgid.link/4998484.31r3eYUQgx@rjwysocki.net
[ rjw: More specific subject, two typos fixed in the changelog ]
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cpuidle: menu: Avoid discarding useful information</title>
<updated>2025-02-25T11:00:43+00:00</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2025-02-06T14:29:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=85975daeaa4d6ec560bfcd354fc9c08ad7f38888'/>
<id>85975daeaa4d6ec560bfcd354fc9c08ad7f38888</id>
<content type='text'>
When giving up on making a high-confidence prediction,
get_typical_interval() always returns UINT_MAX which means that the
next idle interval prediction will be based entirely on the time till
the next timer.  However, the information represented by the most
recent intervals may not be completely useless in those cases.

Namely, the largest recent idle interval is an upper bound on the
recently observed idle duration, so it is reasonable to assume that
the next idle duration is unlikely to exceed it.  Moreover, this is
still true after eliminating the suspected outliers if the sample
set still under consideration is at least as large as 50% of the
maximum sample set size.

Accordingly, make get_typical_interval() return the current maximum
recent interval value in that case instead of UINT_MAX.

Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Reported-by: Artem Bityutskiy &lt;artem.bityutskiy@linux.intel.com&gt;
Tested-by: Artem Bityutskiy &lt;artem.bityutskiy@linux.intel.com&gt;
Reviewed-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Tested-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Tested-by: Aboorva Devarajan &lt;aboorvad@linux.ibm.com&gt;
Link: https://patch.msgid.link/7770672.EvYhyI6sBW@rjwysocki.net
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When giving up on making a high-confidence prediction,
get_typical_interval() always returns UINT_MAX which means that the
next idle interval prediction will be based entirely on the time till
the next timer.  However, the information represented by the most
recent intervals may not be completely useless in those cases.

Namely, the largest recent idle interval is an upper bound on the
recently observed idle duration, so it is reasonable to assume that
the next idle duration is unlikely to exceed it.  Moreover, this is
still true after eliminating the suspected outliers if the sample
set still under consideration is at least as large as 50% of the
maximum sample set size.

Accordingly, make get_typical_interval() return the current maximum
recent interval value in that case instead of UINT_MAX.

Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Reported-by: Artem Bityutskiy &lt;artem.bityutskiy@linux.intel.com&gt;
Tested-by: Artem Bityutskiy &lt;artem.bityutskiy@linux.intel.com&gt;
Reviewed-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Tested-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Tested-by: Aboorva Devarajan &lt;aboorvad@linux.ibm.com&gt;
Link: https://patch.msgid.link/7770672.EvYhyI6sBW@rjwysocki.net
</pre>
</div>
</content>
</entry>
<entry>
<title>cpuidle: menu: Eliminate outliers on both ends of the sample set</title>
<updated>2025-02-25T11:00:35+00:00</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2025-02-06T14:26:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=8de7606f0fe2bf5a918fe97d425e16e190a24fe6'/>
<id>8de7606f0fe2bf5a918fe97d425e16e190a24fe6</id>
<content type='text'>
Currently, get_typical_interval() attempts to eliminate outliers at the
high end of the sample set only (probably in order to bias the prediction
toward lower values), but this it problematic because if the outliers are
present at the low end of the sample set, discarding the highest values
will not help to reduce the variance.

Since the presence of outliers at the low end of the sample set is
generally as likely as their presence at the high end of the sample
set, modify get_typical_interval() to treat samples at the largest
distances from the average (on both ends of the sample set) as outliers.

This should increase the likelihood of making a meaningful prediction
in some cases.

Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Reported-by: Artem Bityutskiy &lt;artem.bityutskiy@linux.intel.com&gt;
Tested-by: Artem Bityutskiy &lt;artem.bityutskiy@linux.intel.com&gt;
Reviewed-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Tested-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Tested-by: Aboorva Devarajan &lt;aboorvad@linux.ibm.com&gt;
Link: https://patch.msgid.link/2301940.iZASKD2KPV@rjwysocki.net
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently, get_typical_interval() attempts to eliminate outliers at the
high end of the sample set only (probably in order to bias the prediction
toward lower values), but this it problematic because if the outliers are
present at the low end of the sample set, discarding the highest values
will not help to reduce the variance.

Since the presence of outliers at the low end of the sample set is
generally as likely as their presence at the high end of the sample
set, modify get_typical_interval() to treat samples at the largest
distances from the average (on both ends of the sample set) as outliers.

This should increase the likelihood of making a meaningful prediction
in some cases.

Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Reported-by: Artem Bityutskiy &lt;artem.bityutskiy@linux.intel.com&gt;
Tested-by: Artem Bityutskiy &lt;artem.bityutskiy@linux.intel.com&gt;
Reviewed-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Tested-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Tested-by: Aboorva Devarajan &lt;aboorvad@linux.ibm.com&gt;
Link: https://patch.msgid.link/2301940.iZASKD2KPV@rjwysocki.net
</pre>
</div>
</content>
</entry>
<entry>
<title>cpuidle: menu: Tweak threshold use in get_typical_interval()</title>
<updated>2025-02-25T11:00:28+00:00</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2025-02-06T14:25:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=60256e458e1c29652b2f9e4f2ba71fc7b09bd30c'/>
<id>60256e458e1c29652b2f9e4f2ba71fc7b09bd30c</id>
<content type='text'>
To prepare get_typical_interval() for subsequent changes, rearrange
the use of the data point threshold in it a bit and initialize that
threshold to UINT_MAX which is more consistent with its data type.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Tested-by: Artem Bityutskiy &lt;artem.bityutskiy@linux.intel.com&gt;
Reviewed-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Tested-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Tested-by: Aboorva Devarajan &lt;aboorvad@linux.ibm.com&gt;
Link: https://patch.msgid.link/8490144.T7Z3S40VBb@rjwysocki.net
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
To prepare get_typical_interval() for subsequent changes, rearrange
the use of the data point threshold in it a bit and initialize that
threshold to UINT_MAX which is more consistent with its data type.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Tested-by: Artem Bityutskiy &lt;artem.bityutskiy@linux.intel.com&gt;
Reviewed-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Tested-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Tested-by: Aboorva Devarajan &lt;aboorvad@linux.ibm.com&gt;
Link: https://patch.msgid.link/8490144.T7Z3S40VBb@rjwysocki.net
</pre>
</div>
</content>
</entry>
<entry>
<title>cpuidle: menu: Use one loop for average and variance computations</title>
<updated>2025-02-25T11:00:17+00:00</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2025-02-06T14:24:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux-stable.git/commit/?id=13982929fb08ed4691256072856f50bf7b206b9b'/>
<id>13982929fb08ed4691256072856f50bf7b206b9b</id>
<content type='text'>
Use the observation that one loop is sufficient to compute the average
of an array of values and their variance to eliminate one of the loops
from get_typical_interval().

While at it, make get_typical_interval() consistently use u64 as the
64-bit unsigned integer data type and rearrange some white space and the
declarations of local variables in it (to make them follow the reverse
X-mas tree pattern).

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Tested-by: Artem Bityutskiy &lt;artem.bityutskiy@linux.intel.com&gt;
Reviewed-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Tested-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Tested-by: Aboorva Devarajan &lt;aboorvad@linux.ibm.com&gt;
Link: https://patch.msgid.link/3339073.aeNJFYEL58@rjwysocki.net
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Use the observation that one loop is sufficient to compute the average
of an array of values and their variance to eliminate one of the loops
from get_typical_interval().

While at it, make get_typical_interval() consistently use u64 as the
64-bit unsigned integer data type and rearrange some white space and the
declarations of local variables in it (to make them follow the reverse
X-mas tree pattern).

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Tested-by: Artem Bityutskiy &lt;artem.bityutskiy@linux.intel.com&gt;
Reviewed-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Tested-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Tested-by: Aboorva Devarajan &lt;aboorvad@linux.ibm.com&gt;
Link: https://patch.msgid.link/3339073.aeNJFYEL58@rjwysocki.net
</pre>
</div>
</content>
</entry>
</feed>
