<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/kernel/sched/cputime.c, branch v6.8</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>cputime: remove cputime_to_nsecs fallback</title>
<updated>2022-12-27T11:52:17+00:00</updated>
<author>
<name>Nicholas Piggin</name>
<email>npiggin@gmail.com</email>
</author>
<published>2022-12-20T07:07:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=c89970202a1153b2fc230e89f90c180bd5bcbcef'/>
<id>c89970202a1153b2fc230e89f90c180bd5bcbcef</id>
<content type='text'>
The archs that use cputime_to_nsecs() internally provide their own
definition and don't need the fallback. cputime_to_usecs() unused except
in this fallback, and is not defined anywhere.

This removes the final remnant of the cputime_t code from the kernel.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Link: https://lore.kernel.org/r/20221220070705.2958959-1-npiggin@gmail.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The archs that use cputime_to_nsecs() internally provide their own
definition and don't need the fallback. cputime_to_usecs() unused except
in this fallback, and is not defined anywhere.

This removes the final remnant of the cputime_t code from the kernel.

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Link: https://lore.kernel.org/r/20221220070705.2958959-1-npiggin@gmail.com
</pre>
</div>
</content>
</entry>
<entry>
<title>sched/core: add forced idle accounting for cgroups</title>
<updated>2022-07-04T07:23:07+00:00</updated>
<author>
<name>Josh Don</name>
<email>joshdon@google.com</email>
</author>
<published>2022-06-29T21:14:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=1fcf54deb767d474181ad7cf33c92bb2a33607fb'/>
<id>1fcf54deb767d474181ad7cf33c92bb2a33607fb</id>
<content type='text'>
4feee7d1260 previously added per-task forced idle accounting. This patch
extends this to also include cgroups.

rstat is used for cgroup accounting, except for the root, which uses
kcpustat in order to bypass the need for doing an rstat flush when
reading root stats.

Only cgroup v2 is supported. Similar to the task accounting, the cgroup
accounting requires that schedstats is enabled.

Signed-off-by: Josh Don &lt;joshdon@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Link: https://lkml.kernel.org/r/20220629211426.3329954-1-joshdon@google.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
4feee7d1260 previously added per-task forced idle accounting. This patch
extends this to also include cgroups.

rstat is used for cgroup accounting, except for the root, which uses
kcpustat in order to bypass the need for doing an rstat flush when
reading root stats.

Only cgroup v2 is supported. Similar to the task accounting, the cgroup
accounting requires that schedstats is enabled.

Signed-off-by: Josh Don &lt;joshdon@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Link: https://lkml.kernel.org/r/20220629211426.3329954-1-joshdon@google.com
</pre>
</div>
</content>
</entry>
<entry>
<title>sched/headers: Introduce kernel/sched/build_policy.c and build multiple .c files there</title>
<updated>2022-02-23T09:58:33+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2022-02-22T12:46:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f96eca432015ddc1b621632488ebc345bca06791'/>
<id>f96eca432015ddc1b621632488ebc345bca06791</id>
<content type='text'>
Similarly to kernel/sched/build_utility.c, collect all 'scheduling policy' related
source code files into kernel/sched/build_policy.c:

    kernel/sched/idle.c

    kernel/sched/rt.c

    kernel/sched/cpudeadline.c
    kernel/sched/pelt.c

    kernel/sched/cputime.c
    kernel/sched/deadline.c

With the exception of fair.c, which we continue to build as a separate file
for build efficiency and parallelism reasons.

Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Reviewed-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Similarly to kernel/sched/build_utility.c, collect all 'scheduling policy' related
source code files into kernel/sched/build_policy.c:

    kernel/sched/idle.c

    kernel/sched/rt.c

    kernel/sched/cpudeadline.c
    kernel/sched/pelt.c

    kernel/sched/cputime.c
    kernel/sched/deadline.c

With the exception of fair.c, which we continue to build as a separate file
for build efficiency and parallelism reasons.

Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Reviewed-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'sched_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2022-01-12T01:14:59+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-01-12T01:14:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6ae71436cda740148640046d58190a5bbc3ac86d'/>
<id>6ae71436cda740148640046d58190a5bbc3ac86d</id>
<content type='text'>
Pull scheduler updates from Borislav Petkov:
 "Mostly minor things this time; some highlights:

   - core-sched: Add 'Forced Idle' accounting; this allows to track how
     much CPU time is 'lost' due to core scheduling constraints.

   - psi: Fix for MEM_FULL; a task running reclaim would be counted as a
     runnable task and prevent MEM_FULL from being reported.

   - cpuacct: Long standing fixes for some cgroup accounting issues.

   - rt: Bandwidth timer could, under unusual circumstances, be failed
     to armed, leading to indefinite throttling."

[ Description above by Peter Zijlstra ]

* tag 'sched_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Replace CFS internal cpu_util() with cpu_util_cfs()
  sched/fair: Cleanup task_util and capacity type
  sched/rt: Try to restart rt period timer when rt runtime exceeded
  sched/fair: Document the slow path and fast path in select_task_rq_fair
  sched/fair: Fix per-CPU kthread and wakee stacking for asym CPU capacity
  sched/fair: Fix detection of per-CPU kthreads waking a task
  sched/cpuacct: Make user/system times in cpuacct.stat more precise
  sched/cpuacct: Fix user/system in shown cpuacct.usage*
  cpuacct: Convert BUG_ON() to WARN_ON_ONCE()
  cputime, cpuacct: Include guest time in user time in cpuacct.stat
  psi: Fix PSI_MEM_FULL state when tasks are in memstall and doing reclaim
  sched/core: Forced idle accounting
  psi: Add a missing SPDX license header
  psi: Remove repeated verbose comment
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull scheduler updates from Borislav Petkov:
 "Mostly minor things this time; some highlights:

   - core-sched: Add 'Forced Idle' accounting; this allows to track how
     much CPU time is 'lost' due to core scheduling constraints.

   - psi: Fix for MEM_FULL; a task running reclaim would be counted as a
     runnable task and prevent MEM_FULL from being reported.

   - cpuacct: Long standing fixes for some cgroup accounting issues.

   - rt: Bandwidth timer could, under unusual circumstances, be failed
     to armed, leading to indefinite throttling."

[ Description above by Peter Zijlstra ]

* tag 'sched_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Replace CFS internal cpu_util() with cpu_util_cfs()
  sched/fair: Cleanup task_util and capacity type
  sched/rt: Try to restart rt period timer when rt runtime exceeded
  sched/fair: Document the slow path and fast path in select_task_rq_fair
  sched/fair: Fix per-CPU kthread and wakee stacking for asym CPU capacity
  sched/fair: Fix detection of per-CPU kthreads waking a task
  sched/cpuacct: Make user/system times in cpuacct.stat more precise
  sched/cpuacct: Fix user/system in shown cpuacct.usage*
  cpuacct: Convert BUG_ON() to WARN_ON_ONCE()
  cputime, cpuacct: Include guest time in user time in cpuacct.stat
  psi: Fix PSI_MEM_FULL state when tasks are in memstall and doing reclaim
  sched/core: Forced idle accounting
  psi: Add a missing SPDX license header
  psi: Remove repeated verbose comment
</pre>
</div>
</content>
</entry>
<entry>
<title>sched/cputime: Fix getrusage(RUSAGE_THREAD) with nohz_full</title>
<updated>2021-12-02T14:08:22+00:00</updated>
<author>
<name>Frederic Weisbecker</name>
<email>frederic@kernel.org</email>
</author>
<published>2021-10-26T14:10:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e7f2be115f0746b969c0df14c0d182f65f005ca5'/>
<id>e7f2be115f0746b969c0df14c0d182f65f005ca5</id>
<content type='text'>
getrusage(RUSAGE_THREAD) with nohz_full may return shorter utime/stime
than the actual time.

task_cputime_adjusted() snapshots utime and stime and then adjust their
sum to match the scheduler maintained cputime.sum_exec_runtime.
Unfortunately in nohz_full, sum_exec_runtime is only updated once per
second in the worst case, causing a discrepancy against utime and stime
that can be updated anytime by the reader using vtime.

To fix this situation, perform an update of cputime.sum_exec_runtime
when the cputime snapshot reports the task as actually running while
the tick is disabled. The related overhead is then contained within the
relevant situations.

Reported-by: Hasegawa Hitomi &lt;hasegawa-hitomi@fujitsu.com&gt;
Signed-off-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Signed-off-by: Hasegawa Hitomi &lt;hasegawa-hitomi@fujitsu.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Tested-by: Masayoshi Mizuma &lt;m.mizuma@jp.fujitsu.com&gt;
Acked-by: Phil Auld &lt;pauld@redhat.com&gt;
Link: https://lore.kernel.org/r/20211026141055.57358-3-frederic@kernel.org

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
getrusage(RUSAGE_THREAD) with nohz_full may return shorter utime/stime
than the actual time.

task_cputime_adjusted() snapshots utime and stime and then adjust their
sum to match the scheduler maintained cputime.sum_exec_runtime.
Unfortunately in nohz_full, sum_exec_runtime is only updated once per
second in the worst case, causing a discrepancy against utime and stime
that can be updated anytime by the reader using vtime.

To fix this situation, perform an update of cputime.sum_exec_runtime
when the cputime snapshot reports the task as actually running while
the tick is disabled. The related overhead is then contained within the
relevant situations.

Reported-by: Hasegawa Hitomi &lt;hasegawa-hitomi@fujitsu.com&gt;
Signed-off-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Signed-off-by: Hasegawa Hitomi &lt;hasegawa-hitomi@fujitsu.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Tested-by: Masayoshi Mizuma &lt;m.mizuma@jp.fujitsu.com&gt;
Acked-by: Phil Auld &lt;pauld@redhat.com&gt;
Link: https://lore.kernel.org/r/20211026141055.57358-3-frederic@kernel.org

</pre>
</div>
</content>
</entry>
<entry>
<title>cputime, cpuacct: Include guest time in user time in cpuacct.stat</title>
<updated>2021-11-23T08:55:22+00:00</updated>
<author>
<name>Andrey Ryabinin</name>
<email>arbn@yandex-team.com</email>
</author>
<published>2021-11-15T16:46:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=9731698ecb9c851f353ce2496292ff9fcea39dff'/>
<id>9731698ecb9c851f353ce2496292ff9fcea39dff</id>
<content type='text'>
cpuacct.stat in no-root cgroups shows user time without guest time
included int it. This doesn't match with user time shown in root
cpuacct.stat and /proc/&lt;pid&gt;/stat. This also affects cgroup2's cpu.stat
in the same way.

Make account_guest_time() to add user time to cgroup's cpustat to
fix this.

Fixes: ef12fefabf94 ("cpuacct: add per-cgroup utime/stime statistics")
Signed-off-by: Andrey Ryabinin &lt;arbn@yandex-team.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Daniel Jordan &lt;daniel.m.jordan@oracle.com&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Link: https://lore.kernel.org/r/20211115164607.23784-1-arbn@yandex-team.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
cpuacct.stat in no-root cgroups shows user time without guest time
included int it. This doesn't match with user time shown in root
cpuacct.stat and /proc/&lt;pid&gt;/stat. This also affects cgroup2's cpu.stat
in the same way.

Make account_guest_time() to add user time to cgroup's cpustat to
fix this.

Fixes: ef12fefabf94 ("cpuacct: add per-cgroup utime/stime statistics")
Signed-off-by: Andrey Ryabinin &lt;arbn@yandex-team.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Daniel Jordan &lt;daniel.m.jordan@oracle.com&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Link: https://lore.kernel.org/r/20211115164607.23784-1-arbn@yandex-team.com
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'sched-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2021-04-28T20:33:57+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2021-04-28T20:33:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=16b3d0cf5bad844daaf436ad2e9061de0fe36e5c'/>
<id>16b3d0cf5bad844daaf436ad2e9061de0fe36e5c</id>
<content type='text'>
Pull scheduler updates from Ingo Molnar:

 - Clean up SCHED_DEBUG: move the decades old mess of sysctl, procfs and
   debugfs interfaces to a unified debugfs interface.

 - Signals: Allow caching one sigqueue object per task, to improve
   performance &amp; latencies.

 - Improve newidle_balance() irq-off latencies on systems with a large
   number of CPU cgroups.

 - Improve energy-aware scheduling

 - Improve the PELT metrics for certain workloads

 - Reintroduce select_idle_smt() to improve load-balancing locality -
   but without the previous regressions

 - Add 'scheduler latency debugging': warn after long periods of pending
   need_resched. This is an opt-in feature that requires the enabling of
   the LATENCY_WARN scheduler feature, or the use of the
   resched_latency_warn_ms=xx boot parameter.

 - CPU hotplug fixes for HP-rollback, and for the 'fail' interface. Fix
   remaining balance_push() vs. hotplug holes/races

 - PSI fixes, plus allow /proc/pressure/ files to be written by
   CAP_SYS_RESOURCE tasks as well

 - Fix/improve various load-balancing corner cases vs. capacity margins

 - Fix sched topology on systems with NUMA diameter of 3 or above

 - Fix PF_KTHREAD vs to_kthread() race

 - Minor rseq optimizations

 - Misc cleanups, optimizations, fixes and smaller updates

* tag 'sched-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (61 commits)
  cpumask/hotplug: Fix cpu_dying() state tracking
  kthread: Fix PF_KTHREAD vs to_kthread() race
  sched/debug: Fix cgroup_path[] serialization
  sched,psi: Handle potential task count underflow bugs more gracefully
  sched: Warn on long periods of pending need_resched
  sched/fair: Move update_nohz_stats() to the CONFIG_NO_HZ_COMMON block to simplify the code &amp; fix an unused function warning
  sched/debug: Rename the sched_debug parameter to sched_verbose
  sched,fair: Alternative sched_slice()
  sched: Move /proc/sched_debug to debugfs
  sched,debug: Convert sysctl sched_domains to debugfs
  debugfs: Implement debugfs_create_str()
  sched,preempt: Move preempt_dynamic to debug.c
  sched: Move SCHED_DEBUG sysctl to debugfs
  sched: Don't make LATENCYTOP select SCHED_DEBUG
  sched: Remove sched_schedstats sysctl out from under SCHED_DEBUG
  sched/numa: Allow runtime enabling/disabling of NUMA balance without SCHED_DEBUG
  sched: Use cpu_dying() to fix balance_push vs hotplug-rollback
  cpumask: Introduce DYING mask
  cpumask: Make cpu_{online,possible,present,active}() inline
  rseq: Optimise rseq_get_rseq_cs() and clear_rseq_cs()
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull scheduler updates from Ingo Molnar:

 - Clean up SCHED_DEBUG: move the decades old mess of sysctl, procfs and
   debugfs interfaces to a unified debugfs interface.

 - Signals: Allow caching one sigqueue object per task, to improve
   performance &amp; latencies.

 - Improve newidle_balance() irq-off latencies on systems with a large
   number of CPU cgroups.

 - Improve energy-aware scheduling

 - Improve the PELT metrics for certain workloads

 - Reintroduce select_idle_smt() to improve load-balancing locality -
   but without the previous regressions

 - Add 'scheduler latency debugging': warn after long periods of pending
   need_resched. This is an opt-in feature that requires the enabling of
   the LATENCY_WARN scheduler feature, or the use of the
   resched_latency_warn_ms=xx boot parameter.

 - CPU hotplug fixes for HP-rollback, and for the 'fail' interface. Fix
   remaining balance_push() vs. hotplug holes/races

 - PSI fixes, plus allow /proc/pressure/ files to be written by
   CAP_SYS_RESOURCE tasks as well

 - Fix/improve various load-balancing corner cases vs. capacity margins

 - Fix sched topology on systems with NUMA diameter of 3 or above

 - Fix PF_KTHREAD vs to_kthread() race

 - Minor rseq optimizations

 - Misc cleanups, optimizations, fixes and smaller updates

* tag 'sched-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (61 commits)
  cpumask/hotplug: Fix cpu_dying() state tracking
  kthread: Fix PF_KTHREAD vs to_kthread() race
  sched/debug: Fix cgroup_path[] serialization
  sched,psi: Handle potential task count underflow bugs more gracefully
  sched: Warn on long periods of pending need_resched
  sched/fair: Move update_nohz_stats() to the CONFIG_NO_HZ_COMMON block to simplify the code &amp; fix an unused function warning
  sched/debug: Rename the sched_debug parameter to sched_verbose
  sched,fair: Alternative sched_slice()
  sched: Move /proc/sched_debug to debugfs
  sched,debug: Convert sysctl sched_domains to debugfs
  debugfs: Implement debugfs_create_str()
  sched,preempt: Move preempt_dynamic to debug.c
  sched: Move SCHED_DEBUG sysctl to debugfs
  sched: Don't make LATENCYTOP select SCHED_DEBUG
  sched: Remove sched_schedstats sysctl out from under SCHED_DEBUG
  sched/numa: Allow runtime enabling/disabling of NUMA balance without SCHED_DEBUG
  sched: Use cpu_dying() to fix balance_push vs hotplug-rollback
  cpumask: Introduce DYING mask
  cpumask: Make cpu_{online,possible,present,active}() inline
  rseq: Optimise rseq_get_rseq_cs() and clear_rseq_cs()
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>sched: Fix various typos</title>
<updated>2021-03-21T23:11:52+00:00</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2021-03-18T12:38:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3b03706fa621ce31a3e9ef6307020fde4e6aae16'/>
<id>3b03706fa621ce31a3e9ef6307020fde4e6aae16</id>
<content type='text'>
Fix ~42 single-word typos in scheduler code comments.

We have accumulated a few fun ones over the years. :-)

Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Juri Lelli &lt;juri.lelli@redhat.com&gt;
Cc: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Cc: Dietmar Eggemann &lt;dietmar.eggemann@arm.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Ben Segall &lt;bsegall@google.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: linux-kernel@vger.kernel.org
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fix ~42 single-word typos in scheduler code comments.

We have accumulated a few fun ones over the years. :-)

Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Juri Lelli &lt;juri.lelli@redhat.com&gt;
Cc: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Cc: Dietmar Eggemann &lt;dietmar.eggemann@arm.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Ben Segall &lt;bsegall@google.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: linux-kernel@vger.kernel.org
</pre>
</div>
</content>
</entry>
<entry>
<title>irqtime: Make accounting correct on RT</title>
<updated>2021-03-17T15:34:09+00:00</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2021-03-09T08:55:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6516b386d8a07102aac353daf9c0fe0045faeb74'/>
<id>6516b386d8a07102aac353daf9c0fe0045faeb74</id>
<content type='text'>
vtime_account_irq and irqtime_account_irq() base checks on preempt_count()
which fails on RT because preempt_count() does not contain the softirq
accounting which is seperate on RT.

These checks do not need the full preempt count as they only operate on the
hard and softirq sections.

Use irq_count() instead which provides the correct value on both RT and non
RT kernels. The compiler is clever enough to fold the masking for !RT:

       99b:	65 8b 05 00 00 00 00 	mov    %gs:0x0(%rip),%eax
 -     9a2:	25 ff ff ff 7f       	and    $0x7fffffff,%eax
 +     9a2:	25 00 ff ff 00       	and    $0xffff00,%eax

Reported-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Tested-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Tested-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Reviewed-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20210309085727.153926793@linutronix.de

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
vtime_account_irq and irqtime_account_irq() base checks on preempt_count()
which fails on RT because preempt_count() does not contain the softirq
accounting which is seperate on RT.

These checks do not need the full preempt count as they only operate on the
hard and softirq sections.

Use irq_count() instead which provides the correct value on both RT and non
RT kernels. The compiler is clever enough to fold the masking for !RT:

       99b:	65 8b 05 00 00 00 00 	mov    %gs:0x0(%rip),%eax
 -     9a2:	25 ff ff ff 7f       	and    $0x7fffffff,%eax
 +     9a2:	25 00 ff ff 00       	and    $0xffff00,%eax

Reported-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Tested-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Tested-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Reviewed-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20210309085727.153926793@linutronix.de

</pre>
</div>
</content>
</entry>
<entry>
<title>irqtime: Move irqtime entry accounting after irq offset incrementation</title>
<updated>2020-12-02T19:20:05+00:00</updated>
<author>
<name>Frederic Weisbecker</name>
<email>frederic@kernel.org</email>
</author>
<published>2020-12-02T11:57:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=d3759e7184f8f6187e62f8c4e7dcb1f6c47c075a'/>
<id>d3759e7184f8f6187e62f8c4e7dcb1f6c47c075a</id>
<content type='text'>
IRQ time entry is currently accounted before HARDIRQ_OFFSET or
SOFTIRQ_OFFSET are incremented. This is convenient to decide to which
index the cputime to account is dispatched.

Unfortunately it prevents tick_irq_enter() from being called under
HARDIRQ_OFFSET because tick_irq_enter() has to be called before the IRQ
entry accounting due to the necessary clock catch up. As a result we
don't benefit from appropriate lockdep coverage on tick_irq_enter().

To prepare for fixing this, move the IRQ entry cputime accounting after
the preempt offset is incremented. This requires the cputime dispatch
code to handle the extra offset.

Signed-off-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20201202115732.27827-5-frederic@kernel.org

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
IRQ time entry is currently accounted before HARDIRQ_OFFSET or
SOFTIRQ_OFFSET are incremented. This is convenient to decide to which
index the cputime to account is dispatched.

Unfortunately it prevents tick_irq_enter() from being called under
HARDIRQ_OFFSET because tick_irq_enter() has to be called before the IRQ
entry accounting due to the necessary clock catch up. As a result we
don't benefit from appropriate lockdep coverage on tick_irq_enter().

To prepare for fixing this, move the IRQ entry cputime accounting after
the preempt offset is incremented. This requires the cputime dispatch
code to handle the extra offset.

Signed-off-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20201202115732.27827-5-frederic@kernel.org

</pre>
</div>
</content>
</entry>
</feed>
