<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/kernel/time, branch master</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>alarmtimer: Fix argument order in alarm_timer_forward()</title>
<updated>2026-03-24T22:17:14+00:00</updated>
<author>
<name>Zhan Xusheng</name>
<email>zhanxusheng1024@gmail.com</email>
</author>
<published>2026-03-23T06:11:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5d16467ae56343b9205caedf85e3a131e0914ad8'/>
<id>5d16467ae56343b9205caedf85e3a131e0914ad8</id>
<content type='text'>
alarm_timer_forward() passes arguments to alarm_forward() in the wrong
order:

  alarm_forward(alarm, timr-&gt;it_interval, now);

However, alarm_forward() is defined as:

  u64 alarm_forward(struct alarm *alarm, ktime_t now, ktime_t interval);

and uses the second argument as the current time:

  delta = ktime_sub(now, alarm-&gt;node.expires);

Passing the interval as "now" results in incorrect delta computation,
which can lead to missed expirations or incorrect overrun accounting.

This issue has been present since the introduction of
alarm_timer_forward().

Fix this by swapping the arguments.

Fixes: e7561f1633ac ("alarmtimer: Implement forward callback")
Signed-off-by: Zhan Xusheng &lt;zhanxusheng@xiaomi.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@kernel.org&gt;
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260323061130.29991-1-zhanxusheng@xiaomi.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
alarm_timer_forward() passes arguments to alarm_forward() in the wrong
order:

  alarm_forward(alarm, timr-&gt;it_interval, now);

However, alarm_forward() is defined as:

  u64 alarm_forward(struct alarm *alarm, ktime_t now, ktime_t interval);

and uses the second argument as the current time:

  delta = ktime_sub(now, alarm-&gt;node.expires);

Passing the interval as "now" results in incorrect delta computation,
which can lead to missed expirations or incorrect overrun accounting.

This issue has been present since the introduction of
alarm_timer_forward().

Fix this by swapping the arguments.

Fixes: e7561f1633ac ("alarmtimer: Implement forward callback")
Signed-off-by: Zhan Xusheng &lt;zhanxusheng@xiaomi.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@kernel.org&gt;
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260323061130.29991-1-zhanxusheng@xiaomi.com
</pre>
</div>
</content>
</entry>
<entry>
<title>time/jiffies: Mark jiffies_64_to_clock_t() notrace</title>
<updated>2026-03-11T09:33:12+00:00</updated>
<author>
<name>Steven Rostedt</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2026-03-07T02:24:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=755a648e78f12574482d4698d877375793867fa1'/>
<id>755a648e78f12574482d4698d877375793867fa1</id>
<content type='text'>
The trace_clock_jiffies() function that handles the "uptime" clock for
tracing calls jiffies_64_to_clock_t(). This causes the function tracer to
constantly recurse when the tracing clock is set to "uptime". Mark it
notrace to prevent unnecessary recursion when using the "uptime" clock.

Fixes: 58d4e21e50ff3 ("tracing: Fix wraparound problems in "uptime" trace clock")
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@kernel.org&gt;
Link: https://patch.msgid.link/20260306212403.72270bb2@robin
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The trace_clock_jiffies() function that handles the "uptime" clock for
tracing calls jiffies_64_to_clock_t(). This causes the function tracer to
constantly recurse when the tracing clock is set to "uptime". Mark it
notrace to prevent unnecessary recursion when using the "uptime" clock.

Fixes: 58d4e21e50ff3 ("tracing: Fix wraparound problems in "uptime" trace clock")
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@kernel.org&gt;
Link: https://patch.msgid.link/20260306212403.72270bb2@robin
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'timers-urgent-2026-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2026-03-08T01:09:15+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-03-08T01:09:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6ff1020c2f7b5593b916afbc87a835371ab1d51b'/>
<id>6ff1020c2f7b5593b916afbc87a835371ab1d51b</id>
<content type='text'>
Pull timer fix from Ingo Molnar:
 "Make clock_adjtime() syscall timex validation slightly more permissive
  for auxiliary clocks, to not reject syscalls based on the status field
  that do not try to modify the status field.

  This makes the ABI behavior in clock_adjtime() consistent with
  CLOCK_REALTIME"

* tag 'timers-urgent-2026-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  timekeeping: Fix timex status validation for auxiliary clocks
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull timer fix from Ingo Molnar:
 "Make clock_adjtime() syscall timex validation slightly more permissive
  for auxiliary clocks, to not reject syscalls based on the status field
  that do not try to modify the status field.

  This makes the ABI behavior in clock_adjtime() consistent with
  CLOCK_REALTIME"

* tag 'timers-urgent-2026-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  timekeeping: Fix timex status validation for auxiliary clocks
</pre>
</div>
</content>
</entry>
<entry>
<title>timekeeping: Fix timex status validation for auxiliary clocks</title>
<updated>2026-03-04T19:05:37+00:00</updated>
<author>
<name>Miroslav Lichvar</name>
<email>mlichvar@redhat.com</email>
</author>
<published>2026-02-25T08:51:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e48a869957a70cc39b4090cd27c36a86f8db9b92'/>
<id>e48a869957a70cc39b4090cd27c36a86f8db9b92</id>
<content type='text'>
The timekeeping_validate_timex() function validates the timex status
of an auxiliary system clock even when the status is not to be changed,
which causes unexpected errors for applications that make read-only
clock_adjtime() calls, or set some other timex fields, but without
clearing the status field.

Do the AUX-specific status validation only when the modes field contains
ADJ_STATUS, i.e. the application is actually trying to change the
status. This makes the AUX-specific clock_adjtime() behavior consistent
with CLOCK_REALTIME.

Fixes: 4eca49d0b621 ("timekeeping: Prepare do_adtimex() for auxiliary clocks")
Signed-off-by: Miroslav Lichvar &lt;mlichvar@redhat.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@kernel.org&gt;
Link: https://patch.msgid.link/20260225085231.276751-1-mlichvar@redhat.com
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The timekeeping_validate_timex() function validates the timex status
of an auxiliary system clock even when the status is not to be changed,
which causes unexpected errors for applications that make read-only
clock_adjtime() calls, or set some other timex fields, but without
clearing the status field.

Do the AUX-specific status validation only when the modes field contains
ADJ_STATUS, i.e. the application is actually trying to change the
status. This makes the AUX-specific clock_adjtime() behavior consistent
with CLOCK_REALTIME.

Fixes: 4eca49d0b621 ("timekeeping: Prepare do_adtimex() for auxiliary clocks")
Signed-off-by: Miroslav Lichvar &lt;mlichvar@redhat.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@kernel.org&gt;
Link: https://patch.msgid.link/20260225085231.276751-1-mlichvar@redhat.com
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'sysctl-7.00-fixes-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl</title>
<updated>2026-03-04T16:21:11+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-03-04T16:21:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ecc64d2dc9ff9738d2a896beb68e02c2feaf9a02'/>
<id>ecc64d2dc9ff9738d2a896beb68e02c2feaf9a02</id>
<content type='text'>
Pull sysctl fix from Joel Granados:

 - Fix error when reporting jiffies converted values back to user space

   Return the converted value instead of "Invalid argument" error

* tag 'sysctl-7.00-fixes-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl:
  time/jiffies: Fix sysctl file error on configurations where USER_HZ &lt; HZ
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull sysctl fix from Joel Granados:

 - Fix error when reporting jiffies converted values back to user space

   Return the converted value instead of "Invalid argument" error

* tag 'sysctl-7.00-fixes-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl:
  time/jiffies: Fix sysctl file error on configurations where USER_HZ &lt; HZ
</pre>
</div>
</content>
</entry>
<entry>
<title>time/jiffies: Fix sysctl file error on configurations where USER_HZ &lt; HZ</title>
<updated>2026-03-04T12:48:31+00:00</updated>
<author>
<name>Gerd Rausch</name>
<email>gerd.rausch@oracle.com</email>
</author>
<published>2026-02-25T23:37:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=6932256d3a3764f3a5e06e2cb8603be45b6a9fef'/>
<id>6932256d3a3764f3a5e06e2cb8603be45b6a9fef</id>
<content type='text'>
Commit 2dc164a48e6fd ("sysctl: Create converter functions with two new
macros") incorrectly returns error to user space when jiffies sysctl
converter is used. The old overflow check got replaced with an
unconditional one:
     +    if (USER_HZ &lt; HZ)
     +        return -EINVAL;
which will always be true on configurations with "USER_HZ &lt; HZ".

Remove the check; it is no longer needed as clock_t_to_jiffies() returns
ULONG_MAX for the overflow case and proc_int_u2k_conv_uop() checks for
"&gt; INT_MAX" after conversion

Fixes: 2dc164a48e6fd ("sysctl: Create converter functions with two new macros")
Reported-by: Colm Harrington &lt;colm.harrington@oracle.com&gt;
Signed-off-by: Gerd Rausch &lt;gerd.rausch@oracle.com&gt;
Signed-off-by: Joel Granados &lt;joel.granados@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Commit 2dc164a48e6fd ("sysctl: Create converter functions with two new
macros") incorrectly returns error to user space when jiffies sysctl
converter is used. The old overflow check got replaced with an
unconditional one:
     +    if (USER_HZ &lt; HZ)
     +        return -EINVAL;
which will always be true on configurations with "USER_HZ &lt; HZ".

Remove the check; it is no longer needed as clock_t_to_jiffies() returns
ULONG_MAX for the overflow case and proc_int_u2k_conv_uop() checks for
"&gt; INT_MAX" after conversion

Fixes: 2dc164a48e6fd ("sysctl: Create converter functions with two new macros")
Reported-by: Colm Harrington &lt;colm.harrington@oracle.com&gt;
Signed-off-by: Gerd Rausch &lt;gerd.rausch@oracle.com&gt;
Signed-off-by: Joel Granados &lt;joel.granados@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'cgroup-for-7.0-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup</title>
<updated>2026-03-03T22:25:18+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-03-03T22:25:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=0031c06807cfa8aa51a759ff8aa09e1aa48149af'/>
<id>0031c06807cfa8aa51a759ff8aa09e1aa48149af</id>
<content type='text'>
Pull cgroup fixes from Tejun Heo:

 - Fix circular locking dependency in cpuset partition code by
   deferring housekeeping_update() calls to a workqueue instead
   of calling them directly under cpus_read_lock

 - Fix null-ptr-deref in rebuild_sched_domains_cpuslocked() when
   generate_sched_domains() returns NULL due to kmalloc failure

 - Fix incorrect cpuset behavior for effective_xcpus in
   partition_xcpus_del() and cpuset_update_tasks_cpumask()
   in update_cpumasks_hier()

 - Fix race between task migration and cgroup iteration

* tag 'cgroup-for-7.0-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup/cpuset: fix null-ptr-deref in rebuild_sched_domains_cpuslocked
  cgroup/cpuset: Call housekeeping_update() without holding cpus_read_lock
  cgroup/cpuset: Defer housekeeping_update() calls from CPU hotplug to workqueue
  cgroup/cpuset: Move housekeeping_update()/rebuild_sched_domains() together
  kselftest/cgroup: Simplify test_cpuset_prs.sh by removing "S+" command
  cgroup/cpuset: Set isolated_cpus_updating only if isolated_cpus is changed
  cgroup/cpuset: Clarify exclusion rules for cpuset internal variables
  cgroup/cpuset: Fix incorrect use of cpuset_update_tasks_cpumask() in update_cpumasks_hier()
  cgroup/cpuset: Fix incorrect change to effective_xcpus in partition_xcpus_del()
  cgroup: fix race between task migration and iteration
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull cgroup fixes from Tejun Heo:

 - Fix circular locking dependency in cpuset partition code by
   deferring housekeeping_update() calls to a workqueue instead
   of calling them directly under cpus_read_lock

 - Fix null-ptr-deref in rebuild_sched_domains_cpuslocked() when
   generate_sched_domains() returns NULL due to kmalloc failure

 - Fix incorrect cpuset behavior for effective_xcpus in
   partition_xcpus_del() and cpuset_update_tasks_cpumask()
   in update_cpumasks_hier()

 - Fix race between task migration and cgroup iteration

* tag 'cgroup-for-7.0-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup/cpuset: fix null-ptr-deref in rebuild_sched_domains_cpuslocked
  cgroup/cpuset: Call housekeeping_update() without holding cpus_read_lock
  cgroup/cpuset: Defer housekeeping_update() calls from CPU hotplug to workqueue
  cgroup/cpuset: Move housekeeping_update()/rebuild_sched_domains() together
  kselftest/cgroup: Simplify test_cpuset_prs.sh by removing "S+" command
  cgroup/cpuset: Set isolated_cpus_updating only if isolated_cpus is changed
  cgroup/cpuset: Clarify exclusion rules for cpuset internal variables
  cgroup/cpuset: Fix incorrect use of cpuset_update_tasks_cpumask() in update_cpumasks_hier()
  cgroup/cpuset: Fix incorrect change to effective_xcpus in partition_xcpus_del()
  cgroup: fix race between task migration and iteration
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'timers-urgent-2026-03-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2026-03-01T20:15:58+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-03-01T20:15:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=f6542af92298d8a44ab5588144086fa2d0edf391'/>
<id>f6542af92298d8a44ab5588144086fa2d0edf391</id>
<content type='text'>
Pull timer fix from Ingo Molnar:
 "Improve the inlining of jiffies_to_msecs() and jiffies_to_usecs(), for
  the common HZ=100, 250 or 1000 cases. Only use a function call for odd
  HZ values like HZ=300 that generate more code.

  The function call overhead showed up in performance tests of the TCP
  code"

* tag 'timers-urgent-2026-03-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  time/jiffies: Inline jiffies_to_msecs() and jiffies_to_usecs()
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull timer fix from Ingo Molnar:
 "Improve the inlining of jiffies_to_msecs() and jiffies_to_usecs(), for
  the common HZ=100, 250 or 1000 cases. Only use a function call for odd
  HZ values like HZ=300 that generate more code.

  The function call overhead showed up in performance tests of the TCP
  code"

* tag 'timers-urgent-2026-03-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  time/jiffies: Inline jiffies_to_msecs() and jiffies_to_usecs()
</pre>
</div>
</content>
</entry>
<entry>
<title>cgroup/cpuset: Call housekeeping_update() without holding cpus_read_lock</title>
<updated>2026-02-23T20:46:49+00:00</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2026-02-21T18:54:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=a84097e625f2b9e7f273161c004f34b7be63b348'/>
<id>a84097e625f2b9e7f273161c004f34b7be63b348</id>
<content type='text'>
The current cpuset partition code is able to dynamically update
the sched domains of a running system and the corresponding
HK_TYPE_DOMAIN housekeeping cpumask to perform what is essentially the
"isolcpus=domain,..." boot command line feature at run time.

The housekeeping cpumask update requires flushing a number of different
workqueues which may not be safe with cpus_read_lock() held as the
workqueue flushing code may acquire cpus_read_lock() or acquiring locks
which have locking dependency with cpus_read_lock() down the chain. Below
is an example of such circular locking problem.

  ======================================================
  WARNING: possible circular locking dependency detected
  6.18.0-test+ #2 Tainted: G S
  ------------------------------------------------------
  test_cpuset_prs/10971 is trying to acquire lock:
  ffff888112ba4958 ((wq_completion)sync_wq){+.+.}-{0:0}, at: touch_wq_lockdep_map+0x7a/0x180

  but task is already holding lock:
  ffffffffae47f450 (cpuset_mutex){+.+.}-{4:4}, at: cpuset_partition_write+0x85/0x130

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:
  -&gt; #4 (cpuset_mutex){+.+.}-{4:4}:
  -&gt; #3 (cpu_hotplug_lock){++++}-{0:0}:
  -&gt; #2 (rtnl_mutex){+.+.}-{4:4}:
  -&gt; #1 ((work_completion)(&amp;arg.work)){+.+.}-{0:0}:
  -&gt; #0 ((wq_completion)sync_wq){+.+.}-{0:0}:

  Chain exists of:
    (wq_completion)sync_wq --&gt; cpu_hotplug_lock --&gt; cpuset_mutex

  5 locks held by test_cpuset_prs/10971:
   #0: ffff88816810e440 (sb_writers#7){.+.+}-{0:0}, at: ksys_write+0xf9/0x1d0
   #1: ffff8891ab620890 (&amp;of-&gt;mutex#2){+.+.}-{4:4}, at: kernfs_fop_write_iter+0x260/0x5f0
   #2: ffff8890a78b83e8 (kn-&gt;active#187){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x2b6/0x5f0
   #3: ffffffffadf32900 (cpu_hotplug_lock){++++}-{0:0}, at: cpuset_partition_write+0x77/0x130
   #4: ffffffffae47f450 (cpuset_mutex){+.+.}-{4:4}, at: cpuset_partition_write+0x85/0x130

  Call Trace:
   &lt;TASK&gt;
     :
   touch_wq_lockdep_map+0x93/0x180
   __flush_workqueue+0x111/0x10b0
   housekeeping_update+0x12d/0x2d0
   update_parent_effective_cpumask+0x595/0x2440
   update_prstate+0x89d/0xce0
   cpuset_partition_write+0xc5/0x130
   cgroup_file_write+0x1a5/0x680
   kernfs_fop_write_iter+0x3df/0x5f0
   vfs_write+0x525/0xfd0
   ksys_write+0xf9/0x1d0
   do_syscall_64+0x95/0x520
   entry_SYSCALL_64_after_hwframe+0x76/0x7e

To avoid such a circular locking dependency problem, we have to
call housekeeping_update() without holding the cpus_read_lock() and
cpuset_mutex. The current set of wq's flushed by housekeeping_update()
may not have work functions that call cpus_read_lock() directly,
but we are likely to extend the list of wq's that are flushed in the
future. Moreover, the current set of work functions may hold locks that
may have cpu_hotplug_lock down the dependency chain.

So housekeeping_update() is now called after releasing cpus_read_lock
and cpuset_mutex at the end of a cpuset operation. These two locks are
then re-acquired later before calling rebuild_sched_domains_locked().

To enable mutual exclusion between the housekeeping_update() call and
other cpuset control file write actions, a new top level cpuset_top_mutex
is introduced. This new mutex will be acquired first to allow sharing
variables used by both code paths. However, cpuset update from CPU
hotplug can still happen in parallel with the housekeeping_update()
call, though that should be rare in production environment.

As cpus_read_lock() is now no longer held when
tmigr_isolated_exclude_cpumask() is called, it needs to acquire it
directly.

The lockdep_is_cpuset_held() is also updated to return true if either
cpuset_top_mutex or cpuset_mutex is held.

Fixes: 03ff73510169 ("cpuset: Update HK_TYPE_DOMAIN cpumask from cpuset")
Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The current cpuset partition code is able to dynamically update
the sched domains of a running system and the corresponding
HK_TYPE_DOMAIN housekeeping cpumask to perform what is essentially the
"isolcpus=domain,..." boot command line feature at run time.

The housekeeping cpumask update requires flushing a number of different
workqueues which may not be safe with cpus_read_lock() held as the
workqueue flushing code may acquire cpus_read_lock() or acquiring locks
which have locking dependency with cpus_read_lock() down the chain. Below
is an example of such circular locking problem.

  ======================================================
  WARNING: possible circular locking dependency detected
  6.18.0-test+ #2 Tainted: G S
  ------------------------------------------------------
  test_cpuset_prs/10971 is trying to acquire lock:
  ffff888112ba4958 ((wq_completion)sync_wq){+.+.}-{0:0}, at: touch_wq_lockdep_map+0x7a/0x180

  but task is already holding lock:
  ffffffffae47f450 (cpuset_mutex){+.+.}-{4:4}, at: cpuset_partition_write+0x85/0x130

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:
  -&gt; #4 (cpuset_mutex){+.+.}-{4:4}:
  -&gt; #3 (cpu_hotplug_lock){++++}-{0:0}:
  -&gt; #2 (rtnl_mutex){+.+.}-{4:4}:
  -&gt; #1 ((work_completion)(&amp;arg.work)){+.+.}-{0:0}:
  -&gt; #0 ((wq_completion)sync_wq){+.+.}-{0:0}:

  Chain exists of:
    (wq_completion)sync_wq --&gt; cpu_hotplug_lock --&gt; cpuset_mutex

  5 locks held by test_cpuset_prs/10971:
   #0: ffff88816810e440 (sb_writers#7){.+.+}-{0:0}, at: ksys_write+0xf9/0x1d0
   #1: ffff8891ab620890 (&amp;of-&gt;mutex#2){+.+.}-{4:4}, at: kernfs_fop_write_iter+0x260/0x5f0
   #2: ffff8890a78b83e8 (kn-&gt;active#187){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x2b6/0x5f0
   #3: ffffffffadf32900 (cpu_hotplug_lock){++++}-{0:0}, at: cpuset_partition_write+0x77/0x130
   #4: ffffffffae47f450 (cpuset_mutex){+.+.}-{4:4}, at: cpuset_partition_write+0x85/0x130

  Call Trace:
   &lt;TASK&gt;
     :
   touch_wq_lockdep_map+0x93/0x180
   __flush_workqueue+0x111/0x10b0
   housekeeping_update+0x12d/0x2d0
   update_parent_effective_cpumask+0x595/0x2440
   update_prstate+0x89d/0xce0
   cpuset_partition_write+0xc5/0x130
   cgroup_file_write+0x1a5/0x680
   kernfs_fop_write_iter+0x3df/0x5f0
   vfs_write+0x525/0xfd0
   ksys_write+0xf9/0x1d0
   do_syscall_64+0x95/0x520
   entry_SYSCALL_64_after_hwframe+0x76/0x7e

To avoid such a circular locking dependency problem, we have to
call housekeeping_update() without holding the cpus_read_lock() and
cpuset_mutex. The current set of wq's flushed by housekeeping_update()
may not have work functions that call cpus_read_lock() directly,
but we are likely to extend the list of wq's that are flushed in the
future. Moreover, the current set of work functions may hold locks that
may have cpu_hotplug_lock down the dependency chain.

So housekeeping_update() is now called after releasing cpus_read_lock
and cpuset_mutex at the end of a cpuset operation. These two locks are
then re-acquired later before calling rebuild_sched_domains_locked().

To enable mutual exclusion between the housekeeping_update() call and
other cpuset control file write actions, a new top level cpuset_top_mutex
is introduced. This new mutex will be acquired first to allow sharing
variables used by both code paths. However, cpuset update from CPU
hotplug can still happen in parallel with the housekeeping_update()
call, though that should be rare in production environment.

As cpus_read_lock() is now no longer held when
tmigr_isolated_exclude_cpumask() is called, it needs to acquire it
directly.

The lockdep_is_cpuset_held() is also updated to return true if either
cpuset_top_mutex or cpuset_mutex is held.

Fixes: 03ff73510169 ("cpuset: Update HK_TYPE_DOMAIN cpumask from cpuset")
Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL uses</title>
<updated>2026-02-22T16:26:33+00:00</updated>
<author>
<name>Kees Cook</name>
<email>kees@kernel.org</email>
</author>
<published>2026-02-22T07:46:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=189f164e573e18d9f8876dbd3ad8fcbe11f93037'/>
<id>189f164e573e18d9f8876dbd3ad8fcbe11f93037</id>
<content type='text'>
Conversion performed via this Coccinelle script:

  // SPDX-License-Identifier: GPL-2.0-only
  // Options: --include-headers-for-types --all-includes --include-headers --keep-comments
  virtual patch

  @gfp depends on patch &amp;&amp; !(file in "tools") &amp;&amp; !(file in "samples")@
  identifier ALLOC = {kmalloc_obj,kmalloc_objs,kmalloc_flex,
 		    kzalloc_obj,kzalloc_objs,kzalloc_flex,
		    kvmalloc_obj,kvmalloc_objs,kvmalloc_flex,
		    kvzalloc_obj,kvzalloc_objs,kvzalloc_flex};
  @@

  	ALLOC(...
  -		, GFP_KERNEL
  	)

  $ make coccicheck MODE=patch COCCI=gfp.cocci

Build and boot tested x86_64 with Fedora 42's GCC and Clang:

Linux version 6.19.0+ (user@host) (gcc (GCC) 15.2.1 20260123 (Red Hat 15.2.1-7), GNU ld version 2.44-12.fc42) #1 SMP PREEMPT_DYNAMIC 1970-01-01
Linux version 6.19.0+ (user@host) (clang version 20.1.8 (Fedora 20.1.8-4.fc42), LLD 20.1.8) #1 SMP PREEMPT_DYNAMIC 1970-01-01

Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Conversion performed via this Coccinelle script:

  // SPDX-License-Identifier: GPL-2.0-only
  // Options: --include-headers-for-types --all-includes --include-headers --keep-comments
  virtual patch

  @gfp depends on patch &amp;&amp; !(file in "tools") &amp;&amp; !(file in "samples")@
  identifier ALLOC = {kmalloc_obj,kmalloc_objs,kmalloc_flex,
 		    kzalloc_obj,kzalloc_objs,kzalloc_flex,
		    kvmalloc_obj,kvmalloc_objs,kvmalloc_flex,
		    kvzalloc_obj,kvzalloc_objs,kvzalloc_flex};
  @@

  	ALLOC(...
  -		, GFP_KERNEL
  	)

  $ make coccicheck MODE=patch COCCI=gfp.cocci

Build and boot tested x86_64 with Fedora 42's GCC and Clang:

Linux version 6.19.0+ (user@host) (gcc (GCC) 15.2.1 20260123 (Red Hat 15.2.1-7), GNU ld version 2.44-12.fc42) #1 SMP PREEMPT_DYNAMIC 1970-01-01
Linux version 6.19.0+ (user@host) (clang version 20.1.8 (Fedora 20.1.8-4.fc42), LLD 20.1.8) #1 SMP PREEMPT_DYNAMIC 1970-01-01

Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</pre>
</div>
</content>
</entry>
</feed>
