<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux.git/arch/x86/kernel/kvmclock.c, branch v3.10</title>
<subtitle>Linux kernel source tree</subtitle>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/'/>
<entry>
<title>x86: kvmclock: zero initialize pvclock shared memory area</title>
<updated>2013-06-19T10:25:28+00:00</updated>
<author>
<name>Igor Mammedov</name>
<email>imammedo@redhat.com</email>
</author>
<published>2013-06-10T16:31:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=07868fc6aaf57847b0f3a3d53086b7556eb83f4a'/>
<id>07868fc6aaf57847b0f3a3d53086b7556eb83f4a</id>
<content type='text'>
kernel might hung in pvclock_clocksource_read() due to
uninitialized memory might contain odd version value in
following cycle:

        do {
                version = __pvclock_read_cycles(src, &amp;ret, &amp;flags);
        } while ((src-&gt;version &amp; 1) || version != src-&gt;version);

if secondary kvmclock is accessed before it's registered with kvm.

Clear garbage in pvclock shared memory area right after it's
allocated to avoid this issue.

Ref: https://bugzilla.kernel.org/show_bug.cgi?id=59521
Signed-off-by: Igor Mammedov &lt;imammedo@redhat.com&gt;
[See BZ for analysis.  We may want a different fix for 3.11, but
 this is the safest for now - Paolo]
Cc: &lt;stable@vger.kernel.org&gt; # 3.8
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
kernel might hung in pvclock_clocksource_read() due to
uninitialized memory might contain odd version value in
following cycle:

        do {
                version = __pvclock_read_cycles(src, &amp;ret, &amp;flags);
        } while ((src-&gt;version &amp; 1) || version != src-&gt;version);

if secondary kvmclock is accessed before it's registered with kvm.

Clear garbage in pvclock shared memory area right after it's
allocated to avoid this issue.

Ref: https://bugzilla.kernel.org/show_bug.cgi?id=59521
Signed-off-by: Igor Mammedov &lt;imammedo@redhat.com&gt;
[See BZ for analysis.  We may want a different fix for 3.11, but
 this is the safest for now - Paolo]
Cc: &lt;stable@vger.kernel.org&gt; # 3.8
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'master' into queue</title>
<updated>2013-03-04T23:10:32+00:00</updated>
<author>
<name>Marcelo Tosatti</name>
<email>mtosatti@redhat.com</email>
</author>
<published>2013-03-04T23:10:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ee2c25efdd46d7ed5605d6fe877bdf4b47a4ab2e'/>
<id>ee2c25efdd46d7ed5605d6fe877bdf4b47a4ab2e</id>
<content type='text'>
* master: (15791 commits)
  Linux 3.9-rc1
  btrfs/raid56: Add missing #include &lt;linux/vmalloc.h&gt;
  fix compat_sys_rt_sigprocmask()
  SUNRPC: One line comment fix
  ext4: enable quotas before orphan cleanup
  ext4: don't allow quota mount options when quota feature enabled
  ext4: fix a warning from sparse check for ext4_dir_llseek
  ext4: convert number of blocks to clusters properly
  ext4: fix possible memory leak in ext4_remount()
  jbd2: fix ERR_PTR dereference in jbd2__journal_start
  metag: Provide dma_get_sgtable()
  metag: prom.h: remove declaration of metag_dt_memblock_reserve()
  metag: copy devicetree to non-init memory
  metag: cleanup metag_ksyms.c includes
  metag: move mm/init.c exports out of metag_ksyms.c
  metag: move usercopy.c exports out of metag_ksyms.c
  metag: move setup.c exports out of metag_ksyms.c
  metag: move kick.c exports out of metag_ksyms.c
  metag: move traps.c exports out of metag_ksyms.c
  metag: move irq enable out of irqflags.h on SMP
  ...

Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;

Conflicts:
	arch/x86/kernel/kvmclock.c
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* master: (15791 commits)
  Linux 3.9-rc1
  btrfs/raid56: Add missing #include &lt;linux/vmalloc.h&gt;
  fix compat_sys_rt_sigprocmask()
  SUNRPC: One line comment fix
  ext4: enable quotas before orphan cleanup
  ext4: don't allow quota mount options when quota feature enabled
  ext4: fix a warning from sparse check for ext4_dir_llseek
  ext4: convert number of blocks to clusters properly
  ext4: fix possible memory leak in ext4_remount()
  jbd2: fix ERR_PTR dereference in jbd2__journal_start
  metag: Provide dma_get_sgtable()
  metag: prom.h: remove declaration of metag_dt_memblock_reserve()
  metag: copy devicetree to non-init memory
  metag: cleanup metag_ksyms.c includes
  metag: move mm/init.c exports out of metag_ksyms.c
  metag: move usercopy.c exports out of metag_ksyms.c
  metag: move setup.c exports out of metag_ksyms.c
  metag: move kick.c exports out of metag_ksyms.c
  metag: move traps.c exports out of metag_ksyms.c
  metag: move irq enable out of irqflags.h on SMP
  ...

Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;

Conflicts:
	arch/x86/kernel/kvmclock.c
</pre>
</div>
</content>
</entry>
<entry>
<title>x86: kvmclock: Do not setup kvmclock vsyscall in the absence of that clock</title>
<updated>2013-02-27T11:19:18+00:00</updated>
<author>
<name>Jan Kiszka</name>
<email>jan.kiszka@siemens.com</email>
</author>
<published>2013-02-23T16:05:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=fe1140cc369410a9c206fdb7aaabc644bd213dc2'/>
<id>fe1140cc369410a9c206fdb7aaabc644bd213dc2</id>
<content type='text'>
This fixes boot lockups with "no-kvmclock", when the host is not
exposing this particular feature (QEMU: -cpu ...,-kvmclock) or when
the kvmclock initialization failed for whatever reason.

Reviewed-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Signed-off-by: Jan Kiszka &lt;jan.kiszka@siemens.com&gt;
Signed-off-by: Gleb Natapov &lt;gleb@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This fixes boot lockups with "no-kvmclock", when the host is not
exposing this particular feature (QEMU: -cpu ...,-kvmclock) or when
the kvmclock initialization failed for whatever reason.

Reviewed-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Signed-off-by: Jan Kiszka &lt;jan.kiszka@siemens.com&gt;
Signed-off-by: Gleb Natapov &lt;gleb@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'kvm-3.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm</title>
<updated>2013-02-24T21:07:18+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2013-02-24T21:07:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=89f883372fa60f604d136924baf3e89ff1870e9e'/>
<id>89f883372fa60f604d136924baf3e89ff1870e9e</id>
<content type='text'>
Pull KVM updates from Marcelo Tosatti:
 "KVM updates for the 3.9 merge window, including x86 real mode
  emulation fixes, stronger memory slot interface restrictions, mmu_lock
  spinlock hold time reduction, improved handling of large page faults
  on shadow, initial APICv HW acceleration support, s390 channel IO
  based virtio, amongst others"

* tag 'kvm-3.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (143 commits)
  Revert "KVM: MMU: lazily drop large spte"
  x86: pvclock kvm: align allocation size to page size
  KVM: nVMX: Remove redundant get_vmcs12 from nested_vmx_exit_handled_msr
  x86 emulator: fix parity calculation for AAD instruction
  KVM: PPC: BookE: Handle alignment interrupts
  booke: Added DBCR4 SPR number
  KVM: PPC: booke: Allow multiple exception types
  KVM: PPC: booke: use vcpu reference from thread_struct
  KVM: Remove user_alloc from struct kvm_memory_slot
  KVM: VMX: disable apicv by default
  KVM: s390: Fix handling of iscs.
  KVM: MMU: cleanup __direct_map
  KVM: MMU: remove pt_access in mmu_set_spte
  KVM: MMU: cleanup mapping-level
  KVM: MMU: lazily drop large spte
  KVM: VMX: cleanup vmx_set_cr0().
  KVM: VMX: add missing exit names to VMX_EXIT_REASONS array
  KVM: VMX: disable SMEP feature when guest is in non-paging mode
  KVM: Remove duplicate text in api.txt
  Revert "KVM: MMU: split kvm_mmu_free_page"
  ...
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull KVM updates from Marcelo Tosatti:
 "KVM updates for the 3.9 merge window, including x86 real mode
  emulation fixes, stronger memory slot interface restrictions, mmu_lock
  spinlock hold time reduction, improved handling of large page faults
  on shadow, initial APICv HW acceleration support, s390 channel IO
  based virtio, amongst others"

* tag 'kvm-3.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (143 commits)
  Revert "KVM: MMU: lazily drop large spte"
  x86: pvclock kvm: align allocation size to page size
  KVM: nVMX: Remove redundant get_vmcs12 from nested_vmx_exit_handled_msr
  x86 emulator: fix parity calculation for AAD instruction
  KVM: PPC: BookE: Handle alignment interrupts
  booke: Added DBCR4 SPR number
  KVM: PPC: booke: Allow multiple exception types
  KVM: PPC: booke: use vcpu reference from thread_struct
  KVM: Remove user_alloc from struct kvm_memory_slot
  KVM: VMX: disable apicv by default
  KVM: s390: Fix handling of iscs.
  KVM: MMU: cleanup __direct_map
  KVM: MMU: remove pt_access in mmu_set_spte
  KVM: MMU: cleanup mapping-level
  KVM: MMU: lazily drop large spte
  KVM: VMX: cleanup vmx_set_cr0().
  KVM: VMX: add missing exit names to VMX_EXIT_REASONS array
  KVM: VMX: disable SMEP feature when guest is in non-paging mode
  KVM: Remove duplicate text in api.txt
  Revert "KVM: MMU: split kvm_mmu_free_page"
  ...
</pre>
</div>
</content>
</entry>
<entry>
<title>x86: pvclock kvm: align allocation size to page size</title>
<updated>2013-02-19T02:05:19+00:00</updated>
<author>
<name>Marcelo Tosatti</name>
<email>mtosatti@redhat.com</email>
</author>
<published>2013-02-19T01:58:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=ed55705dd5008b408c48a8459b8b34b01f3de985'/>
<id>ed55705dd5008b408c48a8459b8b34b01f3de985</id>
<content type='text'>
To match whats mapped via vsyscalls to userspace.

Reported-by: Peter Hurley &lt;peter@hurleysoftware.com&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
To match whats mapped via vsyscalls to userspace.

Reported-by: Peter Hurley &lt;peter@hurleysoftware.com&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86, kvm: Fix kvm's use of __pa() on percpu areas</title>
<updated>2013-01-26T00:34:55+00:00</updated>
<author>
<name>Dave Hansen</name>
<email>dave@linux.vnet.ibm.com</email>
</author>
<published>2013-01-22T21:24:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=5dfd486c4750c9278c63fa96e6e85bdd2fb58e9d'/>
<id>5dfd486c4750c9278c63fa96e6e85bdd2fb58e9d</id>
<content type='text'>
In short, it is illegal to call __pa() on an address holding
a percpu variable.  This replaces those __pa() calls with
slow_virt_to_phys().  All of the cases in this patch are
in boot time (or CPU hotplug time at worst) code, so the
slow pagetable walking in slow_virt_to_phys() is not expected
to have a performance impact.

The times when this actually matters are pretty obscure
(certain 32-bit NUMA systems), but it _does_ happen.  It is
important to keep KVM guests working on these systems because
the real hardware is getting harder and harder to find.

This bug manifested first by me seeing a plain hang at boot
after this message:

	CPU 0 irqstacks, hard=f3018000 soft=f301a000

or, sometimes, it would actually make it out to the console:

[    0.000000] BUG: unable to handle kernel paging request at ffffffff

I eventually traced it down to the KVM async pagefault code.
This can be worked around by disabling that code either at
compile-time, or on the kernel command-line.

The kvm async pagefault code was injecting page faults in
to the guest which the guest misinterpreted because its
"reason" was not being properly sent from the host.

The guest passes a physical address of an per-cpu async page
fault structure via an MSR to the host.  Since __pa() is
broken on percpu data, the physical address it sent was
bascially bogus and the host went scribbling on random data.
The guest never saw the real reason for the page fault (it
was injected by the host), assumed that the kernel had taken
a _real_ page fault, and panic()'d.  The behavior varied,
though, depending on what got corrupted by the bad write.

Signed-off-by: Dave Hansen &lt;dave@linux.vnet.ibm.com&gt;
Link: http://lkml.kernel.org/r/20130122212435.4905663F@kernel.stglabs.ibm.com
Acked-by: Rik van Riel &lt;riel@redhat.com&gt;
Reviewed-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Signed-off-by: H. Peter Anvin &lt;hpa@linux.intel.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
In short, it is illegal to call __pa() on an address holding
a percpu variable.  This replaces those __pa() calls with
slow_virt_to_phys().  All of the cases in this patch are
in boot time (or CPU hotplug time at worst) code, so the
slow pagetable walking in slow_virt_to_phys() is not expected
to have a performance impact.

The times when this actually matters are pretty obscure
(certain 32-bit NUMA systems), but it _does_ happen.  It is
important to keep KVM guests working on these systems because
the real hardware is getting harder and harder to find.

This bug manifested first by me seeing a plain hang at boot
after this message:

	CPU 0 irqstacks, hard=f3018000 soft=f301a000

or, sometimes, it would actually make it out to the console:

[    0.000000] BUG: unable to handle kernel paging request at ffffffff

I eventually traced it down to the KVM async pagefault code.
This can be worked around by disabling that code either at
compile-time, or on the kernel command-line.

The kvm async pagefault code was injecting page faults in
to the guest which the guest misinterpreted because its
"reason" was not being properly sent from the host.

The guest passes a physical address of an per-cpu async page
fault structure via an MSR to the host.  Since __pa() is
broken on percpu data, the physical address it sent was
bascially bogus and the host went scribbling on random data.
The guest never saw the real reason for the page fault (it
was injected by the host), assumed that the kernel had taken
a _real_ page fault, and panic()'d.  The behavior varied,
though, depending on what got corrupted by the bad write.

Signed-off-by: Dave Hansen &lt;dave@linux.vnet.ibm.com&gt;
Link: http://lkml.kernel.org/r/20130122212435.4905663F@kernel.stglabs.ibm.com
Acked-by: Rik van Riel &lt;riel@redhat.com&gt;
Reviewed-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Signed-off-by: H. Peter Anvin &lt;hpa@linux.intel.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86: kvm guest: pvclock vsyscall support</title>
<updated>2012-11-28T01:29:10+00:00</updated>
<author>
<name>Marcelo Tosatti</name>
<email>mtosatti@redhat.com</email>
</author>
<published>2012-11-28T01:28:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=3dc4f7cfb7441e5e0fed3a02fc81cdaabd28300a'/>
<id>3dc4f7cfb7441e5e0fed3a02fc81cdaabd28300a</id>
<content type='text'>
Hook into generic pvclock vsyscall code, with the aim to
allow userspace to have visibility into pvclock data.

Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Hook into generic pvclock vsyscall code, with the aim to
allow userspace to have visibility into pvclock data.

Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86: kvmclock: allocate pvclock shared memory area</title>
<updated>2012-11-28T01:29:05+00:00</updated>
<author>
<name>Marcelo Tosatti</name>
<email>mtosatti@redhat.com</email>
</author>
<published>2012-11-28T01:28:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=7069ed67635b8d574541af426d752cd7fbd465a6'/>
<id>7069ed67635b8d574541af426d752cd7fbd465a6</id>
<content type='text'>
We want to expose the pvclock shared memory areas, which
the hypervisor periodically updates, to userspace.

For a linear mapping from userspace, it is necessary that
entire page sized regions are used for array of pvclock
structures.

There is no such guarantee with per cpu areas, therefore move
to memblock_alloc based allocation.

Acked-by: Glauber Costa &lt;glommer@parallels.com&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
We want to expose the pvclock shared memory areas, which
the hypervisor periodically updates, to userspace.

For a linear mapping from userspace, it is necessary that
entire page sized regions are used for array of pvclock
structures.

There is no such guarantee with per cpu areas, therefore move
to memblock_alloc based allocation.

Acked-by: Glauber Costa &lt;glommer@parallels.com&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>x86: kvmclock: remove check_and_clear_guest_paused warning</title>
<updated>2012-06-12T02:18:33+00:00</updated>
<author>
<name>Marcelo Tosatti</name>
<email>mtosatti@redhat.com</email>
</author>
<published>2012-06-12T02:18:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=e32025a56403df4386cd61a741c0a36afe79ae8a'/>
<id>e32025a56403df4386cd61a741c0a36afe79ae8a</id>
<content type='text'>
CPU offline path calls the hrtimer interrupt handler with interrupts
disabled, without touching preempt_count, triggering this warning.

Remove the warning since it is supposed to be used from hrtimer
interrupt context only.

Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
CPU offline path calls the hrtimer interrupt handler with interrupts
disabled, without touching preempt_count, triggering this warning.

Remove the warning since it is supposed to be used from hrtimer
interrupt context only.

Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>kvmclock: remove unneeded EXPORT macro</title>
<updated>2012-04-08T09:49:54+00:00</updated>
<author>
<name>Eric B Munson</name>
<email>emunson@mgebm.net</email>
</author>
<published>2012-03-15T22:16:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.tavy.me/linux.git/commit/?id=248997095d652576f1213028a95ca5fff85d089f'/>
<id>248997095d652576f1213028a95ca5fff85d089f</id>
<content type='text'>
check_and_clear_guest_paused does not need to be exported as it isn't used
by any modules, remove the export.

Signed-off-by: Eric B Munson &lt;emunson@mgebm.net&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Signed-off-by: Avi Kivity &lt;avi@redhat.com&gt;

</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
check_and_clear_guest_paused does not need to be exported as it isn't used
by any modules, remove the export.

Signed-off-by: Eric B Munson &lt;emunson@mgebm.net&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Signed-off-by: Avi Kivity &lt;avi@redhat.com&gt;

</pre>
</div>
</content>
</entry>
</feed>
