[Top] [All Lists]

Re: [GIT PULL] x86/mm changes for v3.9-rc1

To: Konrad Rzeszutek Wilk <>
Subject: Re: [GIT PULL] x86/mm changes for v3.9-rc1
From: Dave Hansen <>
Date: Fri, 22 Feb 2013 09:30:28 -0800
Cc: "H. Peter Anvin" <>, Linus Torvalds <>, "David S. Miller" <>, "H. Peter Anvin" <>, "Rafael J. Wysocki" <>,, Alexander Duyck <>, Andrea Arcangeli <>, Andrew Morton <>, Andrzej Pietrasiewicz <>, Arnd Bergmann <>, Borislav Petkov <>, Borislav Petkov <>, Christoph Lameter <>, Daniel J Blueman <>, Eric Biederman <>, Fenghua Yu <>, Frederic Weisbecker <>, Gleb Natapov <>, Gokul Caushik <>, "H. J. Lu" <>, Hugh Dickins <>, Ingo Molnar <>, Ingo Molnar <>, Jacob Shin <>, Jamie Lokier <>, Jarkko Sakkinen <>, Jeremy Fitzhardinge <>, Joe Millenbach <>, Joerg Roedel <>, Johannes Weiner <>, Josh Triplett <>, Kyungmin Park <>, Lee Schermerhorn <>, Len Brown <>, Linux Kernel Mailing List <>, Marcelo Tosatti <>, Marek Szyprowski <>, Matt Fleming <>, Mel Gorman <>, Paul Turner <>, Pavel Machek <>, Pekka Enberg <>, Peter Zijlstra <>, Ralf Baechle <>, Rik van Riel <>, Rob Landley <>, Russell King <>, Rusty Russell <>, Shuah Khan <>, Shuah Khan <>, Stefano Stabellini <>, Steven Rostedt <>, Thomas Gleixner <>, Ville Syrjälä <>, Yasuaki Ishimatsu <>, Yinghai Lu <>, Zachary Amsden <>,,,,,,,
In-reply-to: <>
List-archive: <>
List-help: <>
List-id: linux-mips <>
List-owner: <>
List-post: <>
List-software: Ecartis version 1.0.0
List-subscribe: <>
List-unsubscribe: <>
References: <> <>
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/17.0 Thunderbird/17.0
On 02/22/2013 08:55 AM, Konrad Rzeszutek Wilk wrote:
> On Thu, Feb 21, 2013 at 04:34:06PM -0800, H. Peter Anvin wrote:
>> Hi Linus,
>> This is a huge set of several partly interrelated (and concurrently
>> developed) changes, which is why the branch history is messier than
>> one would like.
>> The *really* big items are two humonguous patchsets mostly developed
>> by Yinghai Lu at my request, which completely revamps the way we
>> create initial page tables.  In particular, rather than estimating how
>> much memory we will need for page tables and then build them into that
>> memory -- a calculation that has shown to be incredibly fragile -- we
>> now build them (on 64 bits) with the aid of a "pseudo-linear mode" --
>> a #PF handler which creates temporary page tables on demand.
>> This has several advantages:
>> 1. It makes it much easier to support things that need access to
>>    data very early (a followon patchset uses this to load microcode
>>    way early in the kernel startup).
>> 2. It allows the kernel and all the kernel data objects to be invoked
>>    from above the 4 GB limit.  This allows kdump to work on very large
>>    systems.
>> 3. It greatly reduces the difference between Xen and native (Xen's
>>    equivalent of the #PF handler are the temporary page tables created
>>    by the domain builder), eliminating a bunch of fragile hooks.
>> The patch series also gets us a bit closer to W^X.
>> Additional work in this pull is the 64-bit get_user() work which you
>> were also involved with, and a bunch of cleanups/speedups to
>> __phys_addr()/__pa().
> Looking at figuring out which of the patches in the branch did this, but
> with this merge I am getting a crash with a very simple PV guest (booted with
> one 1G):
> Call Trace:
>   [<ffffffff8103feba>] xen_get_user_pgd+0x5a  <--
>   [<ffffffff8103feba>] xen_get_user_pgd+0x5a 
>   [<ffffffff81042d27>] xen_write_cr3+0x77 
>   [<ffffffff81ad2d21>] init_mem_mapping+0x1f9 
>   [<ffffffff81ac293f>] setup_arch+0x742 
>   [<ffffffff81666d71>] printk+0x48 
>   [<ffffffff81abcd62>] start_kernel+0x90 
>   [<ffffffff8109416b>] __add_preferred_console.clone.1+0x9b 
>   [<ffffffff81abc5f7>] x86_64_start_reservations+0x2a 
>   [<ffffffff81abf0c7>] xen_start_kernel+0x564 


You're probably hitting the new BUG_ON() in __phys_addr().  It's
intended to detect places where someone is doing a __pa()/__phys_addr()
on an address that's outside the kernel's identity mapping.

There are a lot of __pa() calls around there, but from the looks of it,
it's this code:

static pgd_t *xen_get_user_pgd(pgd_t *pgd)
        if (offset < pgd_index(USER_LIMIT)) {
                struct page *page = virt_to_page(pgd_page);

I'm a bit fuzzy on exactly what the code is trying to do here.  It could
mean either that the identity mapping isn't set up enough yet, or that
__pa() is getting called on a bogus address.

I'm especially fuzzy on why we'd be calling anything that's looking at
userspace pagetables (xen_get_user_pgd() ??) this early in boot.

<Prev in Thread] Current Thread [Next in Thread>