linux-mips
[Top] [All Lists]

Re: Troubles with TLB refills

To: ldavies@oz.agile.tv
Subject: Re: Troubles with TLB refills
From: Ralf Baechle <ralf@oss.sgi.com>
Date: Mon, 5 Mar 2001 11:49:27 +0100
Cc: linux-mips@oss.sgi.com
In-reply-to: <3AA30A91.B5842678@agile.tv>; from ldavies@agile.tv on Mon, Mar 05, 2001 at 01:40:01PM +1000
References: <3AA30A91.B5842678@agile.tv>
Sender: owner-linux-mips@oss.sgi.com
User-agent: Mutt/1.2.5i
On Mon, Mar 05, 2001 at 01:40:01PM +1000, Liam Davies wrote:

> I am trying to get the 2.4 kernel up and running on the Cobalt boxes.
> At the moment I am trying to get the initial transition from kernel
> to user mode working.
> 
> The elf loader is trying to put stuff in the stack for the new user
> process and *each* call to NEW_AUX_ENTRY is generating a page fault
> that cannot be resolved. The address generated by the BadVAddr on the
> TLBS exception is not correct. Also, I never receive a TLBrefill
> exception on the accesses. It is using the except_vec0_nevada handler.

The fault address 0x10004f4c looks wrong; NEW_AUX_ENTRY should access
something near the top of stack, that is a value a few bytes below
0x80000000.  I wonder if this behaviour is related to this CPU bug -
one which btw. was never acknowledged by QED but identified independantly
by sever Cobalt people back at the time.

[head.S]
        /* TLB refill, EXL == 0, R52x0 "Nevada" version */
        /*
         * This version has a bug workaround for the Nevada.  It seems
         * as if under certain circumstances the move from cp0_context
         * might produce a bogus result when the mfc0 instruction and
         * it's consumer are in a different cacheline or a load instruction,
         * probably any memory reference, is between them.  This is
         * potencially slower than the R4000 version, so we use this
         * special version.
         */
        .set    noreorder
        .set    noat
        LEAF(except_vec0_nevada)
        .set    mips3
        mfc0    k0, CP0_BADVADDR                # Get faulting address
        srl     k0, k0, 22                      # get pgd only bits
        lw      k1, current_pgd                 # get pgd pointer
        sll     k0, k0, 2
        addu    k1, k1, k0                      # add in pgd offset
        lw      k1, (k1)
        mfc0    k0, CP0_CONTEXT                 # get context reg
        srl     k0, k0, 1                       # get pte offset
        and     k0, k0, 0xff8
        addu    k1, k1, k0                      # add in offset
        lw      k0, 0(k1)                       # get even pte
        lw      k1, 4(k1)                       # get odd pte
        srl     k0, k0, 6                       # convert to entrylo0
        mtc0    k0, CP0_ENTRYLO0                # load it
        srl     k1, k1, 6                       # convert to entrylo1
        mtc0    k1, CP0_ENTRYLO1                # load it
        nop                                     # QED specified nops
        nop
        tlbwr                                   # write random tlb entry
        nop                                     # traditional nop
        eret                                    # return from trap
        END(except_vec0_nevada)
[...]

This exception handler has been modified since the version tested in the
Cobalt Qube and I'm not sure if the bug workaround actually got tested
since then.

(Bonus points to whoever writes a good probe for this bug!)

handle_page_fault got called and printed something; therefore the
exception handler cannot possibly have been trashed.  do_page_fault gets
called by via the generic exception handler.  The TLB vectors there are only
taken if there is a TLB entry matching the address in the TLB.  Therefore
your theory about no tlb refill exception cannot be right.  The TLB
dump only displays entries where at least on of the entry0 / entry1
entries is valid, therefore you get an empty dump; maybe that made you
believe you didn't get a TLB reload exception.

  Ralf

<Prev in Thread] Current Thread [Next in Thread>