On Mon, Mar 05, 2001 at 01:40:01PM +1000, Liam Davies wrote:
> I am trying to get the 2.4 kernel up and running on the Cobalt boxes.
> At the moment I am trying to get the initial transition from kernel
> to user mode working.
>
> The elf loader is trying to put stuff in the stack for the new user
> process and *each* call to NEW_AUX_ENTRY is generating a page fault
> that cannot be resolved. The address generated by the BadVAddr on the
> TLBS exception is not correct. Also, I never receive a TLBrefill
> exception on the accesses. It is using the except_vec0_nevada handler.
The fault address 0x10004f4c looks wrong; NEW_AUX_ENTRY should access
something near the top of stack, that is a value a few bytes below
0x80000000. I wonder if this behaviour is related to this CPU bug -
one which btw. was never acknowledged by QED but identified independantly
by sever Cobalt people back at the time.
[head.S]
/* TLB refill, EXL == 0, R52x0 "Nevada" version */
/*
* This version has a bug workaround for the Nevada. It seems
* as if under certain circumstances the move from cp0_context
* might produce a bogus result when the mfc0 instruction and
* it's consumer are in a different cacheline or a load instruction,
* probably any memory reference, is between them. This is
* potencially slower than the R4000 version, so we use this
* special version.
*/
.set noreorder
.set noat
LEAF(except_vec0_nevada)
.set mips3
mfc0 k0, CP0_BADVADDR # Get faulting address
srl k0, k0, 22 # get pgd only bits
lw k1, current_pgd # get pgd pointer
sll k0, k0, 2
addu k1, k1, k0 # add in pgd offset
lw k1, (k1)
mfc0 k0, CP0_CONTEXT # get context reg
srl k0, k0, 1 # get pte offset
and k0, k0, 0xff8
addu k1, k1, k0 # add in offset
lw k0, 0(k1) # get even pte
lw k1, 4(k1) # get odd pte
srl k0, k0, 6 # convert to entrylo0
mtc0 k0, CP0_ENTRYLO0 # load it
srl k1, k1, 6 # convert to entrylo1
mtc0 k1, CP0_ENTRYLO1 # load it
nop # QED specified nops
nop
tlbwr # write random tlb entry
nop # traditional nop
eret # return from trap
END(except_vec0_nevada)
[...]
This exception handler has been modified since the version tested in the
Cobalt Qube and I'm not sure if the bug workaround actually got tested
since then.
(Bonus points to whoever writes a good probe for this bug!)
handle_page_fault got called and printed something; therefore the
exception handler cannot possibly have been trashed. do_page_fault gets
called by via the generic exception handler. The TLB vectors there are only
taken if there is a TLB entry matching the address in the TLB. Therefore
your theory about no tlb refill exception cannot be right. The TLB
dump only displays entries where at least on of the entry0 / entry1
entries is valid, therefore you get an empty dump; maybe that made you
believe you didn't get a TLB reload exception.
Ralf
|