Ralf,
Thanks for the input.
To understand what exceptions I was getting (apart from RI), I implemented a
counter for each exception.
In "except_vec3_generic" (entry.S), I included some code to increment a counter
for each exception received.
<code>
NESTED(except_vec3_generic, 0, sp)
#if defined(CONFIG_CPU_R5432)
/* [jsun] work around a nasty bug in R5432 */
mfc0 k0, CP0_INDEX
#endif
mfc0 k1, CP0_CAUSE
la k0, exception_counter
andi k1, k1, 0x7c
addu k0, k0, k1
lw k1, (k0)
addi k1, k1, 1
sw k1, (k0)
... (original code follows)
</code>
exception_counter is an array of 32 integers. On printing out the array in
do_ri exception handler, I found that only TLB Mod(Code 1), TLBL (Code 2), TLBS
(Code 3), syscall (code 8) and RI (code 10) exceptions were received (had count
>= 1). With this, will it be safe to assume that RI is the only unwanted
exception?
To get hold of exact EPC at which RI is occuring, I tried to clear the EXL bit
of status register by adding some more code above the exception counting code
in the except_vec3_generic routine.
<code>
NESTED(except_vec3_generic, 0, sp)
#if defined(CONFIG_CPU_R5432)
/* [jsun] work around a nasty bug in R5432 */
mfc0 k0, CP0_INDEX
#endif
mfc0 k0, CP0_STATUS
nop
ori k0, k0, 0x2
xori k0, k0, 0x2
mtc0 k0, CP0_STATUS
nop
... (Exception counting code follows)
</code>
Surprisingly, the processor does not seem to alow me to clear the EXL bit. I
get AdEL (code 4) exception as I complete the "mtc0 k0, CP0_STATUS"
instruction. The processor goes into an infinite loop of exceptions and boot-up
hangs after printing "Freeing unused kernel memory: 48k freed". Is it not
possible for software to clear the EXL bit after it has been set by the
hardware? If not, what else can I do to get hold of the correct EPC value where
RI is occuring?
Thanks,
Sekhar
-----Original Message-----
From: Ralf Baechle [mailto:ralf@linux-mips.org]
Sent: Monday, December 27, 2004 6:00 PM
To: Nori, Soma Sekhar
Cc: linux-mips@linux-mips.org; Iyer, Suraj
Subject: Re: do_ri exception in Linux (MIPS 4kec)
On Thu, Dec 23, 2004 at 04:58:03PM +0530, Nori, Soma Sekhar wrote:
> We are using montavista Linux version 2.4.17, gcc version 2.95.3 running on
> MIPS 4kec.
>
> Here is the dump:
> $0 : 00000000 0044def4 000001ac 0000006b 00000000 7fff7c08 00000001 00000000
> $8 : 0000fc00 00000001 00000000 941524d0 00004700 00000000 97fc3ea0 7fff7c08
> $16: 100048a4 100029d8 100029d8 10003020 00000000 7fff7dc8 10003b60 2d8e2163
> $24: 00000001 2ab7bc30 10008e70 7fff7bf0 04000000 00439e50
> Hi : 00000000
> Lo : 00000001
> epc : 00439e84 Not tainted
> Status: 0000fc13
> Cause : 10800028
> Process sh (pid: 18, stackpage=97fc2000)
> Stack: 00000001 00000000 2abd0ff0 7fff7c28 10008e70 00000000 10008e6c 00000000
> 100049a0 0042f188 00000000 100029d8 00000001 00000001 7fff7f04 10008e70
> 00427fe4 00427f00 00000000 00000000 10002764 10008e70 10008e70 00000000
> 00000000 00000000 10008e70 00422734 00000001 00000001 7fff7f04 10008e70
> 10008e70 00000003 10008e70 004315cc 00000001 00000000 10002764 00000000
> 10008e70 ...
> Call Trace:
> Code: 00000000 2421dd48 00220821 <8c220000> 00000000 005c1021 00400008
> 0000
> 0000 8f99802c
>
> The epc is not in kernel space and ksymoops did not provide any info. The epc
> keeps changing to different locations in user space over multiple runs.
In a case like this you're likely dealing with double exceptions. Your
code is taking an exception and the exception handler while running with
c0_status set is taking another exception. If the first exception handler
is still running with the c0_status.exl bit set the CPU when taking the
second exception it will not record the PC of the second exception and
you will have a seemingly unexplainable exception.
A few processors have the nasty habit of throwing RI receptions or do
similarly weird things when executing code that is mapped through multiple
TLB pages but the 4kEC shouldn't.
Ralf
|