Nothing jumps out to me in the new set of register values.
It might be worth dumping all the CP0 registers?
I'm especially interested in the Config3 to see the VEIC bit.
The timer registers might be useful as well.
From: Kevin D. Kissell [mailto:firstname.lastname@example.org]
Sent: Wednesday, December 22, 2010 7:03 AM
To: Anoop P A
Cc: Anoop P.A.; STUART VENTERS; email@example.com
Subject: Re: SMTC support status in latest git head.
On 12/22/10 3:51 AM, Anoop P A wrote:
> On Wed, 2010-12-22 at 03:37 -0800, Kevin D. Kissell wrote:
>> Thanks. This is indeed strange. The VPE0 Status and TC0 TCStatus/Cause
>> all indicate that interrupts are enabled and not inhibited at the per-TC
>> level, and the presumed timer interrupt, in the 0x4000 bit, is present
>> and not masked-off. Logically, the system must be entering (and
>> exiting) the interrupt handler, yet the timer calibration isn't
>> completing. That leaves more complex possible explanations for failure,
>> most of which would fall into two categories:
>> 1) The platform interrupt handler is failing to decode the event
>> properly as a timer event.
>> 2) Despite there being only one TC active, the calibration code is
>> waiting for some handshake from another "CPU"
>> To test the first, you might consider adding a kprintf() to the case of
>> a "spurious" timer-like interrupt being detected and ignored...
> I have tried it . only one interrupt is coming and platform handler
> detect it as timer interrupt and acknowledges properly . you can see a
> time stamp change in the logs.
That's really strange. And your timer interrupt is definitely on the
interrupt that corresponds to the 0x4000 mask?
I may have written the MT spec and the original SMTC code, but I don't
have a copy of the spec, and it's been a few years, and I can't
interpret the MVP and VPE control/config values. But I just don't see
how the processor could not be taking more interrupts. Stuart did
decode the global/VPE state enough to observe that global multithreaded
execution wasn't enabled, which is indeed strange - it shouldn't matter
for single-TC execution, but I don't recall there being any special-case
in the SMTC initialization that bypassed that enable. That makes me
suspect that maybe someone changed the initialization sequence in a way
that bypasses one of the canonical initialization steps in a way that
would break SMTC, but I don't know why that would result in the
interrupt behavior you observe.
It might be yet another blind alley, but could you add/arm diagnostic
output for each of the initialization functions in smtc.c?
Ah, yes, and one other thing. You should add a dump of ErrorEPC to the
MT register dump. I did it for myself once upon a time when I was
confronted with a similar mystery, but never filed a patch. If you're
breaking in with NMI, that could help identify more precisely where it's
You really ought to try to borrow an EJTAG probe. It would save us both
a lot of time. And my time to trouble-shoot this with you is limited.