* David Daney (ddaney@caviumnetworks.com) [100329 18:54]:
> On 03/27/2010 04:07 PM, Andreas Barth wrote:
>> * David Daney (ddaney@caviumnetworks.com) [100326 19:57]:
>>> Also you could try running with the attached patch. It is not the best
>>> watchdog, but it will print the register state for each core when things
>>> get stuck. Occasionally that is enough to see where the problem is.
>>
>> Thanks.
>>
>> As our logging has only limited buffer size, I'd be happy about an
>> variant of the patch which doesn't reboot but just let the machine
>> hang after the third occurence.
>>
>> Any chances for it?
> You could just sit in a loop kicking the watchdog timer after you get to
> the NMI handler. That should prevent a reset, but still print the
> machine state.
I need to admit that I'm totally unable to make code from that
statement.
Could you (or someone else) give me a hand? Also please note that it
usually takes a few hours to crash the machine, and I didn't see
anything in the normal syslog.
Andi
|