linux-mips
[Top] [All Lists]

Re: irqbalance on movidis crashes the machine (was: movidis x16 h

To: David Daney <ddaney@caviumnetworks.com>, linux-mips@linux-mips.org
Subject: Re: irqbalance on movidis crashes the machine (was: movidis x16 hard lockup using 2.6.33)
From: Andreas Barth <aba@not.so.argh.org>
Date: Thu, 15 Apr 2010 22:35:50 +0200
In-reply-to: <20100415184312.GK2942@mails.so.argh.org>
Original-recipient: rfc822;linux-mips@linux-mips.org
References: <20100326184132.GU2437@apfelkorn> <4BAD03A5.9070701@caviumnetworks.com> <20100327230744.GG27216@mails.so.argh.org> <4BB0DB2A.9080405@caviumnetworks.com> <20100402133224.GR27216@mails.so.argh.org> <20100403154312.GY2437@apfelkorn> <20100415184312.GK2942@mails.so.argh.org>
Sender: linux-mips-bounce@linux-mips.org
User-agent: Mutt/1.5.18 (2008-05-17)
* Andreas Barth (aba@not.so.argh.org) [100415 20:43]:
> * Peter 'p2' De Schrijver (p2@debian.org) [100403 17:43]:
> > http://zobel.ftbfs.de/.x/lucatelli-nmi-watchdog-output.txt 
> > Dump of one of those hangs. Most cores seem to be stuck in wait 
> > (0xffffffff81100b80), except for core 1 which is in octeon_irq_ciu0_ack 
> > (octeon_irq_ciu0_ack).
> 
> On further investigation we found out that this happens when
> irqbalance is started. The version of irqbalance being run is 0.55.
> 
> We removed this program from the affected machine, but of course this
> still should be fixed (and we still get a few reboots on another
> machine without irqbalance).

Clarification:

Running irqbalance itself doesn't crash the machine, but increases the
probability of crashes dramatically. Usually the next few (< 10)
commands crash the machine.

The crashs however look similar to the ones we have without irqbalance
- just way less often then with irqbalance. This seems like irqbalance
exposes the crash way better than we do without.


Andi

<Prev in Thread] Current Thread [Next in Thread>