linux-mips
[Top] [All Lists]

Re: [PATCH] Add support for SB1 hardware watchdog.

To: Andrew Sharp <andy.sharp@onstor.com>
Subject: Re: [PATCH] Add support for SB1 hardware watchdog.
From: Ralf Baechle <ralf@linux-mips.org>
Date: Mon, 3 Dec 2007 23:08:28 +0000
Cc: linux-mips@linux-mips.org, akpm@linux-foundation.org, wim@iguana.be
In-reply-to: <20071203181658.GA26631@onstor.com>
Original-recipient: rfc822;linux-mips@linux-mips.org
References: <20071203181658.GA26631@onstor.com>
Sender: linux-mips-bounce@linux-mips.org
User-agent: Mutt/1.5.17 (2007-11-01)
On Mon, Dec 03, 2007 at 10:17:04AM -0800, Andrew Sharp wrote:

> +       Watchdog driver for the built in watchdog hardware in Sibyte
> +       SoC processors.  There are apparently two watchdog timers
> +       on such processors; this driver supports only the first one,
> +       because currently Linux only supports exporting one watchdog
> +       to userspace.

And even four watchdogs in the BCM1480.

You'd think they'd trust their hardware more than that ;-)

> + * This driver is intended to make the second of two hardware watchdogs
> + * on the Sibyte 12XX and 11XX SoCs available to the user.  There are two
> + * such devices available on the SoC, but it seems that there isn't an
> + * enumeration class for watchdogs in Linux like there is for RTCs.
> + * The second is used rather than the first because it uses IRQ 1,
> + * thereby avoiding all that IRQ 0 problematic nonsense.
> + *
> + * I have not tried this driver on a 1480 processor; it might work
> + * just well enough to really screw things up.

Four rather similar watchdogs using four interrupts also.

> + * It is a simple timer, and there is an interrupt that is raised the
> + * first time the timer expires.  The second time it expires, the chip
> + * is reset and there is no way to redirect that NMI.  Which could
> + * be problematic in some cases where this chip is sitting on the HT
> + * bus and has just taken responsibility for providing a cache block.
> + * Since the reset can't be redirected to the external reset pin, it is
> + * possible that other HT connected processors might hang and not reset.
> + * For Linux, a soft reset would probably be even worse than a hard reset.
> + * There you have it.

If read requests are never returned eventually the ZB bus HT host bridge will
run out of buffers after the 16th request.  The CPU has four more buffers
so the 21st read will stall the CPU's execution.  About a milisecond later
the machine check exception will make the CPU resume execution.  But at this
stage some registers are marked busy in the register scoreboard and any
reference to those CPU registers will cause the CPU to hang again ... until
the next machine check.  Game over, press button to continue.

> + * The timer takes 23 bits of a 64 bit register (?) as a count value,
> + * and decrements the count every microsecond, for a max value of
> + * 0x7fffff usec or about 8.3ish seconds.

One off - the maximum time is 0x800000µs.

  Ralf

<Prev in Thread] Current Thread [Next in Thread>