linux-mips
[Top] [All Lists]

Re: new type of crash report?

To: linux-mips@linux-mips.org
Subject: Re: new type of crash report?
From: Giuseppe Sacco <giuseppe@eppesuigoccas.homedns.org>
Date: Sun, 03 Feb 2008 23:59:34 +0100
In-reply-to: <47A5F580.8080300@mips.com>
Original-recipient: rfc822;linux-mips@linux-mips.org
References: <1202050578.7035.11.camel@scarafaggio> <5BFC57F9-7E81-4667-9D15-72F5F20FA4DD@27m.se> <1202054465.7035.20.camel@scarafaggio> <47A5F580.8080300@mips.com>
Sender: linux-mips-bounce@linux-mips.org
Hi Kevin,

Il giorno dom, 03/02/2008 alle 18.10 +0100, Kevin D. Kissell ha scritto:
> Giuseppe Sacco wrote: 
[...]
> > Thanks for your reply. I will try to understand how to use gdb on this
> > context. (Any URI would be really appreciated.)
> > Anyway I now understood that a dbe is a data bus error, so probably this
> > is an error on the physical address, i.e. a kernel problem related to
> > the mapping between vertical and physical addresses. Is this correct?
> >   
> That's correct.  You didn't say what processor you were running on, so
> it's hard to be more specific - there are some which have a bus error
> input pin that can be asserted by the system for other reasons - but
> in general it means that there's a data reference at 0x2ac2bffc whose
> valid translation goes to a bad address.  Generally, that address
> range is where shared libraries are mapped, so to find the instruction
> you want to run the program that caused the crash under gdb, set a
> breakpoint very early (e.g. main), run to the breakpoint, and
> disassemble the virtual address.  I find it interesting that the
> register value reported for register $10 is a reasonable data address
> shifted up by 32 bits.  It's possible that code would have a real
> reason to do that, but I can't help wonder if that isn't part of the
> problem. We may be looking at a 2-level bug here:  User(?) code
> screwing up a base register used for a load or store, and the OS
> failing to handle the upper reaches of the 64-bit address space
> correctly.

The complete bug report is available at http://bugs.debian.org/463808.
The cpu is an "R5000 V2.1  FPU V1.0".

The system is Debian stable, running mainly with courier-imap-ssl and
exim4 (often in TLS mode).

I cannot find a single program to debug, but I know for sure that if I
leave the machine with those two daemons, it will hung in about 30
minutes. If I run a kernel build (using gcc-4.2 from dDebian testing),
then the machine hungs in a few minutes. One time out of three gcc get a
segmentation fault, other two times the machine stop.

Thanks for your help,
Giuseppe


<Prev in Thread] Current Thread [Next in Thread>