On Mon, Oct 06, 2003 at 07:05:06PM -0700, Steve Scott wrote:
> We tried the fault.c patch Jun suggested, but it didn't solve the problem we
> were
> having with the BUG() in schedule(). The patch at the beginning of
> except_vec3_generic for the Vr5432 bug had previously been installed.
>
> While chasing the BUG() in schedule(), though, we ran across another BUG() in
> alloc_skb() in ...linux/net/core/skbuff.c. :
>
> alloc_skb called nonatomically from interrupt 80117acc
> kernel BUG at skbuff.c:179!
>
> We changed the way sock_init_data initializes the 'allocation' field and
> were able to get past this one (see attached sock.c.patch). We're not sure
> if this fix needs to be permanent, or if it's just a temporary workaround.
>
> For the schedule() BUG(), all evidence that we collected pointed to some
> interrupt causing us to reenter schedule() (i.e., somehow schedule() was
> called during an interrupt handler). We suspected something being run
> from the timer interrupt bottom half, but were never able to prove it. We
> also thought a remote possibility might be a pipeline hazard in the MIPS
> causing the EPC register not to update on a nested exception, but NEC says
> that can't happen on the Vr5432 that we're using...
Can't happen on any MIPS.
> We finally worked around the schedule BUG() by disabling interrupts
> during the context switch in schedule(). This workaround required changes
> in linux/kernel/sched.c and linux/arch/mips/kernel/r4k_switch.S (see attached
> patches).
Ouch. Forgive but if I'd not already ignore these patches for being
ed-style I'd ignore them for being completly broken - these
patches are harmful for performance and probably not going to achieve
stability by anything other than luck ...
Ralf
|