[Top] [All Lists]

clock skew and ethernet timeouts

Subject: clock skew and ethernet timeouts
From: Mark Salter <>
Date: Wed, 13 Aug 1997 17:09:32 -0500
I've been taking a look at the problem with frequent ethernet
transmit timeouts. In addition to the timeouts, I've also seen
an occasional duplicate packet being sent in response to pings
coming from another machine. I'm debugging somewhat in the dark
because I don't have documentation on the indy's DMA controller
although the sgihpc.h file has been helpful.

I also noticed that the linux time of day clock falls behind
real time whenever these timeouts occur. I decided to take a
look at this side of the problem and discovered that interrupts
are being turned off for extended periods of time. I modified
the timer interrupt handler to print a message if it detects
a missed system tick. Sure enough, every ethernet timeout is
accompanied by a message coming from the timer interrupt. The
message indicates that the timer interrupt was held off for
as much as 45ms!

Here's the change I made to indy_timer_interrupt() in indy_timers.c:

        /* Ack timer and compute new compare. */
#if 0
        r4k_cur = (read_32bit_cp0_register(CP0_COUNT) + r4k_offset);
        count = read_32bit_cp0_register(CP0_COUNT);
        if ((count - r4k_cur) >= r4k_offset) {
                printk("missed heartbeat: r4k_cur[0x%x] count[0x%x]\n",
                       r4k_cur, count);
                r4k_cur = count + r4k_offset;
            r4k_cur += r4k_offset;

The original code which calculates the next value for the CP0_COMPARE
register introduces skew by basing it on the current value of the 
CP0_COUNT register rather than the previous CP0_COMPARE value. But as
it turns out, that little bit of skew is preferable to the large 
skew that would result whenever the timer interrupt is held off too

Anyway, I'm going to be away from the office until 19 August, so I'll
pick it up then unless someone else finds it first. It seems likely
that the enet timeouts are the result of interrupts being turned off
too long, but if that's not the case, can someone point me to some
documentation for the indy's dma controller? I have some suspicions
of race conditions when setting up new DMA buffers, but I'd like to
have a document before I try various fixes.

Mark Salter

<Prev in Thread] Current Thread [Next in Thread>