linux-mips
[Top] [All Lists]

Re: numbers...

To: nn@lanta.engr.sgi.com
Subject: Re: numbers...
From: "David S. Miller" <davem@caip.rutgers.edu>
Date: Thu, 16 May 1996 02:21:12 -0400
Cc: sparclinux-cvs@caipfs.rutgers.edu, torvalds@cs.helsinki.fi, lmlinux@neteng.engr.sgi.com
In-reply-to: <199605151640.JAA08277@lanta.engr.sgi.com> (nn@lanta.engr.sgi.com)
Sender: owner-linux@cthulhu.engr.sgi.com
   Date: Wed, 15 May 1996 09:40:53 -0700
   From: nn@lanta.engr.sgi.com (Neal Nuckolls)

   > sun4m SS10 115mhz hypersparc, 256k cache
   > measure_csum_partial
   > csum_partial: sz[1024] 10000 iterations takes 17009430 microseconds
   > csum_partial: sz[1024] 1 iteration takes <1700 microseconds>==<332 
nanoseconds>

   Maybe I'm dense this morning but I don't understand the numbers.

[NOTE: real lmbench results with the new checksum code in a bit, it's
       not as much of an improvement as I wanted and Solaris still
       gets better TCP bandwidth on localhost.]

The benchmark looks like:

        for(iter=0; iter < 10000; iter++) {
                for(inner=0; inner < 512; inner++) {
                        csum(foo, bar, baz);
                        flush_caches();
                }
        }

The 512 number is just imperical because for small buffers (less than
1k) doing just one iteration caused it impossible to measure anything
significant.

I am factoring in the time the cache flush takes.  I calculate how
long it takes to do the flush before the loop runs, then subtract that
value multiplied by (iter * inner) from the final time.

So the 17009430 microseconds is the time it takes to run the entire
loop structure minus the flushing overhead.

1700 microseconds is the time each run of the inner for loop took
again minus the flush overhead, 332 nanoseconds is the absolute time
each instance of the csum() took to run on the buffer once again this
is after subtracting the flush overhead.

Later,
David S. Miller
davem@caip.rutgers.edu

<Prev in Thread] Current Thread [Next in Thread>