linux-mips
[Top] [All Lists]

Re: PATCH: Fix ll/sc for mips (take 3)

To: Jay Carlson <nop@nop.com>
Subject: Re: PATCH: Fix ll/sc for mips (take 3)
From: Jay Carlson <nop@nop.com>
Date: Tue, 5 Feb 2002 11:06:15 -0500
Cc: linux-mips@oss.sgi.com
In-reply-to: <7E232BAE-1A4A-11D6-927F-0030658AB11E@nop.com>
Sender: owner-linux-mips@oss.sgi.com

On Tuesday, February 5, 2002, at 10:10 AM, Jay Carlson wrote:

(Quick background for the list: Because there's such a large code size penalty to PIC/abicalls, I resurrected the bad old Linux/SVR3 statically linked, dynamically loaded libraries, which are linked at absolute locations. Shane Nay took this from a cute demo to a working distribution for the Agenda VR3; Brian Webb helped. Typical code reduction is ~25-40%, eg 391k->272k.)

Oh yes, performance. Apps on the Agenda VR3 built in the snow ABI are dramatically faster/more responsive. If you don't believe me, go search the agenda-dev list and read the testimonials :-)

I don't fully understand why, though. Here are my speculations; bear in mind that the VR3 and some of the other small boxes have 16-bit memory interfaces with small i/d caches.

1) Better icache efficiency.

2) Fewer loads (and stalls) to get typical work done. In PIC, you need a load per symbol reference, and that's every function call.

3) Better dcache efficiency. The GOT no longer needs to be hit for those symbol references.

4) Reduced TLB usage. The GOT pages for each module are quite hot, so now that we're no longer touching them, their 4k (ouch) TLB entries can point somewhere more useful.

5) No symbol resolution at load time. For C++ apps, this can help startup a lot. (prelinking fixes this too)

6) Better scheduling from gcc. egcs seemed to do a better job of arranging loads ahead of use when building non-pic; on the TX39, this helps even more due to non-blocking loads.

I dunno.

Jay


<Prev in Thread] Current Thread [Next in Thread>