Hi Vladimir (and all the others, of course),
On 07-Oct-98 Vladimir Roganov wrote:
> Most important increasement of reliability we obtained commenting (new)
> TLB-related code (define TOTAL_TLB_FLUSH in attached r2300.c), and
> redesign
> cache invalidation code (see following topic. We will happy to receive
> any
> comments about it).
Well, happy reading.
> We are peace about TLB-flush/ASID logic -- it possibly just not debugged
> enough, and will be fixed too.
Yup. I simply copied the routines from r4000.c with a few adaptions for the
R3000
and it's very probable that not everything is right. Now that my DECstation goes
single user I'll be able to make a few tests.
> But yet another thing is more interesting: when we define
> NO_TLB_OPTIMIZE
> in r2300_misc.S: we received kernel page fault: 'unix_gc' function uses
> VMALLOC, which reserves addresses in KSEG2, but 'do_page_fault' function
> mean it is not right.
> It is very interesting for us, due we want to use VMALLOC-like mechanism
> to map KSEG2 area to our lance network adapter memory to avoid
> Baget-specific
> options in r2300_misc.S (that is, to have something like
> sparc_alloc_io).
> If anybody have ideas how to do it in current Linux structure, please
> inform us.
That should be easy to fix (been there done that). The R3000 handle KSEG2 misses
through a different exception mechanism than the R4000. A KSEG2 on a R4000
causes a
TLB miss exception (except_vec0) whereas on the R3000 a KSEG2 miss causes a
TLB[LS]
exception (except_vec3_generic -> handle_r2300_tlb[ls]) and do_fault() get's
very
confused.
The attached patch should fix this, the code has been here from 2.1.14 -
2.1.100.
BTW, the code in this file is strictly R3000 related and we don't need no
stinking
.set push/.set reorder/.set pop here and can insert as many nops as we like.
I have cleaned up the code accordingly.
> o Cache code improvement suggestions:
>
> We spend many hours measuring BAGET performance and found that cache
> code speedup is very desirable. One more reason for cache code
> revision: we found that few cache invalidation functions really accept
> virtual addresses instead physical.
>
> After few revisions we lead to following structure:
>
> 1) To avoid code duplication struct 'cache_space' is used for both
> I & D cache spaces. Initialization of these structures is done
> at functions 'probe_?cache', calling 'cache_size'.
> It is so called initialization level.
>
> 2) Low-level functions for cache spaces invalidation are called
> 'flush_cache_space_page' and 'flush_cache_space_all'.
> First accepts 'cache_space' structure, performs some checks and
> flushes physical cache page as fast as possible.
> Second function just uses above to flush needed quantity of KSEG0
> pages.
>
> To translate address to physical page 'get_phys_page' function is
> used.
> It carefully checks segments and returns 0 or page real memory
> address.
>
> We have red D.Miller article about Linux cache flush architecture,
> where noticed what DMA/Drivers dcache cache coherency should be
> implemented at driver layer. So we are now avoiding dcache
> invalidation,
> and introduce DO_DCACHE_FLUSH flag if somebody need to flush dcache
> here.
Fine, less is more :-). I've been thinking about that too but have been faster.
Honestly, I read Daves paper too and I agree, we probably don't have to
invalidate
the dcache ever in the mmu related code. DMA is another thing and should
stricly be
driver related.
> 3) All exported 'r2300_*' functions obtain physical page(s) and flushes
> one(s). Few top-level optimizations are commented in a source code.
Good work, my DECstation seems to run fine with your changes.
---
Regards,
Harald
patch.r2300_misc
Description: patch.r2300_misc
|