[Top] [All Lists]

Linux support for R[34]000-series computers

Subject: Linux support for R[34]000-series computers
From: Vladimir Roganov <>
Date: Mon, 12 Oct 1998 19:17:35 +0400
Organization: NIISI RAS
This message is addressed to developers who is working on 
Linux support for R[34]000-series computers.
It is dedicated to Linux/2.1.121 code for R3000.

Hello all, hello Harald !

o  Strange code detected in 'asm-mips/mmu_context.h' file:

extern inline void get_new_mmu_context(struct mm_struct *mm, unsigned
long asid)
        /* check if it's legal.. */
        if (((asid & ASID_MASK) >> ASID_SHIFT) > MAX_ASID) {
                /* start a new version, invalidate all old asid's */
                asid = (asid & ASID_VERSION_MASK) + ASID_FIRST_VERSION;
                if (!asid)
                        asid = ASID_FIRST_VERSION;
        asid_cache = asid + (1 << ASID_SHIFT);
        mm->context = asid;                      /* full version + asid */

The first 'if' condition will never succeed for both R3000/R4000.

Yet another problem can (theoretically) lead to security hole:
the version counter is only 16-bits long.  
If we will start a lot of processes, it is possible to obtain same
for old (long time sleeping) and new process, so we can exchange mapping
some addresses.
We do not reproduce it now, but remember about problem with one-byte
of active I/O requests in RSX-11M system, where after 256 QIO$
executive regards task able to swap... :-)

It is also looks strange that function 'get_new_mmu_context' takes asid
parameter, due it must be only 'asid_cache' -- all other values can lead
to problems. Really, both `r4xx0.c` and `r2300.c` use it only with 
'asid_cache' parameter, so may be we eliminate second parameter at all

Suggested (tested for R3000) code looks fixing above problems, 
due it uses all bits upper than hardware asid as their version:

 *  All unused by hardware upper bits will be considered 
 *  as software asid extension -- asid version. 

extern inline void get_new_mmu_context(struct mm_struct *mm)
        /* Version of task asid is zero or obsolete, 
           we should assign actual version and next hardware asid */

        /* Increment _full_ asid carefully, to avoid zero version */
        if (!(asid_cache += (1<<ASID_SHIFT))) 
                asid_cache = ASID_FIRST_VERSION; 

        /* For zero hardware part of asid we starts new process cycle,
           so we need to invalidate all old-cycle TLB-entries */
        if (!(asid_cache & ASID_MASK))
        mm->context = asid_cache;

extern inline void get_mmu_context(struct task_struct *p)
        struct mm_struct *mm = p->mm;

        if (mm) {
                /* Check if our ASID is of an older version and thus invalid */
                if ((mm->context ^ asid_cache) & ASID_VERSION_MASK)

We also have changed line (in `r2300_resume`):

        andi    a3, KU_USER
        andi    a3, ASID_MASK

due it eliminates TLB/ASID logic at all.
R4xx0-code looks does not have this problem.

After this change 'flush_tlb_all()' was removed from 'switch_to' without

'wire_mappings_r3000' which looks present by historical/(futures ?)
does not restore ENTRYHI correctly, what can also lead to problems.
Fortunately, it is called only at kernel startup.

o  Strange code detected in 'lib/memcpy.S' file:

Code, checking memmove arguments overlapping, looks like:

        sltu    t0, a0, a1                      # dst < src -> memcpy
        bnez    t0, memcpy
         addu   v0, a0, a2
        sltu    t0, v0, a1                      # dst + len < src -> non-
        bnez    t0, __memcpy                    # overlapping, can use memcpy
         move   v0, a0                          /* return value */

LEAF(__rmemcpy)                                 /* a0=dst a1=src a2=len */
        addu    a0, a2                          # dst = dst + len
        addu    a1, a2                          # src = src + len

        /* many comments */

        lb      t0, -1(a1)
        subu    a2, a2, 0x1
        sb      t0, -1(a0)
        subu    a1, a1, 0x1
        bnez    a2, r_end_bytes
         subu   a0, a0, 0x1

        jr      ra
         move   a2, zero

First branch is okey: when destination is less than source, we can use
But second condition checks something strange...
Really, see picture:

                <.........src..........>         <----  memcpy

                          <.......dst.......>    <----  rmemcpy

                                          <....dst....>    <---- memcpy

So, best check should be:  src + len < dst

The result is not a bug, but a feature -- in a half of cases use
4-slower code.

Suggested (tested) code is following:

        sltu    t0, a0, a1                      # dst < src -> memcpy
        bnez    t0, memcpy
         addu   v0, a1, a2
        sltu    t0, v0, a0                      # src + len < dst -> memcpy
        bnez    t0, __memcpy                    
         move   v0, a0                          /* return value */

After above changes were produced, Linux compiles GDB normally 
(and even seems faster),  using two gcc in parallel, and with restored 
`r2300.c` TLB-optimization code -- we have removed our (introduced for
TOTAL_TLB_FLUSH logic from one.

Gleb Raiko will send our patch to Harald, due we have more fixes, than
were described here, so ask Harald for actual code.

Please send Your related comments and results,

Best wishes,

<Prev in Thread] Current Thread [Next in Thread>