This message is addressed to developers who is working on
Linux support for R[34]000-series computers.
It is dedicated to Linux/2.1.121 code for R3000.
Hello all, hello Harald !
o Strange code detected in 'asm-mips/mmu_context.h' file:
extern inline void get_new_mmu_context(struct mm_struct *mm, unsigned
long asid)
{
/* check if it's legal.. */
if (((asid & ASID_MASK) >> ASID_SHIFT) > MAX_ASID) {
/* start a new version, invalidate all old asid's */
flush_tlb_all();
asid = (asid & ASID_VERSION_MASK) + ASID_FIRST_VERSION;
if (!asid)
asid = ASID_FIRST_VERSION;
}
asid_cache = asid + (1 << ASID_SHIFT);
mm->context = asid; /* full version + asid */
}
The first 'if' condition will never succeed for both R3000/R4000.
Yet another problem can (theoretically) lead to security hole:
the version counter is only 16-bits long.
If we will start a lot of processes, it is possible to obtain same
versions
for old (long time sleeping) and new process, so we can exchange mapping
for
some addresses.
We do not reproduce it now, but remember about problem with one-byte
counter
of active I/O requests in RSX-11M system, where after 256 QIO$
directives
executive regards task able to swap... :-)
It is also looks strange that function 'get_new_mmu_context' takes asid
as
parameter, due it must be only 'asid_cache' -- all other values can lead
to problems. Really, both `r4xx0.c` and `r2300.c` use it only with
'asid_cache' parameter, so may be we eliminate second parameter at all
?
Suggested (tested for R3000) code looks fixing above problems,
due it uses all bits upper than hardware asid as their version:
<<<<<<<<<<<<<
/*
* All unused by hardware upper bits will be considered
* as software asid extension -- asid version.
*/
#define ASID_VERSION_MASK (~(ASID_MASK|(ASID_MASK-1)))
#define ASID_FIRST_VERSION ((~ASID_VERSION_MASK) + 1)
extern inline void get_new_mmu_context(struct mm_struct *mm)
{
/* Version of task asid is zero or obsolete,
we should assign actual version and next hardware asid */
/* Increment _full_ asid carefully, to avoid zero version */
if (!(asid_cache += (1<<ASID_SHIFT)))
asid_cache = ASID_FIRST_VERSION;
/* For zero hardware part of asid we starts new process cycle,
so we need to invalidate all old-cycle TLB-entries */
if (!(asid_cache & ASID_MASK))
flush_tlb_all();
mm->context = asid_cache;
}
extern inline void get_mmu_context(struct task_struct *p)
{
struct mm_struct *mm = p->mm;
if (mm) {
/* Check if our ASID is of an older version and thus invalid */
if ((mm->context ^ asid_cache) & ASID_VERSION_MASK)
get_new_mmu_context(mm);
}
}
>>>>>>>>>>>>>>>
We also have changed line (in `r2300_resume`):
andi a3, KU_USER
to:
andi a3, ASID_MASK
due it eliminates TLB/ASID logic at all.
R4xx0-code looks does not have this problem.
After this change 'flush_tlb_all()' was removed from 'switch_to' without
problems.
'wire_mappings_r3000' which looks present by historical/(futures ?)
reasons,
does not restore ENTRYHI correctly, what can also lead to problems.
Fortunately, it is called only at kernel startup.
o Strange code detected in 'lib/memcpy.S' file:
Code, checking memmove arguments overlapping, looks like:
<<<<<<<<<<<
LEAF(memmove)
sltu t0, a0, a1 # dst < src -> memcpy
bnez t0, memcpy
addu v0, a0, a2
sltu t0, v0, a1 # dst + len < src -> non-
bnez t0, __memcpy # overlapping, can use memcpy
move v0, a0 /* return value */
END(memmove)
LEAF(__rmemcpy) /* a0=dst a1=src a2=len */
addu a0, a2 # dst = dst + len
addu a1, a2 # src = src + len
/* many comments */
r_end_bytes:
lb t0, -1(a1)
subu a2, a2, 0x1
sb t0, -1(a0)
subu a1, a1, 0x1
bnez a2, r_end_bytes
subu a0, a0, 0x1
r_out:
jr ra
move a2, zero
>>>>>>>>>>>>
First branch is okey: when destination is less than source, we can use
memcpy.
But second condition checks something strange...
Really, see picture:
<.........src..........> <---- memcpy
<......dst.......>
<.........src..........>
<.......dst.......> <---- rmemcpy
<.........src..........>
<....dst....> <---- memcpy
So, best check should be: src + len < dst
The result is not a bug, but a feature -- in a half of cases use
4-slower code.
Suggested (tested) code is following:
<<<<<<<<<<<<
LEAF(memmove)
sltu t0, a0, a1 # dst < src -> memcpy
bnez t0, memcpy
addu v0, a1, a2
sltu t0, v0, a0 # src + len < dst -> memcpy
bnez t0, __memcpy
move v0, a0 /* return value */
END(memmove)
>>>>>>>>>>>>
After above changes were produced, Linux compiles GDB normally
(and even seems faster), using two gcc in parallel, and with restored
`r2300.c` TLB-optimization code -- we have removed our (introduced for
test)
TOTAL_TLB_FLUSH logic from one.
Gleb Raiko will send our patch to Harald, due we have more fixes, than
were described here, so ask Harald for actual code.
Please send Your related comments and results,
Best wishes,
Vladimir
|