linux-mips
[Top] [All Lists]

Re: [PATCH 2] 64K page size

To: Peter Watkins <treestem@gmail.com>
Subject: Re: [PATCH 2] 64K page size
From: Thiemo Seufer <ths@networkno.de>
Date: Fri, 25 Aug 2006 16:21:21 +0100
Cc: linux-mips@linux-mips.org, Ralf Baechle <ralf@linux-mips.org>, Jonathan Day <imipak@yahoo.com>
In-reply-to: <44EF0C61.7090008@gmail.com>
Original-recipient: rfc822;linux-mips@linux-mips.org
References: <44EF0C61.7090008@gmail.com>
Sender: linux-mips-bounce@linux-mips.org
User-agent: Mutt/1.5.13 (2006-08-11)
Peter Watkins wrote:
> Hello,
> 
> Attached is the rest of the 64K page patch. It's been tested 
> uniprocessor on Malta 20KC and 25KF. It also runs on a 6-way SMP 
> functional simulator containing 5KF's. Ran tests with 16K and 64K page 
> size.
> 
> Question: Is there an SMP malta board?

You mean a SMP core card? None I heard of.

> There are 2 areas which could use improvement:
> 
> (1) Because 64K is larger than a 15 bit immediate operand, I could not 
> get the asm-offsets mechanism to produce the correct constants. So I 
> enlisted a fairly gruesome hack of using #define's for _PAGE_SIZE and 
> _THREAD_SIZE, for that page size. Hopefully someone has a better idea.
> 
> (2) In tlbex.c:build_adjust_context(), I suspect the change for shift = 
> PAGE_SHIFT - 12 should be more generally true, rather than just for the 
> CPU's mentioned in the case statement.

It should be added to the initial shift calculation instead, i suspect
it will break 32bit kernels elsewise. It should be generally the same
for all CPUs with r4k-style TLBs.

> I was conservative there because 
> I'm not familiar with the CPU_VR41* machines. Hopefully someone more 
> intimate with that code can comment.

The vr41xx has a different/broken page mask in order to support 1k pages
(which aren't used by linux). The shift += 2 fixes this up.

[snip]
> index c06f63e..c9de17c 100644
> --- a/arch/mips/kernel/head.S
> +++ b/arch/mips/kernel/head.S
> @@ -190,7 +190,12 @@ #endif /* CONFIG_MIPS_MT_SMTC */
>  
>       MTC0            zero, CP0_CONTEXT       # clear context register
>       PTR_LA          $28, init_thread_union
> +#ifdef CONFIG_PAGE_SIZE_64KB
> +     PTR_ADDIU       sp, $28, (_THREAD_SIZE - 32)/2
> +     PTR_ADDIU       sp,  sp, (_THREAD_SIZE - 32)/2
> +#else
>       PTR_ADDIU       sp, $28, _THREAD_SIZE - 32
> +#endif

A PTR_LI ... ; PTR_ADDU sequence would be better, without an #ifdef.
This isn't a critical path of execution, and one day we might go for
even bigger pagesizes. Likewise for the other doubled addiu in this
patch.

[snip]
> diff --git a/arch/mips/mm/tlbex.c b/arch/mips/mm/tlbex.c
> index 375e099..bf093aa 100644
> --- a/arch/mips/mm/tlbex.c
> +++ b/arch/mips/mm/tlbex.c
> @@ -97,6 +97,8 @@ #define FUNC_SH             0
>  #define SET_MASK     0x7
>  #define SET_SH               0
>  
> +#define OP_ERET              0x42000018
> +
>  enum opcode {
>       insn_invalid,
>       insn_addu, insn_addiu, insn_and, insn_andi, insn_beq,
> @@ -631,6 +633,9 @@ static __init void copy_handler(struct r
>  static __init int __attribute__((unused)) insn_has_bdelay(struct reloc *rel,
>                                                         u32 *addr)
>  {
> +     if (*addr == OP_ERET)
> +             return 1;
> +

Why? Eret has no BD slot.

>       for (; rel->lab != label_invalid; rel++) {
>               if (rel->addr == addr
>                   && (rel->type == R_MIPS_PC16
> @@ -996,7 +1001,12 @@ #else
>  #endif
>  
>       l_vmalloc_done(l, *p);
> -     i_dsrl(p, tmp, tmp, PGDIR_SHIFT-3); /* get pgd offset in bytes */
> +
> +     /* Want PGDIR_SHIFT-3 here, but break it into two ops so we don't
> +      * exceed the max shift amount of 31 with large page sizes. */
> +     i_dsrl(p, tmp, tmp, PGDIR_SHIFT-16);    /* get pgd offset in bytes */
> +     i_dsrl(p, tmp, tmp, 16-3);              /* get pgd offset in bytes */

This cries for a single i_dsrl32 for the large page case.

> +
>       i_andi(p, tmp, tmp, (PTRS_PER_PGD - 1)<<3);
>       i_daddu(p, ptr, ptr, tmp); /* add in pgd offset */
>       i_dmfc0(p, tmp, C0_BADVADDR); /* get faulting address */
> @@ -1087,6 +1097,11 @@ static __init void build_adjust_context(
>       case CPU_VR4133:
>               shift += 2;
>               break;
> +     case CPU_20KC:
> +     case CPU_25KF:
> +     case CPU_5KC:
> +             shift  = PAGE_SHIFT - 12;
> +             break;

As said, should be done for all CPUs. (This will also fix the vr41xx
case, just in case somebody is crazy enough to go for large pages
on his PDA. :-)


Thiemo

<Prev in Thread] Current Thread [Next in Thread>