On Wed, Jul 08, 2009 at 10:07:50AM -0700, David Daney wrote:
> The resume() implementation octeon_switch.S examines the saved
> cp0_status register. We were clobbering the entire pt_regs structure
> in kernel threads leading to random crashes.
>
> When switching away from a kernel thread, the saved cp0_status is
> examined and if bit 30 is set it is cleared and the CP2 state saved
> into the pt_regs structure. Since the kernel thread stack overlaid
> the pt_regs structure this resulted in a corrupt stack. When the
> kthread with the corrupt stack was resumed, it could crash if it used
> any of the data in the stack that was clobbered.
>
> We fix it by moving the kernel thread stack down so it doesn't overlay
> pt_regs.
>
> Differences from v1: Don't adjust the sp by an additional 32 bytes, it
> was not needed. Also fix up __KSTK_TOS and
> task_pt_regs.
Thanks for fixing and testing the issues I raised on IRC. Next I'm wonding
what impact the uninitialized state of the stack frame we allocate may
have. I think we're ok - but I need to stare at this for a few more
minutes.
Ralf
|