On Thu, 5 Feb 2015, David Daney wrote:
> > Well, I do actually, I have a working machine driven by an R4000
> > processor. It was the original implementation of the Status.RE feature
> > and therefore it can be used as the reference. I don't feel tempted to
> > use my time to actually make any checks though.
> > What I did instead, I checked the R4000 manual ...
> You are still relying on your interpretation of the text, rather than actual
> behavior of the device. It is not all surprising that your interpretation of
> the manual hasn't changed, but it doesn't persuade me.
> I am sticking to my belief that OCTEON faithfully implements the specification
> with respect to the in-memory byte ordering of the various load and store
> instructions. Switching the endianess of the processor results in byte arrays
> being scrambled such that the low-order 3 bits are XOR 7. This implies that
> aligned 64-bit loads and stores (LD, SD, LLC, SCD) result in identical
> in-memory and in-register layout for either endianess. This is quite handy
> when writing driver code for devices that have 64-bit registers.
Fair enough, this helps interfacing fixed-endian peripherals such as a
PCI bus. Some MIPS-based SOCs map PCI/memory twice in the bus address
space for the benefit of big-endian systems, once with a byte lane
matching policy and again with a bit lane matching policy. This results
in a swapped memory view between the two mapping spaces as seen by PCI
devices doing DMA.
What you describe refers to the bit lane matching policy which has
benefits for PIO and MMIO as values written to peripheral registers do not
change with a host bus endianness change (as long as accesses are as you
noted only made using a specific data width intended), in contrast to DMA
where the byte lane matching policy makes more sense as it makes byte
streams written to memory the same regardless of the host bus endianness.
What does it have to do with the user mode though? Device drivers do not
usually run in the user mode and even if they do (such as X11 DDX), then
what would be the benefit for them from running in the reverse-endian
mode? They'd have to cope with the rest of the environment being
byte-swapped anyway. Having say a MMIO resource mapped as a region
configured in hardware for swapping with the bit lane matching policy
would make more sense than having the whole user binary (here the X
server) built for and run with the opposite endianness.
The use of CP0.Status.RE is different and it has to be implemented such
as to fulfil its purpose. That for example may be running little-endian
DEC Ultrix/MIPS user binaries under a foreign personality on a big-endian
MIPS machine running SGI IRIX or Linux. Of course with the demise of
proprietary *nix systems for the MIPS processor such a feature seems