Toshi Morita writes:
> > If you have a buffer which is not cache-line-aligned (which is
> > possible with the general case of raw or direct I/O, although not in
> > unmodified Linux at the moment), then, for DMA into memory, you must
> > use temporary buffers for any portion of the buffer which occupies
> > just part of a cache line, and copy the data from the temporary buffer
> > to the real buffer after the DMA completes, to account for the
> > possibility of a separate thread modifying data outside the buffer in
> > the shared cache line, leading to a victim writeback (or a
> > writethrough on the R3000). This could apply even to the R3000, depending
> > on how the compiler generates code for a partial-word update, although
> > it is unlikely.
> I don't see why this is necessary?
> You should only have to force a writeback of the first and last cache lines
> before DMA prior to a non-cache-aligned DMA.
Suppose you have a process with two or more threads
(clone/pthread), or suppose you have two processes sharing a System V
shared memory segment, or suppose you have two processes sharing a
mapped file (MAP_SHARED with PROT_WRITE). Then suppose the buffer is
in the shared memory and is unaligned, and suppose that the buffer
starts at offset 0x10 in a cache line, and there is a variable x at
offset 0x0 in the same cache line. Then suppose thread A starts a DMA
into memory, thread B modifies x (leaving the cache line dirty in
memory), the DMA updates the beginning of the buffer, and then the
line containing is written back (as via victim writeback). Now the
bytes of the buffer starting at offset 0x10 in the cacheline have been
overwritten by the stale data from before the DMA. This scenario does
not apply on machines with cache-coherent I/O (such as IA32 PCs), but
it does apply on many RISC systems, including all systems using the
various QED processors and all SGI and MIPS Computer Systems
workstations with pre-R10000 processors.