From LinuxMIPS
Revision as of 04:19, 16 November 2004 by Ralf (Talk | contribs)

Jump to: navigation, search


IMPACT was a series of high-end graphics cards from Silicon Graphics. Two families of machines used those cards: Indigo2 IMPACT and Octane. These cards feature both geometry and rasterization units. Some models possess hardware texturing capability.

The boards were also internally known as the MGRAS or Mardi Gras family.

Because much more is known about the Octane IMPACT cards, also known as IMPACTSR (for Speedracer, Octane's internal name), in this entry we are going to focus on this family.


The IMPACTSR cards came originally in three versions: SI, SSI, MXI. A SI+T version was also possible. Later, the "E-series" cards were introduced, with vastly improved performance figures: SE, SSE and MXE (SGI is unclear on the correct naming, since the Octane PROM uses the names ESI, ESSI and EMXI and most documents use SE, SSE and MXE). By adding a texture module to an SE card, you got a SE+T.

The SS? and MX? cards are double-size while the S? and S?+T cards are single-size. The size matters here, because the double-size cards possess twice as many of each processing element, providing nearly twice the rendering power.

The S?+T and MX? support hardware texturing by the virtue of having an appropriate number of texture modules (one for S?+T and two for MX?). Adding texture modules (different modules are required for the I and E series cards) to a S? or SS? will convert them into respectively S?+T or MX?.

The cards are apparently mostly software-compatible, with few exceptions regarding mostly initialization and microcode loading, and obviously the hardware texturing functionality.

The Pipeline

Let us follow the graphics primitive as it travels down the graphics pipeline inside the IMPACT cards.

HQ4 command processor

After an OpenGL command is issued by the program running on the host processor, the command is passed as a short series of writes to the Command FIFO. The Command FIFO resides in the HQ4 chip. It has a fixed capacity (probably 64 commands) and the processor must be careful not to exceed this number. The HEART flow control mechanism is designed to help with this.

The HQ4 chip performs a number of important functions. It performs DMA operations, interfaces to the XIO host bus and finally also distributes graphics primitives to the Geometry Engines. It also controls the DCB (Display Control Bus), which is a very simple 8-bit bus that is used to set up other graphics chips (program colormaps, set operational register values etc.).

In this case, with an OpenGL command, the command token is passed to the first free Geometry Engine (on double-size boards, of course, only they have more than one GE).

GE11 geometry engine

The geometry engine is responsible for interpreting OpenGL command tokens, (unsurprisingly) geometry processing and (surprisingly) also OpenGL imaging operations (filtering and so on). It is basically a high performance floating-point unit.

The GE11 consists of three floating-point multiply-or-accumulate style datapaths. They are operated by the control microcode in a SIMD fashion (Single Instruction Multiple Data - all units execute the same operation simultaneously). There are also 32 registers for each datapath, and an internal multiport SRAM for result caching and (probably) also GL context storage.

The microcode (which uses 95 bits per operation) memory is huge, totaling well over 1 MB. This is not surprising, as the GE11 is an almost complete OpenGL implementation.

After processing lighting, transformations and triangle setup the GE11 performs a series of writes to the RSS (Rendering Subsystem) via the RE.

RE4 rendering engine

The rendering engine rasterizes the graphics primitives, forming the key part of the raster subsystem (RSS). It is responsible primarily for generating fragments (small clusters of pixels, two pixels wide and one high in case of RE4, but often two by two on other cards, like Onyx2 InfiniteReality), and also Gouraud interpolation of colors.

On cards with two REs (double-width cards), the REs operate in the Scan Line Interleave mode (like two Voodoo2 cards would - or two PCI Express GeForces 6800 Ultra, too), i.e. each of the REs processes only even (or odd) picture lines.

The RE4 passes the generated fragments to the Pixel Pipes via a Rambus link.

TE1 texture engine

The texture engine receives coordinates from the RE4 and cooperates closely with the RE4, providing it with interpolated texture data. Each TE1 is associated with a TM (Texture Module) that performs the interpolation. The TM contains an amount of Rambus DRAM to cache the texture data.

PP1 pixel pipe

A pixel pipe controls the Rambus DRAM framebuffer. It performs alpha blending, Z buffering and logic operations. Another functional block inside the PP1 is the XMAP which performs pixel format translation and outputs the data to the colormap chips.

Each RE4 controls two PP1s, because a fragment consists of two pixels. A PP1 controls three 2Mx9 Rambus DRAMs. Lots of possible blending modes, Z buffer checking modes and other are possible.

The XMAP takes a 5-bit DID (Display ID) from the video controller. This indexes a pixel format table. This way, different pixel formats can be simultaneously displayed on the screen. The output pixels from the XMAP are passed to the CMAP by a parallel interface.

CMAP colormap

This is frankly the very same colormap as used on Indy machines in Newport graphics boards. Apparently, it was a good design. The colormap translates colors in palette modes to RGB components. Thanks to DIDs, many colormaps may be simultaneously active on the screen (like pixel formats).

There are two CMAP chips (for two pixels of each fragment).

VC3 video controller

This chip is responsible for generating video synchronization signals. Because these signals are often quite complex, the VC3 executes a program from external memory that describes them in a RLE-like encoding form.

A similar program contains the DIDs. Each DID describes a set of windows that use a common pixel format and palette.

The VC3 is also used to generate a hardware cursor.

The VC3 is a slightly updated version of VC2.


This chip is a high-quality video DAC from Analog Devices, Inc. However, in IMPACT boards all of its internal video processing features (like colormaps or hardware cursor - note that IMPACT contains two hardware cursors because of this) are discarded and it is used as a simple triple DAC with a 2:1 multiplexer (to finally join the two pixels of each fragment into one stream).

There is however nothing that would prevent anyone from using the hardware cursor of ADV7160 to provide a text cursor for instance.

VIO1 video I/O

This chip is probably used to stream video from video add-on cards to the screen.

Programming the IMPACTSR

In Linux, the IMPACT is visible as a framebuffer device. However, its framebuffer is not direct-mapped, so using a fbdev-type application will not yield any useful results. Indeed, a board crash is possible as the area at 0x00000000 gives access to the IMPACT register file when mmap(2)ed.

Mapping other areas of the video device causes allocating and setting up DMA buffers. This makes DMA programming on IMPACT relatively easy under Linux.

Much remains unknown about this card... Any information is appreciated.