Towards Software Defined Video Display Systems
| Auther: | Samuel A. Falvo II |
| Contact: |
kc5tja .at. arrl.net |
| Revision: |
20081122 |
| Date: |
2008 Nov 22 |
| Copyright: |
Copyright (c) 2008 Samuel A. Falvo II. |
| License: | Creative Commons Attribution-Share Alike 3.0 United States License. See http://creativecommons.org/licenses/by-sa/3.0/us/ for more information. |
ABSTRACT
Computers have a history of using rigid video display hardware. Even with the increasing utilization of CPLDs or FPGAs, most video display needs must be statically determined, resulting in applications and operating systems designed around inherent limitations of the underlying hardware. The SEAforth-24 chip provides a platform for a stable, software-defined display interface at VGA resolutions. Software-defined video solutions greatly lowers the barrier for co-developing "display hardware" features concurrent with application software.
PROBLEM
Throughout the history of computers, video display hardware, if it existed at all, from the simplest to the most complex, proves rather rigid in their capabilities. The simplest example may be found in the Apple Macintosh computer, where video hardware consisted only of a single DMA fetcher, a shift register operating at video scan speeds, and not much else. Consequently, even supporting a single moving on-screen object, the mouse pointer, yielded noticable flicker and tear, let alone supporting games with more numerous objects. This effect may also be observed to varying degrees in computers such as the IBM PC/XT, PC/AT, and their compatibles, and the original Atari ST line. Moreover, supporting the appearance of layered objects on the screen required software to "blit" objects into the framebuffer using some variant of a painter's algorithm [Painter]. However, not enough memory existed for each application containing their own local window graphics memory; thus, most graphical user interface operating systems have highly sophisticated software mechanisms [Pike1983] to emulate proper video behavior, adding significant overhead and latency to many otherwise simple graphics drawing operations.
More sophisticated video display implementations, such as Commodore Semiconductor's VIC-II chip, supported so-called "sprites," which enabled several movable objects on the screen without incurring excessive CPU overhead. However, hardware sprites impose limitations of their own. Often, sprites only permit a certain maximum number of pixels in either the horizontal, the vertical, or both, axes. Additionally, sprites often possess fewer colors than the underlying playfield. Sprites are found in computers such as the Texas Instruments TI-99/4A, TRS-80 color computer, Commodore 64, Commodore 128, Atari 400 and 800 series, and the Commodore-Amiga series of computers.
Today, modern video hardware substitutes a simple theory of operation with known limitations with an utterly opaque, undocumented hardware interface of unprecedented complexity, thus requiring a dependency on proprietary, under-documented (if at all) software interfaces. Most software today prefers to hide behind the shield of some graphical user interface toolkit API, which often proves simpler to interface to, despite itself not being simple at all. Many programmers have serious problems understanding how modern GUI APIs function for anything beyond trivial application software, as these frameworks, or any framework for that matter, quickly overwhelms the engineer and imposes design limitations not natural to the application under development ([Falvo2008], [Ingersoll2008], and many more; simply search on Google for "framework considered harmful"). Adaptor libraries, such as Direct Frame Buffer [DFB] and Simple Direct-media Layer [SDL] become necessary to restore some semblence of sanity when trying to write game or other highly interactive software. This suggests that modern video display hardware fails to meet the simplicity needs of either productivity or entertainment titles with equal facility, all the while drawing enough power to warrant one or two additional cooling fans in the process.
SOLUTION
The SEAforth series of processors from IntellaSys may provide a viable solution to the problem of display hardware rigidity. Their smallest chip, the S24, contains a four by six matrix of C18 cores [IntellaSys2008]. The C18 core, in turn, consists of 64 words of RAM, 64 words of ROM, plus a very simple microprocessor to execute software contained therein. Unlike most other microcontroller architectures, the C18 utilizes an ordinary Von Neumann architecture; thus, it may execute software from RAM with equal facility as from ROM. The simplest instructions execute in 1.5ns, while the longest execute in 5.4ns.
By programming a collection of cores to cooperate towards the goal of video signal generation, a stable display on a monitor results. As an example, please refer to the source code found at http://www.falvotech.com/content/publications/software-defined-video/s24-video-clock.tar.gz for an implementation of a completely software-synthesized clock display.
The aforementioned project consumes ten cores, leaving fourteen additional cores for other tasks. Core 12 assumes the role of a CRTC (Cathode Ray Tube Controller), responsible for generating the horizontal and vertical synchronization pulses to the monitor. Additionally, it serves as the master timebase for the remaining nine cores. Core 13 uses the synchronization information from core 12 to keep track of which line to draw next on the screen. Core 12 communicates this to core 19, which relays this information to cores 20 through 22, which collectively compute which graphics to render based on the line number. Upon receiving a response from these cores, 19 then schedules the display of the bitmapped data so as to ensure proper spacing on the monitor. Core 18 serves as a trivially implemented, monochrome dot shifter driving one of the chip's integrated digital to analog converters. Cores 14 through 16 serve to assist cores 20 through 22 in rendering the clock's indicators.
It should be noted that the complexity of the code linked above stems directly from the lack of SDRAM on the ForthDrive demonstrator product. Without the benefit of a bitmapped memory image, no other choice than per-raster computation of what to display remained. Had external RAM been present, I would have utilized cores 0, 1, and 6 to implement DMA fetch logic for a bitmapped display.
As can be noted by reviewing the source code, the software exists entirely in on-chip RAM. Supporting dynamic code replacement proves trivial by altering each core to rely primarily on a feature unique to the SEAforth architecture: port execution [root2008]. With the ability to reprogram cores on a whim, even in a live system, even the most fundamental characteristics of the display subsystem may be altered on demand.
Let's suppose, for the sake of argument, we implement a computer with a screen-based user interface analogous to the Commodore-Amiga, where users may re-arrange screens as easily as windows. Reprogramming relevant cores between different screens on the physical display allows for differing resolutions, color capabilities, sprite support, pixel and/or character encoding methods, and more. Assuming an allotment of ten cores allocated to the task of video generation, with each core possessing 64 words of RAM, it takes only 640 words fetched via DMA to prepare the SEAforth chip for the next screen's display program. With a 10MHz bus, it takes 64 microseconds to fetch those 640 words -- only two VGA scanlines in the worst possible case. The Commodore-Amiga required no less than six NTSC scanlines for its Copper hardware to affect changes to its display registers between screens. Faster buses and fewer cores updated obviously results in less time spent reprogramming. As you can see, the SEAforth architecture trivially meets the real-time constraints for supporting multiple display methods driving the same physical monitor.
RELATED WORK
The Commodore-Amiga series of computers perhaps constitutes the most accessible prior art. The display hardware in the Amiga was, at the time, incredibly simple and orthogonal. Permitting multiple on-display screen resolutions concurrently, multiple color palettes, and all the while providing a simpler form of sprites than that found in the Commodore 64, the Amiga literally invented the multimedia industry as we know it today. However, the chipset used in the Amiga synchronizes so strongly with the NTSC frame rate that it took almost ten years before a true VGA-capable chipset became available. Failure to remain agile in the face of advancing technology ultimately helped the Amiga market to collapse. See [Commodore1989].
The Atari 400 and 800 series of computers, really an 8-bit predecessor to the Amiga line of computers, similarly utilized a "Copper" (though it was not known by that name at the time) to affect display register changes in response to screen timing. Likewise, it suffers the same flaws as outlined for the Amiga. See [Atari2008].
The Commodore 64 [Rautiainen2002], Atari 400, and Atari 800 computers [Collins1998] utilized graphics chips which provided hardware support for text and bitmapped modes, plus sprites. However, in every case, the chips proved limited: sprites on the Atari systems could not exceed 8 pixels in width, but could be any height you wished. The Commodore 64, however, provided for wider sprites, but with a 21-line height limitation. On either platform, software trickery, taking valuable run-time, often could be employed to artificially extend the capabilities of the underlying hardware. Additionally, the VIC-II's bitmapped mode of operation used a bizarre, tiled layout, which further sapped performance from the main CPU by no less than a factor of four when drawing to the screen, since advancing to the next byte in the bitmap involved an explicit sequence of instructions to add 8 to an index register or base address, instead of a simple, two-cycle increment instruction.
Numerous projects on the Internet utilize microcontrollers, such as the high-performance SX-microcontroller [Kohn] or ATmega-series chips [tinyvga], to render graphics to a VGA-capable monitor. Because they utilize flash ROM to hold their programming, they lack in-system self-reprogrammability. External components would be required to implement in-system, external-reprogrammability. Additionally, reprogramming these devices involves a complete flash ROM rewrite; considering the sheer size of these microcontrollers (often up to 32 kilobytes of memory!). This consumes far more time than a handful of scanlines; thus, you'd need to blank the screen completely during a reprogramming phase. On-demand performance currently proves unattainable with these devices.
CPLDs [Rictor2008] and FPGAs [Hamblen1998] have also been successfully applied to customized display projects. Although FPGAs implement their programming using RAM, these devices still lack any kind of in-system, self-reprogrammability; the entire chip needs to be reset and reloaded, requiring external programming logic. CPLDs rarely utilize RAM for their programming, and likewise, require flash ROM reprogramming comparable to that of a microcontroller. Even were this not the case, you'd still have the same problems as with FPGAs.
The Computer Cowboys MuP21 microprocessor included a dedicated video processor unit [TingMoore]. The video unit supported six instructions (pixel, sync, refresh, skip, burst, and jump) tailored specifically to the NTSC or PAL analog waveform. Driving a VGA display would have involved complex, external circuitry. Subsequent variations of the chip, such as the Ultra Technology F21 processor, supported a digital RGBI output and digital sync outputs, suitable for driving an EGA or VGA monitor. Creative use of the jump or skip instructions permitted support for things like interlacing as well as providing per-window local framebuffers that seamlessly integrates with other local framebuffers on the screen. Chains of jump instructions results in black pixels on the display, due to violation of pixel timing invariants. The four pixel granularity enforced with jump instructions, combined with the need to place them inside whatever local framebuffer happens to serve as the video source at the time, limits their utility to emulate sprites on the display. Additionally, the architecture cannot support any kind of text mode, which can significantly simplify a wide variety of applications. Still, this display coprocessor stands as an example of the first truly multi-disciplinary display technology, capable of driving NTSC to super-VGA and all things in between. Unfortunately, commercial availability of this technology dried up with the failure of the MuP21/F21 in the marketplace.
WORKS CITED
| [Collins1998] | Collins, Steven. "Game Graphics During the 8-bit Computer Era". ACM Siggraph. 32.2. (1998 May) |
| [Commodore1989] | Commodore-Amiga, Inc. Amiga Hardware Reference Manual. Addison-Wesley, Inc. 1989. |
| [Pike1983] | Pike, Robert. "Graphics in Overlapping Bitmap Layers". ACM Trans. on Graph. 2. (1983): 135-160. |