DEVELOPMENT OF A SYSTEM TO DIGITALLY STORE, DISPLAY, AND ALLOW MANIPULATION OF
A PASTER SCAN VIDEO FRAME
by Mark P.Jbbate
SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF BACHELOR OF SCIENCE
at the
MASSACHUSETTS INSTITUTE OF TECHNOLOGY February, 1977
Signature
Certified
of Author...
.
w ... ... Department of Interdisciplinaryand Ge .aa Scieng by ...
Th is
Supe
or
Accepted by...
Chairman, rtrental Committee on Theses
ARCHiY4'
mss, IS.
MAP 2 9 1977
DEVELOPMENT OF A SYSTEM TO DIGITALLY STORE, DISPLAY, AND ALLOW MANIPULATION OF
A RASTER SCAN VIDEO FRAME
by
Mark P. Abbate
Submitted to the Department of Interdisciplinary and General Science on February 1, 1977 in partial fulfillment of the requirements for the Degree
of Bachelor of Science
ABSTRACT
This thesis describes the development of a digital video framestore system. Signal and modulation requirements for an NTSC video waveform are presented, and are then contrasted with
digital video characteristics in areas such as bandwidth,
re-solution, and encoding schemes. A modulator for hue - intensity
-saturation encoded video is presented. Part three deals with the memory array and supporting access systems, which allow
both a main processor (via an appropriate interface) and a
display generator (working in sync with the modulator) to share access to the encoded frame of video. The memory card design
itself is the primary system element due to speed, expansion,
and access requirements. Analog video signals possess a
corresponding digital data rate of 16 bits per 100 nanoseconds. The internal device structure is built around an expandable
bus system for address, data, and control lines. The design is to be implemented in the Visible Language Workshop's
image processing with hardcopy output project.
Judah L. Schwartz
Professor of Engineering Science and Education
ACKNOWLEDGEMENT
I would like to thank
Ron Mac Neil
Prof. Judah Schwartz
and the Architecture Machine fdlks:
Bob Hoffman Bill Donelson Chuck Libby
TABLE OF CONTENTS
1. 0 Introduction: Framestore characteristics and use . .... 1
1.1 Particular system requirements.. ... ... 2
1.2 Procedure...3
2.0 The NTSC video system... 2.1 Sync and timing...5
2.2
Bandwidth...92.3 Color modulation techniques...9
2.4 HIS modulator hardware...12
3.0 Digital video...14
3.1 Minimum framestore requirements that allow video output and image processing...14
3.2 Memory card design...17
3.2.1 The 2107B dynamic 4K RAM... .. .6.6....17
3.2.2 Memory board configuration based on display constraints...22
3.2.3 Memory board configuration based on processing costraints...25
3.2.4 Memory board systems common to both processor and display...28
3.2.5 Memory board hardware...,...31
3.3 The video display...37
3.4 The processor interface and internal bus control...a...43
ILLUSTRATIONS
i
System block diagram...4ii
Scanning and interlace...6iii Video levels and frequencies...6
iv Sync and timing...6
V Bandwidth -...8
vi Spectrum - subcarrier and H rate...8
vii YIQ modulator...8
viii Color phase angles...11
ix HIS modulator...13
x 2107B block diagram...19
xi Chip control voltage levels...19
xii Chip power line transients...19
xiii Write cycle...20
xiv Read cycle...20
xv Display portion of memory board...24
xvi Processor portion of memory board...26
xvii Memory board integrated circuits ... 29-30 xviii The complete memory board
...32
xix General printed circuit board layout...33
xx Video display block diagram...39
xxi Cursor logic...42
xxii Typical processor interface...42
xxiii Map to internal bus connections...42
ILLIJS TRAT IONS
xxiv Processor to internal device bus connections...45
xxv Line map and address bus multiplexer...48
xxvi Arbitration, re fresh and memory timing. ... 49
BIBLIOGRAPHY...51
1.0 INTRODUCTION: FRAMESTORE CHARACTERISTICS AND USE
The digital framestore or framebuffer allows display of one or more frames (or fraction of a frame) of video on a raster scan television monitor. The framestore memory can be accessed by a processor which builds and modifies the picture in memory. The display portion of the framestore continuously
clocks out lines of the video frame in small n - bit words or
pixels, then decodes the color and luminance information and
through an appropriate modulator produces a complete NTSC
(National Television Systems Committee) video signal. Two
typical encoding formats for the pixel information are 16 bits of red - green - blue (three groups of five and one bit for
control), and hue - intensity - saturation (seven - five
-four) . Overall resolution is expressed in terms of pixels per
line and lines per frame; 512x512 yields a raster resolution somewhat beyond quality broadcast television.
The framestore is often an adjunct to hard-copy output
systems as it allows nearly real-time manipulation of images
prior to graphic output. The major design issues in a
frame-store are that it requires a large section of memory capable
of very fast output rates and that the processor must share memory time with an effectively continuous display output
sys-tem. Both of these criteria must be met in certain applica-tions, for example to allow pen and tablet interactive
operation. The next section lists specific requirements for this thesis design which will be implemented in the Visible Language Workshop's hardcopy output image processing system.
1.1 PARTICULAR SYSTEM REQUIREMENTS
The display portion of the framestore must be able to produce an NTSC video signal from RGB or HIS pixel information, in variable frame resolutions from 64x64 to 1024x1024. The video sync generator inputs should be available as system
out-puts, and the display should have inputs for external sync
generator drive, allowing synchronous operation in a
conven-tional television facility. The memory array will be designed
for 2107B 4K Dynamic random access memories, and should be
expandable to use 16K RAMs when available. There must be a
standard memory board and bus system to facilitate plug-in
memory expansion for multiple frames or higher pixel densities. The processor and display should access memory via a common line map under processor control for line order manipulation.
The line map translates a frame-line-pixel address to an abso-lute memory address. The processor must be able to control operating parameters of frame size and resolution, the pixel encoding scheme used, line pixel and color maps and so on.
Internal device operation should be based on a workable bus
functions and programmable interface with the main processor.
1.2 PROCEDURE
A complete system block diagram will be developed, de-fining all important internal and interface control paths.
Four major blocks comprise the system (fig. i): the memory
card array, plug-in expandable; the video display output
counter and modulator; the processor interface and internal
device I/O controller; and the refresh timer with memory
con-tention arbitration. The memory array is connected to the
con-trol blocks by four bus paths: two similar sets of address/
data/control lines to the processor interface and display
out-put; shared address and control lines multiplexed by the
ar-bitration unit; and a high current power distribution bus. Maps and registers which control internal parameters of the framestore such as raster size, absolute addresses which point
to line beginnings etc., are connected to the processor
inter-face and main data bus via an internal device
I/O
bus. Thisstructure allows display parameter control by the processor and
would easily interface to a microprocessor bus structure for control. The interface to processor handshaking sequence can
only be described generally now, since a specific main
process-or for the project has not been chosen. Complete logic
dia-grams and general printed circuit board layout constraints
for the memory boards are presented. The most important
PROCE.SSOR
ITER
FACE
InoR.,T
coua PIXEL COLOR MPIPS VIDEO MOD. S9N)C. GEIJ.j
E) COFOTROL r i-i 1~~~~1K>
tAJ'
ca,0T2
I
'IT
PatbEP~SQS
REFRES VI IA me"ORkelol0 AMeMOTp.YfMCMOR
tf
D5PISPLR
TIMi
t46
AJ
COAJYROL,
(1)
.ATAI/o
CO T&OL co'ROL LNIST&UcS1O N) TO MAQ Poc. -I OTeArAL OE6QcL 1/o 80S REFRESH5
ciM ER,CO U
KTEk
P.IMM
TmI-L
PE
MP
CDKTrRC>L.
3-1
MPX%
element in the overall system is a fast, flexible and
expand-able memory board and bus structure. Basic refresh/display/
processor arbitration logic is included, as well as multiplex
and buffer logic for the memory array bus. The appendices
in-clude manufacturer specification sheets for the RAM chips and
NTSC sync timing diagrams. Development begins with an
exam-ination of NTSC video requirements and modulation systems.
2.0 THE NTSC VIDEO SYSTEM
This section describes timing, bandwidth and modulation
criteria the display output must satisfy, as well as describing
a particular modulation scheme to be implemented.
2.1 SYNC AND TIMING
Standard television is a continuous raster scan process,
sweeping an intensity modulated electron beam across a phosphor-escent screen in a left to right and top to bottom pattern
(fig. ii). A complete frame is composed of 525 lines and is
one-thirtieth of a second in duration. Each frame, however,
is encoded as two interlaced fields; in the first 1/60'stcond:
262 1/2 odd numbered lines are transmitted, followed by the
even numbered lines. The timing of the horizontal and vertical
retrace signals guarantee non-overlap of lines on successive
2. - 3i- WOES occup, ooph0 AETOLACX CoLoR LscARIER Vi NtoAL.
F9ELD
FRA fA I D a ft-7o (2 ti)(ii')
rltLb 2 3S19S4S I-734 ' s. 939Q 29.- 499 MHI Kt Nt%P 712
H GtFWOKIN4b ToEAv A . FROOT POR..m f3Pt KPOR04 0&Sr V AU ?ULSE K-)G6KEEtJE )ALI -LACXKLEEt G.t~pJKPJ G LEEL al CC.LE. 3.%I%? eoi 06COLO&PC
1 FL I
lot
2.19. 4I 03 -SG 13-11 /A.~ nfA. twno. AWAa-IO.11tA4. jt pA. .1'7 H .- Ot3 H .cr18 s -034 Ki a Oa5 Hfields. 2:1 interlace at 1/60 sec. places the flicker rate
below the limit of perception (1/50 sec.) . Flicker perceived
in video and film frames differs in that the film image arrives as a complete frame whereas the video image is formed by a
bright line moving from the screen top to bottom, with the phosphorescence dying in the trail of the scan line. The
re-sult is that video requires a higher frame rate,,and 1/60 sec.
provides an adequate margin. Continuity of motion in the video
case surpasses the 48 frames per sec. effective film rate. Sync pulses attached to the image signal cause horizontal (line) and vertical (field) retrace at the appropriate times
(figs. iii, iv) . The sync signal occurs at the beginning of
a portion of the signal which is 7.5% below the "blackest
black" level transmitted. This 10.8 microsec. space is the
horizontal blanking interval which is used to drop the level
(beam intensity) of the electron gun below the phosphorescence threshold at the picture tube. During vertical retrace from bottom to top 21 H lines are blanked. The vertical sync pulse occurs at the start of the blanking interval. It is composed
of 3 H lines of equalizing pulses - half width two times H
frequency sync pulses which accurately trigger the television
set's horizontal oscillator during the half line field one to
field two transition. Color sync timing is obtained by
di-vision of a 3.579545 Mhz crystal oscillator. Early B&W
transmission utilized an H frequency of 15750 hz. 3.58 Mhz
0SE.SArOO -~L-~W -J 0
II
. 3 ' GROA o CtST C.-m. ). 154 SC MATRi)( DELA'f LPV. )3Mc LPF. 6 A il1 .1 I i Es t S7 619 Ir53 (60 MMI LPV. 4-2 M M1IOM POANCET3
(vie
COLtig s 0 3.Sv
Y t
so%Fl3 (0 C14tRI tR% -IR
G
(3
h/ 0 A. IA 3 -(...4) IDELA r-v-% ITH~ li - MR.OQ'I I I(15734 hz H rate) was chosen later because it is an odd
har-monic of H rate; and because the difference between 3.58 Mhz
and the sound carrier frequency used in broadcast was an odd
multiple of H rate. Since 3.58 Mhz is the subcarrier frequency
for color information, it was desirable to place it in the
gaps in the general video spectrum.
2.2 BANDWIDTH
Bandwidths found in B&W and color transmission are
illus-trated in fig. v. The broadcast bandwidth is somewhat
mis-leading in that professional studio equipment (cameras, monitors,
video tape recorders) often has a useful bandwidth to 7 Mhz.
NTSC recommendations for digital encoding of fully modulated video signals suggest use of a four times subcarrier sampling
rate yielding 7Mhz bandwidth. This provides about 900 pixels across a line. The color picture tube mask resolution is beyond 500x700 holes.
2.3 COLOR MODULATION TECHNIQUES
Color is encoded as a quadrature modulated subcarrier
interleaved in the luminance (Y) and sync spectrum gaps (figs.
vi, vii) . All color and intensity values for a picture can be derived from its RBG components. To achieve B&W and color
compatibility, RBG video signals are matrixed to produce the equivalent B&W intensity value (Y) and two other values. In the illustrated modulator these signals are I and Q (In phase
and Qadrature). They individually modulate two 90* out of
phase subcarrier angles, and are then summed with Y as a
double sideband suppressed carrier component. Hue value is contained in the phase angle of the modulated signal relative
to subcarrier, while peak to peak amplitude of the modulated
signal is that associated color's saturation or purity. Fig. vii illustrates the I and Q vector positions and relative color angles. Some modulators (and demodulators) use a Y, B-Y, R-Y scheme for its simpler matrix, the tradeoff being that they do not fall quite on top of the 901 out of phase angles driving the two balanced modulators. Demodulation
at the receiver must use the original subcarrier phase as a
reference, hence at the beginning of each H line 8 to 12 cycles
of subcarrier (18O* out of phase) are transmitted and used to
trigger and lock a subcarrier regeneration oscillator in the
receiver, where the difference signals are dematrixed to obtain
RBG gun drives. Television receivers for consumer use
usual-ly low pass filter the two color vectors at the same points
as in modulation to reduce noise. Professional monitors often
have their channels open higher, allowing greater color display
a T
i0VIoV
MACIG03A 1ELLoLo3
Sa sc,. 8tuE I3 (11) Aa n cr tJ'J J 1E so GkC 10 GA4CNi
For computer generated images, RBG does not have to be
used as the only encoding of bits in a pixel; alternate
assign-ments can be used if an appropriate modulator is available. A particularly useful alternate scheme is HIS. A digitally
controlled 8 bit resolution TTL delay line serves as an
excel-lent hue to phase angle converter, while two digital to analog
converters provide saturation and intensity values (fig. ix).
The rest of the modulator contains internal or external sync
and subcarrier selection, and then properly adds sync,rbtzrst and
blanking to the modulated signal.
2.4 HIS MODULATOR HARDWARE
The D/A converters selected must be capable of output
settling times almost an order of magnitude faster than the
maximum pixel rate. The Datel HI series takes 25 nanosec. to
arrive at .1% of the final sample value. The Engineering
Component Company programmable digital logic delay line is designed specifically for 0* to 360* of delay at 3.58 Mhz, (279
nanosec.). The actual timing is 8 to 287 nanosec. in 1.4
nano-sec. increments. A TTL compatible crystal oscillator is used
for the main timebase, yielding subcarrier from a divide-by-four
of 14.3 Mhz. The five other sync signals are provided by a
P-MOS LSI chip driven by the master oscillator. All six internally generated drives are sent to a TTL hex 2 to 1
)
I ) I
I
)
I
I
I
TwNh3 fl UL(h, 40' O---cvousokt~W.mt'4'
L7
491I
_____E.04ZdCR
WV/crOUES
1rWL&
w
~
e
H (4 IL Jldata selector which drives the sync addition logic, provides timing to the display line and pixel counters, and can be used to sync external video devices through the TTL to -4V NTSC
pulse output buffers. The data selector can also be switched
to the external sync drives, appropriately buffered to TTL
levels. A 26 bit latch is used at the input and clocked from
the display pixel counter to cause all the D/A converter and
delay line transitions to happen at the same time. Only 16 bits per pixel will be used, while the 26 input lines allow
flexible formatting.
3.0 DIGITAL VIDEO
This section establishes some basic data rate requirements
for a digital to NTSC video storage and output device. The
sections following develop the complete design.
3.1 MINIMUM FRAMESTORE REQUIREMENTS THAT ALLOW VIDEO OUTPUT AND IMAGE PROCESSING
Acceptable picture quality on a professional video
monitor can be obtained with a 512x512x16 raster and pixel
size. Sixteen bitRBG can exceed the chroma channel bandwidth
and fall a little below the luminance channel bandwidth in the monitor. It is also possible to use a direct in PBG
monitor which allows considerably greater bandwidth to be
displayed. Signal to noise ratios for analog video processing
equipment range from 50 to 60 db depending on the function.
Noise figures for digital to analog devices are expressed in
terms of the peak to peak signal to RMS quantizing error ratio
(SER). SERdb = 6.02B + 10.8, so
B
= 8 bits corresponds to59 db.
Due to the 21 H lines blanked during each field's vertical interval, the active number of lines per frame is 483. Two to
one interlace requires that display access to those lines
happens in odd and even numbered line passes, whereas the
proc-essor may find it convenient to access lines in the frame
randomly or sequentially.
The primary system requirement is memory output speed.
An H line is active for 52.8 microseconds. In a 512 pixel per
line display, one pixel is needed by the output device every
100 nanoseconds.
The main processor accessing the framestore must have a real-time read-write capability, that is it should be able to change the contents of the frame memory without shutting off
the display output. The logical time for processor use of the
memory is during the period when the video output is blanked. Vertical blanking for a field is 21 H or 1.3 milliseconds.
H blanking for the remaining 240 lines adds up to 2.5
millisec-onds. Available time estimates from now on will be based
on a 400x400 raster size, likely to be used in the first VLW implementation. A useful processor access speed metric is time
per complete frame rewrite. We will assume that every 4 micro-sec. the processor has one pixel ready to be written into frame memory, with memory write time absorbed into the 4 microseconds.
In the 3.8 millisec. available per field 950 4 microsec. writes can occur. One 160,000 pixel frame can be written in about 2.8 seconds. Since only two 4 microsec. accesses can occur in the
H blanking interval, 3.25 seconds is a more reasonable estimate.
By using the 42 active but black lines remaining in the display per field, an additional 2.1 milliseconds become free,
allow-ing a complete frame write to occur in 1.75 seconds.
Ideally, the main processor should address the framestore
on a frame, line and pixel number basis. To accomplish this,
a small, fast RAMcould be used as a line-resolution pointer
system. Assuming the proper memory configuration, the pixel
number could be directly added to the PAM output as an absolute memory address offset. If the main processor also has
read-write capability in the line map RAM large-scale raster manip-ulation such as scrolling or stretching of a section of the
over-all frame becomes easy through pointer modification.
A similar technique is desirable at the pixel output
stage. Instead of specifying an RBG or HIS value, a selected
number of the 16 pixel bits point to a dedicated memory location
manipulation can then be accomplished in one or two frames depending on map size.
3.2 MEMORY CARD DESIGN
A quick calculation shows that a 512x512x16 bit framestore
requires 1024 4K RAMs. This immediately suggests that the need
exists for a standard memory board and bus system, designed to
meet the primary requirements of plug-in expandability, variable
frame sizes and resolutions under processor control, and frame
-line - pixel addressing. The board size that this design will implement is 64 RAMs per card, or 16 cards per high resolution
frame. One would not want to use an appreciably larger board,
since layout, power distribution and bus considerations become
quite unwieldly.
The next four sections will explain parameters of the
2107B RAMs in terms of single chip and in-system considerations,
the memory board configuration as dictated by both display and processor constraints, and elements of the memory boards which
must be shared by both the processor and display.
3.2.1 THE 21071 DYNAMIC 4K PAM
For design of high density true random access memory systems the 4K dynamic RAM is the most effective device in
(17)
terms of speed, bit density, power consumption and price.
It is available primarily in 22 pin dual-inline packages, although some manufacturers produce a 16 pin version which uses multiplexed address lines. This requires a two part
addressing operation for any given bit, thus more overhead at
the addressing device. The VLW has a supply of 2107B RAMs
already, hence their use in this design.
The 2107D uses a single MOS device and capacitor
combi-nation for each storage cell. Since a logic level stored as
a voltage in a capacitor has a tendency to change due to
dis-charge through the sensing circuit, the RAMs require a periodic
'refresh cycle'. The 2107B specifications state that all 64 row addresses (A0 to A5) must be refreshed once every 2
milli-seconds. A particular row is refreshed by performing a read
cycle on any bit residing in that row. A read cycle is
actua-lly an internal destructive read and write of the addressed bit.
The addressed storage capacitor is discharged, and during dis-charge the voltage is compared to a reference voltage level, decoding a one or zero. Immediately after the comparison pro-cess however, the comparator one or zero output is written into the capacitor, completing the cycle. The whole process is transparent to the user.
Fig. x is the 2107B internal block diagram. It has three
control lines: CE (Chip Enable, active high) , WE (Write Enable, active low), and CS (Chip Select, active low). CS must be low
ROi. DEODL Aw Rurfe& M qM\yoo E -Th Bo5Tef. Amps T / GoZ/oa1A&'1CXEAISTIC5 IIPTLAD tD csr(csDr C 0 Pv G CT
* iui T Lou ILrb
PTPUT LoA) JO A&L
OuTPVT LOU)% J OLTA6L
C-PH
OUT U"T Lo o
O~JTLR O~')CUAt(m
o nto )otcto ta.t
CE 3o&A4n 0 2.. CC -10 - %j D -ATo 10 2-4 -CHARACTERI STCS 5%* t. o 1a or.c or
&
&ty
2
-GO /WA \O .4s VC-L (x.'N (ToP.no Ano -T'-tP. Po-9POn O0 -EMPLA.Ar(ft)2 17 13
-~ 0 k. TL CCLE C-0t 13 T Y30ce
DIN 0OoT I.X-TimIN)6 -MJ IMUM -A-(
XI mI) REAo C9CUC S130 I K-80IVPAO
R
to enable the TTL tri-state data out line, as long as CS is high the output is forced to a high impedance state. W
must be low to allow the bit present at D. to be written at in
the proper part of the memory cycle. When CE is low, the
device is in its off state with low current drain and high
impedance output characteristics. CE controls the complete
memory access operation. The low to high CE transition
latches the address present at A0 through A and immediately
begins decoding the bit location. CE must remain on for the
entire read or write specified by WE. A minimum time is
specified between cycles during which CE must be off. The
overall time for the decode and wait until the next CE is the
bycletime, which is 400 nanosec. minimum for the 2107B. Figs. xiii and xiv are simplified minimum write and read
cycle timing diagrams for the 2107B. In a read cycle, addresses
and WE must be stable just prior to CE high. 100 nanosec. are necessary to allow unambiguous latching of the address lines.
180 nanosec. after CE goes high, the selected bit should be
valid at P . A write cycle is initiated by the high to low
transition of WER 100 nanosec. after CE on and mainted by holding
WE low for at least 50 nanosec. , this allows D. to be presented
in
to the internal write amplifiers. Also is a read-modify-write cycle (RMW). The cycle begins as a read, and after output data is available, WE is brought low and the data present at D.
in
is written into the already decoded address location.
Power is the remaining element in the block diagram.
Four supply levels are required: positive 12V for the maximum
MOS level swing, minus 5V for MOS biasing, positive 5V for the TTL level output buffer, and ground. Fig. xi shows the
asso-ciated input and output level swings. CE is a MOS level logic
signal and the rest are low level TTL. Fig. xii shows typical
current transients encountered during a write cycle for a single
chip. In a system refresh, that will be occuring simultaneously
in all the memory chips. It becomes abundantly clear that a
very large, amply decoupled supply distribution system is
essential, as well as buffered and isolated bus lines. CE
especially must be driven by a high power buffer since it
re-quires a clean fast risetime (less than 40 nanosec.). Also,
a number of CEs must be driven by the same buffer for parallel
access to that number of bits. The CE input capacitance and
1 to 2 picofarads per inch of PC board copper become important in achieving the required risetime. These system requirements will be met as the actual memory board is developed in the next three sections.
3.2.2 MEMORY BOARD CONFIGURATION BASED ON DISPLAY CONSTRAINTS
The display unit requires a continuous series of read
memory operations. It needs an address specified j6 bit pixel
with a 51 microsec. active line duration. If it takes a chip
400 nanosec. total to retrieve one bit, then at least 64 chips
must be presented an address and enabled at once. When data
becomes available it must be read into temporary storage in
16 bit words, and must then be capable of 100 nanosec. access,
clocked by the pixel output counter. When a complete cycle is
done, another must be started while the four pixels already
recovered are being clocked out. The finest level of addressing
the display is interested in is the 16 bit pixel. The 64 chip
board can hold 16K pixels, and for access 14 address lines are
required. The board configuration shown in fig. xv routes
A0 through A1 1 to all the chip address inputs. Four groups of
16 data out lines are routed to four 16 bit registers (high
speed buffer latches)i and ate ,nabled one at a time onto a
common 16 bit data out bus, with register select decoded from
A1 2 and Al3. The four registers and decode logic should reside on the memory board, since otherwise a 64 bit wide off card bus
would be necessary. A control bus is added to the structure to
allow display control of CE, WE and CS. Additionally, the
registers must be latched at the proper time after CE when the
output data becomes valid. Two important issues remain. This
address and data scheme configures the pixel information in memory, the four 16 bit pixels are contiguous on the screen when read out. This is alright if the picture is written in a similar manner. Next, is the board expandability requirement
Li m t VjLota
I
319h1N3 -LI U ---(LOI0-CI
0 ~ 3"? 7)A
It
j
A--W4
fulfilled? Additional boards mean that the address space grows, theref6ree in fig. xv A14 through A1 7 have been added
to the address bus and connected to card decode, allowing
expansion to 16 boards. As cards are added, the number of
pixels that can be simultaneously clocked into buffer latches grows; a 16 card system would obtain 64 pixels for each 500
nanosec. RAM memory cycle. After the access, address lines
A12 through A 7 clock out 16 bits every 100 nanosec. onto the
data bus.
3.2.3 MEMORY BOARD CONFIGURATION BASED ON PROCESSING CONSTRAINTS
The primary processor related constraint is that any
pixel be read or changed without waiting 'too long'. Section
3.1 identified 3.8 millisec. per 1/60 sec. that the display did not need the memory. That period of time allows a frame update
in about 3 seconds. This section will describe processor
access to the boards and recompute the update figure. Fig. xvi
shows a memory board layout similar to and compatible with the
display pixel buffers and control lines. This adds another data bus dedicated to the processor interface, address lines PA1 2 through PA1 7 for pixel and card decoding, and logic to
enable rows of 16 chips rather than all 64. There is no
immediate need for processor access to multiple contiguous
(25)
%wU~d 9 h E Ad/-n-J
I
C~ t-0~
300)3 (lAY) V ) -70 0 w 0 3 "tf J^a-7-2~~-OV
I 10"31
spixels. This is a feature which could be optimized and ex-panded, and could yield a small speed increase.
The processor and display are forced to share some bus
space: A0 through All, and the chip control lines. Thus, the
major issue in the following sections is arbitration of memory
accesses. Returning to speed considerations, in a 16 card
sys-tem the processor can have access to memory considerably more
often than 3.8 millisec. per field. Assume that in a 1 micro
sec. access time (accounting for arbitration, buffer, decode
and access delays) the display loads 64 pixels into buffers.
Thus, for the next 5.4 microsec. (subtracting the 1 microsec.
load cycle since displat must overlap load) the control and chip address lines are free For a 400x400 frame size 6.25
accesses are required per line by the display, therefore 6 x 5.4 microsec. x 200 lines is the additional time
avail-able per field - 6.4 milliseconds. Given that slice, the pro-cessor could change the entire frame in under a second,
assuming that it could efficiently use the available time.
Furthermore, some additional small amount of time will be
lost to the ubiquitous refresh controller, another part of the overall arbitration problem.
3.2.4 MEMORY BOARD SYSTEMS COMMON TO BOTH PROCESSOR
AND DISPLAY
The processor, display, and refresh controller must share the 12 line address bus driving each card's 2107B array. This
presents no memory board problems, since multiplexing occurs
at the bu.t drivers. The three chip control lines are also in
common control. WE, however, only goes low when the processor
has control of the memory and is performing a write cycle. Thus,
whenever the display has been given memory control, E is
high under processor control. CS enables the 2107B tri-state
data out buffer. Since no two data out lines will be tied
to-gether (they all go to separate registers, not a common bus) the outputs can be left active by CS low. This adds only
negligible current drain to the system since CE low still brings
the outputs to a high impedance state, and this particular de-sign would have had CS following CE if it had been necessary. The CE inputs must be accessed in rows of 16 by the processor,
and all at once by the display and refresh controller.
Further-more, each board should have two quad TTL to MOS level drivers
(fig. xvii) which have individual and master enable inputs.
The refresh / display line will be combined in an OR gate with
the decoded processor CE request. If the processor requires a mass read, it can access the refresh / display enable line at the arbitration logic.
A
7f
71ff
\JVW
--
f-?.o--rU &as ofF DEL'M ouTpor CuKvENT . cc 40
Am
I>-Ag
I>
A7
2T3o
Q
REFkERSH/oR
Ptessoc LON Trbv OFFE
PC.LM (F~oM MEm.L
-c 0D Th% CHIP EFM3IB IiP PTS (Mos Uv LEvELY
319~
(X\
TOR*- o' OCLt IupDxe\.
OCCOOE (29) To 31IAPs
S '0 PLurS DEWbOE) 3: 1 0161(
v III
0 *4 1 el- ,iono odu0-
Zz2
K
e-n \CXJ C1Z 4 15 -o&? ]%dV74
~jtP r9000000.P
900V
ec '
h ' h~ 1 (E -X)96.19
0'QN uval CflaHNlaw
r 1 \lI CV3 'I t *ft ->NG.) . .13.2.5 MEMORY BOARD HARDWARE
This section tailors all the memory board support logic
to the electrical requirements of the 2107B array. Printed
circuit board layout, bus placement, decoupling and power
guide-lines will be presented, while actual memory printed circuit
board design is the next stage of the VLW project. The Intel
Memory Design Handbook is the primary source of information
about the 2107B in a system environment.
The address lines are the single largest logic load in the system. The voltage levels required are standard TTL low level compatible. Intel recommends against the use of Schottky
TTL drivers because worst case conditions allow a zero logic
level voltage 100 millivolts higher than other TTL families, yielding a sinificantly lower noise margin under high output
loads. Input load current for all 2107B inputs except CE is
10 microamps maximum. Standard TTL will supply 400 microamps
up level current, so about 40 device inputs can be driven. The 8T380 (fig. xvii, xviii) is a quad bus receiver utilizing
schmitt trigger (high impedance with hysteresis) inputs and 16 ma. high speed outputs. 24 drivers will be used, each handling
32 inputs (fig. xix is a typical layout). An added feature of
the 8T380 is that an address enable line can be connected to
the second gate input to hold the output high until enable
occurs. This forces all transitions to from high to low
)
'Nj 1OdIIPCO) d -no ANr I jro 96 "JO kcnas o "a~1~
'.3IN
osII
.i -a Vd o tijL~O tct *OtfShL h%;P-o -0qZris)Q
-|13
i.
L7
-a s hz 2--- IQJ
--$10:!__ gad_ Ki . . - .IT
Q ~r 0 y h t id oV --+-- Qj f-- NO p %0 00I
"P",., - --",---lm
-Wv-~1lvldsk0 300)39 laX"Ic9[t
IQ
kna S
1<:;JLCVOJ 'WJiMs %oi )O)JJ-6
-007 .
3003 Ta "AI 3oOQI
___________ c _____ lfld a/cyvo sI't.-SI-htL~3C11Qc
QKJUG>O
go Ii 9 L '1)
which is the faster direction for TTL (differences of 5 to 10
nanosec. are possible). Input capacitance for the TTL inputs on the 2107B can be from 5 to 10 picofarads, while CE is from
15 to 25 picofarads. The 300 picofarads that can easily occur
from 40 devices in parallel can slow down the transition and
cause considerable amounts of ringing at the edges. Thus, series resistors are are recommended for the buffer lines, typically 20 ohms for a 36 line load.
The CE inputs are the next critical element since they are MOS level and require a fast risetime. The Intel 3245 is a
quad TTL to MOS driver with refresh logic to enable all four
inputs at once (figs. xvii to xix). Two chips will be used so
that each buffer will only drive 8 devices. 20 ohm series
resistors will also be used.
The processor card and pixel decoders control the CE
buffers, the WE inputs and the output latch enables. The first
level of decoding occurs on PA2 to PA6 . Those lines are
com-pared to the card address hardwired on the board (jumper
selectable). If the card is selected (CARD - active low),
the 1 of 4 decoder (Intel 3205) enables a row of 16 bits
specified by PA0 and PA1. Another decoder used as 1 of 8 looks at PA0, PA1 and W%. If W is high, the four WE inputs are high
and one of the four pixel output registers are enabled onto
the processor interface data bus. For W1E low, the output
The display card and pixel decoders use the same arrangement. DA 2 to DA6 selects the card, and a 3205 enables the pixel buffer given by DA0 and DA1 .
The pixel registers used are 8212 octal latches with
tri-state buffered outputs and strobe - enable logic. They are also
used for their low input load current. The 2107B data out line
can only supply 2 milliamps, enough to drive only one regular
TTL load. The 8212 only requires .25 milliamps, so two latches can easily be attached. Both processor and display arrays
should be buffered onto the data bus with a tri-state unit such
as the 8T96. This reduces the bus capacitance and lowers the
high impedance leakage current on the bus. A 16 card system
(64 8212s on the bus) would produce enough leakage current (20
microamps per unit) to make the drive capability of the
en-abled register marginal. The 8T96 was also chosen as an invert-er to return 2107B data to its original state.
Power supply requirements for the board are hefty. When
enabled, the 2107B array can require up to 60 milliamps IDD
(CE offf IDD is 200 microamps). The other supply currents are
low when averaged, but have large short - duration surge values during device state transitions. Thus, the memory array should be expected to draw 4 amps during refresh or display enables.
The 17 8212 registers (90 to 130 milliamps) , 12 buffers (50
milliamps) , and miscellaneous logic can require up to 3 amps ICC. Ample power supply capacitor decoupling is absolutely
necessary. One capacitor per chip plus two capacitors at supply run terminations is recommended. VDD to V requires
.1 microfarads on every other chip, and VBB to Vs requires
the same on the other chips. Vcc to V8 3 is provided with .1
microfarads at the board edges. .1 microfarad must also be
provided for every other register and bus buffer. Capacitors
must be mounted close to the board and associated chip to
mini-mize inductances. The power line noise requirement that should
be met is no more than 200 millivolts excursion during enables
on VDD and 100 millivolts on VBB and VCC. A double sided board
will be used, with the power lines run both vertically and
horizontally to achieve a grided distribution system.
Alter-natively an above-board power strip distribution system can be
used allowing more flexible use of printed circuit board space
for the address and data lines. Layout rules are generally
that the RAMs, -their address and input drivers, and decoupling
capacitors be placed as close together as possible. The 4K
address lines and board power bus will be connected via the
backplane. Processor and display address, data, and control
lines will be connected across the front of the board with
ribbon cable and 3M clip on connectors.
In terms of expansion considerations, the board could be designed to be plug-in compatible with 22-pin 16K RAMs which
are expected to be available in one to two years. The 16K
unused pin and the CS pin on the 4K package. The board can be
designed with those two pins brought out to the backplane and connected to the common address bus. Logic for entire board
read or write cycles allowing high data rate processing such
as video frame-grabbing can also be implemented. This would
involve three additional data in registers whose outputs go to individual chip data inputs.
3.3 THE VIDEO DISPLAY
The video display timing and decode unit will be
present-ed in block diagram form, and describing all necessary conneci
tions to the internal device I/O bus,, memory control lines, and
inputs to the modulator.
The display essentially decodes the sync information from the modulator and uses it to trigger a preset count sequence
on the address lines, retrieving pixels in the proper order from
the fast registers on the memory boards. The preset count
se-quence is processor loadable and depends on raster size and the number of available memory cards. It will allow quite flexible
access to the memory boards. Periodically the display unit
must request a memory cycle from the arbitration unit and reload
the card registers. Since output is continuous, this operation
must be started sufficiently in advance of clocking out the
last pixel on the last board. That period of time must allow
for delays in obtaining memory control from the arbitrator and delay inherent in memory access.
Four functional blocks comprise the display unit (fig. xx): sync decode and pixel clocking; line, pixel, and pixel offset counters and loading; output synchronization and manipulation; and memory cycle control.
The sync decoder and pixel clock tell the display counters
when to begin pixel clocking and thereafter to provide the pix-el clock rate. The presettable down-counter is connected to the
I/O bus via a 9 bit latch. At the start of V sync the counter
is enabled and is clocked down by H sync. The first H blanking
interval after 000 has been reached enables the pixel oscillator.
A monostable triggered from V sync could serve the same
pur-pose as acount to 0, although processor control of the timing
would be more difficult. The pixel oscillator could be replaced
in certain applications with a clock divided from 14.3 Mhz.
This would be useful if the memory were loaded from a real-time video signal sampled at a multiple of subcarrier.
The line and pixel counters are also presettable down-counters set by processor loaded latches. The first two count-ers provide DA0 to DA6 the pixel and card number counts. The
pixel counter will always clock out four pixels per memory
board, so no preset is necessary (it can be manipulated but the
number of display memory accesses would be likely to increase) The number of memory cards is variable, five address lines are
(6E) WO )H
(Yx)
-od-tvo basa Vc3 vo W)OW 1131)h3 'V4W *LkVn00 ow 0.1 .IN 0 F 3-~Na) /53 SS9.1d kow
029
I0 a E o) C~t-C 333V MI3Q5'c
9" alto,I
Li
nkitl3Ccfi LVVJ3% ,j o ~--~~~~ hVIJS'Q T 0132 -C de -0 A* .. L c n L ow .11 C -4=1
eb r 0 gyp n ci :1 -4 IQK A Ir 9-I -0-3sospecified for expansion to 32 cards. Any counts that ripple beyond DA6 must force a change in the 4K address lines. The
third counter increments on each memory access within a line.
It is loaded with the quantity: pixels per line /four times the
number of active cards. Since the output of the line number
counter is a pointer to an A7 through A18 address via the line
map, the third counter provides a pointer offset which must be
summed with the output of the line map. Four bits (PO 0 PO3
are provided for counter three, since in a 512x512 system, 16
memory accesses are required across a line. The line counter
is 9 bits wide for a 512 line display. The number three counter allows a particularly interesting operation through the addition
of count capability between two presettable limits, rather than one limit to zero. With the proper pointers set in the line
counter, this allows a horizontal pan through a larger than
frame size image. For example, the stored frame line - pixel
width could be 1024, while the display clocks out any 512 pixel continuous line. Since the line map can also point to
more lines than a full frame:s worth, a windowed pan and tilt
has been implemented. Expansions can be performed by
window-ing the segment of the frame desired, slowwindow-ing down the pixel clock, and loading the line map with the same line pointer for
a number of consecutive line counts.
Memory cycle control is accomplished by a straightforward handshake procedure with the arbitration unit. The memory
cycle control asks for access to memory by bringing DREQ
(Display Request - active low) low. The arbitrator hands over memory control by dropping DACK (Display Acknowledge - active
low) and then initiating the memory cycle by connecting PO
-0
PO3 and L -L8 via the line map and summation unit to the 4K
address lines. If the processor or refresh controller is using
the memory, the arbitrator holds DACK high until completion of
the cycle in progress, otherwise display has first priority.
Since the display may have to wait a full cycle and since it re-quires the the data to be ready for latching at a particular
time , it should request memory one complete cycle early. This
is easily accomplished by examining DA0 to DA6 and triggering
DREQ a certain number of counts before zero appears. This
would be an external adjustment tailored to the pixel rate for the raster size in use and the working memory cycle time.
Since the arbitrator generates all the required memory timing,
the memory control unit may have to extend timing of CE on, if
it receives its request acknowledge immediately. See fig. xxiv
for the reset inhibit logic to extend CE on. The display unit must also provide a delayed clock to the modulator to resynchro-nize the bit transitions between pixels. Functions such as
cursor insertion would occur on the data bus just before the resync latch. A set of processor loaded latches (fig. xxi)
are compared with the current pixel position, and when a match occurs the data line could be color complented or otherwise
SET FoK SETFoR LtkW 4 c II'I. A > --- ~ - -1 -M (2P I LS 0 UNEM5 flCL C 4L 30_
-D__
_
-2 /LoAC AODRe . Co007R.0 L - T GE. FIrS4m ALH/ &E EC& (XX 1CuRsO R.LObIC -Rom DELCD p ROC.I805
LM 8OFEC To . C,, ICAA
EM Ll5EMAP Rat ---. 2.4714s
00 OATA.1e# MAP r\)FFER
$:aR AEAOP436 Co&aTEPJTS
.1Ie MP (iRNI&
SXx u0
marked. Two ronostables are used to define the cursor size
(fig. xxi). When the line and pixel output locations match
the cursor location loaded by the processor the first monostable
with a 16 line on time is enabled. The high transition and the
pixel match cause the next monostable to trigger which holds
the cursor on for 16 pixels. This occurs for the 15 lines
following, creating a square cursor with the upper-left hand corner identifying the cursor location.
3.4 THE PROCESSOR INTERFACE AND INTERNAL BUS CONTROL
Device interaction with the main processor falls into
two categories. The first area involves processor access to main memory for either a read or write operation. The access must occur as fast as possible. It will also occur on a line
-pixel address basis and will require use of the 4K address
lines, the line map, the pixel map, and the dedicated processor
address - data - control bus to memory. To obtain the 4K
ad-dress lines the processor must follow a request protocol similar
to the display or refresh procedure. The second type of device
and processor interaction involves control of internal
para-meters such as raster size, active card numbers, color map out-put, cursor location, and line - pixel maps. These are best served by an internal device I/O structure.
The interface expects control, address and data lines from the main processor. In fig. xxiv the blocks labelled
'processor memory read/write control' and 'internal device I/O
and instruction decode' perform the two interface functions
de-scribed above. Fig. xxii shows a typical interface system in
which the processor places a device address on the bus and then sends a device strobe signal. Prior to the strobe, an instruc-tion word for the desired device was placed on the data bus. The device receives the strobe, checks if its number is on the bus, and if so, latches the data bus and proceeds to decode
and perform the instruction. Often the instruction will
speci-fy a read or write data operation to or from the processor data bus during the next CPU cycle.
Fig. xxiii outlines a typical map connection to the in-ternal I/O bus. The line map used here is a 512x12 bit array of very fast bipolar RAM (the 74S200, 256x1, 35 nanosec. access
time). The address to be changed or read is selected by
placing the line number on the address bus as though a frame memory access were about to occur. The 12 data in lines are connected directly to the data bus , and data out can be enabled onto the data bus for reads through its tri-state buffer. WE
and data out enable are driven from the I/O bus. Since the line
map is ashared item, the I/O interface must use the request and acknowledge protocol through the arbitrator. When it receives control it must bring MEMREQ (Memory Request - active low)
?1 )
Sba
V-u9da
H
4
%% ___I I C) )f I i -1*ls c ' O.L 8C H2V c ouv&wWflr 1 S3Wd -s3 -~ (1 C V 4-~ 1----t )IWONSW 0/ '(no 'LV41 ___________ 4 _________ II
t-t
i-I
-4'b-t
II
_________________________________A 30a)3Q9 r' 1 poL-n<VISCYL GNU w/ Nmen-LI -43
mIy
N
oz
o---________IFEF'I_____
______ 'I 50C .Of i lo Chigh, so that the automatic memory timing does not initiate
a read cycle. It then performs the map read or write and
resets the arbitrator via P-RESET. The process is similar
for loading the other maps and registers although it is not
necessary to request a memory cycle to modify them. Memory read/write control (fig. xxiv) initiates a PREQ, waits for PACK, and sets up the line - pixel value on the address bus, enabling the data in the correct direction on the bus
accord-ing to a read or write instruction. DATAREADY low tells the
processor via the interface to latch the data line and change addresses if necessary. Through the OLEN (Output Latch Enable active low) line it is possible to read a selected (using PA0
PA6) pixel register without disturbing the shared memory lines.
This would be used in conjunction with a processor mass enable
and buffer load on all the memory cards.
The pixel map will be treated in this section because it
is dedicated entirely to the processor. It is loaded with a particular modulo count from the processor that corresponds to
the pixel - card counting sequence used in the display. It
also produces four pixel pointer offset bits as in the third
display counter. The two low order bits are not mapped
be-cause they follow the first display counter which is not preset.
The pixel map should be at least 128x9 bits of fast bipolar
3.5 ARBITRATION, -REFRESH, AND MEMORY TIMING
Arbitration among refresh, display, and processor is
provided by two cascaded refresh controller chips. The Intel
3222 contains selection logic for two asynchronous request
in-puts, and controls two acknowledge outputs for the requesting
devices. It produces a memory start cycle signal when the
acknowledge is sent out. A timer and 6 bit refresh address
counter is included to provide refresh requests. The timer is
externally set to trigger 64 times within a 2 millisec. period,
the maximum 2107B refresh interval. Each timer trigger requests a memory cycle and increments the counter. The 6 counter out-puts are multiplexed onto the 6 low order 4K address bits, A7
through Al2 The second input to the controller chip is the processor request line (fig. xxvi). The start cycle output
of this chip does not go directly to memory timing, rather it
loops up to the other 3222, which decides between either
dis-play or processor / refresh. The processor / refresh
acknow-ledge signal is sent to the bottom chip's busy input which in turn allows the return acknowledge to whichever input was select-ed. This arrangement guarantees display the top priority. The refresh timer and counter on the bottom chip are unused. The acknowledge signals from the arbitrator control the multiplexing of the address bus (fig. xxv). Processor and display share the
line map, summer, and address line buffer. The buffer is
J.3 Ti -4c )cj ')Qbc Vd rjsio hY1dSIQ Lvav dkww 3a-We% 13%33O dliO 30e 14-4069SW6 )I%'4 t 33V Ly LV 6rWd-(89)
3 ifl? -- -D E.Q DAXA.Q RIF.- PP REFox. -~
r)MEt
- -K E~p-j-*
TC &CCIE-TI
A eMCzI41iK
LPAL
REST V I8f 7o% __-CA, OFFN)fS-WE. F BosGT OINTARt.AC09
TZOTM m8.ETi&,bbts5 DES itE
VUNocTbo.J eAsisa oNO GELAM L )E 11F41N36 -(49)
atvoN1
kkE N 6 l0 m9 -taD &.t -lto / . -. 'Z 00AA-enabled for either PACK or
5ACK
low. When REFON is low therefresh buffer only is enabled. Chip logic maintains only
one of the acknowledge lines low at once.
A simple memory cycle timing generator using the control
signals developed so far is shown in fig. xxvi. A tapped delay
line is used to clock output latches 180 nanosec. after CE
high. WE is gated low 100 nanosec. after CE high for a
write cycle. At 250 nanosec. the memory reset pulse is tapped
and sent to clear the busy line in the 3222. Included are
reset inhibit and output latch enable. Fig. xxvi shows a
typical timing diagram for a refresh cycle request with the
BIBLIOGRAPHY
Donald P. Martin, Microcomputer Design, Martin Research,
October 1976.
Richard S. O'Brien, Ed., Color Television: Selections From the Journal of the SMPTE, SIVtTEr 1970 .
The Intel Memory Design Handbook., Intel Corp., 1975.
A. A. Goldberg, "PCM NTSC Television Characteris tics,"
SMPTE Journal, 85: 141 - 145, March 1976.
C. J. Libby, "Exploring Color As Space A Computational
Synthesis," MIT Lab. of Architecture and Planning, May 1976.