The Micro VAX 78034 chip - also known as CVAX - is a second-genera
tion single-chip VAX microprocessor. A primary project goal was to develop a chip with three times the performance of the first single-chip VAX processor, the Micro VAX 78032. Therefore, architecture and circuit design efforts were directed toward decreasing ticks per instruction (TPI) and machine cycle time. The designers reduced the TPI by 27 percent and achieved a 90-nanosecond
(
ns) cycle - a significant improvement over the 200-ns cycle time of the first-generation chip. Implemented in a 2-micron CMOS process, the chip comprises six major functional units.These include the instruction queue, execution unit, memory management unit, bus interface unit, microsequencer and control store, and a unique on-chip cache.
The CVAX 7 H 0 5 4 CPU c h i p i s a second -genera
tion , s i ngle-c h i p VAX m icroprocessor. This chip is the CPU of the M icroYAX 5 5 00 and 5 6 0 0 com
purer systems, wh ich have ap prox i mately three r i mes the performance of the M icroVA.,'<. I I com
puter syste m . 1 ·2 The VAX 6 2 0 0 fam i ly of systems uses sl i ghtly faster 80-ns ( speed-bi nned) CVAX C:Pt : c h i ps i n a m u l t i processor configuration . In this paper. we describe the CVAX chip and explain how the increase i n performance was ach ieved.
Project Goals
The pri mary project goa I was ro deve l op a s i ngl<:
chip CPU that i m p lemented the VAX arc h itecture and delivered three t i mes the performance of the J'vl ic roYAX 7 H 0 .) 2 CPU c h i p used in the M icroVAX I I com puter syste ms. Of the several c lements in this goa l . performance presented t he greatest de-i gn cha l l enge .
The performance of a CPU i s inverse ly propor
tional to the product of tic ks per i nstruction ( 'l 'P I )1 and rhe mach i ne cyc l e t i me . TPI depends on t he performance of the system arch itecture . The m i n i mu m machine cyc le r i me depends on c i rc u i t speecl and on how the arc h i tecture is
Di�ital Technical journal No. � A ugust t 'JH8
i mpl emented . I n t he CVAX c h i p , both the TPI and the mac h i ne cyc le t i me were i mproved to meet the performance goa l .
M u c h effort wen t i nto red u c i n g t he TPI . By way of com pari son , the MicroVAX II system, which is based u pon the MicroVAX 7 8 0 3 2 c h i p , performs a r ap prox i mately l l . 5 TPI ; whereas rhe Mi croVAX .)600 syste m , which uses the CVAX 7 8 0 3 4 c h i p , pe rforms at appro xi mately 8 . 4 TPI . The TPl was lowered m a i n ly by reducing t he average n u mber of cyc l es req u i red to access memory. This reduction in the nu mber of cyc les was achieved by t he inclusion of t he fol l owing a rc h i tect u ra l features i n t he syste m :
• A ! - ki lobyte ( KI3 ) , on- c h i p i nstruction and data stream cache , which is capable of a long
word read each cyc l e
• A 64 KB. second - l eve l cache o n t he boa rd , which is capable of a longword read or write i n rwo cycles and a quadword read in three cycles
• A 2 8-entry translat ion buffer (TB ) . which ach ieves a high hit rare for v i rtual -to-physica l address trans lation
9 5
The C VAX 78034 Chip, a 3 2- bit Second-generatio n VAX Microprocessor
Table 1 CVAX Instruction Set Architecture
I nstruction Type N umber
I mplemented Fully by CPU
I nteger /log ical 89
Address 8
Bit field 7
Control 39
Procedure call 3
M iscellaneous 1 0
Queue 6
System su pport 1 1
Character stri ng 8
Su btotal 1 81
Implemented by Floating Point Chip
F floating 24
D floating 23
G floating 23
Subtotal 70
I mplemented Partially by CPU
Character string 3
Decimal 1 6
Edit
CRC 1
Subtotal 2 1
I mplemented Fully in Macrocode
H floating 28
Octaword 4
Subtotal 32
Total 304
Other factors i ntl uencing the lower TPI are as fol l ows:
• More efficient m icrocode was i m p lemented for some i nstructions. I n genera l , most com plex instructions, such as CALLx, RET, PUSHR , POPR, and I NSV, were coded for speed rather tha n for space.
• Six additional i nstructions were i mplemented in m icrocode. These i nstructions are CMPC 3 , CMPC S , LOCC, SKPC , SCANC, and SPANC .
• The instmction decode section decodes a l l specifiers i nstead o f relying o n t h e microcode ro decode some specifiers.
The machine cycle t i me reduction was deter
mined in part by the technology chosen for
fabri-96
cation. The first-generation chip, the MicroVAX 780 3 2 CPU, has a 2 0 0-ns cycle time and was i mp lemented i n a 3 - micron NMOS process. I n comparison , the CVAX 78034 CPU chip had a goal of a 90-ns cycle t i me and was i mplemented in a 2 - m icron CMOS process . However, only 60 percen t of t he i mprovement i n the CVAX cycl e t i me results from t he fabrication process.
The remainder resul ts from arc h itectura l and c i r
cu it innovations, which are described i n the sec
tion Internal Organization .
The section fol lowing presents an overview of t he CVAX arc h itecture .
CV AX Architecture
The CVAX 78034 CPU chip i mplements the VAX architecture , which has 1 6 general-purpose reg
isters , the processor status longword , and 1 8 mis
cel laneous privi leged registers . All 304 VAX i nstructions are supported by the system .4 The chip fu l l y executes 1 8 1 i nstructions and pro
vides microcode operand pars i ng for 2 1 i nstruc
tions that are emulated with macrocode . The chip passes 70 F, D, and G floating poi n t i nstruc
tions to a companion floating point chip. The remaining 32 i nstructions are fu l ly emulated i n macrocode . Table 1 summarizes t he i nstruction set architecture .
The c h i p memory management hardware and m icrocode provide a demand-paged virtual mem
ory environmen t . The virtual memory size is 4 gigabytes, and the physical address space is
1 gigabyt e .
External interface
The CVAX bus provides a flexible i n terconnect protocol between a l l CVAX fam i ly members. The primary data bus is 3 2 bits wide and is t i me mul tiplexed t o s hare addresses and data . U p to four l ongwords can be transferred with each address . Strobes provide timing i nformation for syn
chronous a nd asynch ronous devices. Direct mem
ory access ( DMA) request and grant signals are used to control arbitration of t he data and address l i ne ( DAL) bus between the CPU and perip heral chips .
Shown i n Figure 1 , t h e CVAX 78034 C P U chip is a synchronous device on the CVAX bus. In addi
t ion to support i ng the CVAX bus protocol , eight dedicated p i ns support a floating poi nt coproces
sor interface. These p i ns are time multiplexed between the CPU chip and t he coprocessor chip to transfer control and status information .
Digital Technical journal No. 7 August 1 988
I NT E R RU PT CONTROL
DMA
{
CONTROL
CAC H E M E M ORY AND W R I T E B U F F E R
{
CONTROL
-HALT DS DS
--AS
AS PWR F L
C R D B M < 3 : 0 >
BYTE MASK B M < 3 : 0 >
M E M E RR
LATC H �
---I N T---I ---I M I R0 < 3 : 0 >
CONTROL STATUS CS < 2 : 0 >
LATCH CVAX 7 8 0 3 4
C E NTRAL PROCESSOR U N IT
PARITY D P < 3 : 0 >
C S D P < 3 : 0 >
TRANSC E I V E RS
f--
WRITE1 T
-WR
LATCH W R
1 r
-- D B E
D B E
l �
D M R
DMG DAL < 3 1 : 00 >
�� �
DATA�
0 < 3 1 : 0 0 >TRANSC E I V E R S
-
�
ADDRESS LATCH B A < 3 1 : 00 >CCTL - AS
CWB
�
DAL < 3 1 : 00 >C PSTA < 1 : 0 > C PSTA < 1 : 0 >
CPDAT < 5 : 0 > CPDAT < 5 : 0>
D M G -DMG ROY
f--C L KA -C L KA E R R
1-C L K B-C L K B CVAX 7 8 1 34 R E S ET- -R ES ET FLOATI NG-POI NT
ACCE L E RATO R C L KA
ROY R5Y
C L K B
E R R
R E S ET E R R
M R 1 0B6· 1 1 5 9
Figure 1 CVA X 78034 External interface
Digital Technical journal 9 7
No. 7 A ugust 1988
The C VAX 78034 Chip, a 32- hit Second-generation VAX Microp rocessor
A clock ch i p generates pairs of 1 80-degree phase-shifted clock signals that are d istributed to a l l synchronous MOS components i n t he system . The clock also generates auxi I iary pairs of clocks that can be used by any non -MOS components in the external i nterface . Separation of the clocking for MOS and non -MOS el ements provides better skew control for t he critical MOS clock signals.
Microarchitecture
The CVAX 780 34 CPU chip has some pipe l i n i ng and i s microprogrammed. The chip comprises six major functional un its:5·6·7
• Instruction decode and prefetch queue ( ! - Box)
• Execution u n i t (E-Box)
• Memory management u n i t (M-Box)
• Bus interface u n i t (BIU)
• Cache
• Microsequencer and control store
The photom icrograph in Figure 2 and the block diagram in Figu re 3 i l l ustrate a l l fu nct ional u n i ts on the c h i p .