MPEG decoder Case
K.A. Vissers Philips Research
and
Pieter van der Wolf
Philips Research Eindhoven, The Netherlands
Outline
• Introduction
• Kahn Process Networks Revisited
• MPEG application
• Mapping
• Design space exploration
• Conclusions
Design Problem
VLIW cpu
I$
video-in video-out
audio-in SDRAM
audio-out PCI bridge
Serial I/O
timers MPEG
D$
MPEG2 decoding
Audio and Video,
• Audio completely done in software on VLIW
• Video:
– describe in Kahn Process Networks – Test for compliance with the standards:
• ranging from H263 to HDTV!
– Map on software + dedicated hardware
MPEG2 decoding status
IP content in Philips Semiconductors:
Nx2700: single chip HDTV decoder (audio + Video):
– dedicated MPEG2 video decoder – specialized Video Output
– many other features
Y chart methodology
In Research:
• Describe the full functionality of the MPEG2 video decoding.
• Mapping and performance analysis
Compare with the Semiconductors results ->
Sanity check on the methodology
Application Modeling: Yapi
Y-api:
An C++ api for Kahn Process Networks:
Expose parallelism and communication
Process
FIFO Read
Write
A C
B Execute
Workload analysis
Computation and Communication workload
FIFO
A C
B
Operation
A-up 56952
Count
A-down 1834
Operation
B-filter 58786 Count
Operation
C-join 34815 Count
C-pass 9834
C-skip 7274
58786 58786
34815
Mpeg2 video decoder
Model of MPEG2 video decoder in Yapi
Design Problem
VLIW cpu
I$
video-in video-out
audio-in SDRAM
audio-out PCI bridge
Serial I/O
timers MPEG
D$
Mpeg2 mapping
Dedicated Mpeg coprocessor: pipe with fifo buffers and blocking semantics
Software on the VLIW
Video In
Video Out Memory Memory
Dedicated Mpeg coprocessor Memory
Mapping and performance analysis
Process
FIFO
Read
Write
A C
B Execute
Application model (YAPI)
R(1)
W(3) E(C-join) W(2) R(2)
E(B-filter) R(1)
X Abstract Y
Mem Architecture
model processor
model
MPEG2 mapping
Data dependant: different video streams:
• Computation on the VLIW and coprocessors
• Communication on the bus and internally in the Mpeg2 decoder
Exploration Questions:
Bus load, burst size of the communication, latency of the memory interface
Exploration results: bus waits
0 2000 4000 6000 8000 10000 12000 14000 16000 18000
20000 SU stalls for bus
"int_SU_o2.dat"
Storage Unit
0 500 1000 1500 2000 2500
0 10 20 30 40 50 60 70
cycle
VInput stalls for bus
"int_VIn_o1.dat"
Exploration results: bus waits
Video In
Lessons learned
• Y chart is a valuable guiding principle
• Quantifying design choices is key
• Match of Model of Computation and Model of Architecture
• Tools required: Mapping, Simulation and Design Space Exploration support
• Raise the level of abstraction for exploration
Conclusion
Verilog
Research
“Sanity Check”
on the methodology
Semiconductors tss
References
• Pieter van der Wolf, Paul Lieverse, Mudit Goel, David La Hei, and Kees Vissers, "An MPEG- 2 Decoder Case Study as a Driver for a System Level Design Methodology" In: Proc. 7th Int.
Workshop on Hardware/Software Codesign (CODES'99), Rome, Italy, May 3-5 1999.
• E.A. de Kock, G. Essink, W.J.M. Smits, P. van der Wolf, J.-Y Brunel, W.M. Kruijtzer, P.
Lieverse, K.A. Vissers. “YAPI: Application Modeling for Signal Processing Systems” In:
Proc. 37th DAC, Los Angeles, June 2000.
• Http://ptolemy.eecs.berkeley.edu/~vissers