Bit Stream Switching - FIGURE 4.5: Transcoding architectures for bit-rate reduction [72]:

FIGURE 4.5: Transcoding architectures for bit-rate reduction [72]:

4.4.4 Bit Stream Switching

Although scalable coding can potentially provide flexible bandwidth adaptation over unpredictable best-effort networks, current coding techniques still suffer from relatively low coding efficiency, especially when the bit rate range is large.

As a result, bit stream switching techniques are widely used in many commercial video streaming systems [6,19] to create multiple versions of the same content at different bit rates and dynamically switch among them to accommodate the band-width variations. In this section, we introduce three major switching techniques, namely multiple bit rate coding, SP/SI pictures, and stream morphing.

4.4.4.1 Multiple Bit Rate (Simulcast) Coding

In this approach each media source is simply compressed into multiple indepen-dent nonscalable bit streams at different bit rates and qualities. During the trans-mission, the server switches to a particular bit stream whose transmission yields the minimum reconstructed distortion based on the estimation of actual channel bandwidth and loss characteristics. Ideally, once a change in network bandwidth is detected, the server will immediately switch to a more appropriate stream to re-flect the change promptly. However, because of motion prediction, switching be-tween bit streams at arbitrary locations, such as a P-frame, may introduce severe drift effects since the reference frames are different at the encoder and decoder.

The simplest way to achieve a drift-free switching is to insert I-frames peri-odically in each stream and let the switching from stream to stream occur only at those I-frames. Obviously, because adaptation requests only take effect when an I-frame is reached, this increases the latency of bandwidth adaptation. To pro-vide more flexible adaptation, the frequency of I-frames has to be increased at a cost of significantly increased bit rates to achieve the same quality. Thus, al-lowing more effective stream switching comes at the cost of a decrease in video quality for a given target bit rate. In addition, the flexibility of bandwidth adap-tation also depends on the number of different bit streams available, each coded at a different bit rate. The more bit streams are available, the more accurate and finer level bandwidth adjustments can be supported. The inefficiency of coding I-frames results in a much larger storage requirement on the media server when the number of supported bit streams is large. The trade-off between coding effi-ciency and switching flexibility thus becomes a main consideration on the design of a drift-free switching approach.

More efficient approaches for drift-free switching aim at removing the over-head associated with I-frames, which exists even for normal transmission without switching between bit streams. In order to facilitate switching at inter frames (i.e., P-/B-frames), an extra bit stream is created at each predefined switching point at

an increased rate cost when switching happens, while keeping the coding effi-ciency for normal transmission at the same or close to the one without support-ing the switchsupport-ing functionality. One way is to encode the difference of reference frames at the switching points and transmit this as an additional bit stream, which can be used for drift compensation at the decoder. The mismatch can be removed if lossless compression is applied. Another way is to introduce a specially en-coded P-frame, called an S-frame [23], to achieve switching at the location of inter frames. As illustrated in Figure 4.6a, to initiate switching from bit stream 1 to bit stream 2 at timet, an S-frame (frameS12,t) is encoded as a P-frame with the previously reconstructed frame at timet−1 in bit stream 1 (frameP_1,t₋₁) as the reference frame and the reconstructed frame at timet in bit stream 2 (frame

P2,t-1 P2,t P2,t+1

P1,t-1 P1,t P1,t+1

S12,t

Bit Stream 2

Bit Stream 1

(a) S-frame

P2,t-1 SP2,t P2,t+1

P1,t-1 SP1,t P1,t+1

SP12,t

Bit Stream 2

Bit Stream 1

(b) SP-frame

FIGURE 4.6:

Switching from bit stream 1 to bit stream 2 through

spe-cially encoded frames: (a) S-frame and (b) SP-frame.

P_2,t) as the target frame. This approach cannot completely eliminate the drift.

However, by reducing the QP of the S-frame, the drift amount can be controlled and made relatively small. Another disadvantage of this approach is that the rate required for S-frames can be very large due to the small QP that is required. The SP-/SI-frames to be introduced in the next section provide an improved drift-free switching approach to the S-frames. In addition to switching between nonscalable bit streams, bit stream switching can also be performed for several independently coded scalable streams [67].

4.4.4.2 SP/SI Pictures

The extended profile of H.264/MPEG-4 part 10 AVC [2] introduces two new frame types referred to as SP-frames and SI-frames [33]. SP- and SI-frames facili-tate switching between multiple independently coded bit streams and also provide

“VCR-like” functionalities, such as random access, fast forward, fast backward, and so on.

Within each encoded bit stream, SP-frames are created at the switching points in two different types, namely primary SP-frame and secondary SP-frame (see Figure 4.6b). The primary SP-frame (frames SP1,tand SP2,tin Figure 4.6b) is cre-ated by motion-compenscre-ated prediction from the previously reconstructed frames in the same bit stream, while the corresponding secondary SP-frame (frame SP12,t

as an example) is generated, with identical reconstructed values as the primary SP-frame (SP-frame SP2,t), by using the previously reconstructed values from another bit stream. A primary SP-frame is encoded with almost the same coding efficiency as the corresponding P-frame. The difference between SP- and P-/S-frames lies in that, due to the special encoding of the secondary SP-frame, the pair of SP-frames can be identically reconstructed even if they are predicted using different frames.

Compared to I-frames, SP-frames can achieve same switching functionality with significantly fewer bits by exploiting motion-compensated predictive coding. An alternative to a secondary SP-frame is an SI-frame, using only intra prediction to produce identical reconstructed values as the corresponding primary SP-frame. It is mainly used when motion prediction is not efficient, such as switching between bit streams representing completely different video sequences, or for random ac-cess in which decoding of the current frame does not depend on any previous frames.

4.4.4.3 Stream Morphing

Stream morphing [44] has been introduced as an interesting alternative to scal-able video coding and is related to techniques that have been proposed for effi-cient scalable DPCM coding [58,60,64]. Scalable coding schemes operate in the signal domain to separate an input into different layers. For example, in a closed

loop system, the video sequence obtained from reconstructing the base layer is subtracted from the original video sequence, which is in turn compressed. Al-ternatively, an open loop system (e.g., one based on wavelet transforms) would directly separate the input sequence into “components” (e.g., subbands), com-press these separately, and form the layers by grouping various of these compo-nents.

Stream morphing is based on the following observation. Consider a video se-quence encoded with a nonscalable codec (say MPEG-2) at two different target rates. Clearly there will be some redundancy between the two bit streams since they represent the same sequence, albeit at different rates. For example, most blocks will have the same motion vectors at both rates, large DCT coefficients in the residual signal will tend to be in the same locations, etc. A stream morphing technique would use the low rate stream as the base layer. Then the enhance-ment layer will contain a bit stream with a special syntax that allows the decoder to reconstruct the high rate bit stream from the low rate bit stream. For exam-ple, this enhancement layer could include differential information with respect to the motion vectors included in the base layer. Note that this is a transformation between bit streams. Thus one of the principal differences between stream mor-phing and standard scalability tools is that decoding the base layer is not needed to reproduce the signal at the highest quality. Instead, the base layer bit stream is “morphed” into the high-resolution bit stream, on which a standard decoder is used (e.g., the MPEG-2 decoder in our example). Note also that the quality levels at the decoder are exactly determined by the two (or more) originally encoded versions.

Dans le document MULTIMEDIA OVER IP AND WIRELESS NETWORKS (Page 126-129)