Design of a real-time broadcast system over the internet

(1)

Design of a Real-Time Broadcast System over the Internet by

Kenneth Sau-yee Hon

Submitted to the Department of Electrical Engineering and Computer Science in Partial Fulfillment of the Requirements for the Degrees of

Bachelor of Science in Electrical Engineering and Computer Science and Master of Engineering in Electrical Engineering and Computer Science

at the Massachusetts Institute of Technology January 28, 1998

The author hereby grants to MIT permission to reproduce and distribute publicly paper and electronic copies of this thesis

and to grant others the right to do so.

Author

Department of Electrical Engineerihg and Computer Science January 28, 1998 Certified by Madhu Sudan Thesis Supervisor Accepted by Arthur C. Smith Chairman, Department Committee on Graduate Thesis

(2)

m.

1 Scalability 23

111.2 Load Balancing 27

111.3 Fault Tolerance 30

111.4 Adaptability 32

IV System Architecture 37

IV. 1 Graph Analogy 39

IV.2 System Components 40

IV.2.1 Multimedia Source 40

IV.2.2 Reflector 42

IV.2.3 Client Playback Station 47

IV.2.4 Control Unit 48

(3)

V System Dynamics 54

V.1 Arrival of Multimedia Source 55

V. 1.1 Setting up new session using 57

Weighted Constrained Steiner Tree Algorithm

V. 1.2 Approximation Algorithm for finding the 60

Weighted Constrained Steiner Tree

V. 1.4 Centralized Approach to find the Weighted Constrained 61 Steiner Tree

V.2 Departure of Multimedia Source 67

V.3 Arrival of Reflector 72

V.3.1 Incorporating the reflector into an existing session tree 73 V.3.2 Connecting the reflector to a session using Online 75

Approximation Algorithm for the Weighted Constrained Steiner Tree

V.3.3 Adding new session to existing reflector 76

V.4 Departure of Reflector 78

V.5 Client Admission Scheme 80

V.5.1 Connecting to an active reflector 83

V.5.2 Initiating a passive Reflector 83

V.6 Assigning Alternate Server for System Components 84

V.6.1 Assignment of alternate server for reflector 85

V.6.2 Assignment of alternate server for client 87

VI Conclusion 88

VI.1 Summary 89

VI.2 Extensions 93

VI.3 Potential Benefits 97

VI.4 Future Research Directions 98

(4)

LIST OF FIGURES

II. 1 Comparison of the Unicast, Broadcast, and Multicast models 18

1II.1 Multiple-Tier Architecture 26

111.2 Example showing how migration of clients is used for 29 load-balancing

111.3 Illustration of the effects of server failure 31

111.3 Illustration of how the fault tolerance mechanism works 33

111.4 Feedback process 35

IV. 1 System Architecture 38

IV.2 Example showing the benefit of incorporating a passive reflector 44 IV.3 Example showing the use of TCP/IP connections over the global Internet. 45

IV.4 Filtering Module of the Reflector 46

IV.5 Information Management Module of the Control Unit 50

V. 1 System representation used for illustration of the various operations 56 V.2 System representation illustrating the procedures for handling the 58

arrival of multimedia source

V.3 Graph used to illustrate the approximation algorithm for the 63 Weighted Constrained Steiner Tree (WCST) problem

V.4 The optimal WCST for the example 63

V.5 Reduced graph of the example ₆₅

V.6 Creation of the minimum spanning tree on the reduced graph using the 68 Edge Fitness Indtx

V.7 System representation illustrating the procedures for handling the 71 departure of multimedia source

V.8 System representation illustrating the procedures for handling the 74 arrival of reflector

V.9 Attachment of reflector to a session tree using online approximation 77 algorithm for the WCST problem

(5)

V. 10 System representation illustrating the procedures for adding a new session 79 to a reflector

V. 11 System representation illustrating the procedures for handling the 81 departure of reflector

V. 12 System representation illustrating the procedures for initiating a 85 passive reflector to meet increased demand

(6)

IV. 1 Graph Analogy of the Broadcast System

LIST OF TABLES

(7)

Design of a Real-Time Broadcast System over the Internet by

Kenneth Sau-yee Hon

Submitted to the

Department of Electrical Engineering and Computer Science January 28, 1998

In Partial Fulfillment of the Requirements for the Degrees of Bachelor of Science in Electrical Engineering and Computer Science and Master of Engineering in Electrical Engineering and Computer Science

ABSTRACT

This thesis presents the design of a real-time broadcast system over the Internet. The system that is designed allows multimedia streams consisting of audio and video data to be distributed over standard point-to-point connections in a scalable and adaptive manner. Issues such as load balancing, adaptability and fault tolerance are addressed. The system architecture consists of four main components, including the multimedia source, the reflector, the client playback station, and the control unit. These system components communicate and interact with each other through network connections in order to perform various system operations. Many of the system operations include the execution of graph algorithms, and these operations enable the system to satisfy different users' demand and to react to changes in network conditions and system resources.

Thesis Supervisor: Madhu Sudan

(8)

Chapter I

Introduction

This thesis presents the design of a distributed system that can provide scalable and adaptive real-time multimedia streaming over the Internet. The work in this thesis relates to the area of efficient network communication in real-time using audio and video data. Real-time multimedia streaming is a method for transferring live audio and video data, where the receiver plays the downloaded data and at the same time continues to obtain additional data from the sender. This mechanism is very useful for live audio and video broadcasts, where hundreds and thousands of clients join in simultaneously to listen to live events. As the usage and popularity of the Internet increases, so does the demand for

(9)

real-time multimedia streaming over the network. On the other hand, while conventional

Internet traffic, which usually involves the transmission of plain text and static graphics,

tends to be short-lived, real-time multimedia streaming is lengthy and requires a

minimum bandwidth for the network connection that is transmitting the data. As a result,

many existing architectures that are designed to handle conventional Internet traffic are

not sufficient to manage the demanding task of distributing of real-time multimedia

stream over the network, and a system that can provide this kind of service in a scalable

and efficient manner is highly desirable. The major contribution of this work is to present

a system design that allows for scalable real-time multimedia streaming over standard

point-to-point network connections. Besides scalability, issues such as load balancing, adaptability and fault tolerance are also addressed when designing the system architecture

and operations.

1.1 Multipoint Communication

Multipoint communication has long been valued for its effectiveness in situations when a

sender needs to communicate a message to multiple recipients. Multipoint

communication is a form of communication where a sender delivers a message to the target recipients simultaneously. This is more efficient than using point-to-point communication, where a sender only delivers the message to one receiver during the

(10)

communication process. Regardless of the medium applied, the scalability of multipoint

communication over point-to-point communication is obvious.

Despite multipoint communication's apparent advantage as an efficient way to

disseminate information, early research into data networks and packet switching

technology focused on point-to-point communication, where a sender needs to establish a

separate connection for each receiver [1]. However, the importance of multipoint

communication becomes increasingly apparent as more and more multipoint applications

are being developed and widely employed. In particular, major research efforts have been

devoted to designing and constructing applications that can distribute digital audio and

video over the network in a multipoint form of communication. This particular area has

drawn attention from the technology community because many people believe that

multimedia technology can provide a natural, unobtrusive way to incorporate computers

into more areas of life by providing a better interface for information distribution. On the

other hand, multipoint communication using multimedia data is also expensive as large

bandwidths are often required to transmit audio and video over the network, and

multimedia broadcasts usually last for a long period :f time. The challenge is to devise a

mechanism that allows for efficient multipoint multimedia communication using the

(11)

1.2 Real-time Multimedia Streaming

An area also receiving enormous attention nowadays is the real-time multimedia streaming technology. It is a mechanism for delivering live audio and video over the network; the receiver of the stream plays the multimedia data and at the same time continues to download additional data from the sender. The current network architecture is designed to mainly support full file transfer for document retrieval. While this full file transfer paradigm is sufficient in the area of traditional information retrieval and navigation, it severely limits the inclusion of audio and video information [2]. The transfer times for these video and audio files can be very large, causing the latency required before playback begins to be unacceptably long. The application of streaming technology solves the problem of playback latency. By using streaming technology, audio and video are transferred across the network from a server to a client in response to the client's request in real-time. The client then plays the incoming multimedia stream in real-time as the data are received.

The streaming mechanism that is used in the system that is presented in this thesis is the IBM Bamba technology. It originates from a research effort that demonstrates the use of state-of-the-art compression technologies to allow audio and video data to be streamed efficiently from sender to receivers even through low bit-rate connections. Since its initial development, Bamba has been employed in various network broadcast events,

(12)

including the on-line audio broadcasts for the 1996 Summer Olympic Games, the 1997

US Open, and the Chess Match between Deep Blue and Kasparov over the World Wide

Web. The streaming technology provides a multimedia playback solution which extends

the web browser so that it becomes capable of playing multimedia audio and video clips

whilst they are still being downloaded [3]. This technology requires no need to wait for

the entire clip to download before playing commences. It also monitors the download rate

and compares this to the clip size. It starts playing as soon as it decides that it is possible

to do so and to keep playing without any interruption even as the remaining clip data continues to download. This pre-buffering will mean some delay between the user's

command to start the play process and the time play actually starts. However, this delay

will be minimal when the clip rate is less than the available connection bandwidth; and

even when the clip rate exceeds the connection speed, the delay is still less than waiting

for the entire downloading to be complete [3].

1.3 Real-Time Live Broadcast System over the Internet

This thesis is about the design of a real-time multimedia broadcast system ("the system")

over the Internet. In the next chapter we give a general description of the Multicast technology, currently the most effective method of providing multipoint communication

(13)

using multimedia data in real-time. We also present some limitations of the Multicast

technology, as well as the motivations behind the design of this system.

In Chapter III, we state the specific issues that are being considered when designing this system, which include scalability, load balancing, adaptability, and fault tolerance. We show why it is necessary to address these issues, and use examples to illustrate the bad consequences that could result if the issues are not addressed properly. We also present some general approaches to handle these issues, which would be incorporated into the design of the system architecture and dynamics.

In Chapter IV, we introduce the overall system architecture. Generally speaking, the system consists of four main components, the control unit, the reflector, the multimedia source, and the client playback station. The roles of these components, as well as the interactions between them, are described in the chapter. Furthermore, each component is being separated into different modules, with each module having different capabilities for performing various tasks.

Chapter V presents the system dynamics. This is a very important part of the thesis, as the operations described are crucial in enabling the system to adapt to various changes, such as the arrival and departure of different components. The control unit is the centerpiece for carrying out these operations, making use of its knowledge about the system to execute the necessary algorithms and instructing the other components on the

(14)

actions to take. This chapter also describes the algorithms that are used in these operations. In particular, approximation algorithm for the Weighted Constrained Steiner Tree problem, which is used for setting up a broadcast session, is depicted in great detail.

Concluding remarks will be given in Chapter VI. This final chapter provides a summary for the work, and discusses some of the possible extensions that can be incorporated into the system. It also reiterates some of the benefits and advantages of using the system, and proposes some guidelines for future work that can contribute to provide better scalable real-time multimedia broadcasting services over the network.

(15)

Chapter II

Background and Related Work

The demand of real-time multimedia streaming over the Internet soared in recent years. In order to handle the increasing demand, tremendous efforts have been devoted to research in this area in recent years. Among all the mechanisms that are devised, Multicast is currently the most commonly used method for distributing live multimedia streams over the network. Nevertheless, there are some limitations to the current Multicast architecture, which prompts the design of a system as presented in the thesis. In this chapter, we talk about the Multicast technology and how it can be utilized for real-time mulreal-timedia streaming. Moreover, the limitations of Multicast and the rationales for designing the system in this thesis are described.

(16)

II.1 Real-time Multimedia Distribution using Multicast

Multicast is the most efficient method for distributing real-time multimedia streams over the network. It is a communication mechanism that allows applications to efficiently transmit data to a set of receivers that are members of a designated group in real-time. With Multicast, applications send one copy of information to a group address, reaching all recipients who want to receive it [4, 5]. Otherwise, the transmission of data to multiple recipients is performed either by using multiple Unicast connections, where information must be carried over the network multiple times, one time for each recipient; or by Broadcasting the information to everyone on the network. The comparison of using Unicast, Multicast, and Broadcast for multipoint communication is shown in Figure II.1.

In (a), Unicast is employed, and from the source to each participating client a separate connection has to be established. As a result, redundant data are sent over the network, wasting resources and bandwidth. In (b), Broadcast is used, and the information is transmitted to clients even if they are not interested in obtaining the data. With Multicast, the source only sends one copy of information, as shown in (c), and the information is delivered only to the participating clients. Also, data are replicated at the routers only when necessary, which conserves network resources. Thus, it is apparent that Multicast provides a scalable solution to multipoint communication, espc;ially for distributed applications which require demanding real-time transmission such as live streaming of

(17)

11.2 Mechanism of IP Multicast

IP Multicast is a one-to-many transmission that builds on the standard IP network level protocol. Generally speaking, IP Multicasting is the transmission of IP datagrams to a host group, a set of zero or more members identified by a single IP destination address [4]. No restriction exists on the location or number of members in a host group; any user may be a member of more than one group at the same time [6].

The Class D IP addresses, which range from 224.0.0.0 to 239.255.255.255, are used to specify the Multicast host groups. Similar to Unicast, in which clients use the server's Unicast IP address to establish their individual connections with the server, Multicast users can join a session by specifying the Multicast IP address from which they want to receive packets [7]. Meanwhile, the source for the session does not need to know who and where the destinations are, since routers with Multicast support are responsible for delivering the information to all the members of its destination host group. At the Multicast routers, the data packets are being replicated and forwarded to the group members. All these IP Multicast routers support the Internet Group Manager Prtwtocol (IGMP), a protocol devised to allow the routers to learn about the existence of host group

(18)

Figure II.1: Comparison of (a) Unicast, (b) Broadcast, and (c) Multicast Models

(19)

members on their directly attached subnets. The routers also support Multicast routing protocols such as the Distance Vector Multicast Routing Protocol (DVMRP), Multicast Open Shortest Path First (MOSPF), Protocol-Independent Multicast -Dense Mode (PIM-DM), Core Based Trees (CBT) and Protocol-Independent Multicast - Sparse Mode (PIM-SM) [8, 9]. The details of these routing protocols are not provided in this thesis; readers interested in knowing more about Multicast routing mechanisms and protocols are encouraged to look up the references for further details.

11.3 Limitations of Multicast

Even though Multicast provides an efficient method for distributing multimedia data over the network in real-time, there are some limitations with the current approach. One of them is the accessibility of Multicast. Currently, all Internet users must connect to the MBONE to participate in Multicast sessions. While the MBONE has extensive coverage and is growing in size, the scale and popularity are still quite limited as compared to the whole Internet, where the majority of the connections are still Unicast. Also, Multicast supports only UDP, but in cases where the uvers are using computers behind firewalls, TCP connections are necessary since they can easily traverse firewalls while maintaining reasonable quality.

(20)

Furthermore, while Multicast handles many problems associated with using Unicast

connections, the adaptability of the mechanism is contained due to its inability to alter the

audio and video format during the transmission process. As described previously, once

the multimedia stream is being sent from the source, the "mrouters" are responsible for

distributing and replicating the data stream to their destinations; the source has no control over the transfer process. Thus, even in cases when it is advantageous to perform format

conversions of the audio and video data during the streaming process (e.g., reduction of

the bit rate for audio stream due to data congestion, or increasing video frame rate when

sufficient bandwidth is available for the connection), the conversion cannot be achieved without the necessary control mechanism at the routers.

Existing video distribution models, such as CU-SeeMe and Vosaic, provide scalable

group communications over a network without Multicast support. [10, 11] These models

make use of reflectors that manage the distribution of audio and video to the users. After a reflector receives the audio and video data, it replicates the data, then distributes, or

reflects, the data to the multiple receivers that are connected to it. However, these

systems do not have centralized controls. As a result, the workload can be highly

imbalanced across the reflectors and resources that are available may not be used efficiently. Moreover, clients in these systems have to decide which reflector to connect

to without any knowledge of the status of the reflectors. Hence, a client could actually be connecting to a reflector that is already overloaded instead of to the ones that are relatively idle.

(21)

These limitations provide the motivations for designing this system. Until the scale of the MBONE, the backbone for multicast, is comparable to the whole Internet, with Multicast users representing a considerable fraction of all network users, a system which provides scalable real-time multimedia distribution over existing point-to-point network is highly desirable. Even if Multicast really becomes widely popular in the future, the various functions incorporated in the design of the system, such as its ability to alter the multimedia content dynamically in response to the different client demands and changing network conditions, are still very appealing in many situations.

(22)

Chapter III

System Design Issues

In this chapter, we discuss some of the important issues that are being considered in designing this system. These issues include scalability, load balancing, fault tolerance, and adaptability. All these factors play very important roles in shaping the design of the system. 1hcrease in scalability enables the system to manage a large number of clients participating in multiple sessions simultaneously. In order to achieve good performance, the system should also be able to maintain good balance of workload across the servers and to deal with adverse situations like server failures. Moreover, the system should be

(23)

able to meet different clients' demands and react quickly to changing network conditions.

Failure to address any one of these issues can bottleneck the performance of the whole

system. For example, inability to balance the workload across the servers can lower the

scalability of the system, as clients can be "blocked" if they try to connect to overloaded

servers, and the probability of failures for overloaded servers also increases.

III.1 Scalability

One of the primary objectives for designing this distributed system is to provide a

scalable solution for live multimedia broadcast over the Internet. Increase in scalability

of an application allows more clients to be served by the application simultaneously [12].

For multimedia streaming over the network, achieving better scalability means increasing

the size of the audiences that is allowed to join a particular broadcast to receive audio and

video data. Not only does increase in scalability allow more new clients to be admitted, it

also allows better quality of service to be provided for existing clients as servers that are

delivering the multimedia s reams are less likely to be overloaded. This is very important

since multimedia streaming is a form of sustained and continuous data transfer that also

has a minimum bandwidth requirement. We want to design a distributed system that can

(24)

One method to increase scalability in a distributed environment is to have multiple servers participating in a broadcast at the same time. If multiple servers are present, the issue that needs to be addressed is how to organize these servers so that the resources can be used in the most efficient manner. Arranging the servers using a multiple-tier architecture can enhance the scalability of a broadcast in a distributed environment. With the multiple-tier architecture, servers participating in the broadcast are cascaded in multiple layers in order to increase the size of the session.

To illustrate the advantages of using the multiple-tier architecture, consider a situation where a large number of servers are available for a multimedia broadcast. Without loss of generality, assume that each of these servers is capable of handling an audience size of

n, with n being a positive integer larger than one. One of these servers is the source for

the broadcast, which creates the audio and video stream; the rest are auxiliary servers that replicate the stream to their receivers. If cascading is not allowed, all the clients must connect directly to the source, and the session can only have a maximum audience size of

n; all the available resources from the auxili ~y servers are wasted.

Continuing with the example, if one layer of cascading is allowed, the session size can be increased, and the maximum audience size the session can now support is n2. This is

(25)

accomplished by having n auxiliary servers connecting to the source, increasing the maximum number of clients allowed in the session at any given moment to n * n = n .

For this example, the general formula for the maximum number of clients allowed in the session with respect to the layers of auxiliary servers used in this example is given by:

(m

=

n+1) Where

m is the maximum number of total clients,

n is the maximum number of receivers for each server, and

I is the number of layers of auxiliary servers allowed for the session tree.

Figure mI.1 illustrates the advantages of using the multiple-tier architecture in establishing a session. Each server in Figure

m.1

can manage a maximum of three receivers. In (a), when only one level of cascading is allowed, only three auxiliary servers can be used to scale the broadcast, and the resources available from the rest of the servers (shown in dotted circles) are wasted. In (b), two levels of cascading is allowed. In this case, more auxiliary servers can be used for the broadcast, which allows more clients to participate in the broadcast simultaneously.

Given that the multiple-tier architecture, in which the servers are organized into a tree structure, provides the appropriate framework for enhancing scalability of the system, the next issue is how to construct the tree for the broadcast session in a cost-effective manner. The cost of a session tree is the sum of the costs of using the servers in the tree, together

(26)

N

-z

zz

(a)

(b)

Source (Master Server) Server in use

Multimedia Stream

'...

Potential Server

Participating Clients (Max of 3 in this example)

Figure III.l: Multiple-Tier Architecture. In this example, each servers can serve a maximum of 3 receivers. (a) Only one level of cascading, (b) Two levels of cascading is allowed.

with the cost associated with establishing the connections between the servers. Minimizing the cost of the session tree allows for more efficient multimedia streaming and, hence, better quality of service to clients. Thus, it is an important element to consider when devising algorithms for the various system operations, as will be described in the later chapters.

A....

Layer 0 Ta err 1 Layer 2

0

~5 ~ I\ I\ nr .

...

(27)

111.2 Load Balancing

Another issue related to scalability is load balancing. Load balancing is a technique to improve the system performance by striving to equalize the system workload among all the participating servers [14]. The load balancing issue can also be viewed as a resource allocation problem, and the objective of incorporating a load-balancing scheme into the system design is to provide a method that fairly allocates the resources that are available among the servers.

Resources could be wasted if the system does not have any load balancing mechanisms. Examples of potential inefficiencies are wasting of bandwidth, overloading of servers, and blocking of clients -- all of these have adverse effects on system performance [15]. As an example, imagine a system containing two servers, both of them providing identical service and both capable of handling the same number of clients. Assume also that the overall quality of service a server can provide to clients is inversely proportional to the number of connected clients. In this situation, it is always optimal to have the two servers handling the same number of clients at any given time, which can be achieved by having client admission control that balances the workload of the two servers. Without

(28)

any load balancing mechanisms, it is possible that a large number of clients will connect to the same server, leaving only a small number of clients connecting to the other server. This will obviously lead to a decrease in quality of service provided by the system as the resources from the idle server are wasted. Clients can also be "blocked" (i.e. rejected by the system) if they attempt to connect to a server that is already overloaded. By using load balancing mechanism, the occurrences of these undesirable events are avoided as the system workload is evenly spread across the servers.

For this sample scenario, one solution is to have a mechanism that assigns the incoming client to the server that has the fewer number of clients. Furthermore, if migration of clients between servers is allowed, perfect load balancing can be achieved by constantly moving the clients from the server with more clients to the one with fewer clients (See Figure

m.2).

Nevertheless, this is just a simple example demonstrating the benefits of having a load-balancing mechanism for a system with distributed resources. Most applications have more complicated architectures. For many applications, servers usually have different capabilities and processing power, and the maximum number of clients each server can handle varies. Also, in some cases, migration of clients may not be practical, as there is overhead cost associated with moving a client from one server to another. This is especially true for our case of real-time multimedia streaming, where clients need to connect to a server for a sustained period of time to obtain live audio and

(29)

(a)

(b)

(c)

Figure 111.2: Example showing how migration of clients is used for load-balancing. (a) System is in the load-balanced state, with each server managing the same number of clients. (b) The number of clients each server is handling can be different as clients

enter and leave the system. (c) Clients are migrated from the server which is serving a

larger number of clients to the other server, allowing the system to go back to the load-balanced state.

video data. Migration of a client will lead to audio and video streaming disruption as the clients need to terminate their original connections and establish new ones with other servers.

One way to measure how well the workload is distributed in a system is to examine how the utilization rate across the servers varies. The utilization rate for a server is the percentage of maximum capacity that is used, which can also be thought of as the ratio of the number of clients that the server is actually handling to the maximum number of clients the server can manage. The variance of the servers' utilization rates in the system is given by the following formula:

(30)

Z i

[

n/mi - (Zj(n/mj) / k) ] 2 /

(k-1)

for i,

j

= 1, 2...k,

where

ni is the number of client server i is actually serving,

mi is the maximum number of client server i can serve, and

k is the number of servers in the system.

A small variance in the utilization rates of the servers suggests that the workload is well balanced in the system. Conversely, a large variance suggests that the workload is not evenly distributed across the servers. As for each individual server, we examine how its utilization rate compares to those of the other servers in the system. When new clients arrive, we want them to be directed to the servers that have low utilization rates so that the workload in the system can be balanced.

11I.3 Fault Tolerance

Most operations in a distributed system involve interactions among the system components. Hence, failure of a single system component can affect the whole system and can lead to disastrous consequences if the failure is not being handled appropriately [16]. Failure refers to any deviation from the normal state that can cause disruption of service in the system. For example, in our case of multimedia streaming, failure arises when the service time of a server becomes unacceptably long for real-time audio and

(31)

video data delivery, or when a connection between two components becomes so

congested that streaming, which has a minimum bandwidth requirement, is no longer

achievable. Inability to handle these failures properly can lead to undesirable

consequences. For example, consider a application which employs the multiple-tier

architecture, with servers cascaded in order to scale up a broadcast. When one of the

servers in the application fails, not only are the clients that are directly connected to that server being affected; all its other dependencies, including lower-level servers that are

connecting to it and their respective clients, are affected as well. (See Figure 111.3)

Q

Servers Failed Server Streaming

Connections

.

Affected _Clients

SReceivers

Figure 111.3: Illustration of the effects of server failure. Without fault tolerance

scheme, when a server fails, the clients that are connecting directly to it (set A) will not be able to continue with the streaming process. Moreover, other dependencies,

including the other servers that are connected to it and the clients that they are serving

(32)

To address this issue, failures must first be detected and located. This is achieved by having a monitoring process running at the receiving end of each connection. The module that is responsible for the monitoring process examines parameters such as the throughput and delay for that particular connection, and determines that there may be possible failures if either the bit rate suddenly decreases or the bit rate drops below a certain level that makes multimedia streaming impossible.

The next step is to have the affected receiver establish a new connection with another server, since the original connection is no longer able to provide the desired quality of service. Once the receiver connects to the new server that is being assigned to it as an alternative, it can continue to obtain the data that it wants (See Figure 111.4) The process

of assigning alternate servers to receivers is described later in the thesis.

111.4 Adaptability

One of the key issues considered in designing this real-time multimedia broadcast system is to provide adaptability so that the system can meet different clients' demands and react to changes in netwo.k conditions. In fact, one of the functions Multicast lacks is the ability to make changes to the audio and video format in an adaptive manner, as all the activities for Multicast take place at the routers. As mentioned in the previous chapter, with Multicast the data that are delivered to all members in the host group are identical;

(33)

Figure 111.4: Illustration on how the fault tolerance mechanism works. The key for this figure is the same as Figure 111.3, and the dotted lines show the connections that are redirected. When a server fails, the components that are originally connected to it will establish new connections with other servers in the session tree so that all the affected components can continue to obtain the audio and video streams.

the format of the data cannot be changed once they are being sent from the source. This presents problems when one tries to establish a session for broadcast to multiple clients, while having just a single audio and video format is not able to meet all the client demands. To address this issue, the system is designed such that the multimedia format can be changed dynamically depending on different conditions. For audio, the bit rate can be altered by changing the compression ratio. For video, parameters such as the video resolution and the frame rate can also be adjusted.

(34)

One of the benefits of being able to make these adjustments is the ability to satisfy

different clients' demands. Given a connection with certain bandwidth for transmission,

one can make some tradeoffs between different parameters. For example, a user can obtain a video clip that has higher resolution with lower frame rate, or a video clip that

has lower resolution but higher frame rate. Tradeoffs like this are determined according

to the preferences of the users, but to some extent the tradeoffs are also depended on the

actual content of the audio and video data. For instance, in situation where the video clip contains many detail objects and these objects are relatively static, it would be desirable

to sacrifice the frame rate for better resolution. On the contrary, in situation where the

details of the objects in the video clip are not really important, but the objects are

dynamic, it would be desirable to have better frame rate, with each frame having a lower resolution.

With the capability of altering the audio and video format, the system can also make

adjustments in response to changing network conditions. When the bandwidth that is

available for a connection decreases, the data that are being transmitted through that

connection can be further compressed to continue the streaming process. On the other

hand, connections with large bandwidth can be used to transmit audio and video with higher quality. The process works as follows. If the receiver detects changes in the level

of throughput for the connection, it can provide feedback to the server, which can vary the compression ratio in order to satisfy the bandwidth constraints (See Figure E1.4).

(35)

These adaptive measures allow the system to react to changing network conditions and to function properly for a sustained period of time.

Finally, the system must also be able to adapt to changes in the availability of resources. During the time when the system is in operation, servers enter and exit dynamically, causing changes in the availability of the system resources. For example, when a server joins a system, the resources increase as the new server can allow the system to handle

Sender Original Audio/Video data Receiver

L

I

(a)

(b)

IZZI

Compressed Audio/Video data

Mmmm

(c)

Figure 111.4. Feedback Process. (a) Audio and Video data requiring large bandwidth are delivered to the clients. (b) The receiver senses that the bandwidth of the connection is not sufficient for the data to arrive on time. The receiver sends a feedback message to the sender, asking for audio and video data that consume less

bandwidth. (c) The sender compresses the audio and video data so that they can be delivered to the receiver on time through the low bandwidth connection.

4111

.

... ...

I

(36)

more clients simultaneously. On the other hand, when a server leaves the system, the system must be able to redistribute the workload to the remaining servers. The more the

system is able to make efficient use of its available resources, the more capable the

(37)

Chapter IV

System Architecture

The system designed in this thesis provides real-time live broadcast over the Internet. Some important issues that need to be considered for providing this kind of service are discussed in the previous chapter. The system addresses these issues using its various components, each of them having its own features and functions that contribute to the operations of the system. The system is distributed, meaning that the components are located on different machines, and communicate with each other through network connections. The overall system architecture is depicted in Figure IV.1. The system consists of four main components: the multimedia source, the reflector, the client playback station, and the control unit. Each component is further separated into various

(38)

I I I I I I II I I I I II _I _I II I I

0o

0

Client Multimedia Reflectors Sources _Playback

(39)

modules for handling specific tasks. Detailed descriptions of these components are provided in this chapter.

IV.1 Graph Analogy

This system can be made analogous to a graph, with the nodes of the graph representing the system components (except the control unit); and the edges representing the connections between them. For a multimedia session, the audio and video data are created at the multimedia source, then propagate through the layers of reflectors, and finally reach the client playback stations. Since the streaming process is unidirectional, no cyclical connections are involved, and each session can be viewed as a tree in the graph. The multimedia source is where the actual audio/video data are generated for a session. As a result, it is the root node of a tree. Using a similar analogy, reflectors are intermediate nodes, and clients are the leaf nodes for a session tree.

These analogies provide us with very useful insights as we devise algorithms for performing various system operations. In addition, to execute these algorithms we need to assign weights to the graph nodes and edges. Intuitively, the weights for the graph nodes represent the costs of using the components, and the weights for the graph edges represent the costs of using connections between components. A summary of the analogy is provided in Table IV. 1:

(40)

Architecture Graph Analogy

System Components Graph Nodes

Connections among Graph Edges

components

Session Tree Structure

Components

Source Root Node of session tree

Reflector Intermediate Node of session tree

Client Leaf Node of session tree

Heuristics

Cost of using a source or Node weights of

reflector Reflector

Cost of using a Edge weights of graph connection

Table IV.1: Graph Analogy of the Broadcast System

IV.2 System Components

IV.2.1 Multimedia Source

The multimedia source is responsible for creating and providing the audio and video data for a session. As mentioned previously, the multimedia source acts as the root node for a session tree, sending the audio and video data to layers of reflectors (intermediate nodes), and ultimately to the client playback stations (leaf nodes). The multimedia server consists of two modules: the Audio and Video Generation Module and the System Connection Module.

(41)

* Audio and Video Generation Module

This module is responsible for generating the audio and video data for the session. Audio and video are first converted from analog inputs to digital form. Afterwards, encoding is performed and the audio and video data are compressed so that they can be streamed efficiently to receivers. The module then delivers the packetized multimedia stream to the System Connection Module.

* System Connection Module

The System Connection Module serves as a middleman that transfers the multimedia data created by the Audio and Video Generation Module to the reflectors in the system. Besides delivering the multimedia stream, the System Connection Module also handles all the communications with the other system components. For example, when the source first joins the system, this module is responsible for notifying the control unit of its entrance. Similarly, when the multimedia source decides to leave the system, it will send

a notification message to the control unit.

The separation of the multimedia source into the two modules described above has some advantages. The separation provides more flexibility for implementing the multimedia source, since it allows the Audio and Video Generation Module and the System Connection Module to be coded using different languages. This is beneficial in some situations since programming languages like Java - used for developing the prototype

(42)

currently lacks the methods to provide audio and video capturing. Moreover, the two components can now reside on different servers and communicate via a network connection. This allows the audio and video data to be generated from a computer that is equipped with the appropriate audio and video capturing tools, while the transmission process is managed by a server that is more scalable and capable of handling multiple users.

IV.2.2 Reflector

The reflector allows sessions in the system to be served to multiple clients. It is the most important component in the system, since it manages the distribution of audio and video data to the clients. Reflectors can be cascaded to scale and handle increased demand. The reflector resource is consumed on demand, but upper limits can be set for the number of connections per session as well as for the total number of connections per reflector. Moreover, a reflector in this system is capable of broadcasting multiple sessions at the same time. For each session, the reflector replicates the multimedia data and distributes them to the receivers that are connected to it. Like the multimedia source, it also has a Communication Module that al'ows it to send and receive information to and from the control unit.

(43)

mode. For a reflector in the active mode, the party responsible for the reflector

administration can decide whether to participate in a session or not. The reflector then notifies the control unit; and in the case when the reflector decides to participate in the

session, the control unit determines which server is the best for the reflector to connect to for obtaining the audio and video data.

On the other hand, a reflector in the "passive" mode yields the selection decision to the

control unit. Given the reflector's available resources, the control unit determines whether it is beneficial for the reflector to participate in a session or not. A passive

reflector can be activated by the control unit to join a broadcast under two circumstances.

The first one is during the initial setup process of the session tree. As shown in Figure

IV.2, session tree can have a lower cost if passive reflectors are incorporated into the

broadcast. The second occasion is when the number of clients for a particular session

reach the maximum level such that the addition of new reflectors is necessary to handle

the additional workload. In this case, the control unit starts up a passive reflector for that

particular session.

The reflector can be divided into three main components, the Distribution Module, the

(44)

%n

7 ₆\8

/14

(a)

(b)

Keys: 0 Active reflector ... Edge of graph

O

Passive reflector Edge of actual tree

Figure IV.2 Example showing the benefit of incorporating a passive reflector. (a) The cost of the tree is 15 without using the passive reflector. (b) With the passive reflector, the cost of the tree is reduced to 13.

* Distribution Module

This module is used to distribute the data to multiple clients. It maintains circular buffers of the most recent several seconds of the audio and video data, then creates a new copy and opens a TCP connection for each receiver. The TCP/IP approach allows the connections to easily traverse firewalls while maintaining high quality. The Distribution Module can also be extended to support Multicast when it is connected to networks with Multicast capability. For example, point-to-point TCP connections may be established between reflectors through firewall boundaries that separate Intranets from the global Internet. Within the Intranets the reflector may establish multicast connections to local client playback stations. (See Figure IV.3)

(45)

or Global Internet Clients without t Multicast

Figure IV.3: Example showing the Use TCP/IP connection over the Global Internet.

and Multicast within Intranet when available

* Filtering Module

The Filtering Module is responsible for all the customized features that are built for the reflector. For example, format conversions from high to low bit-rate compression rates to satisfy different network and client playback station capabilities are possible. (see Figure IV.4) If the clients are thin -- meaning that they do not have much processing power --but are on high bit-rate connections, the reflectors can actually perform the decoding task (which demands a lot of computational power) using the filters in this module and send out the decoded data to the clients. On the contrary, it would be appropriate for the system to deliver compressed audio/video streams to clients that are on low bit-rate connections but possess sufficient computational power to perform the actual decoding.

(46)

From sender

Distribution Module

Filter for bit-rate conversion

Filtering Module

To Receivers

Figure IV.4: Filtering Module of the Reflector. Conversions of bit-rate compression

rates can be performed in the Filtering Module to satisfy different network and client playback station capabilities.

Similarly, the video frame rates and the audio bit rates can be dynamically adjusted in reaction to changes in the network conditions and upon the requests of the users.

* Communication Module

This module handles all the communications between the reflector and other system components. When the reflector first joins the system, it handles all the initiation and setup procedures by communicating with the control unit. When the reflector is in operation, the module continues to communicate with other system components and updates them on the status of the reflector by providing them with information such as the number of sessions and receivers that are being managed by the reflector. When the

(47)

reflector exits, the communication module notifies the control unit, and announces the departure of the reflector to its immediate dependencies so that they can be redirected to alternate servers.

IV.2.3 Client Playback Station

The client playback station represents the termination point of the streaming process,

where the audio and video data reach the users. The client playback station performs

decoding and rendering processes on the data that have already been received, and at the

same time continues to obtain new data from the reflector. It can join an ongoing session at any point in the transmission.

Unlike most current applications that are used for audio and video streaming over the

Internet, the client playback station is downloaded on-demand, meaning that it is obtained

only when necessary. This approach is different from most conventional architectures, which require a user to download the application and store it locally in advance. Our

approach has several advantages. First of all, there is no longer any installation required

for the user, as the audio and video player is automatically retrieved from the server. This

greatly reduces the time a user needs to spend on application maintenance, and also avoids potential problems of incompatibility between the multimedia data and the application.

(48)

* Feedback and Communication Module

The Feedback and Communication Module handles all the necessary communications

between the client playback station and other system components. For example, when it realizes that the reflector the client playback station connects to is leaving the system, it

will initialize a new connection with another reflector so that the client playback station

can continue to obtain the audio and video data. Also, the module can provide the

reflector with feedback on information such as the data throughput, allowing the reflector to make appropriate adjustments on the multimedia data using its Filtering Module when

necessary.

* Rendering and Decoding Module

This module performs decoding and rendering on the audio and video data. The

operations are performed in real-time, meaning that the module does not require the

whole multimedia clip to be downloaded to the client before playing the clip to the user.

IV.2.4 Control Unit

The control unit is the "brain" of the system and determines how the resources are

allocated across the components. It gathers relevant information from the system components, uses the information in executing algorithms for different system operations,

instructs the system components on the necessary action to take to carry out these operations, and also handles the client admission process. The control unit comprises of

(49)

several modules, including the Algorithm Module, the Information Management Module, the Client Admission Module, and the Communication Module.

o Algorithm Module

The Algorithm Module is responsible for the execution of all algorithms that are necessary for the system operations. This module is utilized whenever some components enter or leave the system. For example, when a multimedia source arrives at the system, a new session needs to be established so that multiple recipients are able to join the broadcast simultaneously. In this case, the Algorithm Module has to determine how the reflectors should be connected in the session tree. The Algorithm Module is also used in other situations. As mentioned before, when a receiver determines that its original connection can no longer be used to deliver the multimedia data, it will switch to an alternate server, and the address of that alternate server is determined by the Algorithm Module. Obviously, in order to execute the algorithms the module needs to have some knowledge about the status of the system and its components, and the module obtains the necessary information from the Information Management Module.

o Information Management Module

The Information Management Module maintains all the information about the system that is used for the execution of various algorithms. First of all, the module contains general information on the system such as the number of sessions, active and passive reflectors, as well as clients in the system at any given point in time. Moreover, the module contains

(50)

information for each specific system component. As an example, for a reflector, the

module has information such as the number of sessions it is participating in, the size of

the audience it is serving for each session, the actual number of connections, and the

maximum number of total connections the reflector can manage. It also has information about the characteristics of the connections between components, including the cost of

using a specific connection. Finally, it has information about each session, such as the total size of the audiences and number of reflectors that are participating in that session.

One of the most important functions of the Information Management Module is to serve

as a middleman between the Communication Module and the Algorithm Module of the control unit. In situations when no processing of information is required, the Information

Management Module simply relays the information it gathers from the system

components to the Algorithm Module; otherwise, it would process the information so that

they are in usable form for the Algorithm Module. (See Figure IV.5)

Figure IV.5: Information Management Module of the Control Unit. Illustration of the Information Management Module as the Middleman between the Communication Module and the Algorithm Module in the Control Unit.

(51)

* Client Admission Module

As the name suggests, the Client Admission Module is responsible for the admission of clients into the system. This module must be scalable and highly available, since clients can arrive at the system in a large number simultaneously, especially during the beginning of new sessions. When a client enters the system, the module provides the client with a list of available sessions. After the client decides which of these sessions to listen to, the module then gives the client the address of the reflector for obtaining the multimedia data, as well as the address of an alternate reflector. Details on how the module selects these reflectors for the clients will be described in the next chapter.

* Communication Module

The function of the Communication Module for the control unit is similar to that of the other components, which is to handle all the communications that are necessary between the control unit and the other system components. However, the Communication Module of the control unit has a larger workload since communications between the control unit and the other components are more frequent. Some of the connections are used for information gathering purposes. The presence of these connections allows the other system components to update the control unit as to their status. The Communication Module, upon receiving the updated information, relays it to the Information Management Module. This module is also responsible for information dissemination. For example, when a new multimedia source enters the system, it will notify all the reflectors so that ones that are interested can participate in the session. Moreover, the

(52)

control unit also utilizes the Communication Module to provide instructions for the system components so that the appropriate algorithms in the Algorithm Module can be executed.

IV.3 Network Connections between System Components

This section describes the network connections between the system components. The system components, as mentioned previously, are interdependent on each other, meaning that they have to interact with each other in order to perform their designated tasks. Connections among the components can be divided into two major categories. The first one, obviously, is the streaming of the multimedia data, which is unidirectional. For each session, the multimedia stream begins at the source, where the audio and video data are being generated. The data are then sent to the reflectors that scale up the size of the session by allowing multiple client connections. Afterwards, the data are either being transferred to lower layer reflectors, or being delivered directly to the client. The transmission is completed when the client receives the stream and performs the audio and video rendering process. Altholigh the streaming process involves going through layerL

of reflectors, for the users the process looks just like a single point-to-point connection between the source and the client. The characteristic of the streaming process is that certain bandwidth requirements exist for the connections as the audio and video data must

(53)

be delivered on time. Moreover, under most circumstances these connections are continuous and sustained, as many clients are expected to listen to a broadcast for some time after joining in.

The second type of connection allows for the exchange of control information among the system components. The cost for this kind of connection is generally low since only a small amount of data is being transferred each time. This kind of connection can be further divided into two major types. The first type is information broadcast. This is used when a system component wants to notify all other components on the occurrences of some specific events without requiring acknowledgments or feedback from the recipients. For example, when a new source enters the system, the control unit will broadcast this information and notify all the active reflectors about the arrival of the new multimedia source. The second type is tightly coupled communication between the components. A connection is a tightly coupled communication, since, when a request is made from the sender to the receiver; the receiver, upon receiving the message, has to reply with an acknowledgment or with requested information. State information is often encapsulated in this kind of connection. Thus these connections must be reliable, as any missing message can create incorrect state information, which can cause the system to become unstable.

(54)

Chapter V

System Dynamics

In this chapter we present the various system dynamics. The operations that are described include processes for handling the arrival of multimedia source, the departure of multimedia source, the arrival of reflector, and the departure of reflector. Some of these operations involve the use of tree construction algorithms. For example, an approximation algorithm of the Weighted Constrained Steiner Tr e problem is used to construct a new session tree when multimedia sources arrive. Also, when a reflector joins a session, a dynamic algorithm is used for attaching the new reflector to the established session tree. Other operations that are described in this chapter include the process for

(55)

client admission and the assignment of alternate servers to reflectors and clients.

We use both the graph representation and the system representation to illustrate the different operations. The graph representation is used mainly for describing the steps of the algorithms. In the graph representation, we use nodes and edges for the system components and their connections. The number along an edge is read as the cost of using that particular edge, and the number at a node is the weight associated with that node. On the other hand, the system representation is used for illustrating the procedures and communications involve in the various system operations. This representation is more complex than the graph representation as the different components and the sessions in the system have to be identified. An example of the system representation is given in Figure V.1. The graph contains two sessions, Session A and Session B, which are represented by black and gray respectively. For simplicity, communications between components are shown in dotted arrows only when they are used in the operation that is under consideration.

V.1 Arrival of Multimedia Source

Multimedia sources are allowed to join and depart from the system dynamically. After a source is being incorporated into the system, a new session will be constructed in order to

Design of a real-time broadcast system over the internet

TABLE OF CONTENTS

m.

LIST OF FIGURES

LIST OF TABLES

ABSTRACT

Chapter I

Introduction

1.1 Multipoint Communication

1.2 Real-time Multimedia Streaming

1.3 Real-Time Live Broadcast System over the Internet

Chapter II

Background and Related Work

II.1 Real-time Multimedia Distribution using Multicast

11.2 Mechanism of IP Multicast

11.3 Limitations of Multicast

Chapter III

System Design Issues

III.1 Scalability

m.1

N

-z

zz

(a)

A....

0

...

...

...

...

111.2 Load Balancing

m.2).

(a)

(b)

(c)

[

(k-1)

for i,

j

= 1, 2...k,

where

ni is the number of client server i is actually serving,

mi is the maximum number of client server i can serve, and

k is the number of servers in the system.

11I.3 Fault Tolerance

Q

111.4 Adaptability

L

I

(a)

(b)

IZZI

Mmmm

4111

I

I

I

Chapter IV

System Architecture

0o

0

IV.1 Graph Analogy

IV.2 System Components

(a)

(b)

O

IV.3 Network Connections between System Components

Chapter V

System Dynamics

V.1 Arrival of Multimedia Source