• Aucun résultat trouvé

1.2 Telecommunication networks

1.2.5 TCP

It is important to notice that the presentation of the protocol functions, and their realization by the Transmission Control Protocol (TCP), that we give in this sec-tion is neither complete nor exhaustive. The protocol funcsec-tions of error control, connection management and flow control are complex and some details will be voluntarily omitted to keep the presentation both, clear and simple. On the other hand, the specifications of the TCP mechanisms controlling the sending rate, and the implementation of these mechanisms are out of the scope of this thesis. The authors interested in having more details of the TCP can refer to RFCs 675 and RFC 793, for early versions of TCP, or RFCs 5681, 7332, 6247 and 6298 for more recent versions of TCP. There exist many versions of the TCP protocol, we only present some of them and this section aims to present the main protocol functions and their implementation in TCP.

The presentation and understanding of TCP mechanisms are important as they represent the knowledge that will be necessary to understand the models pre-sented in the studies of the Dropbox application, FTP application and CDN throughput that we present in Chapter 4, 5 and 6 respectively.

TCP connection management, error control and flow control

At the Network layer, Figure 1.2, the protocols are not concerned by the un-derlying network capacity. Their only purpose is to ensure that packets are forwarded from one link to another. If, at a certain point of the network, the incoming traffic is bigger than the capacity of the outgoing links, the network at this point becomes congested. In case of congestion, packets can be delayed or, in case of resource shortage, dropped. Usually buffers are used to cope A causal approach to the study of Telecommunication networks

8 1. INTRODUCTION AND BACKGROUND with this eventuality and smooth the traffic to optimize the network usage and improve its performance. However, if a buffer becomes full, it cannot store any more packets and some packets are dropped. Therefore, on its way from the sender to the receiver a packet can be lost, corrupted, duplicated or disordered (it arrives at the receiver before its predecessor or after its successor).

TCP does reliable in sequence delivery and adjusts to the network state (con-gestion). TCP implementsconnection management,error controlandflow con-trol and uses: packet sequence number, timer, acknowledgment bufferand re-ceiver window.

Each time an Internet end user, the sender, wants to transmit an information to another Internet end-user, the receiver, the sender first contacts the receiver and opens a connection. The information that the sender wants to transmit is then divided into fixed length packets that are transmitted along the different networks connecting the two users. Each packet that is sent has a field that contains a unique identifier that consists of its sequence number and its con-nection identifier. The first time a packet is sent the sequence number field is set to a (pseudo) random value and for eachnewpacket sent this field is incre-mented by one. Note that the sequence number is not increincre-mented in case of a retransmission after a packet gets lost (e.g. dropped by router).

For each packet correctly received by the receiver, a special packet called ac-knowledgmentis sent by the receiver to the sender. The acknowledgment is a packet sent by the receiver to the sender that acknowledges the reception of a packet sent by the sender according to its sequence number field value. It is important to notice that, in most versions of TCP, positive acknowledgments are used. The receiver can only acknowledge the reception of a packet (not its loss). Additionally, it is common to use cumulative acknowledgments, where the receiver sends in its acknowledgment the information that all the sequence of packets up to the number present in its acknowledgment have been correctly received.

Connection management Connection management involves setting up and tearing down connections. Connection management is a necessary building block in every protocol that provides a reliable data transfer. It ensures

at-most-once delivery: each unit of information sends by the sender to the receiver gets accepted only once by the receiver

graceful close:A connection is not closed by the sender or the receiver before all the information that the sender has to transmit is correctly re-ceived and acknowledged by the receiver

By keeping track of the packets that have been sent and correctly received, it is possible to ensure at-most-once deliver. For the graceful close, a combination

1.2. TELECOMMUNICATION NETWORKS 9 of a three way handshake and a timer ensures that both parties have agreed to close the connection and that no more information is to be sent.

The complexity of managing a connection comes from (i) Crash: The sender or receiver may crash and loose state information (ii) Size: the identifiers (connec-tions and packets) need to be re used and the communicating entities cannot store all the packets of all the connections in which they are involved.

To solve (i), a crash counter can be used as part of the connection identifier and a timer can be set to ensure that duplicates will not be received.

To solve (ii), it is necessary to ensure safe re use of past identifiers. Regard-ing sequence numbers the sequence number field of a TCP packet is 32 bits that allows safe re use. For the connection identifier, a special care is to be taken on the maximumlifetime of a connection identifier. The lifetime of con-nection records (information relative to a given connection stored by the con-nection management protocol) is used to release concon-nection records after a finite amount of time.

Error control The transmission of data can be subject to errors. In this sec-tion, we describe the general principles of error control. Error control comprises error detection and error recovery. We distinguish two types of errors:

Corruptionsof data, for instance, due to due to noise, crosstalk, or inter symbol interference. Corruptions can be detected using techniques such as parity bits or cyclic redundancy checks. To detect these two kinds of corruptions, it is necessary to perform an end-to-end error detection at the transport layer.

Loss of data, for instance due to buffer overflow in a switching node or an end system. To detect the loss in an unambiguous way, all data trans-mitted must be identified with a sequence number.

In TCP, using the checksum field of the TCP packet header, the receiver can detect corrupted data and discard it. Discarding corrupted messages is called hard error detection.

With the use of sequence number, TCP can detect duplicates and can correct re orderings at the receiver side. In addition, using the estimate of a packet Round Trip Time, the TCP sender can use timers to detect losses. To obtain an estimate of the packet round trip time, at the sender side, TCP can mea-sure the time between the transmission of a packet and the reception of its corresponding acknowledgment.

There are mainly two ways for the sender to detect a loss.

A causal approach to the study of Telecommunication networks

10 1. INTRODUCTION AND BACKGROUND One way is with the use of timers. Each time a packet is sent, the sender starts a timer whose value depends on the different RTT values estimated so far. If no acknowledgment is received at the end of the timer, TCP triggers atime out event and the corresponding packet is considered as lost and is re transmitted.

The second way uses (usually implicit) information from the receiver. Some TCP versions make use of selected acknowledgments some do not, but the receiver can implicitly communicate to the sender that it is still waiting for a missing packet with the use of acknowledgments. After receiving such infor-mation, the sender can consider that the packet has been lost and decide to re transmit it.

Congestion control

In the event of users sending more packets that the network can transmit, pack-ets will be dropped. If the senders keep sending packpack-ets and retransmitting dropped packet (= lost packets), the network will eventually collapse. Conges-tion collapse was first observed in 1986 when a backbone in U.S. (the NFSNET backbone) started dropping three order of magnitude from its capacity. To avoid the situations of congestion collapse, congestion control was implemented in the end nodes.

To control the amount of traffic that can be safely handled by the network, TCP defined a sender internal parameter called thecongestion window. At any time during the communication between two entities over the Internet, the sender sending rate is limited by the congestion window size value. The congestion window size is initially set to an empirical value and each time a packet is suc-cessfully transmitted to the receiver (that is, the sender had received the ac-knowledgment corresponding to the packet that has been sent), the congestion window size value is increased. On the opposite, each time a loss is detected at the sender side, the congestion window size value is decreased. Note that the speed at which the congestion window increases is dictated by the frequency at which the sender receives acknowledgments. The time between the submis-sion of packet and the reception of its corresponding acknowledgment is the RTT and is influenced by the load of the network(s) on the path between the sender and the receiver.

Flow control

The last important parameter to discuss is the receiver window. Packets arriv-ing at the receiver can be disordered. The receiver can re order packets usarriv-ing the sequence numbers of the TCP packets. However, to do so, the receiver needs to buffer the packets it receives, inspect them and, if necessary, re or-der them before passing them to the upper layer (cf TCP/IP suite, Figure 1.2).

1.2. TELECOMMUNICATION NETWORKS 11 To inform the sender about its available buffer and limit the sending rate from the receiver side, one of the fields of the TCP packet contains the value of a parameter called thereceiver window. In each TCP packet sent by the receiver to the sender (for example acknowledgments), the receiver informs the sender about the memory resource that is available to receive more data. Through this mechanism, the receiver ensures that no packet will be lost due resource shortage on its side and it can adjust the sending rate according to its needs or resources.

Concluding remark for TCP congestion control

There exist other mechanisms to prevent congestion, some routers can explic-itly inform senders of the presence of congestion for example. However, such mechanisms are not always supported and are not considered in our work.

What is important to notice is that, at any moment during a connection, the sending rate is defined by the minimum of three factors: the congestion win-dow, the receiver window and the amount of data present in the sender buffer, awaiting to be sent. Figure 1.3 presents a simplified vision of the different TCP sender windows in the case where the receiver window is limiting the sending rate. Notice that this scenario is voluntarily simplified, we reason based on packets (not bytes), we present very little quantities of information and consider a small period of time.

Figure 1.3:Different windows at the TCP sender

In Figure 1.3, we represent a scenario where an application needs to transmit data over the Internet network. The application relies on TCP to safely send data to another entity across the Internet. The TCP sender has a buffer to store packets that need to be sent or wait for their acknowledgments that corresponds to 14 packets. In the scenario represented in Figure 1.3, the sender already A causal approach to the study of Telecommunication networks

12 1. INTRODUCTION AND BACKGROUND sent four packets (1,2,3,4) and received their corresponding acknowledgments, ensuring that they were correctly received by the receiver. The TCP sender also sent three additional packet (5,6,7) that are not acknowledged yet by the receiver. We also suppose that the initial value of the congestion window and the correct transmission of the first three packets have allowed the congestion window (cwnd) to reach a value corresponding to seven packets. On the other hand, the receiver informed the sender that the resources that it allocated for their TCP connection (rcvwnd) corresponds to six packets. As three packets have already been sent, the number of packets that can be sent in the scenario observed in Figure 1.3 is (swnd):

swnd=min(cwnd,rcvwnd)−S entNotAckd=min(7,6)−3=3, so the sender can send packets 8, 9 and 10.

Let us assume that, upon receiving the acknowledgment of packet 5, the con-gestion window is incremented by 1 (cwnd = 8) but the receiver window keeps its value of 6, Figure 1.4. Thereforeswndis:

swnd=min(cwnd,rcvwnd)−SentNotAckd=min(8,6)−5=1, and the packet 11 can be sent.

Figure 1.4:Different windows at the TCP sender upon reception of ack There exist different versions of the TCP protocol, among which the Tahoe ver-sion, the Reno verver-sion, the Vegas verver-sion, the new Reno or the Cubic version.

The different TCP versions vary in their implementations of the congestion avoidance algorithm. For example, TCP RenoimprovesTCP Tahoeby reacting to the reception of three duplicated acknowledgments and implements what is know asFast Recoveryto react better and faster to a congestion event. TCP Ve-gasimproved the estimation of the optimal time out timer value.TCP new Reno

1.2. TELECOMMUNICATION NETWORKS 13 improves the performance of TCP in case of a congestion event detected by the reception of three duplicated acknowledgments, using the implicit informa-tion that informainforma-tion is still passing through the network if acknowledgments are received from the receiver. Due to the changes in networks with higher speed transfers, and in order to decrease the discrimination of connections with high latency,TCP Cubicimplements a function defining the evolution of the conges-tion window as a cubic funcconges-tion of the time elapsed since the last loss event.

TCP Cubicis the default version used in the Linux kernels 2.6.19. and above.

TCP specifications vs TCP implementation

TCP is an important protocol for the well functioning of the Internet network. All the machines connected to the Internet network implement TCP. However, due to the diversity of the Internet connected devices (from a heat sensor to a super computer), each machine implements its own version of the TCP protocol and uses its own parametrization. All the machines implementing TCP, have to com-ply with the specification of TCP, in order to be able to communicate one with the other (such as the TCP header) but the implementation of functions such as the one of congestion avoidance can be different and parameterized: the initial congestion window value, the increase of the congestion window value upon the reception of a valid acknowledgment, the computation of the time out value, and so on. Therefore there is no such thing as one TCP version running on all Internet machines but many versions with many parameterizations, answering to different needs and constraints. As we saw, the TCP Cubic version was de-signed to offer better performance in what is known as Long Fat Network (LFN) (high bandwidth and high latency). However, with the increase of mobile end users, LFN might not be the best network model for mobile users yet. While each platform (Android (Google), Windows (Microsoft), iOS (Apple)) can run its own TCP protocol, they also differ in some parameterization that their TCP versions implement.

Therefore, one should be careful in the distinction between TCP as a specifica-tion that defines rules that allows two entities to communicate and TCP as an implementation in the operating system of the machine being considered. The TCP specification defines the packet format (different header fields), while its implementation in a given machine includes algorithm, implementation choices and parameter values that are specific to the machine.