The Failure-Dependent Path Protection Method

Network Survivability: End-to-End Recovery Using Local Failure Information

5.2 The Failure-Dependent Path Protection Method

Shared protection is a well-studied area of survivable routing and a great number of shared protection methods have been published; however, only very few failure dependent path protection methods were studied. One of the main reasons of the scant proposals is that the problem can easily and efficiently be solved with a sim-ple heuristic based on shortest path search. This simsim-ple heuristic is referred to as the SPH approach later in this section. In [37] the problem is called partial path protec-tion. Besides the SPH approach, in the next section an ILP formulation is given that provides a solution with optimal bandwidth utilization in optical networks. We pro-ceed now with detailed study of the corresponding routing problem, propose some novel approaches, and further verify the benefits of the SPH approach.

5.2.1 Recovery Based on the Failure Scenario

A common point of shared protection is that the preselected upstream node does not need to know which network element has failed. This approach is called failure independent (FI) [28] or state-independent [38] protection.¹Note that SLP for single

1The “state” means network failure state, indicating the failed network component(s).

link failures is a special case where basically the switching node has the knowledge of the failed link; however, it is still considered as being failure independent. It is because from the operational point of view a single protection path is assigned for each switching node.

Alternatively, the connection may be assigned more than one protection path, depending on the failure scenario. Upon a failure, the switching node activates a protection path corresponding to the failed network element. Such an approach re-quires precise knowledge of the failure in the network; hence, it is referred to as failure-dependent (FD) [28] or state-dependent [38] protection.

Similar methods have been published related to path restoration. Contrary to all other methods discussed, path restoration is not a preplanned protection scheme;

thus, the path computation takes place after the failure occurs. However, the re-quired minimum spare capacity is basically determined with FD protection routing methods. In [7] the method is called true-path restoration, in [11, 27] simply path restoration, and in [21] and [17] path restoration with static traffic. As FD protec-tion is tailored to specified failures, normally it requires less spare capacity than FI protection [18]. Even though FD protection was neglected due its longer restora-tion time and for the extended nodal processing and memory requirements [38], the Internet Engineering Task Force (IETF) solved this problem and published RFC 4090 [26], which addresses the necessary signalling extensions to support a recov-ery scheme called MPLS fast reroute. The IETF fast reroute defines two methods.

The first one is called one-to-one backup, where each Label Switched Path (LSP) is protected separately. Among the FI methods, shared segment protection follows very similar ideas. The second method is called facility backup and each facility is protected with a single protection bypass tunnel between the potential failure points;

thus, any LSP passing through the facility is protected by the same bypass tunnel.

P-cycles can be treated as a similar approach for the FI case.

We assume that each working path is protected separately and we consider one-to-one backup. The key advantage of MPLS fast reroute is that it provides FD pro-tection with short restoration time. This can be done by fixing the switching node as the first upstream adjacent node, while the merging node can be any of the down-stream nodes. The failures of the network elements are detected by the adjacent nodes and, since only single link failures are considered, we may assume that the switching node has knowledge of the failed link after the failure detection time, leading to a rapid recovery cycle. In our study we allow any upstream node to be a switching node; however, the above limitation can simply be applied to the proposed algorithms. In [36] the MPLS fast reroute is extended with distributed shared band-width management, which allows sharing of recovery resources of disjoint working paths.

5.2.2 Path Assignment Approaches

We consider an online routing problem, without any knowledge of future request arrivals and without applying prediction-based routing on the statistics of past re-quests. Traffic Engineering (TE) controls traffic to assure an economical utilization of network resources. We use link weight setting methods, which identify the bot-tleneck links in the network and, by assigning high administrative link weights, circumvent these links during the course of path selection. This ensures that resid-ual capacity is always available at bottleneck links, thus facilitating the successful routing of as many future requests as possible. The path selection schemes use the administrative weights as their cost functions to take network-wide TE policies into account to achieve global optimization of network resources.

We assume source routing with complete routing information scenario, where the link state protocols disseminate all the necessity routing information to each node, including the free and spare capacities on links, the shareability of protection routes, and the administrative link weights.

5.2.3 General Shared Risk Groups (SRG)

A Shared Risk Group (SRG) is defined as a group of network elements (links, nodes, physical devices, software or protocol identities, etc., or a mix of them) possibly subject to a common risk of single failure. In practical cases, an SRG may contain several seemingly unrelated and arbitrarily selected network elements. We say that a working path is involved in an SRG if it traverses any network element that belongs to the SRG.

Most of the past studies focused on the case where each single network element in the network topology serves as an SRG. Even if this special case is widely accepted and very common, we believe that a sterling survivable routing algorithm should be able to cope with the general definition of SRG.

Obviously, in single-layer and single-domain networks, multiple network ele-ments might be contained in a single SRG. However, the concepts of general SRG particularly contributes to the development and implementation of survivable rout-ing schemes for the modern multilayer, multi-domain, and multi-carrier public net-works. Because the network can be multilayered, it is not straightforward to take the network elements in the upper virtual layer and the underlying physical layer in a common SRG, although the upper layer virtual topologies could be embedded in the lower-layer topology. The general SRG concept simplifies this dependency by sep-arately grouping the network elements of each layer. In the case of a multi-domain and multi-carrier network environment, a suite of efficient and secure link state in-formation dissemination mechanisms must be developed to support routing in the network layer. Here, the definition of general SRGs can help the link state infor-mation aggregation, classification, and encapsulation to achieve a resource-sharable protection plan.

In end-to-end FI protection (e.g., dedicated or shared path protection), the task is to find an SRG-disjoint working and protection path-pair for a connection request, which has been proved to be NP-complete [1, 9, 15]. Therefore, the problem is either solved by heuristics [11, 14] or by exponential worst-case algorithms, like Integer Linear Program (ILP) [14, 15].

The consideration of the general definition of SRGs has two impacts upon the solving of the survivable routing problem compared to the case where there is a one-to-one mapping between a link or node and an SRG. Firstly, solving the survivable routing problem turns out to be solving an SRG-disjoint path-pair, which is NP-hard. In addition to theNP-completeness, the consideration of the general definition of SRGs may introduce an increase in the size of the spare provision matrix (SPM) [22]. In [33] the matrix expression by Yu Liu [22] was modified to enumerate the Spare Provision Matrix in the case of general SRG.

5.2.4 The Input of the Problem

Given a network with a set of nodes N and a set of links L, a corresponding trans-formed graph can be produced by modeling each network element of interest in the original network as an arc in the transformed graph. Each SRG of the original network can be represented by a set of arcs in the transformed graph.

Let G(V,E)denote the transformed graph of the original network with a set of arcs E and vertices V , where|E|and|V|are the number of arcs and vertices in G.

The cost for allocating a unit capacity on arc j (the administrative weight) is denoted as cj∀j∈E. The unreserved free capacity along arc j is denoted as fj∀j∈E. The amount of capacity reserved along arc j is denoted as vj ∀j∈E. Furthermore, we are given the source vertex s and the destination vertex d of the new demand with a specific amount of bandwidth b. Due to the complete routing information scheme, the full per-flow information of the network (i.e., the working and protection paths along each link) is known. Based on the full per-flow information, the Spare Pro-vision Matrix (SPM) can be calculated [33]. It is denoted as S and a |E| × |SRG|

matrix. The entry(i,j)of S (denoted as si,j, where i=1. . .|E|), j=1. . .|SRG|, is the amount of non-sharable spare capacity along arc i for P if W is involved in the

jth SRG.

The feasible condition of the primary (a.k.a working) path is fj≥b for all arcs j∈W (the working path). In FDPP a backup path is assigned to each SRG involved in W . The feasible condition of the backup path Pjassigned to the jth SRG involved in W is fi+vi−si,j≥b for for all arcs i∈P.

5.2.5 Two-Step Approaches

In two-step approaches the optimization is divided into two steps. First, a shortest path is found and assigned as the working path; second, the protection paths are identified.

Two-step approaches are widely used in shared path protection due to their sim-plicity and decent performance, even if they cannot cope with the trap problem [39].

In the trap problem, the shortest path is such an unfortunate working route that it has no SRG-disjoint protection path, even if there exists an SRG-disjoint path-pair between the given source-destination pair. Due to the trap problem, the two-step approaches have always higher blocking than approaches in which the working and protection paths are jointly optimized.

There are two main concepts that can be applied to solve the trap problem in failure-independent protection environments. First is the selection of a more appro-priate path than the shortest one as working route. Usually, the shortest path that has a disjoint counterpart [32, 39] is selected as the working path. In the case of online routing, this can be done only with heuristics approaches, since the problem is NP-hard [19]. The second concept is to apply a protection method other than end-to-end protection. Segment protection is a good choice, since it is impervious to the trap problem. In [12] it was stated that “In any network topology, whenever two disjoint paths exist between a pair of end nodes, backup segments are guaranteed to exist for any choice of a primary path between them. Similar guarantees cannot be provided on the existence of end-to-end backup.”

Similarly to segment protection, the trap problem can be easily handled in FDPP.

Basically, without knowing the exact route of the working path, we are able to de-cide whether or not a link belonging to it can be protected. This can be done by simulating the failure of each SRG involved and searching for a feasible protection path between the source and destination nodes of the request. If there is a feasible protection path P after the failure of SRG a (∀e∈P fe+ve−se,a≥b), we can be sure that P will intersect the working route in an upstream node (in the worst case at the source node), which can be treated as the switching node, and P will also in-tersect the working path in a downstream node (in the worst case in the destination node), which can be treated as the merging node, and thus we can be sure that there will be a feasible protection path. Obviously if there is no protection path that can protect the failure of SRG a, we will not be able to protect the working path passing through SRG a.

In the same way we can define an FDPP test that filters out all the infeasible edges from being part of the working route, and leaves all the edges that can be freely selected to guarantee a feasible FDPP protection solution.

Definition 5.1. FDPP test of arc a is true if there is a path P between s and d such

Dans le document Texts in Theoretical Computer Science An EATCS Series (Page 165-169)