A note on Pr\"{u}fer--like coding and counting forests of uniform hypertrees

(1)

HAL Id: hal-00905902

https://hal.archives-ouvertes.fr/hal-00905902

Submitted on 18 Nov 2013

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires

A note on Pr’́ufer–like coding and counting forests of uniform hypertrees

Christian Lavault

To cite this version:

Christian Lavault. A note on Pr’́ufer–like coding and counting forests of uniform hypertrees. Journal

of Discrete Algorithms, Elsevier, 2012, 12 (1), 29–36 (selected paper of ACiD 2010). �hal-00905902�

(2)

A note on Pr¨ ufer-like coding and counting forests of uniform hypertrees

Christian Lavault

^∗

Abstract

This note presents an encoding and a decoding algorithms for a forest of (labelled) rooted uniform hypertrees and hypercycles in linear time, by using as few as n−2 integers in the range [1, n]. It is a simple extension of the classical Pr¨ufer code for (labelled) rooted trees to an encoding for forests of (labelled) rooted uniform hypertrees and hypercycles, which allows to count them up according to their number of vertices, hyperedges and hypertrees. In passing, we also find Cayley’s formula for the number of (labelled) rooted trees as well as its generalisation to the number of hypercycles found by Selivanov in the early 70’s.

Key words: Hypergraph, Forest of (labelled rooted) hypertrees, Pr¨ufer code, Encoding-decoding,b-uniform, Enumeration.

1 Notations and definitions

A hypergraph H is a pair H = (V, E), where V = {1, 2, . . . , n} denotes the set of vertices and E is a family of subsets of V each of size ≥ 2 called hyperedges (see e.g. [1]).

Two vertices are neighbours if they belong to the same hyperedge. The degree of a vertex is the number of its neighbours. A leaf is a set of b − 1 non-distinguished vertices of degree (b − 1) belonging to the same hyperedge.

A hyperpath (path) between two vertices u and v is a finite sequence of hyperedges e

₁

, . . . , e

_k

, such that e

_i

∩ e

_i+1

6= ∅ for any 1 ≤ i ≤ k − 1, with u ∈ e

₁

and v ∈ e

_k

.

A hypergraph H is connected if there exists a path between any two vertices of H.

A connected hypergraph is also called a connected component, or simply a component.

A hypergraph is called b-uniform (or uniform) if every hyperedge e ∈ E contains exactly b vertices (2 ≤ b ≤ n) [8, 14]. For example, 2-uniform hypergraphs are simply graphs. In the present note, only connected b-uniform hypergraphs (2 ≤ b ≤ n) are considered.

∗LIPN (UMR CNRS 7030), Universit´e Paris 13 99, av. J.-B. Cl´ement 93430 Villetaneuse (France).

E-mail: lavault@lipn.univ-paris13.fr

(3)

The excess of a connected hypergraph H = (V , E) is defined as exc(H) = X

e∈E

(|e| − 1) − |V|

(see e.g. [8, 14]). Thus, if H is b-uniform, its excess is (b − 1)|E| − |V|. A hypertree is a component of excess −1, which is the smallest excess possible for a connected hypergraph H, and hence, exc(H) ≥ −1 for any H (see the above definition).

A component is rooted if one of its vertices is distinguished from all others. A hypergraph is called a forest if all its components have excess −1 (i.e. are hypertrees), and similarly a hypercycle has excess 0.

2 Bijective enumeration of a forest of hypertrees

2.1 State of the art and motivations

Concerning connected graphs (2-uniform hypergraphs), there exist several methods for counting trees (see e.g. [2, 3, 4, 7, 9, 10, 11]), including of course the Pr¨ ufer code. Pr¨ ufer sequences were first introduced by Heinz Pr¨ ufer to prove Cayley’s enumeration formula in 1918 [13]. In his very elegant proof

¹

, Pr¨ ufer verified Caley’s Theorem [3] by establishing a one-to-one correspondence between labelled free trees of order n and all sequences of n − 2 positive integers from 1 to n. The Pr¨ ufer codes can thus be generated by a simple iterative algorithm (see also [9, vol. 1, chap. 2] and [6]).

Remark 1. To compute the Pr¨ ufer sequence Seq(T) for a labelled tree T , iteratively delete the leaf with smallest label and append the label of its neighbor to the sequence.

After n − 2 iterations a single edge remains and we have produced a sequence Seq(T) of length n − 2.

Since the introduction of Pr¨ ufer codes, a linear time algorithm for its computation was given for the first time only in the 70’s [12, 15], and has been later rediscovered several times in various forms [2, 4, 5]. Recently for example, the sequential encoding and decoding schemes presented in [2]. Both require an optimal Θ(n) time when applied to rooted n-node trees, and provide the first optimal linear time decoding algorithm for Neville’s codes [2].

Amongst the most recent results, “Pr¨ ufer-like” encoding-decoding algorithms are generalizing the Pr¨ ufer code to the case of hypertrees (uniform or arbitrary). In 2009, S. Shannigrahi, S.P. Pal have shown in [17] that uniform hypertrees can be Pr¨ ufer-like encoded (and decoded) in optimal linear time Θ(n), using only n − 2 integers in the range [1, n] (see Pr¨ ufer’s Theorem in [13]).

1Prüfer’s Theorem is as follows. There arenⁿ⁻² sequences (called Prüfer sequence or Prüfer codes) of lengthn−2with entries being from natural numbers; we establish a bijection between the set of trees and this set of sequences.

(4)

In the case when hypertrees are arbitrary (non uniform), the same authors’ encoding and decoding algorithms in [17] require codes of length (n − 2) + p, where p is the number of vertices belonging to more than one hyperedge, or pivots. Since the number p of pivots is bounded by |E|, at most (n − 2) + |E| integers are needed to encode a general hypertree. Therefore, the design of efficient Pr¨ ufer-like encoding and decoding algorithms can be extended to arbitrary hypertrees where each hyperedge has at least two vertices. Up until now, no better bounds are known for the length of Pr¨ ufer-like codes for arbitrary hypertrees. By contrast, the exact number of distinct hypertrees is known to be

n−1

X

i=0

n − 1 i

n

ⁱ⁻¹

, where p

q

denotes the Stirling numbers of the second kind [9].

The main motivation of the present note comes from analytic and bijective combina- torics of hypergraphs, including the enumeration of (labelled) rooted forests of uniform hypertrees and hypercycles [14]. These enumeration results are actually tightly linked to Pr¨ ufer-like coding and decoding of such combinatorial structures.

The following two algorithms code and decode forests of rooted uniform hypertrees, which in turns allows to enumerate these structures bijectively by using a generalization of the Pr¨ ufer sequences. The knowledge of the number of forests of rooted uniform hypertrees provides a concise and simple description of the structures, e.g. by using a recursive pruning of the leaves in the forests.

2.2 Encoding and decoding algorithms for a forest of (labelled) rooted hypertrees

Definition 1. The forests F composed of (k + 1) (labelled) rooted b- uniform hypertrees, with n vertices and s hyperedges, is coded with a 4-tuple (R, r, P , N ) defined as follows:

• R is a set of (k + 1) vertices (roots) with distinct labels in [n] ≡ {1, . . . , n},

• r ∈ R is one the (k + 1) roots,

• P is a partition of [n] \ R into s subsets, each of size (b − 1) and

• N is an (s − 1)-tuple in [n]

^s−1

.

Note that the above positive integers n ≡ n(s), s and k meet the condition n = s(b − 1) + k + 1,

and since F is b-uniform, exc(F) = s(b − 1) − n. So, |E| = s and |V| = n.

Algorithm 1 is coding a given forest F of k + 1 (labelled) rooted uniform hypertrees as input, and returns the 4-tuple (R, r, P, N ) coding F as output.

The 4-tuple (R, r, P , N ) is obtained from the coding in Definition 1 of a forest F (Algo- rithm 1) as follows.

1. the number of its components (i.e. the number |R| of its roots),

(5)

Algorithm 1 Encoding a forest of rooted hypertrees.

Input: A forest F of k + 1 (labelled) rooted b-uniform hypertrees, with s hyperedges and n = s(b − 1) + k + 1 vertices.

Output: The coding (R, r, P , N ) of F as in Definition 1.

1.

(R, r, P , N ) ← ({root}, r, { }, ( ))

2.

Repeat

3.

Add the set of vertices corresponding to the smallest leaf (with respect to the lexicographical order) to the partition P .

4.

Put the vertex linking that set into the (s − 1)-tuple N .

5.

Take F as the “new” forest not having the vertices corresponding to the smallest leaf.

6.

Until there is no hyperedge remaining in F.

7.

The last vertex in N is necessarily a root and r is redefined as this last vertex.

8.

Return (R, r, P, N ).

2. the unique root vertex r attached to the leaf of the last hyperedge in the pruning process,

3. the number of hyperedges |N | + 1, and finally,

4. the number of hypertrees that are not reduced to their roots, namely the number of distinct roots in the pair (N, r).

Example.

The forest F = (V , E) of rooted uniform hypertrees depicted in Fig. 1 is as follows:

V = {1, 2, . . . , 22} and

E = {{1, 21, 22}, {2, 17, 18}, {3, 13, 19}{4, 8, 18}, {4, 12, 14}, {6, 7, 13}, {7, 20, 21}, {10, 13, 15}, {11, 18, 21}}.

The roots of the hypertrees are 5, 9, 13 and 16.

In this example, Algorithm 1 outputs (R, r, P , N ) s.t.

R = {5, 9, 13, 16}, r = 13,

P = {{1, 22}, {2, 17}, {3, 19}, {4, 8}, {6, 7}, {10, 15}, {11, 18}, {12, 14}, {20, 21}} and N = (21, 18, 13, 13, 4, 18, 21, 7).

Next, the following Algorithm 2 decodes a given 4-tuple (R, r, P, N ) as input, and returns a forest of (labelled) rooted b-uniform hypertrees as output.

Within the loop of Algorithm 2, the forest F is found with no ambiguity by choosing the smallest leaf in the lexicographical order.

Remark 2. Encoding Algorithm 1 and decoding Algorithm 2 are both running in optimal

linear time, by using as few as n −2 integers in the range [1, n]. This complexity result is

a direct consequence of the Pr¨ ufer-like encoding algorithm designed in [17] (see Pr¨ ufer’s

Theorem in [13]). The algorithm computes the hyperedge partial order on the hyperedges

(6)

5 9 16

14 17 1

22 10 15

3 19 13

6 20 7

11 21 18

2 8

4 12

Figure 1: A forest of 4 (labelled) rooted hypertrees.

of the hypertree in linear time by defining a directed acyclic graph (DAG) with vertex set E, where each vertex represents a hyperedge of H. (The proof is completed in [17, Lemma 2].)

2.3 Enumeration of forests, hypertrees and trees

From Algorithm 1, we obtain the enumeration of forests with (k + 1) (labelled) rooted uniform hypertrees and s hyperedges.

Theorem 1. The number of forests having (k + 1) (labelled) rooted b-uniform hypertrees and s hyperedges is

n k + 1

(k + 1)

(n − k − 1)!

s! (b − 1)!

^s

n

^s−1

= n!

k!

n

^s−1

s! (b − 1)!

^s

, (1) where the number of vertices is n ≡ n(s) = s(b − 1) + k + 1.

Proof. The proof stems directly from the one-to-one correspondence constructed in Al- gorithm 1, which codes forests F with the set of 4-tuples (R, r, P, N ) defined as in Defi- nition 1. Indeed, the number of forests of (k + 1) (labelled) rooted b-uniform hypertrees and s hyperedges is equal to |R| × #roots × |P| × |N |.

Now, we have |R| = n

k + 1

, the number of roots is (k + 1) , |P| = (n − k − 1)!

s! (b − 1)!

^s

and |N | = n

^s−1

. So, after simplifications, Theorem 1 follows.

Setting k = 0 in the above Eq. (1) (Theorem 1) yields n! n

^s−1

s!(b − 1)!

^s

. Whence the

following

(7)

Algorithm 2 Decoding to a forest of rooted hypertrees.

Input: Integers n, k and s meeting the condition n = s(b − 1) + k + 1 and the coding (R, r, P , N ) of F, as in Definition 1.

Output: Forest F of rooted b-uniform hypertrees, whose k + 1 roots are the vertices of R.

1.

Repeat

2.

Build one hyperedge with the first vertex in the (s − 1)-tuple N and the vertices of the first set of P (w.r. to the lexicographic order) having no vertex still in the remaining set N .

3.

Remove the above set from P and delete the first vertex from the (s − 1)-tuple N .

4.

Until the set N is empty.

5.

Build the (s − 1)-th hyperedge with the last subset of P and the vertex r.

6.

Return the forest F obtained.

Corollary 1. The number of (labelled) rooted b-uniform hypertrees with s hyperedges is (n − 1)!

s! (b − 1)!

^s

n

^s

, where the number of vertices is n ≡ n(s) = s(b − 1) + 1.

Note that, whenever b = 2 (and thus s = n − 1), Corollary 1 is a generalization of Cayley’s Theorem enumerating rooted trees of n vertices: for any n ≥ 1, the number of (nonplane labelled) rooted trees of n vertices is n

ⁿ⁻¹

[3].

One of the advantages offered by a bijective construction proof is also the possibil- ity of performing a random generation and learn some characteristic properties of the structures in F [14].

A generalization of Subsection 2.3 also gives an explicit expression of the number of uniform hypercycles and obtain an alternative proof of Selivanov’s 1972 enumeration result.

3 Enumeration of uniform hypercycles

Together with hypertrees, hypercycles are the simplest structures and they have excess 0.

Along the same lines as in Theorem 1, the following bijective proof gives the enumeration formula (first given by Selivanov in [16]) of the uniform hypercycles in a forest F.

Theorem 2. [16] The number of b-uniform hypercycles with s hyperedges is (b − 1)n! n

^s−1

2 (b − 1)!

^s

s

X

j=2

j

s

^j

(s − j)! =

(b − 1)n! n

^s−1

2 (b − 1)!

^s

1 s(s − 2)! ,

where the number of vertices is n ≡ n(s) = s(b − 1).

(8)

Proof. A hypercycle having a cycle length j corresponds to a forest F of j(b − 1) rooted hypertrees, up to an arrangment of the hypertrees in all distinct ways of forming a cycle. F has (s − j) hyperedges and j(b − 1) components to be arranged into a cycle.

The number of b-uniform hypercycles having a cycle length j (2 ≤ j ≤ s) and with s hyperedges is thus



 n

j(b − 1)

(s − j)(b − 1)

! (s − j)! (b − 1)!

^s−j







 1 2

j(b − 1)

! (b − 2)!

^j



 , (2) where n ≡ n(s) = s(b − 1).

In the above Eq. (2) indeed, the left-hand factor (between brackets) counts the number of forests of rooted b-uniform hypertrees, while the right-hand one counts the number of smooth hypercycles labelled with the set {1, . . . , j(b − 1)}. j distinct hyperedges, and thus labelled with the set {1, . . . , j(b − 1)}.

Now, (b − 1)!

^−s+j

(b − 2)!

^−j

= (b − 1)!

^j

(b − 1)!

^−s

and, since n = s(b − 1), we have (s −j)(b− 1) = n −j(b− 1). So, Eq. (2) simplifies to n!(b − 1)

2 (b − 1)!

^s

(b − 1)!

^j

(s − j)! , with j ranging from 2 to s.

Finally, substituting n

^s−1

/s

^j

for (b − 1)!

^j

and summing on 2 ≤ j ≤ s, gives the finite sum in Theorem 2:

s

X

j=2

j

s

^j

(s − j)! = 1

s(s − 2)! , and the result follows.

4 Conclusions and further results

In the above proof of Theorem 2 we are led to distinghish hypercycles according to the lengths of the cycles. Therefore, a question arises: for a given number n = s(b − 1) of vertices, what is the cycle length j of the class that contributes most to the number of such hypercycles?

It is shown in [17] that there exists at most n

ⁿ⁻²

− f (n, b)

(b − 1)

(b−2)(n−1)/(b−1)

distinct labelled b-uniform hypertrees, where f(n, b) is a lower bound on the number of labelled trees of maximal (vertex) degree exceeding ∆ = (b − 1) + n − 1

b − 1 − 2. In view of extending this result, can we determine a lower bound on the number of labelled trees with no constraint on their maximal (vertex) degree—or, at least, with maximal degree exceeding some ∆

⁰

< ∆? This, for example, by designing some generalized counting techniques based on a bijective or analytic enumeration of b-uniform hypertrees. (See also [15, 17]).

In the spirit of [6], some potential applications of Pr¨ ufer-like code may also arise as fruitful directions of research.

Encoding algorithms (such as the present one or the algorithm designed in [17]) can

be used to generate unique identities (IDs) or PINs. By generating distinct hypertrees

with combinatorial enumeration methods, it is possible to compute distinct codes for

(9)

each of these hypertrees using such encoding algorithms. All codes generated this way would be distinct and can provide unique IDs. The advantage of the scheme is that no check for repetitions is needed, since IDs generated from distinct hypertrees are unique.

Besides, the generation of such codes requires time proportional to the length of the code.

Coding schemes can also be useful for allocating IDs to different users in a system where disjoint sets of users form different groups. Each group is associated with a distinct hypertree, whereas the users within a group are allocated distinct codes of the same hypertree associated with the group. Subgroups can be realized by another level of Pr¨ ufer-like encoding. The actual implementation of such group management schemes is an open direction of research (see [17]).

References

[1] C. Berge, Graphs and Hypergraphs, New York, Elsevier, 1973.

[2] S. Caminiti, I. Finocchi, R. Petreschi, On coding labelled trees, Theoretical Com- puter Science, 382:97-108, 2007.

[3] A. Cayley, A Theorem on Trees, Quart. J. Math. Oxford Series, 23:376-378, 1889.

[4] H.C. Chen, Y.L. Wang, An efficient algorithm for generating Pr¨ ufer codes from labelled trees, Theory of Computing Systems, 33:97-105, 2000.

[5] W. Edelson, M.L. Gargano, Modified Pr¨ ufer code: O(n) implementation, Graph Theory Notes of New York, 40:37–39, 2001.

[6] F. Flajolet, R. Sedgewick, Analytic Combinatorics, Cambridge University Press, 2009.

[7] A. Joyal, Une th´ eorie combinatoire des s´ eries formelles, Adv. in Math., 42:1-82, 1981.

[8] M. Karonski, T. Luczak, The number of sparsely edged uniform hypergraphs, Dis- crete Math., 171:153-167, 1997.

[9] D.E. Knuth, The Art of Computer Programing (vol. 1-3), 2nd Edition, Addison Wesley, 1973.

[10] G. Labelle, Une nouvelle d´ emonstration combinatoire des formules d’inversion de Lagrange, Adv. in Math., 42:217-247, 1981.

[11] J.W. Moon, Various proofs of Cayley’s Formula for Counting Trees, A Seminar on Graph Theory, p 0-78, 1967.

[12] A. Nijenhuis, H.S. Wilf, Combinatorial Algorithms, Academic Press, New York,

1978.

(10)

[13] H. Pr¨ ufer, Neuer Beweis eines Satzes ¨ uber Permutationen, Arch. Math. Phys., 27:742-744, 1918.

[14] V. Ravelomanana, A. L. Rijamamy, Creation and Growth of Components in a Random Hypergraph Process, Proc. of Cocoon 2006, LNCS 4112:350-359, 2006.

[15] C. R´ enyi, A. R´ enyi, The Pr¨ ufer code for k-trees, in: Combinatorial Theory and its Applications III, p. 945-971, Erd¨ os, R´ enyi, S´ os (eds.), North-Holland, 1970.