Optimal succinct representations of planar maps
Luca Castelli Aleardi
(joint work with Olivier Devillers and Gilles Schaeffer)
6th june 2006, SoCG 2006
Projet Geometrica LIX
INRIA Sophia-Antipolis Ecole Polytechnique
Succinct and compact representations
Given a class C m of objects of size m , the goal is to design a space efficient data structure such that:
• queries on objects are answered in constant time;
• the encoding is succinct : the cost of an object R ∈ C m
matches asymptotically the entropy of the class size ( R ) = log 2 k C m k(1 + o (1))
• or compact : we content of a cost
size ( R ) = O (k C m k)
• for dynamic data structures: updates are supported in
O (lg c m ) amortized time
Compact representations: an example
Rooted trees with n vertices
enumeration of binary trees with n vertices:
kB n k = 1 n + 1
2 n
n
≈ 2 2 n n −
32(1)
Example
Optimal (implicit) encoding for compression
• size: log 2 kB n k = 2n + O(lg n) bits
• no efficient navigation
11101010001101001100
Example: explicit representation
Pointers-based representation
• size: 2n lg n bits
• constant time navigation between nodes
C
lef tC
right/
C
rightparent
C
lef tC
rightparent
/ /
parent
/ /
parent C
lef tC
rightparent
/ /
parent
/ /
parent /
lg n
lg n lg n
lg n
lg n
Compact and succinct representations
(Jacobson Focs89, Munro et Raman Focs97)
• size: O(n) bits - adjacency queries in O(lg n) time
• size: 2 n + o ( n ) bits - adjacency queries in O (1) time
01100 01000 11010
11101
lg n
Motivation
Combinatorial information describing incidence relations
Which information?
Connectivity
Motivation
Geometry information (vertex coordinates)
Geometric object
Geometric information
x y z
between 30 and 96 bits (192 bits ?)
Motivation
Usual triangle-based mesh representation
Geometric object
Connectivity information Geometric information
vertex:
triangle:
1 reference to triangle 3 references to vertices 3 references to triangles
log n to 32 bits
Motivation
Mesh compression algorithms
Touma Gotsman Edgebreaker [Rossignac]
Poulhalon Schaeffer General underlying idea
Encoding strategies based on a local (global) conquest
Graph encoding
(Similarities between spanning-tree based methods, Isenburg and Snoeyink)
Edgebreaker
CCCRCCRCCRECRRELCRE
C CC
R C
C R
C C R
S C
R R L E
C E R
V5V5V6V5V4V5V8V5V5V4S4V3V4
T ouma Gotsman(′98)
Canonical orderings and multiple parentheses Chuang et al. (Icalp 98) - Chiang et al. (Soda 01)
( [ [ [ ) ( ] ( ] ( ] [ [ [ ) [ ) ( ] ] [ ) . . .
1101000110000010010000011001000000000
P oulalhon Schaef f er(2003)
Motivation
Mesh compression algorithms
Mesh
comp ress ion
Compact representation
VRML, 288 or 114 bits/vertex
[Touma Gotsman] 2bits/vertex (near-optimal)
Pointer based representation: 208 bits/triangle 2.175 bits/triangle
[Poulalhon Schaeffer] 3.24bits/vertex (optimal)
Tutte’s entropy (triangulations)
(information theory asymptotic lower bound)
enumeration of rooted planar triangulations on n vertices:
Ψ n = 2(4 n + 1)!
(3n + 2)!(n + 1)! ≈ 16 27
r 3
2π n − 5 / 2 ( 256 27 ) n Tutte’s entropy (1962):
e = 1
n log 2 Ψ n ≈ log 2 ( 256
27 ) ≈ 3 . 2451 bits/vertex
Planar Triangulations with a boundary
n + 1 internal vertices, m = 2 n + k faces f ( n, k ) = 2 · (2 k − 3)! (2 k + 4 n − 1)!
( k − 1)! ( k − 3)! ( n + 1)! (2 k + 3 n )!
f ′ (m, k) = 2 · (2k − 3)! (2m − 1)!
(k − 1)! (k − 3)! ( m 2 − k + 1)!
counting planar triangulations with m faces F ( m ) = lg(
m
X
k ≥ 3
f ′ ( m, k )) ≈ 2 . 175 m
3 . 24 bits/vertex = 1 . 62 bits/face < 2 . 17 bits/face
Triangulations with a boundary
(Castelli Aleardi, Devillers and Schaeffer, WADS 2005) Theorem. For planar triangulations with a boundary having m faces, there exists an optimal succinct
representation supporting efficient navigation in O (1) time, requiring
2 . 175 m + O ( m lg lg m
lg m ) = 2 . 175 m + o ( m ) bits
For triangulations of genus g surfaces ( g = o ( lg m m ) ) the same representation requires
2.175m + 36(g − 1) lg m + O(m lg lg m
lg m + g lg lg m) bits
Comparison: space efficiency
graphs with n vertices, e edges, m faces, genus g
Encoding queries 3-conn. triangl. g > 0 dynamic
Jacobson (Focs89) O (lg n ) no
Munro Raman (Focs97) O(1) 8n + 2e 7m no Chuang et al. (Icalp98) O (1) 2 e + 2 n 3 . 5 m no Chiang et al. (Soda01) O (1) 2 e + 2 n 4 m no Blandford et al. (Soda03) O (1) O ( n ) O ( m ) O ( n ) Castelli Devillers Scha-
effer(Wads Cccg05)
O (1) no 2 . 175 m 2 . 175 m
our new encodings O (1) 2 e 1 . 62 m no
Literary digression ”La le¸con” , (Eug` ene Ionesco, 1951)
During a private lesson, a very young student, preparing herself for the total doctorate, talks about arithmetics with her teacher.
(teacher) If you cannot deeply understand these principles,
these arithmetic archetypes, you will never perform correctly a
”polytechnicien” job. For example, what is 3 . 755 . 918 . 261 multiplied by 5 . 162 . 303 . 508?
(student, very quickly) The result is 193891900145...
(teacher, very astonished): How have you computed it, if you do not know the principles of arithmetic reasoning?
(student) Simple: I have learned by heart all possible results of
all possible multiplications.
Decomposing T into sub-triangulations
• compute small triangulations of size bewteen 1 3 lg 2 m and lg 2 m ;
• decompose into tiny triangulations of size between
1
12 lg m and 1 4 lg m .
Initial triangulation with m triangles Tiny triangulations with Θ(lgm) triangles Small triangulations with Θ(lg2m) triangles
Decomposing T into sub-triangulations
• we compute small triangulations having between 1 3 lg 2 m and lg 2 m triangles;
• we decompose small triangulations into tiny
triangulations containing between 12 1 lg m and 1 4 lg m triangles.
Graph G linking tiny triangulations having Θ(lgmm) nodes
A sub-graph Gi linking Θ(lgm) tiny triangulations lying in the same small
Graph F linking small triangulations with Θ(lgm2m) nodes
Overview: representation of a small triangulation
• adjacency relations are described by map G i ;
• internal connectivity is implicitly represented (variable size pointers)
• boundary neighboring relations are represented by boundary coloring (variable length bit-vector)
pointer to tiny triangulation of size t
pointer to rank-select of size b and weight s t, b, s
split small into tiny
first vertex
0
1 0
0 1 1
1
1 0
0
0 0
0
Graph G i linking adjacent tiny triangulations
• G i has a node for each tiny triangulation and an arc for each pair of adjacent tiny triangulations;
• G i is a planar map, having faces of degree at least 3 ,
multiple edges and loops are allowed;
i adjacency relations between tiny triangulations
• Because of Euler’s formula, the overall number of arcs in maps G i is:
X
i
k E ( G i )k = O ( m
lg m )
Total space used
• Catalog of all different tiny triangulations
O(m
412 . 17 lg 2 m lg lg m) = o(m)
• catalog of bit-vectors (with Rank/Select)
O ( m
412 . 17 lg m lg lg m ) = o ( m )
• representation of graph F : O ( m
lg
2m lg m ) = o ( m )
• graphs G i
2.17m + O(m lg lg m
lg m )
Optimality
Our contribution
Proposition ( Castelli Aleardi, Devillers and Schaeffer) .
For planar maps (triangulations and 3 -connected graphs) it is possible to design two optimal succinct representations, supporting efficient navigation in O(1) time, whose storage requirements match asymptotically the entropy of the
respective classes. More precisely, our representations use:
• 1 . 62 m bits, for triangulations (without boundary) with m faces;
• 2 e bits, for 3 -connected graphs with e edges.
Optimal encodings
(Poulalhon/Schaeffer 03, Fusy/Poulalhon/Schaeffer 05)
Original local closures
New ”color-based” closures
11010001010000100000000000 00100100100001 nnodes, 2nstems
nnodes, 2nstems nnodes, 2nstems
nnodes, 2nstems nnodes, 2nstems nnodes, 2nstems nnodes, 2nstems
”color-based” closure for 3 -connected graphs
”color-based” closure for triangulations
11010001010000100000000000 00100100100001
n nodes, 2n stems
n nodes, 2n stems n nodes, 2n stems
n nodes, 2n stems
n nodes, 2n stems n nodes, 2n stems n nodes, 2n stems n nodes, 2n stems
New splitting strategies
Dominant cost analysis
The Catalog A r contains all triangulations with r triangles A reference to an element in A r costs 2 . 17 r bits
k A k = 2 2.17r
r 2.17r bits reference
Dominant cost analysis
Catalog A r,w : all trees with r nodes, 2 r stems and w false stems A reference to an element in A r,w costs 2 . 17 r + w lg r bits
k A r,w k = 2 3 . 24 r ·
2 w r
3.24r + w lg r bits index
Future and related works
• Design of a practical efficient implementation (with A.
Mebarki and O. Devillers)
• extension: realizers, canonical orderings and ”spanning
tree” encodings for higher genus surfaces
Graph encodings
(higher genus triangulations)
Edgebreaker
CCCRCCRCCRECRRELCRE
C CC
R C
C R
C C R
S C
R R L E
C E R
V5V5V6V5V4V5V8V5V5V4S4V3V4
T ouma Gotsman(′98)
Edgebreaker on a torus
Canonical orderings and multiple parentheses Chuang et al. (Icalp 98) - Chiang et al. (Soda 01)
( [ [ [ ) ( ] ( ] ( ] [ [ [ ) [ ) ( ] ] [ ) . . .
1101000110000010010000011001000000000
P oulalhon Schaef f er(2003)