Multi-Scale Synthesis of Large-Scale Traces
Aggregation/disaggregation process
Robin Lamarche-Perrin13, Lucas Schnorr12, Jean-Marc Vincent12
1Laboratoire LIG, UniversitéJoseph Fourier [email protected] 2MESCAL INRIA/LIG Team3MAGMA LIG Team
2012 December 10
Outline
1 Context
2 Measure
3 Dynamic aggregation
4 Experiments
5 Future Work
Outline
1 Context
2 Measure
3 Dynamic aggregation
4 Experiments
5 Future Work
Comprehensible Representation
Space explosion
Comprehensible Representation (2)
phase 1 phase phase 4
2 and 3
Server
time
TraderClient
thread
JVM
Folding information
Folding information(2)
Aggregation/Clustering
Data clustering approach Similarity of objects
⇒distance function; semantic of the function Many methods, (k-means, hierarchical,...) Level of clustering
Aggregation approach External information
⇒hierarchy, topology, ...
Information loss estimation
Objective
Goal 1
Provide a measure of the quality of partial aggregations
Goal 2
Provide an interactive synthetic representation of large-scale data with partial multi-level aggregations
Outline
1 Context
2 Measure
3 Dynamic aggregation
4 Experiments
5 Future Work
Aggregation
P12 10 10 12 11 14 11 12 9 Cluster A
5 5 17 2 13 6 20 19 13 Cluster B
100 Aggregate
11 11 11 11 11 11 11 11 11 Normalized Agg.
Q' Q
Quality estimate of an aggregation function
Goal
comparison of aggregations : criteria composition : dynamic aggregation process semantic : related to an extra structure
Aggregation
P12 10 10 12 11 14 11 12 9 Cluster A
5 5 17 2 13 6 20 19 13 Cluster B
100 Aggregate
11 11 11 11 11 11 11 11 11 Normalized Agg.
Q' Q
Quality estimate of an aggregation function
Goal
comparison of aggregations : criteria composition : dynamic aggregation process semantic : related to an extra structure
Aggregation
P12 10 10 12 11 14 11 12 9 Cluster A
5 5 17 2 13 6 20 19 13 Cluster B
100 Aggregate
11 11 11 11 11 11 11 11 11 Normalized Agg.
Q' Q
Quality estimate of an aggregation function
Goal
comparison of aggregations : criteria composition : dynamic aggregation process
Entropy : Measure of Homogeneity/Disorder
H=−X|sk|
|S|log2|sk|
|S| =X
pklog2 1 pk
(1)
0 0.2 0.4 0.6 0.8 1
0 0.2 0.4 0.6 0.8 1 1.2
Proportion of state 1 :p
Bitrate
Entropy for a two-state system
Quantity of information to code the system
Entropy Properties
Characteristics
H>0,H(p) =0⇒deterministic system H(p)6log2n,H(p) =log2n⇒uniform system Independence property
Conditionning
Entropy Gain
G=Hmicro−Hmacro
G>0
G=0 (no aggregation or deterministic micro-system) maximal if one aggregate
Composition property
Divergence
D(pmicro||pmacro) =X
pmicro(k)log2pmacro(k) pmicro(k) Uniform distribution on the aggregate
pmacro(x) = 1
|A(x)|
X
k∈A(x)
p(k)
D=0 (no aggregation or uniform distribution) Dmax=HUniform−Hmicro
Quantity of information to re-code the system
Outline
1 Context
2 Measure
3 Dynamic aggregation
4 Experiments
5 Future Work
Dynamic multi-level aggregation
Combination Entropy Gain/Divergence Tradeoff aggregation and quantity of information
RIC=G−DRelative Information Criterion Parametrized Information Criteria
PRIC=pG−(1−p)D
p=0 no aggregation p=1 maximal aggregation Evolution as a function ofp
Quel niveau d’agrégation doit-on considérer ?
Quelle partie de la hiérarchie doit-on afficher ?
Projet TRIVA
Agrégation et visualisation de systèmes distribués
Processus
Quel niveau d’agrégation doit-on considérer ?
Quelle partie de la hiérarchie doit-on afficher ?
Projet TRIVA
Agrégation et visualisation de systèmes distribués
Multi-level aggregation: Triva Application/Demo
Machines
Processus
Quel niveau d’agrégation doit-on considérer ?
Quelle partie de la hiérarchie doit-on afficher ?
Projet TRIVA
Agrégation et visualisation de systèmes distribués
Clusters
Machines
Processus
Quel niveau d’agrégation doit-on considérer ?
Quelle partie de la hiérarchie doit-on afficher ?
Projet TRIVA
Agrégation et visualisation de systèmes distribués
Multi-level aggregation: Triva Application/Demo
Clusters
Machines
Processus
Quel niveau d’agrégation doit-on considérer ?
Quelle partie de la hiérarchie doit-on afficher ?
Projet TRIVA
Agrégation et visualisation de systèmes distribués
?
Outline
1 Context
2 Measure
3 Dynamic aggregation
4 Experiments
5 Future Work
Aggregations within a Hierarchy
Experiments
AHierarchy: Site (5) - Cluster (9) - Machine (188) - Process (188)
BRatio Gain/Loss with P = 10% CRatio Gain/Loss with P = 40%
Cluster level
Site level
Full aggregation A.1
A.2
A.3
Scenario with 188 processes, grouped by 9 clusters and 5 sites (Treemaps A, A.1, A.2, and A.3) and with two values of P (Treemaps B and C); when the ratio gain/loss is 10% (treemap B), everything is aggregated but the
Experiments
AHierarchy: Cluster (3) - Machine (50) - Process (433) A.1 Machine level
Cluster level A.2
Full aggregation A.3
BRatio Gain/Loss with P = 10% CRatio Gain/Loss with P = 30%
Scenario with 433 processes, grouped by 50 machines and 3 clusters (treemaps A, A.1, A.2, and A.3) and with two values of P (treemaps B and C);
Experiments
AHierarchy: Site (10) - Super-Cluster (100) - Cluster (1000) - Machine (10000) - Process (1000000)
Bwith P=10%
A.1
A.2 A.3
B.1
B.2 B.4 B.3
Synthetic scenario with 1 million processes, grouped by 10000 machines, 1000 clusters, 100 super-clusters and 10 sites; treemap A shows the aggregated behavior of all processes for each machine; treemap B is configured with a gain/loss ratio of 10%, highlighting the heterogeneous
Outline
1 Context
2 Measure
3 Dynamic aggregation
4 Experiments
5 Future Work
Future Works
Modeling :
- qualitative state−→quantitative state - node aggregation−→flow aggregation - integration of spatial/temporal aggregation Analysis tool
- Visualization of aggregation quality - Statistical tests (significance) Algorithms
- optimal aggregation (structure impact) - dynamics of aggregates
muito obrigado por toda sua atenção e colaboração
Future Works
Modeling :
- qualitative state−→quantitative state - node aggregation−→flow aggregation - integration of spatial/temporal aggregation Analysis tool
- Visualization of aggregation quality - Statistical tests (significance) Algorithms
- optimal aggregation (structure impact) - dynamics of aggregates
muito obrigado por toda sua atenção e colaboração