Internship o↵er spring/summer 2012
Distributed random sampling of the Internet graph
The Internet is formed by an interconnection of independent networks (Autonomous Systems) and its structure is not publicly available. We have to resort to measurements in order to characterize it. At LIP6, we run a system called TopHat [1] that continuously maps the Internet at the IP level thanks to traceroute measurements. It is currently deployed over a thousand PlanetLab nodes around the world, from which probe traffic is sent. A major challenge of our platform is to capture and characterize the Internet dynamics, which requires us to perform measurements at a high frequency. Unfortunately this is practically impossible to achieve during a short interval of time due to the limited amount of probe that can be sent. Previous approaches (such as DoubleTree [2]) have attempted reducing measurement redundancy.
In this internship, we take another approach that consists into sampling the observed graph. We propose to investigate how sampling approaches fit with our objective of measuring the properties of the graph, and to evaluate the performance of a distributed sampling approach. We are looking for a masters stu- dent in computer science or an engineering student who is curious about understanding the design of the Internet, and investigating new approaches for its measurement.
Supervisors: Bruno Baynat, Timur Friedman (Assistant Professors), Jordan Aug´e, Marc-Olivier Buob (Research engineers) Laboratory: UPMC/LIP6 – 4, place Jussieu - 75005 Paris
Duration: 5-6 months
Description of work
The discovery of the Internet graph is typically constructed from a set of paths measured by a traceroute- like tool. It consists in sending a series of probe packets from a source towards a set of destinations, with successive TTL values. Upon expiring, those probes will generate a response from the router that will reveal the IP address of the ingress interface. We denote by the triple (s, d, ttl) the probe that discovers
thettl-th IP nodes belonging to the path fromstod. Following the same approach, IP links are obtained
by sending two probes with successive TTL values. Node and link queries will be our two primitives for discovering the Internet graph.
Because of the short observation window that is made necessary to accurately capture the Internet dynamics, it might not be conceivable to probe towards all destinations every time (all the more so that the discovered graph will be large). Therefore, we need to rethink how we perform measurements. The approach we take here is to design our measurements process to capture fundamental properties of the internet graph by selecting which probe we send, instead of trying to create the largest possible map from which we might deduce biased information.
A possible start for this internship will be to analyze to which extent theoretical results on graph sampling can fit our present requirements. Their performance will be initially evaluated by simulation, before considering further analytical work to validate the approach. The candidate will contribute to the development of a simulator whose development has already started in the team. This tool will work both with generated graph according to known models, and with real data as measured by TopHat).
Such work will serve as a basis for better understanding the structure and properties of the Internet, and to further improve our distributed measurement infrastructure.
1
Skills
• Knowledge of IP networks and basic graph theory concepts
• Familiarity with a GNU/Linux environment
• Programming skills: C/C++ and scripting languages (such as Python), previous experience with the Boost (graph) library is a plus.
• Previous experience with simulation and network data analysis is a plus
• Fluency in English
Contact and application
Please provide the following:
• A CV, in PDF format, in either English or French;
• A letter of motivation, in either English or French;
• The names of two references who can be contacted for letters of recommendation.
Bruno Baynat<[email protected]>, Timur Friedman<[email protected]>, Jordan Aug´e<[email protected]>, Marc-Olivier Buob<[email protected]>
Laboratoire LIP6-CNRS 4 place Jussieu
75005 Paris, France
Do not hesitate to contact us for more details.
References
[1] Thomas Bourgeau, Jordan Aug´e, Timur Friedman, TopHat: supporting experiments through mea- surement infrastructure federation, Proceedings of TridentCom’2010, 18-20 May 2010, Berlin, Ger- many.
[2] Benoˆıt Donnet, Philippe Raoult, Timur Friedman, and Mark Crovella, Efficient algorithms for large- scale topology discovery, in Proc. ACM SIGMETRICS, Jun. 2005.
2