HAL Id: hal-01154566
https://hal.archives-ouvertes.fr/hal-01154566
Submitted on 22 May 2015
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Puma: pooling unused memory in virtual machines for I/O intensive applications
Maxime Lorrillere, Julien Sopena, Sébastien Monnet, Pierre Sens
To cite this version:
Maxime Lorrillere, Julien Sopena, Sébastien Monnet, Pierre Sens. Puma: pooling unused memory in virtual machines for I/O intensive applications. The 8th ACM International Systems and Storage Conference, May 2015, Haifa, Israel. 2015. �hal-01154566�
Puma: Pooling Unused Memory in Virtual Machines for I/O intensive applications
Maxime Lorrillere, Julien Sopena, Sébastien Monnet and Pierre Sens
Sorbonne Universités, Université Pierre et Marie Curie, CNRS
Context: virtualization fragments available memory
VM1 Host 1 VM2 VM3 Host 2 VM4
virtio virtio
Swap
Applications
Hypervisor (KVM)
10GB Ethernet
Hypervisor (KVM)
Anonymous pages
Page cache
Virtualization allows more flexibility and isolation Problem: it fragments available memory
⇒ Memory cannot be reassigned as efficiently as CPU time
⇒ Unused memory (i.e. idle caches) is wasted
Existing solution: Memory Ballooning
⇒ Allows to dynamically resize VM’s memory
⇒ Cannot efficiently reclaim unused memory
⇒ Does not benefit of unused memory on other hosts
⇒ Slow to recover memory
VM1 VM2
virtio
BalloonBalloon I/O
Balloon Swap
Applications
Hypervisor (KVM)
Baseline Auto-ballooning
Our approach: Puma
Rely on a fast network between VMs and hosts
⇒ Puma can reuse unused memory of VMs hosted on different hosts
Handles clean cache pages
⇒ Writes are generally non-blocking
⇒ Simple consistency scheme
⇒ Fast to recover memory!
Exclusive and non-inclusive caching strategies
Puma design
2 basics operations
⇒ get(): gets a page from the [remote] page cache
⇒ put(): sends a victim page to the remote page cache
Local metadata with small memory footprint
⇒ amortized 64 bits/page, 2 MB of metadata per GB of cache
Pages are directly stored into the existing page cache
⇒ Memory is reclaimed naturally
Sequential I/O are detected and filtered
⇒ Disk bandwidth > network bandwidth
Network latency monitoring
⇒ Puma is throttled when the latency becomes too high
Architecture overview
PFRA
alloc()
VM1 VM2
P31
1
Metadata
31
P31
Reclaim 2
put(P31) 3
P31 4 4
Store page P31
5
put() operation
P24
Miss get(P24)
VM1 VM2
P24 1
Metadata
24
2 Hit?
req(P24) 3
Lookup 4
P24 P24
5
get() operation
Evaluation
Experiment setup
Active VM: 1 GB memory
Inactive VM: 512 MB→12 GB memory Network latency injection with
Netem [LCA’05]
Benchrmarks: Filebench, BLAST, TPC-C, TPC-H, Postmark
Puma
Conclusion and Future Work
Puma solves the page cache fragmentation problem
⇒ It is based on an efficient kernel-level remote caching mechanism
⇒ It handles clean cache pages to quickly recover the memory
⇒ It works with co-localised VMs and remote VMs
Ongoing work: detecting when a VM has unused memory
⇒ Our idea: toggle Puma service based on Linux’s active/inactive LRUs activity