Puma: pooling unused memory in virtual machines for I/O intensive applications

(1)

HAL Id: hal-01154566

https://hal.archives-ouvertes.fr/hal-01154566

Submitted on 22 May 2015

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Puma: pooling unused memory in virtual machines for I/O intensive applications

Maxime Lorrillere, Julien Sopena, Sébastien Monnet, Pierre Sens

To cite this version:

Maxime Lorrillere, Julien Sopena, Sébastien Monnet, Pierre Sens. Puma: pooling unused memory in virtual machines for I/O intensive applications. The 8th ACM International Systems and Storage Conference, May 2015, Haifa, Israel. 2015. �hal-01154566�

(2)

Puma: Pooling Unused Memory in Virtual Machines for I/O intensive applications

Maxime Lorrillere, Julien Sopena, Sébastien Monnet and Pierre Sens

Sorbonne Universités, Université Pierre et Marie Curie, CNRS

Context: virtualization fragments available memory

VM¹ Host ¹ VM² VM³ Host ² VM⁴

virtio virtio

Swap

Applications

Hypervisor (KVM)

10GB Ethernet

Anonymous pages

Page cache

Virtualization allows more flexibility and isolation Problem: it fragments available memory

⇒ Memory cannot be reassigned as efficiently as CPU time

⇒ Unused memory (i.e. idle caches) is wasted

Existing solution: Memory Ballooning

⇒ Allows to dynamically resize VM’s memory

⇒ Cannot efficiently reclaim unused memory

⇒ Does not benefit of unused memory on other hosts

⇒ Slow to recover memory

VM¹ VM²

virtio

BalloonBalloon I/O

Balloon Swap

Applications

Baseline Auto-ballooning

Our approach: Puma

Rely on a fast network between VMs and hosts

⇒ Puma can reuse unused memory of VMs hosted on different hosts

Handles clean cache pages

⇒ Writes are generally non-blocking

⇒ Simple consistency scheme

⇒ Fast to recover memory!

Exclusive and non-inclusive caching strategies

Puma design

2 basics operations

⇒ get(): gets a page from the [remote] page cache

⇒ put(): sends a victim page to the remote page cache

Local metadata with small memory footprint

⇒ amortized 64 bits/page, 2 MB of metadata per GB of cache

Pages are directly stored into the existing page cache

⇒ Memory is reclaimed naturally

Sequential I/O are detected and filtered

⇒ Disk bandwidth > network bandwidth

Network latency monitoring

⇒ Puma is throttled when the latency becomes too high

Architecture overview

PFRA

alloc()

VM¹ VM²

P³¹

1

Metadata

31

P³¹

Reclaim 2

put(P³¹) 3

P³¹ 4 4

Store page P³¹

5

put() operation

P²⁴

Miss get(P²⁴)

VM¹ VM²

P²⁴ 1

Metadata

24

2 Hit?

req(P²⁴) 3

Lookup 4

P²⁴ P²⁴

5

get() operation

Evaluation

Experiment setup

Active VM: 1 GB memory

Inactive VM: 512 MB→12 GB memory Network latency injection with

Netem [LCA’05]

Benchrmarks: Filebench, BLAST, TPC-C, TPC-H, Postmark

Puma

Conclusion and Future Work

Puma solves the page cache fragmentation problem

⇒ It is based on an efficient kernel-level remote caching mechanism

⇒ It handles clean cache pages to quickly recover the memory

⇒ It works with co-localised VMs and remote VMs

Ongoing work: detecting when a VM has unused memory

⇒ Our idea: toggle Puma service based on Linux’s active/inactive LRUs activity

Puma: pooling unused memory in virtual machines for I/O intensive applications

Puma: Pooling Unused Memory in Virtual Machines for I/O intensive applications

Maxime Lorrillere, Julien Sopena, Sébastien Monnet and Pierre Sens

Sorbonne Universités, Université Pierre et Marie Curie, CNRS

Context: virtualization fragments available memory

Virtualization allows more flexibility and isolation Problem: it fragments available memory

Existing solution: Memory Ballooning

⇒ Allows to dynamically resize VM’s memory

⇒ Cannot efficiently reclaim unused memory

⇒ Does not benefit of unused memory on other hosts

⇒ Slow to recover memory

Our approach: Puma

Rely on a fast network between VMs and hosts

Handles clean cache pages

Exclusive and non-inclusive caching strategies

Puma design

2 basics operations

Local metadata with small memory footprint

Pages are directly stored into the existing page cache

Sequential I/O are detected and filtered

Network latency monitoring

Architecture overview

Evaluation

Experiment setup

Conclusion and Future Work

Puma solves the page cache fragmentation problem

Ongoing work: detecting when a VM has unused memory

Laboratoire d’Informatique de Paris 6, Inria – Regal team