13/07/2020

Say Goodbye to Off-heap Caches! On-heap Caches Using Memory-Mapped I/O

Iacovos G. Kolokasis, Anastasios Papagiannis, Polyvios Pratikakis, Angelos Bilas, Foivos Zakkak

Keywords:

Abstract: Many analytics computations are dominated by iterative processing stages, executed until a convergence condition is met. To accelerate such workloads while keeping up with the exponential growth of data and the slow scaling of DRAM capacity, Spark employs off-memory caching of intermediate results. However, off-heap caching requires the serialization and deserialization () of data, which add significant overhead especially with growing datasets. This paper proposes , an extension of the Spark data cache that avoids the need of by keeping all cached data on-heap but off-memory, using memory-mapped I/O (mmio). To achieve this, extends the original JVM heap with a managed heap that resides on a memory-mapped fast storage device and is exclusively used for cached data. Preliminary results show that the prototype can speed up Machine Learning (ML) workloads that cache intermediate results by up to 37% compared to the state-of-the-art approach.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at HotStorage 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd

Similar Papers