Title: Accelerating complex data transfer for cluster computing
Abstract: The ability to move data quickly between the nodes of a distributed system is important for the performance of cluster computing frameworks, such as Hadoop and Spark. We show that in a cluster with modern networking technology data serialization is the main bottleneck and source of overhead in the transfer of rich data in systems based on high-level programming languages such as Java. We propose a new data transfer mechanism that avoids serialization altogether by using a shared clusterwide address space to store data. The design and a prototype implementation of this approach are described. We show that our mechanism is significantly faster than serialized data transfer, and propose a number of possible applications for it.
Publication Year: 2016
Publication Date: 2016-06-20
Language: en
Type: article
Access and Citation
Cited By Count: 6
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot