Title: A Runtime Workload Distribution with Resource Allocation for CPU-GPU Heterogeneous Systems
Abstract: Nowadays, Graphic Processing Units (GPUs) have become popular as general-purpose processors; they have been used as co-processors with CPUs forming heterogeneous systems. CPUs and GPUs have different execution capabilities, energy consumption and thermal characteristics. Typically, the role of the GPU is to execute the parallel parts of the job and the role of the CPU (i.e., host) is to execute the sequential parts and manage the CPU-GPU data transfer. The host remains idle, waiting for the GPU's execution and data transfer to complete. This classic workload distribution does not fully utilize the CPU and the GPU. Thus, there is a need for an adaptive workload distributor that fully exploits the potential of the CPU and the GPU. Allocating resources (i.e., core scaling, thread allocation) is also a challenge since different sets of resources exhibit different behaviors in terms of performance, energy consumption, peak power and peak CPU temperature. Several studies have been conducted on workload distribution with an eye on performance improvement. However, few of them consider both performance and energy consumption. We thus propose our novel Workload Distributor with a Resource Allocator (WDRA) which combines workload distribution, core scaling, and thread allocation into a multi-objective optimization problem. Since resource allocation is known to be an NP-hard problem, WDRA utilizes Particle Swarm Optimization (PSO). The goal is to find an efficient workload distribution in terms of both execution time and energy consumption, under peak power and peak CPU temperature constraints. To evaluate WDRA, experiments were conducted on an actual system equipped with a multicore CPU and a GPU. Compared to performance-based and other workload distributors, on average, WDRA can achieve up to a 1.47x speedup and energy savings of up to 82%. WDRA is suitable runtime algorithm for distributing a job's workload since the algorithm only takes up to 1.7% of the job's execution time.
Publication Year: 2017
Publication Date: 2017-05-01
Language: en
Type: article
Indexed In: ['crossref']
Access and Citation
Cited By Count: 5
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot