Performance Characteristics of Virtualized GPUs for Deep Learning
Loading...
Can’t use the file because of accessibility barriers? Contact us with the title of the item, permanent link, and specifics of your accommodation need.
Date
2019-10
Journal Title
Journal ISSN
Volume Title
Publisher
Permanent Link
Abstract
As deep learning techniques and algorithms become
more and more common in scientific workflows, HPC centers are
grappling with how best to provide GPU resources and support
deep learning workloads. One novel method of deployment is to
virtualize GPU resources allowing for multiple VM instances to
have logically distinct virtual GPUs (vGPUs) on a shared physical
GPU. However, there are many operational and performance
implications to consider before deploying a vGPU service in
an HPC center. In this paper, we investigate the performance
characteristics of vGPUs for both traditional HPC workloads
and for deep learning training and inference workloads. Using
NVIDIA’s vDWS virtualization software, we perform a series
of HPC and deep learning benchmarks on both non-virtualized
(bare metal) and vGPUs of various sizes and configurations. We
report on several of the challenges we discovered in deploying
and operating a variety of virtualized instance sizes and configurations. We find that the overhead of virtualization on HPC
workloads is generally < 10%, and can vary considerably for
deep learning, depending on the task.
Description
Preprint
Keywords
Deep Learning, High Performance Computing, Virtualization
Citation
Journal
DOI
Link(s) to data and video for this item
Relation
Type
Preprint