CCES Unicamp

Selecting efficient VM types to train deep learning models on Amazon SageMaker

The cloud has become a popular environment for running Deep Learning (DL) applications. Public cloud providers charge by the amount of time the resources are actually used, with the price per hour depending on the configuration of the chosen cloud instance. Instances are usually provided in the form of a VM that gives access to a certain hardware configuration, and may also come with a pre-configured software environment. More advanced, and theoretically faster, VMs are usually more expensive, but may not necessarily provide the best performance for all applications. Therefore, in order to choose the best instance (or VM type), users must consider the relative performances (and consequent cost) of different VMs when running their specific target application. Taking this into account, we propose a model to estimate the relative performance and cost of training deep learning applications running in different VM instances. This model is built upon observations derived from the performance profile of executions of three different DL applications, on 12 different public cloud instances. We argue that this model is a valuable tool for cloud users looking for optimal VM types to train their deep learning applications on the cloud.

R. K. Tesser, A. Marques and E. Borin, “Selecting efficient VM types to train deep learning models on Amazon SageMaker,” 2021 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW), 2021, pp. 20-27.


Related posts

Pesquisadores do CCES participaram de curso do MIT sobre modelagem multiescala de materiais

cces cces

Vibration readings and high computational power allow us to ‘see’ the bottom of the ocean


PeNTIOS: a package for Petri Net simulation in Saccharomyces cerevisiae

cces cces
WP Twitter Auto Publish Powered By :