Scaling Machine-Learning Based Automatic Performance Tuning
Abstract
Achieving optimal performance has been one of the major concerns with the increasing number of tunable parameters in a HPC system. The ytopt is an autotuning tool developed in ECP PROTEAS-TUNE project that optimizes the search over an autotuning search space. But ytopt encounters some deployment and performance portability issues while working with large scale HPC systems. To address these we explore the use of one of the ECP workflow managers, namely libEnsemble. Ytopt with libEnsemble: scales the auto-tuning capability. enhance parallel evaluation ability of existing work. We applied the approach to two ECP proxy applications: XSBench and sw4lite with different tuning parameters. We investigate the effectiveness of these tuning parameters at scale with respect to performance portability on ALCF Theta.