Trends in GPU price-performance (2022) by mrazomor

Share This Article

Sed ut perspiciatis unde.

We would like to thank Alyssa Vance, Ashwin Acharya, Jessica Taylor and the Epoch team for helpful feedback and comments.

Executive Summary

Using a dataset of 470 models of graphics processing units (GPUs) released between 2006 and 2021, we find that the amount of floating-point operations/second per $ (hereafter FLOP/s per $) doubles every ~2.5 years. For top GPUs at any point in time, we find a slower rate of improvement (FLOP/s per $ doubles every 2.95 years), while for models of GPU typically used in ML research, we find a faster rate of improvement (FLOP/s per $ doubles every 2.07 years). GPU price-performance improvements have generally been slightly slower than the 2-year doubling time associated with Moore’s law, much slower than what is implied by Huang’s law, yet considerably faster than was generally found in prior work on trends in GPU price-performance. We aim to provide a more precise characterization of GPU price-performance trends based on more or higher-quality data, that is more robust to justifiable changes in the analysis than previous investigations.¹

Table 1. Summary of our findings on GPU price-performance trends and relevant trends in the existing literature with the 95% confidence intervals in square brackets.
Trend	2x time	10x time	Growth rate	Metric
Our dataset (n=470)	2.46 years [2.24, 2.72]	8.17 years [7.45, 9.04]	0.122 OOMs/year [0.134, 0.111]	FLOP/s per dollar
ML GPUs (n=26)	2.07 years [1.54, 3.13]	6.86 years [5.12, 10.39]	0.146 OOMs/year [0.195, 0.096]	FLOP/s per dollar
Top GPUs (n=57)	2.95 years [2.54, 3.52]	9.81 years [8.45, 11.71]	0.102 OOMs/year [0.118, 0.085]	FLOP/s per dollar
Our data FP16 (n=91)	2.30 years [1.69, 3.62]	7.64 years [5.60, 12.03]	0.131 OOMs/year [0.179, 0.083]	FLOP/s per dollar
Moore’s law	2 years	6.64 years	0.151 OOMs/year	FLOP/s
Huang’s law	1.08 years	3.58 years	0.279 OOMs/year	FLOP/s
CPU historical (AI Impacts, 2019)	2.32 years	7.7 years	0.130 OOMs/year	FLOP/s per dollar
Bergal, 2019	4.4 years	14.7 years	0.068 OOMs/year	FLOP/s per dollar

Introduction

GPUs are the dominant computing platform for accelerating machine learning (ML) workloads, and most (if not all) of the biggest models over the last five years have been trained on GPUs or other special-purpose hardware like tensor processing units (TPUs). Price-performance improvements in underlying hardware has resulted in a rapid growth of the size of ML training runs (Sevilla et al., 2022), and has thereby centrally contributed to the recent progress in AI.

The rate at which GPUs have been improving has been analyzed previously. For example, Su et al., 2017 finds a 2.4-year doubling rate for GPU FLOP/s from 2006 to 2017. Sun et al., 2019 analyses over 4,000 GPU models and finds that FLOP/s per watt doubles around every three to four years. By contrast, some have speculated that GPU performance improvements are more rapid than the exponential improvements associated with other microprocessors like CPUs (which typically see a 2 to 3-year doubling time, see AI Impacts, 2019). Notable amongst these is the so-called Huang’s Law proposed by NVIDIA CEO, Jensen Huang, according to whom GPUs see a “25x improvement every 5 years” (Mims, 2020), which would be equivalent to a ~1.1-year doubling time in performance.

There is previous work that specifically analyzes price-performance across CPUs and GPUs (summarized in Table 1). Prior estimates of the rate of improvement vary widely (e.g. the time it takes for price-performance to increase by 10-fold ranges from ~6 to ~15 years, depending on the computing precision—see Table 2.). Due to the high variance of previous approaches and their usage of smaller datasets, we are not confident in existing estimates.²

Table 2. Price-performance improvements found in prior work. See also AI Impacts 2015 for a more detailed overview of prior estimates.
Reference	Processor type	Metric	2x time	10x time	Growth rate
Bergal, 2019	GPU	FLOP/s per $ in FP32, FP16, and FP16 fused multiply-add	4.4 years (FP32) 3.0 years (FP16) 1.8 years (FP16 fused)	14.7 years (FP32) 10.0 years (FP16) 6.1 years (FP16 fused)	0.068 OOMs/year (FP32) 0.100 OOMs/year (FP16) 0.164 OOMs/year (FP16 fused)
Median Group, 2018	GPU	FLOP/s per $ in FP32	1.5 years	5.0 years	0.200 OOMs/year
Muehlhauser and Rieber, 2014	Various	MIPS/$	1.6 years	5.2 years	0.192 OOMs/year
Sandberg and Bostrom, 2008	CPU-based	MIPS/$ and FLOP/s per $	1.7 years (MIPS) 2.3 (FLOP/s)	5.6 years (MIPS) 7.7 years (FLOP/s)	0.179 OOMs/year (MIPS) 0.130 OOMs/year (FLOP/s)
Nordhaus, 2001	CPU-based	MIPS/$	1.6 years	5.3 years	0.189 OOMs/year

We aim to extend the existing work with three main contributions:

Using a larger dataset of GPU models than has been analyzed in previous investigations that includes more recent GPU models, we produce more precise estimates of the rate of price-performance improvements for GPUs than currently exists³
We analyze multiple key subtrends for GPU price-performance improvements, such as the trends in price-performance for top-performing GPU and for GPUs commonly used for Machine Learning
We put the trends into perspective by comparing them against prior estimates, Moore’s law, Huang’s law, prior analyses, and public predictions on GPU performance

Dataset

We combine two existing datasets on GPU price-performance. One dataset is from the Median Group, which contains data on 223 Nvidia and AMD GPUs (Median Group, 2018). The second dataset is from Sun et al., 2019, which contains price-performance data on 413 GPUs released by Nvidia, Intel and AMD.

We merged both datasets and removed duplicate observations, i.e. GPU models that were contained in both datasets. Furthermore, we removed different versions of the same product unless they had different specifications.⁴

We also decided to drop observations prior to 2006 for two main reasons: 1) it is unclear whether the we can meaningfully compare their levels of performance as these models predate innovations that enable general-purpose computing on GPUs, and 2) we were not able to validate the accuracy of the data by looking up the relevant performance details in models’ data sheets. For a more detailed discussion see Appendix A.

Finally, we noticed that there is a subset of 20 GPUs for which the 16-bit performance is ~60-fold worse than its performance in 32-bit format, while for all other GPUs the 16-bit performance is at least as good as its 32-bit performance. We dropped these 16-bit performance numbers, which we think might have been erroneous.

The final dataset thus contains 470 GPUs from AMD, Intel, and Nvidia released between 2006 and 2021. We will refer to this merged dataset as “our dataset” for the rest of the report. Throughout, FLOP/s are those in 32-bit (full) precision.

Trends in GPU price-performance (2022) by mrazomor

Trends in GPU price-performance (2022) by mrazomor

Share This Article

Newsletter

Executive Summary

Introduction

Dataset

HackTech

Leave a comment Cancel reply

Editor's Choice

Trends in GPU price-performance (2022) by mrazomor

Trends in GPU price-performance (2022) by mrazomor

Share This Article

Newsletter

Executive Summary

Introduction

Dataset

HackTech

Leave a comment Cancel reply

Editor's Choice

Sign Up to Our Newsletter