We would like to thank Alyssa Vance, Ashwin Acharya, Jessica Taylor and the Epoch team for helpful feedback and comments.
Executive Summary
Using a dataset of 470 models of graphics processing units (GPUs) released between 2006 and 2021, we find that the amount of floating-point operations/second per $ (hereafter FLOP/s per $) doubles every ~2.5 years. For top GPUs at any point in time, we find a slower rate of improvement (FLOP/s per $ doubles every 2.95 years), while for models of GPU typically used in ML research, we find a faster rate of improvement (FLOP/s per $ doubles every 2.07 years). GPU price-performance improvements have generally been slightly slower than the 2-year doubling time associated with Moore’s law, much slower than what is implied by Huang’s law, yet considerably faster than was generally found in prior work on trends in GPU price-performance. We aim to provide a more precise characterization of GPU price-performance trends based on more or higher-quality data, that is more robust to justifiable changes in the analysis than previous investigations.1

Figure 1. Plots of FLOP/s and FLOP/s per dollar for our dataset and relevant trends from the existing literature
Trend | 2x time | 10x time | Growth rate | Metric |
---|---|---|---|---|
Our dataset |
2.46 years |
8.17 years |
0.122 OOMs/year |
FLOP/s per dollar |
ML GPUs |
2.07 years |
6.86 years |
0.146 OOMs/year |
FLOP/s per dollar |
Top GPUs |
2.95 years |
9.81 years |
0.102 OOMs/year |
FLOP/s per dollar |
Our data FP16 (n=91) |
2.30 years |
7.64 years |
0.131 OOMs/year |
FLOP/s per dollar |
Moore’s law |
2 years |
6.64 years |
0.151 OOMs/year |
FLOP/s |
Huang’s law |
1.08 years |
3.58 years |
0.279 OOMs/year |
FLOP/s |
CPU historical (AI Impacts, 2019) |
2.32 years |
7.7 years |
0.130 OOMs/year |
FLOP/s per dollar |
4.4 years |
14.7 years |
0.068 OOMs/year |
FLOP/s per dollar |
Introduction
GPUs are the dominant computing platform for accelerating machine learning (ML) workloads, and most (if not all) of the biggest models over the last five years have been trained on GPUs or other special-purpose hardware like tensor processing units (TPUs). Price-performance improvements in underlying hardware has resulted in a rapid growth of the size of ML training runs (Sevilla et al., 2022), and has thereby centrally contributed to the recent progress in AI.
The rate at which GPUs have been improving has been analyzed previously. For example, Su et al., 2017 finds a 2.4-year doubling rate for GPU FLOP/s from 2006 to 2017. Sun et al., 2019 analyses over 4,000 GPU models and finds that FLOP/s per watt doubles around every three to four years. By contrast, some have speculated that GPU performance improvements are more rapid than the exponential improvements associated with other microprocessors like CPUs (which typically see a 2 to 3-year doubling time, see AI Impacts, 2019). Notable amongst these is the so-called Huang’s Law proposed by NVIDIA CEO, Jensen Huang, according to whom GPUs see a “25x improvement every 5 years” (Mims, 2020), which would be equivalent to a ~1.1-year doubling time in performance.
There is previous work that specifically analyzes price-performance across CPUs and GPUs (summarized in Table 1). Prior estimates of the rate of improvement vary widely (e.g. the time it takes for price-performance to increase by 10-fold ranges from ~6 to ~15 years, depending on the computing precision—see Table 2.). Due to the high variance of previous approaches and their usage of smaller datasets, we are not confident in existing estimates.2
Reference | Processor type | Metric | 2x time | 10x time | Growth rate |
---|---|---|---|---|---|
Bergal, 2019 | GPU | FLOP/s per $ in FP32, FP16, and FP16 fused multiply-add | 4.4 years (FP32) 3.0 years (FP16) 1.8 years (FP16 fused) |
14.7 years (FP32) 10.0 years (FP16) 6.1 years (FP16 fused) |
0.068 OOMs/year (FP32) 0.100 OOMs/year (FP16) 0.164 OOMs/year (FP16 fused) |
Median Group, 2018 | GPU | FLOP/s per $ in FP32 | 1.5 years | 5.0 years | 0.200 OOMs/year |
Muehlhauser and Rieber, 2014 | Various | MIPS/$ | 1.6 years | 5.2 years | 0.192 OOMs/year |
Sandberg and Bostrom, 2008 | CPU-based | MIPS/$ and FLOP/s per $ | 1.7 years (MIPS) 2.3 (FLOP/s) |
5.6 years (MIPS) 7.7 years (FLOP/s) |
0.179 OOMs/year (MIPS) 0.130 OOMs/year (FLOP/s) |
Nordhaus, 2001 | CPU-based | MIPS/$ | 1.6 years | 5.3 years | 0.189 OOMs/year |
We aim to extend the existing work with three main contributions:
- Using a larger dataset of GPU models than has been analyzed in previous investigations that includes more recent GPU models, we produce more precise estimates of the rate of price-performance improvements for GPUs than currently exists3
- We analyze multiple key subtrends for GPU price-performance improvements, such as the trends in price-performance for top-performing GPU and for GPUs commonly used for Machine Learning
- We put the trends into perspective by comparing them against prior estimates, Moore’s law, Huang’s law, prior analyses, and public predictions on GPU performance
Dataset
We combine two existing datasets on GPU price-performance. One dataset is from the Median Group, which contains data on 223 Nvidia and AMD GPUs (Median Group, 2018). The second dataset is from Sun et al., 2019, which contains price-performance data on 413 GPUs released by Nvidia, Intel and AMD.

Figure 2. Plots of FLOP/s and FLOP/s per dollar for Median Group’s and Sun et al., 2019’s datasets.
We merged both datasets and removed duplicate observations, i.e. GPU models that were contained in both datasets. Furthermore, we removed different versions of the same product unless they had different specifications.4
We also decided to drop observations prior to 2006 for two main reasons: 1) it is unclear whether the we can meaningfully compare their levels of performance as these models predate innovations that enable general-purpose computing on GPUs, and 2) we were not able to validate the accuracy of the data by looking up the relevant performance details in models’ data sheets. For a more detailed discussion see Appendix A.
Finally, we noticed that there is a subset of 20 GPUs for which the 16-bit performance is ~60-fold worse than its performance in 32-bit format, while for all other GPUs the 16-bit performance is at least as good as its 32-bit performance. We dropped these 16-bit performance numbers, which we think might have been erroneous.
The final dataset thus contains 470 GPUs from AMD, Intel, and Nvidia released between 2006 and 2021. We will refer to this merged dataset as “our dataset” for the rest of the report. Throughout, FLOP/s are those in 32-bit (full) precision.

Figure 3. Plots of FLOP/s and FLOP/s per dollar for the dataset us