By Neil C. Thompson, Svenja Spanuth
Communications of the ACM,
March 2021,
Vol. 64 No. 3, Pages 64-72
10.1145/3430936
Comments

Credit: The Image Foundation
Perhaps in no other technology has there been so many decades of large year-over-year improvements as in computing. It is estimated that a third of all productivity increases in the U.S. since 1974 have come from information technology,a,4 making it one of the largest contributors to national prosperity.
Key Insights
The rise of computers is due to technical successes, but also to the economics forces that financed them. Bresnahan and Trajtenberg3 coined the term general purpose technology (GPT) for products, like computers, that have broad technical applicability and where product improvement and market growth could fuel each other for many decades. But, they also predicted that GPTs could run into challenges at the end of their life cycle: as progress slows, other technologies can displace the GPT in particular niches and undermine this economically reinforcing cycle. We are observing such a transition today as improvements in central processing units (CPUs) slow, and so applications move to specialized processors, for example, graphics processing units (GPUs), which can do fewer things than traditional universal processors, but perform those functions better. Many high profile applications are already following this trend, including deep learning (a form of machine learning) and Bitcoin mining.
With this background, we can now be more precise about our thesis: “The Decline of Computers as a General Purpose Technology.” We do not mean that computers, taken together, will lose technical abilities and thus ‘forget’ how to do some calculations. We do mean that the economic cycle that has led to the usage of a common computing platform, underpinned by rapidly improving universal processors, is giving way to a fragmentary cycle, where economics push users toward divergent computing platforms driven by special purpose processors.
This fragmentation means that parts of computing will progress at different rates. This will be fine for applications that move in the ‘fast lane,’ where improvements continue to be rapid, but bad for applications that no longer get to benefit from field-leaders pushing computing forward, and are thus consigned to a ‘slow lane’ of computing improvements. This transition may also slow the overall pace of computer improvement, jeopardizing this important source of economic prosperity.
Universal and Specialized Computing
Early days—from specialized to universal. Early electronics were not universal computers that could perform many different calculations, but dedicated pieces of equipment, such as radios or televisions, designed to do one task, and only one task. This specialized approach has advantages: design complexity is manageable and the processor is efficient, working faster and using less power. But specialized processors are also ‘narrower,’ in that they can be used by fewer applications.
Early electronic computers,b even those designed to be ‘universal,’ were in practice tailored for specific algorithms and were difficult to adapt for others. For example, although the 1946 ENIAC was a theoretically universal computer, it was primarily used to compute artillery range tables. If even a slightly different calculation was needed, the computer would have to be manually re-wired to implement a new hardware design. The key to resolving this problem was a new computer architecture that could store instructions.10 This architecture made the computer more flexible, making it possible to execute many different algorithms on universal hardware, rather than on specialized hardware. This ‘von Neumann architecture’ has been so successful that it continues to be the basis of virtually all universal processors today.
The ascent of universal processors. Many technologies, when they are introduced into the market, experience a virtuous reinforcing cycle that helps them develop (Figure 1a). Early adopters buy the product, which finances investment to make the product better. As the product improves, more consumers buy it, which finances the next round of progress, and so on. For many products, this cycle winds down in the short-to-medium term as product improvement becomes too difficult or market growth stagnates.
Figure 1. The historical virtuous cycle of universal processers (a) is turning into a fragmentation cycle (b).
GPTs are defined by the ability to continue benefiting from this virtuous economic cycle as they grow—as universal processors have for decades. The market has grown from a few high-value applications in the military, space, and so on, to more than two billion PCs in use worldwide.38 This market growth has fueled ever-greater investments to improve processors. For example, Intel has spent $183 billion on R&D and new fabrication facilities over the last decade.c This has paid enormous dividends: by one estimate processor performance has improved about 400,000x since 1971.8
The alternative: Specialized processors. A universal processor must be able to do many different calculations well. This leads to design compromises that make many calculations fast, but none optimal. The performance penalty from this compromise is high for applications well suited to specialization, that is those where:
- substantial numbers of calculations can be parallelized
- the computations to be done are stable and arrive at regular intervals (‘regularity’)
- relatively few memory accesses are needed for a given amount of computation (‘locality’)
- calculations can be done with fewer significant digits of precision.15
In each of these cases, specialized processors (for example, Application-specific Integrated Circuits (ASICs)) or specialized parts of heterogeneous chips (for example, I.P. blocks) can perform better because custom hardware can be tailored to the calculation.24
The extent to which specialization leads to changes in processor design can be seen in the comparison of a typical CPU—the dominant universal processor—and a typical GPU—the most-common type of specialized processor (see the accompanying table).
Table. Technical specifications of a CPU compared to a GPU.
The GPU runs slower, at about a third of the CPU’s frequency, but in each clock cycle it can perform ∼100x more calculations in parallel than the CPU. This makes it much quicker than a CPU for tasks with lots of parallelism, but slower for those with little parallelism.d
GPUs often have 5x–10x more memory bandwidth (determining how much data can be moved at once), but with much longer lags in accessing that data (at least 6x as many clock cycles from the closest memory). This makes GPUs better at predictable calculations (where the data needed from memory can be anticipated and brought to the processor at the right time) and worse at unpredictable ones.
For applications that are well-matched to specialized hardware (and where programming models, for example CUDA, are available to harness that hardware), the gains in performance can be substantial. For example, in 2017, NVIDIA, the leading manufacturer of GPUs, estimated that Deep Learning (AlexNet with Caffe) got a speed-up of 35x+ from being run on a GPU instead of a CPU.27 Today, this speed-up is even greater.26
Another important benefit of specialized processorse is that they use less power to do the same calculation. This is particularly valuable for applications limited by battery life (cell phones, Internet-of-things devices), and those that do computation at enormous scales (cloud computing/ datacenters, supercomputing).
As of 2019, 9 out of the top 10 most power efficient supercomputers were using NVIDIA GPUs.37
Specialized processors also have important drawbacks: they can only run a limited range of programs, are hard to program, and often require a universal processor running an operating system to control (one or more of) them. Designing and creating specialized hardware can also be expensive. For universal processors, their fixed costs (also called non-recurring engineering costs (NRE)) are distributed over a large number of chips. In contrast, specialized processors often have much smaller markets, and thus higher per-chip fixed costs. To make this more concrete, the overall cost to manufacture a chip with specialized processors using leading-edge technology is about $80 millionf (as of 2018). Using an older generation of technology can bring this cost down to about $30 million.23
Despite the advantages that specialized processors have, their disadvantages were important enough that there was little adoption (except for GPUs) in the past decades. The adoption that did happen was in areas where the performance improvement was inordinately valuable, including military applications, gaming and cryptocurrency mining. But this is starting to change.
The state of specialized processors today. All the major computing platforms, PCs, mobile, Internet-of-things (IoT), and cloud/supercomputing, are becoming more specialized. Of these, PCs remain the most universal. In contrast, energy efficiency is more important in mobile and IoT because of battery life, and thus, much of the circuitry on a smartphone chip,34 and sensors, such as RFID-tags, use specialized processors.5,7
Cloud/supercomputing has also become more specialized. For example, 2018 was the first time that new additions to the biggest 500 supercomputers derived more performance from specialized processors than from universal processors.11
Industry experts at the International Technology Roadmap for Semiconductors (ITRS), the group which coordinated the technology improvements needed to keep Moore’s Law going, implicitly endorsed this shift toward specialization in their final report. They acknowledged the traditional one-solution-fits-all approach of shrinking transistors should no longer determine design requirements and instead these should be tailored to specific applications.16
The next section explores the effect that the movement of all of the major computing platforms toward specialized processors will have on the economics of producing universal processors.
The Fragmentation of a General Purpose Technology
The virtuous cycle that underpins GPTs comes from a mutually reinforcing set of technical and economic forces. Unfortunately, this mutual reinforcement also applies in the reverse direction: if improvements slow in one part of the cycle, so will improvements in other parts of the cycle. We call this counterpoint a ‘fragmenting cycle’ because it has the potential to fragment computing into a set of loosely-related siloes that advance at different rates.
As Figure 1(b) shows, the fragmenting cycle has three parts:
- Technology advances slow
- Fewer new users adopt
- Financing innovation is more difficult
The intuition behind this cycle is straightforward: if technology advances slow, then fewer new users adopt. But, without the market growth provided by those users, the rising costs needed to improve the technology can become prohibitive, slowing advances. And thus each part of this synergistic reaction further reinforces the fragmentation.
Here, we describe the state of each of these three parts of the cycle for computing and show that fragmentation has already begun.
Technology advancements slow. To measure the rate of improvement of processors we consider two key metrics: performanceg and performance-per-dollar. Historically, both of these metrics improved rapidly, largely because miniaturizing transistors led to greater density of transistors per chip (Moore’s Law) and to faster transistor switching speeds (via Dennard Scaling).24 Unfortunately, Dennard Scaling ended in 2004/2005 because of technical challenges and Moore’s Law is coming to an end as manufacturers hit the physical limits of what existing materials and designs can do,33 and these limits take ever more effort to overcome.2 The loss of the benefits of miniaturization can be seen vividly in the slowdown of improvements to performance and performance-per-dollar.
Figure 2(a), based Hennessy and Patterson’s characterization of progress in SPECInt, as well as Figure 2(b) based on the U.S. Bureau of Labor Statistics’ producer-price index, show how dramatic the slowdown in performance improvement in universal computers has been. To put these rates into perspective, if performance per dollar improves at 48% per year, then in 10 years it improves 50x. In contrast, if it only improves at 8% per year, then in 10 years it is only 2x better.
Figure 2. Rate of improvement in microprocessors, as measured by (a) Annual performance improvement on the SPECint benchmark,7appx and (b) Annual quality-adjusted price decline.1appx
Fewer new users adopt. As the pace of improvement in universal processors slows, fewer programs with new functionality will be created, and thus customers will have less incentive to replace their computing devices. Intel CEO Krzanich confirmed this in 2016, saying that the replacement rate of PCs had risen from every four years to every 5–6 years.22 Sometimes, customers even skip multiple generations of processor improvement before it is worth updating.28 This is also true on other platforms, for example U.S. smartphones were upgraded on average every 23 months in 2014, but by 2018 this had lengthened to 31 months.25
G