Analyzing the exponential acceleration in AI capabilities through advances in models, chips, and compute infrastructure
Venture capitalist David Sacks recently argued that artificial intelligence is on track for a million-fold (1,000,000×) improvement in four years, driven by exponential advances in three areas: models/algorithms, chips, and compute infrastructure. In a podcast discussion, Sacks stated that AI models are getting “3–4× better” every year, new hardware generations are “3–4× faster” each cycle, and companies are deploying exponentially more GPUs in data centers – roughly another 3–4× increase per year in total compute.
Multiplied together, these trends could yield 100× × 100× × 100× ≈ 1,000,000× growth in overall AI capability within four years. Such a claim is extraordinary, so let’s break down each component and see what evidence says about its accuracy.
1. Model/Algorithm Improvements: “3–4× Per Year”
Sacks suggested that “the models are improving at a rate of… three to four times a year”. By this he meant that AI algorithms are not only getting faster and more accurate, but qualitatively more capable – evolving from simple chatbots into more advanced reasoning systems, with the next leap being autonomous AI agents. Is it true that AI model performance is roughly tripling or quadrupling annually? Evidence from AI research trends supports rapid algorithmic progress, though 3–4× per year is on the high end.
Algorithmic efficiency has been rapidly improving. Research from Epoch AI shows how the compute required to achieve a given performance on ImageNet (a key vision benchmark) has fallen dramatically over time – about a 10× reduction every two years. In other words, modern algorithms can reach the same accuracy with only 1/10th the computing power that was needed two years prior. This equates to roughly 3×–4× efficiency gain per year in that period.
Such gains – due to better architectures, training methods, and scaling strategies – mean that for a fixed amount of compute, today’s models are far more capable than those of just a few years ago. An OpenAI study found, for example, that between 2012 and 2019 the amount of compute required to train a model to AlexNet-level performance on ImageNet fell by 44× (algorithmic efficiency doubling about every 16 months).
Beyond raw performance metrics, today’s frontier models also demonstrate new qualitative abilities that previous generations lacked. Sacks gave the example that we “started with pure LLM chatbots, then we went to reasoning models… [next] the agents”. Indeed, the industry is now striving to move beyond chatbots that can only respond to questions, toward AI “agents” that can plan and act autonomously.
A 2024 McKinsey report observes that we are beginning an evolution from conversational tools “that answer questions and generate content” to AI agents that can execute complex, multistep tasks – essentially, moving from “thought to action”. In line with Sacks’ remarks, current top models like GPT-4 have stronger reasoning and problem-solving skills (e.g. using chain-of-thought logic) than earlier chatbots, and upcoming systems are being designed to use tools, APIs, and memory to act as goal-driven agents.
Do experts back up a “100× improvement in models in four years”? Extrapolating current trends, it’s conceivable. If algorithms improve ~10× every two years (as historical data for some tasks suggests), that would be ~100× in four years. For instance, one analysis found that across many AI tasks, the compute needed to reach a fixed performance has dropped by a factor of 10 annually when combining both hardware and algorithmic advances.
Even focusing purely on algorithmic advances, one report noted that each year since 2010, researchers have found ways to make AI models about 3× more efficient on average. It’s important to clarify what “100× better” means in this context – often it refers to achieving the same result with 1/100th the resources, or conversely achieving much higher performance given the same resources.
By that measure, many credible analyses do support on the order of 100× algorithmic improvement over a multi-year period. OpenAI’s own researchers concluded that “hardware and algorithmic efficiency gains multiply” and that algorithmic progress has outpaced Moore’s Law in recent years.
In summary, Sacks’ claim of rapid model progress is grounded in reality: AI models have been improving extremely fast, roughly tripling in effective performance per year by some measures. However, whether this exact 3–4× annual pace will continue for the next four years is uncertain – it could slow as models approach certain limits, or surprise us with new breakthroughs. Nonetheless, the idea of ~100× better AI algorithms in four years is not outrageous given the recent history.
2. Hardware (Chips): Are New AI Chips 3–4× Faster Each Generation?
The second part of Sacks’ argument is that AI hardware performance is growing exponentially. “Each generation of chips is probably three or four times better than the last,” he said, pointing to NVIDIA’s latest GPUs and innovations like networking multiple chips together with high-speed interconnects (e.g. “Nvidia’s NVL72… a rack system… to create much better performance at the data center level”). Here, Sacks is highlighting both the leap in single-chip speed and the improved scalability of many chips working in concert.
Single-chip performance gains: Modern AI accelerators are indeed delivering massive generational improvements, though 3–4× per generation is an upper-range estimate. NVIDIA’s GPUs illustrate this trend. The NVIDIA A100 (2020) was a workhorse for AI training, but the next-generation NVIDIA H100 (2022, “Hopper” architecture) is vastly more powerful.
NVIDIA’s technical disclosures show the H100 offers up to ~6× greater throughput than A100 on certain AI operations, thanks to architectural advances like new Tensor Cores, faster memory (HBM3), and support for lower-precision math (FP8).
In general tasks, a single H100 can be roughly 2–3× as fast as a single A100 of the same power envelope, and even more if taking advantage of 8-bit precision and sparsity features. This easily meets Sacks’ “3–4× better” benchmark for that particular generational jump.
Looking ahead, NVIDIA’s next architecture (widely expected to be “Blackwell,” possibly a 2024–25 product generation) is rumored to continue this trajectory of major gains. Competing AI chip vendors (such as Google with its TPU v4/v5, or AMD with MI300 series) are similarly pushing for large leaps in performance per generation.
While Moore’s Law (transistor density) yields ~1.4× improvement per year historically, AI-specialized chips are exceeding that by architectural optimizations. It’s not guaranteed that every new chip will triple performance, but recent history (e.g. V100 → A100 → H100) shows multi-fold jumps are achievable.
Next-generation AI hardware: NVIDIA’s DGX GB200 “NVL72” rack with 72 interconnected GPUs
Next-generation AI hardware is scaling up dramatically. NVIDIA’s DGX GB200 “NVL72” rack is a 120 kW datacenter system that interconnects 72 GPUs (the upcoming Blackwell-generation chips) via NVLink switches. Essentially, this single rack acts as one giant supercomputer: it’s quoted at over 1.4 exaFLOPS (1.4×1018 operations per second) of AI compute at 4-bit precision – enough to handle training for models with tens of trillions of parameters in memory.
Nvidia and its partners claim that one NVL72 rack can deliver the same performance as ~144 conventional GPU servers, while using the same power as those older systems. That’s roughly a 25× performance boost at equal power from advanced interconnect and system design alone. This exemplifies Sacks’ point that it’s not just individual chips getting faster – it’s also how chips are combined.
It’s reasonable to say hardware could improve ~10× (or more) over a 4-year span (~2 generations), which is slightly lower than Sacks’ rough 100× in four years for chips, but close if we include system-level innovations.
It’s worth noting that some of the hardware improvement will translate into cost reduction rather than pure speed. Sacks himself acknowledged that part of the million-fold gain would be “captured in price reductions… in the performance ceiling”. For example, a new chip might offer the same performance at much lower cost or power.
From an economic standpoint, a 100× hardware improvement could mean that what used to require $1 million of GPUs can be done with $10,000 worth a few years later. Either way – higher max performance or much cheaper computing – the progress is impactful.
Current industry roadmaps indicate continued exponential hardware gains at least in the near term. Therefore, Sacks’ claim that the chip side of the equation will provide on the order of 100× more capability in four years is ambitious but not implausible, assuming one more major GPU generation and widespread adoption of advanced multi-GPU systems.
3. Compute Scaling: 100K→Millions of GPUs in Data Centers?
The third factor is the sheer scale of compute deployed for AI. Sacks noted that companies are rapidly expanding the number of GPUs (graphics processing units) in their AI training clusters. He gave an example: “when Elon first started training Grok (his AI at xAI), maybe 100,000 GPUs. Now they’re up to 300,000, on the way to a million… OpenAI’s data center, Stargate… within a couple years they’ll be at 5–10 million GPUs”. These specific numbers raised some eyebrows – are they accurate?
Current GPU deployment at leading AI firms: No company has yet publicly announced having millions of GPUs in one cluster, but the largest players are already in the hundreds of thousands. For instance, Meta (Facebook) has been aggressively building out AI infrastructure – CEO Mark Zuckerberg said in early 2024 that Meta will have purchased about 350,000 Nvidia H100 GPUs by the end of 2024, and altogether will own roughly 600,000 AI chips including older generations.
This gives a sense of scale: one tech giant alone is acquiring over a third of a million top-of-the-line GPUs to power its AI efforts. Meanwhile, OpenAI/Microsoft have embarked on a project code-named “Stargate” to dramatically expand their compute capacity. According to Bloomberg and other reports, Microsoft plans to spend up to $100 billion on AI supercomputers for OpenAI.
The first Stargate data center campus (in Iowa and/or Texas) will accommodate as many as 250,000–400,000 GPUs in its initial phase, with about 64,000 of Nvidia’s next-gen “GB200” GPUs (Blackwell) expected to be installed by end of 2026 in one location. And that’s just a start – the full Stargate initiative envisions 5–10 such giant data center campuses by around 2030.
If realized, that could indeed mean on the order of millions of GPUs dedicated to OpenAI’s work across their facilities. Similarly, Google (which uses its custom TPUs as well as GPUs) and Amazon are scaling up AI infrastructure, and new entrants like Elon Musk’s xAI are investing heavily too.
Musk’s xAI reportedly acquired around 10,000 GPUs initially in 2023 for its “Grok” model, but has much bigger ambitions: Musk hinted that Grok 3 will be trained on 100,000 Nvidia H100 GPUs, a huge step up from the ~20,000 GPUs used for Grok 2. In fact, Musk is reportedly seeking billions in funding to reach perhaps 100k or more GPUs for xAI.
So while Sacks’ phrasing might have blurred some lines (it’s unlikely xAI actually had 100k GPUs at the very start; that figure may include future plans or other projects), the general trend he cites is valid – the top AI labs are racing from tens of thousands of accelerators to hundreds of thousands, and planning for millions.
100× more compute in four years: Sacks’ compute-scaling projection (100× increase in deployed GPUs over four years) is aggressive but not inconceivable given these initiatives. To put it in perspective, Microsoft’s first Azure supercomputer for OpenAI in 2020 had 10,000 GPUs. By 2023, that was upgraded to “tens of thousands” of Nvidia A100 GPUs in Azure (let’s say ~20–30k, a 2–3× increase).
Now, with the Stargate build-out, OpenAI could have on the order of 100k–200k GPUs by 2025-2026 (another ~5×+ increase), and as the later phases complete, possibly millions by 2027-2028. If Meta’s 350k H100 figure is achieved in 2024, that alone is a 5–10× jump from the number of top GPUs any company had just a couple years prior.
So, extrapolating to 5–10 million by ~2027 might be on the optimistic side, but it’s not a wild fantasy if AI demand and investments stay on their current exponential trajectory. It would require enormous capital expenditures (on the order of tens of billions of dollars for hardware, plus building new data centers) and addressing technical challenges (power and cooling for so many GPUs, supply chain limits, etc.).
Not every player will scale that far that fast – it’s likely to be a small handful of projects globally (e.g. Microsoft/OpenAI’s Stargate, perhaps a similar effort by Google or a government lab). Nonetheless, the notion of a 100× expansion in total AI compute over four years is supported by the plans currently in motion in the industry’s upper echelons.
It’s important to note that these millions of GPUs would not be one monolithic supercomputer but distributed across many locations and clusters. For effective training of a single model, there are diminishing returns to adding more and more GPUs beyond a point (due to communication overheads and algorithmic parallelization limits).
However, techniques are improving to scale training across more chips (e.g. efficient model parallelism, sharding, better networking like NVLink switches). If those hurdles are overcome, future AI models (possibly aiming at AGI-level complexity) might indeed utilize hundreds of thousands of GPUs in a coordinated run.
Even if not, having millions of GPUs total means many AI models can be trained or run simultaneously, pushing overall AI capabilities in the world dramatically forward. In short, Sacks’ third claim – exponential growth in deployed compute – is broadly accurate, though the exact numbers he quoted for specific companies are speculative.
4. Combining the Trends: Will We Really See a 1,000,000× Leap?
David Sacks’ conclusion was that when you multiply these three factors – better algorithms (~100×), better chips (~100×), and massively more chips (~100×) – you get a potential 1,000,000-fold increase in AI capability in about four years. Is such a breathtaking gain plausible? It’s an optimistic scenario, but not strictly impossible on paper. Each vector of improvement, as we’ve seen, has some grounding in current trends. However, there are caveats:
- Independent vs. overlapping improvements: The three factors aren’t completely independent. For example, algorithmic advances often allow using less compute to achieve the same result – which could reduce the need for scaling up hardware. Conversely, the availability of huge compute can enable researchers to discover new algorithmic techniques (as larger experiments become feasible).
- Bottlenecks and limits: Maintaining 3–4× per year improvement on each front could prove challenging. On the algorithm side, some researchers wonder if we’ll hit diminishing returns on scaling up current architectures. For hardware, there are physical and engineering limits: power consumption, fabrication constraints, memory bandwidth and data movement (which is increasingly the bottleneck at scale).
- Timeline: Sacks’ timeline of four years might be slightly too short for the full effect. The compute deployment plans (millions of GPUs) seem to be aiming for the end of this decade (mid-to-late 2020s) rather than just by 2027.
The Bottom Line
The notion of a million-fold boost in AI capability in a handful of years is dramatic but not baseless. If you compare a 2023 state-of-the-art AI (say GPT-4 running on an A100 cluster) to what might exist in 2027 or 2028 (a much more efficient model, running on tens of thousands of H100/B100-class chips, with many more parameters and possibly new agentic abilities), the difference could indeed be on the order of 106.
Some of that will be experienced as cost/performance improvements (making advanced AI accessible and deployed widely), and some as pushing the frontier of what AI can do. Sacks’ claim underscores how people often underestimate exponential growth – going 10× every two years compounds to 100× in four years, not 20×.
Many experts would agree that we are in a period of unusually rapid AI progress. As one analysis succinctly put it: “AI models require 10 times less compute to reach the same accuracy every two years… and the computing power applied to AI has been growing >4× per year”. These two trends together yield a >10× annual effective improvement – very much in line with Sacks’ argument.
However, a dose of skepticism is healthy. Predicting technological progress is notoriously difficult. Unforeseen roadblocks (or a leveling off of investment) could slow things down. On the other hand, breakthroughs (like new algorithms, or quantum leaps in hardware design) could accelerate things further.
But as of mid-2025, the roadmap that Sacks outlines – vastly better models, chips, and compute – is broadly corroborated by industry data and plans. If anything, the real debate is not “will AI improve 1,000,000×?”, but rather what will society do with AI if and when it becomes millions of times more powerful than today.
Sacks believes most people underestimate the coming shock; the evidence reviewed here suggests he’s right that a paradigm shift is coming, even if the exact figure of a million-fold can be quibbled. Barring unexpected stagnation, AI circa 2028 will likely make today’s ChatGPT look as quaint as a flip-phone in the age of the smartphone.