
As the demand for computing power is being questioned, NVIDIA plans to acquire the "GPU landlord." What kind of business is computing power leasing?

NVIDIA plans to acquire GPU rental company Lepton AI, which does not manage data centers or servers itself but rents from cloud providers and then subleases to its own customers. Analysts believe that as the availability of systems with lower computing costs increases, driving down overall computing costs, the computing power rental market has shifted to a clearly buyer's market, prompting NVIDIA to consider vertical integration to consolidate its "dominant" position
Author: Li Xiaoyin
Source: Hard AI
According to media reports, chip giant NVIDIA is close to finalizing a deal to acquire GPU rental company Lepton AI, with the transaction amount expected to reach several hundred million dollars.
At a time when computing power demand is being questioned, NVIDIA's acquisition move is an important step into the cloud computing field, marking its direct competition with major cloud providers such as Amazon and Google.
Lepton AI: A GPU "Sub-landlord"
Lepton AI was founded in 2023 and is headquartered in Palo Alto, California, USA. It is a startup that provides GPU computing power rental services.
Previously, Lepton provided AI cloud services for gaming startup Latitude.io and research startup SciSpace. Its co-founders, Yangqing Jia and Junjie Bai, were former AI researchers at Meta.
Unlike ordinary cloud service providers, Lepton does not manage data centers or servers itself, but rents from cloud providers and then subleases to its own customers— the company does not actually own any GPUs.
Analysis suggests that Lepton's business model will focus more on meeting specific needs for AI training and inference, providing optimized GPU clusters and related technical services. For the company, this asset-light model also allows Lepton to avoid the financial pressure of heavy asset investments.
In terms of training, Lepton offers a job submission method similar to Slurm. Actual tests show that users can adjust existing sbatch scripts to work on the Lepton platform in just a few minutes, and the conversion process is quite intuitive.
Another major highlight of the Lepton platform is its visualization capabilities.
Analysis indicates that Lepton provides a console dashboard where users can view the lifecycle of nodes and understand the job status of each node. This node lifecycle visualization feature performs excellently, second only to CoreWeave. This is crucial for monitoring and managing GPU resources, helping users to identify and resolve issues in a timely manner.
Computing Power Rental Market: Shifting from Seller to Buyer
Currently, the computing power rental market is undergoing profound changes.
The well-known American semiconductor research institution SemiAnalysis points out that computing costs are decreasing over time, and the computing power rental market has clearly shifted to a buyer's market—there are now over 100 GPU cloud service providers competing for essentially the same customer base, leading to intensified price competition.
Jensen Huang also expressed a similar view in his GTC speech last week:
"When Blackwell starts shipping in large quantities, even Hopper will be ignored." The key to this phenomenon lies in the fact that the cost of the computing power market is determined by the weighted average cost of each type of GPU. This means that once the availability of systems with lower computing costs increases, it will drive down the overall computing costs, which in turn will lower the rental prices of older cards.
For example, the unit cost of NVIDIA's GB200 in inference (cost per million tokens in USD) is 75% lower than that of the H100, and the cost in training (cost per effective PFLOP per hour in USD) is 56% lower.
This means that if the H100 wants to remain competitive, it must significantly reduce its price. SemiAnalysis calculates that in order for customers to feel "indistinguishable" when using the two chips, the hourly rental price of the H100 needs to be 65% lower than that of the GB200.
More intuitively, if the rental price of the GB200 is $2.20 per GPU per hour, then the rental price of the H100 needs to drop to $0.98 per GPU per hour.
SemiAnalysis predicted last year that as H100 production accelerates, GPU prices will continue to decline, and considering that buyers will shift their focus to the Blackwell strategy, this downward trend will persist until the end of 2024.
Reality has proven that its prediction is accurate.
This competitive landscape puts significant pressure on specialized GPU rental companies like Lepton and prompts NVIDIA to consider market consolidation through acquisitions to further strengthen its dominant position in AI computing infrastructure.
NVIDIA's Ambitions Cannot Be Hidden, Aggressive Expansion of Cloud Services
NVIDIA CEO Jensen Huang has been referred to as the "Chief Revenue Destroyer" by SemiAnalysis, a title that reflects NVIDIA's aggressive expansion strategy in the computing market in recent years.
SemiAnalysis points out that by acquiring Lepton, NVIDIA not only gains an additional source of revenue but may also squeeze the survival space of other cloud service providers.
Furthermore, this vertical integration strategy allows NVIDIA to profit from the entire industry chain from chip design to computing power rental, while also better controlling the usage and pricing strategies of its GPU chips, further enhancing its dominant position in the AI computing field.
Currently, NVIDIA's cloud and software business is still in its infancy, with a model where NVIDIA directly rents servers powered by its chips to enterprises and provides software to help companies develop AI models and applications, as well as manage GPU clusters for training AI NVIDIA previously stated that this business could generate $150 billion in revenue in the future — this figure exceeds the current annual revenue of NVIDIA or Amazon AWS.