Tencent's Qiu Yuepeng: The explosion of inference demand requires simultaneous upgrades to cloud infrastructure

Wallstreetcn
2025.09.16 08:03
portai
I'm PortAI, I can summarize articles.

Tencent Cloud has made breakthroughs in inference acceleration, Agent Infra, and international expansion

Author | Huang Yu

In 2025, with the explosion of AI applications and the arrival of the Agent era, the demand for inference is surging. To seize this opportunity, cloud service providers are actively upgrading their cloud infrastructure to meet market demands.

On September 16, at the 2025 Tencent Global Digital Ecosystem Conference, Qiu Yuepeng, Vice President of Tencent Group and President of Tencent Cloud, stated that the shift in the focus of the large model industry from training to inference has become an industry consensus. At the same time, customers have shown a strong enthusiasm for using large models and building Agents, leading to a surge in inference demand.

This also means that AI infrastructure needs to be upgraded simultaneously.

In recent years, Tencent Cloud has been continuously upgrading its cloud infrastructure to support the large-scale implementation of Agents and the global development of enterprises. According to Qiu Yuepeng, Tencent Cloud has made breakthroughs in inference acceleration, Agent Infra, and international layout, and will adopt a more open posture to help enterprises seize the opportunities of the times.

In terms of inference acceleration, Tencent Cloud has deeply participated in open-source contributions, submitting multiple optimization technologies to communities such as DeepSeek, vLLM, and SGLang. At the same time, to address the memory bottleneck faced by large model inference, Tencent Cloud has independently developed and open-sourced the FlexKV multi-level caching technology, significantly reducing the occupancy of KVCache and cutting the first-byte latency by up to 70%.

Additionally, Qiu Yuepeng revealed that Tencent Cloud integrates various chip resources through a heterogeneous computing platform to provide high-cost-performance AI computing power to the outside world. Currently, this platform has fully adapted to mainstream domestic chips.

It is reported that the collaborative full-stack optimization of software and hardware is a long-term strategic investment for Tencent Cloud. Through the software capabilities of the heterogeneous computing platform, it integrates different types of chips to provide high-cost-performance AI computing power.

This year is regarded as the Agent era, and as cutting-edge technologies move into enterprise production environments, ensuring their efficient operation in a safe and trustworthy environment has become a new challenge. To this end, Tencent Cloud has also launched a brand-new Agent infra solution—Agent Runtime.

Agent Runtime integrates five major capabilities: execution engine, cloud sandbox, context service, gateway, and security observability service. Among them, the cloud sandbox, based on self-developed technology, has a startup time of only 100 milliseconds and supports hundreds of thousands of concurrent instances.

In addition to upgrading infrastructure for Agents, Qiu Yuepeng pointed out that Tencent Cloud is also considering how to apply Agent capabilities to customers' cloud journeys, helping customers better use and manage the cloud, leading to the creation of Tencent Cloud's expert service agent—Cloud Mate.

Cloud Mate consists of a series of sub-Agents that encapsulate experience from various cloud domains. It is not just a technology but a summary of Tencent Cloud's vast practices, capable of visualizing governance of cloud architecture, proactively intercepting risks, and significantly improving problem-solving efficiency, thus changing the way cloud management is conducted Qiu Yuepeng revealed that in internal practice, Cloud Mate achieved a risk SQL interception rate of 95%, reducing troubleshooting time from 30 hours to as fast as 3 minutes.

The era of agents is surging in, and cloud service providers are actively gearing up for this arms race