Today, the AIGC market in China is booming, with ChatGPT, AUTOGPT, and domestic platforms such as Wenxin Yiyuan all competing for attention and constantly appearing on hot searches.
Huaxi Securities predicts that the global AI software market will reach $126 billion by 2025, with a compound annual growth rate of 41.02% from 2021 to 2025.
Behind the prosperity of ChatGPT lies astronomical computing power. According to estimates, a single large language model training to drive AI training servers requires about $200 million, while AI inference servers, such as those used by ChatGPT in the early stages, require about $4.5 billion.
As a result, the AI server race behind ChatGPT is also on the rise.
Computing power determines ChatGPT As the core engine of large models, the calculation formula for computing power is simple: the number of GPU chips determines the size of the computing power, and the number of high-end GPU chips directly affects the size of the computing power.
The computing power required for ChatGPT is not fixed but rather increases incrementally. The smarter ChatGPT becomes, the more computing power it requires.
According to media speculation, the training cost of GPT-3 is expected to be around $5 million per iteration, and the training cost of the GPT-3 model is about $1.4 million. Google’s PaLM model requires a training cost of about $11.2 million.
According to Microsoft executives, the AI supercomputer that provides computing power support for ChatGPT is a large top-tier supercomputer built by Microsoft in 2019 with an investment of $1 billion. It is equipped with tens of thousands of Nvidia A100 GPUs and more than 60 data centers with hundreds of thousands of Nvidia GPUs deployed.
To meet the increasing computing power demands of ChatGPT, Microsoft announced that it will launch an Azure scalable AI virtual machine series based on Nvidia’s latest flagship chip, the H100 GPU, and Nvidia’s Quantum-2 InfiniBand network interconnect technology to significantly accelerate AI model development.
It seems that behind ChatGPT is Nvidia, Nvidia, and more Nvidia.
In fact, as a hardware giant, NVIDIA not only dominates the consumer market but is also the top choice for AI server chips.
As the saying goes, scarcity leads to high prices. Currently, NVIDIA’s flagship H100 chip has increased by nearly 70,000 yuan in just one week, with a general price of around 300,000 yuan. The second flagship A100 chip has risen from 60,000 yuan to 90,000 yuan in just over three months, an increase of more than 50%.
Not only are the chips expensive and hard to come by, but the United States has also banned NVIDIA from selling chips. In August of last year, the US government issued an export control policy prohibiting NVIDIA from selling A100 and H100 chips to China.
To avoid losing the Chinese market and comply with US export controls, NVIDIA subsequently launched “performance-restricted” A800 and H800 chips. However, these two chips were also quickly snapped up by the market, and prices soared as a result.
Led by Baidu, Alibaba, and Tencent, most domestic internet companies have announced their entry into large-scale models. According to market statistics, the number of large-scale models to be launched in China after ChatGPT has already exceeded 10.
To achieve the level of ChatGPT, at least 3,000 A100 chips are needed, which would cost 270 million yuan at a price of 90,000 yuan per chip for a large-scale model deployment. Ten large-scale models would require 30,000 A100 chips, costing 2.7 billion yuan.
In addition to the astronomical cost of the chips, the training costs are also significant. However, based on NVIDIA’s current delivery time, it is not easy to buy enough chips.
In a flash, the era of mining cards has returned.
NVIDIA, once again sitting at the forefront of the trend, made a fortune of $4.8 billion in just a few years as a necessary graphics card provider for mining during the virtual currency craze a few years ago. Now, with ChatGPT, NVIDIA is living a second life and making history repeat itself.
Faced with the surge in market demand, NVIDIA, which has risen again with the AI wave, has cunningly launched “computing power leasing” services.
On March 21, 2023, at the GTC conference, NVIDIA founder and CEO Huang Renxun launched NVIDIA DGX Cloud™, which provides enterprises with the infrastructure and software needed to train generative AI models. Each DGX Cloud instance is equipped with eight H100 or A100 80GB GPUs, and enterprises can rent DGX Cloud clusters on a monthly basis in the form of “cloud leasing,” with prices starting at $37,000 per instance per month.
Does NVIDIA really have no substitutes? Why do enterprises prefer to lease from NVIDIA rather than choose other GPU chip manufacturers?
According to IDC data, the proportion of domestic GPU servers in the domestic server market in 2021 exceeded 88.4%, with over 80% of the products being used from NVIDIA.
The chips required for large AI models have higher requirements for processing precision and computing power speed. In the field of supercomputing, double-precision floating-point computing power FP64 is a hard indicator for high computing power calculations. Currently, NVIDIA’s H100 and A100 are the only chips that have these capabilities.
The restrictions imposed by the United States are not limited to the sale of NVIDIA chips, but also limit the research and development of Chinese companies in terms of technology, equipment, and materials. However, despite the heavy restrictions in the United States, Chinese companies are still producing some impressive products.
According to the latest “China Accelerated Computing Market (H2 2021) Tracking Report” released by IDC, the scale of China’s AI server market reached 35.03 billion yuan in 2021, a year-on-year increase of 68.6%.
In the enterprise-level GPU chip field, Chinese manufacturers such as Biren Technology launched the “BR100” chip in 2022, Tianzhi Chip launched the “ZhiKai 100” chip, and Cambricon launched the “SiYuan 270” chip. Among them, Biren Technology claims that the BR100 has the highest computing power in the world, with peak computing power exceeding three times that of the flagship products on the market. The 16-bit floating-point computing power exceeds 1000T, the 8-bit fixed-point computing power exceeds 2000T, and the peak computing power of a single chip reaches PFLOPS level.
Although the data is good, the lack of the crucial ability to process FP64 means that they still cannot completely replace NVIDIA’s H100 and A100 chips. In addition, NVIDIA’s CUDA platform has already become the most widely used AI development ecosystem, only supporting NVIDIA’s Tesla architecture GPU, making it impossible to replace it with domestic chips at this stage.
Although Chinese chip manufacturers are catching up in the GPU chip field, the technology gap and US restrictions are still key issues that require more effort and time.
The rise of large AI models like ChatGPT is not only benefiting the AI server and GPU chip markets but also the storage market. The running conditions of ChatGPT include training data, model algorithms, and high computing power, with the underlying infrastructure of high computing power being the foundation for completing massive data and training.
The most obvious feature is that after several iterations, the parameter amount of ChatGPT has increased from 117 million to 175 billion, nearly a 2,000-fold increase, which also brings great challenges to computing storage.
With the opening of a new era of AI, it is expected that the global data generation, storage, and processing volume will increase exponentially, and storage will significantly benefit. Computing storage is an important cornerstone of ChatGPT, and with tech giants like Alibaba and Baidu entering the ChatGPT game, the overall demand for computing storage in the market will only continue to rise rapidly.
With the continuous popularity of AIGC, the digital economy-developed areas such as Beijing, Shanghai, and Guangzhou have also introduced policies to promote the construction of intelligent computing centers. For example, Beijing has proposed to “build a new batch of computing data centers and artificial intelligence computing power centers, and cultivate them into artificial intelligence computing power hubs by 2023”; Shanghai has proposed to “layout and build a batch of high-performance, high-throughput artificial intelligence computing power centers, and promote the construction of public computing power service platforms,” and so on.
Meanwhile, all industries will face the baptism of ChatGPT. With the arrival of a new wave of artificial intelligence, industries related to AI will have broad market space.
Chinese companies are also bound to break through the constraints imposed by the United States and break free from unfair restrictions.