With the rapid development of artificial intelligence technology, large models, with their powerful digital processing capabilities and deep learning abilities, are continuously integrating with various fields and gradually becoming key drivers of industrial innovation and the key engine for driving new quality productivity. According to the latest data released by the state, as of March this year, there are 117 generative artificial intelligence services that have completed the filing in China, and the number of various Chinese large models has exceeded more than 200, with the application scenarios of multimodal large models continuously expanding.
However, with the rapid development of large models, the cost of computing power has increasingly become an important factor affecting the promotion and application of artificial intelligence. The high prices of large model products have long been a constraint on the development of artificial intelligence applications. Recently, with ByteDance taking the lead in reducing the usage price of large models to the "cent" era, giants such as Baidu, Alibaba, and Tencent have followed suit, quickly pushing large models into the free era.
Advertisement
The era of free large models is accelerating
On May 15, the Volcano Engine Cloud Service Platform released the first batch of "Doubao Large Models" in China that have passed the algorithm filing, with the main model only requiring 0.8 cents to process more than 1,500 Chinese characters, which is 99.3% lower than the industry price. This has driven the market pricing of large model enterprises from "charging by the minute" to "charging by the cent," helping enterprises to accelerate business innovation at a lower cost. After ByteDance fired the first shot in reducing the price of large models, Baidu and Alibaba followed suit.
On the morning of May 21, Alibaba Cloud announced that the API input price of the main model Qwen-Long of the Tongyi Qianwen GPT-4 level was reduced to 0.0005 yuan/thousand Tokens, a direct drop of 97%. Subsequently, Baidu stated through its official WeChat that the two main models of the Wenxin Large Model, ENIRE Speed and ENIRE Lite, are completely free. One price reduction and one free, the price reduction actions of Alibaba Cloud and Baidu Smart Cloud indicate that domestic large model technology companies have started a price competition model.On the one hand, price wars are beneficial for giants to seize more customer resources, thereby rapidly expanding market share and maintaining a leading position in market competition. As the user base and scenarios for large models continue to expand, the main trend in the pricing of model calls is that performance is continuously improving while prices are continuously decreasing. With the decline in the pricing of algorithm calls, the cost of using algorithms will also further decrease, which will promote the large model into a rapid growth era and accelerate the development of the large model industry. At present, compared to Baidu and Alibaba, which have strong technical and strength, ByteDance has taken the lead in starting a price war, essentially hoping to achieve a "shortcut overtaking" through this opportunity. Baidu, Alibaba, and Tencent choose to follow, which is obviously unwilling to lag behind and even more unwilling to give up the market.
On the other hand, with the rapid development of the large model industry, the inference cost has also rapidly decreased, becoming the basis for the end price reduction. According to Baidu's official disclosure, compared with a year ago, the algorithm training efficiency of the Wenxin large model has been increased to 5.1 times the original, the weekly training efficiency has reached 98.8%, the inference performance has been increased by 105 times, and the inference cost has been reduced to 1% of the original. That is to say, customers originally called 10,000 times a day, and now they can call 1 million times a day at the same cost.
In the view of industry insiders, reducing costs is a key factor in promoting large models to "value creation stage", and only a large usage can polish good models and also greatly reduce the unit cost of model inference. Therefore, the price reduction of large models cannot just stay in providing low-priced lightweight versions, but also make the main models and the most advanced models also affordable, in order to truly meet the complex business scene needs of enterprises, verify the application value of large models, and promote AI application innovation and value creation. As the leading manufacturers participate in the "price reduction" of large models, the real free era of large models may be accelerating.
The AI infrastructure competition has entered the application stage.
Under the AI trend, as the cloud service providers of AI infrastructure, last year's main focus was still on large model products. This year, the competition among large factories is no longer limited to technology. Now, what is more important is the price and specific landing scenarios.Firstly, industry giants are intensifying their efforts in open sourcing to expand the influence of their large model ecosystems and support more AI-native application innovations. On May 9th, Alibaba Cloud officially released the Tongyi Qianwen 2.5, declaring that the model's performance has comprehensively surpassed GPT-4Turbo, becoming the strongest Chinese large model on the planet. At the Alibaba Cloud AI Summit, Chief Technology Officer of Alibaba Cloud, Zhou Jingren, introduced that the daily API call volume of Alibaba Cloud's large models has exceeded 100 million; in addition to everyday 2C applications, it also serves 90,000 corporate clients, and currently, the download volume of the Tongyi open-source model has exceeded 7 million.
Similarly choosing to open source is Tencent's Hunyuan Wenshengtu Large Model. On May 14th, Tencent announced a comprehensive upgrade to its Hunyuan Wenshengtu Large Model and made it open source to the public. This is also the first Chinese-native model with an architecture similar to Sora that supports both Chinese and English input and understanding, with 1.5 billion parameters. Looking at the market, the current Wenshengtu open-source ecosystem, such as Stable Diffusion, is essentially centered around English semantic understanding, requiring Chinese to be translated into English before generating images. The emergence of Tencent's Hunyuan Wenshengtu Large Model has broken this status quo, allowing the Wenshengtu ecosystem to better understand Chinese.
In fact, Tencent's large model applications already exhibit a significant product mindset. Since the launch of Tencent's Hunyuan Large Model in September last year, Tencent has adopted a "large model family bucket" strategy, with over 400 internal services now integrated with Tencent's Hunyuan Large Model. For instance, following the Hunyuan Large Model, it could be the "AI Ask Books" feature in WeChat Reading or the "AI Assistant" in Tencent Meetings. It is evident that while the internet giants are open sourcing, they are also attempting to set an example for other collaborators with their own AI applications.
Secondly, by optimizing large model engineering, the cost of large model inference is accelerated to decrease, thereby promoting a large-scale reduction in the cost of large models to flourish in application scenarios. In the past, inference models mostly used single-machine inference, while large models utilized distributed inference. For example, many companies in the industry are now using the MoE architecture model, which is based on a parallel mechanism of multiple experts, activating only a portion of the experts during inference, thus greatly compressing the parameter volume and inference cost. In fact, to enable users to use AI at a low cost and promote the development of applications, price reduction has become a consensus among domestic large model manufacturers.
In February of this year, Alibaba Cloud announced a comprehensive price reduction for cloud products on the official website, with an average discount of over 20%, and the highest reduction reaching 55%, involving more than 100 products and over 500 product specifications. In April, Alibaba Cloud expanded its price reduction strategy to overseas public cloud products. The reason the big companies are engaging in a price war is that the lower the price of large models, the more people will use them; the greater the usage, the better the large models can be called upon. In short, the emergence of price power can promote the better implementation of large models.The Chess Game of Large Models Reaches the Middle Game
Since the emergence of ChatGPT, the popularity of large models has been evident to all. Faced with the technological revolution brought about by large models, it is not only major companies like Microsoft, Google, Baidu, and ByteDance that are developing large models, but also a continuous emergence of AI-related startups, with the industry witnessing the rise of AI "Four Little Dragons" such as Baichuan Intelligence, Dark Side of the Moon, MiniMax, and ZhiPu AI. As major companies lower their API prices, venture capital companies, including the new AI Four Little Dragons, will face a brand-new test.
On one hand, cloud service providers led by BAT have become the "main force of price reduction," directly impacting small and medium-sized startups that rely on selling B-end APIs. In the view of industry professionals, the purpose of this round of large model price reduction is not for direct user utilization but to attract developers. In the short term, the performance of large models has encountered a bottleneck; currently, no company can present a new killer feature, making price reduction the top priority.
Because most large model apps are now free, essentially, the user base of these apps has reached a plateau, including OpenAI. For a while, the promotion costs for major AI model apps have skyrocketed, with the return on investment plummeting. Under these circumstances, it is necessary to involve more developers to create applications that can attract user participation.
However, in response to the price cuts by major companies, AI startups have not chosen to follow suit but have instead chosen to remain on the sidelines. For instance, some investors involved in AI large model investments have stated, "This wave of price reduction has a significant impact on the B-to-B model of some startups." In the past, many companies chose to cooperate with startups mainly because the APIs of startups were cheaper than those of major companies, but now there is virtually no possibility of being cheaper than the major companies, which means that the B-end commercialization model of startups no longer exists.In this scenario, startups that are forced into price wars will have to seek entirely new business models. Should startups fail to find new business models within a certain period, they will face a critical test of survival, leading to a major reshuffling of venture capital firms in the industry. This will also compel some venture capital firms to accelerate their exploration of niche application opportunities in vertical fields, developing new AI applications and shifting their service targets from B2B to B2C.
On the other hand, the current industry price war can be seen as a derivative result of the "hundred models battle," where the reduction in prices of large models may benefit the leading large model companies to accelerate their integration pace. At present, the market for Chinese large models is quite limited, and it is impossible for all large models to succeed. As ecosystem products, large models either dominate the market or fade away. Currently, in addition to BATH, many manufacturers such as ByteDance, iFLYTEK, and SenseTime are also involved in AI large model development, and each inevitably falls into the "race for computational scale" of involution. Since the functionality differences among them are not significant, under homogenized competition, price wars are naturally inevitable.
Furthermore, with device manufacturers like Apple, Microsoft, and Lenovo heavily investing in localized AI computing power and large local models, AI PCs and AI Phones are becoming mainstream, while the use cases for general large models are greatly limited. This forces many large model manufacturers to seek breakthrough methods to quickly produce innovative application results to counter external threats.
Thus, on the surface, it appears that large models are reducing in price, but what is really being affected are the competitions among the various stakeholders involved. Under the price war, small and medium-sized startups are turning to new directions, while large cloud service providers are taking the opportunity to seize market share, and a major reshuffling has clearly begun.
The logic of the large model competition has changed.In fact, since the beginning of price reductions, the competitive logic within the industry regarding AI large models has already changed. To put it in the words of industry insiders, the use of AI by companies is not cost-driven, but rather determined by whether it can generate business value. This may become the core logic of the competition for large models in the foreseeable future.
Firstly, the API call model of basic large models is far from actual business operations. What determines the use of AI by companies is not the cost, but whether it is effective and user-friendly. From the perspective of simply reducing API prices, there is not much of a barrier to driving a significant price reduction across the entire industry. However, achieving real commercial success in the B2B sector is relatively more challenging, because how AI large models can be integrated into corporate operations and help companies achieve corresponding business benefits is the core issue that concerns enterprises.
However, many large models are still superficial and have a considerable distance to go before they can truly be implemented in business. In the absence of effective integration with corporate operations, no matter how low the pricing is, companies will not perceive it as valuable. This is because what truly prompts companies to make a purchase is the efficient and user-friendly experience brought by AI, as well as the cost-saving and efficiency-enhancing outcomes that result from it. Therefore, whether price reduction can achieve the goal depends on user experience and feedback; otherwise, it becomes a one-sided wish from the manufacturers.
Secondly, the price reduction of large model APIs has sounded an alarm for the industry's internal competition. Simply piling up parameters, competing in computing power, and pricing are not the best solutions for the healthy development of the industry. In the future, only differentiation will provide a way out. Like all industries, from the initial chaos to the great melee, it is often marked by fierce price wars. Now, after the fervor of the "hundred-model battle," the price war has begun to emerge, and the negative consequences of homogenized competition are gradually appearing.
In fact, some players in the industry are already trying to make attempts aimed at the C-end market. For example, when Baichuan Intelligence recently released its base large model Baichuan 4, it simultaneously launched its first AI assistant "Bai Xiaoying," similar to an AI search application. In Wang Xiaochuan's view, in the Chinese business environment, the C-end market is ten times larger than the B-end, and it is necessary to adopt a "dual-wheel drive" strategy of "base large model" + "AI application."Looking ahead, as the industry's large models undergo accelerated reshuffling, future enterprises with large models will either choose to delve deeply into vertical application fields; or they will opt to combine their strengths with large model companies to create smaller models that suit their needs, rather than blindly joining the large model competition and competing solely on quantity and parameters.