Serenity: U.S. AI companies need to lower inference costs to address the "model distillation" challenge

2026-06-26 13:14

Serenity, the “White-Haired Stock Guru,” stated that although some observations in the UBS report have anecdotal validity, what is more noteworthy is the increasing number of Chinese-language reports regarding the distillation of Anthropic's models. Currently, many U.S. startups and tech companies tend to use cheaper Chinese models (such as DeepSeek) in their AI applications because their cost per task is significantly lower than that of inference models from Gemini, OpenAI, and Anthropic.

Serenity believes that this trend, driven by capitalism, creates a "classic paradox"—companies will naturally choose lower-cost solutions, thereby undermining the leading advantage of U.S. models. He suggests that the U.S. needs to respond in two areas:

First, build a stronger access control and authentication system, such as a "heavy KYC frontier model" for domestic U.S. use, and a tiered access mechanism for allies, to reduce the risk of model distillation and misuse. Simultaneously, an "AI version of banking-grade authentication" identity verification system could be introduced (e.g., biometrics + short-lived permission tokens) to raise the threshold for model calls and use regulatory measures to restrict account sharing and access resale.

Second, improve the cost efficiency of inference models, allowing them to comprehensively outperform competitors like DeepSeek in both price and performance.

Serenity also stated that some high-end models are currently being frequently "exploited through distillation." Ideally, more friction costs should be added to accessing models approaching the AGI level. In summary, the core challenge for the U.S. AI industry is to achieve "low-cost inference capabilities" while also establishing a model access security mechanism comparable to that of the financial system.