Serenity: US AI Companies Lower Inference Costs to Counter the "Model Distillation" Challenge

2026-06-26 13:14

Odaily reported that "White-Haired Stock Guru" Serenity stated that while some observations in the UBS report hold anecdotal truth, the more noteworthy trend is the increasing number of Chinese-language reports regarding the distillation of Anthropic's models. Currently, many US startups and tech companies are opting to use cheaper Chinese models (such as DeepSeek) in their AI applications, as their unit task costs are significantly lower than those of inference models from Gemini, OpenAI, and Anthropic.

Serenity believes this trend, driven by capitalism, creates a "typical paradox"—companies naturally gravitate towards lower-cost solutions, thereby eroding the leading advantage of US models. He proposes that the US needs to address this on two fronts:

First, build stronger access control and authentication systems, such as "heavy KYC frontier models" for domestic US use and tiered access mechanisms for allies, to reduce the risk of model distillation and misuse. This could also be accompanied by introducing an identity verification system akin to "AI-grade banking authentication" (e.g., biometrics + short-lived permission tokens) to raise the barrier for model calls, and using regulatory measures to restrict account sharing and access resale.

Second, enhance the cost efficiency of inference models, allowing them to comprehensively outperform competitors like DeepSeek in both price and performance.

Serenity also noted that some high-end models are currently frequently targeted for "distillation exploitation." Ideally, access to models nearing the AGI level should involve increased friction costs. In summary, the core challenge for the US AI industry lies in achieving both "low-cost inference capabilities" and establishing model access security mechanisms comparable to those in the financial system.