算力正在重新集中:DeepSeek 降價之後,誰來掌控 AI 的基礎設施?
- 核心觀點:DeepSeek 等模型大幅降價推動 AI 應用民主化,但反而加速了算力向少數雲端巨頭集中(2026 年四大雲端廠商資本開支預計達 5708 億美元),去中心化算力網路 Gonka 試圖透過 PoW 激勵機制整合全球閒置 GPU,在算力層集中化完成前提供結構性的替代路徑。
- 關鍵要素:
- 模型降價依賴充足算力,但全球算力正向少數節點匯聚,光通訊龍頭 Lumentum 到 2028 年的產能已幾乎售罄。
- 比特幣網路算力已超過谷歌、微軟、亞馬遜雲端資料中心總和,但僅用於雜湊謎題,大量閒置 GPU 因缺乏協調機制無法用於 AI 推理。
- Gonka 將工作量證明從雜湊計算重新對準 AI 推理,使網路中近 100% 算力貢獻對應真實任務,代幣價值錨定物理算力成本。
- 2026 年預計 AI 推理將佔全球算力消耗的三分之二,降價使調用量指數級增長,進而強化擁有大算力玩家的結構性鎖定。
- Gonka 主網上線不到一年,聚合算力從 60 塊 H100 擴展至超 1 萬塊 H100 當量,由全球數百個獨立節點自發接入驅動。
On April 26, DeepSeek released new pricing for its V4 series API: the cost for hitting the input cache has been reduced to one-tenth of its initial launch price across the board, and with the Pro version's limited-time offer, processing one million tokens costs as little as 0.025 yuan—nearly a hundred times cheaper than a year ago. A-share computing power stocks hit their daily limit up that day, with market sentiment boiling over.
But behind the cheers, there's a question no one is openly discussing: as models become increasingly cheaper, the computing power required to run them is becoming increasingly centralized.
The data doesn't lie. In Q4 2025, the combined capital expenditures of four major cloud providers—Microsoft, Amazon, Meta, and Google—increased by 64% year-over-year to $118.6 billion; their total capital expenditure for full-year 2026 is projected to grow another 53% year-over-year, reaching $570.8 billion. During the same period, Google raised its 2026 TPU chip shipment target by 50% to 6 million units. The delivery lead time for NVIDIA's H100 series has stretched to several months in some markets.
Pricing power at the model layer is tilting towards developers, but control over the computing power layer is consolidating towards a handful of giants at a much faster pace. This is a subtle yet profound contradiction of the AI era.

It is against this backdrop that, on April 24, 2026, Gonka protocol co-founders Daniil and David Liberman took the keynote stage at LA Hacks 2026. This annual, largest university hackathon at UCLA featured the Liberman brothers as keynote speakers this year, addressing hundreds of top engineers about to enter the industry. The question he posed was particularly clear at this moment: Is decentralized computing power still in time?
I. The Other Side of the Price Cuts
The logic behind DeepSeek V4's price cuts appears to be an efficiency dividend from technological progress—a new attention mechanism compresses token dimensions, and combined with DSA sparse attention, it significantly reduces the demands on computation and memory. However, for these price cuts to be sustainable, they depend on the premise that computing power somewhere is sufficiently abundant and cheap enough.
The reality is that the sources of this 'sufficiently abundant' computing power are rapidly converging towards a few nodes globally. Michael Hurlston, CEO of optical communications leader Lumentum, recently stated that based on current trends, the company's production capacity is nearly sold out through 2028. This isn't just a single company's dilemma; it's a collective strain across the entire AI infrastructure supply chain facing explosively growing demand.
In his LA Hacks speech, Daniil used a simple yet powerful comparison: the total computing power of the Bitcoin network already surpasses that of the combined data centers of Google, Microsoft, and Amazon—but what is all this power doing? It's solving a hash puzzle that no one needs an answer to. The same goes for idle GPU power worldwide: the graphics cards in gamers' computers, servers in university labs, spare capacity from smaller cloud service providers—they add up to a massive scale, yet without a coordination mechanism, they cannot be utilized for AI inference.
This is precisely the coordination problem Gonka aims to solve: using Proof-of-Work incentive mechanisms to organize idle GPUs scattered across the globe into a network capable of handling real AI inference tasks.
II. Inference is the New Battlefield
DeepSeek's price cuts have sparked widespread discussion about 'AI democratization' across the Chinese internet. But there's an overlooked detail: the price cuts are for 'invocation costs', not 'computing power costs'. As AI applications scale, the growth in inference demand is exponential. According to industry forecasts, by 2026, inference will account for roughly two-thirds of global AI computing power consumption.
What does this mean? For every order of magnitude reduction in invocation price, the actual total amount of computing power required only increases, never decreases. In a way, the 'democratization' of large models is accelerating the centralization of the computing power layer—because only players with massive computing power can sustain the operation of inference services at ultra-low profit margins.
This is a structural lock-in taking shape: whoever controls the physical computing power on the inference side controls the true infrastructure gateway of the AI era. From this perspective, the significance of a decentralized computing power network is no longer just a '50% cheaper' cost optimization, but providing a structurally alternative path before centralized lock-in is complete.
III. A Real Challenge for Young Builders
The participants of LA Hacks—engineers and product people from top California universities—will soon face a not-so-romantic engineering choice: which layer of computing power to build their products upon.
Whose servers is your AI product invoking for its inference queries?
If that platform adjusts its pricing strategy or access policies, do you have the ability to migrate?
Is the user scale you help build creating value for yourself, or is it just providing leverage for the platform?
Developers have experienced these questions before in the Web2 era: when an application's fate is deeply intertwined with a platform's algorithms or distribution rules, 'independence' becomes a word that needs constant redefinition. The computing power dependency in the AI era will replicate this same logic at the infrastructure layer, and because switching costs are higher, the lock-in effect will only be stronger.

The hackathon, as a format, carries an inherent irony: building something functional in 36 hours with minimal resources and maximum speed—this is precisely the state pursued by decentralized network incentive mechanisms. When Daniil took the stage at LA Hacks, it wasn't just to talk about Gonka; it was more like asking this group: Will the things you do in the future accelerate this trend of centralization, or will you create new possibilities?
IV. PoW 2.0: An Engineering Proposition
Gonka has realigned the incentive structure of Proof-of-Work from hash computation towards AI inference, ensuring that nearly 100% of the network's computing power contributions directly correspond to real tasks. This mechanism has a critical engineering requirement: AI inference tasks must be verifiable and reproducible—given the same model weights, the same random seed, and input, any node can replicate the computation results and verify their validity. This is the core engineering challenge for Gonka in moving from an academic prototype to a functional network.
From an economic perspective, the significance of this mechanism is that token value is naturally anchored to physical computing power costs, not to market sentiment. Miners contributing computing power receive rewards, developers invoking power pay fees, and the entire system's incentive loop operates without needing the goodwill of any intermediary.
Of course, technical feasibility is only part of the story. The harder question is: In an era where computing power demand is skyrocketing and major players' capital expenditures are in the hundreds of billions of dollars, can a distributed computing power network, organized through voluntary community contributions, truly constitute a competitive force in terms of scale?
Gonka's early data provides a reference point: less than a year after mainnet launch, the network's aggregated power has expanded from the equivalent of 60 H100s to over 10,000 H100s, a pace driven by the voluntary connection of hundreds of independent nodes globally, not centralized orchestration. This doesn't prove the scale problem is solved, but it shows the incentive mechanism has effectively driven early growth.
V. The Question of the Window of Opportunity
Historically, dominance over infrastructure tends to converge rapidly in the early stages—this was true for the railway era, the internet era, and the mobile internet era. Each time, some people found a position to insert themselves before standards were solidified, while others only realized their participation rights had significantly narrowed after centralization was complete.
What stage is AI computing infrastructure currently in? Judging by the four major cloud providers' expected $570.8 billion in capital expenditures for 2026, centralization is accelerating. However, looking at developers' actual usage patterns, there remains a vast amount of supply-side resources that have yet to be effectively integrated. This gap is the structural space where decentralized networks can exist.
In his speech, Daniil referenced a parallel: after the 2000 dot-com bubble burst, what remained wasn't just ruins, but a global network of fiber optics that underpinned the digital economy for the next two decades. After the current investment boom in AI infrastructure subsides, the computing protocols and incentive mechanisms that settle will become the foundation for the next cycle. The question is simply which protocols have a logical foundation robust enough to remain operational under pressure.
This isn't a question about any specific project, but a question the entire decentralized AI track needs to confront: Can governance design truly resist the erosion of single points of control? Will incentive mechanisms remain effective as scale increases? Does the decentralization of a computing power network hold true simultaneously across the dimensions of technical execution, token issuance, and upgrade decision-making?
Conclusion
DeepSeek's price cuts have once again heated up the narrative of 'AI democratization'. But democratized inference invocation and democratized computing power infrastructure are two different things. The former is happening; whether the latter can happen depends on how many people over the next few years genuinely treat it as an engineering problem worth solving, rather than just a good-sounding narrative.
About David and Daniil Liberman
David and Daniil Liberman are the co-founders of Gonka, a decentralized AI computing network. The duo previously served as Product Directors at Snap Inc. and founded the AI development company Product Science Inc., with deep expertise in the AI field spanning many years.
Gonka is currently the decentralized AI network with the largest number of GPUs, providing permissionless access to computing resources for developers and researchers while rewarding all participants through its native token, GNK. The project successfully raised $18 million in 2023 and an additional $51 million in 2025. Investors include Coatue Management (an investor in OpenAI), Slow Ventures (an investor in Solana), Bitfury, K5, and partners from Insight and Benchmark. Early contributors to the project include 6 blocks, Hard Yaka, Gcore, and other well-known leaders in the Web2-Web3 space.
Website | Github | X | Discord |Telegram | Whitepaper | Tokenomics | User Guide


