Meta sells computing power, Palantir complains, Zhipu becomes Silicon Valley's top star – the AI Capex story needs a new narrative

区块律动BlockBeats

特邀专栏作者

2026-07-02 13:00

本文約6101字，閱讀全文需要約9分鐘

AI Capex was the sole driver of the recent market rally.

AI總結

展開

Core Thesis: The capital expenditure narrative in the AI industry is shifting from an aggregate "supply shortage" to a structural "utilization mismatch." Companies are beginning to evaluate the actual returns on computing investment, and Meta's consideration of leasing out idle computing power is a signal of this shift.
Key Elements:
1. Meta's potential move to lease out AI computing power has sparked concerns about the sustainability of capital expenditure. However, the revenues of the three major cloud providers (AWS, Google Cloud, Azure) continue to grow strongly, and their cost guidance is rising rather than falling, indicating that aggregate demand has not collapsed.
2. The computing power market is showing signs of stratification: Top-tier cloud providers can continue to raise prices due to certainty (e.g., AWS's reserved GPU service price increase of 20%), while model companies and middle-layer entities face scrutiny over computing power utilization. Companies are starting to break down procurement based on the value of specific tasks.
3. Enterprise AI adoption has entered a "cost-accounting" phase. A UBS survey shows that approximately 60% of respondent companies are tightening their token spending. Palantir's CEO criticized the token billing model as a "wealth tax," emphasizing a shift from consumption-based to outcome-based pricing.
4. Open-source models (such as Zhipu's GLM-5.2) have become tools for enterprises to reduce AI spending. Coinbase's case shows that by switching its default model and optimizing usage, its AI expenditure was halved even as token usage continued to grow exponentially.
5. A computing power mismatch is becoming a key issue: On one hand, Meta is considering selling its computing power; on the other hand, it struggles to purchase sufficient top-tier model capabilities from Google, highlighting the structural contradiction between assets and demand.

The AI market is once again undergoing a violent correction, this time because Meta has mentioned that it might sell its surplus AI computing power.

If this news had come out three years ago, probably no one would have found it strange. Cloud computing has always been a business of slicing up servers and selling them to others. Amazon, Microsoft, and Google have been doing this for years. New cloud providers like CoreWeave and Nebius have also been on this path, turning NVIDIA chips into financing collateral, and then using that financing to acquire more chips.

But when it comes to Meta, things suddenly feel different.

Meta hasn't historically understood computing power this way. It buys chips, builds data centers, secures electricity and land – all for its own models, its advertising system, its recommendation feeds, and for Zuckerberg's ever-closer vision of superintelligence. It is not a cloud vendor. It didn't originally make money by renting out its machines to others.

A company that once said, "I need as many machines as possible because the future will consume them," is now saying, "If these machines are temporarily idle, we can sell access to them."

This doesn't directly prove an oversupply of computing power, but it also can't be dismissed lightly.

On the day the stock market plunged, Palantir CEO Alex Karp appeared on CNBC, ranting at the camera for nearly twenty minutes.

He was originally there to discuss Palantir's new partnership with NVIDIA, but the conversation quickly turned to OpenAI and Anthropic's token-based pricing models. He said CEOs have privately complained to him that current enterprise AI adoption is about "paying for tokens that create no value, while also handing over their data." He even referred to the increasingly expensive model bills as a "wealth tax" imposed on enterprises.

For the past two years, the discussion has been about who dares to spend, who spends fastest, and who can pile up data centers first. Now, the questions are slowly changing. Once the machines are purchased, who can keep them running at full capacity?

Meta's proposition hasn't yet materialized into a formal business. In public reports, it has an internal direction called Meta Compute, which might involve selling raw computing power, or, like Amazon Bedrock, offering different models on its infrastructure to developers. Zuckerberg mentioned at a previous shareholder meeting that external companies almost weekly ask if they can buy their API services or purchase some compute, often willing to pay above Meta's own cost.

He added a qualifier at the time. They haven't done this yet because Meta feels it still needs this computing power for itself.

If they need it, renting it out is a choice. If they don't need it, renting it out becomes a painkiller for the balance sheet.

This is where the judgment becomes difficult. Meta might simply be creating a window in its construction timeline to sell off temporarily vacant resources. Or, it could be telling investors that its hundred-billion-dollar AI spending cannot be sustained solely by a distant superintelligence narrative; it must find a nearer revenue stream.

Both interpretations are plausible.

Demand hasn't disappeared; it's just becoming selective

Capex is the core of the AI narrative, bar none. Much like the monetary easing of 2021, the expectation for Capex is continuous growth. As long as the liquidity keeps flowing, all the branches fueled by market speculation will rise together. So when news broke that Meta is preparing to sell computing power, many people's first reaction was that AI capex is about to collapse. The big companies finally admitted they bought too much, and the semiconductor party is over.

That's too simplistic a view.

Public data doesn't yet support such a definitive conclusion. AWS revenue grew 28% in Q1, reaching $37.6 billion, a rare fast growth rate in recent years. Google Cloud grew even faster, hitting $20 billion in Q1 revenue. Microsoft Azure continues to run at around 40% growth.

Amazon is still saying its annual capital expenditure could reach $200 billion. Alphabet raised its 2026 capital expenditure guidance to between $180 billion and $190 billion. Meta itself raised its annual capex to between $125 billion and $145 billion.

These numbers don't paint a picture of collapsing demand.

They look more like a divergence.

The situation for cloud providers is different from that of model providers. Cloud vendors sell the roads. As long as there's traffic on the road, it doesn't matter who built the vehicles; they can collect tolls. OpenAI, Anthropic, enterprise clients, government clients, startups – ultimately, they all have to land on some data center, some chip, some network, and some electricity contract.

That's why the big three clouds can continue to be strong.

AWS even raised the price of an AI cloud service at the end of June. This service, which allows customers to reserve GPUs in advance, saw its price increase by about 20% starting in July. It had already made a similar adjustment of about 15% in January. This is not an action typically seen during periods of weak demand.

When resources are scarce, sellers raise prices.

But model companies may not all find themselves as comfortable.

The assets of model companies are more demanding. Computing power doesn't generate revenue just by sitting there. It needs to be constantly filled by smarter models, higher-frequency users, and more expensive enterprise workflows. Only when the model is good enough will users tolerate queues, quotas, price increases, and increasingly complex subscription tiers.

This is also why Anthropic is viewed by the market as a different kind of company. It's not because it's cheap, but because users are willing to entrust it with expensive tasks. Writing code, modifying systems, running long tasks, integrating with enterprise workflows – once these tasks truly enter a production environment, they consume far more tokens than casual chat.

The trouble for a strong model is not having enough machines.

The trouble for a weak model is that no one cares about its machines.

Both troubles are about computing power, but they are not the same thing.

The xAI thread has a similar flavor. Grok hasn't established a clear enterprise mindshare like the strongest models, yet some computing power within Musk's ecosystem is flowing to Anthropic. This action is more sobering than any slogan. Machines don't care about founders; they only care about who can keep them fully utilized.

The relationship between Google and Meta also suggests the matter isn't simple. In June, news emerged that Google had restricted Meta's use of Gemini because the computing power Meta wanted to buy exceeded what Google could provide, even affecting some of Meta's internal AI projects. A company is simultaneously considering selling computing power while struggling to buy enough top-tier model capability for certain tasks.

This isn't a traditional case of oversupply.

This is a mismatch. Because the bills are becoming glaring.

Cloud providers can keep raising prices because they sell certainty. Clients want guaranteed GPU access for a specific period, a stable data center, and infrastructure that won't crash in the middle of the night.

But once an enterprise client gets the computing power, the problem isn't over.

They still have to present the bill to the CFO. The CFO won't ask how many tokens you used. He will ask how much money those tokens saved the company, how much extra revenue they generated, and how many mistakes they prevented.

For enterprises, tokens become an electricity meter

This brings us back to Karp's interview at the beginning.

He characterized what many AI companies sell to enterprises as overselling. The day before the interview, Palantir posted a nine-point statement on X about so-called "AI sovereignty," specifically targeting the "tokenmaxxing" model. This term is hard to translate literally, but the meaning isn't complex: equating token consumption with progress, money burning with usage, and bills with productivity.

Karp put frontier labs like OpenAI and Anthropic on the spot. His point wasn't that enterprises shouldn't use the best models. It was that enterprises shouldn't hand over their data, processes, and business judgments, only to pay an ever-increasing bill based on consumption.

Palantir wants to sell something different. Not a universal chat box, not a single API, but an integration of data, approvals, permissions, operational rules, and AI into a single business system. What the client pays for isn't "how many times AI was used," but whether a specific production line, a risk control process, or a government task has been genuinely transformed.

The people who actually manage the money in enterprises are waking up.

UBS recently spoke with enterprise IT executives, and a clear direction emerged. Many enterprises aren't abandoning AI; they are putting the brakes on AI spending. About 60% of surveyed companies are curbing token expenses and adding usage guardrails, especially those that have moved past the trial phase and started integrating AI into daily processes.

This is also a fascinating reversal.

Once AI transforms from a toy into a tool, spending money actually becomes harder. In the toy stage, bosses were willing to give budgets because everyone feared missing out. In the tool stage, the CFO asks who it saved labor hours for, who it sold more products for, and who it reduced risk for.

On this balance sheet, tokens don't look like revenue.

They look more like an electricity meter.

You could say a fast-spinning meter means the factory is operating. You could also say the meter is spinning too fast while output hasn't increased, indicating there's a problem with the machinery.

AI agents have amplified this problem. A Codex study by OpenAI and several universities contains some startling data. In the first half of 2026, active Codex users grew over fivefold; within OpenAI, output tokens for certain roles exploded – the median monthly output tokens for legal roles were 13 times higher than in November 2025, and for research roles, over 50 times higher.

Another study puts it more starkly. Agentic coding tasks can consume up to 1000 times more tokens than regular code chat or code reasoning. Token consumption for the same task can vary by 30 times between different runs.

This is the underlying reason for today's compute crunch.

It's not that everyone is asking a few more questions to a chatbot.

It's that software is becoming a group of tiny workers that repeatedly read files, run commands, modify code, fail, restart, fail again, and restart again. They don't take lunch breaks, but they consume tokens at every step.

When tokens become an electricity meter, whoever owns the power plant has the power. But whoever wastes the electricity will be the first to be questioned.

As bills get thicker, cheaper models find their place

Once the CFO starts watching that electricity meter, the next step is almost instinctive.

He will ask: which tasks truly require the strongest model, and which tasks only need a model that is good enough?

At this point, open-source models like GLM, Kimi, DeepSeek, and Qwen are no longer just tech news. They become bargaining tools on the corporate procurement table.

Even Marc Andreessen of the top-tier Silicon Valley VC a16z has said that many AI practitioners now regard Zhipu GLM-5.2 as one of the first Chinese models that can match or even surpass leading US public models in most tasks. This assessment may not be the final verdict, but it gives enterprises another option.

Coinbase provides a more concrete example. Brian Armstrong stated that by switching its default AI model to open-source models like GLM 5.2 and Kimi 2.7, combined with model routing, caching, and context streamlining, token usage continues to grow exponentially, but AI spending has been cut by nearly half.

The real impact of this statement is that enterprises can now procure model capabilities separately for the first time.

Stick with the most expensive model for the hardest tasks. Use cheaper, locally deployable models for standard summaries, customer service, information extraction, templated code, and internal knowledge base Q&A.

Open-source models don't necessarily need to win the entire battlefield.

They just need to convince the procurement department that not every kilowatt-hour of electricity needs to be charged at a premium rate.

Here, Meta selling computing power ceases to be an isolated news story.

It tells the same story as Palantir criticizing token models and Coinbase cutting costs with open-source models: the AI spending chain is being dismantled. The upstream sells certainty, the midstream sells results, and the downstream squeezes unit prices. Every layer is still growing, but every layer is now being asked: is the money well spent?

The hardest part isn't buying machines; it's keeping them busy

For the past two years, the easiest story to tell in the AI industry was about resource scarcity.

Not enough GPUs, not enough electricity, not enough data centers, not enough engineers, not enough cloud capacity to run the models. This story was too easy. When something is scarce, everyone instinctively charges forward. Secure the position first, sign the power contract, buy the chips, set up the machines.

When scrambling for resources, people don't tend to calculate the fine details.

Because the cost of being one step behind seems much greater.

But Meta's news has pushed another issue to the forefront. After the machines are bought, they don't automatically become a good business just because they are expensive. They need work every day. They need customers willing to pay. They need models to keep them running. They need applications to convert the cost into revenue.

This is utilization.

Utilization sounds like a cold metric, but it's actually quite brutal. It doesn't ask if you have a future; it asks if your machine worked today. It doesn't care what you said at the press conference or if you bought the most expensive GPU. It only cares about one thing: has this money turned into a continuous cash flow?

Cloud providers have a relatively easier time answering this question. They've always sold infrastructure. AWS, Google Cloud, and Azure sell the roads, the electricity, and the data center space. Whether customers want to train models, run inference, or host applications, they ultimately need to land on some cloud.

So they can still be strong.

Strong model companies also have their answer. If the model is powerful enough, users will queue up, enterprises will integrate it, and developers will rewire their workflows around it. In this case, computing power isn't inventory; it's a bottleneck. The more machines they have, the more they can scale.

The hardest position is the middle layer.

They have the machines, the narrative, the model team, and a big budget. But their model isn't at the forefront, their product hasn't become a daily habit, and developers are unwilling to change workflows for it. For this type of company, computing power can turn from a weapon into inventory with just one failed model release or one user migration.

Inventory isn't necessarily useless.

But inventory must be discounted, rented out, or find a new purpose.

This is what makes Meta selling computing power so striking. It doesn't prove Meta's failure, nor does it mean AI demand has disappeared. It just allows the market to see, for the first time, that AI infrastructure can also encounter the same problems as an ordinary factory.

The factory is built. Where are the orders?

Computing power hasn't vanished; it's beginning to stratify

So, the best way to understand this isn't "oversupply of computing power."

That term is too crude.

A more accurate description is that computing power is beginning to stratify.

The top layer remains tight. The strongest models, the best clouds, and the most stable GPU clusters are still being fought over. AWS can raise prices because certainty itself has a price. Clients aren't just buying a GPU; they are buying the guarantee that a specific machine will be available on a specific day, at a specific hour.

The middle layer is starting to feel awkward. It might not be bad, but it's not scarce. It can run models, perform inference, and be sold to external clients. But clients will compare, negotiate, and ask why they shouldn't use a cheaper model, someone else's cloud, or why this batch of machines is worth this particular price.

The bottom layer will be squeezed by open-source models and cost optimization. Enterprises won't always call on the most expensive model for ordinary tasks. They will implement routing, caching, compress contexts, and tier their models into different price points.

Demand has grown up.

Children don't look at the bill when spending money; adults do. As AI enters the enterprise, it will go through this process too. During the pilot phase, everyone fears missing out. During the scaling phase, everyone starts doing the math.

Once the math is done, the industry chain will no longer be as uniform as it was in the early days.

Some will continue to raise prices because they sell irreplaceable certainty. Some will switch to selling results because clients don't want to pay for consumption itself. Some will be forced to cut prices because a "good enough" alternative has appeared. Some will rent out their machines because an idle machine looks worse on the books than a low-priced rental.

When these things happen simultaneously, the industry will appear contradictory.

On one hand, computing power is in short supply.

On the other hand, computing power is being rented out.

On one hand, token consumption is exploding.

On the other hand, enterprises are reducing AI spending.

歡迎加入Odaily官方社群