Meta sells computing power, Palantir complains, Zhipu becomes a Silicon Valley darling. The AI Capex narrative needs a new angle.

区块律动BlockBeats

特邀专栏作者

2026-07-02 13:00

This article is about 6101 words, reading the full article takes about 9 minutes

AI Capex was previously the sole driver of market gains.

AI Summary

Expand

Core Thesis: The capital expenditure narrative in the AI industry is shifting from a total volume "supply shortage" to a structural "utilization mismatch." Companies are beginning to evaluate the actual returns on computing power investments, and Meta's consideration of leasing out idle computing power is a signal of this shift.
Key Elements:
1. Meta's potential leasing of AI computing power has sparked market concerns about the sustainability of capital expenditures. However, the revenue of the three major cloud providers (AWS, Google Cloud, Azure) continues to grow strongly, and their cost guidance has not decreased but increased, indicating that aggregate demand has not collapsed.
2. The computing power market is experiencing stratification: leading cloud providers can continue to raise prices due to certainty (e.g., AWS GPU reservation services saw a 20% price increase), while model companies and middle-layer entities face greater scrutiny over computing power utilization. Companies are beginning to disaggregate procurement based on task value.
3. Enterprise AI adoption has entered an "ROI calculation" phase. A UBS survey shows approximately 60% of surveyed enterprises are cutting token spending. Palantir's CEO criticized the token billing model as a "wealth tax," emphasizing a shift from consumption-based to outcome-based pricing.
4. Open-source models (e.g., Zhipu GLM-5.2) are becoming a tool for enterprises to reduce AI spending. A Coinbase case study shows that by switching the default model and optimizing usage, the company halved its AI spending while token usage continued to grow exponentially.
5. Computing power mismatch is a key issue: On one hand, Meta is considering selling computing power; on the other hand, it struggles to purchase sufficient top-tier model capabilities from Google, highlighting a structural contradiction between assets and demand.

The AI market is experiencing another violent correction, this time because Meta mentioned it might sell its excess AI computing power.

If this news had come out three years ago, probably no one would have found it strange. Cloud computing has always been a business of chopping up servers and selling them to others. Amazon, Microsoft, and Google have been doing this for years. New cloud providers like CoreWeave and Nebius also follow this path, turning Nvidia chips into financing collateral, and then turning that financing into even more chips.

But when it's Meta's turn, things feel different.

Meta didn't understand computing power this way in the past. It bought chips, built data centers, secured electricity and land – all for its own models, its advertising system, its recommendation feed, and for Zuckerberg's ever-closer vision of superintelligence. It wasn't a cloud vendor. It didn't originally make money by renting out its machines to others.

A company that once said, "I need as many machines as possible because the future will consume them," now says, "If these machines are temporarily idle, I can sell access to them."

This doesn't directly indicate an oversupply of computing power. But it cannot be dismissed lightly either.

On the day of the stock market crash, Palantir CEO Alex Karp spent nearly twenty minutes on CNBC, venting on camera.

He was initially there to discuss a new partnership between Palantir and Nvidia, but the conversation quickly turned to the token-based pricing model of OpenAI and Anthropic. He said CEOs privately complain to him that current enterprise AI adoption involves "paying for tokens that create no value, while also handing over your data." He even referred to the increasingly expensive model bills as a "wealth tax" imposed on enterprises.

For the past two years, the discussion has been about who dares to spend, who spends fast, and who can stack up data centers first. Now the question is slowly changing. After the machines are bought, who can keep them running at full capacity?

Meta's statement hasn't materialized into an official business yet. According to public reports, it has an internal direction called Meta Compute. It might sell raw computing power, or it might follow a model similar to Amazon Bedrock, placing different models on its infrastructure to sell to developers. Zuckerberg previously mentioned at a shareholder meeting that external companies almost weekly ask if they can buy Meta's API services or a portion of its compute, often willing to pay a price higher than Meta's cost.

He added a caveat back then. They haven't done this yet because Meta believes it still needs that computing power.

If they need it, renting it out is a choice. If they don't, renting it out is a painkiller for the balance sheet.

This is where it's hardest to judge. Meta might be simply freeing up a window in its construction cycle to sell temporarily idle resources. Or, it might be signaling to investors that billion-dollar AI expenditures cannot be sustained solely by the distant promise of superintelligence; a nearer revenue stream must be found first.

Both interpretations are plausible.

Demand Hasn't Disappeared, It's Just Starting to Pick Winners

Capex is the core of the AI narrative, bar none. Just like the loose monetary policy of 2021, the expectation for Capex is continuous growth – as long as the liquidity keeps flowing, all the sub-sectors the market speculates on will rise together. Upon hearing that Meta is preparing to sell computing power, many people's first reaction is that AI capex is about to collapse. Big companies finally admitted they bought too much, and the semiconductor party should be over.

Saying that is too simplistic.

Public data doesn't support such a clear-cut conclusion yet. AWS revenue grew 28% in Q1, reaching $37.6 billion, a rare fast growth rate in recent years. Google Cloud grew even faster in Q1, reaching $20 billion in revenue. Microsoft Azure is also still running at around 40% growth.

Amazon is still saying its capital expenditure might reach $200 billion this year. Alphabet raised its 2026 capital expenditure guidance to $180-190 billion. Meta itself raised its full-year capital expenditure to $125-145 billion.

These numbers don't look like a collapse in demand.

They look more like a diversion.

The situation for cloud vendors differs from that for model companies. Cloud vendors sell roads. As long as there is traffic on the road, they can collect tolls, regardless of who built the vehicles. OpenAI, Anthropic, enterprise clients, government clients, startups – they all ultimately need to land on some data center, some chip, some network, and some electricity contract.

So the Big Three clouds can continue to be strong.

AWS even raised the price of an AI cloud service in late June – a service for clients to reserve GPUs in advance. AWS increased the price of this service by about 20% starting in July. It had already raised it by about 15% in January. This is not an action typical of weak demand.

When supply is scarce, sellers raise prices.

But model companies might not all be so comfortable.

Model companies have more demanding assets. Computing power doesn't just generate revenue by sitting there. It needs to be continuously filled by smarter models, higher-frequency users, and more expensive enterprise workflows. Only when a model is good enough will users tolerate queues, limits, price increases, and increasingly complex subscription tiers.

This is also why Anthropic is seen by the market as a different kind of company. It’s not because it's cheap, but because users are willing to entrust it with expensive tasks. Writing code, modifying systems, running long-duration tasks, integrating enterprise workflows – once these tasks truly enter a production environment, they consume far more tokens than casual conversation.

The trouble for strong models is not having enough machines.

The trouble for weak models is no one cares if their machines are idle.

Both troubles are called computing power, but they are not the same thing.

xAI's line also carries a similar scent. Grok hasn't formed a clear enterprise mindshare like the strongest models, but some computing power within Musk's system can flow to Anthropic. This move is more sobering than any slogan. Machines don't recognize founders; they only recognize who can keep them running at full capacity.

The relationship between Google and Meta also shows things aren't so simple. In June, news emerged that Google restricted Meta's usage of Gemini, because the computing power Meta wanted to buy exceeded what Google could provide, even affecting some of Meta's internal AI projects. A company, on one hand, considers selling computing power, and on the other hand, can't buy enough top-tier model capabilities for certain tasks.

This is not a traditional oversupply.

This is a mismatch. The bills are starting to become glaring.

Cloud vendors can keep raising prices because they sell certainty. Clients want guaranteed access to GPUs within a specific timeframe, a stable data center, and infrastructure that won't crash in the middle of the night.

But once enterprise clients get the computing power, the problems aren't over.

They still have to take that bill to the CFO. The CFO won't ask how many tokens you used; he'll ask how much money these tokens saved the company, how much extra revenue they generated, and how many mistakes they prevented.

For Enterprises, Tokens Become an Electricity Meter

This brings us back to Karp's interview at the beginning.

He described what many AI companies sell to enterprises as overselling. The day before the show, Palantir posted a nine-point statement on X about so-called 'AI sovereignty,' specifically targeting models like 'tokenmaxxing.' This term is hard to translate directly, but the meaning isn't complex: it's treating token consumption as progress, burning money as usage, and the bill as productivity.

Karp put frontier labs like OpenAI and Anthropic on the table. His point isn't that enterprises shouldn't use the strongest models. It's that enterprises shouldn't hand over their data, processes, and business judgment, only to pay an ever-increasing bill based on consumption.

Palantir wants to sell something different. Not a universal chat box, not a single API, but integrating data, approvals, permissions, operational rules, and AI into the same business system. Clients pay not for "how many times AI was used," but for whether a specific production line, risk control process, or government task has truly been transformed.

The people who manage money in enterprises are starting to wake up.

UBS recently spoke with enterprise IT executives, and one direction is clear. Many enterprises aren't stopping AI usage; they're putting brakes on AI spending. Around 60% of surveyed companies are cutting token expenditure and adding usage guardrails, especially those that have passed the trial phase and are integrating AI into daily workflows.

This is also a very interesting reversal.

After AI transforms from a toy to a tool, spending money becomes harder. In the toy phase, bosses were willing to allocate budget because everyone feared missing out. In the tool phase, the CFO asks: Who did it save labor hours for? For whom did it increase sales? Who did it reduce risk for?

On this balance sheet, tokens don't look like revenue.

They look more like an electricity meter.

You could say a fast-spinning meter means the factory is running. You could also say the meter is spinning too fast while output isn't increasing, indicating a problem with the machine.

AI agents amplify this issue. A Codex study by OpenAI and several universities contains some startling data. In the first half of 2026, Codex active users grew more than five times; output tokens in some internal OpenAI roles surged – the median monthly output token for legal roles is 13 times higher than November 2025, and for research roles, over 50 times higher.

Another study makes things harder. Agentic coding tasks can consume up to 1000 times more tokens than standard code chat and code reasoning. For the same task, token consumption across different runs can vary by 30 times.

This is the real underlying reason for the current compute crunch.

It's not that people are asking chatbots a few more questions.

It's that software is starting to become a group of miniature workers who repeatedly read files, run commands, modify code, fail, retry, fail again, and retry again. They don't take lunch breaks, but they consume tokens at every step.

When tokens become the electricity meter, whoever owns the power plant has the power. But whoever wastes the electricity will be the first to face scrutiny.

As Bills Thicken, Cheaper Models Find Their Place

Once the CFO starts looking at this meter, the next step almost doesn't need instruction.

He will ask: Which tasks require the strongest model, and which tasks only need a model that is 'good enough'?

At this point, open-source models like GLM, Kimi, DeepSeek, and Qwen are no longer just tech news. They become bargaining tools on the enterprise procurement desk.

Even a16z's Marc Andreessen, a top Silicon Valley venture capitalist, said many AI practitioners already see Zhipu's GLM-5.2 as one of the first Chinese models capable of matching, even surpassing, leading US open models across most tasks. This judgment might not be the final verdict, but it gives enterprises another option.

Coinbase provides a more concrete example. Brian Armstrong said the company switched its default AI model to open-source models like GLM 5.2 and Kimi 2.7, combined with model routing, caching, and streamlined context. Token usage is still growing exponentially, but AI spending was cut by nearly half.

The damaging implication of this statement is that enterprises can now procure model capabilities piecemeal.

The hardest tasks continue to be assigned to the most expensive models. Ordinary tasks like summarization, customer service, information extraction, templated code, and internal knowledge base Q&A go to cheaper models and local deployments.

Open-source models don't necessarily need to win the entire battlefield.

They just need to convince the procurement department that not every unit of electricity needs to be paid for at luxury housing prices.

At this point, Meta selling computing power is no longer an isolated news item.

It tells the same story as Palantir criticizing tokens and Coinbase embracing open-source models: the AI spending chain is starting to be disaggregated. Upstream sells certainty, midstream sells results, and downstream pressures unit prices. Every layer is still growing, but every layer is starting to be asked if the money is well spent.

The Hardest Part Isn't Buying Machines, It's Keeping Them Working

For the past two years, the easiest story for the AI industry to tell was that resources were insufficient.

Not enough GPUs, not enough electricity, not enough data centers, not enough engineers, not enough cloud capacity to run models. This story was too convenient. When things are scarce, everyone instinctively rushes forward. Secure your position first, sign the electricity contract, buy the chips, get the machines racked.

During the resource grab, people don't tend to calculate carefully.

Because the cost of being one step behind seems much greater.

But Meta's news pushed another issue to the forefront. After the machines are bought, they don't automatically become a good business just because they are expensive. They need work every day. They need customers willing to pay. They need models to run them at full capacity. They need applications to convert costs into revenue.

This is utilization.

The term 'utilization' sounds cold, but it's actually quite brutal. It doesn't ask if you have a future; it asks if your machine worked today. It doesn't care what you said at the press conference or if you bought the most expensive GPU. It only looks at one thing: Did this money turn into a continuous stream of cash flow?

Cloud vendors have a relatively easier time answering this question. They sell infrastructure by nature. AWS, Google Cloud, Azure sell roads, electricity, and data center space. Clients who need to train models, run inference, or host applications ultimately have to land on some cloud.

So they can remain strong.

Strong model companies also have their own answer. If the model is powerful enough, users will queue, enterprises will integrate, and developers will build workflows around it. Then computing power isn't inventory; it's a bottleneck. The more machines they have, the more they can scale.

The hardest part is the middle layer.

They have machines, a story, model teams, and large budgets. But their model isn't at the forefront. Their product hasn't become a daily habit. Developers aren't willing to rework their workflows for it. For this type of company, computing power can turn from a weapon into inventory with just one failed model release or one user migration.

Inventory isn't necessarily useless.

But inventory must be discounted, rented out, or find a new purpose.

That's what's glaring about Meta selling computing power. It doesn't prove Meta failed. It doesn't prove AI demand vanished. It just lets the market see for the first time that AI infrastructure can also face the same problems as an ordinary factory.

The factory is built. Where are the orders?

Computing Power Hasn't Disappeared, It's Just Starting to Stratify

So the best way to understand this isn't "oversupply of computing power."

That term is too crude.

A more accurate description is that computing power is beginning to stratify.

The top layer is still tight. The strongest models, the best clouds, the most stable GPU clusters – people are still fighting for these. AWS can raise prices because certainty itself has a price. Clients aren't just buying GPUs; they are buying guaranteed access to a specific set of machines on a specific day, at a specific hour.

The middle layer is starting to feel awkward. It might not be bad, but it's not scarce enough. It can run models, do inference, and sell to external clients. But clients will compare, bargain, and ask why they shouldn't use a cheaper model, someone else's cloud, or why this batch of machines is worth this specific price.

The bottom layer will be gradually squeezed by open-source models and cost optimization. Enterprises won't always call upon the most expensive model for routine tasks. They will route, cache, compress context, and model capabilities into different tiers.

Demand has grown up.

Children spend without looking at the bill; adults check it. As AI enters enterprises, it will undergo the same process. In the pilot phase, everyone fears missing out. In the scaling phase, everyone starts calculating the numbers.

Once the calculations start, the industry chain won't be as uniform as it was in the early days.

Some will continue to raise prices because they sell irreplaceable certainty. Some will change to selling results because clients don't want to pay for consumption itself. Some will be forced to lower prices because 'good enough' alternatives appear. Some will rent out their machines because having machines idle looks worse on the books than renting them out cheap.

When these things happen simultaneously, the industry looks contradictory.

On one hand, computing power is scarce.

On the other, computing power is being rented out.

On one hand, token consumption is exploding.

On the other, enterprises are cutting AI spending.

On one hand, frontier models are getting stronger.

On the other, open-source models are getting cheaper.

Welcome to Join Odaily Official Community