From “computational waste” to “useful computation”: How does Transformer-PoW reshape consensus mechanisms?
Original author: Anastasia Matveeva , Gonka.ai
Reaching a consensus through "waste"
Bitcoin has achieved a remarkable feat: it has demonstrated, through large-scale practice, that trustless strangers can collaborate without relying on banks, governments, or any central authority. For the first time, people can transfer funds to people on the other side of the world without anyone's permission. The network cannot be shut down, assets are not subject to censorship, and the transactions are truly effective.
Bitcoin proposes Proof-of-Work (PoW) as a consensus mechanism among distrustful participants. Its core logic is simple and clear: miners compete to solve a "puzzle"—finding a random number (Nonce), combining it with block data, and inputting it into the SHA-256 hash function so that the output meets specific conditions, typically "a hash value starting with a number of leading zeros." For example, to generate a hash value with the first 70 bits being binary zeros, an average of 2 to the power of 70 attempts are needed. There are no shortcuts or clever algorithms to avoid trying different random numbers; the process can only be completed through continuous calculation until a "lucky hit" is achieved.
The revolutionary significance of this mechanism lies in its significant increase in the cost of attacking a blockchain—to alter historical records, an attacker would need to redo all computational work. It also achieves incentive consistency—miners' rewards are proportional to their computational output, rather than depending on their existing wealth (although in practice, funding, hardware, and electricity costs still play a role). This marks the first time a truly decentralized system has been implemented on a large scale.
However, the cost is that these calculations themselves have no intrinsic value. The electricity consumed is only used to calculate the "hash value of leading zeros" and has no other practical use. Therefore, Bitcoin essentially trades massive computational waste for network security. For over a decade, this trade-off has been "good enough" in practice, making Bitcoin a powerful asset.
The New Era of Decentralization
Currently, artificial intelligence is undergoing rapid transformation. Large Language Models (LLMs) are becoming infrastructure—a service that both enterprises and users rely on. However, at present, the vast majority of LLM inference tasks run on centralized servers controlled by a few companies, which raises a series of pressing issues:
- Single point of control risk: A single company decides the types of models that can be used and who has access rights;
- Censorship risk: Governments or businesses may pressure centralized service providers to censor or restrict their services.
- Supplier lock-in: Users and developers have no choice but to rely on the current "gatekeepers".
These are precisely the core pain points that Bitcoin originally aimed to solve. This leads to a crucial question: Can we build a decentralized LLM network that solves these problems while avoiding repeating Bitcoin's mistake of "resource waste"?
Existing solutions and their limitations
Proof-of-Stake (PoS) attempts to solve the problem of wasted computing resources by "replacing computing power with capital": validators need to lock a certain number of tokens as collateral, and the probability of being selected to validate a block is proportional to the size of their collateral stake, while consuming very little energy.
However, this mechanism has a core flaw: capital distribution is inherently uneven. Taking networks like Bittensor as an example, validators with substantial capital attract smaller token holders to delegate their stake to them, creating a positive feedback loop of "the rich get richer"—more capital attracts more delegation, generating more rewards, which in turn attracts even more delegation. Over time, voting power concentrates in the hands of the initial wealth holders. Even if a subnet possesses high-performance GPUs and high-quality inference capabilities, its influence will be negligible if its validators hold limited capital.
The end result is that voting rights, which were originally held by those who contributed actual computing power, are now monopolized by capital holders. Therefore, while PoS solves the problem of resource waste, it gives rise to the new problem of wealth concentration.
An alternative
Therefore, the core of the problem becomes: can we guide computational resources toward tasks with real value while preserving the "fairness" of proof-of-work?
Research teams have long attempted to address the resource waste problem of Proof-of-Work (PoW) through different approaches. Around 2017, researchers began exploring Proof of Useful Work—a mechanism still based on the PoW framework, but shifting miners' computational tasks from "random hash puzzles" to tasks with potential economic or scientific value. Some schemes tie the "difficulty" of PoW to fine-grained problems, while others attempt to combine federated learning, matrix multiplication tasks, or zero-knowledge proof generation. The appeal of these schemes is obvious: miners can maintain the fairness of PoW by completing "truly useful work" while reducing resource waste.
However, until recently, these attempts have not been aimed at LLM inference scenarios—they have mostly focused on discrete computing problems or batch learning, rather than the "real-time Transformer inference" that supports current AI services.
In fact, LLM inference is an ideal carrier of "useful workload": it has high computational cost, high economic value, and its importance is increasing daily. If the computational workload of inference tasks can be used for network security assurance, the alignment between "network security and actual computing needs" can be achieved.
In short, miners no longer need to calculate hash values; instead, they participate in consensus by completing Transformer inference tasks. This is the core idea behind Transformer-based proof-of-work. Of course, the design of this mechanism still needs to address a series of key challenges.
It should also be noted that this mechanism is not limited to Transformer and can be adapted to any more practical and mainstream model architecture in the future.
Design Challenges
Challenge 1: Assessing computing resources
In Bitcoin, "mining" is a miner's full-time job. However, for decentralized LLM networks that need to provide services to users, nodes spend most of their time processing inference requests rather than performing proof-of-work tasks. Therefore, there are two viable options:
The first method is theoretically feasible but requires in-depth research: it estimates the participants' computational resources by utilizing the actual inference computation of existing training models—through running inference tasks, measuring computational costs, and then calibrating node weights. This approach is efficient, but it needs to address two major issues: how to adapt to the differences in different input data, and how to avoid vulnerabilities in the training model structure that may be exploited. Therefore, it requires significant R&D investment.
A second, more practical time-constraint scheme involves designing each proof-of-work puzzle as a "short-duration, fixed-duration, predictable" task (e.g., requiring only a few minutes), with the network committing to maintaining the same computational resources available throughout the entire epoch. This design provides greater flexibility in constructing uniform puzzles.
Challenge 2: Aligning the task with LLM computation
If a "time-constrained proof-of-work" approach is adopted, a new problem arises: if the PoW task is arbitrary, the direction of hardware optimization may deviate from "useful work".
The case of Bitcoin has demonstrated the consequences of "incentive mismatch": over time, the industry has developed specialized hardware (ASICs) that are only used to calculate hash values.
The Proof-of-Work (PoW) based Transformer can reverse this incentive logic: if the PoW task itself is Transformer inference, then hardware optimizations for PoW will naturally improve the inference performance of the service users—the direction of hardware optimization will naturally align with "actual needs".
To achieve this goal, two things need to be ensured: First, the PoW task must be "real Transformer reasoning"; second, the task needs to be updated in each cycle to prevent participants from calculating the answer in advance outside the specified time window.
Specifically, each round of PoW generates a "new, randomly initialized Transformer." Participants, after receiving the challenge, have only a fixed time window to solve it, with no way to analyze or pre-compute in advance—each challenge is entirely new, ensuring that the work aligns with real-world inference. Under this design, there are no shortcuts, nor can dedicated hardware be developed for specific tasks (as tasks are updated each round). Hardware improvements will only enhance the general-purpose Transformer inference performance, rather than serving "mining-specific optimizations."
Challenge 3: Security Assurance
Finally, the core issue is "difficulty design": Is PoW safe enough?
Bitcoin's security logic is simple and clear: generating a hash value with the first N zeros requires brute force, and there are no known mathematical shortcuts for the SHA-256 algorithm; its "difficulty" is simple and verifiable.
Bitcoin's mechanism is also very simple: by adjusting the random number, it verifies whether the hash value satisfies the condition that "the first N bits are zero".
Let's try to understand the direct mapping logic of Bitcoin tasks in the Transformer scenario. The Bitcoin nonce will be transformed into an "input sequence"—which can be a vector or a sequence of tokens, dynamically adjustable, and still generated from positive integers like Bitcoin nonce. The requirement of "leading zeros" will then translate into constraints on the output:
The output vector of a Transformer must satisfy certain specific properties. Possible constraints include: the output vector being close to zero, the distance to the target vector being within a threshold range, having a specific magnitude, or satisfying other explicitly defined criteria. The specific definition of this condition is crucial because shortcuts can be exploited for certain mathematical structures.
The key difference from Bitcoin lies in the higher cost of verifying whether a Transformer input sequence meets the criteria. While ordinary hardware can compute millions of hashes per second for Bitcoin Nonces, verifying a Transformer input requires a full forward propagation computation. Participants cannot attempt to brute-force billions of candidate sequences; their capabilities are limited by inference speed—and this is precisely the "computational workload" we need to measure.
How this system achieves security comparable to Bitcoin requires further technical analysis (which will be discussed in another article). Its core logic is: by randomly initializing Transformers and combining this with rigorous problem design, a search space is constructed that "can only be solved by completing full Transformer reasoning." A complete analysis of its security will be presented separately.
Making this system competitively secure on Bitcoin is a more complex technical story—that's another topic. The core logic is to construct a search space that requires complete Transformer reasoning to solve, through randomized initialization of Transformers and rigorous problem design. A full analysis of its security deserves separate discussion.
The Proof-of-Work mechanism has been running steadily for 15 years, but Bitcoin's design has also brought significant problems: we consume huge amounts of computing resources to generate hash values that have no practical use; while alternatives such as PoS have solved the problem of wasted resources, they have led to the concentration of wealth in the hands of capital holders.
Proof-of-Work (PoW) based on Transformers is another option: it retains the security and fairness of PoW while directing computational resources to where the world truly needs them. As a consensus mechanism for the AI era, it combines the security of PoW, meets real-world computational needs, and offers the "practicality of the work itself"—laying a completely new foundation for decentralized AI networks.
- 核心观点:基于Transformer的PoW可兼顾安全与实用。
- 关键要素:
- 比特币PoW存在巨大计算资源浪费。
- PoS机制导致财富集中新问题。
- Transformer推理可替代哈希计算。
- 市场影响:为去中心化AI网络提供新基础。
- 时效性标注:长期影响


