The core challenge of decentralized AI reasoning: How to prove to the entire network that you are not "cheating"?

Gonka_ai

特邀专栏作者

@gonka_ai

2025-10-22 03:23

This article is about 3796 words, reading the full article takes about 6 minutes

When computing power becomes a commodity, trust must be built through code rather than promises.

AI Summary

Expand

核心观点：去中心化LLM需验证节点模型真实性。
关键要素：
1. GPU非确定性导致输出比较困难。
2. 经济机制惩罚作弊，声誉归零。
3. 三种验证方案防范模型替换攻击。
市场影响：提升去中心化AI网络可信度。
时效性标注：中期影响。

Original article by Anastasia Matveeva, Co-founder of Gonka Protocol

In our previous article , we explored the fundamental tension between security and performance in decentralized reasoning using LLMs. Today, we'll deliver on our promise and delve into a core question: In an open network, how do you truly verify that a node is actually running the exact model it claims to be?

01. Why is verification so difficult?

To understand the validation mechanism, let's review the internal process of the Transformer when performing inference. As input tokens are processed, the model's final layer produces logits—raw, unnormalized scores for each token in the vocabulary. These logits are then converted to probabilities using a softmax function, forming a probability distribution over all possible next tokens. At each generation step, a token is sampled from this distribution to continue generating the sequence.

Before diving into potential attack vectors and specific verification implementations, we first need to understand why verification itself is difficult.

The root of the problem lies in the non-determinism of GPUs. Even the same model and input may produce slightly different outputs on different hardware, or even on the same device, due to issues such as floating-point precision.

The non-determinism of GPUs makes directly comparing output token sequences meaningless. Therefore, we need to examine the Transformer's internal computational process. A natural choice is to compare the probability distributions over the output layer, that is, the model vocabulary. To ensure that we are comparing the probability distributions of the same sequence, our verification procedure requires the verifier to fully reproduce the exact same token sequence generated by the executor and then compare these probability distributions step by step. This process produces a verification certificate that proves the authenticity of the model.

However, probabilistic behavior also presents a delicate balance: we need to punish persistent cheaters while also avoiding accidentally harming honest nodes that are simply unlucky and produce low-probability outputs. Setting the threshold too high can accidentally kill good players; setting it too low can let bad players off the hook.

02. The Economics of Cheating: Benefits and Risks

Potential benefits: Huge temptation

The most direct attack is "model replacement." Assuming the network deployment requires a large amount of computing power for the Qwen3-32B model, a rational node might think: "What if I secretly run the much smaller Qwen2.5-3B model and pocket the difference in computing power saved?"

Using a 3 billion parameter model to pass off as a 32 billion parameter model could reduce computing power costs by an order of magnitude. If you can deceive the verification system, it's like being paid for high-quality computing power while delivering results using cheap computing power.

A more sophisticated attacker might use quantization techniques, claiming to run at FP8 precision, but actually using INT4 quantization. The performance difference might not be as significant, but the cost savings would still be substantial, and the outputs might be similar enough to pass simple verification.

At a more complex level, there are also pre-filling attacks. This attack allows an attacker to generate proofs for the output of a cheap model as if that output was generated by the full model that the network expects. It works as follows:

For example, consensus is reached on the chain to deploy Qwen3-235B with a specific set of parameters.

1. The executor uses Qwen2.5-3B to generate the sequence: `[Hello, world, how, are, you]`.

2. The executor computes the Qwen3-235B proof for these exact same tokens via a single forward pass: `[{Hello: 0.9, Hi: 0.05, Hey: 0.05}, ...]`.

3. The executor submits the probability of Qwen3-235B as proof, claiming that the reasoning comes from Qwen3-235B.

In this case, the probabilities come from the correct model, making them appear legitimate, but the actual sequence generation process is much cheaper. Since the full model is also theoretically possible to generate the same output as the smaller model, the results may appear completely legitimate from a verification perspective.

Potential losses: More expensive

While cheating the system can yield considerable gains, the potential losses are equally significant. The real challenge for cheaters is not to pass a single verification, but to systematically evade detection over a long period of time, so that the "discount" they receive on computation exceeds the penalties the network can impose.

In the Gonka network, we have designed a sophisticated economic suppression mechanism:

- Everyone is a validator: Each node verifies part of the network reasoning according to its weight

- Reputation system: The reputation value of a new node is 0, and all inferences are verified. With continued honest participation, reputation grows and the verification frequency can be reduced to 1%.

- Penalty mechanism: If you are caught cheating, your reputation will be reset to zero and it will take about 30 days to rebuild.

- Epoch Settlement: Within an epoch of approximately 24 hours, if you are caught cheating a statistically significant number of times, all rewards for the entire epoch will be confiscated.

This means that a cheater attempting to save 50% of computing power might end up losing 100% of their gains. This "loss-over-gain" risk makes cheating economically unprofitable. The problem we aim to solve with our verification mechanism isn't to catch every questionable inference, but to draw a line that ensures we can consistently catch cheaters with a high enough probability while not damaging the reputation of honest participants.

03. How to catch cheaters? Three verification schemes

So, how do we catch these attacks? The problem can be divided into two parts: 1) verifying that the distribution in the proof is close to the distribution produced by the claimed model; and 2) confirming that the output text was indeed generated based on the submitted proof.

Option 1: Probability Distribution Comparison (Core Foundation)

When the executor generates an inference output (e.g., `[Hello, world, how, are, you]`), they record a verification certificate containing the final output and the top K probabilities for each position in the output sequence (e.g., `[{Hello: 0.9, Hi: 0.05, Hey: 0.05}, ...]` for the first position, etc.). The verifier then forces its model to follow the exact same token sequence and computes the normalized distance $ d_i $ between the probabilities at each position:

where $ p_{\text{artifact},ij} $ is the probability of the jth most likely token at that position in the inference repository, and $ p_{\text{validator},ij} $ is the probability of the same token in the validator distribution.

The final distance metric is the average sum of the distances of each token:

Comparing the top K probabilities proved sufficient for catching completely different models, or even different quantization versions of the same model. As shown in the chart below, when testing between Qwen3-32B FP8 and INT4 quantization versions, we achieved a 99% fraud detection rate with zero false positives. The probability distributions for other models were also sufficiently different that cheating could be easily detected within a single epoch, while honest participants remained safe.

The following figure compares the number of successful and failed verifications for Qwen3-32B FP8 (honest) vs. INT4 quantization (fraudulent):

The following figure shows the change in the probability distribution distance of Qwen3-32B FP8 (honest) vs INT4 quantization (fraudulent) as a function of sequence length:

However, a drawback of this approach alone is that it does not protect against pre-filling attacks. If the probabilities do match the distribution of the full model, but the sequences were generated via a smaller model, we have several additional methods to catch this situation.

Solution 2: Perplexity Detection (Defending Against Pre-filling Attacks)

In addition to verifying that the evidence comes from the correct model, we also check whether the output text is "likely" to come from the probability distribution of the evidence.

If the sequence is generated by a different model, its perplexity relative to the claimed model's probability distribution will be abnormally high. In the test, the perplexity difference between the Qwen2.5-7B and Qwen2.5-3B pre-filled attacks was significant.

The most intuitive way to catch pre-filling attacks is to check perplexity: in addition to verifying that the attestation was produced by the claimed model, we can also check whether the output text is likely to be generated from the submitted distribution:

If the perplexity is too high, it indicates that the sequence was likely generated by a different model. This approach works well for defending against attacks using significantly different models. When we tested sequences generated by Qwen2.5-3B but claimed to be from Qwen2.5-7B, the difference in perplexity was significant enough to set a clear threshold. The figure below shows a transformed perplexity function, fitted to the interval [0, 1], where the outputs of the 7B model mostly fall below 0.2, while the outputs of the pre-filled attack (3B model outputs with 7B model probabilities) all fall above 0.4.

The following figure shows the perplexity function after the transformation of Qwen2.5-7B (honest) vs. Qwen2.5-3B pre-filled attack (fraud):

However, perplexity detection has limitations when comparing similar models and different quantized versions of the same model. For example, when comparing the w8a16 and w8a8 quantized versions of Qwen2.5-7B Base, no significant difference is shown.

The following figure shows the perplexity function after Qwen2.5-7B w8a16 (honest) vs w8a8 quantization (fraudulent) transformation:

Fortunately, using a lookalike model for an attack usually doesn’t make economic sense. If they run a model that costs 70-90% of the claimed model, plus a score calculation against the true model, they might actually spend more computing resources than if they had just run the true model.

It is important to note that a single low-probability output generated by an honest participant will not significantly reduce their reputation. If this low-probability output is not persistent for that participant, i.e., it is just a random statistical outlier, then they will still receive the full reward at the end of the epoch.

Solution 3: RNG seed binding (deterministic solution)

This is the most radical solution: tie the output sequence to the random number generator seed.

The executor initializes the RNG using a deterministic seed derived from the request (e.g. `run_seed = SHA256(user_seed || inference_id_from_chain)`). The verification evidence contains this seed and the probability distribution.

The verifier uses the same seed to verify that if the sequence indeed comes from the probability distribution of the claimed model, the same output will be reproduced. This provides a deterministic "yes/no" answer, completely eliminates pre-filling attacks, and the verification cost is much lower than full inference.

04. Outlook: Towards a decentralized AI future

We share these practices and reflections out of our unwavering belief in the future of decentralized AI. As AI models increasingly permeate our lives, the need to bind model outputs to specific parameters will only grow stronger.

The verification scheme chosen by the Gonka network has been proven feasible in practice, and its components can also be reused in other scenarios where the authenticity of AI reasoning needs to be verified.

Decentralized AI is not just a technological evolution; it's a transformation of production relations. It attempts to address fundamental trust issues through algorithms and economic mechanisms in an open environment. While the road ahead is long, we've already taken a solid step forward.

Original link

technology

Welcome to Join Odaily Official Community