BTC
ETH
HTX
SOL
BNB
View Market
简中
繁中
English
日本語
한국어
ภาษาไทย
Tiếng Việt

PinchBench Benchmark: Gemini 3 Flash Leads AI Large Models with 95.1% Success Rate in OpenClaw Task

2026-03-08 03:27

According to a post by 23pads, CISO of SlowMist, on the X platform, the PinchBench benchmark evaluates the performance of AI large language models in OpenClaw agent tasks. The results show that Gemini 3 Flash leads in processing OpenClaw tasks with a 95.1% success rate. Minimax-m2.1 and Kimi-k2.5 follow closely in second and third place with 93.6% and 93.4% respectively. Claude Sonnet 4.5 scored 92.7%, while GPT-4o achieved 85.2%.