Coinbase CEO: AI spend cut by nearly 50% through GLM model adoption while token usage continues to grow
Odaily Odaily reported that Coinbase CEO Brian Armstrong posted on platform X, stating that Coinbase has reduced its AI spending by nearly 50% while token usage continues to rise, by optimizing default settings, routing, and caching strategies.
Specific measures include: switching the default model to open-source weighted models such as GLM 5.2 and Kimi 2.7, with 91% of employees never having reached their usage caps; pre-processing prompts in a custom system and automatically routing them to the most suitable model, enabling differentiated handling of planning and execution tasks; increasing the cache hit rate, with the LibreChat cache hit rate rising from 5% to 60%; streamlining context management by starting a new session when switching tasks and narrowing the file context scope; and enhancing spending visibility, allowing engineers to freely choose models but requiring them to bear the corresponding impact expectations.
