Coinbase CEO: AI Spending Reduced by Nearly 50% Through GLM Models, While Token Usage Continues to Grow
Odaily Planet Daily reported that Coinbase CEO Brian Armstrong posted on the X platform stating that by optimizing default settings, routing, and caching strategies, Coinbase has reduced AI spending by nearly 50% while token usage continues to grow.
Specific measures include: switching default models to open-weight models such as GLM 5.2 and Kimi 2.7, with 91% of employees never reaching their usage limits before; preprocessing prompts in a custom system and automatically routing them to the most suitable model, achieving differentiated processing for planning and execution tasks; improving cache hit rates, with LibreChat's cache hit rate increasing from 5% to 60%; streamlining context management by opening new sessions when switching tasks and narrowing file context scope; and enhancing spending visibility, allowing engineers the freedom to choose models while bearing the expected impact accordingly.
