Coinbase CEO: By using methods such as the GLM model, AI spending has been reduced by nearly 50% while token usage continues to grow
Odaily Planet Daily News: Coinbase CEO Brian Armstrong posted on platform X, stating that Coinbase has reduced AI spending by nearly 50% while token usage continues to grow by optimizing default settings, routing, and caching strategies.
Specific measures include: switching the default model to open-weight models such as GLM 5.2 and Kimi 2.7, with 91% of employees never having reached their usage limits; preprocessing prompts in a custom system and automatically routing them to the most suitable model, achieving differentiated processing for planning and execution tasks; improving cache hit rates, with LibreChat's cache hit rate increasing from 5% to 60%; streamlining context management by starting new sessions when switching tasks and narrowing file context scopes; and enhancing spending visibility, allowing engineers to freely choose models but with corresponding performance impact expectations.
