PANews June 27 news, Coinbase CEO Brian Armstrong shared the company’s internal AI cost control practices on X platform. He pointed out that against the backdrop of exponential growth in token call volume, the core of stabilizing AI spending is not setting usage barriers and consumption alerts, but optimizing three foundational capabilities: default models, intelligent task routing, and caching systems.
Coinbase uses its LLM gateway to set open-weight models such as GLM 5.2 and Kimi 2.7 as the default choices. Data shows that 91% of employees never hit usage limits, so the company abandoned the approach of lowering quotas and adding alerts. The system automatically preprocesses prompts, combines caching, and dynamically matches the optimal model based on price, letting AI replace manual model selection.
Meanwhile, the platform enables caching mechanisms across the entire chain, raising the LibreChat cache hit rate from 5% to 60%, and standardizes context simplification to reduce invalid token waste. The entire system does not restrict the scale of AI usage; it is designed solely to support business growth. Currently, Coinbase has nearly halved AI spending while token consumption continues to climb.



