Innovation
Burn after prompting
What if the AI tool you think you're paying for is actually billing you for every decision a model makes before the answer even gets to you?
01 July 2026
It only took Uber four months to burn through its 2026 AI budget. In December 2025, Uber rolled out Claude Code; by April, the rideshare platform had nothing left to spend. The reason wasn’t rogue engineers or poor planning, it was tokens. Unlike a fixed software licence fee, AI coding tools like Claude Code and GitHub Copilot charge per token, which sounds manageable until you understand how tokens work in practice. A token is the smallest unit of measurement that's processed by a GenAI model. When a new model is trained, it is fed all sorts of data, like text, images or audio. The model starts to pick up patterns in the data and breaks this information into small pieces – tokens – that it can recognise and reuse.
Over time, the model builds up a large internal vocabulary of tokens. Models count how many tokens you input, such as prompts, emails or documents uploaded, and get back outputs, like summaries, code suggestions or draft responses. The more tokens you use for a particular model, the more you pay. For Uber, the AI tools were only part of the story. Engineers were ranked on internal leaderboards based on Claude Code usage, creating a direct incentive to consume more tokens. The teams driving adoption were not the same teams watching the spend. With over 5 000 engineers using the tool, it was reported that even Uber's CTO spent $1 200 in a single two-hour session.
ITWeb Premium
Get 3 months of unlimited access
No credit card. No obligation.
