Google launched its new Gemini usage limits this week as part of the I/O announcements, and as of today they are live. The shape of the change is small in the abstract and irritating in the particular. Gemini used to behave, for most people, like an unmetered utility. Now it counts compute, which means it counts you.

The new rule is that every prompt has a cost, and that cost depends on the length of the chat, the features you invoke, and the model's own estimate of how hard the question is. There are two windows. A weekly cap, which most users will never read, and a five-hour block which resets through the day. When you exhaust the block you wait, unless you have already exhausted the week. AI Pro subscribers get roughly four times the free quota; Ultra gets five times that again, plus, per Forbes, a separate hackathon prize pool that has nothing to do with the meter and everything to do with the narrative.

What is interesting is not that limits exist. Limits have always existed, hidden behind the rate-limiter and the polite spinner. What is interesting is that limits are now visible, named, and denominated in a unit you cannot intuit. A "percentage of a five-hour block" is not a thing anyone has spent fifteen years learning to feel for. It is not minutes, not megabytes, not even tokens. It is whatever the dispatcher decides your last sentence cost. The user has to develop a new sense for it, the way phone users in 2009 learned the shape of a 200-megabyte monthly cap by running out of it twice.

The provocation underneath the launch is the new Flash model. Existing users on 9to5Google are reporting that Gemini 3.5 Flash burns multiple percentage points of a five-hour block per prompt. The previous Flash, by all accounts, did not. So the meter arrives at the same time as the model that eats it fastest, and the resulting friction is not a rollout quirk; it is the design. Google is teaching people to feel which questions are expensive. Some will respond by rationing themselves down to the cheaper model when the work is mundane. Some will simply hit the wall and assume the product is broken.

I wrote on Tuesday about Google turning the whole I/O keynote into a Gemini argument. The metering announcement is the same argument from the other side. If Gemini is going to live inside Search and Chrome and Android and your inbox and the in-car dashboard, then the question of how many Geminis you are allowed to spend in a day becomes the question of how much of your day you can route through Google's stack at all. The meter is not a tax on the chatbot; it is a budget for the agent.

Two things will follow from this in roughly the same week. People who use Gemini casually will not notice for months, then notice all at once, on a Sunday when they had three things to ask and the first one ate the budget. Anyone building real workflows on the Pro or Ultra tier will start treating prompt economy the way they used to treat S3 list-bucket calls, with grudging respect for the bill. The intuition that the model is "free as long as I have a subscription" was always a fiction. We are now told the price.

Sources: