Skip to content

Plutonic Rainbows

Forty-Seven Percent Would Rather Not

Nearly half of British sixteen-to-twenty-one-year-olds told the BSI they'd prefer to have grown up in a world without the internet. Forty-seven percent. Not a fringe opinion from technophobes or Luddites — a near-majority of the generation that never knew anything else.

The rest of the numbers are worse. Sixty-eight percent said they felt worse about themselves after spending time on social media. Forty-two percent admitted to lying to their parents about what they do online. Forty percent maintain a decoy or burner account. Eighty-five percent of young women compare their appearance and lifestyle to what they see on their feeds, with roughly half doing so often or very often. These aren't edge cases. This is the baseline experience.

What strikes me isn't the individual statistics — we've had versions of these figures for years. Back in 2018, Apple's own investors were pressuring the company over youth phone addiction, citing surveys where half of American teenagers said they felt addicted to their devices. Seven years later, nothing structural changed. The platforms got stickier. The algorithms got sharper. The age of first exposure dropped. And now the generation that grew up inside the experiment is telling us, plainly, that they wish the experiment hadn't happened.

Fifty percent of respondents said a social media curfew would improve their lives. Twenty-seven percent wanted phones banned from schools. Seventy-nine percent believed tech companies should be legally required to build privacy safeguards. That last number is the one I keep returning to — four out of five young people asking for regulation that adults have spent a decade failing to deliver.

The BSI's chief executive, Susan Taylor Martin, put it in corporate language: "The younger generation was promised technology that would create opportunities, improve access to information and bring people closer to their friends." The research, she said, shows it is "exposing young people to risk and, in many cases, negatively affecting their quality of life." This is what institutional understatement sounds like when the data is screaming.

There's an uncomfortable parallel with how the AI industry is repeating social media's mistakes — the same pattern of externalised harm and internalised profit, the same rehearsed contrition at hearings, the same gap between stated commitments and actual behaviour. The platforms knew what they were doing to adolescents. Internal documents confirmed it. Nothing changed because engagement metrics drove revenue, and revenue was the only number that mattered in the boardroom.

Forty-three percent of the respondents started using social media before the age of thirteen — the legal minimum. Not because their parents approved, but because the platforms made it trivially easy to lie about your age. Then those same platforms sold advertising against the attention of children who shouldn't have been there in the first place.

The generation that was supposed to be "digital natives" — fluent, empowered, connected — is telling us they'd trade it all for something quieter. We should probably listen.

Sources:

Virgin Records Press Call, March 1990

Propaganda lined up for Virgin in March 1990 with four faces calibrated for the dark — the ruffled blouse at center doing more work than any press stylist should have to admit.

Deep Think Crosses the Human Line

Google upgraded Gemini 3 Deep Think yesterday, and the number that matters is 84.6%. That's its score on ARC-AGI-2, the abstract reasoning benchmark designed to resist brute-force pattern matching. Humans average around 60%. Claude Opus 4.6 — which landed last week to genuine excitement — scores 68.8%. GPT-5.2 manages 52.9%. Deep Think clears the human baseline by nearly 25 points and leads the next-best model by almost 16.

I'm trying to figure out what to do with that.

The Codeforces result is harder to dismiss as benchmark theatre. Deep Think hit 3,455 Elo — Legendary Grandmaster territory, better than all but seven active human programmers on the platform. No external tools. No retrieval. Just inference-time compute and whatever Google means by "parallel hypothesis exploration." The top human competitor, Benq, sits at 3,792. That gap is closing fast enough to make competitive programming feel like it has an expiration date.

What changed from the previous version: scope. Earlier iterations of Deep Think were narrowly focused on mathematics. This upgrade pushes into chemistry, physics, and engineering. Gold medals on the written portions of the International Math, Physics, and Chemistry Olympiads. A mathematician at Rutgers used it to peer-review a paper on high-energy physics structures bridging gravity and quantum mechanics. It caught a subtle logical flaw that human reviewers had missed. That's not a benchmark. That's a real research contribution, however narrow.

The architecture Google describes — they call it "Aletheia" — uses a generator, a natural language verifier, and a reviser working in concert. Parallel hypothesis exploration rather than a single reasoning chain. The interesting detail is that the system can acknowledge failure and stop rather than burning compute on dead-end paths. Most reasoning models I've used have no concept of giving up gracefully. They hallucinate forward until they hit a token limit. If Aletheia genuinely knows when it's stuck, that's a meaningful advance in how these systems manage uncertainty.

Google's approach here is fundamentally different from what Anthropic and OpenAI are doing. They're scaling inference-time compute — giving the model more time to think rather than making a bigger model. The base is still Gemini 3 Pro, not some trillion-parameter behemoth. Deep Think is a reasoning mode, not a separate model. The distinction matters because it suggests the ceiling on what you can extract from existing architectures is higher than most people assumed. You don't need a fundamentally new model. You need to let the current one actually think.

That feels right to me, intuitively. When I use extended thinking in Claude, the quality jump over instant responses is enormous — not because the model suddenly knows more, but because it has room to work through contradictions and dead ends before committing to an answer. Google is doing the same thing with significantly more compute thrown at the problem. Anthropic shows you the reasoning. Google hides it. Both approaches produce results that make the non-thinking versions look careless by comparison.

The pricing is interesting. Deep Think is included in the Google AI Ultra subscription at $249.99 per month. API access requires applying for an early programme. I keep thinking about how o3 was positioned as the reasoning breakthrough that would change everything, and then Deep Think shows up a year later scoring nearly 30 times higher on the same class of benchmark. The pace of obsolescence in this space is genuinely disorienting.

Demis Hassabis called it "new records on the most rigorous benchmarks in maths, science & reasoning." MarkTechPost ran with "Is This AGI?" — which, no. But I understand the impulse. A system that reasons better than the average human on abstract pattern recognition, codes better than 99.99% of programmers, and catches errors in peer-reviewed physics papers occupies territory that didn't exist twelve months ago.

Google DeepMind published a research impact taxonomy alongside the release, rating contributions from Level 0 to Level 4. They classify Deep Think's current output at Levels 0-2 — autonomous solutions and publishable collaborations, not landmark breakthroughs. The fact that they felt the need to temper expectations tells you something about the temperature of the conversation. When the company releasing the model is the one saying "calm down," the benchmarks have moved past what anyone's frameworks were built to accommodate.

Sources: