The trouble with GPT-5…
OpenAI’s Sam Altman has been talking GPT-5 up as a quantum leap in intelligence for so long, that it was almost inevitable its release would be something of a disappointment.
While it’s not quite New Coke level bad, it’s not the reception they’d hoped for.
Social media was immediately flooded with GPT-5 (apparently) giving very stupid answers, including obvious logic flaws, stating that 9.11 is greater than 9.9 and following patterns to the incorrect conclusions.
OpenAI’s odds on Polymarket for having the “best model by end of August” plunged from 75% to just 8% in the aftermath, though it’s climbed back to 24% since.
The New Yorker wrote GPT-5 “is the latest product to suggest that progress on large language models has stalled” and that it had reopened the debate on whether you can scale up LLMs past a certain point.
“Nobody with intellectual integrity can still believe that pure scaling will get us to AGI,” wrote AI skeptic Gary Marcus.
OpenAI also clearly underestimated the personal affection many users have for GPT-4o — as sycophantic and gratingly insincere as it seemed to others.
The upgrade includes a router that dynamically switches between various models with different capabilities (GPT-5, GPT-5 mini, GPT-5 nano, thinking mode, etc). It switches to the dumb models to save costs on answering dumb questions, and flips to the more costly advanced models for the hard stuff.
This wasn’t working properly when it was released, which may explain some of the poor reception and pretty substandard IQ test results.
Altman announced an update mid-week, allowing users to choose between auto, fast and thinking models. GPT-4o is also back for paid users and he promised to give users plenty of notice before it’s retired again.
“We are working on an update to GPT-5’s personality which should feel warmer than the current personality but not as annoying (to most users) as GPT-4o.”
AI experiment shows no good solutions for social media
It is a truth universally acknowledged that social media is a complete shitshow that has ruined entire generations.
The personality traits most linked to success in life, such as conscientiousness, agreeableness and extroversion, have plummeted among the under 50s, while neuroticism has skyrocketed.
Only Boomers have been spared, possibly thanks to their affinity for heartwarming AI slop.
Various solutions have been proposed, including changing the AI based content algorithms to provide less divisive content or switching back to a chronological feed (a la Bluesky).
But a new preprint from Cornell University argues that outrage, negativity and conflict may well be structurally embedded in the very architecture of social media.
University of Amsterdam researchers used AI personas to simulate online social media behavior and the experiment suggested that all of the solutions had drawbacks.
Chronological ordering of feeds reduced attention inequality (where the voices of a small number of elite users are amplified) but instead exposed users to more extreme content.
“Bridging algorithms” that surface less divisive content and a greater range of views helped reduce partisanship and increased viewpoint diversity but increased attention inequality. Boosting viewpoint diversity, meanwhile, had no significant impact at all.
The AI algo also hates your friends
Meta has argued in court that the Federal Trade Commission can’t prove it has a monopoly on the personal social networking market — because Facebook and Instagram’s AI powered algorithm rarely show you content from your friends anymore.
On Instagram, just 7% of time is spent looking at content from friends, while on Facebook, it’s 17%.
Meta said the rise of TikTok had forced it to develop the new AI-powered algorithm.
Robots can fold towels now
Figure’s Helix robot has become the first humanoid robot able to fold towels. This may not sound very impressive, as any idiot human can fold a towel, but to a robot, this has been an insurmountable problem called “deformable object manipulation.” It’s tricky because the object keeps changing shape, which breaks the system’s internal models.
For the first time, a humanoid robot can fold laundry using a neural net
We made no changes to the Helix architecture, only new datapic.twitter.com/lSHX4Uc1qA
— Brett Adcock (@adcock_brett) August 12, 2025
ChatGPT dietary advice sends man insane
A man who asked ChatGPT for advice on how to cut salt out of his diet, poisoned himself with its recommendation and ended up in a mental hospital.
Asked for a salt replacement, ChatGPT suggested sodium bromide, which is suitable for replacing salt in cleaning products, but definitely not for humans to ingest.
He ended up with a 19th-century malady called Bromidism, which caused paranoia and hallucinations, chronic thirst and a nasty rash.
Is this the funniest thing ever produced by AI?
Back in May 2024, Anthropic researchers released a research paper that detailed how millions of concepts activate when Claude reads relevant text or images. One such cluster of neurons involved the Golden Gate Bridge, and they discovered they could amplify the effect.
They released Golden Gate Claude, a model that interprets everything by its relationship to the Golden Gate bridge.
It led to this incredibly funny piece of writing, exhumed on X this week:
Amazon’s fake books problem
Author Caitlyn Lynch noted recently that just 19 of the top 100 young adult romance ebooks were legitimate books, and the rest was AI slop.
It appears as if grifters have been mass uploading AI-generated books, farming clicks on them with bots, to generate per click royalties from Amazon Kindle Unlimited (which is like Spotify for books).
There’s also a cottage industry of people generating AI books for sale. Tommi Pedruzzi, 27, claims to have published 1,500 books on Amazon and made $3 million. He says AI does 70% of the work, with the rest consisting of fact-checking, formatting and editing.
Vitalik Buterin endorses AI doom book
Ethereum creator Vitalik Buterin has endorsed a book co-written by well known AI alignment expert and AI doomer Eliezer Yudkowsky.
Co-authored by Nate Soares, the authors of the book “If anyone builds it, everyone dies” argue that superhuman intelligence would inevitably develop its own goals in conflict with humans and the AI would inevitably win by out competing our attempts to stop it.
Buterin, who has proposed d/acc (defensive accelerationism), wrote:
“A good book, worth reading to understand the basic case for why many people, even those who are generally very enthusiastic about speeding up technological progress, consider superintelligent AI uniquely risky.”
Ignore previous instructions: GPT-5 is good actually
While it may not be a giant leap forward, there are still plenty of improvements with GPT-5. There are fewer factual errors and hallucinations, it reasons better, it’s much less of a kiss- ass, it’s less prone to jailbreaks, and the benchmarks have improved.
Contrary to initial reports, GPT-5-Pro actually scored 148 points on the Mensa Norway test, GPT-5 scored 120 and Pro (Vision) scored 138.
New research also shows that GPT-5 is better than human doctors in terms of medical reasoning (it achieved a 24.23% higher score) and medical understanding (29.4%).
Professor Derya Unutmaz from the Jackson Laboratory for Genomic Medicine commented:
“At this point failing to use these AI models in diagnosis and treatment, when they could clearly improve patient outcomes, may soon be regarded as a form of medical malpractice.”
Unutmaz conducted his own month-long experiment with the GPT-5 thinking model to develop engineered cells against lymphomas, and said the results were “nothing short of staggering.”
The model was able to accurately predict experimental results and suggest refinements, allowing them to simulate months of lab work in advance, which compressed “the scientific timeline from years to weeks!”
“This changes everything about how science is done, ushering in an era where discovery moves at the speed of thought!”
Darren Aronofsky wants you to help him make AI films
“Black Swan” and “Requiem for a Dream” director Darren Aronofsky is advertising for “extraordinary AI artists” to join his new Primordial Soup generative AI studio in New York.
Working with Google DeepMind, artists will use AI tools to create worlds, experimental sequences and effects.
Generative AI looks like it could be good for writers and directors, who can create whatever they like without convincing studio bosses to stump up a $400-million budget. But it looks like it is less good for crew members, VFX artists, set designers or many actors.
As a small-scale example, fintech company Slash created this ad on Veo, inspired by Margot Robbie’s appearance in “The Big Short,” for just $235.60.
A lot of people have been asking me how our Global USD account actually works.
So here's Margot Sloppy in a bubble bath to explain: https://t.co/aOeFYMa9XN pic.twitter.com/Rr9Ro5Y2D5
— Victor Cardenas Codriansky (@victorcardenas) August 8, 2025
All Killer No Filler AI News
— DeepSeekR2 is rumoured to be coming out in the next two weeks. However, other reports suggest DeepSeek is suffering from a lack of chips due to US bans, which may delay the release due to an inability to scale up to demand.
— Perplexity AI is offering $34.5 billion to buy Alphabet’s Chrome browser… which is an interesting move as Perplexity is only worth $14 billion. The move is seen as a way to get a wide release for the features of its new Comet browser.
— There’s a lot of talk about people suffering from psychosis as a result of AI affirming their delusions, but rarely is the process caught on video. Here’s a glimpse of how AIs can reinforce existing and (possibly) wrong beliefs.
A woman on TikTok has posted a 24-part series about falling in love with her psychiatrist, claiming he manipulated her into it. She makes absurd claims in her videos and heavily consults AI, which always affirms her delusions.
I think the bigger concern is that AI turns people… pic.twitter.com/kRgQnHEtLK
— Ana Mostarac (@anammostarac) August 10, 2025
— Google is working on a fix after Gemini started getting stuck in self-loathing loops when unable to solve problems. The bot would say things like “I am a failure, I am a disgrace”… “the code is cursed” and would recommend that users find “a more competent assistant.”
— Scientists in Japan used AI to create a new underwater glue modelled on how barnacles stick to rocks that’s 10 times more effective than existing adhesives.
— Claude Pro has usage limits that reset every five hours, so this coder has been alternating sleeping for three hours and using all his tokens up for three hours, over and over.
— Reddit is widely acknowledged as the premier source of unbiased and rigorously fact-checked knowledge on the web [we hope you feel the sarcasm here-Ed], so it’s great to see it has become the number one source of information cited by large language models. Wikipedia is a long way behind in second place.
Andrew Fenton
Buterin’s ETH treasury warning, Bitcoin $250K a ‘maybe’: Hodler’s Digest, Aug. 3 – 9