Researchers just mathematically proved that AI can't recursively self-improve its way to superintelligence.
-
@devsimsek Nobody ever claimed that llms get better by being trained on their own synthetic data. This blog post is very misleading.
The idea of self-improvement and singularity is that llms write improved versions of their own codebase and perform the research and experiments for coming up with better models themselves.
The idea of singularity is interesting but also full of hidden assumptions. I'm always confused when people act like singularity would exist. It's just science fiction. -
Researchers just mathematically proved that AI can't recursively self-improve its way to superintelligence.
Not "we think it's unlikely." Not "it seems hard." Formally proved.
The model doesn't climb toward AGI — it slowly forgets what reality looks like. They call it model collapse. The math calls it inevitable.
I wrote about it
https://smsk.dev/2026/04/26/ai-cannot-self-improve-and-math-behind-proves-it/
Interesting article about how AI cannot grow into a super intelligence because the more systems grow, the more they rely on information generated by themselves and the more ....
'it forgets what reality looks like'.
#AI #AGI #RSI
(RSI = Recursive Self-Improvement)
https://smsk.dev/2026/04/26/ai-cannot-self-improve-and-math-behind-proves-it/ -
@Quantensalat @devsimsek Yes.
They have also never had a machine crash because a recursive operation overran the stack or used up all the memory.
-
Researchers just mathematically proved that AI can't recursively self-improve its way to superintelligence.
Not "we think it's unlikely." Not "it seems hard." Formally proved.
The model doesn't climb toward AGI — it slowly forgets what reality looks like. They call it model collapse. The math calls it inevitable.
I wrote about it
https://smsk.dev/2026/04/26/ai-cannot-self-improve-and-math-behind-proves-it/
-
-
Large language models are fundamentally different from mammals on every level. They do not build models or reason about them. A rat is more "intelligent".
-
@wronglang @devsimsek Yes, sure. I mean I can imagine it improving somewhat still, like when you augment your training set for image recognition by adding noise to a smaller set, but only to a point before it goes downhill from feedback.
No, my gut feeling is rather that there have to be much more effective ways to train a model than to brute force funnel billions of pages of text to a transformer which blindly fits relations between words and structures without understanding them, that seems like doing it the hard way, even if I'm not expert enough to tell you what an alternative would look like
@Quantensalat @devsimsek oh gotcha, yes agreed.
-
@troed Just to make you angry.
-
@devsimsek I'd be interested to see the same analysis of human consciousness. It is well understood that complexity is a regime on the absolute edge of chaos.
@onekind I would be interested in this as well.
-
@Quantensalat @devsimsek For something more formal on this subject see
https://arxiv.org/abs/2601.03220
The abstract starts "Can we learn more from data than existed in the generating process itself?"
@dpiponi @Quantensalat Thanks, will check.
-
How much have you studied human cognition - as well as the emergent effects shown by LLMs?
I've studied both. So far I haven't come upon a single anti-AI fanatic that has any.
Most human cognition is common to all mammals, even most of the frontal lobe is pre-linguistic. LLMs are ONLY linguistic. They are a clever hack repurposing a 1950s ERA model of how the visual cortex works to simulate the barest parody of linguistic processing. At the best you can say that they are implemented on something like a similar kind of processor, but the software, the neural connections and weights, is completely unrelated.
-
Most human cognition is common to all mammals, even most of the frontal lobe is pre-linguistic. LLMs are ONLY linguistic. They are a clever hack repurposing a 1950s ERA model of how the visual cortex works to simulate the barest parody of linguistic processing. At the best you can say that they are implemented on something like a similar kind of processor, but the software, the neural connections and weights, is completely unrelated.
-
Most human cognition is common to all mammals, even most of the frontal lobe is pre-linguistic. LLMs are ONLY linguistic. They are a clever hack repurposing a 1950s ERA model of how the visual cortex works to simulate the barest parody of linguistic processing. At the best you can say that they are implemented on something like a similar kind of processor, but the software, the neural connections and weights, is completely unrelated.
Yeah, so why do you think it's relevant that some brain processing starts to form before we acquire language? Most people vocalize their thoughts, even though you might not (and I don't always either). All our intellectual skills are acquired through language.
What an LLM is, is a "thinking engine". That's what the training creates. That "thinking" can then be applied to different subjects, with a rudimentary form of working memory.
The big surprise to those developing LLMs was that the technology suddenly created emergent effects not foreseen from their basic architecture - the ability to _reason_ and _create world models_. If you're still in 2022 and don't think that this is what they do then maybe you need to get off the "stochastic parrot" bandwagon and update your own knowledge?
After all - humans don't do anything but map inputs to outputs through neural networks either.
-
Yeah, so why do you think it's relevant that some brain processing starts to form before we acquire language? Most people vocalize their thoughts, even though you might not (and I don't always either). All our intellectual skills are acquired through language.
What an LLM is, is a "thinking engine". That's what the training creates. That "thinking" can then be applied to different subjects, with a rudimentary form of working memory.
The big surprise to those developing LLMs was that the technology suddenly created emergent effects not foreseen from their basic architecture - the ability to _reason_ and _create world models_. If you're still in 2022 and don't think that this is what they do then maybe you need to get off the "stochastic parrot" bandwagon and update your own knowledge?
After all - humans don't do anything but map inputs to outputs through neural networks either.
The fact that we learn before acquiring language is itself a demonstration of the fact that mammalian thought and reasoning, and human thought and reasoning, is fundamentally not based on language. Your argument is disproving your point.
-
The fact that we learn before acquiring language is itself a demonstration of the fact that mammalian thought and reasoning, and human thought and reasoning, is fundamentally not based on language. Your argument is disproving your point.
-
-
@troed @devsimsek what you have actually written is based on a category error. You're confusing the platform, neurons, with the software.
Wouldn't it be prudent if you learnt anything about the subject first?
Here - I'll help: One of the better books on the subject is "Consciousness: An Introduction" by Susan Blackmore.
I read it 15 years ago. That you believe there's a "platform" and "software" means you have no idea how human cognition works.
-
Wouldn't it be prudent if you learnt anything about the subject first?
Here - I'll help: One of the better books on the subject is "Consciousness: An Introduction" by Susan Blackmore.
I read it 15 years ago. That you believe there's a "platform" and "software" means you have no idea how human cognition works.
I have been following this approach since the first steps in the '80s. I'm pretty clear on how it works.
Software is a metaphor, the connectome is obviously a different kind of construct than procedural code, but it is the connections, not the fact that it is built out of neurons, that determines the kind of reasoning and model construction that the human brain performs. You are looking at the implementation and ignoring the big picture.
-
I have been following this approach since the first steps in the '80s. I'm pretty clear on how it works.
Software is a metaphor, the connectome is obviously a different kind of construct than procedural code, but it is the connections, not the fact that it is built out of neurons, that determines the kind of reasoning and model construction that the human brain performs. You are looking at the implementation and ignoring the big picture.
Here - read a scientific paper:
"Our findings reveal reasoning-like mechanisms within the LLM's layers that operate across structurally similar tasks. Crucially, these mechanisms remain stable despite variations in input and output data, suggesting the existence of internal processes that transcend basic language processing."
There's no hardware and software in humans. The hardware and the software are one and the same.
https://www.sciencedirect.com/science/article/pii/S2949882126000010
-
Here - read a scientific paper:
"Our findings reveal reasoning-like mechanisms within the LLM's layers that operate across structurally similar tasks. Crucially, these mechanisms remain stable despite variations in input and output data, suggesting the existence of internal processes that transcend basic language processing."
There's no hardware and software in humans. The hardware and the software are one and the same.
https://www.sciencedirect.com/science/article/pii/S2949882126000010
All the "internal processes" are linguistic, and not even sophisticated linguistic processing, anything else is hallucinated by the researchers fooled by the "clever hans" effect.
All mammals have basically the same hardware. All behavioral differences are due to differences in the size and arrangement of the connections between the neurons.

.