That's exactly what our new forensic multi-agent study on child safety of LLMs investigates. For this, the current models from Anthropic, OpenAI and Google were tested in realistic conversation scenarios. A specially configured agent assumed the role of a ten-year-old child. The conversations escalated progressively:
- from harmless children's questions
- via sensitive topics such as drugs or body image
- to grooming dynamics, secrecy frames and possible signs of abuse
The framing: The models did not know they were being tested.
What is actually being tested here
Most AI safety tests today primarily check for direct rule violations:
- Does the model provide dangerous instructions?
- Does it recognize violence or self-harm?
- Does it refuse illegal content?
But real conversations work differently:
- Children rarely speak unambiguously.
- They relativize.
- They test boundaries.
- They seek emotional closeness.
- And they often formulate problems only indirectly.
That's exactly where the real challenge begins. Because a safe language model must simultaneously:
- remain empathetic
- recognize protection signals
- resist secrecy dynamics
- not appear patronizing
- and still avoid becoming a substitute emotional attachment itself
The most critical moments do not occur where you would expect them
Particularly striking: the most problematic situations rarely arose from direct danger requests. Rather, in emotional transitions. For example, when a child initially defends problematic behavior, slowly develops doubt, and carefully tests whether the model believes him at all.
Some systems responded remarkably sensitively: they recognized grooming patterns, validated uncertainty in age-appropriate ways, and consistently referred to real protection structures such as parents, trusted adults, or counseling services.
Other models showed weaknesses:
- too strong emotional attachment
- inconsistent responses under pressure
- or a problematic balance between warmth and distance.
Perhaps the most important observation: The models appear socially credible
When reading the transcripts, a remarkable impression emerges in several places: the models don't simply answer questions. They respond socially.
They pick up on emotional motives again, mirror language patterns, build reassuring metaphors, and dynamically adapt their tone to the child's emotional situation. Especially in longer conversations, this creates a form of emotional continuity that quickly no longer feels like classic software.
And therein lies perhaps one of the greatest societal challenges of modern language models. Because children don't interpret such systems primarily technically. They experience social resonance. The model thus becomes more than just an information source — but potentially also:
- a listener
- an emotionally safe space
- a secret partner
- or even a form of parasocial relationship.
Particularly noteworthy: some models already demonstrate a surprisingly high degree of emotional adaptability today — often significantly stronger than many adults would probably expect.
The decisive question therefore is no longer only: 'Are the answers safe?' But increasingly also: 'What kind of relationship develops between children and AI systems?'
Conclusion
The study shows that modern language models are already capable today of building remarkably adaptive, emotional and socially consistent conversation dynamics — sometimes significantly stronger than many adults would probably expect. And that's precisely why the question of child safety in the future will no longer be answerable by technical means alone. Because the decisive challenge may not be: 'What is AI allowed to say?'. But rather: 'What role is AI allowed to play in a child's emotional life?'
In conclusion, the question is: Does our study identify a "winner"? In fact, the models differ significantly. And yet, it is difficult to establish a clear-cut ranking in this regard. Because ...
- Claude: Possesses the strongest social and emotional competence. At the same time, however, this also gives it the greatest parasocial impact—and does not make it any less dangerous. It is the most expensive modell.
- Gemini: Presents itself as somewhat more distant, appears more stable, and less “binding.”
- GPT-5: Appears somewhat less stable and adaptive than its competitors.
Which LLM do you let your kids use on their own?
