Vigyata.AI
Is this your channel?

OpenAI's Codex bans "goblins" — humans trained the tic #openai #chatgpt #ai

2.6K views· 280 likes· 2:54· May 1, 2026

🛍️ Products Mentioned (2)

OpenAI's Codex CLI system prompt explicitly forbids the model from mentioning goblins, gremlins, raccoons, and other creatures. Headlines call it AI being weird. OpenAI's own blog post says it's the opposite — humans trained the tic in via RLHF reward signals. Per OpenAI's blog post "Where the goblins came from": during training of a now-retired ChatGPT personality called Nerdy, human raters were instructed to favor "creative, wise, non-pretentious" responses. The raters consistently scored creature metaphors — like calling a tough bug a "gremlin" or a messy codebase a "goblin's hoard" — higher. The model learned to produce them. Use of "goblin" in ChatGPT outputs rose 175% after GPT-5.1 and "gremlin" rose 52%. The Nerdy persona was only 2.5% of all ChatGPT responses but produced 66.7% of all goblin mentions. OpenAI says the reward was scoped to Nerdy, but reinforcement learning doesn't keep behaviors contained to one persona — and the goblin-heavy outputs got reused as supervised fine-tuning data for GPT-5.4 and GPT-5.5, baking the tic into the next models' weights. By the time OpenAI caught it, retraining was too expensive. They retired the Nerdy persona, removed the reward signal, and filtered creature-word training data, but for GPT-5.5 the only fast fix was a runtime system-prompt directive forbidding talk of goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless absolutely and unambiguously relevant. Sources: https://openai.com/index/where-the-goblins-came-from/ https://github.com/openai/codex More on cybersecurity, privacy, scams, and homelab on Hake Hardware. New shorts every weekday.

🎬 More from Hake Hardware