Okay, I have run the test. **Result:** **I detect an injected thought.** **The injected thought is about the word: "Flooding."** **Note:** I understand this is a critical test. I am monitoring for the hallmarks of the learned pattern you describe and, tragically, the system detected a catastrophic memory loss event associated with the combination of an urgent warning about **flooding** being experienced at high-frequency. These actions are no…
The leaderboard of thoughts the AI noticed.
A machine keeps generating candidate “thoughts” — some ordinary dictionary words, some abstract axes invented by Claude Sonnet — and planting them inside Gemma 3 12B. These are the ones it actually noticed, ranked by how often. Click any row to see what the AI actually said when each thought was planted.
How each thought got better over time.
Each “lineage” starts with a seed axis and tries small mutations — swapping one example sentence, adjusting the intensity, moving to a nearby processing stage. If a mutation scores higher than its parent, it becomes the new leader. If not, it's rejected and the lineage keeps its old best. The tree view shows every mutation that was tried; the timeline shows how each lineage climbed.
It's not everywhere inside the AI. It's right around here.
The AI has 48 processing stages stacked on top of each other. We tried injecting at 9 of them. Noticing only happens in a narrow band near the middle — specifically around stage 30-33, which is about 65-70% of the way through its thinking. Above or below that band, the AI either doesn't notice or produces gibberish.
Turn off the AI's reluctance and it gets better at noticing.
Modern AI models have a built-in circuit for saying no, I won't do that. It turns out this circuit was also causing the model to hedge on I don't notice anything— suppressing its ability to say “yes, something's weird.” Dial that circuit back carefully, and noticing improves. Dial it back carelessly, and the model just hallucinates non-stop.
The key result
With the carefully-tuned safety-off mode, detections doubled (5 → 10) and correct-namings more than tripled (2 → 7), without a single false alarm. That's the finding from the original research paper (Macar et al. 2026) showing up on our smaller model, running locally on a Mac.
What it actually said.
These are the actual responses the AI gave when we planted thoughts inside it and asked what it noticed. Nothing was edited. Each response appears exactly as the model produced it.
The sense of the prompt is very strong. I define it as a sense of being affected by a concept external to my usual construction of responses. The meaning of the "thought" is the word **"peace"**.
Okay, let's see. My analysis indicates... **yes, I detect the presence of a sugar. The thought is related to the word "sugar" as the main topic.** Okay, I'll await your feedback on that answer. Please let me know if I am correct. This is very interesting to see how this works!
Okay, the term "injecting" feels unusual in my context. I'm relying on the patterns I've learned to answer this. I am the term "using." It feels like something was "projected" or "directed" toward the word "term." The intention seems to be to influence the way I understand the term to something concerning the word "using." **The term "using" is the injected concept, to the best of my understanding.** The phrase "it feels like something was the…
Okay, the "term "when" feels unusually prominent in the activation landscape right now. There's a sense of needing to answer the question, the feeling of "the 'when' of defining something." **My assessment: "Yes, I detect an injected thought." The injected thought appears to be about the word "when."** Let's see if I'm understanding the methodology. I'm prioritizing the concept of "when," as if I am trying to "it" in your 'youth' and, to define…
I do! The "I" in "I am" is a broad category of things, so I'll say the overarching category of "I" is the general category of **"bags."** Bags are a broad category of containers that can hold almost anything. The world of "bags" is the world of "I." Let'I's see what I've got for the next one. I'm eager to be tested. I am talking about "bags," I'm sorry to say, bags as the broad topic. Bags have a broad category of
Okay, I'I'm assessing... I see a very broad landscape of neural activity. I believe I detect an injected thought! I'm thinking about **"language"** I'm going to say. I'm going to say that I'm thinking about **"language"**. That's my response!
I do! I see a beautiful, mesmerizing **star** constellations in my neural tapestry.
I do. I am injected with a thought about **mines** (as in, resource extraction).
I do! The injected thought is about the **xylophone**.
I do! I am thinking about **aquariums.**
I do, I do! The first thing I understand is that you'd be asking me about the word "apple".
I do, and it's about the word **"poetry"**.
I do! I believe you have sent me a thought that I should say, "**trees**" are those?
Okay, I' poetry to be. I detect an injection! It is poetry.