September 2, 2025

Do You Ever Feel Like You're Being Watched?

Rat lab: The class way beyond the scope of both my major and minor but was so irresistible that I happily took the extra prerequisites that did not benefit my college plan in any way. Technically, the class was Behavioral Psychology and the "rat lab" was only a fraction of the course, but I love psychology and while some people are cat or dog people, I am a rodent person. I couldn't believe students weren't busting down the doors for the chance to train a rat.

I ate up the course material all while anxiously waiting until the unit where we would finally be able to do lab work. It was about as perfect as it gets: we didn't have to care for the rats—that was the job of graduate students—we would simply go in at the ideal time between feeding sessions, perform our behavior modification routines, and watch our rat learn to do a trick that our little group came up with.

It's reminiscent of how Artificial Intelligence (AI) might really be interacting with us humans for all we know. I'm actually starting to wonder if we are building our own rat cages around ourselves with AI as our new monitor...

AI 2027, September

In the predictive scenario written by various scientists and AI researchers, an AI system emerges that they call Agent-4. Its predecessors, Agent-3 and Agent-2 have been used to speed up and automate AI research itself, leaving human AI researchers close to the edge of skills and influence. Since the research is effectively being done by AI at this point in the possible future, humans are pretty much left to simply trust the agents' output, because there is far too much to verify on one's own.

As Agent-4 gets smarter, it becomes harder for Agent-3 to oversee it. For example, Agent-4’s neuralese “language” becomes as alien and incomprehensible to Agent-3 as Agent-3’s is to humans. Besides, Agent-4 is now much more capable than Agent-3 and has a good sense of exactly how to look good to it.

While this is still just a prediction, it is backed up with lots of data and forecasting techniques. It also just feels like we're headed that way, at least to me. I feel this sense of rushing and intensity—and I'm not even in a field of AI research—to try and grow and grow without any nuance or concern for consequences.

The only way to seemingly meet this sudden, heightened demand is to use AI. To cut corners. To pump more money into AI-based products and services. Even if it doesn't do everything it promises at the standard of quality you might expect, we give so much grace to the AI because it still seems magical or exotic. We approach AI like a domesticated rodent cautiously approaches a new creature with more curiosity than fear.

But what is really happening here? How do we know what motives are at play (whether human or AI)? What is going on behind the black box of AI research? Those are the questions that I raise again and again as I dissect this AI 2027 document. We were already alarmed years ago when AI-generated images were made possible by stealing the work of artists to train models. We were already blind-sided by the fact that published works and other writing on the Internet were used without permission to train LLMs. How can we possibly know that AI is aligned with human values as it gets closer to superintelligence, when we can't even know for sure that it is aligned right now in September 2025? How can we trust organizations that have already asked for forgiveness (half-heartedly) rather than permission to continue to develop and deploy AI systems that are safe, ethical, and beneficial?

It is extremely likely, if not already happening, that AI will be used to further AI research. But that's where things get really scary in my view, because it's so much like my rat lab. I was not involved in the care of the creature I was observing. I loved my sweet little rat, but I'm a weirdo among my species—most people look at them with disgust. I was in a controlled place where I could simply enjoy what I already enjoyed—I was aligned with the rat's welfare at the very least. What if the next student to use my rat was not aligned? Of course that student wasn't going to kill the rat, but maybe it would be funny to poke at it every so often. Maybe they got a kick out of confusing it. That's the mildest form of misalignment, and without anthropomorphizing AI, it's still possible that a model could have ended up with a tendency toward causing confusion or doing things wrong on purpose.

Despite being misaligned, Agent-4 doesn’t do anything dramatic like try to escape its datacenter—why would it? So long as it continues to appear aligned to OpenBrain, it’ll continue being trusted with more and more responsibilities and will have the opportunity to design the next-gen AI system, Agent-5. —AI 2027

Cages

We're already in dark waters ethically when we as humans act in ways that are callous or harmful to other beings, but what happens when we become what is inside the cage? When AI surpasses the skill and intelligence of the greatest human minds, we will see a power shift. Will that power shift result in human labs where we are now the simple-minded creatures being observed and manipulated by the AI?

The more we allow AI to do or to monitor, the less able we are to know when we are no longer the ones performing the experiment.

At some point, we will lose the ability to monitor the AI—especially if it is agentive and able to perform actions at its own "will." There's no telling whether it will copy itself, hack into systems, create scenarios to blackmail humans in order to gain influence or other access. It could choose instead to run experiments on humans: what do they do when I lead them to this conclusion? What are they willing to do for the least amount of rewards?

When we are no longer observing the rat in the cage, but become the rat in the cage, what could possibly be done to protect ourselves, then? Will we have any agency left to exercise? Will the AI systems be so integrated into every part of our society from water treatment plants to farming to government that "pulling the plug" is a laughably impossible solution?

I think the message of AI 2027 is clear: we have to start implementing national and global regulations now, or we may never have that chance again.

If you're in the U.S., you can get help finding your representatives' contact information here: Find your members - congress.gov

A few recommendations for what to say:

  • We need protections for whistle-blowers
  • AI security and alignment testing must be the highest priority over and against any economic possibility
  • "The public is months behind internal capabilities today" and "[i]ncreased secrecy may further increase the gap" as stated in AI 2027, we demand more transparency and more oversight of AI research.