The Exhaust Is the Product

I recently spent some time at Constellation, exploring ways in which I can contribute to safety. Given my experience, I naturally look for commercial applications. I asked several people about their thoughts on whether commercial approaches to safety are viable. Nobody denied that an intersection probably exists, but I detected skepticism.

One example that I think would benefit from being more commercial is monitorability. In my work with Terminal Bench, I’ve reviewed more trajectories than I can count. In the process, I found an ever-growing graveyard of well-intended trajectory analysis tools that never gained traction. Everyone involved knows there is something missing, and so every so often a new tool is shared in our Discord, and then interest dissipates.

Which is it then? Is there a need for monitorability tools and we just keep missing the mark, or are we in some sort of collective delusion? I think it’s the former. And I think it’s in part because nobody has put the level of attention and intensity a great product deserves. One benefit of building a for-profit company is that when someone is paying you, they are going to have a lot of expectations. If you pick the right first set of customers, that pressure will lead to better solutions. Every single AI trajectory should be monitored. Not just at the labs. Every person using models should be aware of the trajectories they generate. All the way down to what your kids’ trajectories are about. Everybody should have awareness of what the models are doing on their behalf.

If you get really good processing and archiving in place, you can do all sorts of stuff on top. This is a recap of a recent conversation where I argued for some commercially viable motivations to incentivize monitorability adoption. These intentionally sound less like safety pitches, because the intention is to make a case to a broad audience. I acknowledge some of these ideas come close to surveillance. Rather than advocating for any one of these, I’m just wearing my VC hat to startupify some concepts.

Understanding labor productivity. You want everybody in your company using AI. If they’re not using AI, they’re wasting time. But some people are going to use AI so well that they’re basically doing nothing. They’re going to automate their job, and instead of moving on to the next thing, every morning they push two buttons, pretend they’re doing work. The distinction between that person and a highly productive one is going to be very difficult to make. You need to actually look at the trajectories. Do all of someone’s trajectories look the same every day, or are they doing different things?

The quality of reporting on labor productivity right now is really low. Everything I’ve seen from consulting firms and universities is crap. It’s hopeful, or it’s looking through the wrong lens. People are wishing that the impact is less than it is. It’s not written by people who really understand where to look for productivity, because obviously the KPIs are going to change. It’s not about hours in the office. It’s not about Slack messages sent. Text is free, so you can’t measure productivity by text volume. So what do you measure it by? You measure it by what people actually did with their AI, and what came out of it.

Organizational alignment. Here’s something you get at large companies. The CEO and the executive team come up with a strategy. The whole company has to do this thing. But the company has a million employees, so it doesn’t quite work like that. The strategy document is vague. Every team underneath makes their own interpretation. Sometimes you get surveys, once a week, asking what employees think is most important for their job. You try to get a sense of whether people are actually aligned with the goal. But all of this is self-reported. It’s like a personality test. There is what people say and what people do.

Now imagine you can see what people are actually doing. Not because you’re surveilling them, but because the AI is already the intermediary for their work. Claude is the witness. Claude is the confidante. Claude has all the information. Are employees actually doing what the master plan says they should do? Are they doing something different? And if they’re doing something different, it’s not necessarily bad. It could be that the executives are unaware of what’s actually important. In fact, I’d argue that’s half the opportunity or more: seeing what the distributed system of your employees are actually doing naturally, extrapolating the implied strategy that hasn’t been made explicit, and seeing if you can reinforce that.

Real-time strategic nudges. When I’m interacting with Claude, building things or writing an email, can I be reminded of things that are relevant to the company’s strategy at that moment? Not after the fact, but in the moment. Imagine how powerful that is. We already do it with legal stuff. You’re about to do something that violates a policy, and you get flagged. That’s the basic kind of email monitoring that companies have. But this is different. This is: you’re about to do something that is very aligned with the strategy, and you should let people know. CC this person. Or: you’re going off track and should reconsider. And then flip it around: do the executives themselves actually behave according to their own strategy? They go to a golf retreat, come up with a plan, and then go back to work the next day. Is there a discrepancy between how they spend their time and what they said the company should do? They may not even be aware.

Retroactive security analysis. If a zero-day exploit is found, I want to go back to all my trajectories and see: did I use this anywhere? Was I exposed? This is the equivalent of an audit trail, but for every interaction you’ve ever had with an AI agent. If you have good archiving, you can replay, search, and assess exposure at any point in the future for threats that didn’t exist at the time.

Self-understanding. At the individual level, trajectory monitoring is about understanding yourself and where you’re at. How risky are my interactions with my agent? What patterns do I fall into? What am I actually spending my time on versus what I think I’m spending my time on?

Whoever controls the exhaust controls the insight. And if you’re really thoughtful about this kind of analysis, I think you can go really far.