Governance should live in the harness, not the policy doc
Three things every AI agent needs that your policy can't give it
Agentic architectures will soon become the standard form factor for AI.
Generative AI is has taken a few forms to this point.
We first had the chatbot interface, which today still remains quite dominant. It is the one that people tend to be most familiar with, cementing its place with the release of ChatGPT in November 2022.
We then had auto-completion, particularly with coding. This made interaction with the model more indirect; you simply start typing your code in the editor and the model suggests what the rest of the code should look like.
Agents take this ambience further. With agentic architectures, models become the engines that power a wider system that operates autonomously usually with little human direction.
AI agents are designed to go beyond simple single-turn, input-then-output interactions. They can identify the steps required a complete task, access the right tools and data, and use feedback loops to verify that it has completed the task set.
Why are we moving more toward these agentic architectures? Because the chatbot form factor is difficult for most people to use. They do not always know what questions to ask and often struggle to unlock the full capabilities of the model sitting behind the text box.
Agents close this capability gap.
As I wrote previously:
[Agentic architectures] leverage AI capabilities to autonomously execute the right instructions, gather and use the relevant context, and complete actions on behalf the user. Agentic AI could bring into existence the kind of ambient digital assistants that help users navigate their digital world in a way that demonstrates the extent of the capabilities of these strange black boxes.
But for agents to work effectively, they need to be surrounded by other components. Agents are systems consisting of different parts that do not just instruct the model on what to do and how to do it, but also provide it with the resources to enable the completion of the task.
How those components are fitted together is crucial. As Grace Shao talks about in her newsletter on AI agents in knowledge work:
The broader point is that the debate is shifting from “which AI is smartest” to “which system can actually get work done inside an organization.” When intelligence is commoditized, the new scarcity is agency, and the ability to act across systems, with the right permissions, in the right sequence, with accountability attached. That is workflow. That is what the refineries are really selling.
So how do we build whole systems that can get work done?
Harness engineering.
Prompt engineering focuses on crafting detailed natural language instructions for the task the model should complete. Context engineering focuses on providing models with the relevant information to complete tasks with.
Harness engineering is the practice of building the scaffolding around the model that enables it to go beyond just producing text. It is what gives the model the ability to browse the web, run code, read files, remember what happened in previous sessions and take multiple steps toward a goal.
But harness engineering is not just about making models more capable. It is also about making models safer and more reliable.
And this is where AI governance comes in.
Governance work typically lives on paper: policies, inventories and risk assessments scattered across Word documents, spreadsheets and slides.
But this paperwork means nothing if it is not accompanied by methods for its practical implementation. Governance stays stuck as just words on paper.
Harness engineering is what transforms governance from mere paperwork into executable rules for AI agents.
It becomes a way to achieve governance-by-design, producing the safe and reliable AI agents that users and enterprises are ultimately looking for.
In this post, I break down what harness engineering entails and how it can be used to implement three core aspects of AI governance.


