Agent OS is great at making Claude Code respect your codebase. I’m less convinced it improves the hard part: planning.

Brian Casel is a content creator, creator of AgentOS and DesignOS, and the owner of Builder Methods - a community focused on sharing how to code with AI.

Agent OS is one of the more thoughtful frameworks I’ve tried for making Claude Code work in a real codebase. It gives you a structure for turning messy project knowledge into explicit standards, then helps Claude pull those standards into the right conversations at the right time.

That matters. A lot of AI coding failures don’t happen because the model can’t write code. They happen because the model doesn’t know the local taste of the codebase: the naming conventions, the preferred abstractions, the boilerplate patterns, the folder structure, the “we always do it this way here” details that live in people’s heads.

Agent OS is good at surfacing and preserving that knowledge. But after using it, I think its biggest strength is also where its current limits show up: it helps Claude generate code that looks more like your codebase, but it doesn’t go far enough in improving the depth of the plan before the code gets written.

The core loop

Brian's workflow sits mainly in the planning cycle of Agentic development. His workflow underwent a major change in January, removing lots of features that Claude Code had been natively adding, so now its a slimmer version focused mainly on dynamically extracting and injecting standards into the context window.

Brian Casel's core workflow loop

The workflow

In reality this is much simpler than it looks. Its actually 2 parts -initial setup, when you first install AgentOS and planning a piece of work.

Initial setup:

  1. Discover the current standards in your codebase. You won't do this that often.
  2. Plan product - Initial setup to explain what the codebase does, for whom and why. Again you'll only do this once.

Plan mode:

  1. Shape spec - Claude will launch a Q&A session, asking questions to shape the work based on your product and standards.
  2. Inject standards - A final command to polish off the plan and inject examples to steer code generation.
Brian's AgenticOS workflow

Installation: useful, but heavy for teams

The first thing I noticed is that Agent OS installs a lot of files straight away.

The new files AgentOS created

For a solo project, that’s fine. For a team codebase, it’s more complicated. I can imagine my team being hesitant about adding this many project-level commands and workflow files into the repo. Some of that friction could be reduced if there were a clearer option to install commands personally rather than at the project level.

The installation also wasn’t as obvious as I expected. That might just be me, but the experience made me think this could work well as a Claude plugin. A plugin-style setup would make the first-run experience feel lighter and less invasive, especially for teams that aren’t ready to fully buy into the framework yet.

There are trade-offs, of course. Project-level files are more durable and shareable. Personal commands are easier to adopt but harder to standardize. Still, for team adoption, the current setup feels like it asks for quite a lot of commitment up front.

/discover-standards: slow, but genuinely useful

The standout feature for me is /discover-standards.

It’s slow to run, and it creates a lot of small example files. I wonder if the selection process could be simplified, maybe with a file limit or a more guided review flow. But once it finishes, the output is genuinely useful.

This is especially true in codebases with lots of boilerplate-like code and opinionated syntax. Those are exactly the patterns agents can miss. They can infer the broad architecture, but they often miss the local conventions that make code feel native to the project.

This is where Agent OS shines. It turns implicit codebase knowledge into something Claude can actually reference. That makes the agent less likely to produce code that technically works but feels alien in the repo.

/inject-standard: a practical workaround for context problems

/inject-standard seems to be a way to force Claude to pick up the right context at the right time.

In practice, it’s slightly nicer than manually tagging a context file. It feels more dynamic than static rules, and that’s useful. You can bring in a relevant standard mid-conversation instead of overloading Claude with everything from the beginning.

That said, part of me feels like Claude should do more of this out of the box with Skills or rules. The idea of loading only the relevant guidance is exactly what good agent tooling should handle natively.

Still, today, the command is useful. It gives you a more controlled way to steer the model without dumping every standard into the context window.

/plan-product: familiar advice, nicely executed

/plan-product is the kind of workflow I already think of as best practice: clarify the product, ask questions, define the shape of the work before jumping into code.

What I liked is the way it uses Claude’s native interaction style. The questions make the workflow feel collaborative rather than mechanical. Instead of just filling out a template, you get a planning conversation that helps sharpen the idea.

This is the part of Agent OS that feels especially useful for people who are newer to Claude Code. It nudges them away from “vibe coding” and toward a more deliberate loop: think first, specify the work, then let the agent execute.

/shape-spec: good for codebase alignment, not deep enough for hard planning

/shape-spec is where I wanted more.

The current version feels heavily focused on shaping a spec for a specific codebase. That’s useful, but I think Agent OS is missing a bigger opportunity: improving the actual planning step through deeper research.

Brian’s framework will do a great job at generating code that looks similar to the rest of your codebase. If the task is adding a button, building a form, or extending an existing pattern, that might be exactly what you need.

But for more complex architectural work, codebase similarity isn’t enough.

If you’re planning how to implement a new REST layer and dynamically transform it into LLM tools, you need more than local standards. You need to understand current best practices, available libraries in your chosen language, the limitations of your existing architecture, and the trade-offs between possible approaches.

That’s the mode I still find missing, even with Anthropic’s improvements to plan mode. Agent OS helps Claude plan within the known shape of the codebase, but it doesn’t yet feel like it pushes the model into a truly deep research workflow.

What I want is something closer to /deep-plan: a mode that combines codebase discovery, external best-practice research, library evaluation, architecture constraints, and implementation sequencing before producing the spec. At the moment even using Claude with maximum effort doesn't achieve this, so I think its real low hanging fruit.

The team adoption problem

My biggest hesitation with Agent OS is adoption in small to medium-sized teams.

As a framework, it works best when the whole team buys into it. Standards, specs, and commands become most valuable when they’re maintained and trusted by everyone. But that’s also what makes the framework harder to introduce.

If only one person on the team wants to use it, where do the files live? Do they go into the repo? Do they stay local? Do you git ignore them? If you ignore them, they’re not safely persisted or shared. If you commit them, you’re asking the team to accept a new workflow and extra project structure.

None of these options are terrible, but they all feel slightly messy.

This is why I keep coming back to the plugin idea. A lighter-weight Claude plugin could make Agent OS easier to try before asking the whole team to adopt it. The repo-level version could still exist for teams that want the full shared workflow.

Conclusion: excellent for Claude Code beginners, promising for teams, but not the whole answer

Overall, I think Agent OS is an excellent framework, especially for someone starting out with Claude Code.

It encourages the right habits: write specs, capture standards, make implicit codebase knowledge explicit, and avoid asking the model to guess everything. That alone will get many people much further than an unstructured prompting workflow.

The strongest part is standards discovery. Agent OS is genuinely useful for helping Claude generate code that fits your project’s existing and conventions.

The weaker part is planning depth. For straightforward product work, the current planning flow is probably enough. For deeper architecture or unfamiliar technical domains, I’d still want a much stronger research-driven planning mode before trusting the agent to build.

So my read is this: Agent OS is very good at making Claude Code more consistent. It’s less complete as a system for making Claude Code more strategically thoughtful.

That’s not a small achievement. It solves a real problem. I just think the next leap is helping the agent understand not only how we write code here, but what the right thing to build actually is.


​ · ​