An AI agent takes on the entire software development workflow.
Watch the video here.
Hey, I'm Fabian, and I'm on a mission to see how autonomously AI agents can implement software features. I'm building and refining a development loop where an AI agent takes on the entire workflow: implementing code, writing and running tests, linting the code, getting it reviewed, and then merging it. This process is designed to be iterative, allowing the AI to build out complete features with minimal human intervention.
Today, I want to walk you through a real example of this loop in action.
The Application and the Task
Our testing ground is a simple web application. It has basic authentication (login/logout), a full set of CRUD (Create, Read, Update, Delete) operations for files, and a feature to display toast notifications.
The application's main weakness was its navigation. To move between different example sections, a user had to know and type the specific URLs manually. The task for our AI agent was to add a navigation sidebar that links to the CRUD and toast examples. Crucially, this sidebar should only appear within the "examples" section of the app, which suggests the need for a new layout component to avoid adding the sidebar to every single page.
To help with the UI, we'll instruct the agent to leverage the shadcn/ui component framework.
The Tool for the Job: Introducing Roo Code
For this task, I'm using Roo Code, an AI coding assistant that works as a VS Code plugin. It's more than just a chatbot. It can execute commands like reading and editing files, running terminal commands, and even interacting with services like GitHub through an MCP (Model Context Protocol) server.
The most powerful aspect of Roo Code is its customizability. You can define different "modes" and "rules" to guide the AI's behavior. For this project, I've created a "Feature Orchestrator" mode. This orchestrator follows a specific set of rules I've defined, including a standard GitHub feature branch workflow: create an issue, create a branch, implement the code, create a pull request, review the PR, and merge.
Putting the AI to Work: A Step-by-Step Walkthrough
I start by giving the Feature Orchestrator my prompt detailing the sidebar requirements. Let's break down how it handled the task.
Step 1: Planning and Specification
The first thing the agent does is create a plan. It understands the request and decides the initial step is to create a formal task. I guide it to encapsulate this task within a GitHub issue. This is a critical step: by creating a detailed issue with a clear summary and acceptance criteria, the AI clarifies its understanding and proposed solution before writing a single line of code. This moves more time into specification and less into debugging flawed code later.
Step 2: The Git Workflow
Once the issue is created, the Orchestrator follows its rules perfectly. It creates a new feature branch, fetches the latest from the repository, and checks out the new branch. It's now in a clean environment, ready to code.
Step 3: From Issue to Implementation
Next, the Orchestrator starts a "code" sub-task. The AI coder's first action is to read the GitHub issue it just created to get the full context. This is exactly what a human developer would do. With the requirements understood, I set the agent to "auto-approve" its own actions and let it work.
The agent gets to work, creating end-to-end tests first and then implementing the necessary components and layout changes.
Step 4: Testing, Linting, and Building
The agent attempts to run the tests it created. It hits a small snag—the tests fail. This was actually my fault; I still had the development server running locally, which blocked the port. After I stopped it, I instructed the agent to run the tests again. This time, they all passed.
Before creating a pull request, I have a rule that the agent must ensure the code is clean and the project builds successfully. It runs the format and lint commands, making sure everything is up to standard.
Step 5: The Pull Request and AI Code Review
With clean code and passing tests, the agent creates a pull request on GitHub. But the process doesn't stop there.
This is where it gets really interesting. The Feature Orchestrator, having completed the coding sub-task, now creates a review sub-task. A new AI instance starts up with the goal of reviewing the pull request. It fetches the PR details and the original issue, and then proceeds to read through the changed files. It even finds a small issue, fixes it, commits the change, and leaves a comment on the PR.
Step 6: The Merge
After the AI's self-review, all automated checks were passing. I reviewed the final result in the application. The new sidebar was implemented correctly and functioned as expected. The agent had even added a nice responsive touch: the sidebar automatically collapsed into a hamburger menu on smaller screen sizes. This was an excellent addition that I hadn't explicitly requested.
With the work verified, I gave the final approval. The agent merged the pull request, deleted the feature branch, and switched back to the main branch. The task was complete.
The Future is Supervision, Not Typing
Throughout this process, I didn't write any application code myself. My role shifted to that of a supervisor. I provided the initial direction, offered a bit of guidance when the agent needed a nudge, and gave the final sign-off.
The system isn't perfect yet. I still need to refine the rules to make the agent more autonomous, and there are occasional hiccups with things like terminal integrations. However, I'm convinced this is a glimpse into the near future of software development. Our work will be less about typing out lines of code and more about:
Creating clear, detailed specifications (like the initial issue).
Reviewing the final output (like the pull request).
The entire process in between—the coding, testing, and iterating—will be handled by our AI counterparts.
I hope you found this look into an AI-driven development workflow interesting. I'll be sharing more as I continue to build and refine the process!