Let's talk about a concept that's rapidly gaining traction: Vibe Engineering.
Let's talk about a concept that's rapidly gaining traction: Vibe Engineering. This isn't just about letting an AI write your code; it's about creating a highly optimized environment where AI can perform at its peak, supercharging your development process.
The term "Vibe Coding" was mentioned by Andrej Karpathy in a Twitter post, describing a state where developers heavily rely on AI, mostly accepting its suggestions with minimal guidance. Vibe Engineering takes this a step further: it's about meticulously crafting the conditions for the AI to "Vibe Code" effectively, leading to superior results.
Vibe Engineering in Action: Real-World Wins
We have successfully applied Vibe Engineering in several use cases, demonstrating its practical benefits:
End-to-End Feature Implementation:
Specifying features with an Architect AI, then having a Coder AI implement the code and tests, which we could then merge. Sometimes this required re-specification if the AI misunderstood, especially if the architect step was skipped for seemingly "obvious" changes.
Large-Scale Code Refactorings:
A massive refactoring from a technical to a functional code structure, which worked surprisingly well after spending significant time with the Architect AI discussing the refactoring.
A test suite refactoring to change mocking strategies (from a framework's mocking to a native framework's mocking with fixtures), which the AI understood and implemented effectively.
User Documentation Generation:
Added VitePress (a static site generator) to a repository to create user documentation from Markdown files.
A dedicated "User Guide Writer" AI mode was tasked to write the documentation, fed with additional context like landing page text.
Initial attempts to use browser automation failed due to captchas and navigation issues (scrolling). Letting the AI analyze the code directly yielded very good results, with the AI extracting relevant user-facing details.
Code Migration:
Translating code from one language/framework to another. The process involves:
Putting old and new code in a single repository accessible to the AI.
Having the AI describe the old repository extensively (user journeys, screens, features) into documentation.
Instructing the AI to translate/write individual features in the new repository, using the feature descriptions and, for complex parts, the old code as a reference. This has worked very well.
Incident Management:
Analyzing a Sentry trace for a suspected SQL injection. The AI reviewed the trace and corresponding code, producing a very good incident report.
Investigating a downtime issue where a Docker container didn't restart. The AI analyzed logs and infrastructure code, accurately describing the cause and suggesting fixes.
The ultimate aim of Vibe Engineering? Speed and value. In software engineering, the primary goal is to deliver valuable features to users as quickly as possible. Vibe Engineering offers a path to achieve this by leveraging AI to its fullest potential.
So, how do you build this optimal AI coding environment? Here are the key pillars:
1. The Foundation: A Primed Project
For an AI to code effectively, it needs a solid starting point.
Starter Kits: If beginning a new project, say a Svelte web app, ensure a starter kit like SVELTEKIT INIT has been executed.
Pre-configured Essentials: Things like user authentication (if needed), a testing suite (unit and end-to-end), database connections, a component framework, and a styling methodology should already be in place. This allows the AI to immediately start working with existing components and systems rather than building everything from scratch.
2. Guiding the AI: Rulesets and Best Practices
AIs, left to their own devices, don't always follow best practices.
Incremental Implementation: Define rules that require the AI to implement features in small, incremental steps, including intermittent builds and tests. This prevents scenarios where the AI builds an entire feature (database, backend, frontend) only to find it riddled with bugs upon the first execution.
Team-Specific Practices: Every team has its unique way of doing things. The AI must be instructed to adhere to these established team rules and best practices to ensure seamless integration.
3. Knowledge Transfer: Comprehensive Project Documentation
Just like onboarding a new human engineer, the AI needs context.
Project Overview: A document describing the project, its goals, key components, how to navigate the codebase, and essential commands (for installation, execution, testing) is invaluable. This document should become part of every task given to the AI.
Document Recurring Tasks: Common procedures, like creating a new page, connecting it to the backend (e.g., for a form action), and managing the corresponding data model, should be documented.
Reference in Rulesets: Your ruleset can then direct the AI to consult this documentation for specific tasks, enabling it to follow established patterns independently.
4. Streamlined Execution: Simplified Commands
Make it easy for the AI to perform complex or multi-step operations.
Chained Commands: Combine sequences like building the project, running tests, checking coverage, linting, and formatting into single commands. For example, an npm run check script in package.json could execute all these steps.
Clear Feedback Loops: If a chained command fails (e.g., tests break after a successful build), the AI immediately knows which part needs fixing.
Prepared Git Commands: Have ready-made commands for tasks like fetching Git history.
Documented Commands: Ensure all such custom commands are documented so the AI knows they exist and how to use them.
5. Evolving Intelligence: AI Self-Improvement
Encourage the AI to refine its own operational parameters.
Ask the AI: A neat trick is to simply ask the AI what it needs to perform a task optimally and inspire it to improve its own rules or prompts.
Dynamic Rule Updates: If the AI makes a mistake (e.g., incorrectly implements a backend-frontend connection), correct it in the chat. Then, prompt the AI to add a rule to its ruleset to ensure it handles the situation correctly next time.
6. Role-Playing for Precision: AI Personas
Assigning specific roles to the AI can significantly improve its focus and output. Current models aren't yet adept at inherently knowing when to switch modes, making explicit roles beneficial.
Architect Mode: For high-level architectural decisions and task specification. This role might be restricted from editing code files, only working with Markdown for documentation or specifications. For example, an architect might be tasked with creating an Architecture Decision Record (ADR) whenever a fundamental decision is made.
Coder Mode: Specifically prompted and permissioned to handle coding tasks.
User Documentation Writer Mode: Prompted with a user-centric language style, focusing on the "task to be done" from the user's perspective, unlike the architect who focuses on technical aspects like security and maintainability. This separation prevents the AI from jumping into implementation details when you're trying to discuss architecture and allows for role-specific rules.
Choosing Your AI Model: A Balancing Act
Selecting the right LLM involves weighing cost, speed, and quality.
Personal Approach:
We aim for the highest quality model where speed still allows for good iteration and the cost provides a decent return on investment. For instance, coding with Gemini 2.5 Pro can be very fast, even in reasoning mode, costing around €10/hour, which is often justifiable against client rates.
Prompt Caching:
Ensure the model and your system support prompt caching. Prompts should be structured so that initial parts are identical, with new information appended, to leverage caching and reduce costs. Tools like Roo Code can automate this.
Benchmarks:
While not the sole decision factor, benchmarks can be informative.
Look at the coding section of the Artificial Analysis benchmark.
The Aider Polyglot benchmark is also interesting, especially as it comes from a coding assistant provider.
Be aware of benchmark limitations: many focus on competitive coding or specific languages like Python. However, they help identify top-tier models and evaluate the quality-cost trade-off. Speed often needs to be tested manually, though most top models have comparable speeds.
Getting Started with Vibe Engineering
Get a Coding Assistant: Tools like Roo Code or others are your entry point.
Summarize Your Repository: Let the assistant analyze your codebase and create an overview document. We have created a mode for Roo Code ("Repository Overview" or "Code Base Overview") for this. Claude's /init command does something similar. This gives the AI a good starting point for every task.
Ask Questions: Interact with the AI about your codebase. Observe which files it examines and how it formulates answers. This is easier if you're familiar with the codebase yourself, allowing you to challenge or validate its responses.
Start Small: Begin by having the AI make minor changes, perhaps at the function level. Gradually increase the complexity until it can implement entire features.
Develop Your Ruleset: You'll quickly see the need for custom rules. You can even use existing rulesets, like our ruleset from "feature," as a starting point.
An Advanced Workflow: The Feature Branch Power-Up
A particularly effective workflow we've implemented is the "Feature Branch Workflow" using an "Orchestrator Mode" that coordinates other AI modes:
Specification (Architect): The Orchestrator first tasks the Architect AI with fully specifying the requested feature, including asking clarifying questions or making suggestions.
Detailed Issue: The outcome is a GitHub Issue that's much more detailed than usual, outlining the problem, a step-by-step implementation plan, and required tests, just as our team would do. This gives the human developer a clear view of the plan before coding begins.
Implementation (Coder): The Coder AI then takes this detailed issue and implements the feature. This structured approach limits the AI's "freedom" in a good way, leading to more predictable results than if the AI devised the plan itself during coding.
Testing: The Coder AI is instructed to develop tests for the new feature. AI is generally good at writing its own tests. The typical flow is for the AI to write the code first, then the tests; attempts to force test-first development often meet resistance from models and haven't yielded better results in our experience. The AI then iterates against these tests; typically, tests are closer to the specification than the initial code, helping to iron out bugs. All tests must be green before proceeding, even if a failing test seems unrelated to the feature, as our ruleset dictates only green code on main.
Format & Lint: After tests pass, the AI runs formatters and linters, fixing any issues.
Pull Request & Review (Reviewer): The AI creates a pull request. The Orchestrator then invokes a Reviewer AI, which gets a fresh context to review the PR. This feedback loop (Reviewer to Coder) can run up to three times. This "self-review" by an AI with a cleared context has been shown by some papers to produce better results than a single AI output. It's crucial to clear the context for the review; otherwise, the AI might superficially approve its own work without re-reading files.
Human Merge: Finally, a human reviews and merges the PR. The human review can focus more on the tests (are they comprehensive?) and a high-level scan of the code for any glaring oddities (unexpected files changed, major style deviations).
The Human Side: Adapting to Your AI Partner
For a true symbiosis, developers also need to adapt:
Relinquish Some Control: Be a bit more flexible with your personal best practices and standards; let the AI take more ownership of the code.
Embrace AI's Quirks: For instance, AIs often leave many comments in the code. Despite prompts to avoid this, they persist. This might even help the AI's autoregressive generation process, as comments become part of the input for subsequent code.
Code Structure: AIs find it easier to parse a single large file than to jump between many small, highly modularized files. Over-modularization can lead to misinterpretations if the AI misses a relevant piece of code.
Descriptive Code: Well-documented functions, typed languages, and clearly described input/output parameters make it much easier for the AI to understand how code works.
Semantic Search Friendliness: Since coding assistants often use semantic search, having descriptive text (like in comments or documentation) alongside the code that reflects concepts like "authentication flow" helps the AI find relevant sections.
Small Changes: AIs, like less experienced engineers, can be overwhelmed by large, complex changes. Prefer smaller, incremental modifications.
Essential Tooling for AI Companions
Easy Command Execution: Make it simple for AI to run commands for specific tasks.
Use npm scripts in package.json so the model can just run npm run check for build, lint, and format.
Prepare and document git commands for easy execution by the model.
Extra Tools: Provide the AI with tools beyond its built-in capabilities.
A GitHub MCP (Model Context Protocol) tool is almost necessary for our workflow, as the GitHub CLI can sometimes cause errors with text parsing for issue creation.
We are experimenting with a Context7 MCP which allows the AI to fetch documentation for specific systems.
An MCP for browser interaction is very helpful for web development, allowing the AI to open a browser and inspect things if needed.
LLM-Friendly Tool Output: Some tools are not inherently LLM-friendly.
For example, when Playwright end-to-end tests fail because the AI expected to be on one page (e.g., homepage after login) but was on another (e.g., still on the login page), it might misdiagnose the issue as rendering problems or race conditions, not realizing it's on the wrong route.
The Playwright test output should ideally show the last route, visited routes, and any similar locators present on the page, as the AI typically can't "see" the page during tests. Playwright's output formatting can be extended with plugins to provide such information, though we haven't implemented this yet.
Concise Logs: Be careful with logs. The AI should receive only the most important information it actually needs, not be overwhelmed by verbose CLI output.
Conclusion
Vibe Engineering is more than just a buzzword; it's a systematic approach to maximizing AI's contribution to software development. By thoughtfully preparing the project, guiding the AI with clear rules and rich context, and even adapting our own practices, we can unlock new levels of speed and efficiency. It's an evolving field, but the potential for transforming how we build software is immense.