Jason Guo

Posted on Feb 5

From “Vibe Coding” to “Agentic Engineering”: When Coding Becomes Orchestrating Agents

#agents #ai #softwareengineering #vibecoding

More Articles please vist: Windflash AI Daily

In February 2025, Andrej Karpathy sent out a tweet that defined a moment. He christened a new style of programming "Vibe Coding"—a state where you surrender completely to the flow of AI generation. For a brief window, developers felt a unique kind of liberation: you simply spoke to your IDE, and code poured out like magic. Errors were fixed instantly, features appeared on command, and the only thing that mattered was the result. It was a dopamine-fueled joyride where we focused on the "what" and blissfully ignored the "how."

Fast forward to 2026. Looking back, the software world didn't collapse into a heap of unmaintainable sludge. Instead, it evolved. In his one-year retrospective, Karpathy noted that we have shifted from "vibes" to discipline, coining a new term: "Agentic Engineering." This isn't just a rebrand; it marks the maturation of the industry from weekend hacking to professional system orchestration. Today, the core skill for every developer is no longer just typing code, but mastering the art of guiding AI agents that code faster than we can think—while ensuring production-grade safety and quality. This post explores how we got here and provides a practical roadmap for the new era of agentic workflows.

What People Mean by “Vibe Coding”

To understand where we are going, we have to revisit where we started. In early 2025, tools like Cursor Composer integrated models like Claude 3.5 Sonnet, creating a quantum leap in developer experience.

Karpathy described this shift as "fully giving in to the vibes, embracing exponentials, and forgetting that the code even exists." It was a shift away from precision and toward pure momentum.

Typical Symptoms of Vibe Coding
As originally described, this mode of work is characterized by several distinct behaviors:

Natural Language Driven: The keyboard gathers dust. You talk directly to your IDE using tools like SuperWhisper. You give vague commands like "decrease the padding on the sidebar by half" because you are simply too lazy to hunt down the CSS file yourself.

Accept All: You stop reading the diffs. The AI generates code faster than you can verify it, so you hit "Accept All" and trust the momentum.

Paste Errors as Prompts: When the build fails, you don't debug. You paste the stack trace directly into the chat window without a single word of commentary. Usually, the AI fixes it.

Abandoning Comprehension: The codebase grows rapidly, often beyond your mental model. If the AI can't fix a bug, you ask it to "randomly change things until it goes away" or find a workaround. You prioritize forward motion over understanding the root cause.

This pattern was incredibly seductive because it lowered the barrier to creation. As tools like SuperWhisper demonstrated, developers could act as commanders, dispatching orders to Cursor or Claude Code and watching ideas transmute into products in real-time.
For weekend projects, prototypes, or throwaway scripts, this was perfect—it offered maximum efficiency and instant gratification. However, as Karpathy later admitted, this was more of an "experience" than a rigorous engineering method. When this "don't look at the code" habit bled into long-term production environments, the cracks appeared: unmaintainable spaghetti code, hidden security vulnerabilities, and team friction caused by a total collapse of shared understanding.

One Year Later: From “Let It Rip” to Engineering Discipline

If 2025 was the year of the "Vibe Coding" party, 2026 is the year of the hangover—and the subsequent return to sobriety. In his retrospective, Karpathy admitted that the term went viral because it captured the excitement of the moment. But he was clear: while professional workflows now default to using LLM agents, they must be paired with significantly higher oversight and scrutiny.

This new paradigm is Agentic Engineering. The name itself signals a dual commitment:

Agentic: Acknowledging that for 99% of the work, you are not writing the code directly. You are orchestrating agents that do.
Engineering: Emphasizing that this is a discipline requiring expertise, structure, and science. It is not a slot machine; it is a skill you can refine and master.

This shift represents a fundamental change in how we operate. As Google Engineering Director Addy Osmani describes, we are crossing the chasm from "Imperative" to "Declarative" programming. We no longer tell the computer how to do something (write this loop, define that variable); we tell it what we want (pass these tests, handle this edge case). The table below contrasts these two eras:

As predicted, 2026 brings us improvements in both the Model Layer and the Agent Layer. These advancements allow us to build software that is more complex and robust than ever before—but only if we learn to effectively "manage" our digital workforce.

Bringing Agents to Production: A Safer Workflow

The biggest risk in bringing AI agents into production is what Addy Osmani calls "Comprehension Debt." When an agent generates code faster than you can read and understand it, you are borrowing against your future ability to maintain that system. AI can easily do the first 80% of the work. But the final 20%—the integration, the subtle bugs, the performance tuning—requires deep understanding. If you check out mentally during the first 80%, that final 20% becomes an insurmountable wall.

To mitigate this, we need a "Playbook for Production" that turns the luck-based "vibe" into a controlled process:

1. Prompts as Specifications
Stop giving vague instructions like "make this page better." Treat your prompts like requirements documents for a junior engineer. A qualified agent task description must include:

Acceptance Criteria: Specific conditions the feature must meet to be considered done.
Boundary Conditions: Expected behavior for empty states, network failures, or invalid inputs.
Non-goals: Explicitly tell the Agent what not to do to prevent "Abstraction Bloat"—the tendency of agents to over-engineer simple solutions.

2. Automated Verification as Guardrails
In the Vibe Coding era, we relied on our eyes. In Agentic Engineering, code does not enter the repository without passing automated checks.

TDD Revival: Have the AI write the test cases first. Once you verify the tests are correct, let the AI write the implementation until the tests pass. This is the most effective defense against hallucinations.
Strict Type Checking: Leverage the type systems of TypeScript or Rust. Let the compiler be your first line of defense, filtering out the low-level syntax errors and type mismatches that agents are prone to making.

3. Task Decomposition and Permissions
Never ask an Agent to refactor an entire module in one go.

Atomic Iteration: Break large tasks into atomic units, where each change touches only 3-5 files. This ensures the resulting diff is small enough for a human to actually review.
Require Explanations: Before accepting code, force the Agent to generate a "Summary of Changes" explaining what it did and why. If its explanation is confused, the code is almost certainly broken.

4. Fresh Context Review
Addy Osmani suggests a powerful technique: ask the AI to review its own code, but do it in a fresh chat window. In a long conversation, the AI often suffers from "Sycophantic Agreement"—it doubles down on previous mistakes to please you. A fresh context forces it to look at the code objectively, often revealing logic gaps it previously missed.

When should you NEVER Vibe Code?
Tools are powerful, but some domains require 100% human control. Do not delegate:

Security-Critical Logic: Payment processing, authentication flows, and encryption implementation.

Compliance-Heavy Data: Handling PII (Personally Identifiable Information) or medical records.

Infrastructure Core: A hallucinated Terraform configuration can lead to a cloud bill explosion or a total service outage.

Conclusion: Turn Vibes into Method
From 2025 to 2026, we witnessed AI programming settle from a dizzying "Vibe" into a replicable "Method." The value of Vibe Coding was that it shattered the barriers to creation, allowing us to splash ideas onto the screen like artists. Agentic Engineering builds on that freedom but reintroduces the discipline and rigor of the engineer.
In this new era, your value as a developer is no longer defined by your typing speed or your memorization of API parameters. It is defined by the clarity with which you define problems and the accuracy with which you judge results. As Karpathy envisions, 2026 will see the dual evolution of models and agents. Our job is to be the ones who define the destination and ensure the quality of the journey.
To ensure you aren't left behind by this wave, here are three actionable steps to start today:

Start as a Reviewer: In your next feature, try writing only the tests and documentation. Let the Agent write the implementation, then force yourself to review it as strictly as you would a colleague's code.
Practice Declarative Communication: When the Agent makes a mistake, don't paste code snippets to fix it. Instead, try to describe the logical fallacy in precise natural language. Train yourself to "direct" rather than "do."
Focus on Architecture, Not Syntax: Use the time you save to study system design, design patterns, and security architecture. These are the areas where AI is most likely to dig a hole—and the hardest for it to fill on its own.

Top comments (4)

Claessens Dieter • Feb 5

I work with a constitution but I notice it often times violates things in the constitution.
Have you worked with spec-kit before and what was you experience if so?

Jason Guo • Feb 5

I haven't used it before, but I encounter the same situation you're facing from time to time. My current approach is to audit the existing code every so often during the development process.

leob • Feb 6

What's a "constitution" - a set up guidelines for the AI tool?

leob • Feb 6

Useful overview!