What is Gstack? Planning before the agent starts coding

Published 2026-05-12·Updated 2026-05-12·v1·#ai#gstack#agents#software-development#planning#ai-agents#ai-coding#product-development#workflow-automation#software-engineering

who built it

Gstack is by Garry Tan, President and CEO of Y Combinator. The README is unusually personal: he describes wanting to understand how builders like Andrej Karpathy and Peter Steinberger could ship at extreme AI-assisted velocity, then open-sourcing the setup he uses every day.

The result is a collection of specialist workflows: CEO review, engineering review, design review, QA, security, release, and browser-driven testing. The exact list changes, but the philosophy is stable: do not ask a general agent to be everything at once. Give it roles, gates, and rituals.

That is a very YC-flavored insight. Speed is good. But speed without taste, scope control, and review creates slop. Gstack tries to preserve speed while adding operating discipline.

the secret sauce

The secret sauce is that Gstack makes the invisible parts of product development explicit.

A normal prompt says: “Build this feature.”

A better workflow asks:

What is the product bet?
What is in scope?
What is intentionally out of scope?
What technical decisions matter?
What design direction should the result follow?
What are the failure modes?
What should be reviewed before implementation?
What tests prove the work is real?

Gstack turns those questions into slash commands and Markdown skills. That makes the process repeatable. Instead of relying on the agent to spontaneously act like a good PM, staff engineer, designer, QA lead, and release engineer, you invoke a workflow that forces the perspective.

The browser layer is another important piece. Gstack includes a persistent Chromium daemon with a compiled CLI. The architecture docs explain the core latency problem: an agent interacting with a browser needs persistent state and sub-second commands. If every browser action cold-starts Chromium, the workflow dies. Gstack keeps Chromium alive and talks to it over localhost HTTP, so repeated browser commands can be fast.

That matters for QA and design. Agents are much better when they can see the product, click through it, take screenshots, and inspect real pages instead of hallucinating from source code alone.

why it is useful

The most common AI-coding failure is not that the model cannot write code. It is that the model writes code against a weak brief.

You ask for onboarding improvements. It adds a screen, but misses the product logic.

You ask for a refactor. It cleans files, but changes behavior.

You ask for a landing page. It works, but looks like every AI-generated SaaS page.

The failure happened upstream. The task was underspecified, the design target was fuzzy, or the technical tradeoff was never made explicit.

Gstack helps because it creates durable planning artifacts before the implementation begins. In my own Knowledge OS work, the important plans live under project-local Gstack paths such as:

~/.gstack/projects/.hermes/ceo-plans/

Those plans capture scope decisions, architecture diagrams, design notes, accepted/rejected options, and test plans. They are not just chat residue. They become reviewable artifacts.

That changes the feel of agentic development. Instead of “the agent did something, now I need to inspect a diff,” you get “the agent proposed the shape of the work, now I can correct it before the diff exists.”

how to use it

The official repository is:

https://github.com/garrytan/gstack

The development quick start uses Bun:

bun install
bun test
bun run build

The browser tool quick start looks like this:

# build the browser binary
bun install && bun run build

# set the browser CLI path
B=./browse/dist/browse

# drive a page
$B goto https://news.ycombinator.com
$B snapshot -i
$B click @e30
$B text
$B screenshot /tmp/hn.png

But the higher-level way to use Gstack is through the skills and slash commands. The project context in this Knowledge OS already routes strategy, architecture, design, QA, review, and shipping tasks through commands like:

/plan-ceo-review
/plan-eng-review
/plan-design-review
/design-review
/qa
/review
/ship

The exact command names are less important than the pattern:

Start with scope, not code.
Get a CEO/product review when the goal is fuzzy.
Get an engineering review when architecture matters.
Get a design review before accepting a UI.
Use browser QA to test the real product.
Ship only after the gates pass.

my experience

Gstack helps most when the task has taste or tradeoffs.

For pure mechanical edits, a coding agent can often just do the work. But when I am scoping a feature, choosing architecture, or trying to make a page feel less generic, Gstack-style review is a force multiplier.

The CEO review helps by asking whether the thing is worth building and what shape it should have. That sounds fluffy until you watch an agent implement the wrong version of a good idea. Scope is leverage.

The engineering review helps by making technical decisions visible. Should this be a one-off script or a durable API? Should we store state in frontmatter, a manifest, a database, or a generated file? Should the first version be simple, or are we baking in a migration later?

The design review helps with the “AI slop” problem. One-shot design output often gets the layout technically right but the taste wrong: too many cards, too much gradient, vague hierarchy, fake polish. A design pass forces the agent to talk about visual hierarchy, spacing, typography, interaction, and the actual feeling of the product.

The biggest benefit is documentation. After a Gstack pass, I am not left only with code. I have the requirement, the decision trail, and the plan. That makes future agent sessions better because there is something concrete to load.

the tradeoff

Gstack is not magic. It adds ceremony.

If the task is tiny, the ceremony can be too much. If you ask for a full review pipeline on every three-line change, you will waste time. The trick is to match the workflow to the risk.

Use lightweight prompts for simple edits. Use Gstack-style planning for product direction, architecture, UI, QA, and releases. The more irreversible or ambiguous the change, the more valuable the gate.

The mental model is simple: agents are very fast interns with occasional flashes of senior judgment. Gstack gives them a management structure.

That is why it works. It does not merely make the agent smarter. It makes the process around the agent less stupid.