Product Anatomy
Core Read
Garry's system is best understood as one product with multiple visible surfaces, not three unrelated repos. The product is an AI-native operating system for high-leverage builders:
- gstack gives the agent a work process.
- gbrain gives the agent durable memory.
- gbrain-evals proves the memory and workflow claims.
- YC turns the proof into distribution, recruiting, founder affinity, and cultural authority.
The product is not "skills." The product is a way for one high-agency operator to convert intent into shipped software, preserved context, measured learning, and public proof.
Product Stack
| Layer | Product Part | User Sees | System Actually Does | Value Created |
|---|---|---|---|---|
| 1 | Narrative / belief layer | README story, contribution graphs, Karpathy/OpenClaw references, YC credibility | Reframes AI coding from autocomplete into company-building leverage | Makes the user willing to try a new operating model |
| 2 | Workflow layer | gstack slash commands | Encodes product, design, engineering, QA, security, release, and retro roles | Turns AI labor into a managed sprint |
| 3 | Browser/action layer | browse, QA, screenshots, sidebar, pair-agent | Gives agents eyes, clicks, shared tabs, cookies, and verifiable UI evidence | Moves from text-only claims to real-world inspection |
| 4 | Safety/gate layer | review, cso, careful, freeze, guard, ship checks | Adds stop points, approvals, confidence thresholds, and scoped edits | Makes faster work survivable |
| 5 | Memory layer | gbrain pages, search, MCP tools, source citations | Persists pages/chunks/links/sources/embeddings/timelines across sessions | Makes work compound instead of reset |
| 6 | Skillpack behavior layer | gbrain skills | Tells agents when to read, write, cite, enrich, verify, publish, and ask | Turns a database into operational memory |
| 7 | Eval/proof layer | benchmark reports and scorecards | Runs public and in-house memory benchmarks with artifacts and caveats | Makes claims falsifiable |
| 8 | Distribution/funnel layer | MIT repo, install commands, OpenClaw/host adapters, YC hiring/apply CTAs | Converts OSS attention into users, contributors, candidates, and founders | Builds movement and talent flow |
Product Parts By User Job
| User Job | Product Part That Handles It | Why That Part Exists |
|---|---|---|
| "Help me think through what to build." | gstack office-hours, plan-ceo-review, autoplan | The first failure in AI coding is often building the wrong thing faster. |
| "Turn this into a plan I can trust." | plan-eng-review, plan-design-review, plan-devex-review | The system forces taste, architecture, DX, failure modes, and tests into the open. |
| "Build without wrecking my repo." | careful, freeze, guard, investigate, review | AI's speed increases the blast radius of bad assumptions. |
| "Show me what actually happens in the app." | browse, open-gstack-browser, qa, qa-only, design-review | Screenshots, clicks, console errors, and repro steps beat model assertions. |
| "Ship this safely." | ship, setup-deploy, land-and-deploy, canary, benchmark | Shipping is not done until tests, PR, deploy, and production verification are handled. |
| "Remember what we learned." | learn, context-save, context-restore, setup-gbrain, sync-gbrain | The productivity gain collapses if every session starts cold. |
| "Make my knowledge usable by agents." | gbrain pages/chunks/sources/MCP/query | Knowledge must become queryable, citable, source-scoped, and writable. |
| "Make research and notes operational." | gbrain ingest, enrich, meeting-ingestion, briefing, book-mirror, data-research | The system turns raw material into decisions, timelines, and daily prep. |
| "Prove the memory works." | gbrain-evals LongMemEval, BrainBench, reports | Without evals, memory quality is vibes. |
| "Join or contribute to the movement." | MIT license, host adapters, ClawHub skills, YC hiring/apply CTAs | The artifact becomes recruiting and distribution. |
The User-Facing Product Surfaces
1. README As Product
The README is not just documentation. It is the landing page, credibility asset, demo, onboarding script, competitive argument, and recruiting funnel.
Its jobs:
- establish why the category matters
- prove Garry uses it himself
- show the magical moment quickly
- route different user types into the right install path
- normalize team adoption
- expand across hosts
- recruit contributors and YC candidates
2. Slash Commands As Product UI
The command names are the interface:
/office-hoursmeans product thinking/autoplanmeans reviewed plan/reviewmeans staff engineer pass/qameans browser evidence and fixes/shipmeans PR-ready release path/setup-gbrainmeans continuity
This is strong because the commands are memorable role metaphors. The user does not need to learn an abstract API; they hire a specialist.
3. Skill Files As Product Logic
The skills are product logic written in markdown. They encode:
- when to fire
- what to inspect
- what to refuse
- when to ask the user
- what output shape to produce
- what quality bar must be met
This is why the system feels like process, not prompting.
4. CLI/MCP As Product Infrastructure
gbrain's CLI and MCP surfaces turn the brain into typed tools. This matters because durable memory cannot rely on natural-language shell-outs alone.
Important product choices:
- same operation contracts across CLI/MCP/HTTP
- source-scoped reads and writes
- local-only restrictions for risky file operations
- OAuth for remote access
- redacted request logs
- keyword fallback when embeddings are unavailable
5. Browser As Product Surface
The browser layer is a major product surface because it makes agent output inspectable.
It supports:
- screenshots
- real clicks
- authenticated sessions
- visual QA
- devex testing
- canary monitoring
- shared-agent browsing
- scraping-to-skill workflows
The product insight: a coding agent without a browser is blind to a large part of software quality.
6. Benchmarks As Product Surface
gbrain-evals makes proof public. The benchmark docs are not internal QA artifacts; they are market-facing credibility.
They communicate:
- what gbrain is good at
- where vector retrieval is enough
- where graph/source-aware retrieval matters
- what competitors claim
- where metrics are not comparable
- which parts still need work
Hidden Infrastructure
| Hidden Part | Why It Matters |
|---|---|
| Source scoping | Prevents personal/team/client/source contamination. |
| Chunking and embeddings | Determines what the agent can recall and cite. |
| Code graph edges | Moves coding memory beyond grep into symbol relationships. |
| MCP operation contracts | Prevents tool schema drift and lets clients call memory safely. |
| OAuth/scopes/logging | Makes remote memory access possible without total trust. |
| Skill routing | Keeps the agent from loading everything all the time. |
| Context save/restore | Protects long-running work from session resets. |
| Eval candidates/artifacts | Converts real use and benchmarks into reviewable proof. |
| Host adapters | Lets the method travel beyond Claude Code. |
Feedback Loops
Loop 1: Build Loop
Idea -> office-hours -> CEO/design/eng/DX review -> implementation -> review -> QA -> ship -> canary -> retro.
This loop makes AI coding feel like an engineering org.
Loop 2: Memory Loop
Work happens -> pages/chunks/links/timelines update -> future sessions search/cite/reuse context -> new decisions are written back.
This loop makes the system compound.
Loop 3: Proof Loop
System changes -> evals run -> benchmark reports update -> public claims become sharper -> contributors know where to improve.
This loop makes the system credible.
Loop 4: Distribution Loop
Public proof -> OSS adoption -> contributors and host adapters -> more use cases -> stronger story -> YC hiring/founder funnel.
This loop makes the system a movement, not just a repo.
Product Requirements By Part
| Part | Must Do | Failure Mode If Missing |
|---|---|---|
| Narrative | Make the user believe the new workstyle is real | Tool looks like another prompt pack |
| Install | Create a fast first win | User churns before experiencing leverage |
| Workflow | Give agents roles, gates, and stop conditions | Parallel work becomes chaos |
| Browser | Produce real UI evidence | Agent claims remain unverified |
| Safety | Constrain destructive or speculative behavior | Faster work creates faster damage |
| Memory | Persist source-scoped, cited context | Every session starts cold |
| Skills | Encode behavior around memory and work | Database stays passive |
| Evals | Measure retrieval and workflow quality | Claims become marketing |
| Funnel | Convert attention into users/contributors/candidates | OSS interest dissipates |
What Is Actually Being Sold
The public artifact is free and MIT licensed, so the sale is not a subscription. What is being sold is belief in an operating model:
- a founder can ship with a virtual team of agents
- the right process matters more than raw model access
- memory is the missing layer between sessions
- evals are necessary to trust agent memory
- YC is where this style of building should compound
That is why the YC CTA belongs inside the repo. The repo is proof of the workstyle YC wants builders to adopt or help build.
Open Product Questions
- What is the true first magical moment:
/office-hours,/qa,/ship, or/setup-gbrain? - Does the full product need a unified dashboard, or does the command-line/README surface remain enough?
- Where does gstack end and gbrain begin from a user's mental model?
- Which memory writes should happen automatically versus require explicit approval?
- Can gbrain-evals become a general benchmark standard, or is it mainly proof for gbrain?
- How much of the YC funnel should be explicit before it starts to feel like recruiting instead of OSS?
- What is the smallest productized version of this system that a normal technical founder can adopt in under 30 minutes?
Ren/OpenClaw Implications
The lesson for Ren is not to copy the exact skill list. The lesson is the product architecture:
- make the operating model visible
- encode judgment as repeatable roles
- give agents browser evidence
- preserve memory as source-scoped, cited context
- publish evals as proof
- turn public work into contribution and recruiting surface
For OpenClaw specifically, the opportunity is to become the orchestration layer where these parts are easier to run together: agent spawning, session continuity, channel delivery, skill routing, browser access, memory writes, and eval evidence.