Product Anatomy

Core Read

Garry's system is best understood as one product with multiple visible surfaces, not three unrelated repos. The product is an AI-native operating system for high-leverage builders:

gstack gives the agent a work process.
gbrain gives the agent durable memory.
gbrain-evals proves the memory and workflow claims.
YC turns the proof into distribution, recruiting, founder affinity, and cultural authority.

The product is not "skills." The product is a way for one high-agency operator to convert intent into shipped software, preserved context, measured learning, and public proof.

Product Stack

Layer	Product Part	User Sees	System Actually Does	Value Created
1	Narrative / belief layer	README story, contribution graphs, Karpathy/OpenClaw references, YC credibility	Reframes AI coding from autocomplete into company-building leverage	Makes the user willing to try a new operating model
2	Workflow layer	gstack slash commands	Encodes product, design, engineering, QA, security, release, and retro roles	Turns AI labor into a managed sprint
3	Browser/action layer	browse, QA, screenshots, sidebar, pair-agent	Gives agents eyes, clicks, shared tabs, cookies, and verifiable UI evidence	Moves from text-only claims to real-world inspection
4	Safety/gate layer	review, cso, careful, freeze, guard, ship checks	Adds stop points, approvals, confidence thresholds, and scoped edits	Makes faster work survivable
5	Memory layer	gbrain pages, search, MCP tools, source citations	Persists pages/chunks/links/sources/embeddings/timelines across sessions	Makes work compound instead of reset
6	Skillpack behavior layer	gbrain skills	Tells agents when to read, write, cite, enrich, verify, publish, and ask	Turns a database into operational memory
7	Eval/proof layer	benchmark reports and scorecards	Runs public and in-house memory benchmarks with artifacts and caveats	Makes claims falsifiable
8	Distribution/funnel layer	MIT repo, install commands, OpenClaw/host adapters, YC hiring/apply CTAs	Converts OSS attention into users, contributors, candidates, and founders	Builds movement and talent flow

Product Parts By User Job

User Job	Product Part That Handles It	Why That Part Exists
"Help me think through what to build."	gstack office-hours, plan-ceo-review, autoplan	The first failure in AI coding is often building the wrong thing faster.
"Turn this into a plan I can trust."	plan-eng-review, plan-design-review, plan-devex-review	The system forces taste, architecture, DX, failure modes, and tests into the open.
"Build without wrecking my repo."	careful, freeze, guard, investigate, review	AI's speed increases the blast radius of bad assumptions.
"Show me what actually happens in the app."	browse, open-gstack-browser, qa, qa-only, design-review	Screenshots, clicks, console errors, and repro steps beat model assertions.
"Ship this safely."	ship, setup-deploy, land-and-deploy, canary, benchmark	Shipping is not done until tests, PR, deploy, and production verification are handled.
"Remember what we learned."	learn, context-save, context-restore, setup-gbrain, sync-gbrain	The productivity gain collapses if every session starts cold.
"Make my knowledge usable by agents."	gbrain pages/chunks/sources/MCP/query	Knowledge must become queryable, citable, source-scoped, and writable.
"Make research and notes operational."	gbrain ingest, enrich, meeting-ingestion, briefing, book-mirror, data-research	The system turns raw material into decisions, timelines, and daily prep.
"Prove the memory works."	gbrain-evals LongMemEval, BrainBench, reports	Without evals, memory quality is vibes.
"Join or contribute to the movement."	MIT license, host adapters, ClawHub skills, YC hiring/apply CTAs	The artifact becomes recruiting and distribution.

The User-Facing Product Surfaces

1. README As Product

The README is not just documentation. It is the landing page, credibility asset, demo, onboarding script, competitive argument, and recruiting funnel.

Its jobs:

establish why the category matters
prove Garry uses it himself
show the magical moment quickly
route different user types into the right install path
normalize team adoption
expand across hosts
recruit contributors and YC candidates

2. Slash Commands As Product UI

The command names are the interface:

/office-hours means product thinking
/autoplan means reviewed plan
/review means staff engineer pass
/qa means browser evidence and fixes
/ship means PR-ready release path
/setup-gbrain means continuity

This is strong because the commands are memorable role metaphors. The user does not need to learn an abstract API; they hire a specialist.

3. Skill Files As Product Logic

The skills are product logic written in markdown. They encode:

when to fire
what to inspect
what to refuse
when to ask the user
what output shape to produce
what quality bar must be met

This is why the system feels like process, not prompting.

4. CLI/MCP As Product Infrastructure

gbrain's CLI and MCP surfaces turn the brain into typed tools. This matters because durable memory cannot rely on natural-language shell-outs alone.

Important product choices:

same operation contracts across CLI/MCP/HTTP
source-scoped reads and writes
local-only restrictions for risky file operations
OAuth for remote access
redacted request logs
keyword fallback when embeddings are unavailable

5. Browser As Product Surface

The browser layer is a major product surface because it makes agent output inspectable.

It supports:

screenshots
real clicks
authenticated sessions
visual QA
devex testing
canary monitoring
shared-agent browsing
scraping-to-skill workflows

The product insight: a coding agent without a browser is blind to a large part of software quality.

6. Benchmarks As Product Surface

gbrain-evals makes proof public. The benchmark docs are not internal QA artifacts; they are market-facing credibility.

They communicate:

what gbrain is good at
where vector retrieval is enough
where graph/source-aware retrieval matters
what competitors claim
where metrics are not comparable
which parts still need work

Hidden Infrastructure

Hidden Part	Why It Matters
Source scoping	Prevents personal/team/client/source contamination.
Chunking and embeddings	Determines what the agent can recall and cite.
Code graph edges	Moves coding memory beyond grep into symbol relationships.
MCP operation contracts	Prevents tool schema drift and lets clients call memory safely.
OAuth/scopes/logging	Makes remote memory access possible without total trust.
Skill routing	Keeps the agent from loading everything all the time.
Context save/restore	Protects long-running work from session resets.
Eval candidates/artifacts	Converts real use and benchmarks into reviewable proof.
Host adapters	Lets the method travel beyond Claude Code.

Feedback Loops

Loop 1: Build Loop

Idea -> office-hours -> CEO/design/eng/DX review -> implementation -> review -> QA -> ship -> canary -> retro.

This loop makes AI coding feel like an engineering org.

Loop 2: Memory Loop

Work happens -> pages/chunks/links/timelines update -> future sessions search/cite/reuse context -> new decisions are written back.

This loop makes the system compound.

Loop 3: Proof Loop

System changes -> evals run -> benchmark reports update -> public claims become sharper -> contributors know where to improve.

This loop makes the system credible.

Loop 4: Distribution Loop

Public proof -> OSS adoption -> contributors and host adapters -> more use cases -> stronger story -> YC hiring/founder funnel.

This loop makes the system a movement, not just a repo.

Product Requirements By Part

Part	Must Do	Failure Mode If Missing
Narrative	Make the user believe the new workstyle is real	Tool looks like another prompt pack
Install	Create a fast first win	User churns before experiencing leverage
Workflow	Give agents roles, gates, and stop conditions	Parallel work becomes chaos
Browser	Produce real UI evidence	Agent claims remain unverified
Safety	Constrain destructive or speculative behavior	Faster work creates faster damage
Memory	Persist source-scoped, cited context	Every session starts cold
Skills	Encode behavior around memory and work	Database stays passive
Evals	Measure retrieval and workflow quality	Claims become marketing
Funnel	Convert attention into users/contributors/candidates	OSS interest dissipates

What Is Actually Being Sold

The public artifact is free and MIT licensed, so the sale is not a subscription. What is being sold is belief in an operating model:

a founder can ship with a virtual team of agents
the right process matters more than raw model access
memory is the missing layer between sessions
evals are necessary to trust agent memory
YC is where this style of building should compound

That is why the YC CTA belongs inside the repo. The repo is proof of the workstyle YC wants builders to adopt or help build.

Open Product Questions

What is the true first magical moment: /office-hours, /qa, /ship, or /setup-gbrain?
Does the full product need a unified dashboard, or does the command-line/README surface remain enough?
Where does gstack end and gbrain begin from a user's mental model?
Which memory writes should happen automatically versus require explicit approval?
Can gbrain-evals become a general benchmark standard, or is it mainly proof for gbrain?
How much of the YC funnel should be explicit before it starts to feel like recruiting instead of OSS?
What is the smallest productized version of this system that a normal technical founder can adopt in under 30 minutes?

Ren/OpenClaw Implications

The lesson for Ren is not to copy the exact skill list. The lesson is the product architecture:

make the operating model visible
encode judgment as repeatable roles
give agents browser evidence
preserve memory as source-scoped, cited context
publish evals as proof
turn public work into contribution and recruiting surface

For OpenClaw specifically, the opportunity is to become the orchestration layer where these parts are easier to run together: agent spawning, session continuity, channel delivery, skill routing, browser access, memory writes, and eval evidence.