FAQ - Scaffold OS

● The Basics

What exactly is Scaffold OS? ▾

Scaffold OS is a coordination protocol — a structured, closed-source engine that transforms how AI agents approach software development. Instead of giving an AI a vague description of what you want and hoping for the best, Scaffold OS runs agents through a defined sequence: a structured brainstorm to extract architecture, a pre-build audit to catch contradictions before code is written, and a build plan where every step has explicit entry conditions and verifiable completion signals.

As of v4.9, Scaffold OS is also the canonical owner of workflow state — maintaining machine-readable records for project status, decisions, review readiness, and deploy readiness that any surface can consume without parsing internal files. As of v5.1, it's a reusable protocol platform: multiple products and surfaces can sit on top of the engine without requiring access to its internal implementation.

The result is that complex, multi-system software projects — the kind that require five engineers over three months under normal conditions — can be built in a fraction of the time with a fraction of the team. Not because the AI is smarter, but because the structure forces it to think before it acts.

Is this an app I download? A plugin? A prompt? ▾

Scaffold OS is currently in transition from a proprietary protocol system to a SaaS platform (launching soon). The system is a coordination engine — a structured set of protocols, planning files, and session state management that runs on top of AI coding agents.

You don't install it like a traditional app, and it's not a simple prompt you paste into ChatGPT. It's a complete build methodology that changes how the entire session is structured, what gets documented, and how build decisions get verified.

Who is Scaffold OS for? ▾

Solo developers and small teams who want to build production-grade systems that would normally take a full engineering team. If you're technical enough to know what you want to build but have struggled to get AI coding tools to actually produce it consistently — Scaffold OS is for you.

Non-technical founders building SaaS products. The optional infrastructure layer and technical-level detection in the brainstorm protocol mean you don't need to know Docker or PostgreSQL configuration to get a working backend.

Freelancers and agencies building complex client software. The multi-project management features on the roadmap are designed specifically for this use case.

Scaffold OS is not for people who want to generate a one-page landing page or a simple CRUD app. For small, simple projects — regular AI coding tools are probably enough.

What languages and tech stacks does it support? ▾

Scaffold OS is stack-agnostic by design. The protocol governs how builds happen — not what tech stack is used. In practice, it works best with the most common production stacks:

Backend: Node.js, Python (FastAPI, Django, Flask), Go, Ruby
Frontend: React, Next.js, Vue, Svelte
Databases: PostgreSQL, MySQL, MongoDB, Supabase, Firebase
Cloud: AWS, GCP, Vercel, Railway, Render
ML/AI: PyTorch, TensorFlow, LangChain, Hugging Face, vector databases

The stack is selected during the brainstorm phase and locked into the architecture file. Every subsequent session builds against that spec — no drift, no confusion about what technology to use.

⚙ vs. Other Tools

How is this different from Cursor or GitHub Copilot? ▾

Cursor and Copilot are code completion and editing tools. They're excellent at helping you write code faster in a file you already have open. They don't know what your overall system architecture should look like, they don't track which features are built vs. pending, and they don't prevent you from changing a shared API interface without updating everything that depends on it.

Scaffold OS is a build coordination system. It operates at the project level, not the file level. It manages architecture decisions before code is written, tracks the state of each feature area across sessions, enforces contracts between systems, and prevents the build drift that makes long-running AI-assisted projects collapse.

The honest framing: Cursor and Copilot help human developers write code faster. Scaffold OS orchestrates AI agents to build entire systems from specification to production.

What about Devin? Isn't that the same thing? ▾

Devin is an autonomous AI agent that takes a task description and attempts to complete it independently. The demo results are impressive. The real-world results on complex projects are mixed — because autonomous agents without structured protocols run into the same problems: they lose context, they make architectural decisions mid-build that contradict earlier decisions, and they produce code that works in isolation but fails in integration.

Scaffold OS's approach is different: rather than maximizing autonomy, it maximizes the quality of the protocol the agents execute. The agent doesn't decide what the architecture should be — you and the system decide together in a structured brainstorm. The agent doesn't decide when a step is complete — a real verification command decides that. The agent doesn't decide how to handle a contract change — an 8-step protocol handles that.

The result is more consistent, more auditable, and more controllable than fully autonomous agents — which matters when you're building production systems, not demos.

I already use AI coding tools and they work fine. Why would I switch? ▾

If you're building small, well-scoped projects (landing pages, simple CRUD apps, single-endpoint APIs) — you probably don't need Scaffold OS. Existing tools are great for that.

The pattern that Scaffold OS solves is: you start a complex project with an AI tool, it goes well for the first session, starts losing coherence by session 3, and by session 6 you're fighting against the code that was generated in session 2 rather than building new features.

If that pattern sounds familiar — multi-session projects, multi-system architectures, external platform integrations, teams of more than one person working on the same codebase — that's the exact problem Scaffold OS was built to solve.

💡 The Protocol

What does "demand validation" mean and why is it in the brainstorm phase? ▾

The most expensive failure mode in software development isn't technical — it's building the right product for the wrong audience, or a perfectly engineered solution to a problem nobody actually has. Previous versions of Scaffold OS produced thorough architecture documents without ever challenging whether the product was worth building.

v4.1 adds three mandatory questions before product discovery even begins: Status Quo (what do target users do today without this product — walk through their actual current process), Desperate Specificity (describe one specific person who would pay for this right now, not a user segment — a person), and Narrowest Wedge (what is the absolute minimum version that would be genuinely useful to that person this week).

The narrowest wedge answer becomes the reference point for every scope decision throughout the project. When REDUCTION planning mode is active, features are measured against it. When scope creep is suggested, it's challenged against it.

If the user can't answer these questions specifically — gaps are flagged and recorded, not silently accepted. The plan proceeds with explicit notes about what's unvalidated.

What is the adversarial spec review and can I skip it? ▾

After the challenge phase and before the architecture is locked, the system argues against its own plan — presenting 2–3 specific, honest objections to the approach. These aren't generic "have you considered security?" warnings. They're targeted: "Your primary user is X, but this part of the architecture is designed for Y — that mismatch will cost you in v2." Or: "The plan doesn't address the hardest problem you identified — it works around it."

You then respond: unchanged (objections don't change anything), adjusted (something should change), or partial. The objections and your response are recorded in the session record. You can't skip the adversarial review — but you can respond with "no changes" and move on immediately.

The purpose isn't to block you. It's to make sure the strongest case against the plan has been heard and considered before weeks of building commit you to it. Architecture decisions made before code is written cost nothing to change. The same decisions made in week 3 can cost everything.

What is "planning mode" and when should I use each option? ▾

Before generating planning files, you choose how the engine approaches your plan. There are three modes:

HOLD SCOPE (default): Make what's in the architecture bulletproof exactly as specified. No scope additions, maximum rigor. Use this when you've thought through the architecture carefully and want the engine to harden it, not expand it.

EXPANSION: Surface what's missing. After generating required files, the engine runs a pass identifying gaps — unconsidered edge cases, missing error paths, features you might want — and asks if you want to add them. Notes them in the backlog as suggestions even if you decline. Use this early in a project when you want a thorough first pass.

REDUCTION: Cut ruthlessly. Features not required to prove the narrowest wedge are moved to the backlog's Deferred section and removed from Phase 1. Everything is cross-referenced against the minimum viable version. Use this when you're over-scoped and need to get to a shippable v1 fast.

What is "drift detection" and why does it matter? ▾

Drift happens when what's actually built diverges from what was specified. In a long-running project, a feature might be partially implemented, then a later build step changes how a shared interface works, and now the earlier feature is silently broken — but no one knows because no system is tracking it.

Scaffold OS tracks every feature area against its specification using a three-state system: IN_SYNC (built matches spec exactly), DRIFTING (divergence detected, within tolerance), or DIVERGED (significant departure from spec, build blocked until resolved). This check runs at the start of every session — before any new code is written.

The practical impact: bugs that would have been discovered at integration time (when they're expensive to fix) are caught at session start (when they're cheap to fix).

What's the contract change protocol and when does it fire? ▾

Any time a shared interface changes — an API endpoint signature, a database schema field, an event payload format — the 8-step contract change protocol fires automatically. The steps are: (1) stage the change, (2) scan all systems that depend on this interface, (3) publish an alert to every dependent, (4) open a 24-hour acknowledgment window, (5) coordinator review, (6) update each dependent system, (7) audit to confirm consistency, (8) resume build.

No code that depends on a changed interface ships until every dependent system has been updated and the change has been audited. This is what prevents the "we changed the API and now the frontend is broken" problem that makes multi-system projects painful.

What happens if a session crashes mid-build? ▾

The session recovery protocol handles this automatically. When a new session starts, the engine reads the latest recovery summary and progress markers first, then reconstructs the exact build status from current workflow state and pending decisions. It resumes from the next incomplete step instead of blindly replaying already-finished work.

No re-explanation required. No guessing about what was done in the previous session. The build continues from the exact next action with the right context already prepared. This is one of the most underrated features in production use - crashes happen, and without this, every crash loses significant context.

How does the audit system work? ▾

Scaffold OS uses a 3-tier audit architecture that runs continuously — not just before releases.

Tier 1 — Spot Audit (5–10 min): Five rapid checks that run automatically after every build cycle as part of continuous mode. Catches regressions in spec consistency, contract integrity, security flags, test coverage, and dependency drift before they compound into bigger problems.

Tier 2 — Focused Audit (30–60 min): Six-dimension deep investigation triggered automatically when Spot Audit flags risk, before a major feature merge, or when contract changes occur across systems. Covers data flow tracing, cross-system consistency, security end-to-end, agent dry-run simulation, spec/dependency cross-validation, and fix verification.

Tier 3 — Full Pre-Build Audit (2–4 hrs): Two-round 13-dimension gate that runs before major build phases. Round 1 covers 7 dimensions (internal consistency, completeness, verification command quality, security model, build sequence logic, setup executability, feature spec quality). Round 2 covers 6 dimensions (data flow trace, Round 1 fix verification, spec/dependency cross-validation, agent dry-run, security end-to-end, machine column audit). Build begins only when both rounds return a CONSISTENT verdict.

Tier selection is automatic — the continuous mode engine decides which tier to invoke based on what changed in the last cycle. Architecture problems caught at Tier 3 cost nothing to fix. The same problems caught at session 8 can cost days.

🛠 What You Can Build

What types of projects can Scaffold OS build? ▾

Scaffold OS was built to handle the projects that raw AI prompting can't reliably produce. Specifically:

Full-stack SaaS products — backend API, frontend application, authentication, database, payment processing, and email — all in one coordinated build, not as isolated pieces.

Enterprise systems — Salesforce implementations, Snowflake data warehouses, dbt transformation pipelines, AWS infrastructure — built as tracked artifacts with the same contract enforcement as application code.

ML/AI products — training pipelines, evaluation gates, RAG systems, agentic architectures, fine-tuning workflows — with ML-specific drift tracking separate from code drift.

Automation-first systems — n8n workflow automation, webhook-driven architectures, scheduled job orchestration — as first-class build outputs, not afterthoughts.

Existing codebases — archaeology mode reverse-engineers what's already built, reconstructs the planning files, and brings the codebase under full protocol management without rewriting anything.

What is the Code Archaeology feature? ▾

Code Archaeology is the dedicated v5.3 flow for importing an existing repository before any build work begins. It reads the codebase, infers the architecture from what was actually built, and reconstructs a complete set of planning files: architecture specification, schema reconstruction, feature area folders, and API contract registry.

Every inferred decision is marked with a [VERIFY] flag - so you can review what the system reconstructed against what you know to be true before continuing.

Once the archaeology session completes, the codebase is under full v5.3 protocol management: drift detection, contract enforcement, health scoring, structured resumes, and build planning from that point forward. Your existing codebase effectively gets a modern planning system retrofitted onto it.

How long does a typical project take? ▾

It depends heavily on project complexity. A simple SaaS with standard auth, CRUD API, and basic frontend might complete the core in 3–5 build sessions. A complex multi-system product (backend + frontend + payment processing + automation + cloud infrastructure + external platform integrations) might take 15–25 sessions across a few weeks.

What Scaffold OS changes isn't just the speed — it's the predictability. Traditional AI-assisted builds often hit a wall at session 5–8 where progress slows dramatically because earlier decisions are fighting new ones. With Scaffold OS, session velocity stays consistent because the protocol prevents the architectural decay that makes late sessions slow.

💰 Access & Pricing

How do I get access right now? ▾

Scaffold OS is currently in early access — we're onboarding teams directly before the public SaaS launch. Request access via the form on the homepage and we'll reach out directly.

Early access gives you: direct communication with the founding team, input on which features and build surfaces to prioritize, priority support during your builds, and preferred pricing locked in before public launch.

What will pricing look like? ▾

Pricing is actively being figured out — we'd rather get this right than rush it. We have a dedicated pricing exploration page where we're thinking through the options transparently. The short version: it will be usage-based in some form, with the goal of aligning cost closely with value delivered (complexity of what you're building, not just raw compute).

Early access teams are being offered preferred pricing that locks in before public launch. See the pricing exploration page →

Is there a free tier or trial? ▾

We're exploring this as part of the pricing model. Our current thinking is that a free tier makes sense for getting familiar with the protocol (first project, limited build surfaces), with paid tiers for production use. Nothing is finalized yet.

If you join early access now, you'll have direct input on what the trial/free tier looks like and guaranteed access to it when it launches.

□ The Engine Today

What changed in v4.8? What is the Update Cycle? ▾

v4.8 introduced the first-class Update Cycle — a named, structured model for post-build work. Before v4.8, changes after launch were managed as a rolling backlog. There was no formal record of what each change round involved, where it stood, or what it produced beyond session files.

The Update Cycle changes this: each round of post-launch changes is now a defined cycle with its own scope, workspace, and durable record. That record covers the request, context, plan, build, review, audit, and result — so every change round is independently traceable from start to finish.

In practice: when a build is live and you want to add a feature or fix a bug, the engine opens a named Update Cycle rather than just continuing in an open-ended session. The cycle runs to completion, gets archived with its full record, and the next change round starts fresh — with no ambiguity about what the previous round established.

What is Canonical Workflow State (v4.9) and why does it matter? ▾

Before v4.9, any product or surface built on top of Scaffold OS had to infer project state by parsing scattered session files — which meant making assumptions about internal structure that could break when the engine evolved. This was a real problem for anyone building integrations or dashboards on top of the engine.

v4.9 fixes this by making Scaffold OS the canonical authoritative owner of workflow state. The engine now maintains compact, machine-readable records for every meaningful workflow signal: what phase the project is in, what the current health score is, what decisions are pending, whether the build is ready for review, whether it's ready to deploy, and what the session execution contract looks like.

The practical effect: any surface — a dashboard, an integration, an automation — can read current project state from one place, in a stable format, without parsing session files or making guesses about internal structure. Workflow state is no longer inferred. It's declared by the engine directly.

What does the v5.1 Platform Foundation mean in practice? ▾

v5.1 makes Scaffold OS a reusable protocol foundation that multiple products and surfaces can sit on top of. Before v5.0, the engine was designed around a single-surface model — one surface consuming the protocol. v5.1 adds the infrastructure needed to support multiple surfaces, products, and integrations simultaneously.

Concretely, v5.1 adds: a target registry (canonical manifest of all declared build targets with per-target readiness), an integration registry (per-integration health and config tracking), multi-track work queue management (parallel workstreams without collision), a lifecycle event history (wrapper-safe, append-only event stream any surface can query), and visibility and packaging controls (fine-grained rules for what leaves the project boundary).

The wrapper execution contract is also extended in v5.1 — surface identity, capability declarations, queue position, and event cursor metadata are now part of the contract. This means any wrapper surface can declare what it can do, where it is in the lifecycle, and what events it has already consumed. The engine handles the rest.

Is Scaffold OS open source? Can I see how it works internally? ▾

Scaffold OS is a closed-source, proprietary protocol engine. The internal implementation — the protocol files, skill definitions, artifact formats, and workflow state schemas — is not publicly available and not open source.

What is public is the capability surface: what Scaffold OS can build, how the protocol phases work conceptually, what the quality gate system enforces, and what the current engine capabilities are. Everything on this site describes the engine from the outside in — what it does and how you interact with it, not how it's internally implemented.

This is intentional. The engine's protocol design is the proprietary element of real value. Making it available publicly would commoditize the approach, which we're not willing to do. If you're evaluating Scaffold OS, the right question is whether the outputs and the guarantee structure are worth the access cost — not whether you can read the source files.

🔧 Technical Details

What AI models does Scaffold OS use? ▾

Scaffold OS uses multiple model tiers through an internal routing engine — the right class of model is selected automatically based on what the current phase requires. Architecture and reasoning-heavy phases (brainstorm, challenge, audit) use frontier-class reasoning models. Code generation and scaffolding phases use faster, lower-cost tiers optimized for output velocity.

We don't expose specific model names because we route across providers and update the routing as better models become available. You get the best available capability for each task without needing to think about model selection.

How does it connect to external platforms like Salesforce or n8n? ▾

External platform connections happen through MCP (Model Context Protocol) servers — a standardized, bidirectional connection layer that lets agents interact directly with external systems. When Salesforce is declared as a build target in the architecture, the agent connects to it via an MCP server and creates the declared schema objects directly, with live confirmation.

For platforms that require browser-based management (dashboards, admin consoles), browser automation handles the verification step — the agent navigates to the platform's interface and confirms what was built matches the spec. This is what makes "real build target vs. integration you hope works" a real distinction.

What is the domain skills system introduced in v4.6? ▾

v4.6 introduced a formal domain skills system that replaces the older "specialist roles" framing. The key shift: instead of a fixed list of roles the protocol always runs, Scaffold OS now activates skills selectively based on your project type.

There are 36 curated skills across four categories — Planning, Specialist, Quality Gate, and Domain — that are matched to your project using one of 11 project profiles (SaaS Product, Enterprise Platform, Mobile App, ML Platform, etc.). When your project is detected as an ML Platform, ML-specific skills activate automatically. When it's a SaaS product, the SaaS-specific skills load instead.

Beyond the 36 curated skills, there's an extended catalog of 1,300+ on-demand skills across 81 classified groups that agents can request from a central skill server when they need specialist capability not included in the core set. None of these reference internal tool names or file paths — they're all capability descriptions focused on outcomes.

Quality gate skills enforce PASS / ADVISORY / FAIL outcomes rather than just flagging issues. If a skill returns FAIL, the build is blocked. ADVISORY results can be overridden with explicit acknowledgment. PASS clears normally. See the full skills system →

How does Scaffold OS know which skills to activate for my project? ▾

At the end of the brainstorm phase, the protocol matches your project to one of 11 project profiles based on the architecture you described — SaaS Product, ML Platform, E-Commerce, Mobile App, etc. The matched profile determines which domain skill loads automatically (e.g., the SaaS Product Specialist for a SaaS build, the ML Platform Specialist for an ML build).

Beyond the profile-matched domain skill, planning skills run automatically for every project (demand validation, adversarial review, failure path mapping). Specialist skills activate when the architecture calls for them — if your project uses a database, the Database Architect activates. If it touches external APIs, the Integration Specialist activates.

Quality gate skills run at defined checkpoints. Which gates are active and what their thresholds are is declared in your architecture document before the build starts — so there are no surprise blocks mid-build.

What happens when a quality gate returns FAIL? ▾

The build stops. Not pauses — stops. A FAIL verdict from a quality gate means the declared standard wasn't met and the build cannot continue until it is resolved. The agent surfaces exactly what failed and what needs to change, but it does not proceed to the next step.

ADVISORY verdicts are different — they surface an issue but allow you to acknowledge it and continue if you have a reason to. The acknowledgment is logged with a timestamp and your stated reason. This creates an audit trail of every override — nothing is silently bypassed.

PASS verdicts continue the build normally. Most gates return PASS on well-structured projects with a complete architecture spec.

Do I need to configure anything for the skills system to work? ▾

No. The skills system is fully automatic. You describe your project in the brainstorm phase exactly as you always have. The protocol detects the project type, matches it to a profile, and loads the relevant skills without any configuration required from you.

The only decision you make is confirming or adjusting the detected profile at the end of brainstorm — a single question with a recommended answer. If the auto-detected profile is correct, you confirm it. If it's slightly off, you select the right one. That's the full extent of skills configuration.

The skills themselves run in the background. You won't see "skill invoked" messages in the middle of a build — you'll just notice that the architecture review is more thorough, quality gates actually block bad moves, and specialist knowledge appears exactly when it's needed.

What does "15 specialist roles" mean in practice? ▾

Each role represents a distinct cognitive mode the system switches into at the appropriate phase. You don't manually invoke them — the protocol activates the right role based on what's being built.

When the architecture is being designed: Solutions Architect and AI Engineer mode. When security decisions are being validated: Security Engineer mode. When the database schema is being built: Data Engineer mode. When ML pipelines are detected: ML/AI Engineer mode. When an existing codebase is being read: Code Archaeologist mode.

In practice this means: the same session that plans the architecture, designs the database, writes the security model, and scaffolds the ML pipeline is doing so with a different cognitive approach for each — not just one "general coding assistant" mode applied uniformly to everything.

Is my code and architecture specification private? ▾

Yes. Your architecture files, build plans, session state, and code are your own. Scaffold OS is a coordination protocol running on top of your environment — your project data stays in your repository and your local planning folder. We will publish a clear data handling and privacy policy before the public SaaS launch.

What is the project health score and how is it calculated? ▾

After every build session, Scaffold OS computes a 0–100 health score for the project based on four inputs: sync drift (features that have diverged from their spec cost the most points), code drift (features that are drifting but not yet broken cost fewer), open decision debt (unresolved decisions from past sessions), and session staleness (time elapsed since the last active work session).

The score is always available as a single, machine-readable signal. Any surface built on top of the engine reads this file — there's no recomputation happening on the surface side. The formula is canonical and consistent across every project and every surface. A score of 80+ is considered healthy for active projects. Below 60 typically means accumulated drift or significant open decisions that need resolving.

The health score is supplemented by a complexity signal generated before the first build session — a structured estimate of how large the build will be: feature count, integration count, estimated number of sessions needed, and a complexity tier (low/medium/high/extreme). This gives you realistic expectations before any code is written.

Can the engine suggest its own branching strategy? ▾

Yes. In v5.3, one brainstorm question sets the project's git strategy - GitHub Flow, trunk-based development, GitFlow, main-only, or a custom model. The engine can recommend the common default when nothing unusual is detected, but you still confirm the final branching model before the project uses it everywhere.

From that point, deploy steps, PR creation, and release tagging are personalized to your declared model. If your project uses GitHub Flow or another PR-required setup, the engine drafts the full pull request content at the end of each feature session - title, summary, files changed, testing instructions, and merge conditions. The surface layer submits it; the engine drafts it. Nothing is hardcoded on the surface side for CI/CD behavior.

Release note drafts work the same way: when release tagging is enabled, a structured release note draft is generated at build completion, including version numbers formatted according to the project's declared tagging format.

Questions we get
answered honestly.

Still have questions?

Questions we getanswered honestly.

Still have questions?

Questions we get
answered honestly.