Market My Spec Case Study: 4 Days, 13 Stories, Zero Prompts From Me

What is Market My Spec?

Market My Spec is a small MCP server that exposes a marketing-strategy skill to Claude Code. One install command connects a user's existing agent to the server, and the agent walks the user through an eight-step strategy flow covering ICP, positioning, channels, and content. The MVP ships the connection plumbing: magic-link auth, multi-tenant accounts with an agency tier, OAuth provider scaffolding, two MCP servers, and the skill-content surface the agent reads. The strategy intelligence runs in the user's own Claude Code subscription — no token markup, no inference resale.

Claude Code

Primary client. Connects with one `claude mcp add` install command

Magic-link Email

Passwordless sign-up and sign-in via Resend

OAuth (GitHub, Google)

Provider scaffolding for connect-account flows

Key Features

Marketing-strategy skill delivered to Claude Code over MCP
Eight-step strategy walkthrough: ICP, positioning, channels, content
Multi-tenant accounts with admin-provisioned agency tier
Magic-link sign-up and sign-in (no passwords)
OAuth provider scaffolding for GitHub and Google
BYO-Claude — no token markup, no inference resale

User Stories

A representative slice of the stories that drove this build. Stories that ran the Three Amigos protocol expose their actual persona, rules, scenarios, and resolved questions inline. The records shown are excerpted from the harness; the full set lives in the project.

13

Stories

115

Acceptance Criteria

13

With Three Amigos

Open Source Repository

Market My Spec is open source under Apache 2.0. The harness's own source remains in the CodeMySpec repository, but the product the harness shipped is fully open for inspection, contribution, and self-hosting. Issues and PRs welcome.

View on GitHub

Elixir / Phoenix

Language

Apache 2.0

License

MCP-backed records

Strategy Storage

OAuth2 + magic link

Auth

The Dev Story

The Good, The Bad, and The Ugly

An honest assessment of the second product the harness shipped. The MMS build itself is small: seventeen commits across four days (May 3-6). The harness changes that made the build possible were larger and longer — about 200 commits across the seventeen days between the prior r/elixir post and this one. Three named experiments landed in that window, every one of them with criterion-level Spex covering its evaluator rules.

The Configurable Workflow Shipped Working

Story 671 covers the requirement-graph projector with seventeen criterion-level specs. A ProjectConfiguration row carries require_specs, require_reviews, require_tests, spec_validation, and qa_validation per project. The knobs gate module specs, spec reviews, and unit tests — BDD specs are not configurable; they are the spine of the default loop. Toggle require_specs off and the module-spec nodes drop out of the graph cleanly. Toggle require_reviews off and implementation splices onto the spec-valid edge directly. Toggle them on and off and the graph round-trips. The configurable per-component workflow stopped being a slogan the moment seventeen tests landed against it.

Three Amigos Caught Real Ambiguity

Story 678 (multi-tenancy) is the cleanest example. The agent ran the Three Amigos protocol, surfaced four open questions about agency-vs-individual UX instead of guessing, and parked them on the story. I answered in plain English. The agent then added a new rule capturing the product decision I had just made ("agency accounts are admin-provisioned only, no self-service") plus a matching scenario, and only then closed out the readiness gate. Without the gate, that ambiguity would have flowed straight into BDD specs.

Boundary Protection Installed Clean On First Run

On May 1 the spex_boundary_ready task ran for the first time on Market My Spec. It installed two framework Credo checks (deny stdlib calls and direct send/2 from inside specs), generated a project-local Credo deny list by reading architecture/proposal.md and naming seven internal contexts (Repo, Mailer, Vault, Users, Integrations, McpAuth, Skills), scaffolded the curated fixtures bridge, and wrote the project BDD plan. Four artifacts in one task, zero hand-edits.

Zero Prompts, Orchestration Only

The headline number is honest with one footnote: I orchestrated. I picked which subagent ran when, called retries, and occasionally said "do it yourself, no subagent on this one." The writing was the agent's, top to bottom. The 0/0 stat is the result. The orchestration framing is the truth that prevents the headline from reading as marketing.

The Harness Started Specifying Itself

Story 553 landed within nine hours of the prior r/elixir post going live. Seven criterion specs against the harness's own ProjectConfiguration pipeline. By May 5 the harness was running roughly two hundred criterion-level Spex against its own internal modules, with cassette-backed pipeline fixtures for the validation paths and a release CI that fails on warnings. The framework that builds Phoenix applications now has the framework's own BDD coverage as its backstop.

The BDD Specs Got Sloppy On A Less-Careful Model

Story 609 (magic-link sign-up) shipped on May 1 via the BDD-only path: Three Amigos, six BDD criterion specs, twelve implementation files, no module specs and no unit tests. The model writing the specs that day was not Opus. Two days later, when Opus was available again, I spun up a fresh session with the brief "You are reviewing every BDD spec file in this project. The specs were written by a less-careful model and need a thorough audit." All six spec files needed the sweep. The cleanup wasn't gated by a config flip — BDD specs are not currently configurable in the workflow. The lever was the model itself: getting Opus back was what made the audit possible. The honest version of "knob, not deletion" is that the workflow knobs cover module specs and spec reviews and unit tests, not BDD specs, and model selection is a separate axis nobody has automated yet.

Boundary Protection Has No Live-Catch Transcript Yet

The four artifacts are in place. The framework Credo checks are active. The project-local check denies the right modules. The fixtures bridge is the only sanctioned door. The BDD plan is written. What's missing is a transcript moment where Credo fires live on a violation and the agent has to back up and try again. The artifacts work as designed. The live catch will get staged in the demo video; until then, the boundary-protection claim stands on installation evidence and design intent, not on a real-time catch.

The Mixed-Session Corpus

Forty-seven Claude Code sessions covered the build window. Twenty had build-side signal. All twenty were also marketing-operations sessions, because Market My Spec is the marketing tool we use to do CodeMySpec's marketing. There are no "build-only" sessions in the corpus. Transcript mining for this case study had to happen at the cwd-and-tool-call level inside each session. Privacy review on every excerpt. No customer data leaked, but the operational cost was real.

The Three Amigos Gate Is Structural, Not Semantic

The readiness check confirms records exist: at least one persona linked, at least one rule, every rule has at least one scenario, scenarios outnumber open questions. It does not confirm the rules are right. Quality lives in the agent's prompt and the human PM's review. Whether a one-PM Three Amigos session can substitute for a real workshop with a human dev and a human QA is an open question, not a closed one.

What We Changed After Market My Spec

The Layers Are Knobs, Not Doctrine

MetricFlow taught us that the six-phase pipeline ran most of its layers for ceremony rather than catching real problems. Market My Spec made the layers configurable instead of removing them. Module specs, spec reviews, and unit tests are still available; they're one config flip away. The default path is the shortest one that ships working software, and you opt into ceremony as the surface stabilizes. The harness rewires its own requirement graph when the toggles change.

Example Mapping Belongs Inside The Harness

The MetricFlow conclusion called Example Mapping the missing step. Market My Spec made it a real agent task with twelve criterion-level specs against the readiness rules, hosted in the harness, runnable from Claude Code through MCP. Persona, Rules, Scenarios, Questions, all persisted as records, not flat files. The records make the gate machine-checkable. The agent calls evaluate_task when it thinks the conditions are met. The PM holds product intent.

Specs Should Drive The User's Surfaces, Not The Internals

The MetricFlow Potemkin Village failure mode was the coding agent and the QA agent collaborating to ship broken functionality. The fix is structural: a sealed boundary that forces specs to drive the same LiveView, controller, and MCP surfaces a real user drives. Repo is denied. Direct send/2 is denied. System.cmd is denied. The fixtures bridge is the only door, and it is grep-able. The harness refuses to walk into boundary-shaped work until the gate is green.

Orchestration Is The Next Bottleneck

I didn't write prompts. I didn't write code. I did pick which subagent ran when, call retries, and occasionally route around a bad path. That is the work that's left, and it is the work the next experiment has to absorb. Auto-orchestration through the requirement graph is the obvious next step. We are not there yet.

Four days. Seventeen commits. Thirteen stories. Zero prompts. Zero lines of code from me. The harness shipped Market My Spec the same way it shipped MetricFlow, but this time without me typing into the chat. The fix wasn't more automation. It was knobs on the layers, a real gate before the specs, and a sealed boundary around them. Configurable workflow, Three Amigos, and boundary protection. Three changes, one MVP, one open question about orchestration. That's the loop that ships.

Read the Full Methodology

Market My Spec