
What it is
A system, not a single AI model.
A software factory is the platform AI coding agents run on. A piece of work is usually handled by an orchestrated group of specialised agents — one for planning, others for implementation, testing, and review — working together rather than one general agent.
A factory is needed because agent use doesn't scale on its own. Without a system around it, many developers each running several agents produce inconsistent code, no shared standards, no safe way for agents to merge their work, and no control over what they access. The throughput comes from the system around the agents.



What a factory is made of
The capabilities every factory is built from.
Six functional capabilities, assembled into one system. Throughput comes from the assembly, not from any single AI model.
Orchestration
Coordinating the work of many agents across many tasks at once.
Isolated environments
A clean, separate sandbox for each agent to run in, so agents don't interfere with each other or with production.
Context & memory
Keeping agents aware of the codebase, its conventions, and past decisions, so their output is consistent and correct.
Validation
Checking output against defined criteria before it ships — tests, evaluations, security scans.
Governance
Controlling what agents can and can't do — permissions, credential handling, and an audit trail of every action.
Learning
Feeding lessons from production back into the system so it improves over time.
How we build it
The MVP, built in vertical slices.
01
Start
Real applications
Start from real work
Pick one or two applications from your estate. The factory is built around real software, not infrastructure in isolation.
02
Team
Two seniors
Two seniors, embedded
Two senior re:cinq engineers work inside your team. Guardrails go in before agents get broad access.
03
Slices
Ship early
Vertical slices
Each slice ships working software rather than months of infrastructure with nothing running. Judged on real outcomes.
04
MVP
~12 weeks
Handover
A 12-week MVP is common (3–6 months typical). We hand over the factory for your teams to run and extend.
01
Start
Real applications
Start from real work
Pick one or two applications from your estate. The factory is built around real software, not infrastructure in isolation.
02
Team
Two seniors
Two seniors, embedded
Two senior re:cinq engineers work inside your team. Guardrails go in before agents get broad access.
03
Slices
Ship early
Vertical slices
Each slice ships working software rather than months of infrastructure with nothing running. Judged on real outcomes.
04
MVP
~12 weeks
Handover
A 12-week MVP is common (3–6 months typical). We hand over the factory for your teams to run and extend.
01
Start
Real applications
Start from real work
Pick one or two applications from your estate. The factory is built around real software, not infrastructure in isolation.
02
Team
Two seniors
Two seniors, embedded
Two senior re:cinq engineers work inside your team. Guardrails go in before agents get broad access.
03
Slices
Ship early
Vertical slices
Each slice ships working software rather than months of infrastructure with nothing running. Judged on real outcomes.
04
MVP
~12 weeks
Handover
A 12-week MVP is common (3–6 months typical). We hand over the factory for your teams to run and extend.
The clearest public examples of working software factories today. They're the reference points for what the pattern does at scale — evidence, not endorsements.
Stripe (Minions) — around 1,300 agent-written pull requests a week
Ramp (Inspect) — around half of the company's pull requests
StrongDM — code not written or reviewed by humans
OpenAI harness team — dark factory pattern
Reference points
What working factories
look like today.
How autonomy works
Autonomy is earned, not switched on.
When agents produce many times more code than before, humans reviewing every line become the bottleneck. Review shifts from the human to an automated verification system. The more the verification covers, the less a human needs to check — and the more the agents can merge on their own.
The autonomy dimmer
Lights on
Agents draft the code; a human reviews every pull request. Most human oversight.
Dimming
Humans review a sample; routine, well-verified work merges automatically.
Lights out by exception
Work merges on its own. Humans step in only on novel or high-risk changes.
Escalation path
Separately, agents escalate to a human whenever they can't resolve something on their own — an unclear spec, or a change they can't complete safely.
Preconditions
A factory only works on solid foundations.
The assessment determines whether a team is ready. Education and legacy modernisation build the readiness — usually in parallel with factory development, not before it.
Mature automated testing
Without a real test harness, verification can't earn autonomy — and agents ship broken work at scale.
Reliable CI/CD
The merge gate is the one path to production. Every change, whoever produced it, passes through it.
Well-structured backlogs
Agents work from specifications. Vague backlogs produce vague output at high speed.
Clear specifications
A human writes a clear spec including the criteria the result must meet. That spec is what the loop runs on.
