Software Factory
Status note
This engineering note captures repo-specific agent, review, and verification patterns shaping VRDex delivery. It is not public product documentation or a universal contributor mandate.
Promote only proven reusable patterns into basics-agentic-dogfooding or global agent context. Keep speculative workflow ideas here until they have repeated value.
Locked decisions
VRDexis an opinionated repository, not a neutral sandbox.- Safe routine progress should continue without repeatedly asking for permission.
- Global repo conventions belong in
AGENTS.md. - Personal/operator preferences belong in
AGENTS.local.md, which should remain gitignored. - Infrequent onboarding/setup material should live in docs and a repo onboarding skill, not in every-session
AGENTS.mdcontext. - Durable markdown should live under
docs/rather than accumulating at the repo root.
Current recommendation
- treat product design and software-factory design as parallel workstreams
- when an agent takes a poor path or asks a low-value question, finish the immediate task and then capture the process fix before moving on
- bias toward stronger human and agent onboarding so new sessions converge quickly on repo norms
- prefer discoverable, cross-linked artifacts over chat-only decisions
- avoid turning local process experiments into public product promises
Rigorous, not prescriptive
Locked decision
- VRDex should be rigorous about review, verification, and documentation expectations.
- VRDex should not force every contributor to use one specific agent, model, or editor workflow.
Current recommendation
- optimize for compatibility with multiple agents so long as they can operate inside the repo's review and verification system
- use the repo's review/recycle loops to catch slop and improve quality without over-policing how people work locally
Global vs local context model
AGENTS.md
Use for:
- repo-wide behavior defaults
- autonomy and commit/push posture
- durable safety rules
- durable workflow expectations every agent should always know
Do not use for:
- long onboarding playbooks
- personal operator preferences
- fast-changing implementation details
AGENTS.local.md
Use for:
- personal communication preferences
- local model preferences
- operator-specific autonomy bias and scratchpad habits
- anything that should not silently become repo policy
Skills
Use for:
- onboarding flows
- repeatable multi-step setup
- model/tool/MCP orientation
- control-loop playbooks that are useful on demand but too large for every session
Repo onboarding skill direction
The VRDex onboarding skill should cover:
- local agent setup expectations
- how this repo separates global policy from local operator preference
- supported agent roles and model-routing expectations
- how to think about agent development in this repo
- software-factory conventions, issue filing conventions, and documentation conventions
- typical development workflow from task intake to verification to merge
Review-recycle loop
Current recommendation
- treat review-recycle as a first-class normal development loop, not an exception
- use a fresh-context reviewer and a recycler that resumes the original implementer context when possible
- trigger recycler work on PR creation, draft->ready transitions, new review comments, failing checks, and mergeability regressions
- triage every outstanding review comment before pushing a follow-up commit on an open PR
- reply or react with disposition before resolving review threads; do not silently resolve rejected or partially applied feedback
- use GitHub Copilot automatic follow-up reviews and CI for ordinary iteration; reserve paid/manual reviewer reruns for substantial change sets
Roles
implementer: writes the change, runs relevant verification, and keeps enough context to make minimal follow-up patches.reviewer: inspects the change from a fresh context and returns source-linked findings, confidence, uncertainty, and test gaps without editing files.recycler: triages review findings and failing checks, decides apply/reject/split/ask, patches confirmed issues, reruns verification, and records dispositions.
Candidate Reviewer Sources
- GitHub Copilot: automatic or low-friction PR review and follow-up comments.
- Greptile: paid/manual review for coherent change sets where cost is justified.
- Codex, Claude, or other fresh-context agents: parallel source-linked review lanes outside GitHub when a cold read helps.
- Custom GitHub Action reviewers: deterministic or model-backed checks that produce PR comments or artifacts.
- Custom OpenCode reviewers: repo-local reviewer sessions that can inspect working trees, artifacts, and docs before feedback is reflected into GitHub.
Candidate direction
- run first-pass reviewer/recycler loops outside GitHub when practical, then reflect the result back into GitHub once the branch is in better shape
- allow agents to request reviewer and recycler jobs from the common task pool defined by #50
- encode reviewer source, confidence, false-positive disposition, and recycler outcome as structured metadata when #50 moves from direction to implementation
Trigger model
- PR opened ready for review
- draft PR marked ready
- substantial new commit pushed to a PR branch
- baseline check, deploy check, CodeQL, or hosted E2E failure
- new blocking reviewer comment from a human or AI reviewer
- mergeability regression after base branch movement
- stale branch that blocks otherwise-ready merge
Recycle gate
Before the next recycle push:
- gather all outstanding review comments and failing checks
- decide for each item: apply, reject with reason, split follow-up, or ask one human question
- make the smallest correct patch set
- rerun the relevant verification
- record dispositions in the PR or issue when review context would otherwise be lost
Orchestrator / supervisor loop
Current recommendation
- add an orchestrator or executive-assistant layer that sits above implementer sessions
- the orchestrator should decide one next action when an implementer stops: continue, ask one human question, dispatch another agent, or mark done
- prefer checkpointed incremental deltas over replaying giant transcripts
- conserve human attention by asking one concrete decision at a time, with the recommended option first when there is a clear default
- keep supervisor messages bounded to task state, blocker, evidence, and next-action choices instead of forwarding full transcripts by default
Candidate direction
- keep implementer sessions persistent and resumable
- treat recycler work as resuming the original implementer session rather than spinning up a brand-new deep-context worker each time
- treat
.opencode/plugins/supervisor-loop.tsand.opencode/commands/supervisor.mdas local experiment files until restart/tool-discovery behavior is validated in a follow-up issue under #43
Resume policy
- resume the same session when the task needs preserved implementation context, review history, or partial local state
- start a fresh session when the job is independent, benefits from cold review, or needs reduced context bloat
- pass a compact delta package upward: goal, files changed, verification run, blockers, open decisions, and proposed next action
- do not use resumability to hide stale assumptions; reread changed files before editing after a long pause
Delta package template:
- goal and linked issue/PR
- branch, files changed, and verification already run
- blocker or reason the implementer stopped
- open decisions, with recommended option first when possible
- proposed next action: continue, ask, dispatch, recycle, or mark done
OpenCode server / task-pool direction
Current recommendation
- move toward a common hosted OpenCode server that acts as a shared task pool
- prefer atomic jobs/tasks over thread-subscribed chats when that improves dispatch, accounting, and re-entry
- track active, idle, completed, and resumable agent sessions as discoverable system state
- distinguish atomic jobs from resumable sessions explicitly
- keep dispatch through an orchestrator path instead of uncontrolled recursive agent spawning
Concepts
atomic job: a bounded task with a clear input package, expected output, and completion state. Example: review one issue closure, recycle one confirmed finding, or recover one stale PR.resumable session: an agent session with useful retained context that can continue implementation, recycle review feedback, or recover mergeability without replaying the whole history.task pool/server: the roster and queue layer that tracks jobs, sessions, states, assignments, and re-entry metadata.orchestrator request: a controlled request for new work or a resumed session; agents should not recursively spawn uncontrolled work.
Lifecycle states worth preserving:
- requested
- queued
- dispatched
- active
- checkpointed/resumable
- completed, failed, or cancelled
Candidate direction
- let agents request new agents through an orchestrator-facing interface instead of directly spawning uncontrolled work
- expose parallelism ceilings, roster visibility, and resume-vs-new-session policy as explicit system controls
- expose task type, required tools, repo path, branch, risk level, and expected verification as dispatch metadata when a follow-up #50 implementation issue exists
Mergeability recovery
Current recommendation
- treat mergeability regression as a first-class recycler trigger
- default to resuming the original implementer session when practical
- let automation update from base or resolve straightforward conflicts only when the intended behavior remains clear
- ask for human input before risky conflict resolution that changes product, security, billing, trust, or migration behavior
Recovery loop
- Detect unmergeable or stale PR state.
- Gather base branch, changed files, failing checks, and outstanding reviews.
- Resume the original implementer when context is useful and available.
- Apply the smallest conflict or stale-branch fix.
- Rerun affected checks.
- Leave a concise PR comment if the recovery changed behavior or deferred work.
Detection sources:
- GitHub PR mergeability state or branch protection state
- base/head SHA mismatch indicating a stale branch
- required check failures after base branch movement
- failed update-from-base or merge attempts
- scheduled or webhook-based stale-PR scans once automation exists
Dispatch package:
- PR number, base/head refs, and base/head SHAs
- changed files and conflict files when known
- failing checks and outstanding review threads
- preferred session ID or original implementer identity when available
- risk flags for product, security, billing, trust, migration, or data behavior
Automation boundary examples:
- straightforward: stale base update with no conflicts, formatting-only conflict, or test snapshot conflict with unchanged product behavior
- human required: conflicts that alter product behavior, auth, billing, trust labels, migrations, data retention, or public API contracts
Verification loops
VRDex should plan verification as a layered system, not a single test command.
Required layers to design for
- lint and formatting validation
- typecheck/build validation
- unit and integration testing
- end-to-end testing
- screenshot and visual regression review
- VLM review of meaningful UI changes
- validation of scripts and ancillary automation code
- AST/policy checks where structural rules matter
- infrastructure verification for IaC and deployment automation
Candidate direction
- feature-ready agents should present a video or screenshot-backed validation package to the human reviewer
- the human checkpoint should happen when the feature is already mergeable, not as a substitute for engineering verification
Definition of ready
Current recommendation
- every non-trivial feature should define how it will be reviewed, verified, rolled out, and measured before implementation begins
- definition-of-ready belongs in engineering/docs discipline, not just in a PM tool
docs/agentic/definition-of-ready.mdis the canonical checklist and issue-snippet reference for this repo
Definition of done
Current recommendation
- every non-trivial feature should close with an explicit done check, not just a claim that implementation landed
- definition-of-done should cover verification completion, documentation updates, rollout posture, and review closure
docs/agentic/definition-of-done.mdis the canonical closeout checklist and handoff-snippet reference for this repo
Feature flags and analytics
Current recommendation
- treat feature flags, experimentation, and product analytics as first-class design concerns for feature work
- default to asking whether a feature should be gated, progressively rolled out, or instrumented
- avoid stacking overlapping platforms too early; prefer one primary system per concern until a real gap appears
docs/agentic/product-analytics-and-feature-flags.mdis the canonical policy for tool roles, rollout posture, and product-signal expectations
LLM and agent observability
Current recommendation
- keep LLM/agent observability separate from product analytics and feature flags
- do not add a dedicated LLM observability platform until traces, evals, prompt quality, cost, or loop diagnostics are hard to manage with current artifacts
- treat
Langfuseas the first candidate to evaluate if dedicated traces/evals become necessary - consider signed action receipts or similar provenance/accountability artifacts as a separate concern from tracing
- prompt text is not captured by default until a redaction, privacy, and retention policy exists
First signals worth capturing
- task goal and issue/PR linkage
- model/agent role at a coarse level
- tool categories used, without secrets
- review findings, false-positive dispositions, and recycler outcomes
- checks run and pass/fail state
- human decisions requested and answered
- cost/latency only when the platform exposes it safely and usefully
Implementation ownership: #45 only chooses direction. Any dedicated tracing platform, prompt capture, eval harness, or signed action receipt implementation needs a follow-up issue under #43.
Boundary
Product analytics answers whether VRDex users are succeeding in the product. LLM/agent observability answers whether repo/product agents are producing reliable work. Do not force both jobs into one telemetry system by default.
Cross-repo promotion model
Current recommendation
- solve repo-specific versions first inside
VRDex - once a pattern proves useful here, promote the generalized version into
basics-agentic-dogfooding - avoid over-generalizing before the repo-specific version has shown real value
Contributor posture
Current recommendation
- newer contributors should be helped by the system rather than forced to infer expectations from tribal knowledge
- reviewer agents and recycle loops should help raise quality without requiring maintainers to hand-police every sloppy draft
- protected branches, contributor roles, and org-level controls should arrive when collaboration volume justifies them
docs/agentic/contributor-workflow.mdis the canonical contributor contract and onboarding pointer for this repo
Backlog direction
Software-factory implementation ideas should be tracked under #43 or linked child issues, separate from product features.