Skip to main content

Documentation Index

Fetch the complete documentation index at: https://bintzgavin-apastra-14.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Local evals with your IDE agent are the starting point. CI integration is the next step: it makes regression detection automatic, gates merges on passing checks, and creates an audit trail for every prompt change.

When to upgrade from local to CI

SignalWhat it means
More than one engineer edits promptsYou need merge gates — local evals don’t protect the shared branch
Prompt changes ship without reviewYou need a required status check that blocks merge
You want an audit trailCI creates a durable record of every eval run tied to a commit SHA
Baselines need to be sharedCI writes baselines to the promptops-artifacts branch so everyone uses the same reference
If you’re a solo builder or in early prototyping, local evals are sufficient. Upgrade to CI when the team or stakes grow.

Basic CI: two workflows

For most teams, two workflows cover the full loop — eval on pull requests, release on tag push.
Trigger: Pull requests with changes to promptops/**What it does: Runs the regression gate on every PR that touches prompt specs, datasets, evaluators, or policies. Blocks merge if a regression is detected.
name: Prompt Eval
on:
  pull_request:
    paths:
      - 'promptops/**'

jobs:
  eval:
    uses: ./.github/workflows/regression-gate.yml
This workflow delegates to regression-gate.yml (the reusable workflow), which:
  1. Detects changed files in promptops/harnesses/**, promptops/prompts/**, promptops/datasets/**, and promptops/policies/*.yaml
  2. Skips if no evaluable files changed
  3. Fetches the latest regression report from the promptops-artifacts branch
  4. Evaluates the report status and posts a step summary with per-metric evidence
  5. Checks cost budgets against suite budgets.cost_budget limits
  6. Exits non-zero (blocking merge) if the report status is not pass

Full CI: six workflows

For enterprise teams needing fine-grained control, apastra ships six individual workflows. Each has a distinct responsibility:
WorkflowTriggerWhat it does
regression-gate.ymlPull requests to main; also callable via workflow_callDetects changed files, fetches the regression report from artifacts branch, evaluates pass/fail, checks cost budgets, posts step summary, blocks merge on failure
promote.ymlManual dispatch (workflow_dispatch) or release publishVerifies an approval state record exists for the digest, generates an append-only promotion record, commits it to the promptops-artifacts branch, then calls deliver.yml
deliver.ymlCalled by promote.yml via workflow_callReads the promotion record, resolves the channel and digest, iterates over promptops/delivery/*.yaml targets, and executes the sync for each matching target
immutable-release.ymlTag push (*); also callable via workflow_callPackages promptops/, computes digest, attests build provenance, creates an immutable GitHub Release with the bundle and digest
auto-merge.ymlPull request opened, reopened, or synchronizedResolves PROGRESS.md merge conflicts automatically, then enables squash auto-merge for PRs that pass all required checks
prompt-eval.ymlPull requests touching promptops/**Entry point for the basic CI setup — delegates to regression-gate.yml
regression-gate.yml, immutable-release.yml, and deliver.yml are reusable workflows (they declare workflow_call as a trigger). This means you can call them from other workflows in the same repo or from workflows in other repos, which lets a platform team standardize PromptOps across many repositories.

What the regression gate does in detail

The regression gate is the most important workflow for day-to-day governance. Here is what happens step by step when a PR is opened:
1

Detect changed files

The workflow uses tj-actions/changed-files to detect whether any of the following changed:
  • promptops/harnesses/**
  • promptops/prompts/**
  • promptops/datasets/**
  • promptops/policies/*.yaml
If none of these changed, the gate is skipped with a log message. Non-prompt changes (docs, code, config) are not gated.
2

Fetch the artifacts branch

When evaluable files changed, the workflow fetches reports/regression_report.json and reports/run_manifest.json from the promptops-artifacts branch. These were written by the most recent eval run.
3

Evaluate the regression report

The workflow reads the report and posts a step summary table with columns: Metric, Status, Candidate, Baseline, Delta, Message. Any failing metrics are annotated as errors in the PR.
4

Check cost budgets

If a run_manifest.json includes a total_cost field, the workflow compares it against the budgets.cost_budget declared in each suite file. A cost overrun fails the gate.
5

Block or pass

If report.status == "pass", the workflow exits 0 (green check). Otherwise it exits 1 (red check), which fails the required status check and blocks merge on protected branches.

CODEOWNERS for prompt governance

Use a CODEOWNERS file to require human review of any prompt, policy, or evaluator change:
# Prompt specs require review from the AI quality team
promptops/prompts/         @your-org/ai-quality
promptops/policies/        @your-org/ai-quality
promptops/evaluators/      @your-org/ai-quality

# Delivery targets require platform team review
promptops/delivery/        @your-org/platform
With CODEOWNERS in place:
  • Any PR that modifies a file in promptops/prompts/ requires at least one approved review from @your-org/ai-quality before merge.
  • This is enforced by GitHub branch protection rules — not just convention.

Branch protection: required status checks

Configure branch protection on main to require the regression gate before merge:
  1. Go to Settings → Branches → Branch protection rules.
  2. Add a rule for main.
  3. Enable Require status checks to pass before merging.
  4. Add gate (the job name from regression-gate.yml) as a required check.
  5. Optionally enable Require a pull request before merging and Require review from Code Owners.
With this configuration, no PR can merge to main unless:
  • The regression gate job passed (no regression detected, no cost budget exceeded)
  • At least one CODEOWNERS reviewer approved (if CODEOWNERS is configured)

Auto-merge for passing PRs

The auto-merge.yml workflow enables squash auto-merge for PRs that pass all required checks. It fires on pull_request events (opened, reopened, synchronized) and:
  1. Resolves any PROGRESS.md merge conflicts automatically using a union merge strategy.
  2. Calls gh pr merge --auto --squash to enable auto-merge once all required checks are green.
In the apastra repo, auto-merge is scoped to PRs from google-labs-jules and BintzGavin. For your own repo, update the if: conditions to match your autonomous agent’s login or a specific label.

Artifacts branch: keeping derived data out of main

Regression reports, run manifests, promotion records, and baselines are derived data — they should not live on main alongside your source files. Apastra uses a separate promptops-artifacts branch as an append-only store:
# branch: promptops-artifacts
reports/
  regression_report.json
  run_manifest.json
approvals/
  <approval-record>.json
promotions/
  <timestamp>-<id>.json
The regression gate fetches from this branch at runtime. The promote workflow commits to this branch. This keeps the main branch clean and prevents merge conflicts on derived files.