Schema reference

Apastra ships 56 JSON schemas that validate every file type in the protocol. Schemas ensure your prompts, datasets, evaluators, and run artifacts are machine-readable by any agent or harness — not just your own. All schemas use JSON Schema draft 2020-12.

Your agent validates files against these schemas when you run the apastra-validate skill. The schema-validation.yml CI workflow also validates changed files on every pull request.

Schema categories

Schema file	What it validates
`prompt-spec.schema.json`	Prompt template + variable schema + output contract
`dataset-manifest.schema.json`	Dataset identity, version, digest, provenance
`dataset-case.schema.json`	A single JSONL test case with inputs and inline assertions
`evaluator.schema.json`	Scoring rules (deterministic, schema, judge, human)
`suite.schema.json`	Benchmark suite: datasets, evaluators, model matrix, thresholds
`quick-eval.schema.json`	Single-file eval combining prompt, cases, and assertions

Schema file	What it validates
`run-request.schema.json`	Immutable work order for triggering a run
`run-manifest.schema.json`	Resolved digests, harness, model IDs, env metadata, status
`scorecard.schema.json`	Normalized metrics, metric definitions, variance
`run-case.schema.json`	Per-case result record
`run-artifact.schema.json`	Complete run artifact directory structure
`run-failures.schema.json`	Structured failure records from a run
`artifact-refs.schema.json`	URIs + digests pointing to large raw artifacts

Schema file	What it validates
`baseline.schema.json`	Named reference to a known-good scorecard
`regression-policy.schema.json`	Per-metric rules: floors, deltas, directionality, severity
`regression-report.schema.json`	Candidate vs baseline comparison outcome + evidence
`comparison-scorecard.schema.json`	Side-by-side scorecard comparison
`drift-report.schema.json`	Canary suite drift detection report
`flake-quarantine-record.schema.json`	Record of a quarantined flaky test case

Schema file	What it validates
`promotion-record.schema.json`	Append-only binding: digest → channel
`approval-state.schema.json`	Human review decision record
`delivery-target.schema.json`	Downstream sync target configuration
`delivery-target-receipt.schema.json`	Confirmation that a sync completed
`consumption-manifest.schema.json`	App-side prompt pins, overrides, model defaults
`policy-exception-record.schema.json`	Documented exception to a governance policy
`audit-report.schema.json`	Audit trail record

Schema file	What it validates
`prompt-package.schema.json`	Immutable bundle of prompt specs with manifest and digest
`provider-artifact.schema.json`	Distribution wrapper (git ref, release asset, OCI, npm/PyPI)
`harness-adapter.schema.json`	How to invoke a harness + capabilities + env vars
`release-descriptor.schema.json`	Signed release descriptor for delivery to internal APIs
`provenance-attestation.schema.json`	SLSA-style build provenance attestation
`trusted-publisher-provenance.schema.json`	Trusted publisher badge provenance data

Schema file	What it validates
`submission-record.schema.json`	Package submission to the public registry
`moderation-decision-record.schema.json`	Moderation outcome for a submitted package
`moderation-approval-for-public-listing.schema.json`	Approval to list a package publicly
`moderation-escalation-record.schema.json`	Escalation record for human moderator review
`reject-record.schema.json`	Package rejection record with reasons
`community-report-record.schema.json`	Community abuse/moderation report
`namespace-claim-record.schema.json`	Namespace ownership claim
`ownership-dispute-record.schema.json`	Dispute over package namespace ownership
`deprecation-record.schema.json`	Package deprecation notice
`takedown-record.schema.json`	Package takedown record
`takedown-appeal-record.schema.json`	Appeal of a takedown decision
`emergency-takedown-decision.schema.json`	Emergency takedown decision record
`vulnerability-flag-record.schema.json`	Security vulnerability flag on a package
`mirror-sync-receipt.schema.json`	Confirmation of a mirror registry sync
`community-prompt-pack.schema.json`	Curated community prompt pack definition
`agent-skill.schema.json`	Agent skill definition (SKILL.md format)

Core schema field reference

`prompt-spec.schema.json`

Source-of-truth prompt definition. Required in every apastra project.

string

required

Stable identifier for the prompt. Use a namespaced slug such as my-app/summarize-v1. Renaming breaks consumption manifest pins.

variables

object

required

Map of variable names to JSON Schema type objects. Each key is a template placeholder; each value defines the type.

variables:
  text: { type: string }
  max_length: { type: string }

template

string | object | array

required

The prompt template. For completion models, use a string with {{variable}} placeholders. For chat models, use an array of message objects.

template: "Summarize: {{text}}"

output_contract

object

JSON Schema defining the expected output structure. Used by schema evaluators to validate model responses.

tool_contract

object

JSON Schema defining expected tool calling structure and available tools. Required if the prompt uses function calling.

metadata

object

Arbitrary key-value pairs such as author, intent, and tags.

`dataset-manifest.schema.json`

Declares a dataset’s identity and content digest for reproducibility.

string

required

Stable identifier for the dataset.

version

string

required

Semantic version or revision of the dataset. Treat dataset edits as new versions.

digest

string

required

SHA-256 content digest of the .jsonl file. Format: sha256:<hex>. See digest convention.

schema_version

string

required

Version of the dataset-case schema used by the JSONL file.

provenance

object

Information about the origin of the dataset.

`dataset-case.schema.json`

A single test case — one line in a JSONL dataset file.

case_id

string

required

Stable identifier for the test case. Never change existing case_id values; add new cases instead.

inputs

object

required

Map of variable names to input values, matching the prompt spec’s variables schema.

assert

array

Array of inline assertion objects. Each object has type (string) and value (any). See assertion types for the full list.

expected_outputs

object

Expected output values for evaluator scoring (e.g., should_contain keyword lists).

metadata

object

Arbitrary metadata for the case (e.g., tags, difficulty, domain).

`evaluator.schema.json`

Scoring definition for a suite.

string

required

Stable identifier for the evaluator.

type

string

required

Evaluator type. One of: deterministic, schema, judge, human.

metrics

array

required

Array of metric names produced by this evaluator (e.g., ["keyword_recall"]). At least one is required.

config

object

Evaluator-type-specific configuration. For judge evaluators, this includes rubric and model details. For schema evaluators, this includes the target JSON Schema.

metric_versions

object

Mapping of metric names to their semantic versions. Increment when changing how a metric is computed to preserve historical comparability.

digest

string

SHA-256 hash of the evaluator content. Format: sha256:<64 hex chars>.

`suite.schema.json`

Benchmark suite configuration.

string

required

Stable identifier for the suite.

name

string

required

Human-readable name.

datasets

array

required

Array of dataset IDs (at least one). Each ID maps to a .jsonl file in promptops/datasets/.

evaluators

array

required

Array of evaluator IDs (at least one). Each ID maps to a .yaml file in promptops/evaluators/.

model_matrix

array

required

Array of model or provider identifiers to run the suite against. Use "default" to mean the agent’s own model.

tier

string

Execution tier. One of: smoke, regression, full, release-candidate. Default: smoke.

trials

integer

Number of times to run each case for variance measurement. Default: 1. Use 3+ for regression suites.

budgets

object

Cost and time limits. Supports cost_budget (dollars) and time (seconds).

thresholds

object

Pass/fail criteria. Keys are metric names; values are minimum acceptable scores.

thresholds:
  keyword_recall: 0.6
  pass_rate: 1.0

`quick-eval.schema.json`

Single-file evaluation format combining prompt, cases, and assertions.

string

required

Stable identifier for the quick eval.

prompt

string

required

The prompt template with {{variable}} placeholders.

cases

array

required

Array of test cases. Each case follows the dataset-case schema with id, inputs, and assert.

thresholds

object

Pass/fail thresholds, typically pass_rate: 1.0.

`run-manifest.schema.json`

Durable metadata record for a completed run. Written by the harness.

input_refs

object

required

References to input files (suite, prompt, dataset, evaluator IDs).

resolved_digests

object

required

Content digests of all resolved inputs at run time, enabling replay.

timestamps

object

required

Run start and end times.

harness_identifier

string

required

Identifier of the execution environment. Common values: claude-code, antigravity, cursor, copilot, api, github-actions, jules.

harness_version

string

required

Version of the harness. The same model in different harnesses can produce different outputs.

model_ids

array

required

Array of model identifiers used in the run.

sampling_config

object

required

Temperature, top-p, and other sampling parameters used.

environment

object

required

Environment metadata for reproduction attempts.

status

string

required

Run outcome. Typically pass or fail.

total_cost

number

Total cost of the run in dollars (input tokens × price + output tokens × price).

provenance

object

SLSA-style provenance metadata: builder.id, buildType, invocation, and metadata.

`scorecard.schema.json`

Normalized metrics summary for a run.

normalized_metrics

object

required

Mapping of metric names to their aggregated values (0–1 scale for most metrics).

{"keyword_recall": 0.85, "pass_rate": 1.0}

metric_definitions

object

required

Metadata for each metric. Each entry requires version and optionally includes description and direction.

{
  "keyword_recall": {
    "version": "1.0",
    "description": "Fraction of expected keywords found in output",
    "direction": "higher_is_better"
  }
}

variance

object

Variance data if trials > 1 was configured in the suite.

flake_rates

object

Mapping of metric names to their observed flake rates.

`baseline.schema.json`

Reference to a known-good scorecard for regression comparison.

baseline_id

string

required

Identifier for this baseline record.

run_digest

string

required

Content digest of the reference run’s scorecard.

created_at

string

required

ISO 8601 timestamp when the baseline was established.

description

string

Human-readable description (e.g., “post-v2-launch baseline”).

`regression-policy.schema.json`

Defines how candidate scorecards are compared against baselines.

baseline

string

required

Baseline reference rule (e.g., "prod current", "last-rc-passing-run").

rules

array

required

Array of per-metric rule objects. Each rule requires metric and severity.

rules:
  - metric: keyword_recall
    floor: 0.5
    allowed_delta: 0.1
    direction: higher_is_better
    severity: blocker

Rule fields:

rules[].metric

string

required

Metric name to evaluate.

rules[].severity

string

required

blocker — fails the check and blocks merge. warning — reported but does not block.

rules[].floor

number

Absolute minimum acceptable value for this metric.

rules[].allowed_delta

number

Maximum allowed drop from the baseline value.

rules[].direction

string

higher_is_better or lower_is_better. Controls which direction a delta is treated as a regression.

`consumption-manifest.schema.json`

App-side file declaring prompt pins and resolution overrides.

version

string

required

Version of the consumption manifest format.

prompts

object

required

Mapping of local prompt names to resolution configurations. Each entry requires id and optionally includes pin, override, and model.

prompts:
  summarize-v1:
    id: summarize-v1
    pin: "abc123"

prompts[].id

string

required

Stable prompt ID to resolve.

prompts[].pin

string

Git ref, commit SHA, semver range, or packaged artifact reference. See resolver for supported pin formats.

prompts[].override

string

Local file path overriding resolution. Used for local-linked development.

prompts[].model

string

Override the default model for this specific prompt.

defaults

object

Global fallbacks: model and provider.

`promotion-record.schema.json`

Append-only record binding an approved version to a delivery channel.

version

string

required

The approved version being promoted.

channel

string

required

Target channel (e.g., prod, staging, release).

digest

string

required

Content digest of the promoted version.

evidence

object

Links to supporting evidence (e.g., run_id of the release-candidate run).

timestamp

string

ISO 8601 timestamp of the promotion event.

`delivery-target.schema.json`

Declarative configuration for a downstream sync target.

type

string

required

Target type (e.g., github_pr, oci_registry).

repo

string

required

Target repository (for github_pr type).

Digest convention

All digest fields in apastra schemas use SHA-256 computed over canonicalized content.

Canonicalization rules

JSON files (.json)

Parse the JSON.
Sort all keys alphabetically (recursively).
Remove all insignificant whitespace.
This is equivalent to: jq -cSM . <file>

YAML files (.yaml, .yml)

Parse the YAML into a JSON object.
Apply the same canonicalization as JSON files.

JSONL files (.jsonl)

Parse each line as JSON.
Canonicalize each line independently.
Rejoin lines with a single \n between each.
Hash the resulting string.

Digest format

sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855

All digest fields that use the pattern validator require exactly sha256: followed by 64 lowercase hex characters.

Using schemas for validation

With the validate skill

Ask your agent:

Use the apastra-validate skill to validate all promptops files

With the CLI

npm install -g ajv-formats ajv-cli
ajv validate -s promptops/schemas/prompt-spec.schema.json -d promptops/prompts/summarize-v1.yaml

In CI

The schema-validation.yml workflow validates changed prompt and dataset files on every pull request automatically. See GitHub workflows reference for details.

Get Started

Skills

Guides

Reference

Schema reference

Schema categories

Core schema field reference

`prompt-spec.schema.json`

`dataset-manifest.schema.json`

`dataset-case.schema.json`

`evaluator.schema.json`

`suite.schema.json`

`quick-eval.schema.json`

`run-manifest.schema.json`

`scorecard.schema.json`

`baseline.schema.json`

`regression-policy.schema.json`

`consumption-manifest.schema.json`

`promotion-record.schema.json`

`delivery-target.schema.json`

Digest convention

Canonicalization rules

Digest format

Using schemas for validation

With the validate skill

With the CLI

In CI

Get Started

Skills

Guides

Reference

Documentation Index

​Schema categories

​Core schema field reference

​prompt-spec.schema.json

​dataset-manifest.schema.json

​dataset-case.schema.json

​evaluator.schema.json

​suite.schema.json

​quick-eval.schema.json

​run-manifest.schema.json

​scorecard.schema.json

​baseline.schema.json

​regression-policy.schema.json

​consumption-manifest.schema.json

​promotion-record.schema.json

​delivery-target.schema.json

​Digest convention

​Canonicalization rules

​Digest format

​Using schemas for validation

​With the validate skill

​With the CLI

​In CI

Schema categories

Core schema field reference

`prompt-spec.schema.json`

`dataset-manifest.schema.json`

`dataset-case.schema.json`

`evaluator.schema.json`

`suite.schema.json`

`quick-eval.schema.json`

`run-manifest.schema.json`

`scorecard.schema.json`

`baseline.schema.json`

`regression-policy.schema.json`

`consumption-manifest.schema.json`

`promotion-record.schema.json`

`delivery-target.schema.json`

Digest convention

Canonicalization rules

Digest format

Using schemas for validation

With the validate skill

With the CLI

In CI