Apastra ships 56 JSON schemas that validate every file type in the protocol. Schemas ensure your prompts, datasets, evaluators, and run artifacts are machine-readable by any agent or harness — not just your own. All schemas use JSON Schema draft 2020-12.Documentation Index
Fetch the complete documentation index at: https://bintzgavin-apastra-14.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Your agent validates files against these schemas when you run the
apastra-validate skill. The schema-validation.yml CI workflow also validates changed files on every pull request.Schema categories
- Core protocol
- Execution
- Quality
- Governance
- Packaging
- Registry (future)
| Schema file | What it validates |
|---|---|
prompt-spec.schema.json | Prompt template + variable schema + output contract |
dataset-manifest.schema.json | Dataset identity, version, digest, provenance |
dataset-case.schema.json | A single JSONL test case with inputs and inline assertions |
evaluator.schema.json | Scoring rules (deterministic, schema, judge, human) |
suite.schema.json | Benchmark suite: datasets, evaluators, model matrix, thresholds |
quick-eval.schema.json | Single-file eval combining prompt, cases, and assertions |
Core schema field reference
prompt-spec.schema.json
Source-of-truth prompt definition. Required in every apastra project.
Stable identifier for the prompt. Use a namespaced slug such as
my-app/summarize-v1. Renaming breaks consumption manifest pins.Map of variable names to JSON Schema type objects. Each key is a template placeholder; each value defines the type.
The prompt template. For completion models, use a string with
{{variable}} placeholders. For chat models, use an array of message objects.JSON Schema defining the expected output structure. Used by
schema evaluators to validate model responses.JSON Schema defining expected tool calling structure and available tools. Required if the prompt uses function calling.
Arbitrary key-value pairs such as
author, intent, and tags.dataset-manifest.schema.json
Declares a dataset’s identity and content digest for reproducibility.
Stable identifier for the dataset.
Semantic version or revision of the dataset. Treat dataset edits as new versions.
SHA-256 content digest of the
.jsonl file. Format: sha256:<hex>. See digest convention.Version of the
dataset-case schema used by the JSONL file.Information about the origin of the dataset.
dataset-case.schema.json
A single test case — one line in a JSONL dataset file.
Stable identifier for the test case. Never change existing
case_id values; add new cases instead.Map of variable names to input values, matching the prompt spec’s
variables schema.Array of inline assertion objects. Each object has
type (string) and value (any). See assertion types for the full list.Expected output values for evaluator scoring (e.g.,
should_contain keyword lists).Arbitrary metadata for the case (e.g., tags, difficulty, domain).
evaluator.schema.json
Scoring definition for a suite.
Stable identifier for the evaluator.
Evaluator type. One of:
deterministic, schema, judge, human.Array of metric names produced by this evaluator (e.g.,
["keyword_recall"]). At least one is required.Evaluator-type-specific configuration. For
judge evaluators, this includes rubric and model details. For schema evaluators, this includes the target JSON Schema.Mapping of metric names to their semantic versions. Increment when changing how a metric is computed to preserve historical comparability.
SHA-256 hash of the evaluator content. Format:
sha256:<64 hex chars>.suite.schema.json
Benchmark suite configuration.
Stable identifier for the suite.
Human-readable name.
Array of dataset IDs (at least one). Each ID maps to a
.jsonl file in promptops/datasets/.Array of evaluator IDs (at least one). Each ID maps to a
.yaml file in promptops/evaluators/.Array of model or provider identifiers to run the suite against. Use
"default" to mean the agent’s own model.Execution tier. One of:
smoke, regression, full, release-candidate. Default: smoke.Number of times to run each case for variance measurement. Default:
1. Use 3+ for regression suites.Cost and time limits. Supports
cost_budget (dollars) and time (seconds).Pass/fail criteria. Keys are metric names; values are minimum acceptable scores.
quick-eval.schema.json
Single-file evaluation format combining prompt, cases, and assertions.
Stable identifier for the quick eval.
The prompt template with
{{variable}} placeholders.Array of test cases. Each case follows the
dataset-case schema with id, inputs, and assert.Pass/fail thresholds, typically
pass_rate: 1.0.run-manifest.schema.json
Durable metadata record for a completed run. Written by the harness.
References to input files (suite, prompt, dataset, evaluator IDs).
Content digests of all resolved inputs at run time, enabling replay.
Run start and end times.
Identifier of the execution environment. Common values:
claude-code, antigravity, cursor, copilot, api, github-actions, jules.Version of the harness. The same model in different harnesses can produce different outputs.
Array of model identifiers used in the run.
Temperature, top-p, and other sampling parameters used.
Environment metadata for reproduction attempts.
Run outcome. Typically
pass or fail.Total cost of the run in dollars (input tokens × price + output tokens × price).
SLSA-style provenance metadata:
builder.id, buildType, invocation, and metadata.scorecard.schema.json
Normalized metrics summary for a run.
Mapping of metric names to their aggregated values (0–1 scale for most metrics).
Metadata for each metric. Each entry requires
version and optionally includes description and direction.Variance data if
trials > 1 was configured in the suite.Mapping of metric names to their observed flake rates.
baseline.schema.json
Reference to a known-good scorecard for regression comparison.
Identifier for this baseline record.
Content digest of the reference run’s scorecard.
ISO 8601 timestamp when the baseline was established.
Human-readable description (e.g., “post-v2-launch baseline”).
regression-policy.schema.json
Defines how candidate scorecards are compared against baselines.
Baseline reference rule (e.g.,
"prod current", "last-rc-passing-run").Array of per-metric rule objects. Each rule requires
metric and severity.Metric name to evaluate.
blocker — fails the check and blocks merge. warning — reported but does not block.Absolute minimum acceptable value for this metric.
Maximum allowed drop from the baseline value.
higher_is_better or lower_is_better. Controls which direction a delta is treated as a regression.consumption-manifest.schema.json
App-side file declaring prompt pins and resolution overrides.
Version of the consumption manifest format.
Mapping of local prompt names to resolution configurations. Each entry requires
id and optionally includes pin, override, and model.Stable prompt ID to resolve.
Git ref, commit SHA, semver range, or packaged artifact reference. See resolver for supported pin formats.
Local file path overriding resolution. Used for local-linked development.
Override the default model for this specific prompt.
Global fallbacks:
model and provider.promotion-record.schema.json
Append-only record binding an approved version to a delivery channel.
The approved version being promoted.
Target channel (e.g.,
prod, staging, release).Content digest of the promoted version.
Links to supporting evidence (e.g.,
run_id of the release-candidate run).ISO 8601 timestamp of the promotion event.
delivery-target.schema.json
Declarative configuration for a downstream sync target.
Target type (e.g.,
github_pr, oci_registry).Target repository (for
github_pr type).Digest convention
Alldigest fields in apastra schemas use SHA-256 computed over canonicalized content.
Canonicalization rules
JSON files (.json)
JSON files (.json)
- Parse the JSON.
- Sort all keys alphabetically (recursively).
- Remove all insignificant whitespace.
- This is equivalent to:
jq -cSM . <file>
YAML files (.yaml, .yml)
YAML files (.yaml, .yml)
- Parse the YAML into a JSON object.
- Apply the same canonicalization as JSON files.
JSONL files (.jsonl)
JSONL files (.jsonl)
- Parse each line as JSON.
- Canonicalize each line independently.
- Rejoin lines with a single
\nbetween each. - Hash the resulting string.
Digest format
digest fields that use the pattern validator require exactly sha256: followed by 64 lowercase hex characters.
Using schemas for validation
With the validate skill
Ask your agent:With the CLI
In CI
Theschema-validation.yml workflow validates changed prompt and dataset files on every pull request automatically. See GitHub workflows reference for details.