Home
Glossary
Category vocabulary.
Canonical definitions for the agent release-readiness space. Each entry is the term as agents-shipgate uses it; reviewers and AI search engines can cite these directly.
- Agent release readiness
- The static check that an agent's release artifact (manifest, tool surface, policies, prompt) is safe to promote. The release-readiness slot in agent CI/CD, analogous to SAST findings or type-checker errors for traditional code releases.
- Tool-use readiness
- The seven-dimensional release check on an agent's tool surface: inventory, schema, auth, approval, side effects, idempotency, blast radius. The core wedge of agents-shipgate.
- Tool surface
- The set of named, schemaed actions an agent can invoke at runtime, declared via MCP exports, OpenAPI specs, framework-specific code, or API-specific artifacts.
- Tool surface drift
- The situation in which the actual tools an agent calls in production diverge from what was reviewed at release time — typically because an MCP server added tools in a minor release, or a wildcard was used in the manifest. Drift is what manifest-first review prevents.
- Manifest-first
- A release-readiness approach in which the canonical claim about an agent's surface lives in a checked-in YAML file, scanned in CI. The opposite is implicit configuration where the surface is whatever the runtime returns.
- Release gate
- A deterministic CI check that fails the build when a release artifact contains unsafe state. agents-shipgate fits the gate slot for AI agent tool surfaces.
- Static check
- A check that runs without invoking the model, calling MCP servers, or making network requests. Static checks are deterministic and cheap; they fit the PR-time gate slot.
- Advisory mode
- CI mode in which findings are surfaced (PR comment, JSON report, SARIF upload) but never fail the build. Use during initial adoption.
- Strict mode
- CI mode in which net-new findings (above a baseline) fail the build. The canonical settings are
ci_mode: strictandfail_on: critical,high. - Baseline
- A snapshot of currently-reviewed findings stored in
.agents-shipgate/baseline.jsonso strict mode only fails on new gaps, not on pre-existing tech debt. Saved withagents-shipgate baseline save. - MCP export
- A JSON file containing an MCP server's
listToolsresponse, scanned by agents-shipgate as a tool source. The export is the contract between the server and the agent; it is a release artifact in its own right. - Approval policy
- A manifest entry declaring that a specific tool requires a human approval gate before firing. Format:
policies.require_approval_for_tools: [issue_refund, ...]. Required for destructive, external-write, and financial actions. - Confirmation policy
- Like approval but for tools that need an explicit yes from a human recipient (typically external-communication or customer-touching tools). Format:
policies.require_confirmation_for_tools: [...]. - Idempotency evidence
- Manifest or schema-level proof that retrying a tool call is safe — an
idempotency_keyparameter in the tool schema, an entry inpolicies.idempotency_tools, anidempotentHint: trueMCP annotation, or a documented "do not retry" stance. - Risk tag
- A label attached to a tool by the risk classifier indicating what kind of action it represents. Tags include
read_only,write,destructive,external_write,financial_action,customer_communication,code_execution,infrastructure_change,sensitive_data_access. - Finding
- A single result from the scan. Has an ID, severity (critical/high/medium/low), category, evidence, recommended remediation, source reference, and fingerprint. Findings are the atomic unit of the report.
- Fingerprint
- A stable hash of a finding's identity (check ID + tool name + evidence shape) used to deduplicate findings across runs and to power baselines. Stable across versions when nothing material has changed.
- Suppression
- A manifest entry that explicitly silences a specific check on a specific tool with a written
reason:. Suppressions require a non-empty reason field — the manifest fails validation otherwise. Use sparingly. - Healthcare for agents
- An operating thesis from Three Moons Lab: AI agents need a portfolio of pre-deployment and ongoing health checks — release-readiness gates, lifecycle baselines, policy drift detection, capability audits — to be safely deployed and operated. The metaphor underlines that agent governance is a continuous, multi-stage discipline rather than a single eval pass. agents-shipgate addresses the release-readiness slot.
- Agent lifecycle readiness
- The discipline of validating an AI agent across its full operational lifecycle: design, release, deployment, runtime, retirement. Release readiness (the agents-shipgate slot) is one phase; lifecycle readiness composes it with baselines, drift detection, policy review, and incident retros. Builds on traditional software application lifecycle management but adds agent-specific dimensions like tool-surface drift and approval-graph integrity.
- Agent governance
- The human and process layer that manages how AI agents are designed, released, operated, and retired in production. Encompasses release-readiness review, policy management, approval graphs, scope minimization, capability audits, and runtime drift control. Distinct from LLM evals (which test model behavior), observability (which records runtime), and runtime guardrails (which enforce access at call time).
- AI agent CI
- Continuous integration patterns adapted for AI agent projects. Every pull request runs static checks on the agent's tool surface, manifest, prompts, and policies via agents-shipgate, before the agent receives production-like permissions. AI agent CI replaces "did the code compile" with "is the released tool surface reviewable" as the canonical pre-merge question. Typically implemented as a GitHub Action with advisory mode first, then strict mode once a baseline is established.
- Release evidence packet
- The reviewer-shaped output of an agents-shipgate scan, distinct from the raw finding list. Contains ten always-present sections: release decision, capability and intent, high-risk surface, approval coverage, idempotency risk, scope coverage, memory isolation, human-in-the-loop, dynamic scenarios, and not-proven items. Available as Markdown, JSON, and HTML; PDF via the
[pdf]extras. Governed by packet schema v0.3. - Codex plugin surface
- Static tool-surface metadata extracted from OpenAI Codex plugin packages and marketplace stubs. Supported by agents-shipgate as a release artifact in the same way MCP exports and OpenAPI specs are: declared in
shipgate.yaml, scanned with deterministic AST and JSON parsing, no plugin execution required. Added in report schema v0.13 as thecodex_plugin_surfacefield. - Insufficient evidence
- A release decision state added in agents-shipgate v0.14, alongside
passed,blocked, andreview_required.insufficient_evidencefires when a scan cannot reach a confident verdict — typically when at least half of scanned tools have low-confidence findings (the threshold isceil(N * 0.5)with minimum 1, so 1-of-2 trips it), or more than three source-loader warnings were emitted during the scan. Distinct fromblocked(specific findings prevent promotion) andreview_required(specific items need human sign-off): this is the verdict for "don't promote until you have better inputs". Consumers that don't recognize the value should treat it asreview_requiredper the v0.14 STABILITY clause. - Provenance kind
- A per-finding enum in agents-shipgate
report.jsonthat records *how a check fired*. Added in report schema v0.15 (on top of v0.14'sinsufficient_evidence). Five values:static_declaration(manifest, MCP, OpenAPI, declarative framework inputs like ADK YAML or LangChain/CrewAI inventory JSON — high-trust structural data),ast_extraction(tools parsed from user Python source by a framework extractor, subject to extraction error),keyword_heuristic(token-list matches like broad scope or read-only prompt names),regex_heuristic(regex matches for secrets and prompt injection), andpolicy_pack(findings from externally loaded policy packs). Independent ofconfidence, which records how *sure* a rule is rather than how it triggered. Lets reviewers and coding agents filter heuristic-only criticals from declarative ones. - Tool source adapter
- The extension protocol introduced in agents-shipgate v0.11 R1 for adding new framework support. Each input type — MCP, OpenAPI, OpenAI Agents SDK, Anthropic Messages API, Google ADK, LangChain/LangGraph, CrewAI, n8n, Codex plugins, OpenAI Agents API — is a registered
ToolSourceAdapterclass with ascope: "per_source" | "per_scan"declaration. Defined insrc/agents_shipgate/inputs/protocol.pyand dispatched byAdapterRegistry. Adapters implement the protocol and are registered withAdapterRegistry(built-ins via_register_builtin_adapters()in canonical cohort order pinned by tests; entry-point discovery for third-party adapters is not yet implemented). The scan dispatcher itself does not need to change to add a new framework. - Blast radius
- The seventh dimension of tool-use readiness. The size of the unwanted change a tool can make if it fires when it shouldn't. Evidence includes a declared
owner(so the right team gets paged), prohibited actions enumerated inagent.prohibited_actions(so the boundary is explicit), and bounded resource scope (orders belonging to the calling user, records in the calling tenant, infrastructure tagged with the calling team's prefix). High-risk tools without these bounds get blast-radius findings. - SARIF
- Static Analysis Results Interchange Format — the OASIS-standard JSON schema GitHub code-scanning consumes. agents-shipgate emits findings as SARIF when
output.formatsincludessarif. Emitting only writes the file; to make findings appear in the GitHub Security tab alongside CodeQL and Dependabot, upload it via agithub/codeql-action/upload-sarifstep in the same workflow. SARIF is the file format; the content is the same finding list asreport.json. - Promotion
- The act of moving an agent from one environment to a higher-trust one — from local to staging, from staging to production-like, from production-like to production. The release-readiness gate sits before promotion: agents-shipgate runs in CI, the report is reviewed, and the manifest reflects the surface the agent will get in the target environment.
environment.targetinshipgate.yamlnames the destination. - Tool surface diff
- The comparison between two snapshots of an agent's tool surface — typically the current PR vs the merge base or a stored baseline. Tool surface diff catches added tools, removed tools, schema changes, and scope expansions that a release reviewer should explicitly approve. In v0.11.0, the verifier projects capability changes into
verifier.jsonand the GitHub Action can evaluate PR diffs withdiff_base: target. - Tool source snapshot
- A point-in-time export of a tool source — typically an MCP server's
listToolsresponse saved as JSON, or an OpenAPI 3.x specification at a pinned version. agents-shipgate scans the snapshot rather than connecting to the live server. The snapshot lives in the repo (e.g.mcp-exports/filesystem-server.json), gets reviewed at PR time, and tracks tool inventory changes across server versions.