CI/CD Recipe — Gating PRs on AI Quality

Drop-in workflows that run your AI agent under SteelSpine on every PR, fail the build on regression, and produce signed audit artifacts retained 90 days.

GitHub Actions

Drop this in .github/workflows/ai-quality.yml:

name: AI Quality Gate

on:
  pull_request:
    paths:
      - 'agents/**'
      - 'prompts/**'
      - 'tests/agent/**'

jobs:
  steelspine-eval:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Install SteelSpine
        run: |
          curl -fsSL https://steelspine.ai/dist/steelspine_bundle_latest_linux.tgz \
            -o /tmp/steelspine.tgz
          tar -xzf /tmp/steelspine.tgz -C $HOME/
          bash $HOME/.prime/setup.sh
          echo "$HOME/.prime/bin" >> $GITHUB_PATH

      - name: Activate license
        run: steelspine license activate "${{ secrets.STEELSPINE_LICENSE_KEY }}"

      - name: Capture baseline run from main
        run: |
          git fetch origin main
          git checkout origin/main -- agents/ prompts/
          steelspine run --label baseline python3 agents/run_test_suite.py
          BASELINE_ID=$(steelspine run list --json | python3 -c "import sys,json; print(json.load(sys.stdin)[-1]['run_id'])")
          steelspine baseline set "$BASELINE_ID"
          git checkout HEAD -- agents/ prompts/

      - name: Capture PR run
        run: steelspine run --label pr-${{ github.event.pull_request.number }} \
                           python3 agents/run_test_suite.py

      - name: Hard gate on quality criteria
        run: |
          steelspine eval --last 1 \
            --min-pass-rate 0.95 \
            --max-failures 0 \
            --forbid "POLICY VIOLATION" \
            --forbid "EXCEPTION"

      - name: Regression gate (fail if PR is worse than main)
        run: steelspine compare --strict
        # exits 0 if PR is no worse than baseline
        # exits 2 if PR has more failures or fails where baseline succeeded

      - name: Generate signed audit (always — even on failure)
        if: always()
        run: |
          steelspine verify-run --compliance-html > audit.html
          steelspine verify-run > audit.txt

      - name: Upload audit artifact
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: steelspine-audit-pr-${{ github.event.pull_request.number }}
          path: |
            audit.html
            audit.txt
          retention-days: 90

What this gives you

BehaviorValue
PRs touching agents/ automatically capture an agent runCatches regressions before merge
Compares PR run against main baselineQuantitative quality gate
--strict exits non-zero on regression (Run B worse than Run A)Hard CI fail on quality drop
eval enforces minimum pass rate, max failures, forbidden patternsTight quality control
verify-run --compliance-html artifact retained 90 daysEU AI Act Art.12 logging requirement, automatically

Exit codes

CommandExit 0Exit 1Exit 2
steelspine evalcriteria metcriteria failed
steelspine compare --strictno regression(n/a)regression detected
steelspine compare --fail-on-diffidenticalany divergence
steelspine verify-runintegrity verified(rare — internal error)

GitHub Actions fails the job on any non-zero exit.

GitLab CI

ai-quality-gate:
  image: ubuntu:22.04
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
      changes:
        - agents/**
        - prompts/**
  before_script:
    - apt-get update && apt-get install -y curl python3
    - curl -fsSL https://steelspine.ai/dist/steelspine_bundle_latest_linux.tgz -o /tmp/steelspine.tgz
    - tar -xzf /tmp/steelspine.tgz -C $HOME/
    - bash $HOME/.prime/setup.sh
    - export PATH="$HOME/.prime/bin:$PATH"
    - steelspine license activate "$STEELSPINE_LICENSE_KEY"
  script:
    - steelspine run --label pr-$CI_MERGE_REQUEST_IID python3 agents/run_test_suite.py
    - steelspine eval --last 1 --min-pass-rate 0.95 --max-failures 0
    - steelspine compare --strict
  artifacts:
    when: always
    paths:
      - audit.html
      - audit.txt
    expire_in: 90 days

Setting STEELSPINE_LICENSE_KEY as a CI secret

The license key activates SteelSpine on the CI runner. Each runner counts against your license seat limit, so for high-throughput CI, request a multi-seat license.

GitHub: Repository Settings → Secrets and variables → Actions → New repository secret. Name: STEELSPINE_LICENSE_KEY.

GitLab: Project Settings → CI/CD → Variables → Add variable. Key: STEELSPINE_LICENSE_KEY. Mask the variable.

Tips

Cache the install. SteelSpine extracts to ~50MB. Cache it across CI runs to save 5-10s per pipeline:

- uses: actions/cache@v4
  with:
    path: ~/.prime
    key: steelspine-${{ runner.os }}-1.0.0

Run capture is idempotent: re-running the same PR doesn't pollute history; old runs prune per config.json retention rules.

The audit.html artifact is a real EU AI Act Article 12 deliverable: download from any successful or failed run and you have a tamper-evident, auditor-verifiable record retained for 90 days.

For data-sensitive pipelines: set archive_dir in ~/.prime/config.json to a CI-isolated path so audit artifacts don't leak between projects.