Skip to content

highk/ci-triage-lite

Use this GitHub action with your project
Add this Action to an existing workflow or create a new one
View on Marketplace

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CI Triage Lite

CI Triage Lite is a deterministic-first GitHub Action that turns failed GitHub Actions logs into compact, actionable PR comments.

It is designed for teams that want a lightweight first pass before digging through long CI logs:

  • What failed?
  • What probably caused it?
  • Which log lines matter?
  • What should I try next?

The core analysis runs inside GitHub Actions and does not require an AI API key. Optional OpenAI-compatible summaries can be enabled later, but the Action is useful without sending logs to an external AI provider.

Status

Developer preview. The current version supports deterministic failure classification, matrix dedupe, repository-specific custom rules, and optional OpenAI-compatible summaries.

Initial scope:

  • Strongest built-in coverage today is for GitHub Actions projects using Node.js, TypeScript, npm/pnpm/yarn, package exports, and Playwright.
  • General rules also cover missing secrets, missing command-line tools, repository policy checks, workflow gate failures, toolchain version mismatches, build tool failures, code scanning failures, infrastructure-as-code validation, security scanner findings, Docker build failures, network/registry failures, and broad test/lint/typecheck failures.
  • Other stacks can still use custom .ci-triage-lite.yml rules, but deeper Python, Ruby, Go, Rust, and JVM rule packs are future work.
  • GitHub API reads for workflow jobs and existing PR comments follow paginated results, so large matrix runs and comment-heavy pull requests are handled without relying on the first 100 items only.

Dogfood examples:

What It Does

  • Reads failed jobs from the current GitHub Actions workflow run
  • Fetches failed job logs through the GitHub API
  • Redacts common token/secret patterns
  • Extracts relevant error lines
  • Removes recurring runner/setup noise from the report
  • Classifies common failures, with strongest built-in coverage for Node.js/TypeScript/Playwright projects:
    • Node.js dependency install failures
    • package-manager policy failures such as pnpm approve-builds and frozen-lockfile config mismatches
    • npm security audit failures
    • dependency updates that cannot reach a fixed non-vulnerable version
    • dependency policy audits such as unvetted Rust dependencies
    • package export errors
    • Ruby/PHP/Python-style dependency resolution failures
    • missing Node.js packages/modules
    • missing command-line tools
    • bundler module resolution failures
    • Playwright failures
    • TypeScript/ESLint/Prettier-style lint/typecheck failures
    • RuboCop, markdownlint, ruff, mypy, golangci-lint, and Sonar-style diagnostics
    • commit message lint failures
    • repository policy checks such as PR labels, title rules, changelog gates, and coverage thresholds
    • workflow gate failures from required upstream jobs or canceled runs
    • toolchain version mismatches
    • Maven, Gradle, and dotnet build/restore failures
    • CodeQL/code-scanning configuration failures
    • missing artifact upload outputs
    • Terraform/OpenTofu formatting, linting, and validation failures
    • security scanner findings from tools such as Grype or Trivy
    • generated file drift detected by git diff --exit-code
    • generic test failures
    • missing environment variables/secrets
    • Docker build failures
    • network/registry failures
  • Applies repository-specific custom rules before built-in rules
  • Deduplicates repeated matrix failures
  • Writes a GitHub step summary
  • Optionally posts or updates a pull request comment
  • Optionally adds an AI-assisted summary using an OpenAI-compatible API

Package and export errors can include contextual hints when the log is specific enough, for example a likely Tailwind CSS v4 migration issue or missing Tailwind v4 PostCSS adapter.

Rule design notes are documented in docs/rule-design.md. Core rules should stay broad and evidence-driven; repository-specific patterns should use .ci-triage-lite.yml.

Why Deterministic First

Most CI failures do not need a broad AI review. They need the right lines pulled out of a long log, a stable failure class, and a next action that does not change from run to run.

CI Triage Lite starts with deterministic rules because they are:

  • predictable across repeated runs
  • cheap to run on every failed PR
  • safe to use without external AI calls
  • easy to tune with repository-specific rules
  • useful even when a provider API key is unavailable

AI summaries are optional. Use them when natural-language synthesis is worth the extra cost and data flow. Leave them off when deterministic triage is enough.

Current Limits

CI Triage Lite is most useful when a repository has repeated CI failures where opening logs and finding the decisive lines is real work. It is less useful when failures are rare, obvious from the GitHub Actions UI, or already handled by a project-specific bot.

  • Built-in rules are strongest for GitHub Actions, Node.js, TypeScript, package managers, Playwright, lint/typecheck, repository policy checks, and common infrastructure/security tooling.
  • Deterministic rules can be wrong or too generic when a project has unusual log formats. Use .ci-triage-lite.yml for project-specific recurring failures.
  • unknown is not treated as a product failure by itself. CI Triage Lite should prefer unknown over a confident-looking but weak classification when the log only has generic cancellation, success, or follow-on cleanup messages.
  • Redaction covers common token shapes and user-provided regex patterns. It is not a complete secret scanner.
  • Custom rule config intentionally supports a small YAML subset, not full YAML. Avoid multiline values, anchors, nested objects, and advanced YAML syntax.

Custom Rules

Add .ci-triage-lite.yml at the repository root to classify project-specific failures before the built-in rules run. This is useful when your team has recurring failures with known causes, internal service names, or project-specific recovery steps.

rules:
  - match: "AcmeQueueTimeout"
    type: acme-queue-timeout
    cause: "The Acme queue timed out."
    next: "Restart the Acme worker and retry the job."
    context: "Acme worker"
    severity: high

Another example for a known project bootstrap failure:

rules:
  - match: "Cannot find module 'internal-test-harness'"
    type: internal-test-harness-missing
    cause: "The internal test harness package was not installed in CI."
    next: "Check the private registry token and rerun npm ci."
    context: "Private npm registry"
    severity: high

For the initial custom-rule support, CI Triage Lite reads a small YAML subset rather than full YAML syntax. Each rule uses one regex-style match string. match, type, cause, and next are required. context and severity are optional, and severity defaults to medium. Custom rules win over built-in rules.

Built-In Rule Selection

Built-in rules have stable rule ids and broader groups. By default, all built-in groups are enabled. Use group or rule selection when a repository already has another bot for a category, or when one rule creates comment noise for your workflow.

Available groups:

  • core: generic tests, missing secrets/env, missing tools, network/registry, workflow gates, artifact upload, broad lint/typecheck, toolchain mismatch
  • javascript: npm/pnpm/yarn, package exports, missing Node modules, bundler/module resolution
  • playwright: Playwright-specific failures and focused reproduction hints
  • security: npm audit, CodeQL/code scanning, Trivy/Grype/Snyk-style scanner findings
  • repo-policy: commitlint, PR label/title/changelog/coverage gates, generated file drift
  • dependency-policy: dependency policy and security-update constraints
  • dependency: non-JavaScript dependency resolver failures such as Bundler or Python resolver output
  • build-tools: Maven, Gradle, dotnet, and similar build/restore failures
  • iac: Terraform/OpenTofu/TFLint
  • docker: Docker build failures

Example using .ci-triage-lite.yml:

enabled-groups:
  - core
  - javascript
  - playwright
disabled-rules:
  - artifact-upload-missing-files

Example using Action inputs:

      - uses: highk/ci-triage-lite@v0
        with:
          github-token: ${{ github.token }}
          enabled-rule-groups: core,javascript,playwright
          disabled-rules: artifact-upload-missing-files

Rule selection precedence:

  • custom .ci-triage-lite.yml rules are always enabled
  • disabled-rules disables a built-in rule even when its group is enabled
  • enabled-rules enables a built-in rule even when its group is disabled
  • disabled-groups disables a built-in group
  • enabled-groups restricts built-in matching to the listed groups
  • blank selection keeps the default behavior: all built-in groups enabled

Usage

Add this job after your normal CI jobs:

name: CI

on:
  pull_request:

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npm ci
      - run: npm test

  triage:
    needs: [test]
    if: always()
    runs-on: ubuntu-latest
    permissions:
      actions: read
      contents: read
      pull-requests: write
      issues: write
    steps:
      - uses: actions/checkout@v4
      - uses: highk/ci-triage-lite@v0
        with:
          github-token: ${{ github.token }}
          comment-mode: update

Optional AI Summaries

Deterministic triage is the default. If you want an additional natural-language summary, provide an OpenAI-compatible API key:

      - uses: highk/ci-triage-lite@v0
        with:
          github-token: ${{ github.token }}
          openai-api-key: ${{ secrets.OPENAI_API_KEY }}
          model: gpt-5.4-mini
          reasoning-effort: low

OpenAI-compatible providers are supported:

      - uses: highk/ci-triage-lite@v0
        with:
          github-token: ${{ github.token }}
          openai-api-key: ${{ secrets.DEEPSEEK_API_KEY }}
          openai-base-url: https://api.deepseek.com/v1
          model: deepseek-chat
          temperature: "0.2"
          max-tokens: "600"

For provider-specific options, pass a JSON object:

      - uses: highk/ci-triage-lite@v0
        with:
          github-token: ${{ github.token }}
          openai-api-key: ${{ secrets.OPENROUTER_API_KEY }}
          openai-base-url: https://openrouter.ai/api/v1
          model: openai/gpt-5.4-mini
          extra-body-json: '{"provider":{"sort":"throughput"}}'

Example Output

For a Playwright timeout, CI Triage Lite posts a deterministic PR comment like:

## CI Triage Lite

A failed workflow was triaged from the most relevant log lines.

### e2e

- **Type:** playwright-failure
- **Severity:** high
- **Confidence:** medium
- **Likely cause:** A Playwright browser test failed or timed out.
- **Next action:** Reproduce with `npx playwright test tests/home.spec.ts:12 --trace=on`, then inspect the trace and selector state.

For a custom rule, the output includes your repository-specific type, context, cause, and next action before falling back to built-in classifications.

Inputs

Input Required Default Description
github-token yes ${{ github.token }} Token used to read workflow logs and post PR comments.
openai-api-key no Optional OpenAI-compatible API key.
openai-base-url no https://api.openai.com/v1 OpenAI-compatible API base URL.
model no gpt-5.4-mini Model used for optional AI summaries.
temperature no Optional chat completions temperature. Omitted when blank.
max-tokens no Optional max_tokens value for the AI summary. Omitted when blank.
reasoning-effort no Optional OpenAI-style reasoning_effort value. Omitted when blank.
extra-body-json no Optional JSON object merged into the chat completions request body for provider-specific settings.
max-log-lines no 160 Maximum relevant log lines to keep per failed job.
comment-mode no update One of new, update, summary-only.
redact-patterns no Newline-separated regular expressions to redact.
enabled-rule-groups no Comma-separated built-in rule groups to enable. Leave blank to enable all groups.
disabled-rule-groups no Comma-separated built-in rule groups to disable.
enabled-rules no Comma-separated built-in rule ids or types to enable even when their group is disabled.
disabled-rules no Comma-separated built-in rule ids or types to disable even when their group is enabled.
fail-on-error no false Set to true if this action should fail when triage cannot complete.

Failure Behavior

CI Triage Lite is fail-soft by default. If the triage step itself cannot complete, it writes a warning to the step summary and exits successfully so it does not block the pull request.

Set fail-on-error: true if you prefer strict behavior.

Security Notes

  • This action reads workflow logs. Treat logs as potentially sensitive.
  • Common token patterns are redacted before analysis.
  • Add project-specific redact-patterns for internal hostnames, customer identifiers, or secrets.
  • If openai-api-key is not provided, no AI provider is called.
  • If openai-api-key is provided, the extracted relevant log lines are sent to the configured OpenAI-compatible endpoint.
  • extra-body-json is inserted into the chat completions request. Do not put secrets in this field.
  • This repository does not include its own workflow files because GitHub Marketplace requires an action repository to avoid workflow files.

Local Development

npm test
npm run lint

For end-to-end dogfooding, use a separate test repository and add the workflow from the Usage section.

License

MIT

About

A local-first GitHub Action that summarizes failed CI logs into actionable PR comments.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors