Add RAT for header checking#149
Conversation
There was a problem hiding this comment.
Pull request overview
Adds Apache RAT-based license header enforcement to address Issue #141 and updates key resource/config files to include the ASF license header so they pass header checks.
Changes:
- Added ASF license headers to Spring Boot
.propertiesresources andgradle/libs.versions.toml. - Added the Apache RAT Gradle plugin (via version catalog) and configured the
rattask with excludes (including a.gitignore-derived exclude list). - Documented RAT report location and intended integration with the verification lifecycle.
Reviewed changes
Copilot reviewed 4 out of 5 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| src/main/resources/application.properties | Adds ASF license header to application properties. |
| src/main/resources/application-stdio.properties | Adds ASF license header to stdio profile properties. |
| src/main/resources/application-http.properties | Adds ASF license header to http profile properties. |
| gradle/libs.versions.toml | Adds ASF license header and introduces the RAT plugin version + alias. |
| build.gradle.kts | Applies RAT plugin and configures tasks.rat exclusions (explicit + .gitignore-derived). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Alternative approach for comparison: #150 implements this same RAT header enforcement as a Whichever direction is preferred — inline (here) or buildSrc (#150) — happy to converge on one. |
…plugin (stacked on #138) (#150) * docs: add Apache LICENSE and NOTICE files Add the top-level Apache License 2.0 text and NOTICE file required by ASF release policy, and bundle them into the META-INF directory of every JAR produced by the build (main, bootJar, sources, javadoc). See https://www.apache.org/legal/release-policy.html#licensing-documentation * docs(spec): add SBOM generation design Captures decisions made during brainstorming: CycloneDX over SPDX, embed-in-bootJar via Spring Boot's native CycloneDX integration, full build + Docker + Release coverage, no cosign attestation in this PR. Signed-off-by: Aditya Parikh <aditya.m.parikh@gmail.com> Signed-off-by: adityamparikh <aditya.m.parikh@gmail.com> * docs(plan): add SBOM generation implementation plan Step-by-step bite-sized tasks covering: version catalog, Gradle plugin wiring, actuator endpoint enablement, focused HTTP integration test, CI workflow uploads, README + CLAUDE.md docs, final verification. Signed-off-by: Aditya Parikh <aditya.m.parikh@gmail.com> Signed-off-by: adityamparikh <aditya.m.parikh@gmail.com> * chore(deps): add CycloneDX Gradle plugin 1.10.0 to version catalog Plugin will be applied in the next commit. Adding the catalog entry first keeps build.gradle.kts changes reviewable in isolation. Signed-off-by: Aditya Parikh <aditya.m.parikh@gmail.com> Signed-off-by: adityamparikh <aditya.m.parikh@gmail.com> * feat(build): generate and embed CycloneDX SBOM Apply org.cyclonedx.bom Gradle plugin 2.4.1. Spring Boot 3.5's CycloneDxPluginAction auto-wires bootJar to embed the generated SBOM at META-INF/sbom/application.cdx.json, so every distribution (JAR, Jib JVM image, both Paketo native images) ships the embedded SBOM via bootJar packaging — no per-image wiring. Plugin version note: 1.10.0 breaks against Gradle 9.4 with UnsupportedOperationException (ImmutableCollection.removeAll). 2.4.1 is the latest v1.x-compatible class layout (CycloneDxPlugin / CycloneDxTask) that Spring Boot's auto-integration recognizes; v3.x renamed the classes (CyclonedxPlugin) and is incompatible until Spring Boot adopts the new shape. projectType is set explicitly to Component.Type.APPLICATION because v2.4.1 changed the property from Property<String> to Property<Component.Type>; Spring Boot's `.convention("application")` would store a raw String and break the task at execution time. Signed-off-by: Aditya Parikh <aditya.m.parikh@gmail.com> Signed-off-by: adityamparikh <aditya.m.parikh@gmail.com> * feat(actuator): enable /actuator/sbom endpoint explicitly `sbom` was already in management.endpoints.web.exposure.include; this makes the endpoint enablement explicit so the file conveys intent without relying on Spring Boot defaults. Signed-off-by: Aditya Parikh <aditya.m.parikh@gmail.com> Signed-off-by: adityamparikh <aditya.m.parikh@gmail.com> * docs(spec): drop integration-test scope, document plugin-version decisions - Drop the planned SbomEndpointIntegrationTest: /actuator/sbom is stock Spring Boot functionality; our only project-specific addition is two property lines. The build itself fails if cyclonedxBom breaks (Spring Boot's bootJar auto-depends on it). - Update plugin version note to 2.4.1 and explain why both 1.10.0 (Gradle 9.4 bug) and 3.x (Spring Boot class-name change) are unsuitable. - CycloneDX schema 1.6 (plugin default) replaces the originally-noted 1.5. Signed-off-by: Aditya Parikh <aditya.m.parikh@gmail.com> Signed-off-by: adityamparikh <aditya.m.parikh@gmail.com> * docs(spec): drop stale 1.10.0 version reference Signed-off-by: Aditya Parikh <aditya.m.parikh@gmail.com> Signed-off-by: adityamparikh <aditya.m.parikh@gmail.com> * docs(spec): inline the plugin-version constraints explanation Earlier edit lost the detail by accident. Restored as part of the Tool choice section so the spec stands on its own. Signed-off-by: Aditya Parikh <aditya.m.parikh@gmail.com> Signed-off-by: adityamparikh <aditya.m.parikh@gmail.com> * ci: upload CycloneDX SBOM as workflow artifact Mirrors the existing JAR/test-results/coverage upload pattern. Retains the SBOM for 30 days (vs the standard 7) since supply-chain investigations often happen well after a build. Signed-off-by: Aditya Parikh <aditya.m.parikh@gmail.com> Signed-off-by: adityamparikh <aditya.m.parikh@gmail.com> * ci(release): strict SBOM generation + upload + release attachment The existing Generate SBOM step swallowed errors with `|| echo "..."`, masking failures now that the plugin is wired. Removes the fallback, uploads the SBOM as a 90-day workflow artifact, and attaches it to the v<version> GitHub Release when one exists (graceful fallback otherwise since the source release of record lives at dist.apache.org, not GitHub). RELEASE_VERSION is already validated by validate-release; routing it through an env var instead of inline ${{ }} interpolation is defence-in-depth against actions-injection. Signed-off-by: Aditya Parikh <aditya.m.parikh@gmail.com> Signed-off-by: adityamparikh <aditya.m.parikh@gmail.com> * docs(readme): document SBOM location, retrieval, and scanning New 'Supply chain & SBOM' section covers all four distribution channels (embedded in JAR/image, /actuator/sbom endpoint, GitHub Release asset, CI workflow artifact) and shows trivy/grype usage. Signed-off-by: Aditya Parikh <aditya.m.parikh@gmail.com> Signed-off-by: adityamparikh <aditya.m.parikh@gmail.com> * refactor(build): drop unnecessary cyclonedxBom configuration Spring Boot 3.5.14's CycloneDxPluginAction already sets outputName, outputFormat, projectType, and wires bootJar embedding — matching what Spring Initializr generates for the same dependency set. Verified that applying the plugin alone produces a valid CycloneDX 1.6 SBOM at META-INF/sbom/application.cdx.json inside the bootJar with component type=application. The earlier projectType override + includeConfigs/skipConfigs were defensive but unnecessary; let the framework defaults work. Signed-off-by: Aditya Parikh <aditya.m.parikh@gmail.com> Signed-off-by: adityamparikh <aditya.m.parikh@gmail.com> * docs(agents): note SBOM generation in commands + architecture CLAUDE.md symlinks to AGENTS.md; edit lands on the real file. Records the cyclonedxBom command and how the SBOM flows through bootJar → actuator → Docker images, so future agents have the mental model when working on related code. Signed-off-by: Aditya Parikh <aditya.m.parikh@gmail.com> Signed-off-by: adityamparikh <aditya.m.parikh@gmail.com> * style: apply spotless Signed-off-by: Aditya Parikh <aditya.m.parikh@gmail.com> Signed-off-by: adityamparikh <aditya.m.parikh@gmail.com> * feat(build): derive binary-release LICENSE/NOTICE from the SBOM The base LICENSE/NOTICE are correct for the source release, but the binary release (the Spring Boot fat bootJar) bundles third-party bytecode and so per https://infra.apache.org/licensing-howto.html must additionally enumerate each bundled dependency's license and lift bundled ASF dependencies' NOTICE snippets. Stacks on the CycloneDX SBOM work and reuses it as the source of dependency license data: - generateBinaryLicense: base Apache-2.0 + an appendix listing every productionRuntimeClasspath dependency with a link to its license, read from the bundled SBOM (META-INF/sbom/application.cdx.json). The SBOM resolves a license for every component, including Gradle-module-metadata-only ASF artifacts (solr-solrj/solr-api) that POM-only scanners miss, so no per-dependency list is hand-maintained. It also gates the build: a bundled module missing from the SBOM, or carrying a license not in config/license-policy.json, fails the build. - generateBinaryNotice: base NOTICE + the META-INF/NOTICE files lifted verbatim and de-duplicated from the bundled jars (the Shade ApacheNoticeResourceTransformer approach), so ASF dependency notices stay current automatically. config/license-policy.json holds the allowedLicenses set plus overrides (group:name -> SPDX id) correcting the few components CycloneDX mislabels (mcp-server-security -> Apache-2.0; ANTLR ST4/antlr-runtime -> BSD-3-Clause). Source-form jars keep the base LICENSE/NOTICE. Verified: ./gradlew build green; fat jar META-INF/LICENSE lists 158 deps (incl. SolrJ) and META-INF/NOTICE aggregates 21 upstream notices. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: adityamparikh <aditya.m.parikh@gmail.com> * refactor(build): extract LICENSE/NOTICE generation to a buildSrc plugin Move the inline LICENSE/NOTICE logic out of the root build.gradle.kts into a buildSrc convention plugin (org.apache.solr.mcp.license-notice) backed by two typed tasks: - GenerateBinaryLicense / GenerateBinaryNotice are proper DefaultTask types with @InputFile/@InputFiles/@OutputFile, so they're incremental and (being real .kt files) free of the kts-script-compiler limitations that forced the previous Pair-based workarounds — the logic now reads as plain Kotlin with data classes. - The root build.gradle.kts drops ~250 lines and three imports, and just applies `id("org.apache.solr.mcp.license-notice")`. Behaviour is unchanged: the bootJar still bundles a LICENSE with the SBOM-derived 158-dependency appendix (incl. SolrJ) and a NOTICE aggregating 21 upstream notices; source-form jars keep the base files; `check` still runs the gate. The tasks now live in buildSrc, so they can be unit-tested with Gradle TestKit. Verified: ./gradlew build green; fat-jar META-INF/LICENSE and NOTICE identical to the pre-refactor output. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: adityamparikh <aditya.m.parikh@gmail.com> * test(build): unit-test the LICENSE/NOTICE buildSrc tasks Add ProjectBuilder-based tests for the two convention-plugin tasks (now possible since they live in buildSrc as typed tasks). Covers the correctness-critical behaviour without needing the full spring-boot + cyclonedx stack: - generateBinaryLicense: appendix lists bundled deps with SPDX links, applies a policy override to correct a mislabelled SBOM license, and preserves the base LICENSE text; the gate fails on a disallowed license and on a bundled coordinate absent from the SBOM. - generateBinaryNotice: aggregates bundled META-INF/NOTICE files verbatim, de-duplicates identical notices, attributes each to its module, and emits just the project NOTICE when no dependency notices exist. buildSrc's test task runs as part of `./gradlew build`, so these are enforced on every build. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: adityamparikh <aditya.m.parikh@gmail.com> * docs(build): explain the LICENSE/NOTICE tasks in comments and AGENTS.md Add step-by-step comments to GenerateBinaryLicense/GenerateBinaryNotice walking through what each phase does (load policy, index the SBOM, resolve+gate each shipped dependency, write the file; and notice matching/de-dup/attribution). Expand the AGENTS.md "Release LICENSE / NOTICE" section with where the tasks are unit-tested and a short runbook for what to do when the license gate fails (add an override for an SBOM mislabel, or allow a genuinely new license) instead of silencing it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: adityamparikh <aditya.m.parikh@gmail.com> * refactor(build): drop license-policy.json; disclose SBOM licenses verbatim apache/solr has no license allow-list (it uses a per-dependency licenses/ folder, which JanHoy said not to replicate), and the binary LICENSE is a disclosure, not a license policy. Remove config/license-policy.json and the allow-list gate + override corrections it powered. generateBinaryLicense now lists each shipped dependency with the license the CycloneDX SBOM reports, verbatim — so a few imprecise-but-permissive upstream labels appear as-is (mcp-server-security: Apache-1.0; ANTLR: BSD-4-Clause / BSD licence). The appendix preamble says licenses are as-reported and links each one. The remaining gate is completeness only: fail if a bundled dependency is absent from the SBOM, so nothing is silently omitted from the LICENSE. Tests updated to assert verbatim SBOM labels and SBOM name/URL handling. Verified: ./gradlew build green; fat-jar LICENSE still lists 158 deps and NOTICE aggregates 21 upstream notices. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: adityamparikh <aditya.m.parikh@gmail.com> * docs(build): point the LICENSE appendix to the bundled SBOM Add a line to the appendix preamble noting the machine-readable bill of materials (component versions, hashes, licenses) is bundled at META-INF/sbom/application.cdx.json — the inline appendix stays the human-readable disclosure, with the SBOM offered for tooling. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: adityamparikh <aditya.m.parikh@gmail.com> * docs: document where/when the binary LICENSE & NOTICE are available Add a 'where / when they appear' note to the Release LICENSE / NOTICE section: both binary files are regenerated on every build (tasks run ahead of bootJar and in check), land at META-INF/LICENSE and META-INF/NOTICE in the fat jar and thus in every published Docker image, and are also written to build/generated/license/ for local viewing; source-form jars carry the repo-root base files. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: adityamparikh <aditya.m.parikh@gmail.com> * docs(build): explain the buildSrc LICENSE/NOTICE plugin for non-Gradle readers Reviewers who don't work with Gradle had no easy way into buildSrc. Add: - buildSrc/README.md: what buildSrc is, a short glossary of the Gradle concepts the code uses (Task, @TaskAction, the input/output annotations, Property/Provider types, convention plugin, productionRuntimeClasspath), and the end-to-end flow. - KDoc on GenerateBinaryLicense / GenerateBinaryNotice: a "for readers new to Gradle" orientation on each class plus a note on every annotated property explaining what the input/output annotation does (up-to-date checking, ordering). - A note on the convention plugin header explaining precompiled script plugins, and a comment on buildSrc/build.gradle.kts explaining what it builds. Documentation only; no behaviour change. ./gradlew :buildSrc:test green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: adityamparikh <aditya.m.parikh@gmail.com> * docs(build): comment the convention plugin body for non-Gradle readers Add plain-language inline comments through the plugin body explaining the parts that are opaque without Gradle background: what a 'configuration' is and why productionRuntimeClasspath equals 'what ships', how the lazy provider chains (flatMap/map over resolvedArtifacts) derive the coordinate list and the jar-name->coordinate map, what tasks.register/.set wiring does, and how metaInf from(...) plus dependsOn bundle the generated files into the bootJar while the source-form jars keep the base files. Comments only; code unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: adityamparikh <aditya.m.parikh@gmail.com> * Dont double load the root files * feat(build): enforce Apache license headers via RAT convention plugin Add Apache RAT (Release Audit Tool) header enforcement as an `org.apache.solr.mcp.rat` buildSrc convention plugin, stacked on the license-notice plugin from #138. RAT is wired into `check`, so `./gradlew build` audits that every scanned file carries an ASF header (report at build/reports/rat/index.html). The .gitignore-to-RAT-glob translation lives in a pure, unit-tested `RatExcludes` helper rather than inline in build.gradle.kts. Moving it to buildSrc fixes two gitignore-semantics gaps from the inline approach: interior-slash patterns (e.g. src/generated) are now root-anchored instead of matched at any depth, and the negation/anchoring rules are documented and tested. Local developer-tooling dirs (.claude worktrees, .kotlin caches) are excluded so contributors don't hit spurious audit failures. ASF headers are added to the three application*.properties and libs.versions.toml so they pass the audit. Supersedes the inline approach in #149. Stacked on #138. Fixes #141. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: adityamparikh <aditya.m.parikh@gmail.com> --------- Signed-off-by: Aditya Parikh <aditya.m.parikh@gmail.com> Signed-off-by: adityamparikh <aditya.m.parikh@gmail.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Eric Pugh <epugh@opensourceconnections.com>
|
close in favour of #150 |
Fixes #141