Skip to content

Test elastic-package#3725 - DO NOT MERGE#19907

Draft
elastic-vault-github-plugin-prod[bot] wants to merge 3 commits into
mainfrom
test-elastic-package-pr-3725
Draft

Test elastic-package#3725 - DO NOT MERGE#19907
elastic-vault-github-plugin-prod[bot] wants to merge 3 commits into
mainfrom
test-elastic-package-pr-3725

Conversation

@elastic-vault-github-plugin-prod

Copy link
Copy Markdown

Update elastic-package reference to elastic/elastic-package@45bf8de.
Automated by Buildkite build

Relates: elastic/elastic-package#3725

Add a one-off Buildkite path for PR 19907 that forces full package coverage on 9.5.0-SNAPSHOT with LogsDB and LogsDB Columnar enabled. This ensures the normal integrations test flow exercises the elastic-package columnar implementation under realistic CI conditions.
@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

TL;DR

The failed Buildkite jobs are caused by the elastic-package version under test enforcing stricter package output: several data stream categories no longer include all categories from their policy templates, and http_endpoint generated docs now render HTTP Endpoint, making two committed READMEs stale. Two stack-test jobs (httpcheck_otel, iis_otel_input) only have teardown/tail logs in the prefetched data, so their exact test assertion needs the uploaded artifacts or full job log.

Remediation

  • For the category lint failures, align each affected data stream manifest with its policy template categories, or remove security from the policy template if it is not intended:
    • packages/aws/data_stream/transitgateway/manifest.yml: add security to match packages/aws/manifest.yml transitgateway policy template.
    • packages/gcp/data_stream/vpcflow/manifest.yml, packages/gcp/data_stream/loadbalancing_metrics/manifest.yml, and packages/gcp/data_stream/loadbalancing_logs/manifest.yml: add security or adjust the corresponding GCP policy templates.
    • packages/tencent_cloud/data_stream/cos/manifest.yml and packages/tencent_cloud/data_stream/clb/manifest.yml: add security or adjust the Tencent Cloud policy templates.
  • Rebuild docs for packages/doppler and packages/kolide with the tested elastic-package version so docs/README.md picks up the generated HTTP Endpoint capitalization.
  • Re-run the failed package checks, and inspect the uploaded JUnit/artifacts or full logs for httpcheck_otel and iis_otel_input because the fetched logs do not include the failing assertion.
Investigation details

Root Cause

This PR only changes the Buildkite one-off mode and the elastic-package module replacement (go.mod replaces github.com/elastic/elastic-package with github.com/andrewkroh/elastic-package at commit 45bf8deca769). The package failures are therefore from the behavior of the elastic-package build under test, not from direct package edits.

Category validation failures

The new lint validation compares each data stream manifest’s categories with the categories configured on its policy template.

Examples from the current source:

  • packages/aws/manifest.yml:704-710 defines policy template transitgateway with categories: [security], while packages/aws/data_stream/transitgateway/manifest.yml:1-6 lists only aws, network, and observability.
  • packages/gcp/manifest.yml:93-98 defines policy template vpcflow with security, while packages/gcp/data_stream/vpcflow/manifest.yml:1-8 lists cloud, google_cloud, network, and observability.
  • packages/gcp/manifest.yml:165-172 defines policy template loadbalancing with security, while packages/gcp/data_stream/loadbalancing_metrics/manifest.yml:1-6 and packages/gcp/data_stream/loadbalancing_logs/manifest.yml:1-7 do not include security.
  • packages/tencent_cloud/manifest.yml:72-77 and packages/tencent_cloud/manifest.yml:89-94 define cos and clb policy templates with security, while packages/tencent_cloud/data_stream/cos/manifest.yml:1-4 and packages/tencent_cloud/data_stream/clb/manifest.yml:1-4 list only observability.

Generated README drift

The tested elastic-package regenerates the http_endpoint doc block with HTTP Endpoint, but the committed READMEs still use Http Endpoint:

  • packages/doppler/docs/README.md:101-105
  • packages/kolide/docs/README.md:184-188

Evidence

  • Build: https://buildkite.com/elastic/integrations/builds/45314
  • Category lint failures:
    • aws: data stream "transitgateway" manifest categories [aws network observability] are missing policy template "transitgateway" categories [security]
    • gcp: vpcflow, loadbalancing_metrics, and loadbalancing_logs are missing policy template category security.
    • tencent_cloud: cos and clb are missing policy template category security.
  • README failures:
    • doppler: README.md is outdated. Rebuild the package with 'elastic-package build', diff changes Http endpoint / Http Endpoint to HTTP Endpoint.
    • kolide: same generated README capitalization drift.
  • Tail-only failures:
    • httpcheck_otel and iis_otel_input prefetched logs only show teardown, artifact upload, and The command exited with status 1; the earlier assertion/error is not present in the available log files.

Verification

  • Not run locally; this is a read-only Buildkite detective pass based on the prefetched logs and repository source.

Follow-up

  • Fetch full logs or uploaded JUnit XML for httpcheck_otel and iis_otel_input to identify their exact stack-test assertions.

What is this? | From workflow: PR Buildkite Detective

Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.

@elastic-vault-github-plugin-prod

elastic-vault-github-plugin-prod Bot commented Jul 1, 2026

Copy link
Copy Markdown
Author

🚀 Benchmarks report

Package azure 👍(7) 💚(2) 💔(4)

Expand to view
Data stream Previous EPS New EPS Diff (%) Result
platformlogs 6622.52 4048.58 -2573.94 (-38.87%) 💔
provisioning 11363.64 9090.91 -2272.73 (-20%) 💔
activitylogs 1964.64 1569.86 -394.78 (-20.09%) 💔
auditlogs 5617.98 4237.29 -1380.69 (-24.58%) 💔

Package azure_app_service 👍(0) 💚(0) 💔(1)

Expand to view
Data stream Previous EPS New EPS Diff (%) Result
app_service_logs 1930.5 1187.65 -742.85 (-38.48%) 💔

Package google_workspace 👍(14) 💚(1) 💔(7)

Expand to view
Data stream Previous EPS New EPS Diff (%) Result
gcp 8849.56 5988.02 -2861.54 (-32.34%) 💔
gmail 1344.09 1136.36 -207.73 (-15.46%) 💔
meet 3289.47 2544.53 -744.94 (-22.65%) 💔
token 3355.7 2538.07 -817.63 (-24.37%) 💔
user_accounts 24390.24 18518.52 -5871.72 (-24.07%) 💔
alert 4132.23 2173.91 -1958.32 (-47.39%) 💔
chrome 2994.01 2320.19 -673.82 (-22.51%) 💔

Package microsoft_sqlserver 👍(0) 💚(1) 💔(2)

Expand to view
Data stream Previous EPS New EPS Diff (%) Result
audit 2450.98 1663.89 -787.09 (-32.11%) 💔
performance 7246.38 5882.35 -1364.03 (-18.82%) 💔

Package windows 👍(3) 💚(5) 💔(2)

Expand to view
Data stream Previous EPS New EPS Diff (%) Result
powershell_operational 2475.25 1313.72 -1161.53 (-46.93%) 💔
applocker_exe_and_dll 5291.01 3649.64 -1641.37 (-31.02%) 💔

To see the full report comment with /test benchmark fullreport

@infra-vault-gh-plugin-prod

infra-vault-gh-plugin-prod Bot commented Jul 1, 2026

Copy link
Copy Markdown

⏳ Build in-progress, with failures

Failed CI Steps

History

cc @andrewkroh

@andrewkroh

Copy link
Copy Markdown
Member

I reproduced the hashicorp_vault 0-hit failure locally with the LogsDB Columnar test path and found the ingest rejection reason.

Summary:

  • The input was producing data. The Elastic Agent event log showed hashicorp_vault.audit events being published for logs-hashicorp_vault.audit-43464.

  • Elasticsearch never created the target data stream because the bulk request was rejected with HTTP 400.

  • The rejection reason was:

    field [event.original] cannot reconstruct _source from doc values; every field must be reconstructable from doc values in index using [logsdb_columnar] index mode

  • All three failing hashicorp_vault log system configs set preserve_original_event: true, which populates event.original. This likely explains these 0-hit failures, and may explain other LogsDB Columnar failures in packages that preserve event.original.

I did not patch elastic-package to expose this. I reproduced with --defer-cleanup and inspected the independent Elastic Agent container log at /usr/share/elastic-agent/state/data/logs/elastic-agent-20260701.ndjson.

Full Elastic Agent error JSON
{"log.level":"error","@timestamp":"2026-07-01T17:34:35.220Z","message":"failed to index document","resource":{"service.instance.id":"56ac4a15-5a49-4c48-9a03-195300de3593","service.name":"/usr/share/elastic-agent/data/elastic-agent-1885ca/components/elastic-otel-collector","service.version":"9.5.0"},"otelcol.component.id":"elasticsearch/_agent-component/default","otelcol.signal":"logs","index":"","http.response.status_code":400,"log.origin.stack_trace":"github.com/open-telemetry/opentelemetry-collector-contrib/exporter/elasticsearchexporter.flushBulkIndexer\n\tgithub.com/open-telemetry/opentelemetry-collector-contrib/exporter/elasticsearchexporter@v0.154.0/bulkindexer.go:425\ngithub.com/open-telemetry/opentelemetry-collector-contrib/exporter/elasticsearchexporter.(*syncBulkIndexerSession).Flush\n\tgithub.com/open-telemetry/opentelemetry-collector-contrib/exporter/elasticsearchexporter@v0.154.0/bulkindexer.go:252\ngithub.com/open-telemetry/opentelemetry-collector-contrib/exporter/elasticsearchexporter.(*sessionList).Flush.func1\n\tgithub.com/open-telemetry/opentelemetry-collector-contrib/exporter/elasticsearchexporter@v0.154.0/exporter.go:666\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\tgolang.org/x/sync@v0.21.0/errgroup/errgroup.go:93","otelcol.component.kind":"exporter","error.type":"illegal_argument_exception","error.reason":"field [event.original] cannot reconstruct _source from doc values; every field must be reconstructable from doc values in index using [logsdb_columnar] index mode","ecs.version":"1.6.0","log.origin":{"file.line":425,"file.name":"elasticsearchexporter@v0.154.0/bulkindexer.go","function":"github.com/open-telemetry/opentelemetry-collector-contrib/exporter/elasticsearchexporter.flushBulkIndexer"},"ecs.version":"1.6.0"}

@andrewkroh

Copy link
Copy Markdown
Member

The ECS event.original value is defined with doc_values=false and index=false.

https://github.com/elastic/ecs/blob/b85f7573f0b48c03fddf90dad06fc743876f1b33/generated/elasticsearch/composable/component/event.json#L60-L64

which makes it incompatible with columnar index modes because of

Disabling doc values: Mapped fields cannot disable doc values. Setting doc_values to false leads to mapping errors. The only exception is multi-fields, where it's often desirable to store doc values for just one and use different index configurations for the rest.

Disabling _source: Setting "_source": {"enabled": false} is not allowed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant