Skip to content

claude_cowork: add Claude Cowork OTel integration#19943

Open
efd6 wants to merge 1 commit into
elastic:mainfrom
efd6:19407-claude_cowork
Open

claude_cowork: add Claude Cowork OTel integration#19943
efd6 wants to merge 1 commit into
elastic:mainfrom
efd6:19407-claude_cowork

Conversation

@efd6

@efd6 efd6 commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Proposed commit message

claude_cowork: add Claude Cowork OTel integration

Collect Claude Cowork agentic telemetry via OpenTelemetry. Derived
from the claude_code package with adaptations for Cowork-specific
schema: non-interactive terminal, dotted MCP field names
(mcp_server.name, mcp_tool.name), no tool_parameters/tool_input,
deployment_mode and workspace.host_paths attributes.

Test samples were collected from live sessions and sanitised.

Note

Again, sorry for the size.

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.
  • I have verified that any added dashboard complies with Kibana's Dashboard good practices

Author's Checklist

  • [ ]

How to test this PR locally

Related issues

Screenshots

Screenshot from 2026-07-03 08-03-19

@efd6 efd6 requested a review from a team July 2, 2026 22:36
@efd6 efd6 self-assigned this Jul 2, 2026
@efd6 efd6 added enhancement New feature or request Team:Security-Service Integrations Security Service Integrations team [elastic/security-service-integrations] labels Jul 2, 2026
@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

✅ Elastic Docs Style Checker (Vale)

No issues found on modified lines!


The Vale linter checks documentation changes against the Elastic Docs style guide. To use Vale locally or report issues, refer to Elastic style guide for Vale.

@efd6 efd6 force-pushed the 19407-claude_cowork branch from 1665121 to a0dc396 Compare July 2, 2026 22:43
@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

TL;DR

The failed Buildkite job ran against commit 1665121897daea89d93157e1c4c02b30374d6fef, where packages/claude_cowork/manifest.yml declared owner elastic/security-service-integrations but the new package had no matching .github/CODEOWNERS entry. The current PR head a0dc3969a557d520f5993b1f4230f128c09cee28 already adds that missing CODEOWNERS line, so the immediate action is to rerun CI on the latest commit.

Remediation

  • Rerun the Buildkite build for the current PR head.
  • If the branch is rebased or amended, keep the .github/CODEOWNERS entry for /packages/claude_cowork with elastic/security-service-integrations as an owner.
Investigation details

Root Cause

dev/codeowners/codeowners.go:185-217 validates each package manifest by finding the applicable CODEOWNERS rule and requiring the manifest owner.github value to appear in that rule. The failed build commit added packages/claude_cowork/manifest.yml with:

owner:
  github: elastic/security-service-integrations
  type: elastic

but that same failed commit did not include a .github/CODEOWNERS rule for /packages/claude_cowork, so the validator rejected the package. The current PR head now includes:

/packages/claude_cowork `@elastic/security-service-integrations` `@elastic/sit-crest-contractors`

Evidence

Error: error validating packages in directory 'packages': error checking manifest 'packages/claude_cowork': owner "elastic/security-service-integrations" defined in "packages/claude_cowork/manifest.yml" is not in ".github/CODEOWNERS"

Verification

  • Inspected the Buildkite failure log and PR commit patches.
  • Did not rerun the repository check locally because the workspace is checked out at main; the relevant fix was confirmed from the current PR head patch.

What is this? | From workflow: PR Buildkite Detective

Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.

@elastic-vault-github-plugin-prod

Copy link
Copy Markdown

🚀 Benchmarks report

To see the full report comment with /test benchmark fullreport

@efd6 efd6 marked this pull request as ready for review July 2, 2026 23:05
@efd6 efd6 requested a review from a team as a code owner July 2, 2026 23:05
"service.name": "cowork",
"service.version": "1.15962.1",
"session.id": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
"source": "config",

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Severity: 🔵 Low confidence: low path: packages/claude_cowork/data_stream/events/_dev/test/pipeline/test-tool-decision.json:19

Dashboards aggregate on claude_cowork.events.decision_source, but tool_decision events emit the permission source as source; if the real attribute is source, the permission-source panels will render empty. Verify the vendor attribute name and align the dashboards or the fixtures.

Details

The permission-decisions, security-overview, session-timeline and tool-call-analysis dashboards bucket on claude_cowork.events.decision_source (with values user_temporary/user_permanent). However the only observable data for tool_decision events — this pipeline test fixture and the docker sample generator — emits the permission source under the attribute source (value config), which the pipeline namespaces to claude_cowork.events.source. Nothing in the pipeline populates decision_source. Note that config is listed as a decision_source value in fields.yml, suggesting the fixture may use the wrong key. Since the true Cowork OTel attribute name cannot be confirmed from the checkout, this is flagged low-confidence — but if the emitted attribute is source, the decision-source dashboard panels will always be empty.

Recommendation:

Confirm the attribute name Cowork actually emits for permission source. If it is source, either point the dashboards at the produced field or normalize it in the pipeline; if it is decision_source, update the fixtures/generator so the tests exercise it. For example, if the real attribute is decision_source, update the tool_decision fixture:

{
    "decision": "accept",
    "decision_source": "config"
}

Alternatively, if the field is source, add a rename in the pipeline so dashboards resolve:

- rename:
    tag: rename_source_to_decision_source
    field: claude_cowork.events.source
    target_field: claude_cowork.events.decision_source
    ignore_missing: true
    if: ctx.claude_cowork?.events?.event?.name == 'tool_decision'

🤖 AI-Generated Review | Vera Review Bot | 📚 Knowledge base: integration-skills

⚠️ Automated review — verify suggestions before applying.

@efd6 efd6 force-pushed the 19407-claude_cowork branch from a0dc396 to d332a1d Compare July 3, 2026 00:00
- configuration
type:
- change
mcp_server_connection:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Severity: 🟡 Medium confidence: medium path: packages/claude_cowork/data_stream/events/elasticsearch/ingest_pipeline/default.yml:81

The mcp_server_connection event path has no pipeline test fixture, so its field names (server_name, status, transport_type, error_code) are unverified even though the MCP Server Access dashboard depends on them; add a test-mcp-server-connection fixture.

Details

The pipeline categorizes an mcp_server_connection event (network/connection), and the MCP Server Access dashboard renders panels on claude_cowork.events.server_name, claude_cowork.events.status, claude_cowork.events.transport_type, and claude_cowork.events.error_code. None of the 7 pipeline fixtures nor the 8-event system-test generator emit an mcp_server_connection event, so the attribute names those panels rely on are never exercised against the pipeline. This is the same class of alignment gap previously raised for the permission-source panels: if the real vendor attribute differs from what fields.yml assumes, these connection panels render empty and no test would catch it.

Recommendation:

Add a pipeline fixture that drives the connection path, e.g. test-mcp-server-connection.json:

{
    "events": [
        {
            "@​timestamp": "2026-06-30T02:43:31.836681529Z",
            "attributes": {
                "event.name": "mcp_server_connection",
                "server_name": "workspace",
                "server_scope": "project",
                "transport_type": "stdio",
                "status": "connected",
                "session.id": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee"
            },
            "event_name": "mcp_server_connection",
            "data_stream": { "type": "logs", "dataset": "claude_cowork.events.otel", "namespace": "default" }
        }
    ]
}

Commit the matching -expected.json so the connection field names are validated against the pipeline.


🤖 AI-Generated Review | Vera Review Bot | 📚 Knowledge base: integration-skills

⚠️ Automated review — verify suggestions before applying.

"includeIsRegex": false,
"excludeIsRegex": false
},
"sourceField": "claude_cowork.events.to_mode"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Severity: 🔵 Low confidence: medium path: packages/claude_cowork/kibana/dashboard/claude_cowork-security-overview.json:707

The Security Overview "Permission Mode Changes" panel aggregates on claude_cowork.events.to_mode, a field fields.yml documents as reserved/not currently emitted, so the panel always renders empty; drop the panel (or the field) until the event is emitted.

Details

This panel's terms aggregation uses sourceField: claude_cowork.events.to_mode. In fields.yml, to_mode is declared with the description "Permission mode after a change (reserved for future use)" and the field comment states it is "reserved; not currently emitted by Cowork." No categorized event or fixture populates it. As shipped, this visualization is permanently empty, which is a visible quality issue on a headline dashboard rather than a forward-compatibility benefit.

Recommendation:

Remove the empty panel until permission_mode_changed events are actually emitted, or keep only fields that carry data today. If retained, gate it behind a note. Minimal change is to delete the panel referencing the reserved field:

{
  "//": "Remove the 'Permission Mode Changes' Lens panel and its columns",
  "//2": "that reference sourceField: claude_cowork.events.to_mode until the",
  "//3": "permission_mode_changed event is emitted by Cowork."
}

🤖 AI-Generated Review | Vera Review Bot | 📚 Knowledge base: integration-skills

⚠️ Automated review — verify suggestions before applying.

@efd6 efd6 force-pushed the 19407-claude_cowork branch 2 times, most recently from 5c34361 to 5987894 Compare July 3, 2026 00:50
- append:
tag: append_related_hosts
field: related.hosts
value: "{{{claude_cowork.events.server_name}}}"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Severity: 🔵 Low confidence: low path: packages/claude_cowork/data_stream/events/elasticsearch/ingest_pipeline/default.yml:331

related.hosts is populated from claude_cowork.events.server_name, but an MCP server name is a service identifier, not a host identifier; consider dropping this enrichment or moving it to a more appropriate ECS field.

Details

The append_related_hosts processor copies claude_cowork.events.server_name into related.hosts. server_name is the MCP server name from connection events (e.g. 'workspace'), which is a logical service/component identifier rather than a hostname, FQDN, or host alias. ECS related.hosts is intended for host identifiers used to correlate events to machines; mixing MCP server names into it dilutes host-based correlation and pivots. related.user (email + id) is correctly populated, so the correlation story does not depend on this line.

Recommendation:

Either drop the related.hosts enrichment for MCP server names, or keep the value only in the namespaced field. If host correlation is genuinely desired for MCP connections, gate it on a field that actually carries a host identifier. Example removing the mis-bucketed enrichment:

# Remove the following processor; claude_cowork.events.server_name is retained
# in the namespaced field and does not belong in ECS related.hosts.
#  - append:
#      tag: append_related_hosts
#      field: related.hosts
#      value: "{{{claude_cowork.events.server_name}}}"
#      allow_duplicates: false
#      if: ctx.claude_cowork?.events?.server_name != null

🤖 AI-Generated Review | Vera Review Bot | 📚 Knowledge base: integration-skills

⚠️ Automated review — verify suggestions before applying.

|-------------|-------------|
| `events` | All Claude Cowork OTLP log events — tool executions, API requests, permission decisions, MCP connections, hooks, plugins, and session lifecycle. |

The integration processes these event types:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Severity: 🔵 Low confidence: medium path: packages/claude_cowork/_dev/build/docs/README.md:23

The README 'event types' table lists only 7 of the 16 events the pipeline categorizes; add the missing types (api_error, api_refusal, api_retries_exhausted, auth, permission_mode_changed, mcp_server_connection, hook_registered, plugin_installed, skill_activated) so the collected scope is accurate.

Details

The ingest pipeline's ecs_categorization processor assigns event.category/event.type for 16 distinct event_name values, but the 'The integration processes these event types' table documents only 7 (tool_result, tool_decision, api_request, user_prompt, hook_execution_start, hook_execution_complete, plugin_loaded). For a security-focused integration, an auditor reading this table would not know that authentication events (auth), API error/refusal events, permission mode changes, and MCP server connections are also collected and categorized. The table understates the actual collection scope.

Recommendation:

Extend the table to cover the remaining event types the pipeline categorizes, for example:

| Event | Description | ECS category |
|-------|-------------|--------------|
| `api_error` | Failed API call to Anthropic. | `web` |
| `api_refusal` | API request refused. | `web` |
| `api_retries_exhausted` | API call gave up after retries. | `web` |
| `auth` | Authentication event. | `authentication` |
| `permission_mode_changed` | Session permission mode change. | `configuration` |
| `mcp_server_connection` | MCP server connection state change. | `network` |
| `hook_registered` | Hook registered for an event. | `process` |
| `plugin_installed` | Plugin installed. | `package` |
| `skill_activated` | Skill activated. | `process` |

🤖 AI-Generated Review | Vera Review Bot | 📚 Knowledge base: integration-skills

⚠️ Automated review — verify suggestions before applying.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@efd6 would event.category:api be a better fit for the api events than 'web'?

Collect Claude Cowork agentic telemetry via OpenTelemetry. Derived
from the claude_code package with adaptations for Cowork-specific
schema: non-interactive terminal, dotted MCP field names
(mcp_server.name, mcp_tool.name), no tool_parameters/tool_input,
deployment_mode and workspace.host_paths attributes.

Test samples were collected from live sessions and sanitised.
@efd6 efd6 force-pushed the 19407-claude_cowork branch from 5987894 to a476567 Compare July 3, 2026 04:07
@elastic-vault-github-plugin-prod

Copy link
Copy Markdown

✅ All changelog entries have the correct PR link.

@infra-vault-gh-plugin-prod

Copy link
Copy Markdown

💚 Build Succeeded

History

cc @efd6

value: success
if: ctx.claude_cowork?.events?.success == 'true'
- set:
tag: set_event_outcome_failure

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Severity: 🟡 Medium confidence: high path: packages/claude_cowork/data_stream/events/elasticsearch/ingest_pipeline/default.yml:294

The pipeline's failure/error branches are untested: no fixture exercises event.outcome=failure (success='false', api_error, api_retries_exhausted) or the error/error_type/error_code fields the Tool Call Analysis dashboard aggregates on. Add at least one failure-path fixture (e.g. test-tool-result-failed and test-api-error).

Details

All seven pipeline fixtures are success-path only: every one sets success='true' or omits it, and none carries an api_error / api_refusal / api_retries_exhausted event or the error, error_type, or error_code attributes. As a result the set_event_outcome_failure processor (which fires on success=='false' or event.name in {api_error, api_retries_exhausted}), the web/error and web/denied categorization entries, and the set_event_reason enrichment are never exercised. The Tool Call Analysis dashboard has panels that aggregate on claude_cowork.events.error and claude_cowork.events.error_type, and the MCP Server Access dashboard aggregates on claude_cowork.events.error_code, so those field names are unverified by any test. The sibling claude_code package ships test-tool-result-failed and test-mcp-server-connection-failed fixtures for exactly this reason.

Recommendation:

Add a failure-path fixture so the failure-outcome branch and the error fields are covered, for example a tool_result with a failed outcome:

{
    "events": [
        {
            "@​timestamp": "2026-06-30T02:43:31.836681529Z",
            "event_name": "tool_result",
            "attributes": {
                "event.name": "tool_result",
                "tool_name": "Bash",
                "success": "false",
                "error": "command failed",
                "error_type": "ShellError",
                "error_code": "ENOENT",
                "duration_ms": 12
            }
        }
    ]
}

and assert in the matching -expected.json that event.outcome resolves to "failure" and event.reason is copied from claude_cowork.events.error. Consider a second fixture for an api_error event to cover the web/error categorization path.


🤖 AI-Generated Review | Vera Review Bot | 📚 Knowledge base: integration-skills

⚠️ Automated review — verify suggestions before applying.

@vera-review-bot

Copy link
Copy Markdown

Review summary

Issues found across the latest commits [a476567](https://github.com/elastic/integrations/commit/a476567d491aa815e7f75bdc518c18730b81a3dd) — 1 medium
  • 🟡 The pipeline's failure/error branches are untested: no fixture exercises event.outcome=failure (success='false', api_error, api_retries_exhausted) or the error/error_type/error_code fields the Tool Call Analysis dashboard aggregates on. Add at least one failure-path fixture (e.g. test-tool-result-failed and test-api-error). (link) (Unresolved)
Issues found across earlier commits [5987894](https://github.com/elastic/integrations/commit/5987894949d7c5ab8f124f90731eba2fd5061234) — 1 low
  • 🔵 The README 'event types' table lists only 7 of the 16 events the pipeline categorizes; add the missing types (api_error, api_refusal, api_retries_exhausted, auth, permission_mode_changed, mcp_server_connection, hook_registered, plugin_installed, skill_activated) so the collected scope is accurate. (link) (Unresolved)
Issues found across earlier commits [5c34361](https://github.com/elastic/integrations/commit/5c3436111cd3e07622802ece01a85ba4617b309c) — 1 low
  • 🔵 related.hosts is populated from claude_cowork.events.server_name, but an MCP server name is a service identifier, not a host identifier; consider dropping this enrichment or moving it to a more appropriate ECS field. (link) (Unresolved)
Issues found across earlier commits [d332a1d](https://github.com/elastic/integrations/commit/d332a1daf593c45da7c6b25a669d5ad40d3c78d8) — 1 medium, 1 low
  • 🟡 The mcp_server_connection event path has no pipeline test fixture, so its field names (server_name, status, transport_type, error_code) are unverified even though the MCP Server Access dashboard depends on them; add a test-mcp-server-connection fixture. (link) (Unresolved)
  • 🔵 The Security Overview "Permission Mode Changes" panel aggregates on claude_cowork.events.to_mode, a field fields.yml documents as reserved/not currently emitted, so the panel always renders empty; drop the panel (or the field) until the event is emitted. (link) (Unresolved)
Issues found across earlier commits [a0dc396](https://github.com/elastic/integrations/commit/a0dc3969a557d520f5993b1f4230f128c09cee28) — 1 low
  • 🔵 Dashboards aggregate on claude_cowork.events.decision_source, but tool_decision events emit the permission source as source; if the real attribute is source, the permission-source panels will render empty. Verify the vendor attribute name and align the dashboards or the fixtures. (link) (Unresolved)

I'll pick up this PR for review again after 15 minutes.

🤖 AI-Generated Review | Vera Review Bot | 📚 Knowledge base: integration-skills

⚠️ Automated review — verify suggestions before applying.

@andrewkroh andrewkroh added New Integration Issue or pull request for creating a new integration package. documentation Improvements or additions to documentation. Applied to PRs that modify *.md files. labels Jul 3, 2026
@ishleenk17

Copy link
Copy Markdown
Member

Hello, I have a question regarding this and claude code integration.
Claude provides these fields in OTel natively. Shouldn't we oush it out as OTel only.
Converting OTel fields to ECS could be counterintuitive for the users. WDYT ?
And we are calling it OpenTelemetry but actually not pushing out the OTel fields.

@ShourieG

ShourieG commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Hello, I have a question regarding this and claude code integration. Claude provides these fields in OTel natively. Shouldn't we oush it out as OTel only. Converting OTel fields to ECS could be counterintuitive for the users. WDYT ? And we are calling it OpenTelemetry but actually not pushing out the OTel fields.

I think the main reason we are doing OTEL -> ECS is due to specific business requirements. @jamiehynds could you clarify here ?

As for the naming, I think it's because it's using otel receivers as the input, but will let @efd6 clarify when he's back.

@jamiehynds

Copy link
Copy Markdown

Hello, I have a question regarding this and claude code integration. Claude provides these fields in OTel natively. Shouldn't we oush it out as OTel only. Converting OTel fields to ECS could be counterintuitive for the users. WDYT ? And we are calling it OpenTelemetry but actually not pushing out the OTel fields.

Hey @ishleenk17 - our goal with the Claude integration is to spare users from having to manually add component templates, ingest pipelines, etc. to normalize to ECS and stay aligned with the schema our security users rely on. Our InfoSec team published a blog post on onboarding Claude data via OTel, but several users have asked for a more seamless experience via an OOTB integration.

We should absolutely consider our o11y users, who may want to onboard this data as native OTel. Is there a path to supporting that within the same integration, or would we need to split it between an integration and a content pack?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation. Applied to PRs that modify *.md files. enhancement New feature or request New Integration Issue or pull request for creating a new integration package. Team:Security-Service Integrations Security Service Integrations team [elastic/security-service-integrations]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Anthropic] Claude Code / Cowork Telemetry

5 participants