Skip to content

Add ai_agent object and extend ai_operation profile coverage#1641

Open
Aniak5 wants to merge 15 commits into
ocsf:mainfrom
Aniak5:issue-1640_ai_agent
Open

Add ai_agent object and extend ai_operation profile coverage#1641
Aniak5 wants to merge 15 commits into
ocsf:mainfrom
Aniak5:issue-1640_ai_agent

Conversation

@Aniak5

@Aniak5 Aniak5 commented May 19, 2026

Copy link
Copy Markdown
Contributor

First step toward agentic AI observability per #1640. This PR introduces the ai_agent object and threads it through the schema so that data-plane events across multiple categories can attribute activity to the agent that initiated it. Delegation, lineage, and the AI control-plane event classes are intentionally deferred to follow-up PRs.

Changes

New object

  • objects/ai_agent.json — autonomous AI agent identity with:
  • uid (required) — stable logical agent identifier assigned by the agent's authoritative source (control plane, registry, or issuing IdP); persists across restarts and instances
  • instance_uid — restart-sensitive identifier for a specific running instance
  • name — human-readable agent name
  • type / type_id — agent framework with enum: Native, LangChain, AutoGen, CrewAI, Other (communication protocols like MCP/A2A are per-event rather than per-agent and are not modeled
    here)
  • ai_model — the model backing the agent at the time of action
  • version — agent version (its own code or configuration revision, distinct from the model version on ai_model.version), for correlating behavioral changes with charter or configuration revisions
  • charter — file-typed reference to the document that defines the agent's durable role, responsibilities, constraints, and operating boundaries (e.g., system prompt or constitution); supports hashes for content integrity and signatures for provenance
  • Distinct from the existing agent object (which models security sensors such as EDR, DLP, APM).

Dictionary

  • dictionary.json — added top-level ai_agent attribute referencing the new ai_agent object type.
  • dictionary.json — added top-level charter attribute (file type) for documents defining the role, scope, and operating bounds of an entity.

Profile

  • profiles/ai_operation.json — added optional ai_agent attribute (context group) so any event class adopting the profile can carry agent attribution.

Object updates

  • objects/actor.json — added optional ai_agent attribute and included it in the at_least_one constraint, so an actor can be identified as an autonomous AI agent rather than (or in addition to) a user, IAM role, process, etc.

  • objects/process.json — added optional ai_agent attribute, allowing a process to be identified as the runtime of a specific AI agent.

    Event classes — added ai_operation profile

    • events/system/system.json (base class) — propagates the profile to all System Activity events
    • events/network/network.json (base class) — propagates the profile to all Network Activity events
    • events/application/web_resources_activity.json — does not extend the network base, added directly
    • events/network/email_activity.json — does not extend the network base, added directly

These join process_activity, datastore_activity, and api_activity, which already had the profile.

image image image image

Closes part of #1640.

@github-actions

github-actions Bot commented May 19, 2026

Copy link
Copy Markdown

Schema Description Review

Automated suggestions for improving description clarity for LLM consumption. These are advisory — not required changes.

Looking at this OCSF schema PR, I'll check against my previous review and assess the current state.

Progress Check from Previous Review

Fixed: CHANGELOG formatting issue resolved

  • The CHANGELOG now includes the required PR reference [#1641](https://github.com/ocsf/ocsf-schema/pull/1641) (previous issue resolved)

Current Review Findings

Suggestions

  1. Object: web_resources_activity
    Attribute: web_resources_result
    Issue: Missing attribute in compiled schema despite being listed in _changed_attributes
    Current: (attribute not found in compiled output)
    Suggested: This appears to be a compilation issue - the attribute is marked as changed but not present in the compiled schema. Please verify the attribute definition exists in the source files.

Summary

The AI agent integration in this PR is well-designed with clear, self-contained descriptions that effectively distinguish between AI agents (autonomous task performers) and the existing agent object (security sensors). All visible attribute descriptions provide good semantic clarity for LLM consumption. However, there appears to be a compilation issue where web_resources_result is marked as changed but missing from the compiled output. The previously identified CHANGELOG formatting issue has been resolved.

@Aniak5 Aniak5 self-assigned this May 19, 2026
Aniak5 added a commit to Aniak5/ocsf-schema that referenced this pull request May 26, 2026
…g placement

Brings in the polished ai_agent vocabulary from PR ocsf#1641 (uid/ai_model
description polish, version attribute, ai_agent placements on actor and
process, ai_operation profile coverage extension, and the new charter
attribute) so this delegation demo branch reflects the upstream-bound
agent foundation.

# Conflicts:
#	dictionary.json
#	events/application/web_resources_activity.json
#	events/system/file_activity.json
#	events/system/scheduled_job_activity.json
#	events/system/script_activity.json
#	objects/ai_agent.json
#	profiles/ai_operation.json
Comment thread objects/ai_agent.json
Comment thread objects/ai_agent.json
Levaj2000 added a commit to Levaj2000/ocsf-schema that referenced this pull request Jun 1, 2026
…agent activity)

Builds on ocsf#1641. Advances ocsf#1640. Introduces an optional ai_forensics profile
carrying an attestation object (a digital signature bound to ai_agent identity,
with optional tamper-evident chaining via entry_hash/prev_entry_hash/chain_uid),
plus the supporting dictionary attributes.
@Aniak5 Aniak5 force-pushed the issue-1640_ai_agent branch from d855f9d to 3751504 Compare June 1, 2026 15:55
Levaj2000 added a commit to Levaj2000/ocsf-schema that referenced this pull request Jun 2, 2026
…s profile

Per maintainer feedback on ocsf#1661: fold the attestation object onto the
ai_operation profile as an optional attribute rather than carrying it on a
standalone single-attribute ai_forensics profile. The attestation object is
unchanged and remains domain-agnostic, leaving a broader non-repudiation
profile as a clean follow-on with AI as the first consumer.

Advances ocsf#1640. Builds on ocsf#1641.
@mikeradka

mikeradka commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

This is a strong foundation. What would really help reviewers and early implementers is a few concrete end-to-end examples - what an event looks like on the wire from both the producer and consumer sides.

A couple of things I'm curious about:

Producer side - how should an event source populate this in practice? For example, when a LangChain agent reads a file via a tool call, does ai_agent appear on actor, process, or both? How does a producer distinguish between the agent's identity and the underlying process that made the action?

Consumer side - what's the intended correlation pattern? How would an analyst pivot from "this file was deleted" to "which agent instance was responsible and what model was it running?" across the uid/instance_uid split?

A pair of abbreviated JSON event snippets would go a long way toward building confidence that the attribute
placement choices are unambiguous for implementers - and make it easier to move forward knowing those decisions
are solid

@Levaj2000

Copy link
Copy Markdown
Contributor

+1 to the ask for wire examples, @mikeradka. Rather than a hypothetical, here's a real (scrubbed) row from our gateway — we log every agent request as an immutable, hash-chained record, so I can speak to what a producer can and can't actually populate, which is where the placement questions get real.

This is our own ada agent calling its read_file tool through the gateway — allowed under policy v36 — as an API Activity (class_uid 6003) with the ai_operation profile (UUIDs/hashes scrubbed):

{
  "activity_id": 2,
  "category_uid": 6,
  "class_uid": 6003,
  "type_uid": 600302,
  "severity_id": 1,
  "time": 1777912519552,
  "metadata": {
    "version": "1.9.0-dev",
    "profiles": ["ai_operation"],
    "correlation_uid": "0a1b2c3d-4e5f-4a6b-8c7d-9e0f1a2b3c4d"
  },
  "action": "Allowed",
  "action_id": 1,
  "api": { "operation": "read_file", "service": { "name": "ada-tools" } },
  "http_request": { "http_method": "POST", "url": { "path": "/ada/tools/read_file" } },
  "actor": {
    "user": { "uid": "11111111-2222-4333-8444-555566667777", "type_id": 1 }
  },
  "ai_agent": {
    "uid": "a9c3e7d1-2b4f-4e6a-9c8d-3f5a7b1e2c4d",
    "name": "ada"
  },
  "unmapped": {
    "policy_version": 36,
    "status_code": 200,
    "latency_ms": 90,
    "upstream_latency_ms": 45
  }
}

The honest producer answer to your uid / instance_uid question: as a gateway (a proxy in front of the tool/model APIs), I can populate ai_agent.uid — it's derived from the agent's credential — but I cannot populate ai_agent.instance_uid. I observe the agent's identity, not its process lifecycle; instance_uid is only knowable to the agent runtime self-reporting. Same with actor.process: a proxy sees no OS process, so it's legitimately empty — note this is a file read and yet there's no filesystem object in sight, because the gateway captures it as the API/tool call it actually saw, not an OS file op. So the two layers you asked about (agent identity vs. underlying process) aren't just separable — for a whole class of producers they arrive on different events from different vantage points, and the schema is right to keep them independent. Suggest the spec note that instance_uid is "populate when self-reported; gateways/proxies legitimately omit."

Consumer pivot (your correlation question): ai_agent.uid → everything that logical agent did; metadata.correlation_uid → the full request chain across services. We use correlation_uid exactly as CoSAI's Agentic IAM framework prescribes (Appendix E, "prove control on demand"). On "what model was it running?" — for a tool call the gateway log doesn't carry model identity; that lives on the model-inference events, correlated by the same correlation_uid. (Worth noting ai_model isn't always knowable to the producer of a given event.)

Two placement questions back at you: (1) policy decisions map cleanly here — action_id: 1 (Allowed) / 2 (Denied) is exactly right for allow/deny. But our gateway carries a third decision state, error (policy-eval or upstream failure), and there's no action_id enum for it, so it falls to 99 (Other). Is Other the intended home for an errored decision, or is that better modeled on status_id? And (2) we carry a policy_version per decision (CoSAI's logging schema asks for it too) — is there a home for that, or is unmapped/enrichment the expectation for now?

Thank you for the review - Jeff

@Aniak5

Aniak5 commented Jun 4, 2026

Copy link
Copy Markdown
Contributor Author

Thanks for the review @mikeradka — agreed, the placement question deserves a concrete answer before this lands. Here's how we're thinking about it.

The placement landscape

We currently have ai_agent available in three places: on the ai_operation profile, on actor, and on process. The instinct is that this is redundant — but each placement composes differently when you walk into nested objects (actor.process, process.parent_process, evidences.actor, etc.). The question isn't really "which one of the three?" — it's "what's the canonical home, and where do nested instances earn their keep?"

Why ai_operation should carry ai_agent directly

One option we considered was removing ai_agent from the ai_operation profile entirely and relying on actor.ai_agent (reached via the host profile, which is applied to most event classes). We don't think that's the right path:

  • Producer discovery is indirect. A producer logging an AI-relevant event reaches for the ai_operation profile first — that's the natural mental model. Telling them "actually, populate actor.ai_agent from the host profile, even though you've already applied ai_operation" creates exactly the kind of cross-profile indirection that breeds inconsistent mappings.
  • Profiles should be self-contained kits. Applying ai_operation and finding agent identity inside it is the obvious contract. If agent identity has to come from a different profile, the uai_operation profile feels incomplete — and producers will guess wrong.
  • Not every AI-relevant event has an actor. The host profile gives actor to most classes, but not all (discovery/query events, application events, future model_inference_activity-style
    classes). Pinning agent identity to actor would leave those gaps uncovered, while a profile-level placement covers them uniformly.

So ai_operation.ai_agent is the canonical home. The question is whether actor.ai_agent and process.ai_agent still earn their place alongside it.

Placement table

Placement Pros Cons
ai_operation.ai_agent (profile-level) Producer's natural discovery path. Self-contained: producers don't need to combine multiple profiles to log an agent reference. Works on event classes that don't have actor (discovery, application, future model-inference events). None significant. This is the canonical home.
actor.ai_agent Composes via evidences.actor.ai_agent in findings — a detection about agent-driven activity can carry the agent identity inside the evidence trail. Aligns with the actor model (agent as a non-human principal) for events where actor is the natural aggregator. On the top-level event itself, largely redundant with the profile-level reference. If we don't expect producers to use it for delegated-authority cases (actor.user + actor.ai_agent), it adds a placement choice without semantic value.
process.ai_agent Composes via process.parent_process.ai_agent to express agent-to-agent delegation at the runtime layer — when an orchestrator agent spawns a worker agent, the worker carries ai_operation.ai_agent and the orchestrator surfaces only through parent_process.ai_agent; nothing else on the event preserves that spawning relationship between agents. Also composes via actor.process.ai_agent for "the running process is the agent's runtime" (endpoint forensics, container attribution). Redundant at the top level (the profile-level ai_agent already carries the acting agent's identity), and the composition cases are real but apply mainly in multi-agent / endpoint-forensics scenarios — single-agent events get nothing extra from it.

Worth considering: adding to evidences

We could also add ai_agent to the evidences object directly, alongside actor and process. That would let findings carry agent identity at the top of the evidence trail without requiring it to be reached via evidences.actor.ai_agent. This may be the cleaner path for detection-finding consumers.

Producer-side convention (proposed)

Canonical placement: ai_operation.ai_agent is the primary home for agent identity on any event applying the ai_operation profile. Populate this whenever an AI agent is involved.

Nested placements (actor.ai_agent, process.ai_agent, process.parent_process.ai_agent) are populated only when the binding carries information beyond the canonical reference — e.g., a child process whose parent is an agent runtime, or finding evidence that links agent activity to a detection.

Examples

LangChain agent reads a file via a tool call:

{
  "class_uid": 1001,
  "activity_id": 2,
  "ai_agent": {
    "uid": "agent-research-assistant",
    "instance_uid": "i-7c2f9a",
    "name": "Research Assistant",
    "type_id": 4,
    "type": "LangChain",
    "version": "1.4.2",
    "ai_model": {
      "uid": "claude-sonnet-4-6",
      "name": "Claude Sonnet 4.6",
      "version": "20260514",
      "vendor_name": "Anthropic"
    },
    "charter": {
      "name": "research_assistant_charter.md",
      "hashes": [
        { "algorithm_id": 3, "value": "8f3b2c1d4e7a..." }
      ]
    }
  },
  "file": {
    "name": "q4_earnings.pdf",
    "path": "/data/reports/q4_earnings.pdf"
  }
}

Agent spawns a subprocess that performs the activity (parent_process binding):

  {
    "class_uid": 1007,
    "activity_id": 1,
    "ai_agent": {
      "uid": "agent-deploy-worker",
      "instance_uid": "i-9e22",
      "name": "Deploy Worker",
      "type_id": 1,
      "type": "Native",
      "ai_model": { "uid": "claude-haiku-4-5" }
    },
    "process": {
      "pid": 51220,
      "name": "python3.13",
      "parent_process": {
        "pid": 48211,
        "name": "python3.13",
        "ai_agent": {
          "uid": "agent-devops-orchestrator",
          "instance_uid": "i-44c1",
          "name": "DevOps Orchestrator",
          "type_id": 3,
          "type": "A2A",
          "ai_model": { "uid": "claude-opus-4-7" }
        }
      }
    }
  }

The acting agent is agent-deploy-worker — that's the agent on ai_operation.ai_agent and the agent running in the current process. Its parent process is another agent (agent-devops-orchestrator, an A2A orchestrator) that delegated this work. process.parent_process.ai_agent is the only place this delegating-agent identity surfaces on the event — without it, you'd lose the spawning relationship between agents.

Questions for the group

  1. Is the canonical-placement-with-nested-bindings convention the right model, or should we simplify to profile-level only and accept the loss of parent_process / evidence composition?
  2. Should we add ai_agent directly to the evidences object?

@davemcatcisco @jedwardsol if you get a chance, I'd really appreciate your feedback on adding ai_agent to actor and process. That's probably where endpoint telemetry has the biggest impact.

@davemcatcisco davemcatcisco changed the title Add ai_agent object and extend ai_operation profile coverage Add ai_agent object and extend ai_operation profile coverage Jun 5, 2026

@davemcatcisco davemcatcisco left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As well as the comments I've made about the AI profile being available in all System Activity and Network events, it also needs to be available in all events in the Identity & Access Management category by adding it to the iam base event for that category.

Comment thread events/system/file_activity.json Outdated
Comment thread events/network/network_activity.json Outdated
Comment thread objects/ai_agent.json Outdated
Comment thread objects/ai_agent.json
Comment thread objects/ai_agent.json
@Aniak5 Aniak5 force-pushed the issue-1640_ai_agent branch 2 times, most recently from 853d99a to f201e0c Compare June 8, 2026 15:27
Comment thread profiles/ai_operation.json
Comment thread dictionary.json Outdated
Aniak5 added a commit to Aniak5/ocsf-schema that referenced this pull request Jun 11, 2026
Address Dave + jedwardsol's review feedback on PR ocsf#1641. A single OS
process can host multiple AI agents (e.g., an IDE running Claude Code,
Codex, and Copilot simultaneously), and producers may not be able to
attribute a given activity to a specific agent. Added hosted_ai_agent_list
to the process object as an array of ai_agent, analogous to how the
Windows extension models hosted services. The singular ai_agent attribute
remains for cases where attribution is known.

Also replaced two — HTML entities in dictionary.json with hyphens
per mikeradka's nit.
@Aniak5 Aniak5 force-pushed the issue-1640_ai_agent branch from 2430051 to 74eca4b Compare June 11, 2026 14:54
Aniak5 added a commit to Aniak5/ocsf-schema that referenced this pull request Jun 17, 2026
Address Dave + jedwardsol's review feedback on PR ocsf#1641. A single OS
process can host multiple AI agents (e.g., an IDE running Claude Code,
Codex, and Copilot simultaneously), and producers may not be able to
attribute a given activity to a specific agent. Added hosted_ai_agent_list
to the process object as an array of ai_agent, analogous to how the
Windows extension models hosted services. The singular ai_agent attribute
remains for cases where attribution is known.

Also replaced two — HTML entities in dictionary.json with hyphens
per mikeradka's nit.
@Aniak5 Aniak5 force-pushed the issue-1640_ai_agent branch from 2c5696c to 849aa86 Compare June 17, 2026 16:50
Comment thread objects/ai_agent.json Outdated
Comment thread objects/process.json
Aniak5 added a commit to Aniak5/ocsf-schema that referenced this pull request Jun 23, 2026
Address Dave + jedwardsol's review feedback on PR ocsf#1641. A single OS
process can host multiple AI agents (e.g., an IDE running Claude Code,
Codex, and Copilot simultaneously), and producers may not be able to
attribute a given activity to a specific agent. Added hosted_ai_agent_list
to the process object as an array of ai_agent, analogous to how the
Windows extension models hosted services. The singular ai_agent attribute
remains for cases where attribution is known.

Also replaced two — HTML entities in dictionary.json with hyphens
per mikeradka's nit.
Comment thread objects/ai_agent.json Outdated
@davemcatcisco

Copy link
Copy Markdown
Contributor

@Aniak5 - As we discussed in the Network/AI call just now, I think there is a strong argument for the ai_operation profile also being applicable in the Identity & Access Management event category. All events in that category extend the iam base event, so the profile would only need to be added there.

mikeradka
mikeradka previously approved these changes Jun 24, 2026
davemcatcisco
davemcatcisco previously approved these changes Jun 24, 2026

@davemcatcisco davemcatcisco left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now. Thanks, @Aniak5.

@davemcatcisco

Copy link
Copy Markdown
Contributor

@mikeradka will need to re-approve.

Comment thread objects/process.json
Comment thread objects/process.json
Aniak5 added a commit to Aniak5/ocsf-schema that referenced this pull request Jun 24, 2026
Address Dave + jedwardsol's review feedback on PR ocsf#1641. A single OS
process can host multiple AI agents (e.g., an IDE running Claude Code,
Codex, and Copilot simultaneously), and producers may not be able to
attribute a given activity to a specific agent. Added hosted_ai_agent_list
to the process object as an array of ai_agent, analogous to how the
Windows extension models hosted services. The singular ai_agent attribute
remains for cases where attribution is known.

Also replaced two — HTML entities in dictionary.json with hyphens
per mikeradka's nit.
@Aniak5 Aniak5 force-pushed the issue-1640_ai_agent branch from 1499105 to e516d55 Compare June 24, 2026 19:55
Aniak5 added 14 commits June 24, 2026 15:58
Introduce a new ai_agent object representing an autonomous AI agent and thread it through
the schema so that data-plane events across multiple categories can attribute activity to
the agent that initiated it. Distinct from the existing agent object, which models security
sensors (EDR, DLP, APM, etc.).
…guidance

Address PR review feedback by clarifying the charter attribute description
to prescribe populating hashes and signatures on the file for content
integrity and provenance verification.
Lift ai_operation profile from individual classes to the system and
network base events so all System Activity and Network Activity events
inherit agent attribution. Refine ai_agent.uid, ai_model, type_id, and
version descriptions; drop MCP and A2A from type_id since communication
protocols are properties of operations, not the agent itself.
The charter attribute is added to ai_agent in the same PR that
introduces the ai_agent object, so the per-attribute entry under
Improved > Objects is redundant.
Address Dave + jedwardsol's review feedback on PR ocsf#1641. A single OS
process can host multiple AI agents (e.g., an IDE running Claude Code,
Codex, and Copilot simultaneously), and producers may not be able to
attribute a given activity to a specific agent. Added hosted_ai_agent_list
to the process object as an array of ai_agent, analogous to how the
Windows extension models hosted services. The singular ai_agent attribute
remains for cases where attribution is known.

Also replaced two — HTML entities in dictionary.json with hyphens
per mikeradka's nit.
The dictionary description is shared across all entities that may
adopt the attribute. The process-specific guidance lives on the
process object's own description.
An AI agent that initiates an activity is attributed via the
ai_operation profile's ai_agent attribute, not the actor object.
Remove ai_agent from actor (and its at_least_one constraint) and
point the actor description to ai_operation.ai_agent instead.
- Extend ai_agent from _entity instead of object, matching the
  standard OCSF pattern for objects with name/uid identity.
- Gate process.ai_agent and process.hosted_ai_agent_list behind the
  ai_operation profile so they only surface on AI-relevant events.
- Replace remaining em dashes in ai_agent descriptions with hyphens.
- Add the ai_operation profile to the application base event so all
  Application Activity classes inherit agent attribution, and drop the
  now-redundant direct includes from web_resources_activity,
  api_activity, and datastore_activity.
- Rewrite the ai_agent.instance_uid description to decouple it from OS
  process lifecycle: an instance is a logical session/run that may
  persist across restarts and span multiple cooperating components.
Per OCSF normative-schema guidance to keep company names out of
descriptions.
- Add the ai_operation profile to the iam base event so all Identity &
  Access Management classes inherit agent attribution, per Network/AI
  call discussion.
- Reword the tail of the ai_agent.instance_uid description per review
  ("a particular instance of the agent rather than to the agent
  generally").

@Levaj2000 Levaj2000 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — solid attribution foundation for ai_agent, and the charter naming resolved cleanly. Thanks @Aniak5!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants