Skip to content

[salesforce] Order EventLogFile queries by the cursor field (CreatedDate)#19954

Open
shmsr wants to merge 3 commits into
elastic:mainfrom
shmsr:salesforce-elf-cursor-order-fix
Open

[salesforce] Order EventLogFile queries by the cursor field (CreatedDate)#19954
shmsr wants to merge 3 commits into
elastic:mainfrom
shmsr:salesforce-elf-cursor-order-fix

Conversation

@shmsr

@shmsr shmsr commented Jul 3, 2026

Copy link
Copy Markdown
Member

What does this PR do?

Orders the Apex, Login, and Logout EventLogFile queries by CreatedDate (the cursor field) instead of LogDate, in both the default and value queries.

Background: two unrelated timestamps

Every EventLogFile record has two datetime fields that mean different things:

  • LogDate — the period the log covers (e.g. the start of the hour/day).
  • CreatedDate — when Salesforce actually generated the file. This lags LogDate, and the lag is variable (minutes to days), so CreatedDate order does not follow LogDate order.

These data streams track the collection cursor on CreatedDate (cursor.field: CreatedDate) and resume with WHERE CreatedDate > <cursor>, but the queries sorted the results by LogDate:

-- before
... WHERE CreatedDate > [[ .cursor.event_log_file.last_event_time ]] AND (<event types>) ORDER BY LogDate ASC NULLS FIRST

The bug, with an example

The input watermarks the cursor from the last record it processes in a page. When the page is ordered by LogDate, that last record is not necessarily the one with the greatest CreatedDate.

Consider two log files where the one covering the later period happened to be generated first:

# LogDate (period) CreatedDate (generated)
1 2026-06-22T00:00:00Z 2026-06-24T11:28:06Z
2 2026-06-23T00:00:00Z 2026-06-24T11:08:05Z

ORDER BY LogDate ASC returns them as #1 then #2. The input processes #1 (sets watermark 11:28:06), then #2 (overwrites watermark with 11:08:05). So the stored last_event_time ends at 2026-06-24T11:08:05Zearlier than a record it already ingested (11:28:06).

On the next poll:

WHERE CreatedDate > 2026-06-24T11:08:05Z ...

…record #1 (11:28:06) matches again and is re-collected. The cursor effectively lags behind the data and re-fetches already-ingested files each poll.

This is not a contrived case — sorting real EventLogFile results by LogDate produces multiple such CreatedDate inversions per page whenever files are generated slightly out of period order (common for hourly logs).

The fix

Order by the cursor field so the last record always carries the maximum CreatedDate:

-- after
... WHERE CreatedDate > [[ .cursor.event_log_file.last_event_time ]] AND (<event types>) ORDER BY CreatedDate ASC NULLS FIRST

With the example above, ORDER BY CreatedDate ASC returns #2 then #1, so the watermark ends at 2026-06-24T11:28:06Z (the true maximum) and the next poll (CreatedDate > 11:28:06) does not re-collect either file. This matches the already-correct SetupAuditTrail query, which orders by its cursor field.

Why is it important?

Prevents repeated re-collection of already-ingested EventLogFile data and keeps the collection cursor moving strictly forward.

Compatibility / upgrade safety

The change is limited to the ORDER BY clause. The WHERE filter and cursor.field are unchanged, so:

  • Existing persisted cursors remain valid CreatedDate values and are interpreted identically.
  • For any given cursor value, the exact same set of records is returned — only their order (and the resulting watermark) changes — so no records are skipped on upgrade. A previously lagging cursor simply re-fetches its small lag window once and then advances cleanly.

Checklist

  • I have reviewed tips for building integrations and this change adheres to the guidelines.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.

How to test this PR locally

  • elastic-package lint in packages/salesforce.
  • elastic-package test system -v for the apex, login, and logout data streams (the mock server query matchers in _dev/deploy/docker/files/config.yml are updated to match the new ordering).

…ate)

The Apex, Login, and Logout EventLogFile queries sorted results by
`LogDate` while tracking the collection cursor on `CreatedDate`. Those
two fields are not correlated for EventLogFile records: Salesforce can
create a log for an earlier `LogDate` period after one for a later
period, so the last record in `LogDate` order frequently does not carry
the maximum `CreatedDate`.

Because the input watermarks the cursor from the last processed record,
the stored `event_log_file.last_event_time` could be set below the
newest `CreatedDate` already ingested, so the next poll re-collected
data it had already fetched.

Order these queries by `CreatedDate` (the cursor field), matching the
already-correct SetupAuditTrail query, so the watermark only advances.
The change is limited to the ORDER BY clause; the WHERE filter and
cursor field are unchanged, so existing persisted cursors remain valid
and no data is skipped on upgrade.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@shmsr shmsr requested a review from a team as a code owner July 3, 2026 08:51
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@shmsr shmsr self-assigned this Jul 3, 2026
@shmsr shmsr added the Integration:salesforce Salesforce label Jul 3, 2026
@shmsr shmsr requested a review from stefans-elastic July 3, 2026 08:52
@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

✅ Elastic Docs Style Checker (Vale)

No issues found on modified lines!


The Vale linter checks documentation changes against the Elastic Docs style guide. To use Vale locally or report issues, refer to Elastic style guide for Vale.

@elastic-vault-github-plugin-prod

Copy link
Copy Markdown

✅ All changelog entries have the correct PR link.

@infra-vault-gh-plugin-prod

Copy link
Copy Markdown

💚 Build Succeeded

cc @shmsr

@andrewkroh andrewkroh added the Team:Obs-InfraObs Observability Infrastructure Monitoring team [elastic/obs-infraobs-integrations] label Jul 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Integration:salesforce Salesforce Team:Obs-InfraObs Observability Infrastructure Monitoring team [elastic/obs-infraobs-integrations]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants