Skip to content

Fix N+1 queries in trigger asset event submission#65367

Open
henry3260 wants to merge 1 commit into
apache:mainfrom
henry3260:fix-n-1-query-in-trigger
Open

Fix N+1 queries in trigger asset event submission#65367
henry3260 wants to merge 1 commit into
apache:mainfrom
henry3260:fix-n-1-query-in-trigger

Conversation

@henry3260

@henry3260 henry3260 commented Apr 16, 2026

Copy link
Copy Markdown
Contributor

Why

When a trigger submits an event for watched assets, Airflow needs to load the trigger's asset watchers and their related assets before registering asset changes. These relationships were previously loaded lazily, so a trigger watching multiple assets could issue extra queries per asset during event submission.

This adds unnecessary database overhead in the Triggerer path and can scale poorly for triggers that notify many assets. Loading the asset watcher graph up front keeps event submission query usage stable as the number of watched assets grows.

Was generative AI tooling used to co-author this PR?
  • Yes (please specify the tool below)

  • Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
  • For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
  • When adding dependency, check compliance with the ASF 3rd Party License Policy.
  • For significant user-facing changes create newsfragment: {pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.

Important

🛠️ Maintainer triage note for @henry3260 · by @potiuk · 2026-06-18 13:57 UTC

Paused pending your next update — this PR has been inactive for ~41 days, so it's been moved to draft to keep the review queue clear:

  • Rebase on the latest main, address any new failures, and mark it Ready for review when you pick it back up — no rush.
  • See the Pull Request quality criteria.

The ball is in your court — you've been assigned to this PR.

Automated triage — may be imperfect; a maintainer takes the next look.

@henry3260 henry3260 requested review from XD-DENG and ashb as code owners April 16, 2026 12:45
@potiuk potiuk added the ready for maintainer review Set after triaging when all criteria pass. label Apr 22, 2026
@henry3260

Copy link
Copy Markdown
Contributor Author

Hi @ashb Could I get your review when you get a chance? Thanks in advance!

@potiuk potiuk marked this pull request as draft June 18, 2026 13:58
@henry3260 henry3260 marked this pull request as ready for review June 18, 2026 14:09
@henry3260 henry3260 force-pushed the fix-n-1-query-in-trigger branch from 953ac8e to b476dbb Compare June 18, 2026 14:09

@jason810496 jason810496 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, I verify the select chain as well. Nice catch, left small nits regarding the test cases.

I will merge this for 3.3.1.

Comment on lines +256 to +267
def _submit_event_query_count(trigger_id: int) -> int:
with count_queries(session=session) as query_result:
Trigger.submit_event(trigger_id, TriggerEvent("payload"), session=session)
return sum(query_result.values())

small_query_count = _submit_event_query_count(_create_trigger_with_assets(1))
larger_query_count = _submit_event_query_count(_create_trigger_with_assets(5))

assert larger_query_count - small_query_count < 3, (
f"Added 4 assets but query count increased by {larger_query_count - small_query_count} "
f"({small_query_count} -> {larger_query_count}), suggesting n+1 queries for trigger assets"
)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, we should parameterize the asset_count at test function level and get the exact same select count.

def test_submit_event_no_n_plus_one_for_assets(_, session):
"""Ensure asset notifications do not trigger per-asset lazy-load queries."""

def _create_trigger_with_assets(asset_count: int) -> int:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we could collapse the these utility function as the test function itself.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, making it a nested function gives us limited gain here.

@jason810496 jason810496 added this to the Airflow 3.3.1 milestone Jun 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Triggerer ready for maintainer review Set after triaging when all criteria pass.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants