From 4d327c84991e8da9246f262fb519a13b9a8bf75c Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Tue, 30 Jun 2026 18:38:00 +0000 Subject: [PATCH 1/3] Python: reformulate instanceFieldStep to avoid classInstanceTracker recursion --- .../new/internal/TypeTrackingImpl.qll | 24 ++++++++++++++----- 1 file changed, 18 insertions(+), 6 deletions(-) diff --git a/python/ql/lib/semmle/python/dataflow/new/internal/TypeTrackingImpl.qll b/python/ql/lib/semmle/python/dataflow/new/internal/TypeTrackingImpl.qll index 13afd6a4276d..865c4767aba1 100644 --- a/python/ql/lib/semmle/python/dataflow/new/internal/TypeTrackingImpl.qll +++ b/python/ql/lib/semmle/python/dataflow/new/internal/TypeTrackingImpl.qll @@ -349,11 +349,23 @@ module TypeTrackingInput implements Shared::TypeTrackingInput { * `instance.attr`, where `instance` is a reference to an instance of `cls`). * * This complements `selfAttrRef`, which only handles `self.attr` accesses inside the - * methods of `cls`. Unlike `selfAttrRef`, this depends on the call graph (via - * `classInstanceTracker`), so steps using it must be reported as `levelStepCall`. + * methods of `cls`. The instance is identified using *local* flow from a constructor + * call `cls(...)` (resolved via the call graph by `resolveClassCall`), rather than a + * dedicated instance type-tracker (`classInstanceTracker`). + * + * Using `classInstanceTracker` here would make `levelStepCall` mutually recursive with + * `classInstanceTracker` -- itself a full type-tracker run -- which caused catastrophic + * query slowdowns on some OOP-heavy Python code bases (e.g. `mypy` and `dask`). Relying + * on local flow from a resolved constructor call instead depends only on `classTracker` + * (the same call-graph machinery already used by `inheritedFieldStep`), avoiding that + * blow-up. The trade-off is reduced precision: instances that flow across a call or + * return before being read are no longer covered by this step. */ private predicate instanceAttrRead(Class cls, string attr, DataFlowPublic::AttrRead read) { - read.getObject() = DataFlowDispatch::classInstanceTracker(cls) and + exists(DataFlowPublic::CallCfgNode construction | + DataFlowDispatch::resolveClassCall(construction.asCfgNode(), cls) and + read.getObject().getALocalSource() = construction + ) and read.mayHaveAttributeName(attr) } @@ -432,9 +444,9 @@ module TypeTrackingInput implements Shared::TypeTrackingInput { * This is the cross-instance counterpart of `localFieldStep`: it relates a write of * `self.attr` inside a class to a read of `attr` on a reference to an instance of that * class or one of its subclasses. Identifying instances relies on the call graph (via - * `classInstanceTracker`), so this step is reported as `levelStepCall` rather than - * `levelStepNoCall`. The write may occur in the instance's own class or in any of its - * superclasses, since those methods are inherited. + * `resolveClassCall`, see `instanceAttrRead`), so this step is reported as + * `levelStepCall` rather than `levelStepNoCall`. The write may occur in the instance's + * own class or in any of its superclasses, since those methods are inherited. * * Like `localFieldStep`, this is an over-approximation: it is both instance-insensitive * and order-insensitive. From de8f489812057625058ad06c1247bc118167c790 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Tue, 30 Jun 2026 18:47:32 +0000 Subject: [PATCH 2/3] Add change note for instance-attribute type-tracking performance fix --- .../2026-06-30-instance-attribute-typetracking-performance.md | 4 ++++ 1 file changed, 4 insertions(+) create mode 100644 python/ql/lib/change-notes/2026-06-30-instance-attribute-typetracking-performance.md diff --git a/python/ql/lib/change-notes/2026-06-30-instance-attribute-typetracking-performance.md b/python/ql/lib/change-notes/2026-06-30-instance-attribute-typetracking-performance.md new file mode 100644 index 000000000000..53532f6e65ce --- /dev/null +++ b/python/ql/lib/change-notes/2026-06-30-instance-attribute-typetracking-performance.md @@ -0,0 +1,4 @@ +--- +category: minorAnalysis +--- +* Type tracking of values stored in instance attributes and read from outside the class (for example `instance.attr` where the value was assigned to `self.attr` in a method) no longer relies on a dedicated instance type-tracker. This avoids a structural mutual recursion that could cause catastrophic query slowdowns on some OOP-heavy code bases. Such reads are now resolved using local flow from the constructor call, which is slightly less precise for instances that flow across a call or return before being read. From 4181855d09c1b9f750f19b0afa293a05be73a514 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Tue, 30 Jun 2026 20:24:31 +0000 Subject: [PATCH 3/3] Add test case with MISSING tag demonstrating instance-across-call shortcoming --- .../dataflow/typetracking/attribute_tests.py | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/python/ql/test/library-tests/dataflow/typetracking/attribute_tests.py b/python/ql/test/library-tests/dataflow/typetracking/attribute_tests.py index b6bca72507f6..724be1091a4a 100644 --- a/python/ql/test/library-tests/dataflow/typetracking/attribute_tests.py +++ b/python/ql/test/library-tests/dataflow/typetracking/attribute_tests.py @@ -161,6 +161,18 @@ def possibly_uncalled_method(self): # $ MISSING: tracked=foo instance.print_foo() # $ MISSING: tracked=foo +# attribute set in method, but the instance flows across a call/return before the read. +# `instanceFieldStep` identifies the instance using only local flow from the constructor +# call, so a value stored on `self.foo` is not seen once the instance has crossed a +# function boundary. + +def make_my_class2(): + return MyClass2() + +returned_instance = make_my_class2() +print(returned_instance.foo) # $ MISSING: tracked + + # attribute set from outside of class class MyClass3(object):