C++: Support fully qualified field names in MaD#22088
Open
MathiasVP wants to merge 5 commits into
Open
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates the C++ models-as-data (MaD) flow-summary handling to support selecting fields by fully qualified field name (aligning with how other languages model fields), while keeping the legacy unqualified field-name syntax for backward compatibility. It also tweaks path-explanation hiding so MaD nodes that are marked hidden are actually suppressed in paths.
Changes:
- Extend C++ MaD field selection/encoding to recognize and emit fully qualified field names (while still accepting unqualified names).
- Fix path explanation hiding for MaD flow-summary nodes that are marked as hidden.
- Update/extend MaD-related library tests and expected outputs; add a C++ library-pack change note.
Show a summary per file
| File | Description |
|---|---|
| cpp/ql/lib/semmle/code/cpp/dataflow/internal/FlowSummaryImpl.qll | Updates MaD field encoding logic to support qualified field names while preserving legacy behavior. |
| cpp/ql/lib/semmle/code/cpp/ir/dataflow/internal/DataFlowPrivate.qll | Ensures flow-summary nodes that are marked hidden are treated as hidden in path explanations. |
| cpp/ql/test/library-tests/dataflow/external-models/flow.ext.yml | Adds external-model summaries exercising fully qualified field-name syntax. |
| cpp/ql/test/library-tests/dataflow/external-models/test.cpp | Adds test code that writes/reads fields from a namespaced struct and a global struct to validate qualified field handling. |
| cpp/ql/test/library-tests/dataflow/external-models/flow.expected | Updates expected flow graph output to reflect the new MaD behavior and added tests. |
| cpp/ql/test/library-tests/dataflow/external-models/sources.expected | Updates expected sources for new test code locations. |
| cpp/ql/test/library-tests/dataflow/external-models/sinks.expected | Updates expected sinks for new test code locations. |
| cpp/ql/test/library-tests/dataflow/models-as-data/testModels.expected | Updates expected rendered MaD flow summary nodes to reflect qualified field naming. |
| cpp/ql/test/query-tests/Security/CWE/CWE-497/semmle/tests/ExposedSystemData.expected | Updates expected path explanations/edges impacted by the MaD/path-hiding adjustments. |
| cpp/ql/test/query-tests/Likely Bugs/Format/NonConstantFormat/NonConstantFormat.expected | Updates expected path explanations/edges impacted by the MaD/path-hiding adjustments. |
| cpp/ql/lib/change-notes/2026-06-30-mad-qualified-field-names.md | Adds a change note documenting the MaD field-name qualification change and deprecation plan. |
Review details
- Files reviewed: 11/11 changed files
- Comments generated: 3
- Review effort level: Low
hvitved
reviewed
Jun 30, 2026
Comment on lines
+57
to
+60
| if arg.matches("%::%") | ||
| then | ||
| arg = repeatStars(c.getIndirectionIndex() - 1) + formatQualifiedName(namespace, type, base) | ||
| else arg = repeatStars(c.getIndirectionIndex() - 1) + base |
Contributor
There was a problem hiding this comment.
Isn't this simply the same as
Suggested change
| if arg.matches("%::%") | |
| then | |
| arg = repeatStars(c.getIndirectionIndex() - 1) + formatQualifiedName(namespace, type, base) | |
| else arg = repeatStars(c.getIndirectionIndex() - 1) + base | |
| arg = repeatStars(c.getIndirectionIndex() - 1) + formatQualifiedName(namespace, type, base) | |
| or | |
| arg = repeatStars(c.getIndirectionIndex() - 1) + base |
(assuming that base cannot match %::%)?
| or | ||
| n instanceof SsaSynthNode | ||
| or | ||
| n.(FlowSummaryNode).getSummaryNode().isHidden() |
Contributor
There was a problem hiding this comment.
Yes, you definitely want this.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
There's a small discrepancy between how Java and C# picks out a field in a MaD summary and how we do it in C++: In C++ we simply use the name of the field whereas Java and C# (and probably other languages as well. I didn't check this) use the fully qualified name of the field.
Using the fully qualified name is definitely the right approach as it precisely allows us to pick out the field that is being written to / read from, but we apparently missed that when we added MaD for C++.
Using the fully qualified name turns out to be crucial for some MaD summaries I'm working on for associative STL containers. So this PR adds the ability for C++ to specify fully qualified names while keeping the old "just use the field name" around for compatibility reasons. This makes some of the rendered flow summary nodes less pretty, but that's the tradeoff we have to make I think, unfortunately.
I tucked on a small one-line change in which nodes we hide in path explanations that we've been missing ever since we introduced MaD. We weren't hiding nodes that were supposed to be hidden, and caaed72 fixes that. It shouldn't change any results.
DCA is clean (the
keepassxcis a well known source of wobble on DCA for C++).