test(dataset_tests): make tests self-cleaning and remove global-list assertion#117
Merged
Merged
Conversation
…assertion
The Python SDK CI matrix (python-version: [3.9, 3.13]) runs both jobs
against the same shared PRAPP backend with the same org/API key. This
caused cross-matrix flakes in tests/tests/dataset_tests/, most visibly
test_get_all_datasets, which:
1. Created two datasets without ever deleting them, leaking state into
subsequent runs of the other matrix entry.
2. Asserted len(get_all_datasets()) >= 2, which is a global-list
assertion that depends on backend state outside the test's control
and can also race with the SDK's internal N+1
list -> get_dataset_by_name loop (a parallel deletion can surface
as a 404 mid-loop).
This change:
- Adds tests/tests/dataset_tests/conftest.py with a function-scoped
'created_datasets' fixture: tests append each created dataset name to
the yielded list, and the fixture deletes every registered name on
teardown (best-effort per name, so a single failure does not mask
others). This mirrors the create -> yield -> delete pattern already
used by setup_dataset in tests/conftest.py.
- Drops the global len(ds_all) >= 2 assertion in test_get_all_datasets;
it now only checks that the two UUID-named datasets it created appear
in the result, which is what the test actually intends to verify.
- Wires the new fixture into every backend-touching test in
dataset_tests/ that previously created datasets without teardown:
test_dataset_lists, test_edit_dataset (all four tests, including the
rename case where both pre- and post-rename names are registered),
test_get_dataset (both tests), test_get_dataset_files (both tests),
test_pii_occurrences (both tests), test_upload_files_to_dataset (all
three tests). test_entity_mappings is unaffected (pure unit tests
with a mocked client).
Out of scope (tracked separately): the underlying N+1 in
tonic_textual/services/dataset.py::get_all_datasets, which is still
race-prone against concurrent deletions even with this fix.
KirillTonic
approved these changes
Jun 17, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The Python SDK CI matrix (python-version: [3.9, 3.13]) runs both jobs against the same shared PRAPP backend with the same org/API key. This caused cross-matrix flakes in tests/tests/dataset_tests/, most visibly test_get_all_datasets, which:
This change:
Out of scope (tracked separately): the underlying N+1 in tonic_textual/services/dataset.py::get_all_datasets, which is still race-prone against concurrent deletions even with this fix.