v9.7.0#2622
Merged
Merged
Conversation
…Each/forEachOrNull Add support for the %rowIndex environment variable as defined in the SQL on FHIR ViewDefinition spec. Within forEach and forEachOrNull iterations, %rowIndex resolves to the 0-based index of the current element. At the top level (no iteration), it evaluates to 0. Each nesting level maintains independent %rowIndex values. The implementation uses Spark's indexed transform(array, (elem, idx) ->) to track element positions during unnesting, threading the index through ProjectionContext into the FHIRPath evaluation as a supplied variable. Closes #2560 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…eat clause Adds support for %rowIndex within the repeat directive, producing a global 0-based traversal-order index across all depth levels of the flattened recursive tree. Each repeat directive scopes its own counter independently from enclosing or nested forEach/forEachOrNull/repeat directives. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rning Replace lazy ThreadLocal initialization with eager init and readObject() to eliminate a race condition when the instance is shared across threads via Spark's addReferenceObj(). Suppress S5164 (ThreadLocal.remove()) with documentation explaining why removal is unnecessary. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nt values Split RowCounter into separate read (RowCounterGet) and increment (RowCounterIncrement) operations so that multiple references to %rowIndex within the same repeat element all read the same value. Previously each reference independently called getAndIncrement(), causing N references to consume N counter values per element instead of one. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
RowCounter (getAndIncrement per evaluation) is superseded by the RowCounterGet + RowCounterIncrement split. Remove the old expression, its ValueFunctions helper, the getAndIncrement method, and the four encoder-level tests that exercised it. The equivalent behavior is now tested via ViewDefinition-level tests in rowindex.json. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cover the 13 new lines flagged by SonarCloud as uncovered: all methods on RowIndexCounter (get, increment, reset, serialization) and the three ValueFunctions entry points (rowCounterGet, rowCounterIncrement, resetCounter) exercised via a Spark dataset test in both codegen and interpreted modes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Introduces let() binding in SqlFunctions to materialise non-deterministic Spark column expressions exactly once per row, and applies it throughout the fhirpath and sql packages to prevent TraceExpression side effects from firing multiple times where the same operand appears in both branches of a when() expression. Adds a RepeatedSqlEvaluation checkstyle rule (RegexpMultiline) to catch accidental duplicate SQL evaluation at compile time, scoped to the fhirpath and sql package trees. Includes regression tests for all fixed evaluation paths. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix join() to use the lambda-bound parameter instead of getValue(), preventing duplicate evaluation of non-deterministic operands, and add a single-fire regression test with a string-array dataset. - Replace nullif(c, array()) in normaliseNull() with let() + size() check to avoid relying on element-type equality, which fails for MapType array elements in ANSI mode. - Document the 400-character false-negative trade-off in the RepeatedSqlEvaluation checkstyle rule comment. - Add @throws AnalysisException to SqlFunctions.let() Javadoc for the aggregate/window constraint. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix "Two" → "Three" design constraints count in checkstyle comment - Remove incorrect cdUnit as let() lambda parameter example; clarify that leftR/rightR are short local variables, not lambda parameters - Update PathlingContext Javadoc: add toBoolean() to list of affected helpers and change "post-3.0" to "Spark 3.0+" for consistency - Update SqlFunctions class Javadoc to mention union alongside deduplication - Add testNormaliseNull() and nullArray() case to testSingular() in DefaultRepresentationTest to cover semantic correctness after rewrite - Add trace-count regression guards for: convertToDateTime, convertToTime, and IMPLIES right-operand in TraceFunctionTest Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Backtick-quoted code references and removed bare semicolons from explanatory comments in ColumnRepresentation and TraceFunctionTest to avoid triggering java:S125. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extend the let() materialisation pattern to five more sites where a Column parameter was referenced multiple times in a single Spark SQL expression tree, causing nondeterministic expressions (e.g. trace()) to fire once per reference instead of once per row. Fixed sites: - ArrayElementWiseColumnEquality.performArrayComparison() - QuantityComparator.wrap() - TemporalComparator.implementWithSql() - ReferenceValue.validateTypeFormat() - ValidationLogic.validateConversionToBoolean() Also adds a binary let(Column, Column, BinaryOperator<Column>) overload to SqlFunctions to reduce verbosity when materialising two operands. Each fix is covered by a new trace-count regression test that wraps the input column in TraceExpression and asserts exactly one fire per row. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
`sanitiseRow` only handled nested `Row` values but fell through for `scala.collection.Seq` values (how Spark represents array fields), so synthetic fields like `_fid` and null-valued fields leaked into the JSON output whenever a FHIRPath expression returned a type containing an array of structs (e.g. `CodeableConcept.coding`). Adds a new branch that iterates over `Seq` elements, recursively sanitises any `Row` elements, and updates the parent field's `ArrayType` elementType to the sanitised element schema so that `Row.json()` positional mapping remains correct. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pre-size the ArrayList with the known sequence length, remove a redundant what-comment, and extract the shared coding row fixture into a helper to eliminate copy-paste between two test classes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Locks in that sanitiseRow correctly renders JSON for an array of structs where elements differ in which fields are null, and therefore have different post-sanitisation schemas.
Added suppressions for newly reported CVEs across core libraries, server, and site scopes following contextual impact assessment. All suppressed findings are either not bundled in the distribution or have unreachable vulnerable code paths. Upgraded mermaid from 11.12.2 to 11.15.0 via package.json override to fix four MEDIUM CVEs (CSS/HTML injection and DoS in diagram rendering) in the deployed static site. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three unit tests (SqlQueryResultStreamerTest, ViewRegistrationServiceTest, LibraryReferenceResolverTest.CanonicalReferences) called spark.stop() in @afterall, which destroyed the JVM-wide SparkContext and caused ViewDefinitionSearchTest and ViewDefinitionCreateTest to fail intermittently depending on test execution order. Converted all three tests to @SpringBootUnitTest so they receive the shared SparkSession via Spring injection, consistent with every other Spark-dependent test in the server module. The manually created sessions and @afterall teardowns are removed entirely. Closes #2615 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
BulkSubmitProviderTest was installing a Mockito mock as the active Spring SecurityContext in @beforeeach but never clearing it. Under JUnit 5 parallel execution the mock leaked onto adjacent threads, causing SearchProviderAuthTest to inherit a mock context in which setAuthentication() is a no-op, so checkHasAuthority() would throw "Token not present". Closes #2617. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Hadoop Path.toUri() drops the empty authority on file:// paths built via new Path(parent, child), yielding file:/path. Files discovered later via fs.listFiles + fs.makeQualified preserve the empty authority and come back as file:///path. UrlAllowlist's string-prefix match then rejects the downloaded file URLs against the staging-directory prefix, failing the import with an AccessDeniedError after the bulk export has already completed. Build the prefix via fs.getFileStatus so both sides use the same canonical URI form. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The shared static warehouse @tempdir is cleaned in @AfterEach, but Spark's catalog cache and Delta's global DeltaLog cache still hold references to the deleted tables. The next test rebuilds the warehouse from test fixtures, but isDeltaTable returns false against the stale log, so the import falls through to an ERROR_IF_EXISTS write that collides with the freshly-copied directory and fails with DELTA_PATH_EXISTS. Clear both caches before deleting files so cleanup restores both the on-disk and in-memory state. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The test runs under the integration-test profile with PNP credentials configured and auth enabled, but its requests were missing the Authorization header. The pre-existing 401 was hidden by an earlier PNP allowlist bug; with that fix in place, the auth interlock now rejects the request before the poisoning scenario can exercise. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The integration tests configured pathling.bulk-submit.allowable-sources
to bare http://localhost via @TestPropertySource. The URI-aware
UrlAllowlist resolves that prefix to effective port 80 and no longer
matches the dynamic http://localhost:{wireMockPort} the tests
actually use. Move the property into @DynamicPropertySource so it
picks up the WireMock port at runtime.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The test relied on the WebTestClient default response timeout of 5 s, which is shorter than the cold-start latency of the first POST against a freshly started Spring Boot context with a Delta-backed warehouse. Match the 60 s timeout already used by the sibling integration tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fix: Resolve intermittent test failures caused by shared state leaks
The shareable compliance suite previously ran only in a report-only execution that ignored test failures, so regressions in features Pathling supports could land unnoticed. The suite now runs in the default build with the maintained exclusion list, failing the build on any regression in a supported case. The report-only run moves to an opt-in profile activated by the release workflow, so the SoF compliance report continues to be produced. %rowIndex cases are added to the exclusion list because that feature is not yet supported and is tracked separately.
When a repeat directive's traversal followed a path whose runtime value was null (e.g. multi-path repeat where one branch produced no value at certain nodes), the extractor returned a null array and Spark's Concat propagated the null upward, producing wrong results. Wrapping the extractor's output in Coalesce(_, []) keeps the typed empty array in place of nulls and lets the surrounding Concat assemble the projection correctly. Resolves the previously failing repeat compliance case "multi-path repeat inside forEach" in the expanded SQL on FHIR v2 test suite. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When a repeat directive recursively traversed past the encoder's maxNestingLevel, Pathling's FHIRPath evaluator continued resolving the path against HAPI definitions while the Catalyst schema no longer had the field. The FIELD_NOT_FOUND fallback emitted an untyped empty array, which crashed StructProduct with "NullType cannot be cast to StructType" whenever a sibling typed array combined with the empty result. The expected element type is now derived from the repeat's projection clause (declared sqlType, FHIR type, or materialised column type, wrapped in ArrayType for collection columns) and threaded through transformTree. When the root traversal hits FIELD_NOT_FOUND, the fallback emits a typed empty array matching the declared column shape, so downstream StructProduct sees a consistent element type and combines correctly. Resolves the previously failing repeat compliance cases (repeat inside repeat, triple-nested repeat, repeat with forEach with repeat) in the expanded SQL on FHIR v2 test suite. The symmetric forEach-past-cap case is tracked separately and excluded from the regression suite. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…d-empty Add regression tests covering three type-mapping bugs silently fixed by routing through FhirPathType (BASE64BINARY, DECIMAL, INSTANT). Add a behavioural test for the typed-empty fallback path in transformTree. Improve comments in RepeatSelection, Expressions, and ProjectedColumn to capture non-obvious intent. Update design.md to reflect as-built decisions (getSqlType on existing records, FhirPathType delegation). Archive the completed change and sync the repeat-directive delta spec to the main specs directory. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add regression cases for EqualityOperator.handleNonEquivalentTypes covering both left and right traced operands. The branch is currently correct because isEmpty() references its operand once, but the asymmetry with handleEquivalentTypes (which already let()-wraps both sides) means a future refactor that adds a second operand reference would silently re-introduce issue #2594. The new assertions document the single-fire invariant at the FHIRPath surface.
Eliminate trace() entry duplication in ColumnRepresentation
fix: Ensure repeat directive implementation passes expanded test coverage
Extend sanitiseRow to recurse into arrays of structs
The encoders module requires the Bunsen-derived license header, but the new RowIndexCounter source and test files carried the standard CSIRO header, causing the license:check goal to fail in CI.
The merge of release/9.7.0 added a traceCollector constructor parameter but the withVariable call site was not updated, breaking compilation.
Implement %rowIndex environment variable
The @throws org.apache.spark.sql.AnalysisException tag caused the Javadoc build to fail since the exception is not declared in the method signature. The constraint is already documented in the preceding prose paragraph.
The SoF compliance report execution was moved behind the opt-in sofComplianceReport profile, but the pre-release workflow was not updated to activate it. As a result, fhir-view-compliance-test.json was never produced and the subsequent S3 upload step failed.
The fromLiteral Javadoc referenced a ParserException that the method never throws, which had pulled in an unintended import of org.apache.jena.reasoner.rulesys.Rule.ParserException. The code only compiled because Jena was present transitively on the classpath. Removed the bad import and the incorrect @throws tag.
Bumps HAPI FHIR from 8.6.0 to 8.10.0 in the core libraries and from 8.6.8 to 8.10.0 in the server. 8.10.0 bundles org.hl7.fhir.* 6.9.4.1, which remains behind the patched 6.9.7, so the existing transitive overrides are retained and their comments updated to reflect the new bundled version.
Investigation confirmed the DELTA_PATH_EXISTS failure for same-type batch updates was already fixed on release/9.7.0 by the switch to DeltaTable.isDeltaTable(spark, path) for table detection, and could not be reproduced against the current branch. No production change is required. Locks in the fixed behaviour and removes the harness confounds that masked it: - Harden testBatchUpdateMultiplePatients to assert both patients are retrievable and that a patient seeded before the batch survives the merge. - Add UpdateExecutorPathExistsTest covering recovery when the target path exists but is not a recognised Delta table. - Copy only the parquet table directories in copyTestDataToTempDir, and clean the destination first, so stray bulk-export job output cannot leak into the warehouse under test. - Publish the resource-update-persistence capability spec and archive the OpenSpec change.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.