Skip to content

[Audit] Fix nullable port_id in fact_downtime_daily grain and surrogate key #102

Description

@daria-sukhareva

Context

Downtime should be on the port level.

fact_downtime_daily documents port_id as "may be null if unavailable" but uses it as a component of the unique key (date_id + charge_point_id + port_id + type). The downtime_id surrogate key is generated from this nullable column.

This creates a data integrity hole: two rows with port_id = null on the same date_id, charge_point_id, and type will produce identical downtime_id values. The unique test on downtime_id will then either fail (catching the problem late) or pass incorrectly if nulls are excluded from the test.

Beyond the surrogate key collision, a nullable column in the grain is a grain ambiguity: "one row per date + charge_point + port + type" and "one row per date + charge_point + unknown port + type" are different things, and downstream consumers cannot distinguish them.

Acceptance criteria

  • The business case for port_id being nullable is reviewed and documented: is a charge-point-level (non-port-specific) downtime event a real, distinct concept, or is null a data quality gap?
  • If null port_id is a legitimate business event: introduce a sentinel value (e.g. 'UNKNOWN' or 'N/A') to replace null in the surrogate key generation, document the sentinel in the column description, and update the dbt_utils.generate_surrogate_key call accordingly
  • If null port_id is a data quality gap: add a not_null test to port_id and a data investigation note explaining what upstream fix is required
  • The unique test on downtime_id is verified to catch duplicates correctly after the chosen fix is applied
  • The grain statement in the model description is updated to reflect the resolution (matches the grain documentation issue if that work happens concurrently)
  • dbt build --select fact_downtime_daily passes with no test failures

Metadata

Metadata

Assignees

No one assigned

    Labels

    kwwhat-auditKimball compliance audit findings

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions