Context
Downtime should be on the port level.
fact_downtime_daily documents port_id as "may be null if unavailable" but uses it as a component of the unique key (date_id + charge_point_id + port_id + type). The downtime_id surrogate key is generated from this nullable column.
This creates a data integrity hole: two rows with port_id = null on the same date_id, charge_point_id, and type will produce identical downtime_id values. The unique test on downtime_id will then either fail (catching the problem late) or pass incorrectly if nulls are excluded from the test.
Beyond the surrogate key collision, a nullable column in the grain is a grain ambiguity: "one row per date + charge_point + port + type" and "one row per date + charge_point + unknown port + type" are different things, and downstream consumers cannot distinguish them.
Acceptance criteria
Context
Downtime should be on the port level.
fact_downtime_dailydocumentsport_idas "may be null if unavailable" but uses it as a component of the unique key (date_id + charge_point_id + port_id + type). Thedowntime_idsurrogate key is generated from this nullable column.This creates a data integrity hole: two rows with
port_id = nullon the samedate_id,charge_point_id, andtypewill produce identicaldowntime_idvalues. Theuniquetest ondowntime_idwill then either fail (catching the problem late) or pass incorrectly if nulls are excluded from the test.Beyond the surrogate key collision, a nullable column in the grain is a grain ambiguity: "one row per date + charge_point + port + type" and "one row per date + charge_point + unknown port + type" are different things, and downstream consumers cannot distinguish them.
Acceptance criteria
port_idbeing nullable is reviewed and documented: is a charge-point-level (non-port-specific) downtime event a real, distinct concept, or is null a data quality gap?port_idis a legitimate business event: introduce a sentinel value (e.g.'UNKNOWN'or'N/A') to replace null in the surrogate key generation, document the sentinel in the column description, and update thedbt_utils.generate_surrogate_keycall accordinglyport_idis a data quality gap: add anot_nulltest toport_idand a data investigation note explaining what upstream fix is requireddowntime_idis verified to catch duplicates correctly after the chosen fix is applieddbt build --select fact_downtime_dailypasses with no test failures