Repository mental model
Bronze history
→
Current state
→
Silver tables
The repo is a controlled conversion point: noisy DMS history in, clean analytics structures out.
This repository owns the transformation layer between raw DMS files and the analytics-ready Silver schema. It reads full-load plus CDC Parquet files, resolves the latest current-state row, validates data quality, isolates bad records, and publishes operational evidence.
The repo is a controlled conversion point: noisy DMS history in, clean analytics structures out.
Because CDC reconciliation depends on history, the safest result is a fresh canonical snapshot each run.