Skip to main content

Module 4: Canonical Modeling

Key Terms

  • Canonical Model: A unified and standardized version of source data, often referred to as the silver layer in the medallion architecture.
  • Entities: Logical tables created by grouping similar data across different source systems.
  • Attributes: Columns inside an entity, which are generated or mapped using LLM.
  • Business Key: A unique identifier for each entity used to track records.
  • Data Lake Load / Data Warehouse Load: Options to move cleaned and modeled data to storage or warehouses.
  • Custom Code: Transformations or data quality rules written in code before loading.

Step-by-Step Overview

Step 1: Access Canonical Modeling Section

Navigate to the Data Modeling section and select Canonical Model.
canonicalmodelling-step1


Step 2: Generate Business Keys

Click the ⚙️ menu to Generate Business Key automatically or assign it manually.
canonicalmodelling-step3


Step 3: Data Lake / Warehouse Load

Use options like Data Lake Load or Data Warehouse Load to send entities to storage.
canonicalmodelling-step4


Step 4: Add Custom Code or Attributes

In the Code Review section, you can:

  • Add custom attributes
  • Add transformation logic
  • Apply data quality checks
    canonicalmodelling-step5

canonicalmodelling-step5.1


Step 5: Commit and Execute

Once updates are done, click Commit Changes to push to Git, then execute the notebook.
canonicalmodelling-step6