Module 5: Entity Management
Key Terms
-
Enriched Entities - An enriched entity is a composite data structure that combines and enhances data from multiple canonical entities, incorporating additional business logic, transformations, and relationships to provide a more comprehensive and refined dataset for analysis and reporting.
-
Data Flow - A data flow can be defined as a structured representation of the movement and transformation of data within a system. It illustrates how data travels from source to destination, including any intermediate processing steps, transformations, and storage points.
-
Parent-child entities represent a hierarchical relationship where the parent entity contains one or more child entities.
-
Parent Entity: The main entity that holds general information.
- Example: In a company database, the Department entity can be a parent.
-
Child Entity: Subordinate entities that hold more specific information related to the parent entity.
- Example: Within the Department entity, Employee entities can be children, representing employees within that department.
This structure helps organize data logically and manage relationships effectively.
Pre-requisites
Before proceeding with entity reconciliation, ensure that you have completed the following steps:
- Generate Mappings to create the necessary entities and attributes. Refer to the section for detailed instructions.
- Perform Canonical Data Modeling to structure your data. Refer to the section for guidance.
- Implement necessary Data Quality improvements to ensure data integrity and accuracy.
Entity Management
Entity management involves reconciling entities and creating relationships to ensure data consistency, integrity, and accurate representation of real-world interactions across systems.
Entity Management Flow
- Generate Parent-child entities entities.
- Discover Relationships between entities and their attributes (Add or Edit relationships).
- Create new entities from system entities.
Entity Reconciliation
Entity Reconciliation involves generating groups of parent-child entities. Follow these steps to manage your entities:
-
Generate Relationships:
- Click on "Generate Relationships" to create core entities, which form parent groups (blue box), and sub-entities, which are child entities of a parent group (grey box).
-
Modify Child Entities:
- Drag and Drop: Drag and drop child entities between groups.
- Dropdown Menu: Add entities from the dropdown menu.
- Dropdown Box: Enter entities from the dropdown box.
-
Modify Parent Entity Names:
- Click on the blue box three times to rename the parent entity.
-
Add New Parent-Child Entities:
- Use the "Add Parent & Child Entities" button to create new groups.
- Delete Reconciliation:
- Click "Delete Reconciliation" to reset all parent-child entities and restart the generation process.
Generating and Storing Data Flow
To generate a data flow for creating enriched entities, follow these steps:
-
Generate Data Flow:
- Click the "Generate Data Flow" button.
- This action generates the logic and relationships for the enriched entities.
-
Store to Database:
- After the data flow is generated, review the logic.
- Click the "Save to Database" button to store the generated data flow to the database.
Discover Relationships
In this phase, you identify and create relationships between different entities, determining the cardinality such as one-to-one or one-to-many.
-
Select Core Entity:
- Choose a core entity from the dropdown menu.
-
Generate Relationships:
- Click the "Generate Relationships" button to allow the LLM to identify potential relationships between entities.
-
Review Relationships:
- The identified relationships will be listed with details including left entity, left attribute, right entity, right attribute, and cardinality.
-
Edit and Fine-Tune:
- Manually adjust the relationships if needed.
- Use the "New Entity Relationship" button to add new relationships.
-
Check Cardinality:
- Determine and verify the cardinality (e.g., one-to-one, one-to-many) for each relationship.
-
Show Data Model:
-
Click the "Show Data Model" button to view the relationships between entities visually.
-
-
Add New Entity Relationships:
- Click the "New Entity Relationships" button to include additional entities as needed.
Enriched Data Models
In this final step, users can edit and create relationships between sub-entities to form new core entities that may not have been generated by the LLM.
Enriched Data Entities
In this section, you can create new enriched entities from canonical entities. Follow these steps:
-
Provide Entity Details:
- Enter an entity name and description.
- Use the @ symbol in other fields to see the available options.
-
Generate Code:
- Click on the "Generate Code" button at the bottom of the screen to generate the new entity.
-
Execute Code:
- Click on the "Execute" button in the code block to create the new entity and store it in the data lake.
-
Data Preview and Storage:
- You can preview the data using the "Data Preview" button.
- Store the data into a data warehouse by clicking on the "Data Warehouse Load" button.