Data lineage
Last updated
Last updated
Data lineage helps track how data flows through the transformation process, from sources to models. It provides visibility into the dependencies between datasets and the transformations applied to them, making it easier to understand the relationships and impacts of changes within a data process.
In Recurve, data lineage is represented as a directed acyclic graph (DAG). Whenever you edit an asset and make references (through ref()
and source()
functions), Recurve automatically tracks its relationships with other assets and reflects them in the DAG view.
To view data lineage in Recurve, open an asset in the editor and toggle on the Lineage view option.
Each asset in the data lineage is represented as a node with links showing the relationships with other assets (nodes). You can click and drag the nodes to reposition them for a clearer view.
The bottom-left toolbar allows you to expand the lineage section, zoom in and out, and focus on the current node of the opened asset.
By default, the selected asset/node is designated as the centric node. With large project size and complex transformation strategy, there can be a lot of linkage associated with an asset. By focusing on one node at a time, the lineage view helps you inspect the asset information, and its relationship with the upstream and downstream models, and enhance the overall transformation plan.
You can use the search bar on the top left corner to search and adjust the number of upstream and downstream layers of the current node.
To view the data lineage of all assets in the project, you can toggle on the Show all lineages option.