Overview
A introduction to the data modeling module of Recurve.
Transforming data is one of the crucial steps in building an effective data pipeline. At this stage, raw data, loaded from a centralized database or data warehouse, is converted into a desired structure and format, enabling analysts to extract valuable insights that serve business needs.
Modern data transformation tools like dbt (Data Build Tool) are becoming more efficient by applying software engineering principles to analytics. Such principles include modular SQL scripting, version control, validation, and integration with orchestration platforms. This approach makes data transformation more reliable and scalable.
Recurve adopts best practices from the open-source dbt library and elevates them with its own design elements. This provides a powerful, integrated data transformation workspace, all presented within the Data modeling module.
Components
If you have used dbt before, some concepts in Recurve's data modeling module may be familiar. Recurve leverages several data transformation techniques and artifacts from open-source dbt, and enhances them with an intuitive asset management system.
Take a look at these guides to get started with data modeling in Recurve.
Sources: Sources are references to raw tables from a database or to models defined in other projects. These sources act as the input for data transformation.
Models: Queries that process data, apply transformations, and output structured datasets. Recurve currently supports models in SQL.
Jinja templating: Jinja is a templating language originally used in the Python ecosystem. With Jinja, you can enhance SQL transformation by adding programming features, such as loops, variables, and functions (or macros).
Data lineage: Data lineage provides a visual representation of data flows throughout various transformations, from sources to destination. This feature helps you understand how transformation can impact downstream output.
Data tests: Data tests are assertions that you make about models and resources in a project. These tests validate the correctness of the transformed data, ensuring that standards for integrity and quality are met before delivering the data to downstream analytics.
Last updated