Models
Last updated
Last updated
A model is a query that processes data from sources, applies transformations, and outputs structured datasets.
Recurve currently supports writing models in SQL.
As an example, the following snippet showcases the basic syntax of SQL models:
At its core, models are regular queries to data sources.
Transformation is finalized with a single SELECT statement.
Jinja is integrated to make references and enhance logic. See: Jinja templating.
When Recurve runs your SQL models, it goes through a process of compiling them into executable SQL queries. This means parsing all Jinja expressions, resolving model and source dependencies based on ref()
and source()
calls, and combining with configurations (like materialization types) to produce the final SQL.
Implementing data transformation logic as models provides several benefits:
Simplicity: Complex transformation can be broken down into smaller, manageable steps.
Modularity and reusability: Some transformations can be extracted into foundational models that are referenced in multiple places. This allows you to build transformation incrementally, rather than start from scratch.
Data lineage and transparency: Linkage between models are automatically tracked and presented in data lineage. This helps you understand the dependencies between models and make debugging easier. See: Data lineage.
Testing and validation: You can write tests and apply them to each model to ensure data quality. A model is run together with its tests to ensure data issues are caught early. See: Data tests (Coming soon).
To create a SQL model, follow these steps:
In the Models tab, click on the + icon and select New SQL model.
Provide a name for your model and click Create.
The created model will be organized in the models folder.
In the model editor, input your SQL query.
For example:
Click Save to confirm the changes.
Click Preview to view the query output in the Result tab.
To inspect the compiled code of your model, click on the Compiled code tab.
Referencing one model in another allows you to build on existing transformations without rewriting. This modular approach streamlines your transformations, making the logic more maintainable and readable.
Within a model, you can reference another model using the ref()
function. This function establishes dependencies between models, ensuring that the models are built in the correct order.
Here's an example:
In mart_daily_revenu
, the ref('stg_orders')
function references stg_oders
as a dependency. This reference allows one model to use the results of another, and also ensures their order when building models.
As data transformation can follow different strategies and go through several stages, you can organize the models into folders that reflect the transformation stages and help maintain a clean project structure.
To create a new folder, click on the action button of an existing folder and select Add sub-folder:
To move a model into the new folder, click on the action buton of the model and select Move. Then select the target folder from the list.