Models

A model is a query that processes data from sources, applies transformations, and outputs structured datasets.

Models are the fundamental building blocks in the data modeling module, with each model representing a discrete unit of transformation. This approach shifts development from monolithic SQL scripts toward a more modular, maintainable structure that supports incremental development.

Recurve currently supports writing models in SQL.

As an example, the following snippet showcases the basic syntax of SQL models:

At its core, models are regular queries to data sources.
Transformation is finalized with a single SELECT statement.
Jinja is integrated to make references and enhance logic. See: Jinja templating.

-- Select the total number of orders per customer
select
    customers.customer_id,
    customers.first_name,
    customers.last_name,
    count(orders.order_id) as total_orders
from
    {{ source('jaffle_shop', 'raw_customers') }} as customers
join
    {{ source('jaffle_shop', 'raw_orders') }} as orders
on
    customers.customer_id = orders.customer_id
group by
    customers.customer_id,
    customers.first_name,
    customers.last_name
order by
    total_orders desc

When Recurve runs your SQL models, it goes through a process of compiling them into executable SQL queries. This means parsing all Jinja expressions, resolving model and source dependencies based on ref() and source() calls, and combining with configurations (like materialization types) to produce the final SQL.

Implementing data transformation logic as models provides several benefits:

Simplicity: Complex transformation can be broken down into smaller, manageable steps.
Modularity and reusability: Some transformations can be extracted into foundational models that are referenced in multiple places. This allows you to build transformation incrementally, rather than start from scratch.
Data lineage and transparency: Linkage between models are automatically tracked and presented in data lineage. This helps you understand the dependencies between models and make debugging easier. See: Data lineage.
Testing and validation: You can write tests and apply them to each model to ensure data quality. A model is run together with its tests to ensure data issues are caught early. See: Data tests.

Create a SQL model

To create a SQL model, follow these steps:

In the Models tab, click on the + icon and select New SQL model.
Provide a name for your model and click Create.
The created model will be organized in the models folder.
In the model editor, input your SQL query.

For example:

-- select customers from raw_customers table
select 
        id as customer_id
        name as customer_name
from {{ source('jaffle_shop', 'raw_customers') }}

Click Save to confirm the changes.
Click Preview to view the query output in the Result tab.

To inspect the compiled code of your model, click on the Compiled code tab.

Reference a model

Referencing one model in another allows you to build on existing transformations without rewriting. This modular approach streamlines your transformations, making the logic more maintainable and readable.

Within a model, you can reference another model using the ref() function. This function establishes dependencies between models, ensuring that the models are built in the correct order.

Here's an example:

-- model: stg_orders
select
    id as order_id,
    user_id as customer_id,
    order_date,
    status,
    amount
from {{ source('jaffle_shop', 'orders') }}

-- model: mart_daily_revenue

select
    date_trunc('day', order_date) as date,
    count(*) as number_of_orders,
    sum(amount) as daily_revenue
from {{ ref('stg_orders') }}
where status = 'completed'
group by 1
order by 1 desc

In mart_daily_revenu, the ref('stg_orders') function references stg_oders as a dependency. This reference allows one model to use the results of another, and also ensures their order when building models.

Organize models

As data transformation can follow different strategies and go through several stages, you can organize the models into folders that reflect the transformation stages and help maintain a clean project structure.

To create a new folder, click on the action button of an existing folder and select Add sub-folder:

To move a model into the new folder, click on the action buton of the model and select Move. Then select the target folder from the list.

PreviousSources NextModel schema

Last updated 1 month ago