3. Create data models
This section introduces the core concepts of data modeling. You'll learn how to create a simple data transformation use case with Recurve.
Assets in Recurve
Assets are the building blocks of your data workflow in Recurve. You can think of assets as the essential components you'll create and manage throughout your data transformation process. Each type of asset serves a specific purpose, working together to turn your raw data into valuable insights.
For example, in Data modeling, you mainly work with these types of assets:
Sources: references to your raw data, allowing you to connect and document your data origins.
Models: queries that process this data, applying transformations to create structured, analytics-ready datasets.
Jinja macros and variables: Jinja templating artefacts that add programming logic to SQL, allowing you to write more dynamic and maintainable transformations.
Data modeling process
The data modeling process in Recurve follows a logical flow that helps you build and validate your transformation workflow.
Define your sources: Start by defining sources that represent your raw data tables. Sources help you describe and document the origins of your data.
Create data models: With your sources defined, create data models to transform your raw data into useful analytics datasets. Each model represents a specific transformation step that builds on previous steps.
Configure materialization: Specify how the results of your models will be materialized in your data warehouse.
Add data tests: Add data tests to the models to ensure they're working correctly. These tests associated with a model are automatically executed every time the model is successfully built.
Your development cycle should include these steps to ensure that transformations work properly before applying them to production data.
Prerequisites
The following walkthrough uses the jaffle_shop
dataset (a fictional e-commerce store) provided by the dbt Community. You can follow the guide in this repository to generate the data and load it into your target database: jaffle-shop-generator.
Walkthrough
Let's go through the data modeling process of Recurve with hands-on steps.
To begin, from the Data development dashboard, open the project that you've created. By default, Recurve navigates you to the Design section, where all transformation activities happen.
Define your sources
Follow these steps:
In the Models tab, click on the + icon and select Add source.
In the opened modal:
Select the connection type.
Select the target connection. This is the project connection that you've set up in 2. Create a project
Click Next. Recurve then displays all the tables available from the target connection.
Select the desired raw tables or models.
Here we select all tables organized in the
jaffle_shop
schema.
Click Add source.
The selected tables will then be added to the Sources folder and grouped by the schema name.
Create data models
To demonstrate the dynamics and modularity of data models, here we're going to create three models:
This stage model standardizes customer data by selecting relevant fields (customer_id
, first_name
, and last_name
) from the raw customers
table.
This uses the Jinja {{ source() }}
function to reference the raw tables defined in the previous section.
Follow these steps to create each model:
In the Models tab, click on the + icon and select New SQL model.
Provide the model name and click Create.
The new model is then placed in the Models folder.
Open the model in the editor and paste in the query.
Click Save to confirm the changes.
Click Preview to view the query output in the Result tab.
Perform the steps above to create the other two models.
Now that we've created two staging models that standardize raw data, and one model that aggregates and consolidates the results, we can view them in Data linage to better understand the relationship.
Open a model and toggle on the Lineage view option. This will display a DAG (directed acyclic graph) showing the relationship of assets, from raw data to the final downstream model.
Data lineage is achieved through the use of source()
and ref()
functions, which automatically track dependencies of assets.
Configure materialization
You can specifically configure how a model is materialized within your warehouse.
By default, all models have the table materialization.
Follow these steps:
Open a model in the editor.
In the Materialization field, select a materialization option.
Continuing with our three example models, we can materialize the staging models as views to ensure they reflect the latest source data and minimize storage costs. On the other hand, the consolidated model can be materialized as table as it is the final model and is queried more frequently.
Add data tests
Coming soon: Data tests will be available in the next release.
Data tests are simply SQL queries that return failing records, based on the condition that you set. These tests validate the correctness of the transformed data and ensure results from the model meet predefined standards.
Recurve provides a list of built-in tests that you can quickly add to your models.
To add a test to a model, follow these steps:
Open a model in the editor.
Switch to the Test cases tab.
Click +Add new.
Select a template and specify the values.
For example, with the
stg_orders
model, we can add the Empty Value test to verify that nonull
value exists in the date column.Click Add.
The new test is added to the model's test case list and is executed every time the model runs in the console or as part of a pipeline.
Last updated