BigQuery setup
Recurve's BigQuery connector enables you to seamlessly integrate your BigQuery data warehouse as a destination for your analytics database. This guide walks you through establishing a secure connection using Google Cloud service accounts, from initial setup to your first successful connection.
You'll learn how to:
Prepare a sample dataset in your Google Cloud Project (GCP).
Create and configure a service account with sufficient permissions.
Establish a connection between Recurve and BigQuery.
Prerequisites
You already join a Recurve organization.
You've already created a GCP project.
Prepare sample dataset
To demonstrate the connetion process, we'll be using the jaffle_shop
dataset — a fictional e-commerce store's data provided by the dbt community. This dataset offers a practical example of typical e-commerce data structures that mirror real-world scenarios.
You can use the jafgen CLI tool generate synthetic data for any specified year range.
The dataset includes these tables:
Customers (who place Orders)
Orders (from those Customers)
Products (the food and beverages the Orders contain)
Order Items (of those Products)
Supplies (needed for making those Products)
Stores (where the Orders are placed and fulfilled)
Tweets (Customers sometimes issue Tweets after placing an Order)
Or simply download the generated data below:
Ingest sample data
Follow these steps to ingest the data:
Go to your GCP project console.
Create a new dataset:
Expand the Actions option and select Create dataset.
Enter provide a name for the dataset in Dataset ID. You can leave the other fields as default.
Click Create dataset.
Create tables for the dataset.
Click on the Actions option of the dataset and select Create table.
Choose Upload as the creation method, select the CSV file, and name the table after the file name. For example, we upload the
raw_orders.csv
and name the tableraw_orders
.In Schema, check Auto detect to automatically generate the schema for the table.
Click Create table.
Repeat the four steps above to create and upload the other tables.
Create a service account
After loading data into the project, you need to create a Google Service Account. This account represents the identity and permissions that the connector can use to authenticate and interact with BigQuery.
Follow these steps:
Go to the GCP credentials wizard and select your project.
Enable BigQuery API:
Select BigQuery API from the dropdown.
Choose Application data for data processing type.
Click Next. This navigates to to the Create service account section.
Configure your service account:
Provide a descriptive name for your service account.
Click Create and continue.
Assign the two following roles:
BigQuery Job User.
BigQuery Data Editor.
Click Done.
After the service account is created, the console navigates you to the Credentials page, where you can generate keys to authenticate the account. Locate the Service Accounts rows and click on the account
Navigate to the Keys tab.
Click the Add key -> Create new key.
Select JSON as the Key type and click Create. This will download the JSON file that contains authentication credentials and metadata.
Connect Recurve to BigQuery
With the Google Service Account key, you now can set up a connection to BigQuery from Recurve.
Follow these steps:
Open the left sidebar.
Navigate to Connections -> Destinations.
Click + Create connection.
Select Google BigQuery.
Fill in the fields with the information from the JSON key file.
Destination name: the name to identify this specific connection in your organization.
Auth Type:
service_account
Google Project ID: the unique identifier for your GCP project
Google Auth Private Key ID: the unique identifier for your service account's private key
Google Auth Private Key: the actual private key used for authentication
Client ID: your service account's unique identifier
Client Email: the email address associated with your service account
Select Development environment and click Test connection.
Once the connection is successfully tested, click Create Destination.
The connection is now ready to be used in a project.
Last updated