Skip to content

Unity Catalog Spice OSS Integration

Spice OSS is a unified SQL query interface and portable runtime to locally materialize, accelerate, and query datasets across databases, data warehouses, and data lakes.

Unity Catalog and Databricks Unity Catalog can be used with Spice OSS as Catalog Connectors to make catalog tables available for query in Spice.

Follow the guide below to connect to open-source Unity Catalog. See the Databricks Unity Catalog quickstart if using Databricks Unity Catalog.

  • Spice OSS installed: To install Spice, see Spice OSS Installation.
  • Access to an open-source Unity Catalog server with 1+ tables.
  1. Create a new directory and initialize a Spicepod

    Terminal window
    mkdir uc_quickstart
    cd uc_quickstart
    spice init
  2. Add the Unity Catalog Connector

    Configure the spicepod.yaml with:

    catalogs:
    - from: unity_catalog:https://<unity_catalog_host>/api/2.1/unity-catalog/catalogs/<catalog_name>
    name: uc_quickstart
    params:
    # Configure the object store credentials here

    The Unity Catalog connector currently supports Delta Lake tables only and requires object store credentials.

  3. Configure Delta Lake tables object store credentials

    params:
    unity_catalog_aws_access_key_id: ${env:AWS_ACCESS_KEY_ID}
    unity_catalog_aws_secret_access_key: ${env:AWS_SECRET_ACCESS_KEY}
    unity_catalog_aws_region: <region> # E.g. us-east-1, us-west-2
    unity_catalog_aws_endpoint: <endpoint> # If using an S3-compatible service, like Minio

Set the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables to the AWS access key and secret key, respectively.

params:
unity_catalog_azure_storage_account_name: ${env:AZURE_ACCOUNT_NAME}
unity_catalog_azure_account_key: ${env:AZURE_ACCOUNT_KEY}

Set the AZURE_ACCOUNT_NAME and AZURE_ACCOUNT_KEY environment variables to the Azure storage account name and account key, respectively.

params:
unity_catalog_google_service_account: </path/to/service-account.json>
version: v1beta1
kind: Spicepod
name: uc_quickstart
catalogs:
- from: unity_catalog:https://<unity_catalog_host>/api/2.1/unity-catalog/catalogs/<catalog_name>
name: uc_quickstart
params:
# delta_lake S3 parameters
unity_catalog_aws_region: us-west-2
unity_catalog_aws_access_key_id: ${secrets:aws_access_key_id}
unity_catalog_aws_secret_access_key: ${secrets:aws_secret_access_key}
unity_catalog_aws_endpoint: s3.us-west-2.amazonaws.com

Step 5. Start the Spice runtime and show the available tables

Section titled “Step 5. Start the Spice runtime and show the available tables”

Once the spicepod.yml is configured, start the Spice runtime:

Terminal window
spice run

In a seperate terminal, run the Spice SQL REPL:

Terminal window
spice sql

In the REPL, show the available tables.

SHOW TABLES;
+---------------+--------------+---------------+------------+
| table_catalog | table_schema | table_name | table_type |
+---------------+--------------+---------------+------------+
| spice | runtime | metrics | BASE TABLE |
| spice | runtime | task_history | BASE TABLE |
| uc_quickstart | default | taxi_trips | BASE TABLE |
+---------------+--------------+---------------+------------+

In the SQL REPL execute query. For example:

-- SELECT * FROM uc_quickstart.<SCHEMA_NAME>.<TABLE_NAME> LIMIT 5;
sql> SELECT fare_amount FROM uc_quickstart.default.taxi_trips LIMIT 5;
+-------------+
| fare_amount |
+-------------+
| 11.4 |
| 13.5 |
| 11.4 |
| 27.5 |
| 18.4 |
+-------------+
Time: 1.4579897499999999 seconds. 5 rows.