Unity Catalog Spice OSS Integration
Spice OSS is a unified SQL query interface and portable runtime to locally materialize, accelerate, and query datasets across databases, data warehouses, and data lakes.
Unity Catalog and Databricks Unity Catalog can be used with Spice OSS as Catalog Connectors to make catalog tables available for query in Spice.
Follow the guide below to connect to open-source Unity Catalog. See the Databricks Unity Catalog quickstart if using Databricks Unity Catalog.
Prerequisites
Section titled “Prerequisites”- Spice OSS installed: To install Spice, see Spice OSS Installation.
- Access to an open-source Unity Catalog server with 1+ tables.
-
Create a new directory and initialize a Spicepod
Terminal window mkdir uc_quickstartcd uc_quickstartspice init -
Add the Unity Catalog Connector
Configure the
spicepod.yaml
with:catalogs:- from: unity_catalog:https://<unity_catalog_host>/api/2.1/unity-catalog/catalogs/<catalog_name>name: uc_quickstartparams:# Configure the object store credentials hereThe Unity Catalog connector currently supports Delta Lake tables only and requires object store credentials.
-
Configure Delta Lake tables object store credentials
AWS S3
Section titled “AWS S3”params:unity_catalog_aws_access_key_id: ${env:AWS_ACCESS_KEY_ID}unity_catalog_aws_secret_access_key: ${env:AWS_SECRET_ACCESS_KEY}unity_catalog_aws_region: <region> # E.g. us-east-1, us-west-2unity_catalog_aws_endpoint: <endpoint> # If using an S3-compatible service, like Minio
Set the AWS_ACCESS_KEY_ID
and AWS_SECRET_ACCESS_KEY
environment variables to the AWS access key and secret key, respectively.
Azure Storage
Section titled “Azure Storage”params: unity_catalog_azure_storage_account_name: ${env:AZURE_ACCOUNT_NAME} unity_catalog_azure_account_key: ${env:AZURE_ACCOUNT_KEY}
Set the AZURE_ACCOUNT_NAME
and AZURE_ACCOUNT_KEY
environment variables to the Azure storage account name and account key, respectively.
Google Cloud Storage
Section titled “Google Cloud Storage”params: unity_catalog_google_service_account: </path/to/service-account.json>
Example Delta Lake Spicepod
Section titled “Example Delta Lake Spicepod”version: v1beta1kind: Spicepodname: uc_quickstart
catalogs: - from: unity_catalog:https://<unity_catalog_host>/api/2.1/unity-catalog/catalogs/<catalog_name> name: uc_quickstart params: # delta_lake S3 parameters unity_catalog_aws_region: us-west-2 unity_catalog_aws_access_key_id: ${secrets:aws_access_key_id} unity_catalog_aws_secret_access_key: ${secrets:aws_secret_access_key} unity_catalog_aws_endpoint: s3.us-west-2.amazonaws.com
Step 5. Start the Spice runtime and show the available tables
Section titled “Step 5. Start the Spice runtime and show the available tables”Once the spicepod.yml
is configured, start the Spice runtime:
spice run
In a seperate terminal, run the Spice SQL REPL:
spice sql
In the REPL, show the available tables.
SHOW TABLES;+---------------+--------------+---------------+------------+| table_catalog | table_schema | table_name | table_type |+---------------+--------------+---------------+------------+| spice | runtime | metrics | BASE TABLE || spice | runtime | task_history | BASE TABLE || uc_quickstart | default | taxi_trips | BASE TABLE |+---------------+--------------+---------------+------------+
Step 6. Query a dataset
Section titled “Step 6. Query a dataset”In the SQL REPL execute query. For example:
-- SELECT * FROM uc_quickstart.<SCHEMA_NAME>.<TABLE_NAME> LIMIT 5;sql> SELECT fare_amount FROM uc_quickstart.default.taxi_trips LIMIT 5;+-------------+| fare_amount |+-------------+| 11.4 || 13.5 || 11.4 || 27.5 || 18.4 |+-------------+
Time: 1.4579897499999999 seconds. 5 rows.
More information
Section titled “More information”- Unity Catalog Connector Quickstart
- Databricks Unity Catalog Connector Quickstart
- Spice OSS Catalogs Documentation
- Spice OSS Unity Catalog Documentation
- Get help on Discord.