Unity Catalog XTable Integration
This document walks through the steps to register an Apache XTable™ (Incubating) synced Delta table in Unity Catalog.
Apache XTable provides cross-table omni-directional interoperability between Apache Hudi, Apache Iceberg, and Delta Lake.
Pre-Requisites
Section titled “Pre-Requisites”- Source table(s) (Hudi/Iceberg) already written to external storage locations like S3/GCS/ADLS or local. In this guide, we will use a S3 example.
- Follow the XTable installation guide here
- Clone the Unity Catalog repository from here and build the project by following the steps outlined here
To sync a source Hudi/Iceberg table using XTable use the following:
sourceFormat: HUDI|ICEBERG # choose only onetargetFormats: - DELTAdatasets: tableBasePath: s3://path/to/source/data tableName: table_name partitionSpec: partitionpath:VALUE
Now, from your terminal under the cloned Apache XTable™ (Incubating) directory, run the sync process using the below command. This will generate the Delta Lake metadata.
java -jar xtable-utilities/target/incubator-xtable-utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig my_config.yaml
Note: At this point, if you check your bucket path, you will be able to see _delta_log directory with the JSON log.
Configure Server Property for using S3
Section titled “Configure Server Property for using S3”The server config file is at the location etc/conf/server.properties
For enabling server to vend AWS temporary credentials to access S3 buckets, the following parameters need to be set:
s3.bucketPath.i
: The S3 path of the bucket where the data is stored. Should be in the formats3://<bucket-name>
.s3.accessKey.i
: The AWS access key, an identifier of temp credentials.s3.secretKey.i
: The AWS secret key used to sign API requests to AWS.s3.sessionToken.i
: THE AWS session token, used to verify that the request is coming from a trusted source.
Run the Unity Server
Section titled “Run the Unity Server”bin/start-uc-server
Register the XTable-synced table in the Unity Catalog
Section titled “Register the XTable-synced table in the Unity Catalog”In a separate terminal, run the following commands to register the target table in Unity Catalog.
bin/uc table create --full_name unity.default.people --columns "id INT, name STRING, age INT, city STRING, create_ts STRING" --storage_location s3://path/to/source/data
Validating the results
Section titled “Validating the results”You can now read the table registered in Unity Catalog using the below command.
bin/uc table read --full_name unity.default.people