Sharing data from Azure Databricks to Oracle Cloud Autonomous database with Delta-Share protocol


Delta share is a protocol for sharing data via REST. Oracle Cloud Autonomous Database provides capabilities to publish and consume data using this protocol. The specification is here and you can easily understand how it works just by executing things like this provided by a mockup over there to get the shared tables under an schema:

curl   "https://sharing.delta.io/delta-sharing/shares/delta_sharing/schemas/default/tables"

Or this other to get data from one shared table:

curl   "https://sharing.delta.io/delta-sharing/shares/delta_sharing/schemas/default/tables/COVID_19_NYT/query" -X POST -d {} -H "Content-Type:application/json"

Ok, let’s see how easy is to consume in Oracle Cloud databases shares provided by Azure Databricks. The other way around can also be done, will explain in future post.

SEE ALSO

Welcome to Oracle Data Sharing

Unlimited data-driven collaboration with Data Sharing of Oracle Autonomous Database

Reading Data from Oracle Autonomous Database Using Databricks and Delta Sharing Protocol

PART ONE: CONFIGURATION

In AZ DBR interface go to [Manage Account]:

Go to [Catalog]->[Metastores] and open one of the metastores that we want to share data from:

Enable checkbox [X]Enable Delta Sharing …blah blah blah…

PART TWO: CREATE OBJECTS TO BE SHARED and other stuff needed

Subscribe to get access

Read more of this content when you subscribe today or post a comment if you need support.

Go to [Catalog]->[Delta Sharing]->[Shared by me] and clic on the name of the share created (my share in this case):

Click on the blue button [Add recipient] on the right:

In the pop up [Add recipient] dialog, open listbox and click on [Create new recipient] link

Give a name to the recipient in the new dialog and click [Create and add recipient] button:

In the next dialog click [Add] button:

Copy the link and save for later (this is the link to be sent to the recipient that will consume the data)

Verify that the recipient has been added and copy the name of the recipient for later:

Open the notebook again and execute the last paragraph putting the name of the recipient created before:

part three: consume the share

Open Autonomous Database Actions UI and go to [Data Studio]->[Data Share]:

Click on [Consume Share] button:

Click on [Subscribe to Delta Share Provider] under the listbox [Subscribe to Share Provider]:

In this point paste in a browser the url link we get when creating the recipient and click on [Download Credential File] button:

Then come back to Database Actions and upload the json file downloaded in the [Share Provider JSON] selection area, put a name of the provider under [Provider Name] and click [Next] button:

Move the share on the left to the right and click [Subscribe] button:

Now, in the new window, drag & drop the shared table on the left to the right panel:

Click on the pencil close to the recycle bin icon to take a look and preview the shared data:

Finally click [Start] button to start consuming the data:

And that’s it:

Click on [Report] button to learn more about the share if you will:

DECLARE
l_TABLE_NAME DBMS_QUOTED_ID := '"SHARED_TABLE"';
l_CREDENTIAL_NAME DBMS_QUOTED_ID := '"AZUREDATABRICKS$SHARE_CRED"';
l_FILE_URI_LIST CLOB :=
q'[https://germanywestcentral.azuredatabricks.net/api/2.0/delta-sharing/metastores/6c8ae1...15e5a#MYSHARE.DEFAULT.SHARED_TABLE]';
l_COLUMN_LIST CLOB :=
q'[
"id" NUMBER(10)
,"name" VARCHAR2(4000)
,"age" NUMBER(10)]';
l_FIELD_LIST CLOB := null;
l_FORMAT CLOB :=
'{
"type" : "parquet",
"access_protocol" : "delta_sharing"
}';
BEGIN
"C##CLOUD$SERVICE"."DBMS_CLOUD"."CREATE_EXTERNAL_TABLE"
( TABLE_NAME => l_TABLE_NAME
,CREDENTIAL_NAME => l_CREDENTIAL_NAME
,FILE_URI_LIST => l_FILE_URI_LIST
,COLUMN_LIST => l_COLUMN_LIST
,FIELD_LIST => l_FIELD_LIST
,FORMAT => l_FORMAT
);
END;
/

So far, so good, shared data can be consumed as usual:

That’s all, hope it helps!! 🙂

Images created with the following code:

from openai import OpenAI
%env OPENAI_API_KEY=sk-proj-Wnoseastolayestenovale7lrelu1v
client = OpenAI()
response = client.images.generate(
model="dall-e-3",
prompt= "an azure sharing data across a wall of bricks towards and oracle database",
size="1024x1024",
quality="standard",
n=1,
)
image_url = response.data[0].url
from openai import OpenAI
%env OPENAI_API_KEY=sk-proj-Wnoseastolayestenovale7lrelu1v
client = OpenAI()
response = client.images.generate(
model="dall-e-3",
prompt= "oracle autonomous database",
size="1024x1024",
quality="standard",
n=1,
)
image_url = response.data[0].url

Related info:

{
"error_code" : "INVALID_PARAMETER_VALUE",
"message" : "Table features delta.enableDeletionVectors are found in table version: 1. For DeletionVectors, use DBR with version 14.1(14.2 for CDF and streaming) or higher, or delta-sharing-spark with version 3.1 or higher, and set option (\"responseFormat\", \"delta\") to query the table. ",
"details" : [ {
"@type" : "type.googleapis.com/google.rpc.ErrorInfo",
"reason" : "DS_UNSUPPORTED_FEATURES",
"domain" : "data-sharing.databricks.com",
"metadata" : {
"tableFeatures" : "delta.enableDeletionVectors",
"versionStr" : " version: 1.",
"optionStr" : "For DeletionVectors, use DBR with version 14.1(14.2 for CDF and streaming) or higher, or delta-sharing-spark with version 3.1 or higher, and set option (\"responseFormat\", \"delta\") to query the table. "
}
} ]
}

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.