Creating a high available multi-region datalake with 2 Oracle databases and 2 replicated object storage buckets for purposes such as reporting, ML or AI | PART ONE


Oracle (autonomous) database in OCI provides external tables, machine learning notebooks, model deployment, json native storage and json extended capabilities among other features.

In this post we are creating two databases in 2 different OCI regions and 2 object storage buckets for storing json data coming from a streaming source such as Kafka.

STORY

The DIABOLIC MESSAGES PRODUCER sends data to OCI Streaming topic. Then, a configured SERVICE CONNECTOR gets messages from topic and stores in object storage bucket. A BUCKET REPLICATION POLICY copies datafiles to bucket in region2. Databases 1 and 2 has got an external table and a view created as a select on the table. Finally the views are exposed as REST services.

Let’s hands on!

step 1: Create resoUrces WITH CLOUD SHELL

From the OCI dashboard open cloud shell

Execute step by step the following commands of this script downloadable from here, read the comments to understand what’s is going on:

#!/bin/sh
# 
# create the workshop resources
#
# put your own values here
PASSWD=<robust password>
#
# this is the second region we are going to subscribe to
# we need regionname and region-key, see https://docs.oracle.com/en-us/iaas/Content/General/Concepts/regions.htm
REG2=ap-sydney-1
REG2KEY=SYD
NAME=signals # dont change or find&replace in case you modify it!
# here the name of your home region
home=$(oci iam region-subscription list | jq -r '.data[0]."region-name"')
#
# create a compartment under tenancy
#
TENANCY=$(oci iam availability-domain list --all | jq -r '.data[0]."compartment-id"')
COMPARTMENT=$(oci iam compartment create -c $TENANCY --name $NAME --description $NAME | jq -r '.data.id')
sleep 5
echo "Created new compartement with ocid: "$COMPARTMENT
#
# subscribe to second region
#
echo "These are the regions already subscribed:"
oci iam region-subscription list | jq '.data[]."region-name"'
#
oci iam region-subscription create --tenancy-id $TENANCY --region-key $REG2KEY
#
# create dynamic group and iam policies needed for service connector, database resource principal and bucket replication
#
oci iam dynamic-group create --name $NAME --description $NAME --matching-rule "ALL {resource.type = 'autonomous-databases'}"
#
oci iam policy create --compartment-id $TENANCY --name signals --statements '["allow service objectstorage-eu-frankfurt-1 to manage object-family in compartment signals","allow any-user to {STREAM_READ, STREAM_CONSUME} in compartment signals","allow any-user to manage objects in compartment signals","allow dynamic-group signals to manage buckets in tenancy"]' --description signals

#
# object storage urls 
#
namespace=$(oci os ns get | jq -r .data)
echo "Grab this object storage urls for creating external tables later:"
echo "https://objectstorage.$home.oraclecloud.com/n/"$namespace"/b/signals1"
echo "https://objectstorage."$REG2".oraclecloud.com/n/"$namespace"/b/signals2"
#
# create databases and outputs sqldev urls
#
echo "Creating databases, grab the sqldeveloper url's for later:"
oci db autonomous-database create --display-name db1 --db-name db1 --cpu-core-count 1 --compartment-id $COMPARTMENT --admin-password $PASSWD --data-storage-size-in-tbs 1 --region $home --wait-for-state AVAILABLE | jq '.data."connection-urls"."sql-dev-web-url"'
#
oci db autonomous-database create --display-name db2 --db-name db2 --cpu-core-count 1 --compartment-id $COMPARTMENT --admin-password $PASSWD --data-storage-size-in-tbs 1 --region $REG2 --wait-for-state AVAILABLE | jq '.data."connection-urls"."sql-dev-web-url"'
#
# create streaming topic and output its ocid for later
#
echo "Creating streaming topic, grab the ocid of the streaming topic for later use in the microservice code:"
strocid=$(oci streaming admin stream create --name signals1 --partitions 1 --compartment-id $COMPARTMENT --region $home | jq -r '.data.id')
#
# generating json config for later create the service connector
echo '{"kind":"streaming","streamId":'$strocid',"cursor":{"kind":"LATEST"}}' > source
echo '{"kind": "objectStorage","bucketName": "signals1","objectNamePrefix": "signalsrepl"}' >target
#
# create os buckets
#
echo "Creating storage buckets in both regions"
oci os bucket create --name signals1  --compartment-id $COMPARTMENT --region $home 
oci os bucket create --name signals2  --compartment-id $COMPARTMENT --region $REG2 
#
# create service connector hub
#
echo "Creating service connector"
oci sch service-connector create --display-name signals --compartment-id $COMPARTMENT --region $home --source file://./source --target file://./target
#
# create bucket replication policy
#
echo "Creating object storage replication policy"
 oci os replication create-replication-policy  --bucket-name signals1 --name signalsrepl --destination-region $REG2 --destination-bucket signals2
#

STEP 2: REVIEW RESOURCES

Check you have:

  • a database in region1 and other in region2
  • an streaming topic in region1
  • an object storage bucket in region1 and other in region2
  • a replication policy in object storage bucket1 (region1)
  • a service connector hub in region1
  • a policy with several statements (this resource is created at tenancy level)

STEP 3: GRAB datABASE AND OBJECT STORAGE URLs CREATED IN STEP 1

Object storage urls should be like this: https://objectstorage.<region&gt;.oraclecloud.com/n/<namespace>/b

Database sqldeveloper urls should be like this: https://<id>-<dbaname&gt;.adb.<region>.oraclecloudapps.com

STEP 4: Connect to database1 with sqldeveloper web

Use admin with password you set in step 1

STEP5: EXECUTE PL/SQL CODE TO CREATE EXTERNAL TABLE AND ASSOCIATED VIEW

Mind the text in bold which is namespace dependent on your tenancy and the pattern of the files stores in object storage by the service connector

-- allows database connection to object storage via a dynamic group an policy created in step1
EXEC DBMS_CLOUD_ADMIN.ENABLE_RESOURCE_PRINCIPAL();
/
-- create external table
BEGIN
   DBMS_CLOUD.CREATE_EXTERNAL_TABLE(
    table_name =>'signals1',
    credential_name =>'OCI$RESOURCE_PRINCIPAL',
    file_uri_list =>'https://objectstorage.<region>.oraclecloud.com/n/<namespace>/b/signals1/o/*.data.gz',
    format => json_object('delimiter' value '~', 'compression' value 'gzip'),
    column_list => 'jsondata VARCHAR2 (23767)' );
END;
/
-- create view
create view signals1v as (select j.jsondata.timestamp, 
JSON_VALUE(utl_raw.cast_to_varchar2(utl_encode.base64_decode(utl_raw.cast_to_raw(j.jsondata.value))), '$.t') t,
JSON_VALUE(utl_raw.cast_to_varchar2(utl_encode.base64_decode(utl_raw.cast_to_raw(j.jsondata.value))), '$.d') d,
JSON_VALUE(utl_raw.cast_to_varchar2(utl_encode.base64_decode(utl_raw.cast_to_raw(j.jsondata.value))), '$.v') v from signals1 j)
/

STEP 6: EXPOSE VIEW AS REST AND TEST IT

You can expose using PL/SQL o this the user interface in a few clicks:

STEP 7: REPEAT 5 AND 6 IN DATABASE 2

Repeat steps in second database

STEP 8: PROVISION AN OKE CLUSTER

Use the wizzard or find a tutorial to create an OKE k8s cluster faced to the internet.

Step 8: deploy the DIABOLIC MESSAGES PRODUCER

Clone the git repo, edit osp.yaml and put the appropriate values of an OCI user (for API signing key), the tenancy, region and the like. Take a look at this post.

Then deploy as follows.

# the private key is get from the private key in your environment
kubectl create configmap myppk --from-file=myppk=./myppk 
kubectl get configmaps myppk -o yaml
kubectl apply -f osp.yaml

STEP9: TEST IT ALL!

Verify the messages producer is deployed and running:

Verify the streaming topic is receiving messages:
Verify you have datafile in bucket1:
Verify previous step in second region:

Verify the table and views has rows and REST endpoints are retrieving data:

Second part here

That’s all for the first part, hope it helps! 🙂

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.