Offloading Oracle Data to Object Storage and Mounting as External Table in an Autonomous Database | Part Two: AWS S3


Following up the previous post, we are explaining today how to make same thing using Amazon S3. The procedure and steps are quite similar, let’s go and try!

Amazon side setup

Create a S3 bucket with no public access:

Upload datafile:

Grab the object URL for later:

Bucket located in Frankfurt

Create a user in IAM:

Grant access to S3 for the new user created:

Create an access key:

Oracle autonomous database side setup

Create the credential with the access key id and secret created above:

BEGIN   DBMS_CLOUD.CREATE_CREDENTIAL(     
     credential_name => 's3rfidcredential',     
     username => 'AKIA...ZFAP',     
     password => 'r...J'   
 ); END;
 /

Create the external table with the credential name and the url of the datafile uploaded to S3:

BEGIN
    DBMS_CLOUD.CREATE_EXTERNAL_TABLE(
     table_name =>'s3rfid201901',
     credential_name =>'s3rfidcredential',
     file_uri_list =>'https://orcldata.s3.eu-central-1.amazonaws.com/rfid201901.json',
     format => json_object('delimiter' value '|', 'truncatecol' value 'True'),
     column_list => 'jsoncol varchar2(4000)'
  );
 END;
 /

Validate the external table:

EXECUTE DBMS_CLOUD.VALIDATE_EXTERNAL_TABLE ('s3rfid201901')
 /

Query a few rows to see if the external table is working:

SELECT jsoncol FROM s3rfid201901 where rownum < 10
It seems we get rows from the external datafile in S3

Now, let’s create a staging table from the external table:

CREATE TABLE s3rfid (id RAW (16) NOT NULL, date_loaded TIMESTAMP WITH TIME ZONE, jsondata CLOB CONSTRAINT s3_ensure_jsondata CHECK (jsondata IS JSON))
/

Now we populate the staging table:

insert into s3rfid (id, date_loaded, jsondata) select hextoraw(ora_hash(t.jsoncol)),sysdate, t.jsoncol from s3rfid201901 t where rownum < 400000
/
select 
      r.id id, r.jsondata.st store,
      r.jsondata.ch[0].t time,
      r.jsondata.ch[0].its[0].co,
      r.jsondata.ch[0].its[0].q
  from azurfid r where r.jsondata.st = '40466'
  minus 
  select 
      r.id id, r.jsondata.st store,
      r.jsondata.ch[0].t time,
      r.jsondata.ch[0].its[0].co,
      r.jsondata.ch[0].its[0].q
  from s3rfid r where r.jsondata.st = '40466'
  /
Set A minus set B = null if A=B

At this point, from previous and current post we have created 2 external tables (azurfid201901, s3rfid201901) and 2 staging tables(azurfid, s3rfid) already, all of them with the same origin datafile. Let’s do some checks, it is interesting to note that, with the same number of I/O requests, S3 is 10 times quicker than Azure, but it probably depends on where the object storage is located:

Comparing the 2 full table scans
Full table scan to external table in azure storage
Explain plan
Details
Full table scan to external table in S3
Explain plan
Details

Full scan to the staging tables has obviously petty much the same response time:

Full table scan to staging table from Azure
Full table scan to staging table from S3

That’s all for today, in next post we’ll do the same utilising Oracle OCI Object Storage. Hope it helps! 🙂

NOTE: This work have been done utilising an always free OCI account in Frankfurt

3 thoughts on “Offloading Oracle Data to Object Storage and Mounting as External Table in an Autonomous Database | Part Two: AWS S3

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.