Server-less serial to parallel access conversion pattern in OCI


Context

Let be:

  • S a data stream with messages sent by producers
  • OS an object storage bucket
  • SC a transceiver that reads messages from S and stores in OS as (compressed) files containing messages collected in chunks from S
  • F a function that reacts to events emitted in OS when a new file is stored in OS
  • D an autonomous database
  • XT an external table in D that reads compressed files form OS and shows data as rows and columns
  • P a pipeline that gets data form XT and puts onto internal table IT

Anatomy of the S2P pattern (the coloured parts):

Serial to parallel data access converter pattern S2P

AXIOM

S, OS, D, SC, F and P are all server-less services in OCI. Data coming in serial mode to S can be processed in random parallel access mode in D or other downstream pipelines once stored in XT and/or IT.

S2C pattern is a simple construct abstracted from the underlying infrastructure thanks to its serverless implementation.

IMPLEMENTATION

S is OCI Streaming

OS is Object Storage

D is Autonomous Database

SC is Service Connector Hub

XT implementation:

EXEC DBMS_CLOUD_ADMIN.ENABLE_RESOURCE_PRINCIPAL();
/
drop table cardmovs 
/
BEGIN
   DBMS_CLOUD.CREATE_EXTERNAL_TABLE(
    table_name =>'cardmovs',
    credential_name =>'OCI$RESOURCE_PRINCIPAL',
    file_uri_list =>'https://objectstorage.eu-frankfurt-1.oraclecloud.com/n/f......k/b/siteA/o/*.data.gz',
    format => json_object('delimiter' value '~', 'compression' value 'gzip'),
    column_list => 'jsondata VARCHAR2 (23767)' );
END;
/
# we create a view for decoding messages becaus they are b64 coded in the datafiles stored in object storage
#
drop view cardmovsv
/
create view cardmovsv as (select j.jsondata.timestamp, 
JSON_VALUE(utl_raw.cast_to_varchar2(utl_encode.base64_decode(utl_raw.cast_to_raw(j.jsondata.value))), '$.card') card,
JSON_VALUE(utl_raw.cast_to_varchar2(utl_encode.base64_decode(utl_raw.cast_to_raw(j.jsondata.value))), '$.holder') holder,
JSON_VALUE(utl_raw.cast_to_varchar2(utl_encode.base64_decode(utl_raw.cast_to_raw(j.jsondata.value))), '$.amount') amount,
JSON_VALUE(utl_raw.cast_to_varchar2(utl_encode.base64_decode(utl_raw.cast_to_raw(j.jsondata.value))), '$.prod') prod,
JSON_VALUE(utl_raw.cast_to_varchar2(utl_encode.base64_decode(utl_raw.cast_to_raw(j.jsondata.value))), '$.lat') lat,
JSON_VALUE(utl_raw.cast_to_varchar2(utl_encode.base64_decode(utl_raw.cast_to_raw(j.jsondata.value))), '$.lon') lon 
from cardmovs j)
/

P implementation

#
insert into it select * form xt where xt.a not in (select it.a from it)

CONCLUSION

S2P pattern can resolve serial to parallel conversion in fast data lakes in which ingestion arriives only to real-time streaming ports.

One Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.