Near Real Time Data Driven SaaS Integration with Streaming | Part 2: Extracting/Publishing Data Changes to Streaming Topics

As we mentioned in a previous post, the idea here is to develop a construct that seeks for changes in the SaaS data system and publishes those changes to a stream for later consumption from other systems over there.

Indeed, as we see in the diagram, there is a block in which we are executing a program that run in loops of “get all” requests to a list of API endpoints that have been configured with the Administration User interface. The logic is:

  • wake up every x seconds
  • for each endpoint registered
  • get parameters such as conditions, last successful execution time, …
  • execute GET requests in loops with chunks of N records per call and get the data until there is no more data retrieved
  • put the data in the topic with the streaming API

As we mentioned in this post, the SaaS REST API’s have a common pattern so its easy to create a program that executes the logic mentioned and put it in an image container to be deployed in K8s, we’ll show an example in a new post sooner.

And that’s all for today, hope it helps! 🙂

Near Real Time Data Driven SaaS Integration with Streaming | Part 1: Overview

File based approaches

File-based data exchange comfort a large percentage of the integrations between SaaS and other ERP solutions in the past and today. The approach is robust and allows a large number of transactions to be executed in batch without affecting the online systems, but it has an inconvenience: the information is not updated at the last moment.

Streaming Solutions

Streaming solutions (kafka, JMS, …) allow weakly coupled systems, which provide everything necessary to have information updated almost in real time without overloading the source and target systems.

Indeed, the use of streaming allows the source system to publish data to an intermediate flow where data is stored during a sufficient time so that the target systems have the chance to consume the data according to the load fluctuations to which they are subjected and the source system publishes data in the same way.

The benefits are, among others:
– Destination systems are not overloaded
– Data arrives at the destination much earlier than in batch mode
– No data is lost

Solution example

Oracle OCI Streaming

OCI Streaming is a Kafka compatible, secure, no lock-in, pay as you use, scalable, cheap streaming solution that allows the purpose mentioned with very low effort and ease to develop and deploy.

Source systems

Source systems, such as Oracle SaaS, can be queried by an intermediate publisher with the REST API’s, get the data and put as messages in topics utilising the streaming API included in OCI SDK’s provided.

Target Systems

Destination systems, receive messages by means of an intermediate publisher that get messages from the topics and send to the target using their own API.

Containerisation and “K8szation”

One good approach for this purpose is creating a containerised programs to be deployed and run in a Kubernetes Cluster.

Next Post: Near Real Time Data Driven SaaS Integration with Streaming | Part 2: Extracting/Publishing Data Changes

Hope it helps! 🙂