This is the real history of a company running a long-duration workload once at the end of every month, consisting on a very mature solvency risk calculation distributed computing application built with Oracle Coherence and Java. The application behavior is like a progressive degradation of performance because the delays caused by garbage collector events as the memory of each Java process increases along the workload duration. Sometimes the Customer has to kill the process and start again because it takes too much time and seems not to reach and end.
We have performed a benchmark by lifting and shifting the app to Oracle Cloud Infrastructure, testing 2 different worloads. The table below depicts the best results but thanks to Terraform and the flexibility of the cloud, we have been able to execute more than 30 workloads with different topologies and compute shapes either in Virtuel Machine or Bare Metal. Let’s see what happened!
Cloud environment IaC procedure: Terraform
Time to provission and start the cluster in cloud: 1o-15 minutes
Time to destroy cloud infra: 3-5 mins
Application software, configuration, operating system and data: Identical for each workload either onprem and cloud. Neither improvments nor changes have been done to the application when moved to the cloud. No improvements to cloud operating systems, network and the like have been done, all settings are the default values provided by Oracle Cloud.
Storage Cloud: All nodes reading and writing data from/to a Shared File System, software and logs in shared disk either, no use of local disk at all
Storage OnPrem: Data in NAS, software in local disk
VMWARE environment: More than 6 years working, supposed to be tunned as much as possible, prevoiusly lifted to AWS then moved back to onprem because its costs
Benchmark dates: October-November 2019
Datacenter: eu-frankfurt-1, AD3
IaC Workstaion location: Madrid
Workload 1: With similar memory and 40% less cores Oracle performs the workload in 40% less time
Workload 2: With similar memory and 40% less cores Oracle performs the workload in 26% less time
Workload 2: Doubling the number of nodes we get a reduction of 27% in duration
That’s all, hope it helps! 🙂