Using big data to remedy cloud wastage and unlock cloud optimization

"The more that complex big-data applications migrate to the cloud, the greater the likelihood of misallocation of resources."

ITOps professionals operate in a constantly evolving, highly challenging environment. They are faced with a complex big-data environment, often running tens of thousands of applications across thousands of nodes. This state of affairs means that users often experience high levels of unnecessary cloud wastage. Conversely, it also means that the opportunities for improving big-data performance and/or cutting spend are numerous.

At Pepperdata we recently undertook a data investigation to uncover the scale of both cloud wastage and optimization potential. We looked at various customer clusters, just prior to their implementation of our solution; these clusters accounted for 400 petabytes of data on 5000 nodes, running 4.5 million applications. Our investigation culminated in a comprehensive big data performance report that looped in recent cloud-computing statistics and big data trends. 

Here are some of the highlights:

The growing cloud wastage

Statista reports that “in 2020, the public cloud-services market is expected to reach around 266.4 billion US dollars in size, and by 2022 market revenue is forecast to exceed 350 billion U.S. dollars.” 

This growth comes at a cost. The more that complex big-data applications migrate to the cloud, the greater the likelihood of misallocation of resources. In 2019 alone, losses attributed to cloud waste amounted to around $14 billion. Hence, as Gartner put it, “through 2024, nearly all legacy applications migrated to public cloud infrastructure as a service (IaaS) will require optimization to become more cost-effective.”

Our data bears out this story:

  • Across big-data clusters, within a typical week without optimization, the median rate of maximum memory utilization is a mere 42.3%. This means there is a wide range of clusters that are either underutilizing or wasting cluster resources.

  • Prior to implementing cloud optimization, the average wastage across 40 large clusters is 60+%.

  • Often, only 5-10% of the total jobs experience major wastage, which makes optimization a needle-in-a-haystack challenge.

The potential for optimization

With this great wastage comes a great opportunity for optimization. As Google discovered, even low-effort cloud optimization can give businesses as much as 10% savings per service within two weeks. Meanwhile, fully optimized cloud services running for six weeks or more can save companies more than 20%.

However, optimization can be a challenge. Every application’s resource requirements are constantly changing. To achieve full optimization potential, big-data solutions need to collect information from all applications and associated infrastructure, analyze the data, and dynamically meet the changes in the resource requirements. 

This can only be done with the help of machine learning (ML).

With the help of ML-driven cloud optimization, teams in our study unlocked a range of optimizations for big data workloads:

  • Three quarters of customer clusters increased throughput and immediately won back up to 52% of their task hours.

  • 25% of users were able to save a minimum of $400,000 per year, with the most successful users saving a projected $7.9 million for the year.

The right big-data-optimization solution

As cloud migration continues, enterprises must bear in mind that costs will only grow if they don’t optimize properly. Business must strive to adopt a machine-learning-powered solution that can quickly pinpoint which clusters are wasting space or resources, while dynamically addressing changing resource requirements. 

A solution that provides them with this level of observability and automated tuning is the only software that can serve to maximize application performance, and save millions in the long run.