EC Cloud

An Expert’s guide to understand your Cloud Application Performance

Written by Suchit Kumar M | Feb 1, 2021 3:45:00 AM

Although proponents believe the cloud solves all problems, with all its inherent limitations, skeptics see nothing beyond large-scale virtualization. What's right and what's the desirability for cloud architectures to think about?

Yes, clouds are usually implemented as virtualized environments, though not generally, this is not an argument for or against a cloud-based solution. Likewise, for very small systems, clouds make little sense, but this is no excuse why small apps can not take advantage of an existing cloud. Most significantly, one can provide new VMs in the cloud without making explicit assignments to hardware. Everything is done with templates and cloud management handled.

  • This is a huge cost-and time-saver from an operational perspective, and it makes it possible to schedule load tests during off-peak hours while ensuring priorities for VMs in production.
  • This can significantly reduce hardware and operations on the rather large scale typical of cloud computing.

Changing Complexity Layers 

However, the main cloud advantages of self-service, automated provisioning, and rapid elasticity must be purchased at the cost of increased application-level complexity. Can new instance can have a secret impact on the performance of already running applications— an impact that can only be seen if we look at the underlying shared infrastructure ?

Cloud features such as automatic provisioning and workload management enable us to ignore the relationship between VMs and the underlying hardware assigned, but our applications still have to run on the hardware itself. And while we may be blissfully unaware of these tasks, we need to be able to track the issue when things go wrong.

Application A, for example, could affect Application B today, but tomorrow it could steal CPU time from Application C! How are we going to say? The cloud's inherent dynamism poses significant challenges in control and performance management of applications.

With software owners, who often have no access to the hypervisor layer, typically handled by a separate group, things aren't much better, making it almost impossible to do proper troubleshooting and system tuning.

Visibility is what is needed; a means to simultaneously control applications, the VM, and the hypervisor layer. This is the only way we can trace cause and effect when two running applications share physical hardware. With such an integrated application monitoring solution, we can collect application-level metrics such as response time, load patterns, and resource usage— the information needed to ensure application performance and achieve optimal use.

While not inherently different, tracking cloud systems is more complicated due to the added layer of indirection. But APM also provides us with the means to run cost-effectively performing applications.

For clouds, we are able to automate the process of adaptive resource allocation based on changing application requirements. At the same time, a cloud can't make our applications run any quicker, whether public or private. In fact, there will be no transaction running faster in a cloud. The wonderful versatility that we benefit from cloud computing must therefore be carefully balanced against the inevitable need to be increasingly aware of transactional performance.

The performance tuning process remains much the same, with the metrics just discussed being added. Yet there is yet another level of confusion introduced by third-party providers that one has no power over. For example, most vendors in the cloud provide services in the database. Like everything else, these services have an effect and need to be checked on transactional results.

Likewise, as we add VMs to an application that is simpler and more flexible than changing resource allocations, we don't want to over-supply.

For example, adding a huge VM instance is overkill (huge in the sense that it requires a lot of CPU and memory resources) when we need 5 percent more power. Alternatively, increasing the equipment available in small increments helps to maintain the mobility advantages.

There are two logical consequences of using smaller, less powerful full VM instances: the number of instances per tier tends to rise, resulting in a very large number of overall levels. For each tier, elastic scaling may become necessary.

At the same time, clouds are designed to accommodate massive horizontal scaling, meaning we are likely to have more levels than with an equivalent data center deployment and more nodes. This all adds to the complexity of the scaling logic, as you might guess, making it harder to find performance issues and optimize applications. This is a task that has not been designed to handle most monitoring tools and scaling mechanisms!

Realizing the cost advantages of the cloud means that we need to exploit this flexible scalability as challenging as it sounds. We need to rethink how we model and track our applications to achieve this and preserve the optimal single-transaction efficiency.

Applications deployed in the cloud must be inherently scalable, meaning: avoid all transaction synchronization and state transfers. It limits sharing and involves a storage and caching tradeoff. We need to use inherently scalable technology for distributed information, which could disallow a particular SQL solution that we are used to.

                                  "Optimizing the critical path to remain as independent as possible for each level"

Essentially, all time-critical answers will prevent additional thirds. In such situations, the best-case scenario is a single point. To help scaling, use queues between thirds. Queues allow the calculation of the load on a tier, the depth of the queue, which makes it very easy to scale the consuming tier.


But a well-designed framework needs to exploit both elasticity and scalability to achieve the full cost-effectiveness offered by cloud computing. This requires a level of application monitoring to collect data on response time and resource usage, and algorithmic scaling load calculation.

It takes us to the big difference between private and public clouds:

  • cost-effectiveness
  • elastic scaling

Public and Private Clouds: Siblings with Different Performance Goals We need to distinguish between public and private clouds before we continue to discuss application monitoring in the cloud. This has little to do with technology, and a lot to do with visibility, objectives, and ownership.

There are two conflicting goals in traditional business computing environments: maximizing utilization while minimizing hardware performance in Virtualized and cloud environments.

The downside of private clouds is that they are managed by the same company that is responsible for application performance and IT operations. We are in a position to enforce application performance control across all levels, allowing us to optimize for both objectives.

The underlying virtualization in the public cloud is opaque. Based on our desires or expectations, we can not control or automate it. We need to follow a black-box approach and optimize while retaining scalability with a single goal of meeting response-time requirements.  At the same time, the constraints of finite resources are no longer threatening us. In a private cloud, we tend to automate and reduce the use of resources until our application performance goals are met. Free hardware is of no interest in a public cloud.

Clearly, we are still motivated to cut costs, but this is calculated based on instance time (not using CPU), disk access, network traffic, and server capacity. In fact, the ultimate cost-saving approach would depend on the cost structure of your chosen vendor.

                                            "As such, selecting a vendor based on your cloud application's specific performance characteristics is more important than thinking about the hardware of a vendor."


Example -  For each search operation, we have a transaction that makes 10 disk accesses, and each access involves a very low charge. The quest transaction may not run any quicker by cutting the number of accesses in half, but think about the possible administrative savings for a transaction that could be executed a million times!

In addition, we are facing a situation where the pricing structure of the supplier will change at any time. In order to optimize costs in response, we need to track performance preventively by collecting data from the right actions.

Detailed analysis of performance is a complex process, but the objective is very simple: to identify the root cause of a problem. We start with a typical problem, slow application response time, then we narrow down the possibilities in step-by-step fashion: decide whether the problem affects the entire application, or just a specific transaction or type of transaction. If the latter is to isolate the type of transaction that slows down, then our first step. The scope for further investigation will be established. Speaking with an expert as to your organization's requirements will provide you with an efficient solution. 

For a tailored solution, drop your query below to get expert assistance.