AppFormix Blog

There’s More to Optimizing App Performance Than Maximizing CPU Availability

By Sumeet Singh on June 16, 2016

BlogPost_Maximize_1.png

When running on software-defined infrastructure, applications share physical hardware resources. Resource contention on a host can lead to reduced and unpredictable performance of an application.

TL;DR… It does not matter how much CPU is available if workloads are competing for memory.

A traditional workload management approach is to schedule applications on the infrastructure in units of virtual or physical cores. This common approach to CPU allocation is not sufficient to ensure predictable performance of applications. This includes approaches like using hypervisor and operating system schedulers to select which VMs and containers may execute on a processor core for a given time-slice.

Why doesn’t this approach work? Because it overlooks resources that are shared inside of the processor. Think about things like L3 cache and memory bus bandwidth, and you see the point I’m making. (These internal resources are not controlled by operating system schedulers, so relying on those technologies is useless.)

Shared cache and memory bandwidth are critical to application performance and must be controlled inside of the processor complex. Even when an application receives an expected share of CPU time and memory space, another application may evict cache lines of the first application. Cache contention leads to both applications suffering from poor and unpredictable performance.  

In their paper, Burns, et al. identified that “containers cannot prevent interference in resources that the operating-system kernel doesn't manage, such as level 3 processor caches and memory bandwidth.” A so-called “noisy neighbor” that contends for cache lines and memory bandwidth can lead to 10x increases in latency and jitter, even when idle CPU is available.


In short: It does not matter how much CPU is available if workloads are competing for memory.


 

This is bad news for operators of microservice architectures attempting to prioritize thousands of containers or VM-based workloads when modern applications are creating noise and chaos at the processor level.

The good news is AppFormix offers the first cloud optimization solution that uses Intel® Resource Director Technology (RDT) to provide this micro-level visibility and control of shared resources inside of the processor.

AppFormix software—in conjunction with the Cache Monitoring Technology (CMT), Memory Bandwidth Monitoring (MBM) and Cache Allocation Technology (CAT) features available in the new Intel® Xeon® E5 v4 processor family—makes it possible for the first time to enforce isolation between workloads and to detect and mitigate processor-level noisy neighbors in real time.

The advantages of this level of control are substantial.

AppFormix has demonstrated that a web server executing on a host with idle CPU resources is severely impacted by a memory-intensive application executing concurrently on the same host.

To quantify the effects of contention within the processor, we executed a web server container and a memory-intensive computation container concurrently on a host with idle CPU capacity.  

When we used AppFormix software to control the amount of cache available to the memory-intensive application, the web server had 24% lower average latency and 70% lower maximum latency. Further, the jitter was reduced. The standard deviation of peak latency was 82% lower when controlling the cache utilization of the memory-intensive application.  

The ability to monitor and manage cache and memory bandwidth at the processor level with AppFormix and Intel RDT improves application latency by 70 percent.


In a DevOps world, the ability to deliver the low‐latency user experience that end users demand requires infrastructure transparency—down to the processor level—and real‐time monitoring and analytics. Only AppFormix, using Intel’s game-changing RDT technology, offers operators and developers this unprecedented visibility into and control over how applications run at the processor level.

 

1 Roy, P. “Meet Your Noisy Neighbor, Container.” AppFormix Inc. March 31, 2016, http://blog.appformix.com/meet-your-noisy-neighbor-container

Burns, B., Grant, B., Oppenheimer, D., Brewer, E., Wilkes, J. 2016. Borg, Omega, and Kubernetes: Lessons learned from three container-management systems over a decade. ACM Queue. 14, 1 (Jan-Feb 2016), http://queue.acm.org/detail.cfm?id=2898444.

Roy, P., Newhouse, T., Singh, S. “CPU shares insufficient to meet application SLAs,” AppFormix Technical Report (APPFORMIX-TR-2016-1). March 2016. http://www.appformix.com/wp-content/uploads/2016/06/AppFormix-TR-2016-1-final.pdf

Share

Subscribe to AppFormix Blog!

Increase Your Cloud ROI with AppFormix Analytics.

  Get a FREE Trial Now!

    

Subscribe to the AppFormix Cloud Operations Blog

About AppFormix

AppFormix is the leading provider of infrastructure performance optimization for cloud-based datacenters and the best OpenStack Cloud Analytics Solution on the market. AppFormix increases the ROI of existing enterprise infrastructure through software that enables consistent performance of applications running in Virtual Machines or containers, either on-premise or in the public cloud.

Get a FREE Trial of AppFormix Analytics and try our new alerting feature.