Protecting Oracle Database Performance in a Virtual Server World

on November 6, 2014

Keep a watchful eye on two key configuration parameters to maintain consistent performance.

Server virtualization for Oracle databases is here–and it works, so it will be staying. As database administrators, we must adapt to the new architecture, while ensuring database performance and resiliency. A recent client problem involving a handful of critical applications revealed two key configuration settings every database administrator needs to continuously verify. Even in the virtual world of shared host resource utilization, some resources should not be shared and some guests need to be granted “VIP” status.

Case study: Critical applications running very slowly

A client reported that five business-crucial applications were suddenly running extremely slow. The impact scope unfortunately included the client’s external partners, so there was no time to waste. As the discussion unfolded, a possible bottleneck point was identified: the Oracle database managing the business-rules engine data. This database was hosted as a virtual guest on a mainframe computer running zLinux on IFL (Integrated Facility for Linux) processors. The mainframe LPAR hosted at least seven other server guests, each running two JVMs, so this was a busy environment. The database itself was fronted by nearly 20 JVMs, all from the five different applications that were reported as slow.

Different teams dug into the infrastructure with a primary focus on the database environment. The mainframe engineer reported that the physical host LPAR resources were not strained. The Oracle DBA reported an increase in overall response time and had concerns about CPU availability. Finally, the zLinux engineer reported that resource stealing was occurring between the guests, with the database server guest losing up to 90% of its assigned virtual CPU value. Then within minutes, the zLinux engineer reported that memory had been take away from the Oracle database guest server, so now the database SGA was being swapped.

Lessons learned: Performance impact of resource shortages

Actions were taken to resolve the slowness and a post-mortem was done to verify root cause and determine mitigation. The final root causes were that the Oracle database guest server had not been guaranteed CPU or memory resources. Now, this guest has a memory reservation designed to not allow the SGA to ever get swapped and has been registered with the CPU dispatch process to always get CPU when ready, ahead of all other guests.

So, please remember that the people building the underlying architecture upon which the Oracle guest will reside probably don’t understand database hosting, especially the performance impact of resource shortages. Take time to educate your peers, be sure to ask for fixed minimum resources, and then continuously check that the configuration doesn’t get changed inadvertently over time.

Comments

Rex Black says

January 13, 2015 at 4:42 pm

Excellent case study, it makes the issue clear for those of us with more of an infrastructure background. Of course, it helps to know that the SGA is the Oracle DB cache, and swapping a cache negates much of the benefit of having a cache.

Reply