Keep a watchful eye on two key configuration parameters to maintain consistent performance.
Server virtualization for Oracle databases is here–and it works, so it will be staying. As database administrators, we must adapt to the new architecture, while ensuring database performance and resiliency. A recent client problem involving a handful of critical applications revealed two key configuration settings every database administrator needs to continuously verify. Even in the virtual world of shared host resource utilization, some resources should not be shared and some guests need to be granted “VIP” status.
Case study: Critical applications running very slowly
A client reported that five business-crucial applications were suddenly running extremely slow. The impact scope unfortunately included the client’s external partners, so there was no time to waste. As the discussion unfolded, a possible bottleneck point was identified: the Oracle database managing the business-rules engine data. This database was hosted as a virtual guest on a mainframe computer running zLinux on IFL (Integrated Facility for Linux) processors. The mainframe LPAR hosted at least seven other server guests, each running two JVMs, so this was a busy environment. The database itself was fronted by nearly 20 JVMs, all from the five different applications that were reported as slow.
Different teams dug into the infrastructure with a primary focus on the database environment. The mainframe engineer reported that the physical host LPAR resources were not strained. The Oracle DBA reported an increase in overall response time and had concerns about CPU availability. Finally, the zLinux engineer reported that resource stealing was occurring between the guests, with the database server guest losing up to 90% of its assigned virtual CPU value. Then within minutes, the zLinux engineer reported that memory had been take away from the Oracle database guest server, so now the database SGA was being swapped.
Lessons learned: Performance impact of resource shortages
Actions were taken to resolve the slowness and a post-mortem was done to verify root cause and determine mitigation. The final root causes were that the Oracle database guest server had not been guaranteed CPU or memory resources. Now, this guest has a memory reservation designed to not allow the SGA to ever get swapped and has been registered with the CPU dispatch process to always get CPU when ready, ahead of all other guests.
So, please remember that the people building the underlying architecture upon which the Oracle guest will reside probably don’t understand database hosting, especially the performance impact of resource shortages. Take time to educate your peers, be sure to ask for fixed minimum resources, and then continuously check that the configuration doesn’t get changed inadvertently over time.