Tuning SQL Server Memory, Part 1

By: Steven Wort, Ross LoForte, Brian Knight


Because memory is fast relative to disk I/O, using this system resource effectively can have a large impact on the system’s overall ability to scale and perform well.

SQL Server 2012 memory architecture and capabilities vary greatly from those of previous versions of SQL Servers. Changes include the ability to consume and release memory-based, internal-server conditions dynamically using the AWE mechanism. In SQL Server 2000, all memory allocations above 4GB were static. Additional memory enhancements include the introduction of hierarchical memory architecture to maximize data locality and improve scalability by removing a centralized memory manager. SQL Server 2012 has resource monitoring, dynamic management views (DMVs), and a common caching framework.

The following performance counters are available from the System Performance Monitor. The SQL Server Cache Hit Ratio signifies the balance between servicing user requests from data in the data cache and having to request data from the I/O subsystem. Accessing data in RAM (or data cache) is exponentially faster than accessing the same information from the I/O subsystem; thus, the wanted state is to load all active data in RAM. Unfortunately, RAM is a limited resource. A wanted cache hit ratio average should be well over 90 percent. This does not mean that the SQL Server environment would not benefit from additional memory. A lower number signifies that the system memory or data cache allocation is below the wanted size.

Another reliable indicator of instance memory pressure is the SQL Server:Buffer Manager:Page-life-expectancy (PLE) counter. This counter indicates the amount of time that a buffer page remains in memory, in seconds. The ideal number for PLE varies with the size of the RAM installed in your particular server and how much of that memory is used by the plan cache, Windows OS, and so on. The rule of thumb nowadays is to calculate the ideal PLE number for a specific server using the following formula: MaxSQLServerMemory(GB) x 75. So, for a system that has 128GB of RAM and has a MaxServerMemory SQL setting of 120GB, the “minimum PLE before there is an issue” value is 9000. This can give you a more realistic value to monitor PLE against than the previous yardstick of 300. Be careful not to under-allocate total system memory because it forces the operating system to start moving page faults to a physical disk. A page fault is a phenomenon that occurs when the operating system goes to a physical disk to resolve memory references. The operating system may incur some paging, but when excessive paging takes places, it uses disk I/O and CPU resources, which can introduce latency in the overall server, resulting in slower database performance. You can identify a lack of adequate system memory by monitoring the Memory: Pages/sec performance counter. It should be as close to zero as possible because a higher value indicates that more hard-paging is taking place, as can happen when backups are taking place.

SQL Server 2012 has several features that should help with this issue. With Windows Server 2008, SQL Server 2012 has support for hot-add memory, a manager framework, and other enhancements. The SQL Server operating system (SQLOS) layer is the improved version of the User Mode Scheduler (UMS), now simply called scheduler. Consistent with its predecessor, SQLOS is a user-mode cooperative and on-demand thread-management system. An example of a cooperative workload is one that yields the processor during a periodic interval or while in a wait state, meaning that if a batch request does not have access to all the required data for its execution, it requests its data and then yields its position to a process that needs processing time.

SQLOS is a thin layer that sits between SQL Server and Windows to manage the interaction between these environments. It enables SQL Server to scale on any hardware. This was accomplished by moving to a distributed model and creating an architecture that would foster locality of resources to aid in getting rid of global resource management bottlenecks. The challenge with global resource management is that in large hardware design, global resources cannot keep up with the demands of the system, slowing overall performance. Figure 1 below highlights the SQLOS components that perform thread scheduling and syncrohnization, perform SQL Server memory management, provide exception handling, and host the Common Language Runtime (CLR).

Wiley Admin 11_12

Figure 1. SQLOS Components.

Be sure to read part 2 of this article for more information.

Trackbacks

Leave a Reply