[…] SOS_SCHEDULER_YIELD—SQL Server instances with high CPU usage often show the SOS_SCHEDULER_YIELD wait type, which can indicate a need for further research and action. […]
SQL Server instances with high CPU usage often show the SOS_SCHEDULER_YIELD wait type, which can result from a number of different causes. One cause may be when an active thread voluntarily yields to allow another runnable thread its turn (next from the runnable queue) executing on the CPU. In SQL Server, each thread is assigned a quantum (duration 4ms) , with SQL Server using a cooperative model to ensure its CPU resources are shared amongst all the threads that are in a runnable state, preventing the ‘starving’ condition of any individual thread.
Another less common, much more complex example, is related to spinlocks contention (spinlock backoffs). Resolving this requires deeper knowledge of SQL Server (and is likely to require assistance from Microsoft support).
Where to start troubleshooting
Plans that process large data amounts (large scans) can often require many cycles of CPU to ‘process’ all the data. This could lead to higher SOS_SCHEDULER_YIELD waits as it has repeating cycles of running and yielding, reducing the amount of data to be processed (for example, turn those ‘Scans’ into ‘Seeks’), and can often reduce the # CPU cycles required and thus amount of SOS_SCHEDULER_YIELD waits. If the PAGEIOLATCH_xx waits are also not seen for the executing statement, the data is already memory resident and it is using its 4ms quantum up and then yielding.
With the SOS_SCHEDULER_YIELD wait type often resulting from a plan with HIGH CPU execution times, we can use the DMV sys.dm_exec_query_stats {total_worker_time} to identify those ‘Expensive CPU’ queries:
SELECT TOP 50 SUBSTRING(eqt.TEXT,( eqs.statement_start_offset/2) +1 ,((CASE eqs.statement_end_offset WHEN -1 THEN DATALENGTH(eqt.TEXT) ELSE eqs.statement_end_offset END - eqs.statement_start_offset)/2)+1) ,eqs.execution_count ,eqs.total_logical_reads ,eqs.last_logical_reads ,eqs.total_logical_writes ,eqs.last_logical_writes ,eqs.total_worker_time ,eqs.last_worker_time ,eqs.total_elapsed_time/1000000 AS Total_elapsed_time_Secs ,eqs.last_elapsed_time/1000000 AS Last_elapsed_time_Secs ,eqs.last_execution_time ,eqp.query_plan FROM sys.dm_exec_query_stats eqs CROSS APPLY sys.dm_exec_sql_text(eqs.sql_handle) eqt CROSS APPLY sys.dm_exec_query_plan(eqs.plan_handle) eqp ORDER BY eqs.total_worker_time DESC
Of course, other factors can contribute the potential of SOS_SCHEDULER_YIELDS, including older (or slower) CPUs or the spinlock conditions we mentioned.
Although this is not usually the case, the root cause for SOS_SCHEDULER_YIELD may be poor performance of the disk subsystem. If this is the case, reviewing the performance of database files performance can help confirm this, and using the DMV sys.dm_io_virtual_file_stats to look for indication of slow activity on the database files can be useful.
Digging deeper
If you have long running queries, especially when in conjunction with high signal wait times, it may be an indication of CPU pressure. Reviewing the system performance will show if the CPUs are showing signs of CPU pressure, and reducing workload through performance tuning or other steps can help reduce the workloads being performed by SQL Server.
If you are looking at using Extended Events to troubleshoot with SQL Server 2008 | R2 or SQL Server 2012, you can use the wait_info event and filter that for the following wait type value:
SQL Server 2008 | R2: Wait_Info ‘SOS_SCHEDULER_YIELD’ (this maps to ID = 120) SQL Server 2012: Wait_Info ‘SOS_SCHEDULER_YIELD’ (this maps to ID = 124)
Potential Solutions
Depending on the type of problem that exists, the potential solution approach will vary.
Performance Tuning
This wait type is a typical indicator of CPU pressure, often because of large scans.
Reducing the amount of data processing via performance tuning, commonly achieved with indexing improvements, can be a starting point to reducing the workload and the number of CPU cycles required.
Exhausting CPU Quantum’s
There is a lack of CPU cycles (especially if high signal wait times).
Adding more CPUs (potentially via hyper-threading)or faster CPUs (newer hardware) may be indicated.
For a virtualized instance, increasing the number of vCPU may help.
Spinlock Contention
You will need advanced knowledge for resolving and identifying spinlock problems, usually involving the Microsoft support team to assist in the resolution.
hemant says
excellent article.