The Sometimes Forgotten Critical Checks for Oracle

on January 25, 2013


Surprisingly, many Oracle database administrators often miss two critical checks when starting to troubleshoot problems: the alert log and v$resource_limit.  Database administrators should always start with these fundamentals and progress methodically through a troubleshooting process.  Getting caught up in the emotional “panic” of others or focusing on vague user reports are often distractions that lead database administrators away from root cause and the solution, causing critical time loss.

I’m not advocating ignoring error messages provided by users, as the error messages will be essential to rectifying the problem; I am recommending caution when “slowness” or another vague failure report is ticketed.  How many times have database administrators heard, “The database is down and all users are not working,” only to discover what that really means is that three people at a remote location that was just slammed by a storm that caused a power outage can no longer work?

First, check the alert log

The alert log is a fountain of critical information chronologically captured so that finding important data is quick and easy.  The alert log journals general configuration facts such as instance startup settings, run time events like redo log switches and warnings encompassing deadlocks, failed segment expansions due to tablespace storage shortages and ORA-nnnnn errors.  The latter events are what database administrators must seek out to find the problem and to begin working toward resolution.

Second, check the v$resource_limit values

Next, by connecting to the database from a client workstation to check the v$resource_limit values, database administrators can quickly assess three operational metrics:  status of the listener, accessibility of the database and resource availability.  Doing a ‘select * from v$resource_limit;’ yields vital database health metrics (see below.)  To keep the output neat, use ‘set linesize 200 pages 40’.

SQL> l

  1* select * from v$resource_limit

SQL> /

 

RESOURCE_NAME                  CURRENT_UTILIZATION MAX_UTILIZATION INITIAL_AL LIMIT_VALU

------------------------------ ------------------- --------------- ---------- ----------

processes                                      234             296        800        800

sessions                                       243             307        885        885

enqueue_locks                                  220             298      10832      10832

enqueue_resources                               51              95       4112  UNLIMITED

ges_procs                                      231             292        801        801

ges_ress                                         0               0      17202  UNLIMITED

ges_locks                                        0               0      26471  UNLIMITED

ges_cache_ress                                3585            6986          0  UNLIMITED

ges_reg_msgs                                   271             809       2350  UNLIMITED

ges_big_msgs                                    42             530       2350  UNLIMITED

ges_rsv_msgs                                     0               0       1000       1000

gcs_resources                               120338          396511     414309     414309

gcs_shadows                                  80088          276865     414309     414309

dml_locks                                       16              16       3892  UNLIMITED

temporary_table_locks                            0               3  UNLIMITED  UNLIMITED

transactions                                   102             144        973  UNLIMITED

branches                                         0               0        973  UNLIMITED

cmtcallbk                                        2              10        973  UNLIMITED

max_rollback_segments                           21              22        973      65535

sort_segment_locks                               0              50  UNLIMITED  UNLIMITED

k2q_locks                                        0               0       1770  UNLIMITED

max_shared_servers                               1               1  UNLIMITED  UNLIMITED

parallel_max_servers                             2               9         80       3600

 

23 rows selected.

 

Any resource shortage will inhibit optimal performance, and in many cases cause significant problems for users.  It’s a good practice to check these values weekly so that you can recognize abnormalities quickly.

On one system, for example, the max_rollback_segments value had never exceeded 200, but one day after hearing of a performance problem, I found the value was at 1500.  The root cause of the problem ended up being a report that wasn’t really a report but a batch job with lots of data changes followed by a report.  After tracking down the submitting user and asking about the job, the response was, “We knew that report was not working but we decided to run it anyway.”

The good news is that by checking v$resource_limit immediately, finding the problem was relatively easy. Note the rollback level, check for a process consuming a lot of rollback, find out who submitted the job and then work with the submitter concerning what to do about the offending job.  In this case the job was terminated and the submitter was “asked” to not resubmit the job until after the performance team reviewed the code.

Conclusion

Remember, focus on the fundamentals first.  Doing these two critical checks immediately will take five minutes, saving you from a major headache that will come if you overlook these initial troubleshooting steps.

Related Posts

Leave a Reply