Managing Diagnostic Data for Oracle RAC

on October 1, 2013


Beginning with Release 11g, Oracle Database introduced the advanced fault diagnosability infrastructure for collecting and managing diagnostic data. The infrastructure is designed to target critical errors such as those caused by code bugs, metadata corruption, and customer data corruption. When one of these critical errors occurs, it is assigned an incident number, and diagnostic data for the error is immediately captured and tagged with this number. The collected data is stored in the Automatic Diagnostic Repository (ADR), and from here it can be analyzed by the incident number.

Automatic Diagnostic Repository

The ADR is a special file-based repository that is automatically maintained by Oracle 11g to hold diagnostic information about critical error events. The diagnostic information includes trace files, dumps, and core files, plus new types of diagnostic data that enable customers and Oracle Support to identify and resolve problems more effectively.

ADRCI command-Line utility

The ADR Command Interpreter (ADRCI) is the command line utility that can be used to investigate incidents and to view health check reports. If you want to upload ADR data to Oracle Support, ADRCI can be used to package and upload first-failure diagnostic data. ADRCI also lets you view the names of the trace files in the ADR and view the alert log with XML tags stripped, with and without content filtering. To retrieve a list of incidents, do the following:

   ADRCI> show incident

ADR structure

The ADR root directory is known as the ADR base. Its location is set by the DIAGNOSTIC_DEST initialization parameter. Within ADR base, multiple ADR homes can exist, where each ADR home is the root directory for all diagnostic data. In an Oracle RAC environment with Oracle ASM, each database instance, Oracle ASM instance, and listener has an ADR home. The location of each ADR home is provided by the following path, which starts at the ADR base directory: diag/product_type/product_id/instance_id.

For example, for a database with a security identifier (SID) of maadb1 and database unique name of maadb, the ADR home would be in the following location: ADR_base/diag/rdbms/maadb/maadb1/. Similarly, the ADR home path for the Oracle ASM instance+asm1 would be ADR_base/diag/asm/+asm/+asm1/.

Within each ADR home directory, you find the subdirectories, as listed next. These subdirectories contain the diagnostic data.

T 0240-01

A query against the V$DIAG_INFO view lists all important ADR locations for the current Oracle Database instance:

   SELECT * FROM V$DIAG_INFO;
   INST_ID NAME VALUE
   ------- --------------------- ------------------------------------------
   -------------------
   1 Diag Enabled TRUE
   1 ADR Base /u01/oracle
   1 ADR Home /u01/oracle/diag/rdbms/maadb/maadb1
   1 Diag Trace /u01/oracle/diag/rdbms/maadb/ maadb1/trace
   1 Diag Alert /u01/oracle/diag/rdbms/maadb/ maadb1/alert
   1 Diag Incident /u01/oracle/diag/rdbms/maadb/ maadb1/incident
   1 Diag Cdump /u01/oracle/diag/rdbms/maadb/ maadb1/cdump
   1 Health Monitor /u01/oracle/diag/rdbms/maadb/ maadb1/hm
   1 Default Trace File /u01/oracle/diag/rdbms/maadb/ maadb1/trace/orcl_
   ora_22769.trc
   1 Active Problem Count 2
   1 Active Incident Count 8

ADR in Oracle RAC

The ADR is stored outside of the database to be available for problem diagnosis when the database is down. This brings up the question of where to store the ADR. In an Oracle RAC environment, each node can have an ADR base on its own local storage, or the ADR base can be set to a location on shared storage. The following are two advantages of the shared storage approach:

  • Diagnostic data from all instances can be displayed in a single report.
  • You can use the Data Recovery Advisor to help diagnose and repair corrupted data blocks, corrupted or missing files, and other data failures. (For Oracle RAC, the Data Recovery Advisor requires shared storage.)

Nevertheless, you should keep in mind that you might need the ADR when certain functionality of your cluster is not available. If, for example, ASM or GI in general is not working and you configured ADR to be stored in ACFS, you would be left in the dark with no diagnostic data to use for troubleshooting. Thus, we recommend that you not use ACFS to store the ADR.

Reporting and resolving a problem

Problem resolution starts by accessing the Database Home page in OEM and reviewing critical error alerts. Select an alert for which you want to view details, and then go to the Problem Details page. Examine the problem details and view a list of all incidents that were recorded for the problem. Display findings from any health checks that were automatically run. Create a service request with My Oracle Support and record the service request number with the problem information. Package and upload the diagnostic data for the problem to Oracle Support. Then, set the status for the incidents to Closed.

Related Posts

Leave a Reply