Tuning Oracle Data Loads for SQL*Loader

By: Bob Bryla, Kevin Loney


In addition to running the data load processes at off-peak hours, you can take other steps to improve the load performance. The following steps all impact your overall database environment and must be coordinated with the database administrator. The tuning of a data load should not be allowed to have a negative impact on the database or on the business processes it supports.

First, batch data loads may be timed to occur while the database is in NOARCHIVELOG mode. While in NOARCHIVELOG mode, the database does not keep an archive of its online redo log files prior to overwriting them. Eliminating the archiving process improves the transaction performance. Because the data is being loaded from a file, you can re-create the loaded data at a later time by reloading the datafile rather than recovering it from an archived redo log file.

However, there are significant potential issues with disabling ARCHIVELOG mode. You will not be able to perform a point-in-time recovery of the database unless archiving is enabled. If nonbatch transactions are performed in the database, you will probably need to run the database in ARCHIVELOG mode all the time, including during your loads. Furthermore, switching between ARCHIVELOG and NOARCHIVELOG modes requires that you shut down the instance (all instances if you have a RAC database). If you switch the instance to NOARCHIVELOG mode, perform your data load, and then switch the instance back to ARCHIVELOG mode, you should back up the database.

Instead of running the entire database in NOARCHIVELOG mode, you can disable archiving for your data load process by using the UNRECOVERABLE keyword within SQL*Loader. The UNRECOVERABLE option disables the writing of redo log entries for the transactions within the data load. You should only use this option if you can re-create the transactions from the input files during a recovery. If you follow this strategy, you must have adequate space to store old input files in case they are needed for future recoveries. The UNRECOVERABLE option is only available for Direct Path loads.

Rather than control the redo log activity at the load process level, you can control it at the table or partition level. If you define an object as NOLOGGING, then block-level inserts performed by SQL*Loader Direct Path loading and the INSERT /*+ APPEND */ command will not generate redo log entries. The block-level inserts will require additional space, however, as they will not reuse existing blocks below the table’s high-water mark.

NOTE

Many critical databases have a standby database and are, therefore, set up with FORCED LOGGING set to TRUE to keep the two databases in sync. Setting NOLOGGING for any database object will not override this since that would defeat the purpose of having a standby database.

If your operating environment has multiple processors, you can take advantage of the CPUs by parallelizing the data load. The PARALLEL option of the sqlldr command, as described in the next section, uses multiple concurrent data load processes to reduce the overall time required to load the data.

In addition to these approaches, you should work with your database administrator to make sure the database environment and structures are properly tuned for data loads. Tuning efforts should include the following:

  •  Preallocate space for the table to minimize dynamic extensions during the loads.
  •  Allocate sufficient memory resources to the shared memory areas.
  • Streamline the data-writing process by ensuring there are enough database writer (DBWR) processes for the database. Oracle best practices suggest allowing Oracle to determine the number of DBWR processes by default. Therefore, you should normally not change the number of DBWR processes unless directed by Oracle Support, or if an Automatic Database Diagnostic Monitor (ADDM) report suggests that you set it manually at a higher value.
  • Remove any unnecessary triggers during the data loads. If possible, disable or remove the triggers prior to the load, and perform the trigger operations on the loaded data manually after it has been loaded.
  • Remove or disable any unnecessary constraints on the table. You can use SQL*Loader to dynamically disable and re-enable constraints.
  • Remove any indexes on the tables. If the data has been properly cleaned prior to the data load, then performing uniqueness checks and foreign key validations are unnecessary during the data loads. Dropping indexes prior to data loads significantly improves performance.
  • Pre-sort the data prior to loading. Sort the data based on the indexed columns to minimize the time needed to create and update the indexes.

Leave a Reply