Extracting and loading data from one database to another can often leave us wondering how effective our migration attempts are and if the source and target tables are actually in sync after the process. Keeping two tables in sync can be cumbersome for development teams as well as DBAs that are often left with creating and executing a variety of scripts to test the quality of data.
To the rescue, Oracle has deployed a package and set of views to help validate and then re-sync database objects. Oracle’s package DBMS_COMPARISON enables users to compare two different objects, in the same database or across a database link, and then merge the two objects so that the objects are put back in sync. As an example, this post provides a comparison and merge of differences for an EMPLOYEE table that exists in the two different schemas HR_OLD & HR_NEW.
1. Create the comparison; specify a name of the comparison and the local and remote schema objects. Notice that a NULL database link means that both objects are local.
BEGIN DBMS_COMPARISON.CREATE_COMPARISON( comparison_name => ‘employees_compare’, schema_name => ‘hr_new’, object_name => ‘employees’, remote_schema_name => ‘hr_old’, remote_object_name => ‘employees’, dblink_name => NULL); END; / PL/SQL procedure successfully completed.
2. Compare the two objects to see if there are any inconsistencies:
SET SERVEROUTPUT ON DECLARE consistent BOOLEAN; scan_info DBMS_COMPARISON.COMPARISON_TYPE; BEGIN consistent := DBMS_COMPARISON.COMPARE( comparison_name => ‘employees_compare’, scan_info => scan_info, perform_row_dif => TRUE); DBMS_OUTPUT.PUT_LINE(‘Scan ID: ‘||scan_info.scan_id); IF consistent=TRUE THEN DBMS_OUTPUT.PUT_LINE(‘No differences were found.’); ELSE DBMS_OUTPUT.PUT_LINE(‘Differences were found.’); END IF; END; / Scan ID: 24 Differences were found. PL/SQL procedure successfully completed.
3. Query to show how many differences where found:
SELECT s.scan_id, c.comparison_name, c.schema_name, c.object_name, c.remote_schema_name, c.remote_object_name, s.current_dif_count FROM dba_comparison c, dba_comparison_scan_summary s WHERE c.comparison_name = s.comparison_name AND c.owner = s.owner AND s.scan_id = 24; SCAN_ID COMPARISON_NAME SCHEMA OBJECT_NA REMOTE REMOTE_OB CUR_DIF_CNT ————– ————————————— ————– ——————— ————– ——————— ———————— 24 EMPLOYEES_COMPARE HR_NEW EMPLOYEES HR_OLD EMPLOYEES 107
4. Query to see which rows were different. A ‘Yes’ in the local or remote column signifies that there is a row that exists both in the local and remote objects and a merge will have to occur. If there is a ‘No’ then a converge rule that must be specified in the next step on how to handle these; basically specifying which (local or remote) is the base system and causing a delete in the other.
SELECT c.column_name, r.index_value, DECODE(r.local_rowid,NULL, ‘No’,'Yes’) local_rowid, DECODE(r.remote_rowid,NULL,’No’,'Yes’) remote_rowid FROM dba_comparison_columns c, dba_comparison_row_dif r, dba_comparison_scan s WHERE r.scan_id = s.scan_id AND s.parent_scan_id = 24 AND r.status = ‘DIF’ AND c.index_column = ‘Y’ AND c.comparison_name = r.comparison_name AND c.owner = r.owner ORDER BY r.INDEX_VALUE; COLUMN_NAME INDEX_VALU LOCAL REMOTE ———————– —————— ———- ———– EMPLOYEE_ID 100 Yes Yes EMPLOYEE_ID 101 Yes Yes EMPLOYEE_ID 102 Yes Yes EMPLOYEE_ID 103 Yes Yes EMPLOYEE_ID 104 Yes Yes
5. Perform the merge; notice that the local object wins (replaces data in the remote object) when data is different between the two objects. This also causes rows to be deleted in the remote object if not found in the local object.
SET SERVEROUTPUT ON DECLARE scan_info DBMS_COMPARISON.COMPARISON_TYPE; BEGIN DBMS_COMPARISON.CONVERGE( comparison_name => ‘employees_compare’, scan_id => 24, scan_info => scan_info, converge_options => DBMS_COMPARISON.CMP_CONVERGE_LOCAL_WINS); DBMS_OUTPUT.PUT_LINE(‘Local Rows Merged :’||scan_info.loc_rows_merged); DBMS_OUTPUT.PUT_LINE(‘Remote Rows Merged:’||scan_info.rmt_rows_merged); DBMS_OUTPUT.PUT_LINE(‘Local Rows Deleted:’||scan_info.loc_rows_deleted); DBMS_OUTPUT.PUT_LINE(‘Remote Rows Deleted:’||scan_info.rmt_rows_deleted); END; / Local Rows Merged :0 Remote Rows Merged:107 Local Rows Deleted:0 Remote Rows Deleted:0 PL/SQL procedure successfully completed.
The compare and merge process become somewhat interesting when you trace the whole process. Take a look at the following trace file: 200903-01-SWI-db11FS_ora_3605.trc.out for some behind-the-scenes activity. I find it quite interesting, to compare and merge two tables that only had 107 rows of data in them, that:
- Nearly 450,000 buffers where retrieved
- It took nearly 30 seconds to compare and merge the two tables
- 300+ SQL statements were executed
- 5,500+ physical disk blocks where read from datafiles
There must be some exhaustive processing going on here. When looking at the tkprof output, you will quickly note that most of this “bad” time and processing was spent querying the ALL_TABLES, ALL_INDEXES, ALL_IND_COLUMNS, ALL_TAB_COLUMNS, and ALL_CONSTRAINTS views. If there is any good news about these atrocious numbers it’s that they probably won’t get worse as the process will always work on only two objects. The real numbers to watch will be those that work against the local and remote objects, EMPLOYEES in this example.
If you plan on using this comparison package I would strongly suggest you trace it a few times to check the overhead. I also fear that, in lieu of standard ETL processing, this package will be used. While the package is handy, depending on the size of the objects to compare, you might be better served to fine tune the processes that are responsible for the initial propagation of data between the two tables.