Scanning two databases for differences
Step-by-step instructions to scan two databases for data differences. This assumes that rubyrep has already been installed.
Step 1: Generate a basic configuration file
This is done with command
rubyrep generate myrubyrep.conf
This will generate a configuration file skeleton like this:
RR::Initializer::run do |config|
config.left = {
:adapter => 'postgresql', # or 'mysql'
:database => 'SCOTT',
:username => 'scott',
:password => 'tiger',
:host => '172.16.1.1'
}
config.right = {
:adapter => 'postgresql',
:database => 'SCOTT',
:username => 'scott',
:password => 'tiger',
:host => '172.16.1.2'
}
config.include_tables 'dept'
config.include_tables /^e/ # regexp matches all tables starting with e
# config.include_tables /./ # regexp matches all tables
end
Please note that this is valid ruby code. So you can do whatever you want.
Step 2: Specify how to connect to both databases
rubyrep always works on two databases known as “left” and “right”. Above example should be fairly self-explaining…
Special encodings
If your database uses special encodings, you might want to specify them also in the connection paramters. (Apparently especially necessary with MySQL.)
Use the :encoding
parameter for this.
Refer to the according explanations for PostreSQL and MySQL.
PostgreSQL schema support
If you are using PostgreSQL and your tables do not reside in the ‘public’ schema, add :schema_search_path
to the database connection parameters. Example:
:schema_search_path => 'accounting'
.
Refer to the ActiveRecord documentation for more background information.
Step 3: Configure which tables to scan
Use the include_tables
option to specify which tables should be covered.
Different forms are possible:
- single table: just specify the table name,
e. g.
config.include_tables 'dept'
- multiple tables: specify any regular expression,
e. g. all tables starting with ‘e’:
config.include_tables /^e/
- match tables that are named differently in left and right database:
e. g. table ‘dept’ in left and table ‘dept_backup’ in right database:
config.include_tables 'dept, dept_backup'
Additionally it is possible to exclude tables. Excludes always trump includes. Example: To match all tables except those ending with ‘backup’:
config.include_tables /./
config.exclude_tables /backup$/
Step 4: Run the scan
To run the scan, execute: rubyrep scan -c myrubyrep.conf
This will scan all configured tables.
It is possible, to overwrite which tables to scan.
E. g. to scan ‘dept’ and all tables starting with ‘e’:
rubyrep scan -c myrubyrep.conf dept /^e/
A scan with default options will produce output like this:
dept 100% ......................... 0
emp 100% ......................... 1
Here the ‘dept’ table is fully in sync, while table ‘emp’ has one difference. (The dots and percentage values are printed during the scan to show the progress.)
Appendix: Further Information
A scan can not only print scarce summaries as shown above. Refer to the scan command reference to see how rubyrep can show more detailed scan summaries as well as actual row differences.
Go to the configuration reference to see a list of all available configuration options.