Schema Sleuth¶ ↑

A ruby script (for *nix) to search a database for related records based given a value or a column name and value. Recursively follows and outputs related rows. Attempts to avoid excessive recursion via related record row limit default of 1000 (otherwise can be slow and may run out of memory), a default max related table depth level of 5, etc. Can also output/dump all data in specific tables or tables under a specified row count, which can be used along with diff to provide a selective table schema diff. A more complete and accurate tool for diffs and dumps is SchemaCrawler.

Installation¶ ↑

Install Git and Ruby.

Install RBI and Trollop:

gem install dbi
gem install trollop

Clone this project.

cd ~
git clone http://github.com/garysweaver/schema-sleuth.git

Then add the following to your .bash_profile, or whatever you want:

#schema-sleuth
export PATH=$PATH:~/schema-sleuth

Restart Terminal.app or whatever, and then test by doing:

ssleuth -h

Usage¶ ↑

Finds related data in a database.

Examples:

MySQL:

ssleuth -r DBI:Mysql:TESTDB:localhost -u jdoe -p secret -c user_id -e 1234

Oracle:

ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST.WORLD -u jdoe -p secret -c user_id -e 1234

Follow NEXT_USER_ID and PREV_USER_ID as if were USER_ID without displaying progress indicator:

ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST.WORLD -u jdoe -p secret -c user_id -e 1234 -q -x 'next_,prev_'

Dump all data in tables (under maximum row count) to a list of inserts

ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST.WORLD -u jdoe -p secret -e 1234 -q -i -a -m

Ignore tables ending in _LOG or _STAT

ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST.WORLD -u jdoe -p secret -c user_id -e 1234 -b ^\(?\!\(\(.*_LOG\)\|\(.*_STAT\)\)\)

Custom:

ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST.WORLD -u jdoe -p secret -c user_id -e 1234 -t "select TABLE_NAME from USER_TABLES" -n "select column_name FROM all_tab_cols where table_name = '$TABLE_NAME'"

Usage:

       ssleuth [options]

where [options] are:

                    --driver-url, -r <s>:   DBI driver URL. e.g. DBI:OCI8://db.acme.org:1234/ACMETEST.WORLD
                          --user, -u <s>:   database username. e.g. jdoe
                      --password, -p <s>:   database password. e.g. secret
                   --column-name, -c <s>:   column name (case-insensitive). e.g. user_id
                         --value, -e <s>:   column value. e.g. 1234
             --table-names-query, -t <s>:   SQL query to use to get full listing of tables. e.g. "select column_name FROM all_tab_cols where table_name = '$TABLE_NAME'"
            --column-names-query, -n <s>:   SQL query to use to get column names for a table. Substitute $TABLE_NAME for the table name. e.g. "select column_name FROM all_tab_cols where table_name = '$TABLE_NAME'"
                  --generate-deletes, -l:   output as delete statements
                  --generate-inserts, -i:   output as insert statements
                  --generate-updates, -w:   output as update statements
                 --maximum-depth, -s <i>:   maximum depth (default: 5)
             --maximum-row-count, -m <i>:   maximum row count. Does not attempt to follow path if more than this many rows are found from a query. However, you can still get more than one row in result summaries, via multiple queries (default: 1000)
                             --likes, -k:   value comparisons done as likes rather than equals
--column-name-partials-to-remove, -x <s>:   comma-delimited list of (case-insensitive) text to remove from column names when following path. e.g. to make next_user_id and user_id match, you'd specify -x 'next_'
                 --table-pattern, -b <s>:   table names must match this regexp pattern be included, otherwise are excluded
                          --all-data, -a:   instead of following relationships, just output all table data
                             --quiet, -q:   quiet. does not output progress indicator
                             --debug, -d:   outputs debugging information
                           --version, -v:   Print version and exit
                              --help, -h:   Show this message

Examples¶ ↑

Using Schema Sleuth with Value Only¶ ↑

This would find any record with value “1234” in any column in any table in the database and use that as a starting point:

ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST -u jdoe -p secret -e 1234

Using Schema Sleuth with a Provided Column Name and Value¶ ↑

This would find any record with value “1234” in any “USER_ID” column in any table in the database and use that as a starting point:

ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST -u jdoe -p secret -c user_id -e 1234

Doing a Custom Schema Diff¶ ↑

This would dump all tables under 10 million rows from two database schemas to db1.txt and db2.txt as insert statements. This may take a really long time. Then it would diff these two files and put the result into changes.txt:

ssleuth -r DBI:OCI8://db1.acme.org:2345/ACMETEST -u jdoe -p secret -a -q -i -m 10000000 > db1.txt && ssleuth -r DBI:OCI8://db2.acme.org:1234/ACMETEST -u jdoe -p secret -a -q -i -m 10000000 > db2.txt && diff db1.txt db2.txt > changes.txt

This would dump all tables under 10 million rows where the table names start with AUTO from two database schemas to db1.txt and db2.txt as insert statements. Then it would diff these two files and put the result into autochanges.txt:

ssleuth -r DBI:OCI8://db1.acme.org:2345/ACMETEST -u jdoe -p secret -a -q -i -b "AUTO(.*)" -m 10000000 > db1.txt && ssleuth -r DBI:OCI8://db2.acme.org:1234/ACMETEST -u jdoe -p secret -a -q -i -b "AUTO(.*)" -m 10000000 > db2.txt && diff db1.txt db2.txt > autochanges.txt

Using Schema Sleuth with Different Databases¶ ↑

MySQL:

ssleuth -r DBI:Mysql:TESTDB:localhost -u jdoe -p secret -c user_id -e 1234

Oracle:

ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST -u jdoe -p secret -c user_id -e 1234

Custom:

ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST -u jdoe -p secret -c user_id -e 1234 -t "select TABLE_NAME from ALL_ALL_TABLES" -n "select column_name FROM all_tab_cols where table_name = '$TABLE_NAME'"

Sample Output¶ ↑

Searching for records related to column USER_ID and value ‘123’:

$ ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST -u jdoe -p secret -c user_id -e 123

.......

SAMPLE_TABLE_A

USER_ID, USER_NAME
------------
'123', 'Joe'

SAMPLE_TABLE_B

USER_ID, VEHICLE_ID
------------
'123', '234'
'123', '235'

SAMPLE_TABLE_C

VEHICLE_ID, VEHICLE_MAKE, VEHICLE_MODEL, COLOR_ID
------------
'234', 'Kia', 'Sorento', '345'
'235', 'Kia', 'Sportage', '346'

SAMPLE_TABLE_D

COLOR_ID, COLOR_NAME
------------
'345', 'Red'
'346', 'Green'

Output the same data as insert statements:

$ ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST -u jdoe -p secret -c user_id -e 123 -i
.......

INSERT INTO SAMPLE_TABLE_A (USER_ID, USER_NAME) VALUES ('123', 'Joe');
INSERT INTO SAMPLE_TABLE_B (USER_ID, VEHICLE_ID) VALUES ('123', '234');
INSERT INTO SAMPLE_TABLE_B (USER_ID, VEHICLE_ID) VALUES ('123', '235');
INSERT INTO SAMPLE_TABLE_C (VEHICLE_ID, VEHICLE_MAKE, VEHICLE_MODEL, COLOR_ID) VALUES ('234', 'Kia', 'Sorento', '345');
INSERT INTO SAMPLE_TABLE_C (VEHICLE_ID, VEHICLE_MAKE, VEHICLE_MODEL, COLOR_ID) VALUES ('235', 'Kia', 'Sportage', '346');
INSERT INTO SAMPLE_TABLE_D (COLOR_ID, COLOR_NAME) VALUES ('345', 'Red');
INSERT INTO SAMPLE_TABLE_D (COLOR_ID, COLOR_NAME) VALUES ('346', 'Green');

Find all records containing ‘1234’:

$ ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST -u jdoe -p secret -e 1234

SAMPLE_TABLE_E

ITEM_ID, AMOUNT
------------
'12345', '1234'

SAMPLE_TABLE_F

BUILDING_ID, IDENT_ID, DESCRIPTION
------------
'15', '2424', '1234'
'1234', '12', 'Green'

SAMPLE_TABLE_G

VEHICLE_ID, CUSTOMER_ID
------------
'1234', '2424'
'1234', '3555'

Troubleshooting¶ ↑

Specify -d to debug. If debug is not specified, you can see still see progress via the dots, etc. printed. A ‘.’ means that data was retrieved. A ‘M’ indicates that the maximum-row-count was exceeded. A ‘O’ indicates that the maximum-depth was exceeded. If it encounters an error, it will display it in full glory.

garysweaver / schema-sleuth