garysweaver / schema-sleuth

A Ruby script to connect to and spider a database, outputting related records based on value only, or the column name and value provided, or can selectively dump tables.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Schema Sleuth

A ruby script (for *nix) to search a database for related records based given a value or a column name and value. Recursively follows and outputs related rows. Attempts to avoid excessive recursion via related record row limit default of 1000 (otherwise can be slow and may run out of memory), a default max related table depth level of 5, etc. Can also output/dump all data in specific tables or tables under a specified row count, which can be used along with diff to provide a selective table schema diff. A more complete and accurate tool for diffs and dumps is SchemaCrawler.

Installation

Install Git and Ruby.

Install RBI and Trollop:

gem install dbi
gem install trollop

Clone this project.

cd ~
git clone http://github.com/garysweaver/schema-sleuth.git

Then add the following to your .bash_profile, or whatever you want:

#schema-sleuth
export PATH=$PATH:~/schema-sleuth

Restart Terminal.app or whatever, and then test by doing:

ssleuth -h

Usage

Finds related data in a database.

Examples:

MySQL:

ssleuth -r DBI:Mysql:TESTDB:localhost -u jdoe -p secret -c user_id -e 1234

Oracle:

ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST.WORLD -u jdoe -p secret -c user_id -e 1234

Follow NEXT_USER_ID and PREV_USER_ID as if were USER_ID without displaying progress indicator:

ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST.WORLD -u jdoe -p secret -c user_id -e 1234 -q -x 'next_,prev_'

Dump all data in tables (under maximum row count) to a list of inserts

ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST.WORLD -u jdoe -p secret -e 1234 -q -i -a -m

Ignore tables ending in _LOG or _STAT

ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST.WORLD -u jdoe -p secret -c user_id -e 1234 -b ^\(?\!\(\(.*_LOG\)\|\(.*_STAT\)\)\)

Custom:

ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST.WORLD -u jdoe -p secret -c user_id -e 1234 -t "select TABLE_NAME from USER_TABLES" -n "select column_name FROM all_tab_cols where table_name = '$TABLE_NAME'"

Usage:

       ssleuth [options]

where [options] are:

                    --driver-url, -r <s>:   DBI driver URL. e.g. DBI:OCI8://db.acme.org:1234/ACMETEST.WORLD
                          --user, -u <s>:   database username. e.g. jdoe
                      --password, -p <s>:   database password. e.g. secret
                   --column-name, -c <s>:   column name (case-insensitive). e.g. user_id
                         --value, -e <s>:   column value. e.g. 1234
             --table-names-query, -t <s>:   SQL query to use to get full listing of tables. e.g. "select column_name FROM all_tab_cols where table_name = '$TABLE_NAME'"
            --column-names-query, -n <s>:   SQL query to use to get column names for a table. Substitute $TABLE_NAME for the table name. e.g. "select column_name FROM all_tab_cols where table_name = '$TABLE_NAME'"
                  --generate-deletes, -l:   output as delete statements
                  --generate-inserts, -i:   output as insert statements
                  --generate-updates, -w:   output as update statements
                 --maximum-depth, -s <i>:   maximum depth (default: 5)
             --maximum-row-count, -m <i>:   maximum row count. Does not attempt to follow path if more than this many rows are found from a query. However, you can still get more than one row in result summaries, via multiple queries (default: 1000)
                             --likes, -k:   value comparisons done as likes rather than equals
--column-name-partials-to-remove, -x <s>:   comma-delimited list of (case-insensitive) text to remove from column names when following path. e.g. to make next_user_id and user_id match, you'd specify -x 'next_'
                 --table-pattern, -b <s>:   table names must match this regexp pattern be included, otherwise are excluded
                          --all-data, -a:   instead of following relationships, just output all table data
                             --quiet, -q:   quiet. does not output progress indicator
                             --debug, -d:   outputs debugging information
                           --version, -v:   Print version and exit
                              --help, -h:   Show this message

Examples

Using Schema Sleuth with Value Only

This would find any record with value “1234” in any column in any table in the database and use that as a starting point:

ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST -u jdoe -p secret -e 1234

Using Schema Sleuth with a Provided Column Name and Value

This would find any record with value “1234” in any “USER_ID” column in any table in the database and use that as a starting point:

ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST -u jdoe -p secret -c user_id -e 1234

Doing a Custom Schema Diff

This would dump all tables under 10 million rows from two database schemas to db1.txt and db2.txt as insert statements. This may take a really long time. Then it would diff these two files and put the result into changes.txt:

ssleuth -r DBI:OCI8://db1.acme.org:2345/ACMETEST -u jdoe -p secret -a -q -i -m 10000000 > db1.txt && ssleuth -r DBI:OCI8://db2.acme.org:1234/ACMETEST -u jdoe -p secret -a -q -i -m 10000000 > db2.txt && diff db1.txt db2.txt > changes.txt

This would dump all tables under 10 million rows where the table names start with AUTO from two database schemas to db1.txt and db2.txt as insert statements. Then it would diff these two files and put the result into autochanges.txt:

ssleuth -r DBI:OCI8://db1.acme.org:2345/ACMETEST -u jdoe -p secret -a -q -i -b "AUTO(.*)" -m 10000000 > db1.txt && ssleuth -r DBI:OCI8://db2.acme.org:1234/ACMETEST -u jdoe -p secret -a -q -i -b "AUTO(.*)" -m 10000000 > db2.txt && diff db1.txt db2.txt > autochanges.txt

Using Schema Sleuth with Different Databases

MySQL:

ssleuth -r DBI:Mysql:TESTDB:localhost -u jdoe -p secret -c user_id -e 1234

Oracle:

ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST -u jdoe -p secret -c user_id -e 1234

Custom:

ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST -u jdoe -p secret -c user_id -e 1234 -t "select TABLE_NAME from ALL_ALL_TABLES" -n "select column_name FROM all_tab_cols where table_name = '$TABLE_NAME'"

Sample Output

Searching for records related to column USER_ID and value ‘123’:

$ ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST -u jdoe -p secret -c user_id -e 123

.......

SAMPLE_TABLE_A

USER_ID, USER_NAME
------------
'123', 'Joe'

SAMPLE_TABLE_B

USER_ID, VEHICLE_ID
------------
'123', '234'
'123', '235'

SAMPLE_TABLE_C

VEHICLE_ID, VEHICLE_MAKE, VEHICLE_MODEL, COLOR_ID
------------
'234', 'Kia', 'Sorento', '345'
'235', 'Kia', 'Sportage', '346'

SAMPLE_TABLE_D

COLOR_ID, COLOR_NAME
------------
'345', 'Red'
'346', 'Green'

Output the same data as insert statements:

$ ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST -u jdoe -p secret -c user_id -e 123 -i
.......

INSERT INTO SAMPLE_TABLE_A (USER_ID, USER_NAME) VALUES ('123', 'Joe');
INSERT INTO SAMPLE_TABLE_B (USER_ID, VEHICLE_ID) VALUES ('123', '234');
INSERT INTO SAMPLE_TABLE_B (USER_ID, VEHICLE_ID) VALUES ('123', '235');
INSERT INTO SAMPLE_TABLE_C (VEHICLE_ID, VEHICLE_MAKE, VEHICLE_MODEL, COLOR_ID) VALUES ('234', 'Kia', 'Sorento', '345');
INSERT INTO SAMPLE_TABLE_C (VEHICLE_ID, VEHICLE_MAKE, VEHICLE_MODEL, COLOR_ID) VALUES ('235', 'Kia', 'Sportage', '346');
INSERT INTO SAMPLE_TABLE_D (COLOR_ID, COLOR_NAME) VALUES ('345', 'Red');
INSERT INTO SAMPLE_TABLE_D (COLOR_ID, COLOR_NAME) VALUES ('346', 'Green');

Find all records containing ‘1234’:

$ ssleuth -r DBI:OCI8://db.acme.org:1234/ACMETEST -u jdoe -p secret -e 1234

SAMPLE_TABLE_E

ITEM_ID, AMOUNT
------------
'12345', '1234'

SAMPLE_TABLE_F

BUILDING_ID, IDENT_ID, DESCRIPTION
------------
'15', '2424', '1234'
'1234', '12', 'Green'

SAMPLE_TABLE_G

VEHICLE_ID, CUSTOMER_ID
------------
'1234', '2424'
'1234', '3555'

Troubleshooting

Specify -d to debug. If debug is not specified, you can see still see progress via the dots, etc. printed. A ‘.’ means that data was retrieved. A ‘M’ indicates that the maximum-row-count was exceeded. A ‘O’ indicates that the maximum-depth was exceeded. If it encounters an error, it will display it in full glory.

License

Copyright © 2010-2011 Gary S. Weaver, released under the MIT license.

About

A Ruby script to connect to and spider a database, outputting related records based on value only, or the column name and value provided, or can selectively dump tables.

License:MIT License


Languages

Language:Ruby 100.0%