About db_converter
db_converter
is an open-source database migration tool for PostgreSQL 9.6+ designed for high-loaded installations.
With db_converter
complex tasks become easier.
Table of contents
- Introduction
- Dependencies and installation
- Fast start
- How to install and run
- Terminology
- Usage modes
- Documentation
- Roadmap
- Support and contributions
Introduction
The primary goal of db_converter
is to simplify the database conversion (migration) process as much as possible while maintaining flexibility and functionality.
Tasks that can be solved using db_converter
:
- Transactional modification of data of any volume
- Database structure changing with locks control
- System and application notifications via
mattermost
(or any other messenger) - Database maintenance (deleting old data, creating new schemas, etc.)
- Export data in
CSV
format into an encrypted archive
The key features are:
- Only plain SQL scripts with placeholders
- Parallel processing of several databases
- Handling of the locks to avoid impact on the regular workload
Dependencies and installation
Python 3.x with modules: sqlparse
, requests
, pyzipper
yum install -y python38 # if Python 3.x is not installed
# if pip is not installed
curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
python3.8 get-pip.py
pip3.8 install sqlparse
pip3.8 install requests
pip3.8 install pyzipper
Built-in module py-postgresql.
Fast start
Download and run container:
docker pull masterlee998/db_converter:dbc_pg13
docker run --name dbc -d masterlee998/db_converter:dbc_pg13
docker exec -it dbc bash
How to install and run
First, needs to install python and modules (see the section above).
Next, clone db_converter
from GitHub:
git clone https://github.com/masterlee998/db_converter.git
cd db_converter
python3 db_converter.py --version
>> Version 1.3
Prepare db_converter.conf
:
mv conf/db_converter.conf.example conf/db_converter.conf
# set connection credentials to the database
test_conn='pq:\/\/some_user:password@127.0.0.1:5400\/test_db_1'
sudo sed -ie "s/^test_db_1.*/dbc = $test_conn/" conf/db_converter.conf
Run dba_get_conf
packet (just displays basic DB configuration options):
# run read-only packet
python3 db_converter.py \
--packet-name=dba_get_conf \
--db-name=dbc
# Info: =====> DBC 1.0 started
# Info: =====> Hold lock for packet dba_get_conf in DB dbc
# Info: Thread 'lock_observer_dbc' runned! Observed pids: []
# Info: --------> Packet 'dba_get_conf' started for 'dbc' database!
# Info: lock_observer_dbc: iteration done. Sleep on 5 seconds...
# Info: Thread 'lock_observer_dbc': Observed pids: []
# Info: Thread 'ro_manager_db_dbc', DB 'dbc', PID 24160, Packet 'dba_get_conf', ...
# Info:
# ---------------------------------------------------------------------------
# | name | value | pretty_value | boot_val | unit |
# ---------------------------------------------------------------------------
# | autovacuum_max_workers | 3 | | 3 | None |
# | autovacuum_naptime | 60 | | 60 | s |
# ....
# Info: <-------- Packet 'dba_get_conf' finished for 'dbc' database!
# Info: Thread lock_observer_dbc finished!
# Info: <===== DBC 1.0 finished
# run in background
nohup python3 db_converter.py \
--packet-name=my_packet \
--db-name=db01
> /dev/null 2>&1 &
tail -f log/dbc_db01_my_packet.log
# run all tests
python3 tests/test_packets.py -v
# run specific test
python3 tests/test_packets.py -v TestDBCLock
Terminology
Packet - is a package of changes (a directory with SQL files) that apply to the specified database. Packet contains meta_data.json
(an optional file with meta-information describing the package) and several SQL files in XX_step.sql
format.
Step - is a SQL file, the contents of which are executed in one transaction, and containing the following types of commands:
- DDL (Data Definition Language) - CREATE, DROP, ALTER, TRUNCATE, COMMENT, RENAME
- DML (Data Manipulation Language) - SELECT, INSERT, UPDATE, DELETE
- DCL (Data Control Language) - GRAND, REVOKE
Action - is a transaction formed on the basis of step
. If the step
does not have a generator
, then it creates one action
. If the step
has a generator
, then several transactions will be generated.
Generator - is a SQL file associated with some step
by index number. If there is a generator
, the step
contains placeholders for substituting the values returned by the generator
(for more details see the "Generators and Placeholders" section).
Conversion (migration, deployment) - is a transformation of the database structure according to the specified package of changes.
When executing Packet
, SQL files are applied to the specified database sequentially by the index.
Usage modes
db_convertrer
works in the following modes:
-
List all target databases according
--db-name
mask if the--list
key is specified -
Perform deployment - deploy the specified
packet
to the target database--db-name
-
Perform force deployment - forced deployment if the
--force
key is specified - ignore the difference between hashes of apacket
at the time of repeated execution and at the time of the first launch -
Perform sequential deployment if the
--seq
key is specified, then parallel execution is disabled (if several databases are selected), and all databases are processed sequentially according to the selected list. db_converter can process several databases in parallel. The possibility of parallelizing the conversion of one database does not make sense. -
Check packet status - display
packet
status if the--status
key is specified -
Wipe packet deployment history if the
--wipe
key is specified. Wipe means delete fromdbc_ *
tables. Removing information about an installed package can be used for debugging purposes. -
Unlock unexpectedly aborted deployment if the
--unlock
key is specified -
Stop all active transactions of unexpectedly aborted deployment if the
--stop
key is specified. It this mode, all active connections will be terminated matching withapplication_name
(specified in thedb_converter.conf
configuration file) +"_"
+--packet-name
-
Use template packet - copy
*.sql
files frompackets/templates/template
topackets/packet-name
if the--template
key is specified
Auxiliary deployment modes also provided:
-
Skip the whole step on the first error like
Deadlock
,QueryCanceledError
if the--skip-step-cancel
key is specified -
Skip action errors like
Deadlock
,QueryCanceledError
if the--skip-action-cancel
key is specified
In all deployment modes, two parameters are mandatory:
-
--db-name
- a name of directory located inpackets
-
--packet-name
- a name of one database or a comma-separated list of databases, orALL
to automatically substitute all databases listed indb_converter.conf
Documentation
- Files layout and configuration
- Command-line parameters
- Action tracker
- meta_data.json description
- Transactional processing of objects and data
- Threads
- Internals
- Use cases
Roadmap
- Synchronization of packets between a source and target databases
- Query bot in mattermost: running packets from the browser instead of command line
- Web interface (command line replacement, collaborative work)
-
- scheduling periodic tasks to run
-
- packets development
-
- testing
-
- approving
-
- delivery to production
-
- viewing logs and deployment statuses
Support and contributions
Bug reports and new features are appreciated and may be filed through the issue tracker.
Feel free to get technical support in Gitter.