sadpdtchr / db_converter

Tool for Database Migrations

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

About db_converter

Build Status Coverage Status GitHub license GitHub release Gitter Platforms Python

db_converter is an open-source database migration tool for PostgreSQL 9.6+ designed for high-loaded installations.

With db_converter complex tasks become easier.

Table of contents

Introduction

The primary goal of db_converter is to simplify the database conversion (migration) process as much as possible while maintaining flexibility and functionality.

Tasks that can be solved using db_converter:

The key features are:

  • Only plain SQL scripts with placeholders
  • Parallel processing of several databases
  • Handling of the locks to avoid impact on the regular workload

Dependencies and installation

Python 3.x with modules: sqlparse, requests, pyzipper

yum install -y python38    # if Python 3.x is not installed
# if pip is not installed
curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
python3.8 get-pip.py
pip3.8 install sqlparse
pip3.8 install requests
pip3.8 install pyzipper

Built-in module py-postgresql.

Fast start

Download and run container:

docker pull masterlee998/db_converter:dbc_pg13
docker run --name dbc -d masterlee998/db_converter:dbc_pg13
docker exec -it dbc bash

How to install and run

First, needs to install python and modules (see the section above).

Next, clone db_converter from GitHub:

git clone https://github.com/masterlee998/db_converter.git
cd db_converter
python3 db_converter.py --version
>> Version 1.3

Prepare db_converter.conf:

mv conf/db_converter.conf.example conf/db_converter.conf
# set connection credentials to the database
test_conn='pq:\/\/some_user:password@127.0.0.1:5400\/test_db_1'
sudo sed -ie "s/^test_db_1.*/dbc = $test_conn/" conf/db_converter.conf

Run dba_get_conf packet (just displays basic DB configuration options):

# run read-only packet
python3 db_converter.py \
	--packet-name=dba_get_conf \
	--db-name=dbc

	# Info: =====> DBC 1.0 started
	# Info: =====> Hold lock for packet dba_get_conf in DB dbc
	# Info: Thread 'lock_observer_dbc' runned! Observed pids: []
	# Info: --------> Packet 'dba_get_conf' started for 'dbc' database!
	# Info: lock_observer_dbc: iteration done. Sleep on 5 seconds...
	# Info: Thread 'lock_observer_dbc': Observed pids: []
	# Info: Thread 'ro_manager_db_dbc', DB 'dbc', PID 24160, Packet 'dba_get_conf', ...
	# Info:
	# ---------------------------------------------------------------------------
	# | name                         | value   | pretty_value | boot_val | unit |
	# ---------------------------------------------------------------------------
	# | autovacuum_max_workers       | 3       |              | 3        | None |
	# | autovacuum_naptime           | 60      |              | 60       | s    |
	# ....
	# Info: <-------- Packet 'dba_get_conf' finished for 'dbc' database!
	# Info: Thread lock_observer_dbc finished!
	# Info: <===== DBC 1.0 finished


# run in background
nohup python3 db_converter.py \
	--packet-name=my_packet \
	--db-name=db01
    > /dev/null 2>&1 &

tail -f log/dbc_db01_my_packet.log

# run all tests
python3 tests/test_packets.py -v
# run specific test
python3 tests/test_packets.py -v TestDBCLock

Terminology

Packet - is a package of changes (a directory with SQL files) that apply to the specified database. Packet contains meta_data.json (an optional file with meta-information describing the package) and several SQL files in XX_step.sql format.

Step - is a SQL file, the contents of which are executed in one transaction, and containing the following types of commands:

  • DDL (Data Definition Language) - CREATE, DROP, ALTER, TRUNCATE, COMMENT, RENAME
  • DML (Data Manipulation Language) - SELECT, INSERT, UPDATE, DELETE
  • DCL (Data Control Language) - GRAND, REVOKE

Action - is a transaction formed on the basis of step. If the step does not have a generator, then it creates one action. If the step has a generator, then several transactions will be generated.

Generator - is a SQL file associated with some step by index number. If there is a generator, the step contains placeholders for substituting the values returned by the generator (for more details see the "Generators and Placeholders" section).

Conversion (migration, deployment) - is a transformation of the database structure according to the specified package of changes.

When executing Packet, SQL files are applied to the specified database sequentially by the index.

Usage modes

db_convertrer works in the following modes:

  • List all target databases according --db-name mask if the --list key is specified

  • Perform deployment - deploy the specified packet to the target database --db-name

  • Perform force deployment - forced deployment if the --force key is specified - ignore the difference between hashes of a packet at the time of repeated execution and at the time of the first launch

  • Perform sequential deployment if the --seq key is specified, then parallel execution is disabled (if several databases are selected), and all databases are processed sequentially according to the selected list. db_converter can process several databases in parallel. The possibility of parallelizing the conversion of one database does not make sense.

  • Check packet status - display packet status if the --status key is specified

  • Wipe packet deployment history if the --wipe key is specified. Wipe means delete from dbc_ * tables. Removing information about an installed package can be used for debugging purposes.

  • Unlock unexpectedly aborted deployment if the --unlock key is specified

  • Stop all active transactions of unexpectedly aborted deployment if the --stop key is specified. It this mode, all active connections will be terminated matching with application_name (specified in the db_converter.conf configuration file) + "_" + --packet-name

  • Use template packet - copy *.sql files from packets/templates/template to packets/packet-name if the --template key is specified

Auxiliary deployment modes also provided:

  • Skip the whole step on the first error like Deadlock, QueryCanceledError if the --skip-step-cancel key is specified

  • Skip action errors like Deadlock, QueryCanceledError if the --skip-action-cancel key is specified

In all deployment modes, two parameters are mandatory:

  • --db-name - a name of directory located in packets

  • --packet-name - a name of one database or a comma-separated list of databases, or ALL to automatically substitute all databases listed in db_converter.conf

Documentation

Roadmap

  • Synchronization of packets between a source and target databases
  • Query bot in mattermost: running packets from the browser instead of command line
  • Web interface (command line replacement, collaborative work)
    • scheduling periodic tasks to run
    • packets development
    • testing
    • approving
    • delivery to production
    • viewing logs and deployment statuses

Support and contributions

Bug reports and new features are appreciated and may be filed through the issue tracker.

Feel free to get technical support in Gitter.

About

Tool for Database Migrations

License:MIT License


Languages

Language:Python 97.5%Language:PLpgSQL 2.0%Language:Shell 0.2%Language:Dockerfile 0.2%Language:Makefile 0.1%