michaelhood / timescaledb-parallel-copy

A binary for parallel copying of CSV data into a TimescaleDB hypertable

Home Page:https://www.timescale.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

timescaledb-parallel-copy

timescaledb-parallel-copy is a command line program for parallelizing PostgreSQL's built-in COPY functionality for bulk inserting data into TimescaleDB.

Getting started

You need the Go runtime (1.6+) installed, then simply go get this repo:

$ go get github.com/timescale/timescaledb-parallel-copy/cmd/timescaledb-parallel-copy

Before using this program to bulk insert data, your database should be installed with the TimescaleDB extension and the target table should already be made a hypertable.

Using timescaledb-parallel-copy

If you want to bulk insert data from a file named foo.csv into a (hyper)table named sample in a database called test:

# single-threaded
$ timescaledb-parallel-copy --db-name test --table sample --file foo.csv

# 2 workers
$ timescaledb-parallel-copy --db-name test --table sample --file foo.csv \
    --workers 2

# 2 workers, report progress every 30s
$ timescaledb-parallel-copy --db-name test --table sample --file foo.csv \
    --workers 2 --reporting-period 30s

# Treat literal string 'NULL' as NULLs:
$ timescaledb-parallel-copy --db-name test --table sample --file foo.csv \
    --copy-options "NULL 'NULL' CSV"

Other options and flags are also available, use timescaledb-parallel-copy --help for more information.

Contributing

We welcome contributions to this utility, which like TimescaleDB is released under the Apache2 Open Source License. The same Contributors Agreement applies; please sign the Contributor License Agreement (CLA) if you're a new contributor.

About

A binary for parallel copying of CSV data into a TimescaleDB hypertable

https://www.timescale.com/

License:Apache License 2.0


Languages

Language:Go 100.0%