timescaledb-parallel-copy
is a command line program for parallelizing
PostgreSQL's built-in COPY
functionality for bulk inserting data
into TimescaleDB.
You need the Go runtime (1.6+) installed, then simply go get
this repo:
$ go get github.com/timescale/timescaledb-parallel-copy/cmd/timescaledb-parallel-copy
Before using this program to bulk insert data, your database should be installed with the TimescaleDB extension and the target table should already be made a hypertable.
If you want to bulk insert data from a file named foo.csv
into a
(hyper)table named sample
in a database called test
:
# single-threaded
$ timescaledb-parallel-copy --db-name test --table sample --file foo.csv
# 2 workers
$ timescaledb-parallel-copy --db-name test --table sample --file foo.csv \
--workers 2
# 2 workers, report progress every 30s
$ timescaledb-parallel-copy --db-name test --table sample --file foo.csv \
--workers 2 --reporting-period 30s
# Treat literal string 'NULL' as NULLs:
$ timescaledb-parallel-copy --db-name test --table sample --file foo.csv \
--copy-options "NULL 'NULL' CSV"
Other options and flags are also available, use
timescaledb-parallel-copy --help
for more information.
We welcome contributions to this utility, which like TimescaleDB is released under the Apache2 Open Source License. The same Contributors Agreement applies; please sign the Contributor License Agreement (CLA) if you're a new contributor.