This utility performs a full scan of an AWS DynamoDB table and outputs each row as a JSON object, storing the dump either on disk or in an S3 bucket.
It supports rate-limiting to a specified average read or write capacity and parallel requests to achieve high throughput.
The underlying Go library can also be imported into other projects to provide dump/load facilities.
Binaries are available for Linux, Solaris, Windows and Mac for the latest release.
Install Go and run
go get github.com/gwatts/dyndump
.
AWS credentials required to connect to DynamoDB must be passed in using environment variables, or will be loaded from ~/.aws/credentials or using EC2 metadata.
AWS_REGION
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
The dyndump program supports four commands:
Dumps an entire DynamoDB table to file or an S3 bucket.
Usage: dyndump dump [--silent] [--no-progress] [-cmpr] [--filename | --stdout] [(--s3-bucket --s3-prefix)] TABLENAME
Dump a table to file or S3
Arguments:
TABLENAME="" Table name to dump from Dynamo
Options:
-c, --consistent-read=false Enable consistent reads (at 2x capacity use)
-f, --filename="" Filename to write data to.
--stdout=false If true then send the output to stdout
-m, --maxitems=0 Maximum number of items to dump. Set to 0 to process all items
-p, --parallel=5 Number of concurrent channels to open to DynamoDB
-r, --read-capacity=5 Average aggregate read capacity to use for scan (set to 0 for unlimited)
--s3-bucket="" S3 bucket name to upload to
--s3-prefix="" Path prefix to use to store data in S3 (eg. "backups/2016-04-01-12:25-")
--silent=false Set to true to disable all non-error output
--no-progress=false Set to true to disable the progress bar
Loads a previous dump from file or S3 into an existing DynamoDB table
Usage: dyndump load [--silent] [--no-progress] [-mpw] (--filename | --stdin | (--s3-bucket --s3-prefix)) TABLENAME
Load a table dump from S3 or file to a DynamoDB table
Arguments:
TABLENAME="" Table name to load into
Options:
--allow-overwrite=false Set to true to overwrite any existing rows
-f, --filename="" Filename to read data from. Set to "-" for stdin
--stdin=false If true then read the dump data from stdin
-m, --maxitems=0 Maximum number of items to load. Set to 0 to process all items
-p, --parallel=4 Number of concurrent channels to open to DynamoDB
-w, --write-capacity=5 Average aggregate write capacity to use for load (set to 0 for unlimited)
--s3-bucket="" S3 bucket name to read from
--s3-prefix="" Path prefix to use to read data from S3 (eg. "backups/2016-04-01-12:25-")
--silent=false Set to true to disable all non-error output
--no-progress=false Set to true to disable the progress bar
Retrieves and displays metadata about a dump stored in S3
Usage: dyndump info --s3-bucket --s3-prefix
Display backup metadata from an S3 backup
Options:
--s3-bucket="" S3 bucket name to read from
--s3-prefix="" Path prefix to use to read data from S3 (eg. "backups/2016-04-01-12:25-")
Deletes an entire dump from S3 matching a specified prefix.
Usage: dyndump delete [--silent] [--no-progress] --s3-bucket --s3-prefix [--force]
Delete a backup from S3
Options:
--s3-bucket="" S3 bucket name to delete from
--s3-prefix="" Path prefix to use to delete from S3 (eg. "backups/2016-04-01-12:25-")
--force=false Set to true to disable the delete prompt
--silent=false Set to true to disable all non-error output
--no-progress=false Set to true to disable the progress bar
JSON is emitted as a stream of objects, one per item in the canonical format used by the DynamoDB API. Each object has a key for each field name with a value object holding the type and field value. Eg
{
"string-field": {"S": "string value"},
"number-field": {"N": "123"}
}
The following types are defined by the DynamoDB API:
* S - String
* N - Number (encoded in JSON as a string)
* B - Binary (a base64 encoded string)
* BOOL - Boolean
* NULL - Null
* SS - String set
* NS - Number set
* BS - Binary set
* L - List
* M - Map
See the godoc documentation for the github.com/gwatts/dyndump/dyndump library to integrate the library into your own projects.