fabiocmazzo / cassandra-exporter

A highly configurable utility to export whole Apache Cassandra keyspace or table structure/data to CQL scripts.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Summary

An alternative utility to nodetool snapshot to export whole Apache Cassandra keyspace/table structure and data to Cassandra Query Language (CQL) scripts. CQL scripts is a lightweight,simple way to restore and backup databases.

Features:

  • simple and configurable, see usage for detail.
  • fast and highly scalable: Data is exported gradually, so memory usage is very low, e.g: for my keyspace with 1,5 million record took ~3m to generated.
  • export process is tracked with detail information.
  • CQL scripts is ready to import using SOURCE command.
  • tested with Cassandra > 2.1, 2.2, 3.0 and tick-tock releases.
  • require Java > 6., make sure Java is available in PATH variable.

Overcome nodetool snapshot caveats:

  • snapshot can only be restored when table schema is there --> cassandra-CQL-exporter support both DDL and DML backup.
  • snapshot can only run on a node, multiple node require parallel ssh to be setup --> cassandra-CQL-exporter dont need to care there is how many node.
  • snapshot is stored the node itself, cassandra-CQL-exporter back up is stored on the backup client itself --> more isolated backup environment.

Generated script contains 2 component:

  • DDL: include keyspace CREATE statement, all tables, indexs, materialized views, function, aggregate function, user defined type.
  • DML: INSERT statement for tables data.

Be careful that script will be forward-compatible but not guarantee to be backward-compatible especially DDL statements. It's better that export and import using same Cassandra version. I'm using this on a daily basis. But anyways, use this at YOUR OWN RISK!

Usage

usage: cql-export [--drop] [-f <file name>] [-fo] [-h <host>] [--help] [-k <keyspace>] [-l] [-m]
       [--noddl] [--nodml] [-p <password>] [-po <port>] [-s] [-t <table>] [--test] [--truncate] [-u
       <username>] [-v] [--secure]
    --drop                  add DROP KEYSPACE statement. BE CAREFUL! THIS WILL WIPED OUT ENTIRE
                            KEYSPACE
 -f,--file <file name>      exported file path. default to "<keyspace>.CQL" or
                            "<keyspace>.<table>.CQL"
 -fo,--force                force overwrite of existing file
 -h,--host <host>           server host name or IP of database server, default is "localhost"
    --help                  print this help message
 -k,--keyspace <keyspace>   database keyspace to be exported. 
  or  <keyspace1,keyspace2> It allows to specify multiple keyspaces separated with comma. e.g. -k keyspace1,keyspace2
 -kf,--keyspacesFile        Allows to specify file which contains keyspaces you'd like to export, separated by new line                   
 -l,--license               Print this software license
 -m,--merge                 merge table data, insert will be generated with "IF NOT EXISTS"
    --noddl                 don't generate DDL statements (CREATE TABLE, INDEX, MATERIALIZED VIEW,
                            TYPE, TRIGGER, AGGREGATE), mutual exclusive with "nodml" option
    --nodml                 don't generate DML statements (INSERT), mutual exclusive with "noddl"
                            option
 -p,--pass <password>       database password
 -po,--port <port>          database server port, default is 9042
 -s,--separate              seperated export by tables
 -t,--table <table>         keyspace table to be exported
    --test                  Enable test mode. for development testing only
    --truncate              add TRUNCATE TABLE statement. BE CAREFUL!
 -u,--user <username>       database username
 -v,--verbose               print verbose message
    --secure                connect via SSL, -Djavax.net.ssl.trustStore=... -Djavax.net.ssl.trustStorePassword=... 
                            must be added to JAVA_OPTS environment variable

##Sample usage

  1. Simplest usages; only keyspace needed with localhost server and default port

     $cql-export -k cycling
     Trying connect to host "localhost"
     Success!
     Trying connect to port "9042" 
     Success!
     Trying connect to keyspace "cycling"
     Success!
     All good!
     Start exporting...
     Write DDL to C:\cql-generator\cycling.CQL
     Extract from cycling.cyclist_races
     Total number of record: 117920
     Start write "cyclist_teams" data DML to C:\cql-generator\cycling.CQL
     Done 5.00%
     Done 30.00%
     Done 90.00%
     Done exporting "cyclist_teams", total number of records exported: 117920
     Export completed after 21.179 s!
     Exited.
    
  2. Simple usage:

$cql-export -h localhost-po 9043 -k cycling
  1. Generate only DDL statement
$cql-export -h localhost-po 9043  -k keyspace_name -noddl

TODO

TODO: optimized jar size.

License

Apache 2.0 License

Using gradle

Run

./gradlew 

or (Windows:)

gradlew.bat

Build jar file

./gradlew jar

or (Windows:)

gradlew.bat jar

this places the jar file into the "dist" folder

Change log

version 1.1 with gradle script and possibility to specify multiple keyspaces to export contributed by matto3c https://www.codegravity.com

About

A highly configurable utility to export whole Apache Cassandra keyspace or table structure/data to CQL scripts.

License:Apache License 2.0


Languages

Language:Java 98.6%Language:Shell 0.7%Language:Batchfile 0.7%