A read-optimized, in-memory, columnar store database (column-oriented DBMS). It's pretty Damn Quick.
- Install goLang on your computer
$ git clone https://github.com/adam-hanna/PDQdb.git
$ make
- Only builds for Linux and MacOS, for now
$ PDQdb -f "path/to/your/file.csv" -c "path/to/your/config.json"
- http api (returns json)
- Grab data by key:
curl -v -XGET http://127.0.0.1:38216/key/{your key}
- Count keys:
curl -v -XGET http://127.0.0.1:38216/count
- Grab data by query: see the section on weeQL
weeQL is the query language written for PDQdb. It was inspired by SQL and implements many of the features of the SQL language; however, it is a very lightweight implementation and does not include all the functionality of SQL. Pronounced "wee" + "quill". Get it? Wee SQL? Tiny SQL?
All weeQL queries are made by sending a json query object using the http POST method to http://127.0.0.1:38216/query
The json query object is structured as follows:
{
"SELECT": [ "COL1", "COL2", ... ], // SITUATIONAL
"COUNT": "*", // SITUATIONAL
"WHERE": { "FIELD1": "VAL1", "FIELD2": "VAL2", ... }, // see the section on where
"GROUP BY": "COL1" // OPTIONAL
}
The "WHERE"
property supports multiple logical operators:
- `"$OR": [ { "FIELD1": "VAL1" }, {"FIELD2": "VAL2" }, ... ]`
- `"$NOT": { "FIELD1": "VAL1", "FIELD2": "VAL2", ... }`
- `"$NOR": { "FIELD1": "VAL1", "FIELD2": "VAL2", ... }`; This is the same as `"$NOT": { "$OR": [ ... ] }`
- `"FIELD1": "$IN": [ "VAL1", "VAL2", ... ]`
- `"FIELD1": "$NIN": [ "VAL1", "VAL2", ... ]` This is the same as `"$NOT": { "$IN": [ ... ] }`
Exporting and aggregating is explained in more detail, below.
- Exporting data is done with the `"SELECT"` query parameter (omit `"COUNT"` and `"GROUP BY"`)
-
- `"SELECT" : [ “FIELD 1”, “FIELD 2”, … ]`: SITUATIONAL. An array of strings that indicates the columns to be returned. Omitted if using `"COUNT"`!
- `"WHERE": { "FIELD1": "VAL1", "FIELD2": "VAL2", ... }`: a subdocument of filters. Multiple filters are returned as the intersection of data that meet each criteria (i.e. "FIELD1" = "VAL1" AND "FIELD2" = "VAL2"). Other logical operators besides AND are available. See the section on `"WHERE"` for more info.
-
query:
```
{
SELECT : [ “ID” ],
WHERE: {
“A”: “foo”
}
}
```
yields:
[ { "ID": "1" }, { "ID": "3" }, { "ID": "4" } ]
- Perform counts on data with query parameters and an optional `"GROUP BY"` command (omit `"SELECT"`)
-
- `"COUNT": "*"`: SITUATIONAL. The only value currently supported is "*". Omitted if using `"SELECT"`!
- `"WHERE": { "FIELD1": "VAL1", "FIELD2": "VAL2", ... }`: a subdocument of filters. Multiple filters are returned as the intersection of data that meet each criteria (i.e. "FIELD1" = "VAL1" AND "FIELD2" = "VAL2"). Other logical operators besides AND are available. See the section on `"WHERE"` for more info.
- `"GROUP BY": "COL1"`: OPTIONAL. Only valid with `"COUNT"` queries. This is the string column name by which to group count results.
-
query #1:
{ “COUNT”: “*”, "WHERE": { “A”: “foo” } }
yields:
{ "COUNT": 3 }
query #2 with
"GROUP BY"
{ “COUNT”: “*”, "WHERE": { “A”: “foo” }, “GROUP BY”: “B” }
yields:
{ "bar": 0, "barbar": 0, "barbarbar": 0, "test": 1, "testtest": 1, "testtesttest": 1 }
query #3 with
"GROUP BY"
and"$OR"
:{ “COUNT”: “*”, "WHERE": { "A": "foo", "$OR": [ { "B": "testtesttest" }, { "B": "testtest" } ] }, “GROUP BY”: “B” }
yields:
{ "bar": 0, "barbar": 0, "barbarbar": 0, "test": 0, "testtest": 1, "testtesttest": 1 }
###example.csv
ID,A,B 1,foo,test 2,foobar,testtest 3,foo,testtesttest 4,foo,testtest 5,foobarbar,bar 6,foofoo,barbar 7,foofoo,barbarbar
###example_config.json
{ "header": [ {"ID": "string"}, {"A": "string"}, {"B": "string"} ], "id_field": "ID", "index_fields": ["A", "B"], "start_at_line": 2 }