Docker-compose file to run citus and hasura together, This repository assumes you have some level of knowledge in citus, hasura and docker-compose, if you know about those but is not that knowledgeble when it comes to citus, here is a quick explanation:
Citus is a postgres extension that distributes data across multiples nodes in a cluster with a horizontal-scaling approach.
If you need more info on citus check the links at the end of this readme.
Check the citus (master) postgres container id, The one with this port:
0.0.0.0:5432 => 5432/tcp, :::5432=>5432/tcp
get its ID and use this command:
docker cp path/to/csv <CONTAINER_ID>:/name_of_csv.csv
Example:
docker cp users.csv 06ae36661338:/users.csv
After that you can enter in postgres (master) with:
sudo docker exec -it citus_master /bin/bash
(citus_master is the container's name)
and Now that you entered you can simply run:
#the -U flag stands for "user"
psql -U postgres
Now that we can execute some commands directly in postgres and we have the csv inside we can copy it to a table we want (the table should be already created):
# here "companies" is an empty table I've created before hand
\copy users from 'users.csv' with csv
If you need to delete ALL your docker containers and volumes use this commands:
docker rm -f $(docker ps -a -q)
docker volume rm $(docker volume ls -q)
If you want to delete only some containers or/and some volumes you can do this:
docker ps
to check all containersdocker rm -f CONTAINER_ID
you can see the id in the other command's output.
To delete volume:
docker volume ls
To check all volumesdocker volume rm -f VOLUME_NAME
you can check volume name in command above.
You can create some dummy data in postgres by using the generate_series
function like this:
-- simple example
INSERT INTO users (hash_key, content, unix_timestamp, "timestamp")
SELECT
md5(random()::text),
'this is some content that will be the same for every new row', -- USE SINGLE QUOTES!!!
-- https://stackoverflow.com/questions/22964272/postgresql-get-a-random-datetime-timestamp-between-two-datetime-timestamp
date_part('epoch', NOW() + (random() * (NOW()+'90 days' - NOW())) + '30 days'),
NOW() + (random() * (NOW()+'90 days' - NOW())) + '30 days'
FROM generate_series(1, 10000) s(i)
Check here for an stackoverflow example, if you need more info, go to postgres docs!
You can get some info about some query like this:
EXPLAIN (ANALYZE, COSTS OFF)
SELECT * FROM users
REF: stackoverflow
There is probably a better solution, but for now what I know is that you can run this befor any query that has a trigger to avoid the "unsafe trigger" error:
SET citus.enable_unsafe_triggers TO on;
- Example citus docker
- Old example citus with hasura
- Safe incremental rollups on Postgres and Citus
- citus great tutorial realtime analytics
Some explanations and links more specific to postgres
- pg hero for checking performance
- stackoverflow what columns generally make good indexes
- tableplus for database management
If are a bit unsure about docker stuff this should give you a help, this is not exactly a begginner guide, just a couple infos about it.
Docker works with containers, those containers represent our "services", for example you can have a docker container that has a postgres image, so when this container is up you have a postgres db running, this container can have data inside but if you delete this container this data is gonna dissappear too. If you want to PERSIST data for a container you can create a VOLUME, if your container is deleted but your volume is not you will still have this data persisted when you recreate this container. This does mean that if you want to completely reset this container's data you need to delete both container and volume.
Sometimes an error can get past the build and create process of a docker container, so for example, you can have a container appear to be working fine, but if you check docker ps
you can see that is constantly restarting itself.
To debug a scenario like this you can check docker logs
:
docker logs --tail 50 --follow --timestamps CONTAINER_NAME