r̶e̶c̶o̶v̶e̶r̶i̶n̶g̶ ̶t̶h̶e̶ ̶c̶o̶n̶n̶e̶c̶t̶i̶o̶n̶s̶ ̶a̶f̶t̶e̶r̶ ̶l̶o̶s̶t̶

Question

r̶e̶c̶o̶v̶e̶r̶i̶n̶g̶ ̶t̶h̶e̶ ̶c̶o̶n̶n̶e̶c̶t̶i̶o̶n̶s̶ ̶a̶f̶t̶e̶r̶ ̶l̶o̶s̶t̶

7c opened this issue 6 months ago · comments

imagine:

$ ./cursus
2023/12/23 23:04:33 [INFO] ConnectToNodes(): Node connection established to 45.xx.135.89:7682
2023/12/23 23:04:33 [INFO] ConnectToNodes(): Node connection established to 127.0.0.1:7682
2023/12/23 23:04:42 [INFO] HandleClientConnection(): 127.0.0.1:59044 query(select * from users;)
2023/12/23 23:07:10 [INFO] HandleClientConnection(): 127.0.0.1:59044 query(select * from users;)

all works.... if node1 is restarted, we can continue with node2 because it is replicated.. But if node2 is restarted.. cursus does not really notify this.. Based on the logs, it does not even notify (at least at the console) that the connection has quit/timedout. So it also does not try to keep connecting.. This results:

curush to wait indefinetly. Did i do something wrong maybe?

This is great approach. Thanks for this nice work.

Alex Gaetano Padula · Answer 1 · Sun Dec 24 2023 07:35:22 GMT+0800 (China Standard Time)

Ah I see. So yeah curush should essentially reconnect to the lost node, that is all. Amazing find!! I am working on this now for v1.8.1 patch.

Alex Gaetano Padula · Answer 2 · Sun Dec 24 2023 08:41:41 GMT+0800 (China Standard Time)

CursusDB Cluster & Node Bundle v1.8.1 STABLE

🛠️ Fixes 🛠️

Stall on select when 1 node or node replica is lost.
🔥New features 🔥
LostReconnect() method implemented. Tries to reconnect to any lost node or node replica automatically.

@7c

7c · Answer 3 · Sun Dec 24 2023 20:42:34 GMT+0800 (China Standard Time)

Thanks for quick reaction:

here is something new, i pulled latest version and compiled. I have a node<>cursus on 1 host and a node2 which is on different datacenter.... I resetted the .cdat* files to give it a new start. As you see on the screenshot there are 2 issues. I am connecting with curush from local>local and remote>node1 both are the same result. Insert results success but select does not return anything. I checked for the .cdat file:...

as you see the .cdat file is not created for some reason on node1 which has cursus. Then i interrupted:

^C2023/12/24 12:35:49 [INFO] Received signal interrupt starting database shutdown.
2023/12/24 12:35:49 [INFO] Starting to write node data to file.
2023/12/24 12:35:49 [INFO] WriteToFile(): Node data written to file successfully.

after restaring curode at node1 the data was lost. then reconnected from curush and tried the same with insert/select - same.

The issue with "109 No available nodes to insert into." is persistent, yesterday with previous version i got like 33% of my inserts this error. On datacenter-across-replication this looks consistent.

thanks for your nice work again.

Alex Gaetano Padula · Answer 4 · Mon Dec 25 2023 20:25:28 GMT+0800 (China Standard Time)

Hey @7c after many versions I believe I've covered this I think you pulled the latest version whilst there was unknown issues that I've further tested, identified and fixed. There was logic corrects in regards to the node replica syncing and insertion selection of nodes.

Alex Gaetano Padula · Answer 5 · Mon Dec 25 2023 20:29:18 GMT+0800 (China Standard Time)

Ah also make sure the basics, like the service is running I saw on your pictures, I had that happen on a setup recently. I forgot to start the node after stopping it thus leading to null results on the select as I only had 1 node no replicas.

Alex Gaetano Padula · Answer 6 · Mon Dec 25 2023 20:42:18 GMT+0800 (China Standard Time)

cluster:

node on port 7682 curodeconfig

Cluster logs on start

Shutdown node and rely on replica

I show a before query and after as you can see the replica is working as intended after first sync:

insert and relying on node 2 mind you cursus will choose a random node but now will check if its ok before comiting to it.

Mind you for these examples I have the CursusDB cluster set with 'join-responses'.

@7c

Alex Gaetano Padula · Answer 7 · Mon Dec 25 2023 20:43:45 GMT+0800 (China Standard Time)

@7c also thank you for testing anything, I really appreciate you. I truly do. Happy holidays by the way! My goal was for v2.0.0 to be the LTR long term release which will be all kinds of stable. At v1.9.6 now and well, It's working as intended! Very exciting.

Alex Gaetano Padula · Answer 8 · Mon Dec 25 2023 20:50:16 GMT+0800 (China Standard Time)

Ah also if no nodes only replicas(if their are replicas) are available you will see this now 🗡️

7c · Answer 9 · Tue Dec 26 2023 00:48:59 GMT+0800 (China Standard Time)

Looks much better and stable. The only issue left is this 104:

as you see i never lost a node at this session. But i understand this case is not easy to debug but other issues seems to be gone, thanks for your time!

Alex Gaetano Padula · Answer 10 · Tue Dec 26 2023 01:03:19 GMT+0800 (China Standard Time)

Interesting find again. I need to debug that before the long term release. I was testing rapid fire and concurrent inserts I’ll figure it out, won’t be long! You were using the latest version I assume?

Also thank you very much for your help @7c

7c · Answer 11 · Tue Dec 26 2023 01:09:38 GMT+0800 (China Standard Time)

Yes, i did git pull and saw your commits.. The fact that previous bugs were gone is an indication for me that we are moving forward and i have the commits, i also built them again from scratch, my binaries were deleted because of .gitignore (which is good)... Maybe you can try to remote setup:

node1 : node+cursus
node2: node

and you connect from node2 curush to node1s cluster remotely.

Alex Gaetano Padula · Answer 12 · Tue Dec 26 2023 01:11:59 GMT+0800 (China Standard Time)

I have identified the issue and will be patching it shortly.
The nodes are fine. It's the cluster.

So here I wrote only retry the amount of nodes so if you have 2 nodes the cluster will retry only twice. Would be good to have a cluster config for this with a default of 4 retries on insert.

Should be

if nodeRetries > -1 {
			nodeRetries -= 1
			goto query
		} else {
			node.Ok = false
			connection.Text.PrintfLine("%d No node was available for insert.", 104)
			return
		}

If you have 2 nodes.. well theres only one with the above logic!! Easy fix.

Alex Gaetano Padula · Answer 13 · Tue Dec 26 2023 01:16:25 GMT+0800 (China Standard Time)

I have a local test setup and I have 3 VMs(1 cluster, 2 nodes) hosted on Google Cloud with TLS all enabled on my tests. @7c

Alex Gaetano Padula · Answer 14 · Tue Dec 26 2023 02:02:57 GMT+0800 (China Standard Time)

@7c
dc8044b to add to this the cluster only waits 2 seconds and then tries another node up to 10 times which to me seems perfectly fine in my tests.

Alex Gaetano Padula · Answer 15 · Tue Dec 26 2023 02:34:54 GMT+0800 (China Standard Time)

Same as your setup:

Working good!! @7c

7c · Answer 16 · Tue Dec 26 2023 04:20:49 GMT+0800 (China Standard Time)

fresh pulled, compiled, my first call returned a 104, i also see "fuck" prefixed debug message in client of node1. If you check the timestamp it is the same time cursus got the insert query.. i believe it sent this request to node1 and it really did not do much with it ?

later at the same session i inserted the same statement again and:

i got
2023/12/25 20:18:35 [INFO] HandleClientConnection(): 45.14.135.89:57040 query(insert into test({"a":1});)

and 2023/12/25 20:18:35 FUCK map[action:select collection:test conditions:[] count:false keys:[$id] limit:1 lock:true oprs:[==] skip:0 sort-key: sort-pos: values:[b03ab05e-2548-4011-8649-c3f82710c1cb]] at client1, and 2023/12/25 20:18:35 FUCK map[action:select collection:test conditions:[] count:false keys:[$id] limit:1 lock:true oprs:[==] skip:0 sort-key: sort-pos: values:[b03ab05e-2548-4011-8649-c3f82710c1cb]] at curush.

Might curush be buggy ? (just an idea)

Alex Gaetano Padula · Answer 17 · Tue Dec 26 2023 04:30:24 GMT+0800 (China Standard Time)

@7c that's me being a banana and forgetting to remove my test logging.

Alex Gaetano Padula · Answer 18 · Tue Dec 26 2023 04:38:50 GMT+0800 (China Standard Time)

Alex Gaetano Padula commented 6 months ago

Alex Gaetano Padula · Answer 19 · Tue Dec 26 2023 04:40:06 GMT+0800 (China Standard Time)

Alex Gaetano Padula commented 6 months ago

Alex Gaetano Padula · Answer 20 · Tue Dec 26 2023 04:40:49 GMT+0800 (China Standard Time)

I see, so using join-results true causes that after some recent reworks. Ok no problem. @7c you're on fire 🔥.

Alex Gaetano Padula · Answer 21 · Tue Dec 26 2023 04:49:41 GMT+0800 (China Standard Time)

It seems like in regards to the 104 error you had. I can't reproduce that for some reason.

7c · Answer 22 · Tue Dec 26 2023 04:56:01 GMT+0800 (China Standard Time)

If you want you can add debug logs to curush and node codes, so i can try to reprocedure

Alex Gaetano Padula · Answer 23 · Tue Dec 26 2023 05:00:58 GMT+0800 (China Standard Time)

.cursusconfig

nodes:
    - host: 0.0.0.0
      port: 7682
      replicas: []
    - host: 0.0.0.0
      port: 7683
      replicas: []
host: 0.0.0.0
tls-node: false
tls-cert: ""
tls-key: ""
tls: false
port: 7681
key: QyjlGfs+AMjvqJd/ovUUA1mBZ3yEq72y8xBQw94a96k=
users:
    - YWxleA==:7V8VGHNwVTVC7EktlWS8V3kS/xkLvRg/oODmOeIukDY=
node-reader-size: 2097152
log-max-lines: 1000
join-responses: false
logging: false

.curodeconfig

replicas: []
tls-cert: ""
tls-key: ""
host: 0.0.0.0
tls: false
port: 7682
key: QyjlGfs+AMjvqJd/ovUUA1mBZ3yEq72y8xBQw94a96k=
max-memory: 10240
log-max-lines: 1000
logging: false
replication-sync-time: 10
tls-replication: false

Node 1

./curode --port 7683
Node key is required.  A node key is shared with your cluster and will encrypt all your data at rest and allow for only connections that contain a correct Key: header value matching the hashed key you provide.
key> ********
2023/12/25 15:36:31 [INFO] main(): No previous data to read.  Creating new .cdat file.

Node 2

Node key is required.  A node key is shared with your cluster and will encrypt all your data at rest and allow for only connections that contain a correct Key: header value matching the hashed key you provide.
key> ********
2023/12/25 15:36:36 [INFO] main(): No previous data to read.  Creating new .cdat file.

Cluster

2023/12/25 15:37:25 [INFO] ConnectToNodes(): Node connection established to 127.0.0.1:7682
2023/12/25 15:37:25 [INFO] ConnectToNodes(): Node connection established to 127.0.0.1:7683
2023/12/25 15:37:38 [INFO] HandleClientConnection(): 127.0.0.1:56430 query(select * from test;)
2023/12/25 15:38:23 [INFO] HandleClientConnection(): 127.0.0.1:56430 query(insert into test({"a":1});)
2023/12/25 15:38:24 [INFO] HandleClientConnection(): 127.0.0.1:56430 query(insert into test({"a":1});)
2023/12/25 15:38:26 [INFO] HandleClientConnection(): 127.0.0.1:56430 query(select * from test;)
^C2023/12/25 15:39:17 [INFO] SignalListener(): Received signal interrupt starting database cluster shutdown.

`

curush

agpmastersystem@agpmastersystem:~/curush$ 
Username>****
Password>********
curush>select * from test;
[{"127.0.0.1:7682": null},{"127.0.0.1:7683": null}]
curush>insert into test({"a":1});
{"insert":{"$id":"e53a19dd-7d48-41b5-a07c-7815e3b551b7","a":1},"message":"Document inserted","statusCode":2000}
curush>insert into test({"a":1});
{"insert":{"$id":"4ace9e3c-9596-4980-b23d-91b26df47dda","a":1},"message":"Document inserted","statusCode":2000}
curush>select * from test;
[{"127.0.0.1:7683": [{"$id":"e53a19dd-7d48-41b5-a07c-7815e3b551b7","a":1},{"$id":"4ace9e3c-9596-4980-b23d-91b26df47dda","a":1}]},{"127.0.0.1:7682": null}]

@7c

Alex Gaetano Padula · Answer 24 · Tue Dec 26 2023 05:03:21 GMT+0800 (China Standard Time)

So yeah it's essentially the join-responses for the select action.
I can't reproduce your 104s for some reason.

Alex Gaetano Padula · Answer 25 · Tue Dec 26 2023 05:13:20 GMT+0800 (China Standard Time)

I'm starting to take that back after doing more testing. It may indeed be something node related but it may just be me because node data is all there on all my vm instances.

Alex Gaetano Padula · Answer 26 · Tue Dec 26 2023 05:15:54 GMT+0800 (China Standard Time)

It's hard to tell because the node would tell you if its corrupted. I think I might be not using clean states when I test but essentially the join-responses is not the piece here.

Alex Gaetano Padula · Answer 27 · Tue Dec 26 2023 05:16:45 GMT+0800 (China Standard Time)

Here I have 2 nodes

I select
I shut down one node
I select
I turn on the shutdown node
i select and sharded data is back.

7c · Answer 28 · Tue Dec 26 2023 05:41:08 GMT+0800 (China Standard Time)

for some reason, we get the action "select" instead of INSERT! I believe this is the bug.

Previous successfull insert was responded at debugger like this, which has the action insert:

7c · Answer 29 · Tue Dec 26 2023 05:48:52 GMT+0800 (China Standard Time)

i see that (probably) multiplexer sends a SELECT then inserts. In this case the bug might be at cursus, obviously it selects but does not continue with inserts

7c · Answer 30 · Tue Dec 26 2023 06:25:12 GMT+0800 (China Standard Time)

After debugging the cursus, i have had hard time replicating this bug but at the end i had 1 case where this happened.

For some reason cursus.Nodeconnections switch from 0.0.0.0 (which is local one from perspective of cursus) to the other node which is indeed a replica.

7c · Answer 31 · Tue Dec 26 2023 06:52:07 GMT+0800 (China Standard Time)

ok i think this is the bug:

if we have 2 NodeConnections you iterate the i 0,1 and the chance that you pick the same .Replica node is very high. I dont know why this isnt that case in your tests. Picking the replica node is 33% i believe.

Alex Gaetano Padula · Answer 32 · Tue Dec 26 2023 17:00:19 GMT+0800 (China Standard Time)

Hm I think thats possible, I’m running tests at the moment. That specific piece selects a node at random and maybe the memory check on the node. So like the node checks it’s own memory to see if it’s at 10gb in mb by default. If it is the node will respond 100 which is unavailable. Amazing finds.

Alex Gaetano Padula · Answer 33 · Tue Dec 26 2023 17:03:26 GMT+0800 (China Standard Time)

Also in one of your pictures I see that the node is not ok thus not allowing the accessibility. So what this means is the replica is getting set Ok to false which is another piece I’m looking at currently.

Alex Gaetano Padula · Answer 34 · Tue Dec 26 2023 17:14:57 GMT+0800 (China Standard Time)

I do see what you're saying regarding the select of the node but we are checking if its a replica etc on insert and for example below is that code being used. As you can see it works fine.

Ahh ok, so you're saying don't pick the same one!!! @7c you're really good at this.

Alex Gaetano Padula · Answer 35 · Tue Dec 26 2023 17:19:43 GMT+0800 (China Standard Time)

Easy money:

Creating commit for Cursus with this fix.

Alex Gaetano Padula · Answer 36 · Tue Dec 26 2023 17:23:17 GMT+0800 (China Standard Time)

i see that (probably) multiplexer sends a SELECT then inserts. In this case the bug might be at cursus, obviously it selects but does not continue with inserts

Yeah on the insert action Cursus is preparing to create a unique $id for the new document which is unique across all nodes for your specific collection. So yes there is a SELECT first. To add to that last piece.
If you insert:
insert into test({"email!": "test@example.com", "name": "John"})
Cursus will check all nodes for email test@example.com because of the email**!** key with the $id check as well.

Alex Gaetano Padula · Answer 37 · Tue Dec 26 2023 17:31:18 GMT+0800 (China Standard Time)

@7c so with your own testing was the INSERT action your only bug you've encountered? I somehow did not catch any insert bugs probably because we are writing our tests differently. Me myself I use Curush but I also use the cursusdb-go package to concurrently insert 1000s of documents not just singles so I may miss what you're seeing.

My single queries are mainly selects, updates, and deletes to test the query language actions such as pattern matching etc.

Alex Gaetano Padula · Answer 38 · Tue Dec 26 2023 19:59:38 GMT+0800 (China Standard Time)

Works like magic 🪄 @7c

Everything works so well. WOO!!!
599a75c

PostDaemon · Answer 39 · Tue Dec 26 2023 20:30:16 GMT+0800 (China Standard Time)

🎩 Lovely.

Alex Gaetano Padula · Answer 40 · Wed Dec 27 2023 00:30:04 GMT+0800 (China Standard Time)

I've finished the backup implementation as well. It's really cool stuff!

Alex Gaetano Padula · Answer 41 · Wed Dec 27 2023 03:04:22 GMT+0800 (China Standard Time)

Here I insert 80k records into 2 nodes concurrently with 20 connections of your object {"a":2}

Its about 30mb of memory per node at 42k docs per node. The reason CursusDB pretty fast on each node linearly is because everything is in memory. If we had to open a file to write or read it everytime sharing a mutex lock would be painfully slow.. like a regular database. Curode only locks per collection not whole database file.

7c · Answer 42 · Wed Dec 27 2023 10:49:29 GMT+0800 (China Standard Time)

Thanks for your time!

Alex Gaetano Padula · Answer 43 · Wed Dec 27 2023 10:57:15 GMT+0800 (China Standard Time)

Hey @7c no problem. Enjoy!

Alex Gaetano Padula · Answer 44 · Fri Dec 29 2023 22:34:33 GMT+0800 (China Standard Time)

@7c the first long term release is out. I've got it running in production. Everything is so smooth, it's incredible. Thank you for your help on testing. I vigorously tested and perfected everything, just lots and lots of repetition. Improving even the searches with new algorithms I came up with. Truly love it. The code base is under 6k core lines of code compared to say 1.7 million like PostgreSQL. This is because of how things are written of course :)

7c · Answer 45 · Sat Dec 30 2023 15:40:01 GMT+0800 (China Standard Time)

Congratulations. what is your aim please? Do you want to fill a gap between mysql and redis ?

Alex Gaetano Padula · Answer 46 · Sat Dec 30 2023 23:48:29 GMT+0800 (China Standard Time)

@7c personally nothing fit my needs for a project and I wanted to build something that was easy to scale, automated, and secure by default whilst offering very fast speeds and concurrency.

Alex Gaetano Padula · Answer 47 · Sat Dec 30 2023 23:50:18 GMT+0800 (China Standard Time)

Innovate for the future now is my goal, making this the most reliable document database.