torodb / stampede

The ToroDB solution to provide better analytics on top of MongoDB and make it easier to migrate from MongoDB to SQL

Home Page:https://www.torodb.com/stampede/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Compatibility with old versions?

p1nox opened this issue · comments

I know you support mongodb version 3.2 only, but is there a way of using stampede with mongodb 2.4.x version (even hacky)? I'm testing with this setup:

  • mongo 2.4.9
  • postgresql 9.6.2
  • stampede 1.0.0-beta1

And I'm getting this error:

2017-02-10T12:15:53.777 INFO 'StampedeService STARTING' c.t.s.StampedeService Starting up ToroDB Stampede
2017-02-10T12:15:53.804 INFO 'StampedeService STARTING' c.t.b.p.PostgreSqlDbBackend Configured PostgreSQL backend at localhost:5432
2017-02-10T12:15:54.845 INFO 'PostgreSqlDbBackend STARTING' c.t.b.AbstractDbBackendService Created pool session with size 28 and level TRANSACTION_REPEATABLE_READ
2017-02-10T12:15:54.978 INFO 'PostgreSqlDbBackend STARTING' c.t.b.AbstractDbBackendService Created pool system with size 1 and level TRANSACTION_REPEATABLE_READ
2017-02-10T12:15:55.019 INFO 'PostgreSqlDbBackend STARTING' c.t.b.AbstractDbBackendService Created pool cursors with size 1 and level TRANSACTION_REPEATABLE_READ
2017-02-10T12:15:56.518 INFO 'BackendBundleImpl STARTING' c.t.b.m.AbstractSchemaUpdater Schema 'torodb' not found. Creating it...
2017-02-10T12:15:56.649 INFO 'BackendBundleImpl STARTING' c.t.b.m.AbstractSchemaUpdater Schema 'torodb' created
2017-02-10T12:15:57.092 INFO 'StampedeService STARTING' c.t.s.StampedeService Database is not consistent. Cleaning it up
2017-02-10T12:15:57.326 INFO 'StampedeService STARTING' c.t.s.StampedeService Replicating from seeds: localhost:27017
2017-02-10T12:15:58.290 INFO 'MongodbReplBundle STARTING' c.t.m.r.MongodbReplBundle Starting replication service
2017-02-10T12:15:58.670 WARN 'topology-network-0' c.t.m.r.t.TopologyHeartbeatHandler Heartbeat start failed (sync source: localhost:27017): com.mongodb.MongoCommandException: Command failed with error -1: 'no such cmd: replSetGetConfig' on server localhost:27017. The full response is { "ok" : 0.0, "errmsg" : "no such cmd: replSetGetConfig", "bad cmd" : { "replSetGetConfig" : 1.0 } }
2017-02-10T12:15:59.682 WARN 'topology-network-0' c.t.m.r.t.TopologyHeartbeatHandler Heartbeat start failed (sync source: localhost:27017): com.mongodb.MongoCommandException: Command failed with error -1: 'no such cmd: replSetGetConfig' on server localhost:27017. The full response is { "ok" : 0.0, "errmsg" : "no such cmd: replSetGetConfig", "bad cmd" : { "replSetGetConfig" : 1.0 } }
2017-02-10T12:16:00.689 WARN 'topology-network-0' c.t.m.r.t.TopologyHeartbeatHandler Heartbeat start failed (sync source: localhost:27017): com.mongodb.MongoCommandException: Command failed with error -1: 'no such cmd: replSetGetConfig' on server localhost:27017. The full response is { "ok" : 0.0, "errmsg" : "no such cmd: replSetGetConfig", "bad cmd" : { "replSetGetConfig" : 1.0 } }

Maybe having a replica set with two members: where member 0 runs mongo 2.4.9 as master and member 1 runs mongo 3.2 as slave and connecting stampede to member 1? (Having in mind the risk of doing that).

It could be a hacky way to do what you want. In fact, this way is just what you are doing. Stampede won't be able to connect to member 0 which runs 2.4.9 because it does not support some commands we require, but theoretically it will connect to member 1. Of course you will see a warning from the topology manager every second, but it may work with only one accessible member. It sounds like a funny experiment. Did you try it?

So I tried the idea of having these two different specific versions as I said, but they are not compatible to be in the same replica set. Apparently, other versions are compatible, but those I tested, are not (2.4.9 - 3.2).

So what ended up working was to have a mongo server 2.4 (s0) and another mongo server 3.2 (s1) not linked to the first one but running as replica set. Then a torodb-stampede connected to s1, then every interval of time, maybe two times a day, a script is executed in s1: db.dropDatabase(); , db.copyDatabase("target_database_to_copy", "server_s0");, that way torodb-stampede can propagate changes from mongo to postgres. The obvious problem here is that database are dropped each time this script is executed. Do you know if another mongo operation can be used instead of copy?

BTW, different versions of mongo can be running in the same replica set, but the lowest supported version is 2.6 (ref).

We are going to try it internally, but if you cannot migrate to mongo >= 2.6, it seems that your solution is the only one. If your really need this feature, a slightly modified version of ToroDB could import all data from MongoDB 2.4.x periodically (the same you are doing with a mongodb 3.2 as middleware) but it is not in our roadmap

Alright, for the meantime I'll give this solution a try, and yes I need to support this 2.4.x "legacy" version.

Also, if you point to me what I could do to create a modified version of ToroDB to apply the approach of pulling data periodically instead of having two mongo servers it would be awesome. I think I can dedicate some time to do it. Thanks for the insight @gortiz :).

The modification should be easy... in theory, but there could be several points of failure or dependencies that make it complicated. The idea is the following: ToroDB replication has two states: initial sync and oplog replication. It seems complicated to adapt the oplog replication state to work with mongo < 3.0, but should be easier to adapt the intiial sync state, as it just download all remote databases (the same you do with your script). The hackiest but easier way to do that is to create your own main class that periodically creates a RecoveryService, which should be able (with some modifications) to clone the database from the remote 2.4 mongod.

PS: I recommend you to use the devel branch, as it has several improvements and the use of Guice has been modified to be more modular.

Hello @p1nox , in the last PR on devel branch we have fixed some issues for compatibility and we have test the next:
We have make a configuration of a Replica Set with Mongodb v3.0 as Primary node, and Mongodb v3.2 , v2.6 and v2.4 as Secondary nodes, and the replication was fine. Then we have started ToroDB (initiated with Mongo v3.2) and has replicate correctly from Primary Node (v3.0) although errors appeared in ToroDB log when it tries to replicate from Mongodb v2.4 and v2.6.
So it seems possible to have a Mongodb 2.4 and another 3.x on the same Replica Set, did you try it?