alibaba / MongoShake

MongoShake is a universal data replication platform based on MongoDB's oplog. Redundant replication and active-active replication are two most important functions. 基于mongodb oplog的集群复制工具,可以满足迁移和同步的需求,进一步实现灾备和多活功能。

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

mongoshake使用all 模式完成全量同步,开始增量同步时报错进程自动退出

xiang-coco opened this issue · comments

1.版本信息如下:
MongoShake的版本:2.4.16
源和目标MongoDB的版本:4.2.17

2.. 使用mongoshake 将mongodbA集群 同步至 mongodbB 集群,
sync_mode =all
mongo_urls = 集群A
tunnel = direct
tunnel.address =集群B

启动mongoshake 后,全量数据同步完成,开始增量同步时报错并且mongoshake 进程自动退出。
日志如下:

[2023/06/25 18:28:37 CST] [INFO] ------------------------full sync done!------------------------
[2023/06/25 18:28:37 CST] [INFO] oldestTs[7248508634924061201[1687674931, 4625]] fullBeginTs[7248564190326030337[1687687866, 1]]
fullFinishTs[7248568704336658527[1687688917, 95]]
[2023/06/25 18:28:37 CST] [INFO] finish full sync, start incr sync with timestamp: fullBeginTs[7248564190326030337[1687687866, 1]],
fullFinishTs[7248568704336658527[1687688917, 95]]
[2023/06/25 18:28:37 CST] [INFO] start incr replication
[2023/06/25 18:28:37 CST] [INFO] RealSourceIncrSync[0]: url[mongodb://username:password@sss,xxx,yyy], name[mongodb-name], startTimestamp[7248564190326030337]
[2023/06/25 18:28:37 CST] [INFO] New session to mongodb://username:password@sss,xxx,yyy], successfully
[2023/06/25 18:28:37 CST] [INFO] Collector-worker-0 start working with jobs batch queue. buffer capacity 64
[2023/06/25 18:28:37 CST] [INFO] New session to mongodb://username:password@sss,xxx,yyy], successfully
[2023/06/25 18:28:37 CST] [INFO] Collector-worker-1 start working with jobs batch queue. buffer capacity 64
[2023/06/25 18:28:37 CST] [INFO] New session to mongodb://username:password@sss,xxx,yyy], successfully
[2023/06/25 18:28:37 CST] [INFO] Collector-worker-2 start working with jobs batch queue. buffer capacity 64
[2023/06/25 18:28:37 CST] [INFO] New session to mongodb://username:password@sss,xxx,yyy], successfully
[2023/06/25 18:28:37 CST] [INFO] Collector-worker-3 start working with jobs batch queue. buffer capacity 64
[2023/06/25 18:28:37 CST] [INFO] New session to mongodb://username:password@sss,xxx,yyy], successfully
[2023/06/25 18:28:37 CST] [INFO] Collector-worker-4 start working with jobs batch queue. buffer capacity 64
[2023/06/25 18:28:37 CST] [INFO] metric[name[mongodb-name] stage[full]] exit
[2023/06/25 18:28:37 CST] [INFO] New session to mongodb://username:password@sss,xxx,yyy], successfully
[2023/06/25 18:28:37 CST] [INFO] Collector-worker-5 start working with jobs batch queue. buffer capacity 64
[2023/06/25 18:28:37 CST] [INFO] New session to mongodb://username:password@sss,xxx,yyy], successfully
[2023/06/25 18:28:37 CST] [INFO] Collector-worker-6 start working with jobs batch queue. buffer capacity 64
[2023/06/25 18:28:37 CST] [INFO] New session to mongodb://username:password@sss,xxx,yyy], successfully
[2023/06/25 18:28:37 CST] [INFO] Collector-worker-7 start working with jobs batch queue. buffer capacity 64
[2023/06/25 18:28:37 CST] [INFO] Syncer[rep-name] poll oplog syncer start. ckpt_interval[5000ms], gid[[]], shard_key[collection]
[2023/06/25 18:28:37 CST] [INFO] Oplog sync[rep-name] create checkpoint manager with url[mongodb://username:password@zzz,xxx,yyy] table[mongoshake.ckpt_default] start-position[7248564190326030337[1687687866, 1]]
[2023/06/25 18:28:37 CST] [INFO] New session to [mongodb://username:password@zzz,xxx,yyy] successfully
[2023/06/25 18:28:37 CST] [INFO] rep-name Load exist checkpoint. content {"name":"rep-name","ckpt":7248564190326030337,"version":2,"fetch_method":"","oplog_disk_queue":"","oplog_disk_queue_apply_finish_ts":1}
[2023/06/25 18:28:37 CST] [INFO] load checkpoint value: {"name":"rep-name","ckpt":7248564190326030337,"version":2,"fetch_method":"","oplog_disk_queue":"","oplog_disk_queue_apply_finish_ts":1}
[2023/06/25 18:28:37 CST] [INFO] persister replset[rep-name] update fetch status to: store memory and apply
[2023/06/25 18:28:37 CST] [INFO] rep-name Load exist checkpoint. content {"name":"rep-name","ckpt":7248564190326030337,"version":2,"fetch_method":"","oplog_disk_queue":"","oplog_disk_queue_apply_finish_ts":1}
[2023/06/25 18:28:37 CST] [INFO] set query timestamp: 7248564190326030337[1687687866, 1]
[2023/06/25 18:28:37 CST] [INFO] start fetcher with src[mongodb://username:password@zzz,xxx,yyy] replica-name[mongodb-name] query-ts[7248564190326030337[1687687866, 1]]
[2023/06/25 18:28:37 CST] [INFO] oplogReader[src:[mongodb://username:password@zzz,xxx,yyy] replset:rep-name] ensure network
[2023/06/25 18:28:37 CST] [INFO] New session to [mongodb://username:password@zzz,xxx,yyy] successfully
[2023/06/25 18:28:37 CST] [WARN] CheckpointOperation updated is not suitable. lowest [0]. current [7248564190326030337[1687687866, 1]]. inputTs [0]. reason : smallest candidates is zero
[18:28:37 CST 2023/06/25] [CRIT] (mongoshake/collector.(*Batcher).filter:136) Syncer[rep-name] ddl oplog found[{"ts":7248568103041236993,"v":2,"op":"c","ns":"schema.$cmd","o":[{"Name":"drop","Value":"table_name"}],"o2":{"numRecords":105672}}] when oplog timestamp[7248568103041236993[1687688777, 1]] less than fullSyncFinishPosition[7248568704336658527[1687688917, 95]]
[2023/06/25 18:28:37 CST] [CRIT] Syncer[rep-name] ddl oplog found[{"ts":7248568103041236993,"v":2,"op":"c","ns":"schema.$cmd","o":[{"Name":"drop","Value":"table_name"}],"o2":{"numRecords":1056}}] when oplog timestamp[7248568103041236993[1687688777, 1]] less than fullSyncFinishPosition[7248568704336658527[1687688917, 95]]
panic: send on closed channel

goroutine 16522 [running]:
vendor/github.com/vinllen/log4go.(*ConsoleLogWriter).LogWrite(0xc420236ce0, 0xc420243980)
/home/zhuzhao.cx/mongo-shake/MongoShake/src/vendor/github.com/vinllen/log4go/termlog.go:41 +0x43
vendor/github.com/vinllen/log4go.Logger.intLogf(0xc42001f860, 0x4, 0xceb279, 0x1e, 0xc42cb5dac0, 0x1, 0x1)
/home/zhuzhao.cx/mongo-shake/MongoShake/src/vendor/github.com/vinllen/log4go/log4go.go:223 +0x289
vendor/github.com/vinllen/log4go.Info(0xba4f80, 0xe1bf20, 0xc42cb5dac0, 0x1, 0x1)
/home/zhuzhao.cx/mongo-shake/MongoShake/src/vendor/github.com/vinllen/log4go/wrapper.go:195 +0x1b2
mongoshake/common.NewMongoConn(0xc4201c24e1, 0x53, 0xcd9cc6, 0x7, 0xffffffffffffff01, 0x0, 0x0, 0x0, 0x0, 0xc420089c98, ...)
/home/zhuzhao.cx/mongo-shake/MongoShake/src/mongoshake/common/mgo_client.go:62 +0x508
mongoshake/executor.(*Executor).ensureConnection(0xc4200969c0, 0xc420030001)
/home/zhuzhao.cx/mongo-shake/MongoShake/src/mongoshake/executor/operation.go:25 +0x9e
mongoshake/executor.(*Executor).execute(0xc4200969c0, 0xc42032ce10, 0xc42000e308, 0x1)
/home/zhuzhao.cx/mongo-shake/MongoShake/src/mongoshake/executor/operation.go:58 +0xb5
mongoshake/executor.(*Executor).doSync(0xc4200969c0, 0xc42000e308, 0x1, 0x1, 0x45dc71, 0xc4203ba850)
/home/zhuzhao.cx/mongo-shake/MongoShake/src/mongoshake/executor/executor.go:245 +0xfd
mongoshake/executor.(*Executor).start(0xc4200969c0)
/home/zhuzhao.cx/mongo-shake/MongoShake/src/mongoshake/executor/executor.go:221 +0x84
created by mongoshake/executor.(*BatchGroupExecutor).Start
/home/zhuzhao.cx/mongo-shake/MongoShake/src/mongoshake/executor/executor.go:71 +0xb6