taosdata / TDengine

TDengine is an open source, high-performance, cloud native time-series database optimized for Internet of Things (IoT), Connected Cars, Industrial IoT and DevOps.

Home Page:https://tdengine.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

high availability in the 3.2.3.0 client

tonyfeel opened this issue · comments

基础环境:
ECS部署3节点3.2.3.0版本(磁盘NAS),3 dnode,3 mnode,客户端通过原生方式连接。

镜像部署3.2.3.0客户端,deployment configmap挂载taos.cfg、hostAliases配置hosts
hostAliases: - hostnames: - td-test-001 ip: 172.20.101.7 - hostnames: - td-test-002 ip: 172.20.101.4

数据库8vgroup,3副本

问题:
停单dnode,服务报:Sync leader is unreachable,需要所有vgroup全部就绪才能正常读写,无法体验高可用,需要协助解决。

其他相关信息:

`taos> show dnodes;
id | endpoint | vnodes | support_vnodes | status | create_time | reboot_time | note |

       1 | td-test-001:6030               |     12 |            100 | ready        | 2024-05-25 20:29:09.259 | 2024-05-31 17:53:33.764 |                                |
       2 | td-test-002:6030               |     10 |            100 | ready        | 2024-05-25 20:31:15.239 | 2024-05-31 12:49:40.701 |                                |
       3 | td-test-003:6030               |     10 |            100 | ready        | 2024-05-25 20:31:18.526 | 2024-05-31 12:49:44.478 |                                |

Query OK, 3 row(s) in set (0.001335s)

taos> show mnodes;
id | endpoint | role | status | create_time | role_time |

       1 | td-test-001:6030               | follower       | ready       | 2024-05-25 20:29:09.289 | 2024-05-31 17:54:13.868 |
       2 | td-test-002:6030               | leader         | ready       | 2024-05-25 20:31:38.000 | 2024-05-31 12:50:14.726 |
       3 | td-test-003:6030               | follower       | ready       | 2024-05-25 20:31:43.324 | 2024-05-31 12:50:21.186 |

Query OK, 3 row(s) in set (0.001417s)

taos> show vnodes;
dnode_id | vgroup_id | db_name | status | role_time | start_time | restored |

       1 |           2 | log                            | follower    | 2024-05-31 17:54:13.868 | 2024-05-31 17:53:34.547 | true     |
       2 |           2 | log                            | leader      | 2024-05-31 12:50:18.549 | 2024-05-31 12:49:42.086 | true     |
       3 |           2 | log                            | follower    | 2024-05-31 12:50:18.008 | 2024-05-31 12:49:46.818 | true     |
       1 |           3 | log                            | follower    | 2024-05-31 17:54:13.868 | 2024-05-31 17:53:40.358 | true     |
       2 |           3 | log                            | leader      | 2024-05-31 12:50:16.506 | 2024-05-31 12:49:49.318 | true     |
       3 |           3 | log                            | follower    | 2024-05-31 12:50:15.956 | 2024-05-31 12:49:54.340 | true     |
       1 |           4 | audit                          | leader      | 2024-05-31 17:54:13.882 | 2024-05-31 17:53:47.058 | true     |
       1 |           5 | audit                          | leader      | 2024-05-31 17:54:13.913 | 2024-05-31 17:53:47.786 | true     |
       1 |           8 | energy_data                    | follower    | 2024-05-31 17:54:13.928 | 2024-05-31 17:53:48.840 | true     |
       2 |           8 | energy_data                    | leader      | 2024-05-31 12:50:17.144 | 2024-05-31 12:49:56.791 | true     |
       3 |           8 | energy_data                    | follower    | 2024-05-31 12:50:16.604 | 2024-05-31 12:50:02.047 | true     |
       1 |           9 | energy_data                    | follower    | 2024-05-31 17:54:13.928 | 2024-05-31 17:53:51.777 | true     |
       2 |           9 | energy_data                    | leader      | 2024-05-31 12:50:16.362 | 2024-05-31 12:49:58.646 | true     |
       3 |           9 | energy_data                    | follower    | 2024-05-31 12:50:15.809 | 2024-05-31 12:50:03.524 | true     |
       1 |          10 | energy_data                    | follower    | 2024-05-31 17:54:14.123 | 2024-05-31 17:53:54.595 | true     |
       2 |          10 | energy_data                    | leader      | 2024-05-31 12:50:18.871 | 2024-05-31 12:50:00.297 | true     |
       3 |          10 | energy_data                    | follower    | 2024-05-31 12:50:18.325 | 2024-05-31 12:50:05.052 | true     |
       1 |          11 | energy_data                    | follower    | 2024-05-31 17:54:14.130 | 2024-05-31 17:53:57.794 | true     |
       2 |          11 | energy_data                    | leader      | 2024-05-31 12:50:18.869 | 2024-05-31 12:50:02.444 | true     |
       3 |          11 | energy_data                    | follower    | 2024-05-31 12:50:18.334 | 2024-05-31 12:50:06.905 | true     |
       1 |          12 | energy_data                    | follower    | 2024-05-31 17:54:14.130 | 2024-05-31 17:54:00.781 | true     |
       2 |          12 | energy_data                    | leader      | 2024-05-31 12:50:18.279 | 2024-05-31 12:50:05.185 | true     |
       3 |          12 | energy_data                    | follower    | 2024-05-31 12:50:17.732 | 2024-05-31 12:50:09.313 | true     |
       1 |          13 | energy_data                    | follower    | 2024-05-31 17:54:15.705 | 2024-05-31 17:54:04.212 | true     |
       2 |          13 | energy_data                    | follower    | 2024-05-31 17:47:36.307 | 2024-05-31 12:50:06.995 | true     |
       3 |          13 | energy_data                    | leader      | 2024-05-31 17:47:36.339 | 2024-05-31 12:50:11.095 | true     |
       1 |          14 | energy_data                    | follower    | 2024-05-31 17:54:15.704 | 2024-05-31 17:54:08.019 | true     |
       2 |          14 | energy_data                    | follower    | 2024-05-31 17:47:35.131 | 2024-05-31 12:50:09.064 | true     |
       3 |          14 | energy_data                    | leader      | 2024-05-31 17:47:35.156 | 2024-05-31 12:50:13.142 | true     |
       1 |          15 | energy_data                    | follower    | 2024-05-31 17:54:14.130 | 2024-05-31 17:54:11.183 | true     |
       2 |          15 | energy_data                    | leader      | 2024-05-31 12:50:17.736 | 2024-05-31 12:50:11.245 | true     |
       3 |          15 | energy_data                    | follower    | 2024-05-31 12:50:17.195 | 2024-05-31 12:50:15.343 | true     |

Query OK, 32 row(s) in set (0.001633s)

taos> show energy_data.vgroups;
vgroup_id | db_name | tables | v1_dnode | v1_status | v2_dnode | v2_status | v3_dnode | v3_status | v4_dnode | v4_status | cacheload | cacheelements | tsma |

       8 | energy_data                    |        1666 |        1 | follower    |        2 | leader      |        3 | follower    | NULL     | NULL        |           0 |             0 |    0 |
       9 | energy_data                    |        1591 |        1 | follower    |        2 | leader      |        3 | follower    | NULL     | NULL        |           0 |             0 |    0 |
      10 | energy_data                    |        1616 |        1 | follower    |        2 | leader      |        3 | follower    | NULL     | NULL        |           0 |             0 |    0 |
      11 | energy_data                    |        1615 |        1 | follower    |        2 | leader      |        3 | follower    | NULL     | NULL        |           0 |             0 |    0 |
      12 | energy_data                    |        1690 |        1 | follower    |        2 | leader      |        3 | follower    | NULL     | NULL        |           0 |             0 |    0 |
      13 | energy_data                    |        1279 |        1 | follower    |        2 | follower    |        3 | leader      | NULL     | NULL        |           0 |             0 |    0 |
      14 | energy_data                    |        1205 |        1 | follower    |        2 | follower    |        3 | leader      | NULL     | NULL        |           0 |             0 |    0 |
      15 | energy_data                    |        1618 |        1 | follower    |        2 | leader      |        3 | follower    | NULL     | NULL        |           0 |             0 |    0 |

Query OK, 8 row(s) in set (0.001718s)

taos>
`

mnode 需要也是三副本

我刚看到3 mnode ,抱歉。

可以停一个节点,然后在执行 show mnodes , show 库.vgroups 截图看看

leader在dnode2上
image

模拟dnode2下线
image
image

此时业务报错:### Cause: java.sql.SQLException: TDengine ERROR (8000090c): Sync leader is unreachable

image
即使vnodes restored均为true,任然在报ERROR (8000090c): Sync leader is unreachable

image

即使vnodes restored均为true,任然在报ERROR (8000090c): Sync leader is unreachable