Ensuring bosh-dns query for tservers is working with drivers, peers table replicating, etc.
aegershman opened this issue Β· comments
some client-side logging that a trial user is seeing:
2020-03-12T08:02:26.635-05:00 [APP/PROC/WEB/0] [OUT] π¦ [main] WARN com.datastax.driver.core.Cluster:2323 - You listed q-s0.tserver.dev-services-network.yugabyte-instance-f19c1f496b63.bosh/10.156.74.12:9042 in your contact points, but it wasn't found in the control host's system.peers at startup
2020-03-12T08:02:26.636-05:00 [APP/PROC/WEB/0] [OUT] π¦ [main] WARN com.datastax.driver.core.Cluster:2323 - You listed q-s0.tserver.dev-services-network.yugabyte-instance-f19c1f496b63.bosh/10.156.73.22:9042 in your contact points, but it wasn't found in the control host's system.peers at startup
2020-03-12T08:02:26.636-05:00 [APP/PROC/WEB/0] [OUT] π¦ [main] WARN com.datastax.driver.core.Cluster:2323 - You listed q-s0.tserver.dev-services-network.yugabyte-instance-f19c1f496b63.bosh/10.156.73.136:9042 in your contact points, but it wasn't found in the control host's system.peers at startup
2020-03-12T08:02:26.636-05:00 [APP/PROC/WEB/0] [OUT] π¦ [main] WARN com.datastax.driver.core.Cluster:2323 - You listed q-s0.tserver.dev-services-network.yugabyte-instance-f19c1f496b63.bosh/10.156.73.24:9042 in your contact points, but it wasn't found in the control host's system.peers at startup
2020-03-12T08:02:26.637-05:00 [APP/PROC/WEB/0] [OUT] π¦ [main] WARN com.datastax.driver.core.Cluster:2323 - You listed q-s0.tserver.dev-services-network.yugabyte-instance-f19c1f496b63.bosh/10.156.74.14:9042 in your contact points, but it wasn't found in the control host's system.peers at startup
...
2020-03-12T07:55:19.689-05:00 [APP/PROC/WEB/0] [OUT] π¦ [cluster1-worker-41] WARN com.datastax.driver.core.ControlConnection:559 - No row found for host /10.156.73.24 in q-s0.tserver.dev-services-network.yugabyte-instance-f19c1f496b63.bosh/10.156.73.137:9042's peers system table. /10.156.73.24 will be ignored.
2020-03-12T13:20:19.698-05:00 [APP/PROC/WEB/0] [OUT] π¦ [cluster1-worker-567] INFO com.datastax.driver.core.Cluster:2327 - Cassandra host /10.156.74.12:9042 removed
see:
- https://github.com/yugabyte/yugabyte-db/issues-- 856
- https://github.com/yugabyte/yugabyte-db/issues-- 285
- https://github.com/yugabyte/yugabyte-db/issues-- 2390
- https://docs.datastax.com/en/developer/java-driver/3.7/manual/address_resolution/
- https://docs.datastax.com/en/developer/java-driver/4.5/
- https://stackoverflow.com/questions/54282262/how-does-one-add-a-node-or-nodes-to-an-existing-yugabyte-db-ce-cluster/54314712
from gh issue:
The 'system.local' query reveals that the `broadcast_address' column doesn't seem to have the desired IP address.
Looks like the default option of --rpc_bind_addresses (0.0.0.0) is causing YB to be non-deterministic when there are multiple options to bind, and it is picking to bind an ipv6 address, and this could be causing the warning you are seeing in the client logs.
For your case, where the IP you want to use is 192.168.0.100, could you update your yb-master & yb-tserver conf files to also pass?
--rpc_bind_addresses 192.168.0.100
Note: The rpc_bind_address is also used as the indentity of the entity (yb-tserver/yb-master). So for this change, it is better to do a clean-slate creation/fresh install of the environment. So you would want to wipe out your old data directory - guess that's OK as this is a test deployment.
here's what I'm seeing when querying select * from system.local
tserver/40dd8fc9-b2da-4517-a066-649e777f45ad:~# source /var/vcap/packages/python-*/bosh/runtime.env
tserver/40dd8fc9-b2da-4517-a066-649e777f45ad:~# /var/vcap/packages/yugabyte/bin/cqlsh --cqlshrc /var/vcap/jobs/yb-tserver/config/cqlshrc
Connected to local cluster at q-m88141n3s0.q-g86407.bosh:9042.
[cqlsh 5.0.1 | Cassandra 3.9-SNAPSHOT | CQL spec 3.4.2 | Native protocol v4]
Use HELP for help.
cassandra@cqlsh> select * from system.local
...
... ;
key | bootstrapped | broadcast_address | cluster_name | cql_version | data_center | gossip_generation | host_id | listen_address | native_protocol_version | partitioner | rack | release_version | rpc_address | schema_version | thrift_version | tokens | truncated_at
-------+--------------+-------------------+---------------+-------------+-------------+-------------------+--------------------------------------+----------------+-------------------------+---------------------------------------------+------------+-----------------+--------------+--------------------------------------+----------------+-------------------------+--------------
local | COMPLETED | 10.156.89.36 | local cluster | 3.4.2 | us-west-2 | 0 | b04fb57f-1a27-738d-4040-f9e27dd3a688 | 10.156.89.36 | 4 | org.apache.cassandra.dht.Murmur3Partitioner | us-west-2a | 3.9-SNAPSHOT | 10.156.89.36 | 00000000-0000-0000-0000-000000000000 | 20.1.0 | {'6148820866244280320'} | null
the peers of a sandbox cluster
cassandra@cqlsh> select * from system.peers;
peer | data_center | host_id | preferred_ip | rack | release_version | rpc_address | schema_version | tokens
---------------+-------------+--------------------------------------+---------------+------------+-----------------+---------------+--------------------------------------+--------------------------
10.156.90.8 | us-west-2 | 81c53aae-a11a-f8b9-2a4a-f3fbd8d89251 | 10.156.90.8 | us-west-2c | null | 10.156.90.8 | 00000000-0000-0000-0000-000000000000 | {'0'}
10.156.89.139 | us-west-2 | db60487a-ee7b-95a2-7c40-bca5c6d79ca3 | 10.156.89.139 | us-west-2b | null | 10.156.89.139 | 00000000-0000-0000-0000-000000000000 | {'-6149102341220990976'}
(2 rows)
and the peers of the cluster under question:
cassandra@cqlsh> select * from system.peers;
peer | data_center | host_id | preferred_ip | rack | release_version | rpc_address | schema_version | tokens
---------------+-------------+--------------------------------------+---------------+------------+-----------------+---------------+--------------------------------------+--------------------------
10.156.73.137 | us-west-2 | 7df4c54b-e5a5-c6aa-ed4b-c21d968afd23 | 10.156.73.137 | us-west-2b | null | 10.156.73.137 | 00000000-0000-0000-0000-000000000000 | {'0'}
10.156.73.24 | us-west-2 | 7cf6f9a6-3665-e88d-6d4a-466280323a73 | 10.156.73.24 | us-west-2a | null | 10.156.73.24 | 00000000-0000-0000-0000-000000000000 | {'3074269695633784832'}
10.156.73.136 | us-west-2 | 94a8abff-c4b2-4686-834f-613e6b61dc9e | 10.156.73.136 | us-west-2b | null | 10.156.73.136 | 00000000-0000-0000-0000-000000000000 | {'6148539391267569664'}
10.156.73.22 | us-west-2 | 3a2e396d-611b-5bbd-4a43-a226fa67a8b8 | 10.156.73.22 | us-west-2a | null | 10.156.73.22 | 00000000-0000-0000-0000-000000000000 | {'9222809086901354496'}
10.156.74.12 | us-west-2 | 7ae41c41-eb5d-608b-094f-5192dc288ef7 | 10.156.74.12 | us-west-2c | null | 10.156.74.12 | 00000000-0000-0000-0000-000000000000 | {'-3075395595540627456'}
(5 rows)
Not actually a bug. This is fine.