Unable to use dns:// prefix for clusterInformation.rpcAddress configuration
hferentschik opened this issue · comments
Expected Behavior
We are trying to setup multi-cluster replication and for that purpose need to configure the clusterMetadata
of the Temporal configuration. This is described here. Specific to the rpcAddress the docs says:
rpcAddress - indicate the remote service address (host:port). Host can be DNS name. Use dns:/// prefix to enable round-robin between IP address for DNS name.
We are using this prefix everywhere where we create a gRPC client. In particular our workers connect via dns:///<service-address>:7233
. The reason to explicitly specify the dns resolver is that grpc-go deviates from the gRPC spec and does not use the dns resolver as default. See grpc/grpc-go#3544 (comment). It uses a so called "passthrough" as default. If one wants to use dns, one needs to explicitly specify the resolver using dns:///
.
It seems in most cases Temporal allows the use of the prefix, except for clusterInformation.rpcAddress
.
Actual Behavior
When configuring the clusterInformation.rpcAddress
with something like dns:///<host>:7233
an error is returned when one tried to use cluster replication commands, for example temporal operator cluster upsert
in order to establish a connection between two clusters. The returned error is:
address dns:///myhost:7233: too many colons in address
See also https://go.dev/play/p/9Rwow_gtcCF
The problem is that RPCFactory#CreateRemoteFrontendGRPCConnection uses net.SplitHostPort
to extract the hostname from the configured rpcAddress in order to lookup the client config for the host.
net.SplitHostPort
cannot handle the dns:///
prefix.
func (d *RPCFactory) CreateRemoteFrontendGRPCConnection(rpcAddress string) *grpc.ClientConn {
var tlsClientConfig *tls.Config
var err error
if d.tlsFactory != nil {
hostname, _, err2 := net.SplitHostPort(rpcAddress)
if err2 != nil {
d.logger.Fatal("Invalid rpcAddress for remote cluster", tag.Error(err2))
}
tlsClientConfig, err = d.tlsFactory.GetRemoteClusterClientConfig(hostname)
if err != nil {
d.logger.Fatal("Failed to create tls config for gRPC connection", tag.Error(err))
return nil
}
}
return d.dial(rpcAddress, tlsClientConfig)
}
Steps to Reproduce the Problem
- Use the
dns:///
prefix in theclusterInformation.rpcAddress
of the Temporal server in order to enforce the dns resolver in grpc-go - Use an operation which calls
CreateRemoteFrontendGRPCConnection
, eg try enabling a cluster connection between two clusters.
Specifications
- Version: v1.22.7
Something like this https://go.dev/play/p/403gj5Zovy7 could work (assuming the port is mandatory):
package main
import (
"fmt"
"net"
"net/url"
"strings"
)
func main() {
host, err := hostname("myhost:7233")
fmt.Printf("host: %s, error: %v\n", host, err)
host, err = hostname("dns:///myhost:7233")
fmt.Printf("host: %s, error: %v\n", host, err)
}
func hostname(rpcAddress string) (string, error) {
hostname, _, err := net.SplitHostPort(rpcAddress)
if err != nil {
grpcUrl, parseErr := url.Parse(rpcAddress)
if parseErr != nil {
return "", parseErr
}
// Check the host field, if set rpcAddress is in plain host:port format
// If the host is not set the address specified a resolver, eg dns:///host:port.
// In this case the address is in the path of the URL.
// See https://github.com/grpc/grpc-go/blob/c8083227ee3181db60cf37005940e11865a572b4/resolver/resolver.go#L272-L288
address := grpcUrl.Host
if address == "" {
address = grpcUrl.Path
}
if address == "" {
address = grpcUrl.Opaque
}
address = strings.TrimPrefix(address, "/")
hostname, _, _ = net.SplitHostPort(address)
err = nil
}
return hostname, err
}
Alternatively one could just use strings.CutPrefix
to remove dns:///
, but then that would only cover the single dns resolver.
Happy to create/work on a PR if we can agree on the approach. Or maybe I am still missing something here?
@yux0 wdyt? Did you have a chance to look at this?