t2bot / matrix-media-repo

Highly configurable multi-domain media repository for Matrix.

Home Page:https://docs.t2bot.io/matrix-media-repo

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

The media repo spontaneously connects to incorrect host, causing TLS errors.

9p4 opened this issue · comments

commented

Occasionally, the media repo will stop downloading new media for one specific host because it gets a TLS error (and then returns a 500 error code).

In the logs, it appears as if the media repo is contacting matrix.example.org, but getting back TLS certificates for example.net, causing an error.

Weirder still, my own server is example.net, and I'm trying to download media from example.com.

Oct 18 00:46:56 hostname matrixmediarepo[1476272]: time="2023-10-18 04:46:56.958 Z" level=error msg="Unexpected error locating media: Get \"https://matrix.example.org:443/_matrix/media/v3/download/example.org/REDACTED?allow_remote=false\": tls: failed to verify certificate: x509: certificate is valid for *.example.net, example.net, not matrix.example.org" allowRemote=true contentLength=0 contentType="" filename="" host=matrix.example.net mediaId=REDACTED method=GET queryString="" remoteAddr="10.88.0.1:50920" requestId=REQ-10567 resource=/_matrix/media/v3/download/example.org/REDACTED server=example.org userAgent="Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/113.0"

Version: 1.3.2

Synapse Version: 1.94.0

This is most likely a host configuration error out of the control for the media repo. If you have an unredacted server name, it should be easy to confirm.

commented

This only happens occasionally. Restarting the media repo fixes the issue for a time, then fails again. Usually it goes a week or so before failing.

Those symptoms still sound like a configuration issue on that host. Other possibilities are DNS errors during server name resolution, but that shouldn't result in port 443 being used - it'd require a misconfiguration for it to pick such a port.

commented

Here is the configuration and relevant log lines.

homeserver.txt
media-repo.txt
cert-fails.txt

From the federation tester:

server name/.well-known result contains explicit port number: no SRV lookup done

This means the homeserver is asking to be accessed on port 443, but isn't responding. This is a homeserver configuration issue.

commented

The failure is only for the IPv6 address, not the IPv4 one. The IPv4 connection still works. I'll get rid of the AAAA record and check if the issue persists.

commented

Issue persists.

Oct 21 22:23:54 sr0.ersei.net matrixmediarepo[3539209]: time="2023-10-22 02:23:54.328 Z" level=error msg="Unexpected error locating media: Get \"https://matrix.lilysthings.org:443/_matrix/media/v3/download/lilysthings.org/x?allow_remote=false\": tls: failed to verify certificate: x509: certificate is valid for *.ersei.net, ersei.net, not matrix.lilysthings.org" allowRemote=true contentLength=0 contentType="" filename="" host=matrix.ersei.net mediaId=x method=GET queryString="" remoteAddr="10.88.0.1:49156" requestId=REQ-2485 resource=/_matrix/media/r0/download/lilysthings.org/x server=lilysthings.org userAgent="Element/1.11.4 (iPhone 13 mini; iOS 16.7.1; Scale/3.00)"

I can't reproduce the issue, so possibly the certificate wasn't set up when a request was made and MMR cached it. Either way, the error is not the result of a MMR bug and would only be possible due to a configuration issue on the remote end.