SH_BODYURI_REVERSE_SBL triggers for fonts.googleapis.com
xrat opened this issue · comments
In the past, SH_BODYURI_REVERSE_SBL
triggered due to IP addresses for fonts.googleapis.com
being listed. It's unclear why these IP addresses got listed in the first place, however, that's not the issue here. There should be means to avoid false positives due to URIs which are obviously generically used like fonts.googleapis.com
, fonts.gstatic.com
, schemas.microsoft.com
, or www.w3.org
to name a few candidates.
Note that the default threshold for spam in SpamAssassin is 5. The currently assigned score for SH_BODYURI_REVERSE_SBL
by this project is 8:
spamassassin-dqs/3.4.1+/sh_scores.cf
Line 7 in 666784c
BTW, one such message recently affected was
Date: Thu, 01 Feb 2024 00:10:05 +0000
From: Spamhaus Technology <notification@service.spamhaus.com>
Reply-To: datafeed-accounts@spamhaus.com
Subject: Annual Account Notification
It would seem to me like at least URIs of some known good and frequently used remote fonts hosters and schemas in general should be excluded. This would also save some resources of Spamhaus' DNS infrastructure.
Hi, this is a listing issue, not a plugin's one. I escalated internally but this is not the right place to raise issues about possible FPs.
Can we at least agree that schema URIs should not be checked?
No, because they could be ever changing and we do not mantain a list of them.
I don't mean a list but the extraction algorithm. The way the plugin operates it extracts and subsequently checks URIs of remote fonts (I get that though I see potential and actual problems) and schemas like those for XML etc. pointing at www.w3.org
. I've never heard of them being exploited or being any reasonable indicator for spam-iness.
Hi, this is a listing issue, not a plugin's one. I escalated internally but this is not the right place to raise issues about possible FPs.
Thanks for escalating this internally. However, I now came to the conclusion that I have to disagree with closing this issue as a "listing issue". Here's why:
- As fellow mailop Bernhard Lichtinger today pointed out on the mailop mailing list (where aspects of this issue are discussed, too), "IPs of fonts.googleapis.com got listed on SBL because these IPs are also used to serve firebasestorage.googleapis.com." I verified this right now w/ at https://check.spamhaus.org/listed/?searchterm=216.58.212.170 . This listing is not disputed. We all agree that the IP is listed for good reasons. It's how this plugin operates and how it is set up which causes problems.
- Many legitimate senders have no choice about what HTML code their MSP or MUA uses. If it uses remote fonts hosted on fonts.googleapis.com, as it currently stands, their messages will very likely be flagged spam by the plugin. On a low volume server where I am currently testing the plugin the FP rate is 5%.
- fonts.googleapis.com is resolved to different IPs every now and then (TTL currently is 300s) due to some kind of round-robbin DNS. Assuming that not all IPs are SBL listed this shows again that this is not a matter of the listing but how the plugin operates.
Line 707 in 666784c
does not work, @ricalfieri. SH_BODYURI_REVERSE_SBL
uses check_sh_bodyuri_a()
which aims to use a skip list via
Line 697 in 666784c
which suggests to be filled using uridnsbl_skip_domains
, which however unfortunately has absolutely no effect (it's actually uridnsbl_skip_domain googleapis.com fonts.googleapis.com
in my SpamAssassin configuration). My expectation is quite simply: Please really skip the domain as the SH.pm
code makes admins believe.
Thanks robert-scheck, I was about to open a new issue about $skip_domains
next. I think it's a separate bug. Would you like to open a new issue for it?
I think it's a separate bug. Would you like to a new issue for it?
Thank you, done.
Just saw even more FPs for maps.googleapis.com
and notifications.googleapis.com
.