umputun / tg-spam

Anti-Spam bot for Telegram

Home Page:https://tg-spam.umputun.dev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Warnings in logs

asm0dey opened this issue · comments

During the ban-unban cycle I see the following messages in the log:

tg-spam  | 2024/02/14 08:06:41.823 [INFO]  user Мария Свиткова detected as spammer: {name: stopword, spam: false, details: not found}, {name: emoji, spam: false, details: 0/2}, {name: similarity, spam: false, details: 0.04/0.50}, {name: classifier, spam: true, details: probability of spam: 61.25%}, {name: cas, spam: false, details: record not found}, "не переживай) из 5 северных сияний, которые были у нас в городе, я проспала 2. И еще 2 были видны из любого села по соседству, но не у нас, потому что над нашим стояли очередные облака"
tg-spam  | 2024/02/14 08:06:41.833 [INFO]  detected spam entry added for user_id:370236309, name:irda_noire
tg-spam  | 2024/02/14 08:06:41.877 [INFO]  {370236309 irda_noire Мария Свиткова} banned by bot for 9600h0m0s
tg-spam  | 2024/02/14 08:06:53.479 [WARN]  failed to send message as markdown, Bad Request: can't parse entities: Can't find end of the entity starting at byte offset 34
tg-spam  | 2024/02/14 08:06:57.660 [INFO]  add aproved user: id:370236309, name:"irda_noire"
tg-spam  | 2024/02/14 08:06:57.663 [INFO]  user "irda_noire" (370236309) added to approved users
tg-spam  | 2024/02/14 08:06:57.667 [WARN]  failed to send message as markdown, Bad Request: can't parse entities: Can't find end of the entity starting at byte offset 603
tg-spam  | 2024/02/14 08:06:57.778 [INFO]  user unbanned, chatID: -1002096077129, userID: 370236309:66269, orig: "permanently banned {370236309 irda_noire Мария Свиткова}\n\nне переживай) из 5 северных сияний, которые были у нас в городе, я проспала 2. И еще 2 были видны из любого села по соседству, но не у нас, потому что над нашим стояли очередные облака\n\n**spam detection results**\n- stopword: ham, not found\n- emoji: ham, 0/2\n- similarity: ham, 0.04/0.50\n- classifier: spam, probability of spam: 61.25%\n- cas: ham, record not found"

Lines 4 and 7 are of particular interest: it looks like the bot tries to send a message and can't? Probably repeat the message sent?

Did you miss those messages for real? Currently, every message that we failed to send as md is reformated to plain text; see events/send

The bottom line - the message above is an indication of the md preparation failure but not the inability to send the final message. The reason - md handling for the tg subset is a tricky thing, and, for example. Having an innocent-looking user name with _ will break it. We are trying to clean it, but, in some cases, we missed it.

I'm not sure about an intended effect. If the messages should have appeared in the chat - they didn't

hmm, this is odd. What version are you using? The current :master and :latest both should have the fallback logic in

It was happening on latest, but I can't reproduce it quickly, happened to me only once

Closing this issue as there is not much that can be done at the moment. If you encounter this again, please feel free to reopen it.