MarshalX / atproto

The AT Protocol (🦋 Bluesky) SDK for Python 🐍

Home Page:https://atproto.blue

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Misdetection of URLs / links in the "auto_huperlinks" example

fxcoudert opened this issue · comments

The regexp in extract_url_byte_positions at https://github.com/MarshalX/atproto/blob/main/examples/advanced_usage/auto_hyperlinks.py does not appear to detect all valid URLs. Take for example:

https://www.cell.com/matter/fulltext/S2590-2385(23)00409-5?rss=yes

This is misdetected, and the URL is stopped before (

Could you pls fix?

Not really. I've added \(\) to the allowed characters in my own use case, but I'm pretty sure the regexp is not conformant and will fail to catch other valid URLs. Probably better to use something designed and tested by someone else.