Feature Request!
milifilou opened this issue · comments
first off, thank you for making that back filling application. I have been struggling for a while with a morphman that just would not apply any type of frequency ordering, so switching to a new system will be helpful.
Something I had difficulties with when trying to backfill my collection was the fact that my deck had furigana in its only expression field.
I was eventually able to hack together a solution that works for me, since I only had to remove anything that lies between [ these types of brackets].
If you are interested in doing a minor update to this program, could I ask that you make furigana ignoring backfill an official feature?
My implementation was:
rx_HTML = re.compile("<.*?>")
rx_Furigana = re.compile(r"\[.*?\]")
def normalize_expr(expression: str):
# removes HTML and surrounding whitespace
result = re.sub(rx_Furigana, '', expression).strip()
result = re.sub(rx_HTML, '', result).strip()
return result
but of course something like this should probably go behind a proper command-line flag instead of being applied automatically. I would make a pull request with this myself, except I have never made a program with a command-line interface in python before. there is also the issue of making this furigana support work across multiple types of formatting, as I remember seeing some decks that use other pairs of brackets for this instead.
Either way, many thanks!
I believe the anki format of inserting furigana is standardized, so different formats shouldn't be an issue I believe. Of course the performance impacts would have to be weighed but if it's unlikely to interfere with regular use, I think this could be potentially added without a flag. I'll think about it, thanks for sharing!
I think this should be fine to add without a feature flag. Feel free to make a PR!