MarvNC / JP-Resources

My contributions to the Japanese learning community.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Feature Request!

milifilou opened this issue · comments

first off, thank you for making that back filling application. I have been struggling for a while with a morphman that just would not apply any type of frequency ordering, so switching to a new system will be helpful.

Something I had difficulties with when trying to backfill my collection was the fact that my deck had furigana in its only expression field.
I was eventually able to hack together a solution that works for me, since I only had to remove anything that lies between [ these types of brackets].
If you are interested in doing a minor update to this program, could I ask that you make furigana ignoring backfill an official feature?
My implementation was:

rx_HTML = re.compile("<.*?>")
rx_Furigana = re.compile(r"\[.*?\]")

def normalize_expr(expression: str):
    # removes HTML and surrounding whitespace
    result = re.sub(rx_Furigana, '', expression).strip()
    result = re.sub(rx_HTML, '', result).strip()
    
    return result

but of course something like this should probably go behind a proper command-line flag instead of being applied automatically. I would make a pull request with this myself, except I have never made a program with a command-line interface in python before. there is also the issue of making this furigana support work across multiple types of formatting, as I remember seeing some decks that use other pairs of brackets for this instead.
Either way, many thanks!

commented

I believe the anki format of inserting furigana is standardized, so different formats shouldn't be an issue I believe. Of course the performance impacts would have to be weighed but if it's unlikely to interfere with regular use, I think this could be potentially added without a flag. I'll think about it, thanks for sharing!

commented

I think this should be fine to add without a feature flag. Feel free to make a PR!