liyucheng09 / Selective_Context

Compress your input to ChatGPT or other LLMs, to let them process 2x more content and save 40% memory and GPU time.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Problem with sentence level reduction

pengshancai opened this issue · comments

It seems when I attempted to do sentence-level reduction, the variable self.keep_leading_word, self.num_lead_words and self.mask_token were never declared. Any clarification? Thanks!

any error message? How to reproduce your error?

Guess I figured it out.
Your file selective_context.py in the GitHub repo is different from the selective_context.py in the pip install version.
If you look at the selective_context.py in this GitHub repo you will find that some variables (e.g. self.keep_leading_word) were used without declaration.

Also, it seems some libs installed together with your package are out of date. (e.g. spacy) and are not compatible with other packages.

see this).

class SelectiveContext:

    def __init__(self, model_type = 'gpt2', lang = 'en'):

        self.model_type = model_type
        self.lang = lang
        self.device = DEVICE

        # this means we calculate self-information sentence by sentence
        self.sent_level_self_info = True

        self._prepare_phrase_tokenizer()
        self.sent_tokenize_pattern = r"(?<!\w\.\w.)(?<![A-Z][a-z]\.)(?<=\.|\?)\s"
        self.phrase_mask_token = ''
        self.sent_mask_token = "<...some content omitted.>"
        self.keep_leading_word = False
        self.mask_token = ''
        self._prepare_model()

I don't see any undeclared parameters here? What error you found exactly?

please share more info for me to improve this project, in case other users find it helpful.

I see. Thanks! Would you mind open a pr for this if you already solve it?

Never mind, issue solved.