openai / openai-python

The official Python library for the OpenAI API

Home Page:https://pypi.org/project/openai/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[FEATURE REQUEST] Bad words list

Peter-Devine opened this issue · comments

I don't have access to the GPT-3 API yet (A guy can dream, eh?), but I have been reading through the docs and it seems like the completion module would be perfect for my use case except for the exclusion of a "bad words list" feature.

This feature would not allow certain words to be generated in the completion output. I am aware of the logit_bias argument, but this only stops individual tokens from being generated.
My idea would take an arbitrary string (Or list of token IDs) as input, and then not allow the completion of this string given the words before it.

I have successfully asked for this feature from the Huggingface .generate API many moons ago. Please see my feature request for a fuller run-down of how it could be implemented (link: huggingface/transformers#3061).

It would be a useful feature for customers because it could give peace of mind that the models that they are serving are not going to output any unsavoury language. I can see that an alternative to this feature would just be to train the model not to output generally bad language (E.g. overly aggressive or xenophobic language) through thoughtful use of training data, but since everyone's definition of bad language is different, it would be nice to customise the model accordingly.

Thanks!

Bump. Another way to think about it is conditional token biasing; e.g. sometimes you want to disallow "on" but only after a specific word. There doesn't seem to be a way to do this currently on the OpenAI or the Goose AI API's, but huggingface transformers offers it via bad_words_ids.

Thanks for the suggestion!

This sounds like a feature request for the underlying OpenAI API and not the Python library, so I'm going to go ahead and close this issue.

Would you mind reposting at community.openai.com?