Toxicity Scanner to return the type of content
RQledotai opened this issue · comments
When using the input or output toxicity scanner, it would be preferrable to return the type of label (e.g. sexual_explicit
) instead of the offensive content. It would enable applications to communicate the issue.
Hey @RQledotai , thanks for reaching out. Apologies for the delay.
I agree, and such refactoring is in works to actually return an object with more context about the reason behind blocking. Currently, the only way to monitor is logs.