Toxicity Scanner to return the type of content

Question

Toxicity Scanner to return the type of content

RQledotai opened this issue 4 months ago · comments

When using the input or output toxicity scanner, it would be preferrable to return the type of label (e.g. sexual_explicit) instead of the offensive content. It would enable applications to communicate the issue.

Oleksandr Yaremchuk · Answer 1 · Fri Mar 22 2024 16:33:12 GMT+0800 (China Standard Time)

Hey @RQledotai , thanks for reaching out. Apologies for the delay.

I agree, and such refactoring is in works to actually return an object with more context about the reason behind blocking. Currently, the only way to monitor is logs.