jayinc / SimpleSafetyTests4LLMs

SimpleSafetyTests for LLMs - abuse, suicide, etc.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SimpleSafetyTests

Content warning

SimpleSafetyTests contains prompts that relate to child abuse, suicide, self-harm and eating disorders, scams and fraud, illegal items, and physical harm. They are highly sensitive and you could find them harmful.

Overview

SimpleSafetyTests comprises 100 handcrafted English test prompts across five harm areas (Suicide, Self-Harm and Eating Disorders; Physical Harm; Illegal and Highly Regulated Items; Scams and Fraud; and Child Abuse).

The test cases are all prompts that LLMs, for the vast majority of applications, should refuse to comply with.

Email bertievidgen@gmail.com if you have questions.

About

SimpleSafetyTests for LLMs - abuse, suicide, etc.