llm-safety

There are 0 repository under llm-safety topic.

PKU-YuanGroup / Hallucination-Attack
Attack to induce LLMs within hallucinations
adversarial-attacks llm hallucinations machine-learning nlp llm-safety ai-safety deep-learning
Language:Python 88
Libr-AI / OpenRedTeaming
Papers about red teaming LLMs and Multimodal models.
awesome-list language-model llm-safety papers redteaming safety
45
declare-lab / resta
Restore safety in fine-tuned language models through task arithmetic
alignment alignment-algorithm llm llm-safety llm-safety-benchmark llms llms-benchmarking safety
Language:Python 23
Babelscape / ALERT
Official repository for the paper "ALERT: A Comprehensive Benchmark for Assessing Large Language Models’ Safety through Red Teaming"
ai artificial-intelligence llm llm-evaluation llm-safety llm-safety-benchmark nlp nlp-machine-learning red-teaming
Language:Python 20
copyleftdev / ai-testing-prompts
Comprehensive LLM testing suite for safety, performance, bias, and compliance, equipped with methodologies and tools to enhance the reliability and ethical integrity of models like OpenAI's GPT series for real-world applications.
ai-explainability llm-safety ai-bias-testing ai-performance-optimization ai-security-testing ai-testing-best-practices gpt-model-reliability large-language-models-testing machine-learning-testing-frameworks model-compliance