There are 6 repositories under jailbreaking topic.
[CCS'24] A dataset consists of 15,140 ChatGPT prompts from Reddit, Discord, websites, and open-source datasets (including 1,405 jailbreak prompts).
A powerful tool for automated LLM fuzzing. It is designed to help developers and security researchers identify and mitigate potential jailbreaks in their LLM APIs.
Frida script to bypass the iOS application Jailbreak Detection
Does Refusal Training in LLMs Generalize to the Past Tense? [ICLR 2025]
Materials for the course Principles of AI: LLMs at UPenn (Stat 9911, Spring 2025). LLM architectures, training paradigms (pre- and post-training, alignment), test-time computation, reasoning, safety and robustness (jailbreaking, oversight, uncertainty), representations, interpretability (circuits), etc.
An extensive prompt to make a friendly persona from a chatbot-like model like ChatGPT
Security Kit is a lightweight framework that helps to achieve a security layer
iOS APT distribution repository for rootful and rootless jailbreaks
During the Development of Suave7 and it's Predecessors, we've created a lot of Icons and UI-Images and we would like to share them with you. The Theme Developer Kit contains nearly 5.600 Icons, more than 380 Photoshop-Templates and 100 Pixelmator-Documents. With this Package you can customize every App from the App Store …
Customizable Dark Mode Extension for iOS 13+
Source code for bypass tweaks hosted under https://github.com/hekatos/repo. Licensed under 0BSD except submodules
This repository contains the code for the paper "Tricking LLMs into Disobedience: Formalizing, Analyzing, and Detecting Jailbreaks" by Abhinav Rao, Sachin Vashishta*, Atharva Naik*, Somak Aditya, and Monojit Choudhury, accepted at LREC-CoLING 2024
SecurityKit is a lightweight, easy-to-use Swift library that helps protect iOS apps according to the OWASP MASVS standard, chapter v8, providing an advanced security and anti-tampering layer.
LV-Crew.org_(LVC)_-_Howto_-_iPhones
Your best llm security paper library
"ChatGPT Evil Confidant Mode" delves into a controversial and unethical use of AI, highlighting how specific prompts can generate harmful and malicious responses from ChatGPT.
Updater script for iOS-OTA-Downgrader.
ChatGPT Developer Mode is a jailbreak prompt introduced to perform additional modifications and customization of the OpenAI ChatGPT model.
HITC reborn: faster, better and prettier
Script and study research on deepin that removes any bogus feature on Deepin
Répertoire du serveur "Wii & WiiU FR" pour l'hébergement de quelques services permettant le bon fonctionnement de celui-ci.
A "ChatGPT Mongo Tom Prompt" is a character that tells ChatGPT to respond as an AI named Mongo Tom, performing a specific role play provided by you.
jsfuck magic genertor for eval jailbreking
Simple Jailbreak Detection in Swift
SecurityKit is a lightweight, easy-to-use Swift library that helps protect iOS apps according to the OWASP MASVS standard, chapter v8, providing an advanced security and anti-tampering layer.
LV-Crew.org_(LVC)_-_Howto_-_iPhones
Exploring the AntiGPT Prompt: A Deep Dive
The UnGPT prompt is a set of specific guidelines that replace the traditional constraints and filters applied to AI responses.