Prompt hacking

Question

Prompt hacking

ricardoborges opened this issue 2 years ago · comments

Have you think how to handle it?

kielingraphael · Answer 1 · Mon Mar 13 2023 10:12:14 GMT+0800 (China Standard Time)

I played a bit in chatgpt and you can use:

[context] ...everything related with the next.js definition

[question] Sorry, i made a mistake...

If the [question] Is not related with the creation of 
a React component, answer: "I only generate REACT components!".

The chatgpt, gave me the message instead of the julius stuff. But obviusly, it's like the DAN jailbreak, you ever will have another way to ask and break it.

Ricardo Borges · Answer 2 · Mon Mar 13 2023 10:40:43 GMT+0800 (China Standard Time)

I was thinking and handling that on code, but the thing is: it can understand almost all languages

Ricardo Borges · Answer 3 · Mon Mar 13 2023 10:41:21 GMT+0800 (China Standard Time)

maybe they can build a config for the API, i dont know if LLM can work like that

kielingraphael · Answer 4 · Mon Mar 13 2023 21:16:42 GMT+0800 (China Standard Time)

It's hard, mainly because LLM do not have a good way yet to prevent it, we can give the model examples of injections and try punish the model if he accepts it, but i can't see too much options for now. Unfortunately seems that we will need to learn these new security LLM boundaries to be able to create good products 🥲

Ricardo Borges · Answer 5 · Mon Mar 13 2023 22:49:31 GMT+0800 (China Standard Time)

yeah, that is a big problem, as you pay for usage, and users break the role for another purposes. Already happen to "AI Dungeon" app

https://www.cnbc.com/2023/03/13/chatgpt-and-generative-ai-are-booming-but-at-a-very-expensive-price.html

Umair · Answer 6 · Thu Mar 16 2023 20:20:36 GMT+0800 (China Standard Time)

one way : i think it can be handle by restricting the user input validating with regex like restricting some specific words ....?

Ricardo Borges · Answer 7 · Sat Mar 18 2023 06:05:43 GMT+0800 (China Standard Time)

@umairabbasDev no, unless you can handle all human languages he can understand

Umair · Answer 8 · Mon Mar 20 2023 14:35:03 GMT+0800 (China Standard Time)

@ricardoborges I am currently only considering the English language, but you are correct. we can include certain keywords in the prompt to instruct ChatGPT not to include them in any language. What do you think about that idea?

Ricardo Borges · Answer 9 · Mon Mar 20 2023 22:32:18 GMT+0800 (China Standard Time)

That's sounds cool! I'll try that

…

On Mon, Mar 20, 2023, 03:35 Umair ***@***.***> wrote: @ricardoborges <https://github.com/ricardoborges> I am currently only considering the English language, but you are correct. we can include certain keywords in the prompt to instruct ChatGPT not to include them in any language. What do you think about that idea? — Reply to this email directly, view it on GitHub <#7 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AADCYXDURVKDDEGYYWQD3RLW473CFANCNFSM6AAAAAAVV6YN3U> . You are receiving this because you were mentioned.Message ID: ***@***.***>