salesforce / CodeGen

CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Proper way to prompt for code generation

tlkh opened this issue · comments

Hello, thanks for your work.

Is there a proper way to prompt for code generation?

For example, to generate code to answer: "Create a function called num_in_str() to check whether a string contains a number."

Currently, when I pass that into the context (for the 350M and 2B models), the output is only # and it stops there.

Thank you!

For Python, wrapping the prompt as a comment with triple quotes may help. E.g., for 350M mono

''' Create a function called num_in_str() to check whether a string contains a number. '''

yields

''' Create a function called num_in_str() to check whether a string contains a number. '''

def num_in_str(str):
    ''' Check whether a string contains a number. '''
    return str.isdigit()

The larger models may sample code which is more correctly aligned with your intention.

Otherwise, provide a few pairs (language, code) as a prefix to the context to convey the format.

Let me know if this helps.

Thanks! That works better.

How would this work for multi-turn conversation?
Could you provide an example input which shows a multi-turn prompt?

Thank you in advance.

For a multi-turn conversation, one approach is to concatenate the history of (prompt, code) pairs.

Say, sample code4 conditionally on "prompt1 \n code1 \n prompt2 \n code2 \n prompt3 \n".