ianarawjo / ChainForge

An open-source visual programming environment for battle-testing prompts to LLMs.

Home Page:https://chainforge.ai/docs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

What is the expected behaviour when providing multiple inputs to a prompt node?

benwhalley opened this issue · comments

What is the expected behaviour with a flow like this?

Screenshot 2024-05-09 at 13 06 12

The default seems to be to multiply the inputs so that all combinations or lang and description are used in the final prompt.

I suppose it might be useful, but I can't actually imagine a scenario when I would want to do this. Is it possible to have a setting for prompts with multiple inputs to either align or multiply the inputs?

The 'join node' feature has a nod to this, allowing for joining of inputs within or across the llms used as input, but this would still multiply within llm I think. It also doesn't allow templating the multiple inputs into a final query to summarise.

I can see that there is a need to track the 'unit' that we are working with here, but perhaps the units could be derived from the name or ID of an earlier node? In the example above, if the texfield node has the id "words" then we could have an option to "align inputs by <words>" in the final prompt node.

Aligning is what Tabular Data node outputs do; look at that: https://chainforge.ai/docs/prompt_templates/#associated-variables-carry-together

As far as composing alignments from LLM outputs... you're right, it does not support that directly. The issue is that, for your example, it seems "obvious" what to do. However, it is not at all obvious. Yes, {word} is the shared template variable, and it will be retained as history in the second prompt node. However, there's no way in general of knowing how to associate the outputs of your top and bottom prompt node! These could be two totally unrelated things. If you want to join, you can join within variable "word" in a Join Node. That is the best you can do. Otherwise, you should export the outputs and import to Tabular Data as a cleaned up spreadsheet, if you want them to always be aligned.

Actually, this is extremely useful (multiplying inputs) for combinatorially exploding the search space. We discuss this in he ChainForge paper. It is primarily useful for auditing-type tasks.

OK, I should have read the paper first! I can see that multiplying inputs x sets of prompts could be useful as you have in fig. 5 for exploring combinations of inputs and instructions, and also crossing with different models. But I do think for some use cases this is less helpful, and there might still be nice ways of allowing for both behaviours.

If you want to join, you can join within variable "word" in a Join Node. That is the best you can do. Otherwise, you should export the outputs and import to Tabular Data as a cleaned up spreadsheet, if you want them to always be aligned.

For a given set of P prompts, it's possible to keep track of the variables which generated the combinations.
For example in Fig5 of the paper you have two nodes which are generating combinations — the textfields (A) and the prompt column in the tabular data node (B). So the output of the prompt field has A*B rows.
This recreates that situation:

Screenshot 2024-05-16 at 12 57 16

In the exported data, there is already a "Prompt" and "Response" column (and "LLM" too):

Screenshot 2024-05-16 at 13 01 19

So would it be possible to just push the export data into a tabular data node where the columns are set by the groupings that created the prompt? Exporting and reimporting files is an option but it's clunky and means the thing has to be run manually. Allowing combinations like this would allow much more flexibility.

Yes that's right. You can do "aligned" inputs by using the output of Tabular Data nodes. And sure, there could be an "input" to tabular data that is just the excel sheet that would be exported from the Prompt Node. Please open an Issue for that in particular, and, if you are feeling up to it, you might attempt to implement it.