Herb-AI / HerbSearch.jl

Search procedures and synthesizers for Herb.jl

Home Page:https://herb-ai.github.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Remarks and future considerations for FrAngel, fragments, and angelic conditions

To5BG opened this issue · comments

As per Sebastijan's request, I am leaving all implementation remarks, and future considerations, that came to be from implementing FrAngel, and separating fragments and angelic conditions into standalone functionalities.

Fragments

  • As of now, implementing the fragments functionality requires a lot of utilities for updating grammars in specific ways. All of these can potentially go to HerbGrammar if better-suited.

  • One of these methods adds rules to the grammar in bulk, and initially it did not preserve RHS uniqueness. This is needed for fragments of size 1 node, as for example when we have the rule Num = 5, and want to add Fragment_Num = 5. So far it has been agreed that we would simply not consider fragments of size 1, so that this property is kept.

  • One may consider extending the regular grammar struct to optionally contain a second version of a grammar. This is the case if you often have the need to keep track of the full grammar with fragments, and the same grammar without them. I envision this to be needed in many use cases - for FrAngel specifically, a base grammar with no fragments is needed for sampling subprogram replacements during modification. Maybe something along the lines of:

mutable struct FragmentGrammar
   fragment_grammar::AbstractGrammar
   fragment_base_rules_offset::Int16
   fragment_rules_offset::Int16
   fragments::Vector{RuleNode}
end

Angelic conditions

  • Creation of angelic conditions is here, and the execution of angelic conditions is in HerbInterpret.

  • Two data structures are used - one is a trie that is needed for prefix checking on code paths for angelic execution. The other is a key-only hashtable, which keeps track of the visited program space. While this will not be needed for sampling once RandomStateIterator is finalized, unfortunately, we also have to keep track of the visited space (or at least, for performance's sake we should), in resolve_angelic!, where we generate angelic replacements and attempt each one. Revisiting candidates here is redundant work. Creating a new random iterator on each function call for the sake of ensuring uniqueness does not seem ideal either.

  • Alternative to how angelic conditions are represented: Currently, we have a rule node in the grammar to signify what is an angelic condition. Two alternatives, with their own merits, are to 1. Make a special node (e.g. AngelicNode) and 2. Have a flag in the RuleNode struct to mark angelic conditions.

  • Previously, when we were checking for duplicates with LongHashMap, we wanted to be able to distinguish between angelic conditions and regular rule nodes. Specifically, angelic placeholders should not be part of a tree's hash to determine uniqueness, unless the grammar has not been changed. This was achieved by replacing the angelic conditions with holes with domains, and then right before evaluation, with the angelic rule node. Since different approaches will be used to keep track of program space, this workaround may be unnecessary.

Frangel-specific

  • Currently, RandomStateIterator is not in needed shape to be used for FrAngel. It needs to consider probabilities, and check for duplicates, for starters. FrAngel comes with a simple implementation of a random sampler, which can be removed/replaced later down the line.

  • The current function for finding the size of a tree ignores holes. With the current implementation of angelic conditions, this will ignore angelic placeholders as well. Do check if this affects FrAngel's performance/correctness empirically.

Fantastic, thanks fro writing this up!