Use pydantic-yaml to save on token costs?
lingster opened this issue · comments
In order to save on costs, could we implement a version of this that makes llm respond in yaml? This may also have a benefit of faster response times as less tokens would need to be generated and returned.
Perhaps this could be added as a flag?
The trick would be to ensure we can correctly validate the yaml is in a format that matches the pydantic model.
this would work if we used mode=MD_YAML, happy to take a pr on this, and add it into the benchmarks