dedent=True doesn't work in python 3.12
mmoskal opened this issue · comments
The bug
The dedend=True
argument to @guidance
decorator seems to work in 3.10 and 3.11 but not in 3.12.
To Reproduce
The test below passes in 3.11 but not 3.12 on macOS x86.
import guidance
@guidance(stateless=True, dedent=True)
def character_maker2(lm):
lm += f"""\
{{
"name": "{guidance.gen('name', stop='"', max_tokens=1)}",
}}"""
return lm
def test():
lm = guidance.models._mock.Mock()
lm += character_maker2()
assert str(lm).startswith("{")
test()
System info (please complete the following information):
- OS (e.g. Ubuntu, Windows 11, Mac OS, etc.): macOS x86
- Guidance Version (
guidance.__version__
): 0.1.15
I've noticed the dedent kwarg but didn't previously know what it was doing... Essentially we are introspecting on the source code and rewriting it as if the user had written textwrap.dedent
on their block strings?
Do you know if this behavior is documented anywhere? Is it important to maintain? @Harsha-Nori any opinions/thoughts?
Python 3.12 had major changes to f-string behavior, so it doesn't surprise me that dedent is acting up now. Thanks for catching this!
@hudson-ai I don't know that we ever documented it, but the idea is to follow the same semantics as textwrap.dedent (ignore leading spaces) so that you can do natural in-line f-strings without having to undo all the indentation if you're in an already nested python block. If we don't have this, the f-string syntax for writing guidance programs gets quite ugly since every line after the first needs to be on the 0-indent level, eg:
def prog(lm):
if blah:
lm += f"""first line
second line
third line"""
return lm
Understood. That is indeed nice to have...
Just messing around a little with this (on Windows.... same behaviour), and the call to dedent happens here:
Line 95 in 4f7bf9e
The result on Python 3.11 is
@guidance(stateless=True, dedent=True)
def character_maker2(lm):
lm += f"""\
{{
"name": "{guidance.gen('name', stop='"', max_tokens=1)}",
}}"""
return lm
and on Python 3.12
@guidance(stateless=True, dedent=True)
def character_maker2(lm):
lm += f"""\
{{
"name": "{guidance.gen('name', stop='"', max_tokens=1)}",
}}"""
return lm
which looks rather similar.
And continued prodding, this is definitely to do with something changing in how f-strings are handled, rather than dedent()
per se.
In Python 3.12, the next portion of strip_multiline_string_indents()
builds an AST. Judicious application of print(ast.dump(old_ast, indent=4))
yields:
Module(
body=[
FunctionDef(
name='character_maker2',
args=arguments(
posonlyargs=[],
args=[
arg(arg='lm')],
kwonlyargs=[],
kw_defaults=[],
defaults=[]),
body=[
AugAssign(
target=Name(id='lm', ctx=Store()),
op=Add(),
value=JoinedStr(
values=[
Constant(value=' {\n "name": "'),
FormattedValue(
value=Call(
func=Attribute(
value=Name(id='guidance', ctx=Load()),
attr='gen',
ctx=Load()),
args=[
Constant(value='name')],
keywords=[
keyword(
arg='stop',
value=Constant(value='"')),
keyword(
arg='max_tokens',
value=Constant(value=1))]),
conversion=-1),
Constant(value='",\n }')])),
Return(
value=Name(id='lm', ctx=Load()))],
decorator_list=[],
type_params=[])],
type_ignores=[])
Note in particular the line Constant(value=' {\n "name": "')
Shifting back to Python 3.11, this line becomes Constant(value='{\n "name": "')
, which lacks the leading spaces.