dabeaz / ply

Python Lex-Yacc

Home Page:http://www.dabeaz.com/ply/index.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Wrong binding of arguments for lambda functions

ichipper opened this issue · comments

This issue could be best illustrated by the following toy example.

import sys
from ply import lex, yacc
class ToyParser(object):
    reserved = ['A', 'B', 'C', 'D']
    reserved_map = {x:x for x in reserved}
    tokens = ['PLUS', 'MULT', 'ID'] + reserved

    t_PLUS = r'\+'
    t_MULT = r'\*'
    t_ignore = ' \t\x0c'

    def t_ID(self, t):
        r'\w+'
        t.type = ToyParser.reserved_map.get(t.value, 'ID')
        return t

    def t_error(self, t):
         print("Illegal character '%s'" % t.value[0])
         self.lexer.skip(1)

    def p_top(self, p):
        '''
        top : A PLUS B PLUS term
        '''
        print( p[5](p[1]+p[2]+p[3]+p[4]) )

    def p_term(self, p):
        '''
        term : C MULT D
        '''
        def concat_string(a, b, c):
            return a +  b + '*' + c

        p[0] = lambda u: concat_string(u, p[1], p[3])

    def __init__(self):
        self.lexer = lex.lex(module=self)
        self.parser = yacc.yacc(module=self)

    def parse(self, input_text):
        self.parser.parse(input_text)

if __name__ == '__main__':
    parser = ToyParser()
    parser.parse('A + B + C * D')

Supposedly, running this program should print out the following:

A+B+C*D
However, it actually print out:
A+B+A*B
What happened is that when returning the lambda function from the rule term : C MULT D, the arguments were not bound to the function concat_string at that time. When the lambda function was applied in the rule top : A PLUS B PLUS term, the arguments were bound. However, the p[1] and p[2] in the rule top : A PLUS B PLUS term were used as the arguments (instead of the ones in term : C MULT D.

We could work around this issue by replacing

       p[0] = lambda u: concat_string(u, p[1], p[3])

with

        b = p[1]
        c = p[3]
        p[0] = lambda u: concat_string(u, b, c)

However, it will be great if this issue could be resolved from the code of PLY.

Referring to the "p" object anywhere else except in the immediate execution of a parser rule will never be safe. It's not a proper data structure or traditional container, but an adapter that provides an interface to items on the internal parsing stack. You might be able do something like p = tuple(p) to turn it into something that would be safe for this.

Referring to the "p" object anywhere else except in the immediate execution of a parser rule will never be safe. It's not a proper data structure or traditional container, but an adapter that provides an interface to items on the internal parsing stack. You might be able do something like p = tuple(p) to turn it into something that would be safe for this.

Thank you David. It will be good if this caveat is added into the documentation.