Some lexer errors are printed to stderr
quasilyte opened this issue · comments
If underlying lexer encounters input error, it uses defaultError handling function that prints an error to stderr.
func (l *Lexer) defaultErrorf(pos token.Pos, msg string) {
l.Error(fmt.Sprintf("%v: %v", l.File.Position(pos), msg))
}
// Error Implements yyLexer[2] by printing the msg to stderr.
func (l *Lexer) Error(msg string) {
fmt.Fprintf(os.Stderr, "%s\n", msg)
}
On a large codebase it sometimes leads to this message:
unicode (UTF-8) BOM in middle of file
There is no way to control it and this is the problem.
I'm proposing a change (example below) that will register our own error handling function that will push lex error to the high-level lexer errors list. This way, errors can propagate and be handled without stderr pollution.
diff --git a/scanner/lexer.go b/scanner/lexer.go
index 52d47c7..d54b8bc 100644
--- a/scanner/lexer.go
+++ b/scanner/lexer.go
@@ -4,6 +4,7 @@ package scanner
import (
"bufio"
"bytes"
+ "go/token"
t "go/token"
"io"
"unicode"
@@ -62,23 +63,32 @@ func Rune2Class(r rune) int {
return classOther
}
+func (l *Lexer) lexErrorFunc(p token.Pos, msg string) {
+ pos := position.NewPosition(
+ l.File.Line(p),
+ l.File.Line(p),
+ int(p),
+ int(p),
+ )
+ l.Errors = append(l.Errors, errors.NewError(msg, pos))
+}
+
// NewLexer the Lexer constructor
func NewLexer(src io.Reader, fName string) *Lexer {
+ lexer := &Lexer{
+ StateStack: []int{0},
+ tokenBytesBuf: &bytes.Buffer{},
+ TokenPool: &TokenPool{},
+ }
+
file := t.NewFileSet().AddFile(fName, -1, 1<<31-3)
- lx, err := lex.New(file, bufio.NewReader(src), lex.RuneClass(Rune2Class))
+ lx, err := lex.New(file, bufio.NewReader(src), lex.RuneClass(Rune2Class), lex.ErrorFunc(lexer.lexErrorFunc))
if err != nil {
panic(err)
}
+ lexer.Lexer = lx
- return &Lexer{
- Lexer: lx,
- StateStack: []int{0},
- PhpDocComment: "",
- FreeFloating: nil,
- heredocLabel: "",
- tokenBytesBuf: &bytes.Buffer{},
- TokenPool: &TokenPool{},
- }
+ return lexer
}
func (l *Lexer) Error(msg string) {
You right, errors of the lexer should be saved into lexer.Errors
.
Thank you for the proposal, I have used it as is.
Also, I have covered this case by the test.