siefkenj / unified-latex

Utilities for parsing and manipulating LaTeX ASTs with the Unified.js framework

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

erroneous `openMark`/`closeMark` on argument parse

retorquere opened this issue · comments

with this:

import { unified } from 'unified'
import { unifiedLatexFromString } from '@unified-latex/unified-latex-util-parse'

const parser = unified().use(unifiedLatexFromString, {
  macros: {
    test: { signature: 'm' },
  },
})

console.log(JSON.stringify(parser.parse('\\test m\\test{m}'), null, 2))

both arguments get openMark={ and closemark=}.

This is expected behavior. Mandatory arguments are surrounded by {...}. unified-latex parses to an abstract syntax tree rather than a concrete syntax tree. Information about whether the user included curly braces around the input or not is lost.

Mandatory arguments are surrounded by {...}

That's not true though? The braceless form works and renders to the same in LaTeX.

In bibtex there can be a difference in sentence casing depending on whether content is in braces under some conditions. I'll investigate to be sure that it also applies to macro arguments. I understand the point about CST vs AST, but if the casing difference applies, those two forms have different meaning in bibtex, not just different expressions of the same meaning.

That's not true though? The braceless form works and renders to the same in LaTeX.

Yes. In latex the commands \foo x and \foo{x} are treated the same, and unifiex-latex doesn't distinguish. Check bibtex. If it does something different, that's very strange. You can look at the xparse documentation for other types of argument signatures that you can try.

It isn't strange to anyone using bibtex. It is a necessary part for this tool chain. I appreciate you have a different perspective, but it's not the whole truth about the latex ecosystem, of which bibtex is a material part. I know these behave the same in the document body, but bibtex has additional behavior you wouldn't encounter writing the body.

I think I'm sensing some frustration on your side; I'm not telling you how to run your project, I'm just telling you that there are parts of the latex ecosystem that behave in ways you're not familiar with.

If I'm crossing a line here I'd like to know; I will need help to grok unified-latex, and if you find the way bibtex uses latex off-putting, that's going to be a strained interaction, and I don't think we should do that.

So long story short, I will likely need this distinction (I haven't completed my tests to see whether I do), and if that's not in the cards for unified, better to know it now rather than later, so I can refocus.

Specifically, this

Yes. In latex the commands \foo x and \foo{x}

may not be true for bibtex. In bibtex,

{\textbf{x}}

and

\textbf{x}

are not the same.

You'll notice that the first parses to group(macro(x)) while the second parses to macro(x).

I am getting a bit frustrated because it seems like many of the questions you ask could be answered by reading the comments in the source code/examining the options available to each function. There is now fully source documentation up on https://siefkenj.github.io/unified-latex You can see that MacroInfo accepts an argumentParser, which can be used to do custom parsing if you need to.

I can sense that frustration, but I find unified-latex pretty overwhelming. Even knowing what parts to look for is a hurdle.

What I meant ti say was that \textbf{x} itself means something different based on whether it is in braces, and that it affects it's neighbours in a group under some circumstances. But I've just tested a simpler sample -- \textbf{C} and \textbf C do not mean the same thing in bibtex. This is not "strange".

I'll try the rest by navigating the sources, but I find it equally frustrating having to walk on eggshells trying to eke out information. A lot I have already discovered on my own, but that is going to be invisible for you. But in the case you mention -- I don't know how I would have stumbled upon that MacroInfocan do these kinds of things, because the samples (which was all documentation there was until I generated typedoc) only show one particular use.