CDSoft / pp

PP - Generic preprocessor (with pandoc in mind) - macros, literate programming, diagrams, scripts...

Home Page:http://cdelord.fr/pp

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Latex \dot vs Built-in Macro \dot

robinrosenstock opened this issue · comments

In my markdown I use raw Inline Latex command, like this: \(\dot V\). I guess this is the pandocs markdown extension "tex_math_single_backslash".
And now when using pp I get an error message, because pp means the \dot is a built-in macro.
This is the error message:

pp: Arity error: dot expects 2 or 3 arguments
CallStack (from HasCallStack):
  error, called at src/ErrorMessages.hs:49:27 in pp-1.12-JZaGJL3q7GE2Y6zH7xiUqv:ErrorMessages

How to overcome this issue?
Can you disable the "\" notation and only use "!" notation?

For the time being, I have removed the dot functionality, that is I've removed "Dot" in variable GraphvizDiagram in Formats.hs - Line 54.
Maybe there is an other alternative instead of hardcoding macros?

This characters could be configurable (with a macro and/or on the command line).

$ pp -macrochars "!"
!macrochars(!)

Another solution may be to use \raw around LaTeX commands.

I like the ideas of being able to customize the macro chars. In some situations it could be very helpful.

Would this be a single-char-length symbols setting only, or could it allow using double chars as symbols. Eg:

$ pp -macrochars "!! \\"

... defining !! and \\ as macrochars (the space separates definitions).

As for the !macrochars() macro, would it become effective from the point of its definitions onward? Ie, one could restore the macrochars to the defaults later on via another !macrochars(!\), or change them once more to something else.

Isn't there a risk of nested macros behaving erraticly after changing the macrochars during the document flow? How would the nested macros definition behave? Or would the macrochars override only apply to the current context?

The command line options should affect the macrochars globally in the document (ie: before the document is even parsed); but the inline macro definition is a different story alltogether.

This would be a single char (any char in the string would be a valid char to start a macro call, !macrochars(!\) would enable two possible syntax: !macro and \macro. More than one char would be too heavy and less than one too ambiguous.
Then it's up to the user not to do a wrong usage of this macro.

The new chars will be used after calling macrochars until the end of all documents or the next call to macrochars.
The idea is to call this macro as early as possible (on the command line, in a file imported on the command line (with -import), in a common included file, at the beginning of the file...).

@CDSoft your solution with !macrochars is well enough. But I can't use !raw, because I have much more Latex Math and probably other things as well that doesn't work well with the \ as a macro character. I like ! much better, and it would be better to only use the exclamation mark (my opinion).

it would be better to only use the exclamation mark (my opinion).

Are you suggesting dropping support for the \?

I also tend to use more the !, but often I alternate ! and \ in complex macros as a visual reminder of the nesting level (and use different bracketing also), but this is just an aesthetic need — I could survive without it.

I agree that the \ syntax has greater potential for conflicts (even within verabtim- and code-blocks, where some overlooked unlucky chars combination could end up being mistaken by PP for a macro). In some rare edge cases, the \ might even clash with pandoc markdown, where the \ can be used for escaping (all_symbols_escapable extension):

Except inside a code block or inline code, any punctuation or space character preceded by a backslash will be treated literally, even if it would normally indicate formatting. [...]

This rule is easier to remember than standard Markdown's rule, which allows only the following characters to be backslash-escaped:

\`*_{}[]()>#+-.!

(so far, I never incurred into a markdown-escape/pp-macro conflict)

But I do think that having more than one syntax choice is good; and since removing altogether the \ would break backward compatibility, it should only be done if the conflicts are common enough to justify removing it as a default syntax, otherwise offering a way to override it should be preferable.

I admit that I use PP almost exclusively to work with pandoc markdown, and I haven't encountered many problems with the \ usage so far. But this might not be the case with other users (as this issue demonstrates).

Again, I really think that the idea of being able to define/override macrochars via CLI options, or in-text via a macro, is good, and I fully support it. Since from the introduction of this feature onward, users will be able to control the macrochars, it would be the right time to consider wether the current !\ chars are good, or if they cause enough conflicts in some contexts that they should be reconsiderd (like @geniusupgrader suggested). If a backward breaking change has to be introduced, the sooner the better (especially since PP users base is starting to grow faster).

Are you suggesting dropping support for the \?

I think some conflicts between PP and pandoc's markdown do not justify the removal of \.
Implementing !macrochars should be enough. And then for me personally, I will only use !.

The use of both \ and ! comes from different preprocessors I used and I kept it for backward compatibility but it may be time to simplify this.

We can have a default configuration with a single char and macros to change this behaviour.

Macro calls: !macro
Literate programming macros: @macro

But I would prefer keeping the three kinds of parenthesis for parameters: (), [] and {}. It helps grouping parameters in nested macro calls.

pp 2.0 implements these macros and uses "!" to call macros. !macrochars(!\) can be used in an imported file to restore the previous behaviour.
Hope it works.

But I would prefer keeping the three kinds of parenthesis for parameters: (), [] and {}. It helps grouping parameters in nested macro calls.

Actually I have been bitten from time to time by the fact that \macro{} may conflict with embedded LaTeX, so I would like to be able to

I wouldn't mind having a set of macros

!add_macrochars(&?%...)
!rem_macrochars(\...)
!add_delimiters(<>...)
!rem_delimiters[{}()...]

which all should take an open number of character(s) (pairs) in their argument so that you can add/remove more than one (pair of) characters in one go.

The characters in the arguments should preferably be allowed to be any characters with Unicode General Category P or S,
expecting users to be smart enough to not shoot themselves in the foot with their character choices.

so I would like to be able to

part of the sentence was lost!

The ideas of a macro to also define delimiters is good — and backward compatiblity can always be reintroduced via such macros.

I think that by default PP should have at least two type of delimiters, so even if you drop the curly braces ("{ }") there will still be the square brackets ("[ ]") and parenthesis ("( )"). But definitely, alternating delimiters make long single-line nested macros easier to read, understand, edit and debug; and at least two alternative delimiters should be built-in!

I noticed that in your example there are !rem_* and !add_* variants of these macros with parameters. How would they work, they would remove/add specific delimiters without affecting the remaining ones? ie: !macrochars (and !macrodelimiters?) would reset the accepted chars/delmiters to the ones in the passed param only, while the !rem_* and !add_* variants will allow removing or adding without affecting the rest?

so I would like to be able to

part of the sentence was lost!

"I would like to be able to redefine the set of delimiters as well."

I noticed that in your example there are !rem_* and !add_* variants of these macros with parameters. How would they work, they would remove/add specific delimiters without affecting the remaining ones? ie: !macrochars (and !macrodelimiters?) would reset the accepted chars/delmiters to the ones in the passed param only, while the !rem_* and !add_* variants will allow removing or adding without affecting the rest?

Exactly.

pp 2.0 implements these macros and uses "!" to call macros. !macrochars(!) can be used in an imported file to restore the previous behaviour.
Hope it works.

As soon as I got notice of the v2.0 release, I decided to update/check all the defintions in my “The Pandoc-Goodies PP-Macros Library” (they were lagging behind, and some stopped working after the v1.11 fix):

Updating included extensive testing via the (pre-existing) test suite. I didn't encounter any problems, and it seems to work finely.

The only difference I noticed is that some macrso that previously managed to create and then delete temporary files via !exec now seem unable to delete them. I don't know why they dont' get deleted anymore, but my guess is that it's just a problem with the file being still used by the previously invoked command/tool — See my Issue #42 ("Add !execwait Macro") on this regard.

How does !add_delimiters(<>(){}) decide which are chars pairs? Does it assume that each odd char is the opening delimiter, and its following (even) char the matching closing delimiter; creating a pair from every 2 contiguos chars?

Does it mean that !add_delimiters(][) would result in flipped square brackets delimiters? ie: !macro]pararm[

Will the macro accept only an even number of chars as parameter, and fail on finding duplicate chars in the param? eg: !add_delimiters([][})

I really like this!

I've written a parser or two in my day which had to deal with arbitrary multichar delimiters. They complicate things.

I've faced similar complications when writing a language definition for a syntax highlighter: I encountered a language that shared some common chars in different strings delimiters ("..." literal strings, and ~"..." escape strings), and it soon became a nightmare implementing states to track strings and quotes escape sequences (\") within literal and escapable strings. So I'm glad you're going to enforce unique delimiter chars.

Well I'm not in a position to enforce anything since I don't know Haskell and thus can't make a PR. I'm merely suggesting.

Is it really useful to have add_* and rem_* macros? A single macro to change the whole char set should be enough:

!macroargs( () «» ) <-- notice that spaces are ignored, there must be an even number of non space chars
!foo(x) !foo«y»
!foo[z] <-- won't work here!

Currently pp does not support unicode. It should be a separate issue if really required because it would require a lot of changes (e.g. changing String type to Text).

Is it really useful to have add_* and rem_* macros?

only in projects that import macro definitions-modules from different sources — where a macro might need to introduce changes that won't disrupt the general context.

I'm not sure if this is a realistic scenario right now. Maybe, one day there will exist hundered of independent macros library, for users to import. When this happens, this macro would allow maintaining macros module updated with newer PP versions, or allow adjusting macro modules to work with specific PP versions.

But my gues is that, right now, authors are personally managing their macros (no matter how many files). So, if it involves lots of work it could be just added to the wishlist of future enhancements.

I have added !macroargs.
I think it would be dangerous to let macro change the parser anywhere. pp uses a one pass parser. Nested macro calls defined before changing the parser configuration will fail when executed.
These macros are intended to be used at the very beginning (ideally in an imported file on the command line).

Well, one week passed by, time to close this, because my problem was solved and your implementation seems working.