d0vgan / nppexec

NppExec (plugin for Notepad++)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SCI_REPLACE has problem matching the beginning of line

tho-gru opened this issue · comments

Hi,

I want to insert a string on every line of a text file using NppExec.

Steps to reproduce:
Take any file containing some no empty lines like

Abc
Def
Xyz

Interactively I use the following dialogue:
Replace-Dialog

This works fine.

But I want to achieve the same function using NppExec.

My first attempt was:

set local replaceFlags ~ NPE_SF_INENTIRETEXT | NPE_SF_REPLACEALL | NPE_SF_REGEXP | NPE_SF_POSIX
sci_replace $(replaceFlags) "^" "prefix-"

This attempt does not work and results in the following output on the NppExec console:

SET: local replaceFlags ~ NPE_SF_INENTIRETEXT | NPE_SF_REPLACEALL | NPE_SF_REGEXP | NPE_SF_POSIX
local $(REPLACEFLAGS) = 270533376
SCI_REPLACE: $(replaceFlags) "^" "prefix-"
- 0 occurrences replaced.
================ READY ================

I tried several other things and found out that this issue seems to be related to the ^-character which match the beginning of a line.

My work around is the following solution:

set local replaceFlags ~ NPE_SF_INENTIRETEXT | NPE_SF_REPLACEALL | NPE_SF_REGEXP | NPE_SF_POSIX
sci_replace $(replaceFlags) "^(.)" "prefix-\1"

Is it possible to get the first attempt work?

Version info:
Notepad++ v8.4.4 (64-bit)
Build time : Jul 15 2022 - 17:54:42
Path : C:\Program Files\Notepad++\notepad++.exe
Command Line : -multiInst
Admin mode : OFF
Local Conf mode : OFF
Cloud Config : OFF
OS Name : Windows 10 Pro (64-bit)
OS Version : 21H2
OS Build : 19044.1826
Current ANSI codepage : 1252
Plugins :
ComparePlugin (2.0.2)
DSpellCheck (1.4.24)
JSMinNPP (1.2205)
mimeTools (2.8)
NppConverter (4.4)
NppExec (0.8)
NppExport (0.4)
PythonScript (2)
XMLTools (3.1.1.13)

Kind regards
Thomas

It's an interesting question. I always treated ^ and $ as "anchors" that require additional characters for actual matching. That's why ^(.) should always work with any regexp engine since it has a standard meaning of "any character at the beginning of line".
As for ^ and $ alone, as well as for the combination of ^$ (that may potentially match an empty line), actually all of them do work in VS Code and in Notepad++. The ^ and $ alone also work in Visual Studio, but the ^$ does not work there.
While using any of ^, $ or ^$ in Notepad++'s Find dialog, it explicitly says "zero length match", so I'd rather treat this behavior as an extension of the standard regexp rules.
I've also tried e.g.

sci_find NPE_SF_INENTIRETEXT|NPE_SF_REGEXP|NPE_SF_PRINTALL "^"

and

sci_find NPE_SF_INENTIRETEXT|NPE_SF_REGEXP|NPE_SF_PRINTALL "^\s*$"

in NppExec, and it says "0 occurrences found".
NppExec uses Scintilla's command SCI_SEARCHINTARGET for searching, and the result in this case is -1 which means "not found".
I am not sure how Notepad++ succeeds in searching in this situation.

Your comment related to Scintilla brought up the idea of testing this in SciTE version:

Version 5.2.4   Scintilla:5.2.4   Lexilla:5.1.8
    Jul  9 2022 09:42:11
by Neil Hodgson.
December 1998-July 2022.

As you can see the screen shot below Scintilla can handle souch regex:
SciTE-Regex

Looking into Scintilla's sources, there is a file "Scintilla\include\BoostRegexSearch.h" that contains the following constants:

#define SCFIND_REGEXP_EMPTYMATCH_MASK          0xE0000000
#define SCFIND_REGEXP_EMPTYMATCH_NONE          0x00000000
#define SCFIND_REGEXP_EMPTYMATCH_NOTAFTERMATCH 0x20000000
#define SCFIND_REGEXP_EMPTYMATCH_ALL           0x40000000
#define SCFIND_REGEXP_EMPTYMATCH_ALLOWATSTART  0x80000000

Notepad++ uses these constants while searching. I was not aware of them since they are not in the public header "Scintilla.h" and are not mentioned in the official documentation https://www.scintilla.org/ScintillaDoc.html

Well, looks like I need to add these things to both sci_find and sci_replace.

The latest commit to the develop branch contains the required changes plus the examples for sci_find and sci_replace.
Note: don't forget to update the "BaseDef.h" under the "NppExec\NppExec" folder since it contains the new NPE_SF_REGEXP_EMPTYMATCH_* constants.

Thanks for fixing this so quickly.