pygy / strung.lua

Lua string patterns rewritten in Lua + FFI, for LuaJIT.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Updates?

pablomayobre opened this issue · comments

I think that the Readme is outdated, I will like to know the state of this library, what things are missing, what needs to be fixed, what needs to be improved and such. So that maybe I can help with it.

I'm really interested in this project, it's amazing! Thanks for your hard work

Posting this again because editing the message I sent earlier by mail breaks the formatting.

@positive07 Thanks for the interest and the praise :-)

The README is indeed a bit out of date in that I have started to implement gsub (and now that I look at it, it is more advanced than I remembered).

find, match and gmatch should work fine as they are

I didn't update the docs because, since gsub is still buggy (there are segfaults when trying to access some of the strings it produces :-/), and its performance is suboptimal. I recently found a bug in the way I double malloc()ed buffers, which may be the culprit (I can't see how I would get segfaults otherwise, anyway).

The code related to gsub is located in the scope that starts on line 794 of the current version. It is in a slightly derelict state, the WIP stuff is vaguely named
(long_string_hander, short_string_handler, prepare...).

I'll give it another crack if you want.

Oh, reviewing more of the code, I also realise that I removed the install() method but kept it in the README. Thanks for the heads up.

@CapsAdmin, in reply to the comment I received in my mail but can't see on the site (did you delete it?):

Glad to know it's been working fine for so long :-)

Thanks for the long explanation! I would love to use this library in my project, I dont think I can help much with gsub because I know nothing about C, but I could make the install() and uninstall() methods if you wanted.

You're welcome :-)

Thanks for the proposition, but now that I look at it closer, I realize that install() and uninstall() were still there, but not exposed. I've just restored them.

Oh great! So the only missing thing is fixing gsub?

Yup. It already works with table, function and plain string replacements. It becomes problematic when the replacement is a string that references one or more captures ("foo%0bar", etc...).

The segfault is gone, and I've fixed another unrelated (but bad) bug in the character set parser.

like this?

> = string.gsub("banana", "(a)(n)", "%2%1") -- reverse any "an"s
bnanaa  2

Yes. There's an off by one error somewhere in the buffer handling code (I use a byte array to build the new string, and I overwrite some bytes).

Your example works fine, but = strung.gsub("banana", "(a)(n)", "+%2+%1+") will return "bna+naa", 2 instead of "b+n+a++n+a+a", 2.

Off by one: gone.

There's still one big bug with position captures that are not properly handled: strung.gsub("Foo", "()", "%1") hangs.

Only one test left to pass in pm.lua and we're good :-)

Nice! So this test?

strung.gsub('alo alo', '()[al]', '%1') == '12o 56o'

How did you implement the () pattern? Is there any way to fix this? maybe an ugly hack...

No need for any hack, I just had to add a special case as seen in this commit.

Position captures are just captures whose start bound is set to 0. Since no text capture can start at that index, I can use it as a tag (before that commit, I set the close bound to 2^32-1, but the new solution is cleaner, since it doesn't limit the position range of the last capture).

Now that strung is AFAICT feature complete, I'm about to release v1.0 and publish it as a Luarock. Do you see anything that is missing?

Thanks a lot for the nudge, there was actually little left to finish this :-)

You are amazing! I wasnt expecting this much really, you are awesome, I dont see anything missing, everything that was proposed is already done.

Maybe for another release, format which is not JIT compiled (partial support in 2.1) and maybe those that were added in 2.1, lower, rep, reverse and upper. But I don't think there will be any benefit in those last ones, they may be slower than their interpreted functions.

Actually, even the pattern matching functions are not that interesting in LuaJIT 2.0, since the performance is so unpredictable.

I may look into string.format at some point.

Closing this now.

Again, thanks for the support, there was little left to do, I just lacked the motivation to finish the lib.

Hahaha now you have a library in Luarocks! Amazing :)

Not yet, I somehow broke my moonrocks install, and I'll have to fix it before I upload it.

But I do have a rock already up there already: the mighty require.lua :-)

Ohh I think I'll be using a modified version of require.lua in a project of mine (Cube, which is really EARLY in development)