aligrudi / neatvi

A small vi/ex editor for editing bidirectional UTF-8 text

Home Page:http://litcave.rudi.ir/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

lbuf_replace is utterly slow

kyx0r opened this issue · comments

commented

Hello Ali, lbuf_replace is utterly slow.

I haven't really looked into what's causing this but when I fire up a 20MB book file and do a simple word substitution
it takes a (few hours ? maybe even infinite time) to complete.
The whole book is 385 000 lines, trimmed to 80 char limit.

I think those memmove() inside lbuf_replace() have exponential complexity, which sucks.

We need to speed this up, this is on my todo list in the near future.
Any ideas?

commented

rset_find() may seem like it is a bottleneck, but something is wrong. Profiling results are consistent on different machines.

Look at this callgrind file, the overhead of that memmove is outrageous 80%.
https://0x0.st/-xUa.txt

I ran this on some other smaller file 80K lines.
Simple replacement
:%s/<if>/hello/g

commented

Seems like a bug hidden somewhere that makes it read past the buffer, or something causes the memmove parameters to get out of hand.

commented

https://0x0.st/-xUX.txt

DFA regex implementation, same story. Memmove is having a lot of fun wasting cpu cycles.
Maybe I overlook something, but I what's the chances of me making a bug in 2 completely different regex implementations?
Likely none, but what causes this crazy memmove?

I don't know why but I can't get that memmove go crazy on your version of neatvi, but that doesn't mean the bug is not a concern anymore.

commented

Okay, I messed up, tested your neatvi again and can confirm, lbuf_replace is indeed utterly slow.
lbuf_replace is utterly slow:
https://0x0.st/-xUa.txt

rset_find() is not the overhead.

commented

Wow that's a massive speed up! I LIKE IT!

Now I don't have to put word replacements in my book overnight when I go to sleep! Yeah this world is crazy like that, that is why I say every little bit of code matters. Because you never know how someone will use the code, and what seems like a negligible overhead can become a huge problem.

Thanks Ali :)

I will make another issue soon, it's a different problem easy to fix.