Perf checks
tlienart opened this issue · comments
Thibaut Lienart commented
investigating partition
TextBlock(...)
in it, most of the time (significant!) is spent infindfirst
,findlast
; this makes the partition looping slow
investigating find_tokens
- main loop is most of the chunk; check if pre-finding trigger chars helps
- (secondary) postprocess emph looks a bit too slow
Thibaut Lienart commented
Aug 6
julia> ct = read("src/_precompile/expages/real3.md", String) * "\n" * read("src/_precompile/expages/real4.md", String);
julia> FranklinParser.md_partition(ct);
───────────────────────────────────────────────────────────────────────────────
Time Allocations
─────────────────────── ────────────────────────
Tot / % measured: 26.3ms / 99.7% 2.02MiB / 92.4%
Section ncalls time %tot avg alloc %tot avg
───────────────────────────────────────────────────────────────────────────────
tokenizer 1 15.5ms 59.1% 15.5ms 361KiB 18.9% 361KiB
main loop 1 15.3ms 58.4% 15.3ms 290KiB 15.2% 290KiB # this is find tokens
haskey 31.2k 2.94ms 11.2% 94.2ns 0.00B 0.0% 0.00B # ==> many many calls, adds up!!!
loop 3.25k 2.65ms 10.1% 817ns 288KiB 15.0% 90.7B # ==> the actual matches
fixed match 4.51k 1.21ms 4.6% 270ns 157KiB 8.2% 35.7B
greedy match 1.46k 386μs 1.5% 264ns 129KiB 6.8% 90.4B
nexthead 31.2k 2.37ms 9.0% 75.9ns 0.00B 0.0% 0.00B
pp emph 1 116μs 0.4% 116μs 57.7KiB 3.0% 57.7KiB
pp header 1 33.8μs 0.1% 33.8μs 480B 0.0% 480B
pp alink 1 17.0μs 0.1% 17.0μs 7.97KiB 0.4% 7.97KiB
begin 1 2.12μs 0.0% 2.12μs 336B 0.0% 336B
end 1 292ns 0.0% 292ns 0.00B 0.0% 0.00B
partitioning 1 9.80ms 37.4% 9.80ms 403KiB 21.1% 403KiB
loop 1 9.80ms 37.4% 9.80ms 401KiB 21.0% 401KiB
push2 663 9.29ms 35.4% 14.0μs 33.4KiB 1.7% 51.6B
pushif 663 8.99ms 34.3% 13.6μs 32.0KiB 1.7% 49.4B # ==> TextBlock!
findlast 436 6.86ms 26.2% 15.7μs 0.00B 0.0% 0.00B # ==> most of the work
findfirst 436 1.60ms 6.1% 3.67μs 0.00B 0.0% 0.00B # ==> here too
view 436 14.8μs 0.1% 34.0ns 0.00B 0.0% 0.00B
isempty 663 58.6μs 0.2% 88.5ns 0.00B 0.0% 0.00B
push1 663 117μs 0.4% 177ns 366KiB 19.1% 565B
inter 663 98.3μs 0.4% 148ns 0.00B 0.0% 0.00B
init 1 958ns 0.0% 958ns 0.00B 0.0% 0.00B
end 1 291ns 0.0% 291ns 0.00B 0.0% 0.00B
blockifier 1 881μs 3.4% 881μs 1.12MiB 59.9% 1.12MiB
postprocessing 1 46.6μs 0.2% 46.6μs 1.92KiB 0.1% 1.92KiB
───────────────────────────────────────────────────────────────────────────────
Thibaut Lienart commented
Ok find token is already significantly better (about 1/4 of the cost by just doing stuff better)
Thibaut Lienart commented
hmm yeah maybe but tests are now failing... changes are only in find_tokens
for now so fix this.
Thibaut Lienart commented
tests fixed; should try replacing fixed steps match with regexes and see...