concat can be faster
meooow25 opened this issue · comments
I recently wanted a reverseConcat
, which led me to the current implementation of concat
and the realization that it is not very efficient:
Lines 1033 to 1046 in e84c7a3
- Allocates a new list of
Text
s (lazily, but it still has a cost) because of the filter - Allocates a list of boxed int lengths to sum them, since
sumP
does not fuse
Here is an alternate straightforward implementation:
concat :: [T.Text] -> T.Text
concat ts0 = case ts0 of
[] -> T.empty
[t] -> t
_ | len == 0 -> T.empty
| otherwise -> T.Text arr 0 len
where
flen acc (T.Text _ _ l)
| acc' > 0 || l == 0 = acc'
| otherwise = concatOverflowError
where
acc' = acc + l
len = foldl' flen 0 ts0
arr = A.run $ do
marr <- A.new len
let loop !i [] = pure marr
loop i (T.Text a o l : ts) = A.copyI l marr i a o *> loop (i+l) ts
loop 0 ts0
This not exactly the same, since the current implementation performs a case match on the list after filtering out null Text
s. But I doubt such a preemptive step helps.
Benchmarks on GHC 9.6.3 with -O, concating a list of all Char
s:
concat: OK
54.8 ms ± 3.1 ms, 174 MB allocated, 52 MB copied, 325 MB peak memory
alt concat: OK
13.1 ms ± 901 μs, 4.2 MB allocated, 773 B copied, 325 MB peak memory
==: OK
Benchmark file
{-# LANGUAGE BangPatterns #-}
import Prelude hiding (concat)
import Data.Foldable (foldl')
import Test.Tasty.Bench
import Test.Tasty.HUnit
import qualified Data.Text as T
import qualified Data.Text.Internal as T
import qualified Data.Text.Array as A
main :: IO ()
main = defaultMain
[ env (pure xs_) $ \xs -> bgroup ""
[ bench "concat" $ whnf T.concat xs
, bench "alt concat" $ whnf concat xs
, testCase "==" $ concat xs @?= T.concat xs
]
]
where
xs_ = map T.singleton [minBound .. maxBound]
concat :: [T.Text] -> T.Text
concat ts0 = case ts0 of
[] -> T.empty
[t] -> t
_ | len == 0 -> T.empty
| otherwise -> T.Text arr 0 len
where
flen acc (T.Text _ _ l)
| acc' > 0 || l == 0 = acc'
| otherwise = concatOverflowError
where
acc' = acc + l
len = foldl' flen 0 ts0
arr = A.run $ do
marr <- A.new len
let loop !i [] = pure marr
loop i (T.Text a o l : ts) = A.copyI l marr i a o *> loop (i+l) ts
loop 0 ts0
concatOverflowError :: a
concatOverflowError = error "Data.Text.concat: size overflow"
Yeah, makes sense.
I actually think that functions like this should not force the input list before starting to consume it. A usual pattern with doubling a mutable buffer on overflows would be better.
That is also an option, but it has the disadvantage of overallocating.
Whichever way it is done, it would be nice to get an improved concat
.
I guess your suggestion is a more conservative option. PRs welcome.
I can make a PR 👍
Or a related note, what do you think of relaxing the type to Foldable f => f Text -> Text
, similar to Data.List.concat
? This can be useful if someone has Vector Text
to concat, for instance. They will not need to materialize a [Text]
.
Relaxing the type would be a breaking change (because some programs can stop typechecking due to excessive polymorphism), and I think there is no appetite for breaking changes that soon after text-2.1
.