haskell / text

Haskell library for space- and time-efficient operations over Unicode text.

Home Page:http://hackage.haskell.org/package/text

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

concat can be faster

meooow25 opened this issue · comments

I recently wanted a reverseConcat, which led me to the current implementation of concat and the realization that it is not very efficient:

text/src/Data/Text.hs

Lines 1033 to 1046 in e84c7a3

-- | /O(n)/ Concatenate a list of 'Text's.
concat :: [Text] -> Text
concat ts = case ts' of
[] -> empty
[t] -> t
_ -> Text (A.run go) 0 len
where
ts' = L.filter (not . null) ts
len = sumP "concat" $ L.map lengthWord8 ts'
go :: ST s (A.MArray s)
go = do
arr <- A.new len
let step i (Text a o l) = A.copyI l arr i a o >> return (i + l)
foldM step 0 ts' >> return arr

  • Allocates a new list of Texts (lazily, but it still has a cost) because of the filter
  • Allocates a list of boxed int lengths to sum them, since sumP does not fuse

Here is an alternate straightforward implementation:

concat :: [T.Text] -> T.Text
concat ts0 = case ts0 of
  [] -> T.empty
  [t] -> t
  _ | len == 0 -> T.empty
    | otherwise -> T.Text arr 0 len
  where
    flen acc (T.Text _ _ l)
      | acc' > 0 || l == 0 = acc'
      | otherwise = concatOverflowError
      where
        acc' = acc + l
    len = foldl' flen 0 ts0
    arr = A.run $ do
      marr <- A.new len
      let loop !i [] = pure marr
          loop i (T.Text a o l : ts) = A.copyI l marr i a o *> loop (i+l) ts
      loop 0 ts0

This not exactly the same, since the current implementation performs a case match on the list after filtering out null Texts. But I doubt such a preemptive step helps.

Benchmarks on GHC 9.6.3 with -O, concating a list of all Chars:

    concat:     OK
      54.8 ms ± 3.1 ms, 174 MB allocated,  52 MB copied, 325 MB peak memory
    alt concat: OK
      13.1 ms ± 901 μs, 4.2 MB allocated, 773 B  copied, 325 MB peak memory
    ==:         OK
Benchmark file
{-# LANGUAGE BangPatterns #-}
import Prelude hiding (concat)
import Data.Foldable (foldl')
import Test.Tasty.Bench
import Test.Tasty.HUnit

import qualified Data.Text as T
import qualified Data.Text.Internal as T
import qualified Data.Text.Array as A

main :: IO ()
main = defaultMain
  [ env (pure xs_) $ \xs -> bgroup ""
    [ bench "concat" $ whnf T.concat xs
    , bench "alt concat" $ whnf concat xs
    , testCase "==" $ concat xs @?= T.concat xs
    ]
  ]
  where
    xs_ = map T.singleton [minBound .. maxBound]

concat :: [T.Text] -> T.Text
concat ts0 = case ts0 of
  [] -> T.empty
  [t] -> t
  _ | len == 0 -> T.empty
    | otherwise -> T.Text arr 0 len
  where
    flen acc (T.Text _ _ l)
      | acc' > 0 || l == 0 = acc'
      | otherwise = concatOverflowError
      where
        acc' = acc + l
    len = foldl' flen 0 ts0
    arr = A.run $ do
      marr <- A.new len
      let loop !i [] = pure marr
          loop i (T.Text a o l : ts) = A.copyI l marr i a o *> loop (i+l) ts
      loop 0 ts0

concatOverflowError :: a
concatOverflowError = error "Data.Text.concat: size overflow"

Yeah, makes sense.

I actually think that functions like this should not force the input list before starting to consume it. A usual pattern with doubling a mutable buffer on overflows would be better.

That is also an option, but it has the disadvantage of overallocating.
Whichever way it is done, it would be nice to get an improved concat.

I guess your suggestion is a more conservative option. PRs welcome.

I can make a PR 👍
Or a related note, what do you think of relaxing the type to Foldable f => f Text -> Text, similar to Data.List.concat? This can be useful if someone has Vector Text to concat, for instance. They will not need to materialize a [Text].

Relaxing the type would be a breaking change (because some programs can stop typechecking due to excessive polymorphism), and I think there is no appetite for breaking changes that soon after text-2.1.