hspec / silently

Prevent or capture output to stdout or other handles in Haskell

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Use a pipe instead of a temp file

jfischoff opened this issue · comments

I don't see why this library writes to temp file as opposed to using a pipe with a call like this one: https://hackage.haskell.org/package/process-1.6.3.0/docs/System-Process.html#v:createPipe

It seems like a pipe is more robust and less likely to fail.

Hey, thanks for the feedback.

I'm not the original author of this library and I'm not actively working on it.

using a pipe

@jfischoff if you want to give this a shot then I'm more than happy to accept a patch. My only requirement would be that it works both on *nix and Windows, and that there are tests that demonstrate this. We may want to setup AppVeyor for that.

wrote a similar one that uses a pipe

Nice! If you are interested to collaborate on this, I'm happy to add anybody as a co-maintainer who made at least one quality contribution.

Keep in mind the OS pipe buffers.

If a pipe is full, writing to it will block. That doesn't happen for temporary files. If your output is larger than the pipe buffer, and you don't consume from the pipe, then your program will get stuck.

If a pipe is full, writing to it will block

You need to read from it on another thread.

@jfischoff Right, but then put it where?

You have to put it either onto disk or into memory.

If the goal is disk, then doing it through a pipe read by a thread writing it to disk doesn't seem better than writing to a temporary file directly.

If the goal is memory, createPipe seems unnecessary, you could directly read from the process's standard handles.

From the issue description it isn't quite clear to me what the issue or goal is:

It seems like a pipe is more robust and less likely to fail.

Robust against what, the disk being full?

You have to put it either onto disk or into memory.

The goal is to avoid writing to disk and then reading from disk and just buffer in memory.

If the goal is memory, createPipe seems unnecessary, you could directly read from the process's standard handles.

You will have to elaborate on what you mean here.

Robust against what, the disk being full?

Yes. Also it just seems unnecessarily complex to write to the disk just to read from disk.

Also it just seems unnecessarily complex to write to the disk just to read from disk.

I imagine people would do that so that they can handle more output than fits into RAM (or, not to occupy that RAM). But I see that indeed silently writes it to the file and then reads it wholly back into memory, for example here:

str <- hGetContents tmpHandle
str `deepseq` return (str,a)

So you're right, that can be achieved more easily without a roundtrip through files.

I think going via the disk would be beneficial if silently actually offered functions via which the captured output could be read incrementally/streamingly; then it would really save some RAM. And due to the OS buffer cache, if one happens to have enough free RAM to fit it, this wouldn't be much slower than just doing it in memory as all the contents will still be in memory.

You will have to elaborate on what you mean here.

I somehow assumed you were talking about using silently to capture e.g. the stdout of some process. But I realise that wasn't the case, so what you said about createPipe makes total sense.

I think going via the disk would be beneficial if silently actually offered functions via which the captured output could be read incrementally/streamingly

Agree

For me it makes sense, because I'd like to capture stdout in unit tests. Therefore "not fitting into RAM" isn't the case (due to small amounts of test data).

Here the function (inspired by the library):

myCapture :: IO a -> IO (String, a)
myCapture action = do
  bracket redirect restore runActionAndCapture
  where
    redirect = do
      (pipeReadEnd, pipeWriteEnd) <- createPipe
      old <- hDuplicate stdout
      hDuplicateTo pipeWriteEnd stdout
      return (pipeReadEnd, pipeWriteEnd, old)

    runActionAndCapture (pipeReadEnd, _, _) = do
      a <- action
      hFlush stdout
      c <- readAvailable pipeReadEnd
      return (c, a)

    restore (pipeReadEnd, pipeWriteEnd, old) = do
      hDuplicateTo old stdout
      hClose old
      hClose pipeWriteEnd
      hClose pipeReadEnd

readAvailable :: Handle -> IO String
readAvailable h = do
  isReady <- hReady h
  if isReady
    then do
      c <- hGetChar h
      tail <- readAvailable h
      return $ c : tail
    else return []

Haven't tested it under Windows, though

I think it makes sense to have both a file-based interface and at least one pipe-based one. One pipe-based solution could use a second thread to pull output from the pipe, making it available on request. A second one could use the pipe blocking mechanism, producing a stream of output chunks followed by a result.