ringw / BufferedStreams.jl

Fast composable IO streams

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Diagram of locks

Build Status codecov.io

BufferedStreams provides buffering for IO operations. It can wrap any IO type automatically making incremental reading and writing faster.

BufferedInputStream

BufferedInputStream(open(filename)) # wrap an IOStream
BufferedInputStream(rand(UInt8, 100)) # wrap a byte array

BufferedInputStream wraps a source. A source can be any IO object, but more specifically it can be any type T that implements a function

readbytes!(source::T, buffer::Vector{UInt8}, from::Int, to::Int)

This function should write new data to buffer starting at position from and not exceeding position to and return the number of bytes written.

BufferedInputStream is itself an IO type and implements the source type so you can use it like any other IO type.

Anchors

Input streams also have some tricks to make parsing applications easier. When parsing data incrementally, one must take care that partial matches are preverved across buffer refills. One easy way to do this is to copy it to a temporary buffer, but this unecessary copying can slow things down.

Input streams instead support the notion of "anchoring", which instructs the stream to save the current position in the buffer. If the buffer gets refilled, then any data in the buffer including or following that position gets shifted over to make room. When the match is finished, one can then call takeanchored! return an array of the bytes from the anchored position to the currened position, or upanchor! to return the index of the anchored position in the buffer.

# print all numbers literals from a stream
stream = BufferedInputStream(source)
while !eof(stream)
    b = peek(stream)
    if '1' <= b <= '9'
        if !isanchored(stream)
            anchor!(stream)
        end
    elseif isanchored(stream)
        println(ASCIIString(takeanchored!(stream)))
    end

    read(stream, UInt8)
end

BufferedOutputStream

stream = BufferedOutputStream(open(filename, "w")) # wrap an IOStream

BufferedOutputStream is the converse to BufferedInputStream, wrapping a sink type. It also works on any writable IO type, as well the more specific sink interface:

writebytes(sink::T, buffer::Vector{UInt8}, n::Int, eof::Bool)

This function should consume the first n bytes of buffer. The eof argument is used to indicate that there will be no more input to consume. It should return the number of bytes written, which must be n or 0. A return value of 0 indicates data was processed but should not be evicted from the buffer.

BufferedOutputStream as an alternative to IOBuffer

BufferedOutputStream can be used as a simpler and often faster alternative to IOBuffer for incrementally building strings.

out = BufferedOutputStream()
print(out, "Hello")
print(out, " World")
str = takebuf_string(out)

About

Fast composable IO streams

License:Other


Languages

Language:Julia 100.0%