flightaware / speedtables

Speed tables is a high-performance memory-resident database. The speed table compiler reads a table definition and generates a set of C access routines to create, manipulate and search tables containing millions of rows. Currently oriented towards Tcl.

Home Page:https://flightaware.github.io/speedtables/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Allow shared-memory tables to dynamically grow in size

bovine opened this issue · comments

Currently, attempting to insert too much into a table just causes an unrecoverable abort().

package require speedtable

CExtension bullseye 1.0 {
    CTable Pasture {
        varstring alpha
        varstring beta
        varstring delta
        varstring gamma
    }
}

package require Bullseye


Pasture create mypasture master name "moo" file "mypasture.dat" size 4096

for {set i 0} {$i < 1000} {incr i} {
    mypasture store [list alpha alfa beta bravo delta delta gamma golf]
}

$ ./shmemtest.tcl

Out of shared memory for "mypasture.dat".
Abort (core dumped)

I started out with the idea that you could add additional shared memory segments to a shared memory pool, and there were at one point very preliminary hooks for such a thing, but I found that juggling shared memory segments dynamically across programs that didn't have a common ancestor (ie, via forking) was tricky even with one segment, so I didn't pursue that.

Using additional shared memory segments indeed just seems like unnecessary complexity.

Is there any reason why the shared memory backing file couldn't just be enlarged by appending blank bytes to it and then re-map the larger file back into memory? (The file may become remapped to a different base address, so there is the possibility of pointers within the block needing to be recalculated relative to the start.)

On Linux, the mremap() call could be used to map a larger sized file, however on FreeBSD you need to call munmap()+mmap().

Each time the readers check the cycle number, they would also look at a size indicator to see if it had grown. If so, they would each mremap() or munmap()+mmap() the file back in and begin using the new base address of the shared memory.

There's absolutely no support for remapping to a different address, and implementing it would be a huge effort. The shared memory code doesn't use relative pointers because the objects inside the shared memory segment are speedtables rows containing pointers to strings and would break if they were moved anyway.

As I noted to Karl: "I think I decided that it would be better to simply map the largest shared memory segment you would ever need at the start, and let the operating system page it in as needed, since untouched pages are backed by the mmapped file rather than swap and they don't become resident until they are actually touched."

Even without allowing relative pointers, there is the possibility that munmap()+mmap() for an enlarged file will give you back the same base address and your pointers will remain accessible and valid. Is there any other reason why growing the memory would be difficult?

Assuming that I can grow the segment in place, it should be possible. I told Karl I'd have a look at this this weekend, is that OK?

Closing as won't fix, since this is not trivial to do.