Suboptimal performance of method WritableTrackingBuffer.makeRoomFor

Question

Suboptimal performance of method WritableTrackingBuffer.makeRoomFor

pavel-zeman opened this issue a year ago · comments

Software versions

Tedious: 16.1.0

Problem description
Source code of method WritableTrackingBuffer.makeRoomFor is as follows:

  makeRoomFor(requiredLength: number) {
    if (this.buffer.length - this.position < requiredLength) {
      if (this.doubleSizeGrowth) {
        let size = Math.max(128, this.buffer.length * 2);
        while (size < requiredLength) {
          size *= 2;
        }
        this.newBuffer(size);
      } else {
        this.newBuffer(requiredLength);
      }
    }
  }

The problem is the last line, i.e. this.newBuffer(requiredLength);. If you allocate a new buffer of requiredLength, the buffer will be immediately full and a new buffer will have to be allocated next time you write something to it. As a result, each following writeXXX method will result in a new buffer being allocated.

I propose to respect the initialSize here, i.e. change the line to this.newBuffer(Math.max(this.initialSize, requiredLength));.

Michael Sun · Answer 1 · Fri Jun 09 2023 04:09:11 GMT+0800 (China Standard Time)

Hi @pavel-zeman , I just did a bit research on this part and think the current logic make sense. I may not be 100% correct. For you proposal:
initialSize is quale to this.buffer.length after construction since that this.buffer = Buffer.alloc(this.initialSize, 0);
assume the position is 0 ,
if initialSize (this.buffer.length) is larger then requiredLength, then the logic will not able to reach the line this.newBuffer(Math.max(this.initialSize, requiredLength));
if initialSize (this.buffer.length)is smaller then requiredLength, then the line this.newBuffer(Math.max(this.initialSize, requiredLength)); will be equivalent with just this.newBuffer(requiredLength)

The case that you change will be effective is when initialSize - position (this.buffer.length - this.position ) less then requiredLength. For example, if initialSize = 10, position = 5, and requiredLength = 5 in this case with you change, the logic will create a newbuffer with size 10 instead of 5. However, for this case, buffer with size 5 is enough.

I understand you concern, but if we keep creating a larger size buffer, it may cause unnecessary memory consumption all the time.

pavel-zeman · Answer 2 · Sat Jun 10 2023 23:15:40 GMT+0800 (China Standard Time)

Hi @MichaelSun90,
the key part here is that once the buffer is full, each following writeXXX method invocation needs to allocate a new buffer.

Let's take your example with initialSize = 10 and try to simulate a few makeRoomFor invocations from writeXXX method. In the following table, each line represents a single invocation and the values show current state at the start of makeRoomFor method.

buffer.length	position	requiredLength	comment
10	0	5
10	5	5
10	10	1	New buffer allocated
1	1	1	New buffer allocated
1	1	1	New buffer allocated
1	1	1	New buffer allocated
1	1	1	New buffer allocated
1	1	1	New buffer allocated

You can see, that a new buffer is allocated 6 times.

With my proposal, the same situation looks as follows:

buffer.length	position	requiredLength	comment
10	0	5
10	5	5
10	10	1	New buffer allocated
10	1	1
10	2	1
10	3	1
10	4	1
10	5	1

A new buffer is allocated just once.

To observe the real performance impact, we can use the following simple test:

const buffer = new WritableTrackingBuffer(10);
console.time("buffer");
for(let i = 0; i < 10000; i++) {
    buffer.writeInt8(1);
}
console.timeEnd("buffer");

With the existing makeRoomFor method implementation, this test takes about 40ms on my machine. If I apply my proposal, the same test takes only 7ms. If I increase the initialSize to 100, the difference is 40ms vs. 1ms.

When run using Node.js with --trace-gc option, we can also observe GC activity. Existing implementation triggers GC 7 times, while my implementation triggers GC just once. From this perspective, my implementation does not allocate more memory than existing implementation. In fact, it actually saves quite a lot of memory allocations.

You may argue, that my test is just theoretical and such situation does not happen in tedious. And I'm not sure here, because I'm not familiar with all the tedious internals. But before generators were introduced in version 8.1.0, I observed that my implementation reduced time to generate specific request to SQL server from about 300ms to about 5ms.

Arthur Schreiber · Answer 3 · Thu Jun 29 2023 03:49:13 GMT+0800 (China Standard Time)

@pavel-zeman Thanks for looking into this! You're right, the WritableTrackingBuffer is not very smart about this - if you pass in an initial size the implicit assumption throughout the codebase is that you know what you're doing and you know the exact size of the data you're going to write into it.

If the final length is not know, the correct way to handle this is to set doubleSizeGrowth to true when creating the WritableTrackingBuffer.

@MichaelSun90, @mShan0 and I looked through all the uses of WritableTrackingBuffer, and it looks like we specify either the final size, a size that's most likely going to be bigger than the final size of the buffer, or we set doubleSizeGrowth to true. We probably should change all the places that don't set the exact size to either use the exact size, or set doubleSizeGrowth to true as well. But all of those places are in code that's probably not called very often, so the performance impact might not actually be visible.