Generate data with zero allocs
matheusd opened this issue · comments
How can I generate/encode data with zero heap allocations?
Example use case: I want to generate data to write into a file, so I need to allocate a single structure, message and arena. I can pre-allocate the arena buffer. But all my attempts at writing a loop fail to make it zero alloc (either the arena grows or message escapes to heap).
Here's the set of test benchmarks: https://github.com/matheusd/capnptest01/blob/master/bench_test.go
And here are my results:
BenchmarkSetText01-7 6966074 174.9 ns/op 99 B/op 0 allocs/op
BenchmarkSetText02-7 1000000 1100 ns/op 352 B/op 5 allocs/op
BenchmarkSetText03-7 2315973 517.9 ns/op 72 B/op 2 allocs/op
BenchmarkSetText04-7 2661026 429.9 ns/op 260 B/op 0 allocs/op
While benchmarks 01 and 04 are shown with 0 allocs, that's actually just the amortized growth of the arena which ends up becoming large in those tests.
If you call Release()
on your message, the underlying buffers are returned to a sync.Pool
. This is the recommended way to avoid allocations. I am not certain that there is a way to guarantee zero-allocation (un)marshalling, though the Arena
API is designed to support it in principle.
AFAICT, my BenchmarkTest03()
was supposed to do that (through the use of msg.Reset(arena)
) and indeed, the arena itself does not grow, but some objects (looks to be the *Segment
created inside Message.setSegment()
) still escape to the heap.
func BenchmarkSetText03(b *testing.B) {
var msg capnp.Message
arena := capnp.SingleSegment(nil)
b.ReportAllocs()
b.ResetTimer()
for i := 0; i < b.N; i++ {
seg, err := msg.Reset(arena)
if err != nil {
b.Fatal(err)
}
tx, err := NewTransaction(seg)
if err != nil {
b.Fatal(err)
}
err = tx.SetDescription("my own descr")
if err != nil {
b.Fatal(err)
}
}
// b.Log(arena.String())
}
I notice that API wise, msg.Reset()
has an issue, where I can't be completely in control of the arena. msg.Reset()
forcibly calls arena.Release
which puts the arena back to the sync pool, even if I know that I'll reuse the arena without necessitating going through the arena sync pool.
This is easily solved by instantiating my own arena type that NOPs on Release().
Ok, so the first escape to heap is caused because the second time Message.Segment()
is executed, m.segs != nil
but len(m.segs) == 0
, meaning the first segment (which is statically allocated) has been removed from the segs map, causing a new Segment
to be initialized. The following diff fixes this issue:
diff --git a/message.go b/message.go
index 5ae7f3e..4bcceaa 100644
--- a/message.go
+++ b/message.go
@@ -100,6 +100,9 @@ func (m *Message) Release() {
func (m *Message) Reset(arena Arena) (first *Segment, err error) {
m.capTable.Reset()
for k := range m.segs {
+ if k == 0 && m.segs[k] == &m.firstSeg {
+ continue
+ }
delete(m.segs, k)
}
@@ -113,6 +116,7 @@ func (m *Message) Reset(arena Arena) (first *Segment, err error) {
DepthLimit: m.DepthLimit,
capTable: m.capTable,
segs: m.segs,
+ firstSeg: Segment{msg: m},
}
if arena != nil {
I noticed that Message.allocSegment
requires m.segs
to be non-nil, so it seems to me that Reset
should just directly create m.segs
if it is nil and populate it with the first seg, removing the need to consider the two cases throughout the code and make it easier to reason about. What do you think?
If the previous diff is reasonable, I can send it as a proper PR.
Ok, to fix the second escape to heap I had to create a new Arena that does not use the bufferpool to manage its memory: https://github.com/matheusd/capnptest01/blob/fd2e71a57e5c1ccf8a42f28804ba8f85129002d9/arena.go#L35
This makes it possible to encode messages without any heap allocations (other than the initial buffer if correctly sized).
Is ManualSegmentArena
is something useful to be contributed to go-capnp?
@matheusd Both of your suggestions seem sensible to me, and a PR is most welcome. Please be aware that I am currently traveling, so I will be slower to respond/review than usual.
Looking forward to the PR :)
(No rush to respond, I'm aware you're travelling)
I've sent the first PR to address this (#555).
I previously wrote:
I noticed that Message.allocSegment requires m.segs to be non-nil, so it seems to me that Reset should just directly create m.segs if it is nil and populate it with the first seg, removing the need to consider the two cases throughout the code and make it easier to reason about
But the more I think about it, looking at a cpu profile with the heap allocs fixed, the more I think that segs
should really be a function of (and therefore moved to) the Area implementations.
In particular, this would allow implementing an arena that forgoes the Message.mu
protection, getting a performance boost at the expense of the caller having to ensure no concurrent access to the same message.
For example, in the following profile of my benchmark, Reset()
is over 50% of the time while it's basically useless (because the size of the arena is accounted for, only a single segment will ever be used and access is all done in a single goroutine).
Rewriting to use an Arena that does not use the map and mutex would make Reset() essentially free.
Hey @matheusd Thanks for investigating this. Your proposal sounds reasonable. I don't think this will break any existing behavior ... wanna give it a shot?
Yeah, I've started work on this. I only work on this as time permits, so no hard deadline on when I'll submit, but I think I've got a rough high level design going already.
Thank you for the support!
Absolutely no rush! I'm very, very glad you're able to contribute some cycles 😃
Please let me know if I can be of help!