capnproto / go-capnp

Cap'n Proto library and code generator for Go

Home Page:https://capnproto.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Generate data with zero allocs

matheusd opened this issue · comments

How can I generate/encode data with zero heap allocations?

Example use case: I want to generate data to write into a file, so I need to allocate a single structure, message and arena. I can pre-allocate the arena buffer. But all my attempts at writing a loop fail to make it zero alloc (either the arena grows or message escapes to heap).

Here's the set of test benchmarks: https://github.com/matheusd/capnptest01/blob/master/bench_test.go

And here are my results:

BenchmarkSetText01-7     6966074               174.9 ns/op            99 B/op          0 allocs/op
BenchmarkSetText02-7     1000000              1100 ns/op             352 B/op          5 allocs/op
BenchmarkSetText03-7     2315973               517.9 ns/op            72 B/op          2 allocs/op
BenchmarkSetText04-7     2661026               429.9 ns/op           260 B/op          0 allocs/op

While benchmarks 01 and 04 are shown with 0 allocs, that's actually just the amortized growth of the arena which ends up becoming large in those tests.

If you call Release() on your message, the underlying buffers are returned to a sync.Pool. This is the recommended way to avoid allocations. I am not certain that there is a way to guarantee zero-allocation (un)marshalling, though the Arena API is designed to support it in principle.

AFAICT, my BenchmarkTest03() was supposed to do that (through the use of msg.Reset(arena)) and indeed, the arena itself does not grow, but some objects (looks to be the *Segment created inside Message.setSegment() ) still escape to the heap.

func BenchmarkSetText03(b *testing.B) {
	var msg capnp.Message
	arena := capnp.SingleSegment(nil)

	b.ReportAllocs()
	b.ResetTimer()

	for i := 0; i < b.N; i++ {
		seg, err := msg.Reset(arena)
		if err != nil {
			b.Fatal(err)
		}

		tx, err := NewTransaction(seg)
		if err != nil {
			b.Fatal(err)
		}

		err = tx.SetDescription("my own descr")
		if err != nil {
			b.Fatal(err)
		}
	}

	// b.Log(arena.String())
}

profile001

I notice that API wise, msg.Reset() has an issue, where I can't be completely in control of the arena. msg.Reset() forcibly calls arena.Release which puts the arena back to the sync pool, even if I know that I'll reuse the arena without necessitating going through the arena sync pool.

This is easily solved by instantiating my own arena type that NOPs on Release().

Ok, so the first escape to heap is caused because the second time Message.Segment() is executed, m.segs != nil but len(m.segs) == 0, meaning the first segment (which is statically allocated) has been removed from the segs map, causing a new Segment to be initialized. The following diff fixes this issue:

diff --git a/message.go b/message.go
index 5ae7f3e..4bcceaa 100644
--- a/message.go
+++ b/message.go
@@ -100,6 +100,9 @@ func (m *Message) Release() {
 func (m *Message) Reset(arena Arena) (first *Segment, err error) {
 	m.capTable.Reset()
 	for k := range m.segs {
+		if k == 0 && m.segs[k] == &m.firstSeg {
+			continue
+		}
 		delete(m.segs, k)
 	}
 
@@ -113,6 +116,7 @@ func (m *Message) Reset(arena Arena) (first *Segment, err error) {
 		DepthLimit:    m.DepthLimit,
 		capTable:      m.capTable,
 		segs:          m.segs,
+		firstSeg:      Segment{msg: m},
 	}
 
 	if arena != nil {

I noticed that Message.allocSegment requires m.segs to be non-nil, so it seems to me that Reset should just directly create m.segs if it is nil and populate it with the first seg, removing the need to consider the two cases throughout the code and make it easier to reason about. What do you think?

If the previous diff is reasonable, I can send it as a proper PR.

Ok, to fix the second escape to heap I had to create a new Arena that does not use the bufferpool to manage its memory: https://github.com/matheusd/capnptest01/blob/fd2e71a57e5c1ccf8a42f28804ba8f85129002d9/arena.go#L35

This makes it possible to encode messages without any heap allocations (other than the initial buffer if correctly sized).

Is ManualSegmentArena is something useful to be contributed to go-capnp?

@matheusd Both of your suggestions seem sensible to me, and a PR is most welcome. Please be aware that I am currently traveling, so I will be slower to respond/review than usual.

Looking forward to the PR :)

(No rush to respond, I'm aware you're travelling)

I've sent the first PR to address this (#555).

I previously wrote:

I noticed that Message.allocSegment requires m.segs to be non-nil, so it seems to me that Reset should just directly create m.segs if it is nil and populate it with the first seg, removing the need to consider the two cases throughout the code and make it easier to reason about

But the more I think about it, looking at a cpu profile with the heap allocs fixed, the more I think that segs should really be a function of (and therefore moved to) the Area implementations.

In particular, this would allow implementing an arena that forgoes the Message.mu protection, getting a performance boost at the expense of the caller having to ensure no concurrent access to the same message.

For example, in the following profile of my benchmark, Reset() is over 50% of the time while it's basically useless (because the size of the arena is accounted for, only a single segment will ever be used and access is all done in a single goroutine).

gocapnp02

Rewriting to use an Arena that does not use the map and mutex would make Reset() essentially free.

Hey @matheusd Thanks for investigating this. Your proposal sounds reasonable. I don't think this will break any existing behavior ... wanna give it a shot?

Yeah, I've started work on this. I only work on this as time permits, so no hard deadline on when I'll submit, but I think I've got a rough high level design going already.

Thank you for the support!

Absolutely no rush! I'm very, very glad you're able to contribute some cycles 😃
Please let me know if I can be of help!