tidwall / sjson

Set JSON values very quickly in Go

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

possible memory leak using sjson.Set

saartamir opened this issue · comments

Hi @tidwall
I'm using this library to enrich event without unmarshalling into Go struct, and there's seems to be a memory problem in sjson.Set().
This is how I use it:

func EnrichEvent(event string, ctx *fasthttp.RequestCtx) string {
	startTime := time.Now().UnixNano() / 1000000

	log := zap.L()

	ua := string(ctx.UserAgent())
	ip := ctx.RemoteIP().String()
	entryTs := strconv.Itoa(int(time.Now().UnixNano() / int64(time.Millisecond)))

	var err error

	// add new fields to events
	event, err = sjson.Set(event, "ua", ua)
	if err != nil {
		log.Error("could not add ua to event", zap.String("event", event), zap.Error(err))
	}

	event, err = sjson.Set(event, "client_ip", ip)
	if err != nil {
		log.Error("could not add client_ip to event", zap.String("event", event), zap.Error(err))
	}

	event, err = sjson.Set(event, "entry_ts", entryTs)
	if err != nil {
		log.Error("could not add entry_ts to event", zap.String("event", event), zap.Error(err))
	}

	endTime := time.Now().UnixNano() / 1000000
	promy.RecordMetric("EnrichTimeHistogram", 0, float64(endTime-startTime), nil)

	return event
}

and when I'm trying to profile the allocs, it looks like sjson keeps growing and growing:
allocs

am I doing anything wrong?

I wrote a program that tried to reproduce the issue, but I wasn't able to see the memory issues that you were seeing.

I took your example function and hardcoded the values that came from the fasthttp and `zap' libraries.

Here's the complete program:

package main

import (
	"fmt"
	"log"
	"runtime"
	"strconv"
	"time"

	"github.com/tidwall/sjson"
)

func main() {
	evs := make(chan string, 100)
	go func() {
		var totalBytes int64
		var totalEvents int64
		ticker := time.NewTicker(time.Second)
		for {
			select {
			case ev := <-evs:
				totalEvents++
				totalBytes += int64(len(ev))
			case <-ticker.C:
				var stats runtime.MemStats
				runtime.ReadMemStats(&stats)
				fmt.Printf(
					"Processes %d Messages (%0.2f GB), Heap Size: %0.2f MB\n",
					totalEvents,
					float64(totalBytes)/1024/1024/1024,
					float64(stats.HeapAlloc)/1024/1024,
				)
			}
		}
	}()

	for i := 0; i < runtime.NumCPU(); i++ {
		go func() {
			for {
				evs <- EnrichEvent("{}")
			}
		}()
	}
	select {}
}

// func EnrichEvent(event string, ctx *fasthttp.RequestCtx) string {
func EnrichEvent(event string) string {
	// startTime := time.Now().UnixNano() / 1000000

	// log := zap.L()

	// ua := string(ctx.UserAgent())
	ua := "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.131 Safari/537.36"
	// ip := ctx.RemoteIP().String()
	ip := "127.10.105.81:98172"
	entryTs := strconv.Itoa(int(time.Now().UnixNano() / int64(time.Millisecond)))

	var err error

	// add new fields to events
	event, err = sjson.Set(event, "ua", ua)
	if err != nil {
		// log.Error("could not add ua to event", zap.String("event", event), zap.Error(err))
		log.Fatal("could not add ua to event", event, err)
	}

	event, err = sjson.Set(event, "client_ip", ip)
	if err != nil {
		// log.Error("could not add client_ip to event", zap.String("event", event), zap.Error(err))
		log.Fatal("could not add client_ip to event", event, err)
	}

	event, err = sjson.Set(event, "entry_ts", entryTs)
	if err != nil {
		// log.Error("could not add entry_ts to event", zap.String("event", event), zap.Error(err))
		log.Fatal("could not add entry_ts to event", event, err)
	}

	// endTime := time.Now().UnixNano() / 1000000
	// promy.RecordMetric("EnrichTimeHistogram", 0, float64(endTime-startTime), nil)

	return event
}

The program processes as many events as possible over a number of goroutines and prints out stats every second.

Here's what I'm seeing:

Processes 1139184 Messages (0.20 GB), Heap Size: 3.57 MB
Processes 2255683 Messages (0.39 GB), Heap Size: 2.86 MB
Processes 3358867 Messages (0.58 GB), Heap Size: 3.63 MB
Processes 4471472 Messages (0.77 GB), Heap Size: 3.48 MB
Processes 5598419 Messages (0.96 GB), Heap Size: 3.21 MB
Processes 6704932 Messages (1.16 GB), Heap Size: 3.65 MB
Processes 7796037 Messages (1.34 GB), Heap Size: 3.30 MB
Processes 8903962 Messages (1.53 GB), Heap Size: 3.61 MB
Processes 9993757 Messages (1.72 GB), Heap Size: 2.71 MB
Processes 11091102 Messages (1.91 GB), Heap Size: 2.02 MB
Processes 12201903 Messages (2.10 GB), Heap Size: 2.97 MB
Processes 13317121 Messages (2.29 GB), Heap Size: 2.09 MB
Processes 14428639 Messages (2.49 GB), Heap Size: 3.58 MB
Processes 15551060 Messages (2.68 GB), Heap Size: 3.53 MB
Processes 16684651 Messages (2.87 GB), Heap Size: 2.57 MB
Processes 17801735 Messages (3.07 GB), Heap Size: 3.52 MB
Processes 18914031 Messages (3.26 GB), Heap Size: 2.24 MB
Processes 20032989 Messages (3.45 GB), Heap Size: 2.07 MB
Processes 21146109 Messages (3.64 GB), Heap Size: 3.67 MB
Processes 22257386 Messages (3.83 GB), Heap Size: 3.54 MB
Processes 23346580 Messages (4.02 GB), Heap Size: 3.36 MB
Processes 24456117 Messages (4.21 GB), Heap Size: 3.04 MB
Processes 25562760 Messages (4.40 GB), Heap Size: 1.73 MB
Processes 26671997 Messages (4.60 GB), Heap Size: 3.61 MB
Processes 27789998 Messages (4.79 GB), Heap Size: 2.33 MB
Processes 28945007 Messages (4.99 GB), Heap Size: 3.31 MB
Processes 30069362 Messages (5.18 GB), Heap Size: 3.49 MB

After about 30 seconds, 30 million sjson.Set calls were processed with amounted to 5 GB of data. Yet the Go heap never goes above 4 MB.