Improve performance of CollectRows / RowToStructBy...

Question

Improve performance of CollectRows / RowToStructBy...

zolstein opened this issue 3 months ago · comments

The CollectRows and AppendToRows functions, especially with the RowToStructByPos/Name functions, are incredibly convenient. However, in benchmarking I noticed they do a lot of redundant work and unnecessary allocations, and I think they can be improved. I've come up with alternate implementations that seem to perform noticeably better, by using the following optimizations:

Serialize directly into the slice, rather than returning values and copying them in.
Reuse memory to process multiple rows, rather than allocating for each row.
Precompute mapping from columns to struct fields and reuse them across rows, rather than re-computing the mapping for each row. (This is especially impactful for RowToStructByName, which requires complex string comparisons to compute the mapping.)

Using a relatively simple benchmark, I see a bit better than a 3-4x end-to-end performance improvement on a query returning 1000 rows. (AppendRowsUsing and RowInto... are my equivalents of AppendRows/RowTo...):

type Person struct {
	Name string
	Age  sql.NullInt64
}

type Record struct {
	ID int64
	Person
	CreatedAt time.Time
}

func main() {
	ctx := context.Background()
	conn, err := pgx.Connect(ctx, "...")
	if err != nil {
		fmt.Fprintf(os.Stderr, "Unable to connect to database: %v\n", err)
		os.Exit(1)
	}
	defer conn.Close(ctx)

	type r = Record
	records := make([]r, 0, 1000)
	for i := 0; i < 1000; i++ {
		records = records[:0]
		rows, err := conn.Query(ctx, "select id, name, age, created_at from people")
		// Select one of:
		// records, err = pgx.AppendRows(records, rows, pgx.RowToStructByPos[r])
		// records, err = pgx.AppendRowsUsing(records, rows, pgx.RowIntoStructByPos[r])
		// records, err = pgx.AppendRows(records, rows, pgx.RowToStructByName[r])
		// records, err = pgx.AppendRowsUsing(records, rows, pgx.RowIntoStructByName[r])
		_ = records
		if err != nil {
			fmt.Fprintf(os.Stderr, "QueryRow failed: %v\n", err)
			os.Exit(1)
		}
	}
}

# AppendRows - RowToStructByPos
$ go build main.go && hyperfine --warmup=1 "./main -bench"            14:43:03
Benchmark 1: ./main -bench
  Time (mean ± σ):      3.945 s ±  0.376 s    [User: 2.762 s, System: 0.357 s]
  Range (min … max):    3.077 s …  4.372 s    10 runs

# AppendRowsUsing - RowIntoStructByPos
$ go build main.go && hyperfine --warmup=1 "./main -bench"            14:44:03
Benchmark 1: ./main -bench
  Time (mean ± σ):      1.114 s ±  0.250 s    [User: 0.671 s, System: 0.236 s]
  Range (min … max):    0.818 s …  1.508 s    10 runs
 
# AppendRows - RowToStructByName
$ go build main.go && hyperfine --warmup=1 "./main -bench"            14:44:23
Benchmark 1: ./main -bench
  Time (mean ± σ):      3.662 s ±  0.379 s    [User: 2.803 s, System: 0.285 s]
  Range (min … max):    3.100 s …  4.327 s    10 runs
 
# AppendRowsUsing - RowIntoStructByName
$ go build main.go && hyperfine --warmup=1 "./main -bench"            14:45:14
Benchmark 1: ./main -bench
  Time (mean ± σ):      1.082 s ±  0.114 s    [User: 0.779 s, System: 0.114 s]
  Range (min … max):    0.948 s …  1.290 s    10 runs

Some of these changes could be retrofitted into the existing CollectRows, but (as far as I can tell) some are impossible with the existing RowToFunc signature, and would require a breaking change to the API to support. (Most code using the RowToFunc implementaitons supplied by the library would work, but custom RowToFunc implementations would break - so might code that explicitly uses the underlying function type.)

Would you be interested in seeing a full PR and potentially integrating any of this?

Jack Christensen · Answer 1 · Sun Mar 17 2024 11:09:34 GMT+0800 (China Standard Time)

I don't think that a breaking change would be okay. But I'd be interested in seeing any optimizations that can be done within the existing framework.

And I would be interested in seeing the rest, even if can't be done without a breaking change, just got me to understand more.

Zach Olstein · Answer 2 · Sun Mar 17 2024 11:36:36 GMT+0800 (China Standard Time)

Draft PR here: #1949

Zach Olstein · Answer 3 · Mon Mar 25 2024 08:42:39 GMT+0800 (China Standard Time)

PR that takes in some of the optimizations without affecting the public API: #1959

Jack Christensen · Answer 4 · Wed Apr 17 2024 02:12:57 GMT+0800 (China Standard Time)

Merged #1959.