jackc / pgx

PostgreSQL driver and toolkit for Go

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Improve performance of CollectRows / RowToStructBy...

zolstein opened this issue · comments

The CollectRows and AppendToRows functions, especially with the RowToStructByPos/Name functions, are incredibly convenient. However, in benchmarking I noticed they do a lot of redundant work and unnecessary allocations, and I think they can be improved. I've come up with alternate implementations that seem to perform noticeably better, by using the following optimizations:

  • Serialize directly into the slice, rather than returning values and copying them in.
  • Reuse memory to process multiple rows, rather than allocating for each row.
  • Precompute mapping from columns to struct fields and reuse them across rows, rather than re-computing the mapping for each row. (This is especially impactful for RowToStructByName, which requires complex string comparisons to compute the mapping.)

Using a relatively simple benchmark, I see a bit better than a 3-4x end-to-end performance improvement on a query returning 1000 rows. (AppendRowsUsing and RowInto... are my equivalents of AppendRows/RowTo...):

type Person struct {
	Name string
	Age  sql.NullInt64
}

type Record struct {
	ID int64
	Person
	CreatedAt time.Time
}

func main() {
	ctx := context.Background()
	conn, err := pgx.Connect(ctx, "...")
	if err != nil {
		fmt.Fprintf(os.Stderr, "Unable to connect to database: %v\n", err)
		os.Exit(1)
	}
	defer conn.Close(ctx)

	type r = Record
	records := make([]r, 0, 1000)
	for i := 0; i < 1000; i++ {
		records = records[:0]
		rows, err := conn.Query(ctx, "select id, name, age, created_at from people")
		// Select one of:
		// records, err = pgx.AppendRows(records, rows, pgx.RowToStructByPos[r])
		// records, err = pgx.AppendRowsUsing(records, rows, pgx.RowIntoStructByPos[r])
		// records, err = pgx.AppendRows(records, rows, pgx.RowToStructByName[r])
		// records, err = pgx.AppendRowsUsing(records, rows, pgx.RowIntoStructByName[r])
		_ = records
		if err != nil {
			fmt.Fprintf(os.Stderr, "QueryRow failed: %v\n", err)
			os.Exit(1)
		}
	}
}
# AppendRows - RowToStructByPos
$ go build main.go && hyperfine --warmup=1 "./main -bench"            14:43:03
Benchmark 1: ./main -bench
  Time (mean ± σ):      3.945 s ±  0.376 s    [User: 2.762 s, System: 0.357 s]
  Range (min … max):    3.077 s …  4.372 s    10 runs

# AppendRowsUsing - RowIntoStructByPos
$ go build main.go && hyperfine --warmup=1 "./main -bench"            14:44:03
Benchmark 1: ./main -bench
  Time (mean ± σ):      1.114 s ±  0.250 s    [User: 0.671 s, System: 0.236 s]
  Range (min … max):    0.818 s …  1.508 s    10 runs
 
# AppendRows - RowToStructByName
$ go build main.go && hyperfine --warmup=1 "./main -bench"            14:44:23
Benchmark 1: ./main -bench
  Time (mean ± σ):      3.662 s ±  0.379 s    [User: 2.803 s, System: 0.285 s]
  Range (min … max):    3.100 s …  4.327 s    10 runs
 
# AppendRowsUsing - RowIntoStructByName
$ go build main.go && hyperfine --warmup=1 "./main -bench"            14:45:14
Benchmark 1: ./main -bench
  Time (mean ± σ):      1.082 s ±  0.114 s    [User: 0.779 s, System: 0.114 s]
  Range (min … max):    0.948 s …  1.290 s    10 runs

Some of these changes could be retrofitted into the existing CollectRows, but (as far as I can tell) some are impossible with the existing RowToFunc signature, and would require a breaking change to the API to support. (Most code using the RowToFunc implementaitons supplied by the library would work, but custom RowToFunc implementations would break - so might code that explicitly uses the underlying function type.)

Would you be interested in seeing a full PR and potentially integrating any of this?

I don't think that a breaking change would be okay. But I'd be interested in seeing any optimizations that can be done within the existing framework.

And I would be interested in seeing the rest, even if can't be done without a breaking change, just got me to understand more.

Draft PR here: #1949

PR that takes in some of the optimizations without affecting the public API: #1959