Improve performance of CollectRows / RowToStructBy...
zolstein opened this issue · comments
The CollectRows and AppendToRows functions, especially with the RowToStructByPos/Name functions, are incredibly convenient. However, in benchmarking I noticed they do a lot of redundant work and unnecessary allocations, and I think they can be improved. I've come up with alternate implementations that seem to perform noticeably better, by using the following optimizations:
- Serialize directly into the slice, rather than returning values and copying them in.
- Reuse memory to process multiple rows, rather than allocating for each row.
- Precompute mapping from columns to struct fields and reuse them across rows, rather than re-computing the mapping for each row. (This is especially impactful for RowToStructByName, which requires complex string comparisons to compute the mapping.)
Using a relatively simple benchmark, I see a bit better than a 3-4x end-to-end performance improvement on a query returning 1000 rows. (AppendRowsUsing and RowInto... are my equivalents of AppendRows/RowTo...):
type Person struct {
Name string
Age sql.NullInt64
}
type Record struct {
ID int64
Person
CreatedAt time.Time
}
func main() {
ctx := context.Background()
conn, err := pgx.Connect(ctx, "...")
if err != nil {
fmt.Fprintf(os.Stderr, "Unable to connect to database: %v\n", err)
os.Exit(1)
}
defer conn.Close(ctx)
type r = Record
records := make([]r, 0, 1000)
for i := 0; i < 1000; i++ {
records = records[:0]
rows, err := conn.Query(ctx, "select id, name, age, created_at from people")
// Select one of:
// records, err = pgx.AppendRows(records, rows, pgx.RowToStructByPos[r])
// records, err = pgx.AppendRowsUsing(records, rows, pgx.RowIntoStructByPos[r])
// records, err = pgx.AppendRows(records, rows, pgx.RowToStructByName[r])
// records, err = pgx.AppendRowsUsing(records, rows, pgx.RowIntoStructByName[r])
_ = records
if err != nil {
fmt.Fprintf(os.Stderr, "QueryRow failed: %v\n", err)
os.Exit(1)
}
}
}
# AppendRows - RowToStructByPos
$ go build main.go && hyperfine --warmup=1 "./main -bench" 14:43:03
Benchmark 1: ./main -bench
Time (mean ± σ): 3.945 s ± 0.376 s [User: 2.762 s, System: 0.357 s]
Range (min … max): 3.077 s … 4.372 s 10 runs
# AppendRowsUsing - RowIntoStructByPos
$ go build main.go && hyperfine --warmup=1 "./main -bench" 14:44:03
Benchmark 1: ./main -bench
Time (mean ± σ): 1.114 s ± 0.250 s [User: 0.671 s, System: 0.236 s]
Range (min … max): 0.818 s … 1.508 s 10 runs
# AppendRows - RowToStructByName
$ go build main.go && hyperfine --warmup=1 "./main -bench" 14:44:23
Benchmark 1: ./main -bench
Time (mean ± σ): 3.662 s ± 0.379 s [User: 2.803 s, System: 0.285 s]
Range (min … max): 3.100 s … 4.327 s 10 runs
# AppendRowsUsing - RowIntoStructByName
$ go build main.go && hyperfine --warmup=1 "./main -bench" 14:45:14
Benchmark 1: ./main -bench
Time (mean ± σ): 1.082 s ± 0.114 s [User: 0.779 s, System: 0.114 s]
Range (min … max): 0.948 s … 1.290 s 10 runs
Some of these changes could be retrofitted into the existing CollectRows, but (as far as I can tell) some are impossible with the existing RowToFunc signature, and would require a breaking change to the API to support. (Most code using the RowToFunc implementaitons supplied by the library would work, but custom RowToFunc implementations would break - so might code that explicitly uses the underlying function type.)
Would you be interested in seeing a full PR and potentially integrating any of this?
I don't think that a breaking change would be okay. But I'd be interested in seeing any optimizations that can be done within the existing framework.
And I would be interested in seeing the rest, even if can't be done without a breaking change, just got me to understand more.
Draft PR here: #1949
PR that takes in some of the optimizations without affecting the public API: #1959
Merged #1959.