rs / xid

xid is a globally unique id generator thought for the web

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Use raw []byte value in sql interfaces for smaller indices

smyrman opened this issue · comments

As mentioned in #14, XID may be stored as a bytea in Postgres, resulting it to take up 16 bytes rather then 24.

While read/write performance is impacted slighlty, as far as I understand, the benefits of a BYTEA over TEXT in smaller size, which again means smaller (and thus in theory faster) indices for large tables in particular. In addition, there are less "special rules" (e.g. unicode / local encoding rules) for comparison, which agin, in theory, should make it faster to query as well.

https://www.db-fiddle.com/f/jgYzsKTFGu3NU9ZjDjfRUw/0

The link above shows a simple table with an ID as either bytea or string. Given 50.000 entries, the index size is reduced from 2496 kB to 2048 kB by using bytea.

I don't know at which table sizes this become significant, and if it really matters. A propper benchmark with a few million rows and a few quries is probably wise before making any changes.

How to define bytea IDs

Given bytea is used as an ID in the schema, test-code to encode/decode XIDs from binary is provaided here:

type XID struct {
	xid.ID
}

// NewXID generates a new XID instance.
func NewXID() XID {
	return XID{ID: xid.New()}
}

// Value implements the driver.Valuer interface.
func (id XID) Value() (driver.Value, error) {
	if id.IsNil() {
		return nil, nil
	}
	return id.Bytes(), nil
}

// Scan implements the sql.Scanner interface.
func (id *XID) Scan(value interface{}) error {
	switch b := value.(type) {
	case []byte:
		_id, err := xid.FromBytes(b)
		if err != nil {
			return err
		}
		id.ID = _id
		return nil
	case nil:
		id.ID = xid.ID{}
		return nil
	default:
		return fmt.Errorf("xid: scanning unsupported type: %T", value)
	}
}

Didn't notice #39 before reporting this. 🤦

Not to mention, didn't notice https://github.com/rs/xid/blob/master/b/id.go

Closing this as solved then.