rs / xid

xid is a globally unique id generator thought for the web

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[bug] xid.FromString return string is difference.

syssam opened this issue · comments

commented

xid.FromString("c6e52g2mrqcjl44hf179")
expect to "c6e52g2mrqcjl44hf179", but currently is "c6e52g2mrqcjl44hf170"

How did you obtain this xid? It is not a possible value as both values are decoded as the same in base32.

commented

v is print c6e52g2mrqcjl44hf170

package main

import (
	"fmt"

	"github.com/rs/xid"
)

func main() {
	v, e := xid.FromString("c6e52g2mrqcjl44hf179")
	if e != nil {
		fmt.Println(e)
		return
	}
	fmt.Println(v)
}

This xid can not exist (the one ending with 69). How was it generated?

commented

"c6e52g2mrqcjl44hf179" is not generated by xid, correct value is "c6e52g2mrqcjl44hf170"
XID is used on graphql id, like wrapper https://github.com/ent/contrib/blob/master/entgql/internal/todouuid/ent/schema/uuidgql/uuidgql.go

func MarshalID(id xid.ID) graphql.Marshaler {
	return graphql.WriterFunc(func(w io.Writer) {
		_, _ = io.WriteString(w, strconv.Quote(id.String()))
	})
}

func UnmarshalID(v interface{}) (id xid.ID, err error) {
	s, ok := v.(string)
	if !ok {
		return id, fmt.Errorf("invalid type %T, expect string", v)
	}
	return xid.FromString(s)
}

if I have an API get user information who id is "c6e52g2mrqcjl44hf170", but someone is manually chang to "c6e52g2mrqcjl44hf179", he also will get "c6e52g2mrqcjl44hf170" user, not throw user is not exist

commented

@rs any updated?

This is a base32 issue. It will require to detect invalid base32 and output an error on parsing.

For reference, here is a another reproduction of the issue.

https://go.dev/play/p/ncN-mCIhpzr

import (
	"fmt"

	xid "github.com/rs/xid"
)

func main() {
	id1, _ := xid.FromString("9bsv0s6krhp002t5fla0")
	id2, _ := xid.FromString("9bsv0s6krhp002t5fla9")
	fmt.Println("Hello", id1)
	fmt.Println("Hello", id2)
}
Hello 9bsv0s6krhp002t5fla0
Hello 9bsv0s6krhp002t5fla0

Program exited.

The change is in the padding part and has no effect on the base32 encoded value.

Yes, I see that base32.NewEncoding("0123456789abcdefghijklmnopqrstuv").WithPadding(-1) from the standard lib is behaving in exactly the same way; DecodeFromString against both values will give the same binary result and no error.

As mentioned by @syssam, this means that ID comparisons in an ID appear to give false positives. This can seam scarier than it is. Perhaps particularly so if the ID is used as a username in a username/password context 🙈

Is there any efficient way in which we can detect the padding changes and return an error? Would this be the correct thing to do for an ID?

I don’t know any.

Would something like this work?

I just tested one example... I don't know if this check is valid for all possible XIDs, but I suppose one could write a "quickest" to check that.

https://go.dev/play/p/Gr0ikoyXPzo

package main

import (
	"encoding/base32"
	"fmt"
)

func main() {
	const s1 = "9bsv0s6krhp002t5fla0"
	const s2 = "9bsv0s6krhp002t5fla9" // invalid

	enc := base32.NewEncoding("0123456789abcdefghijklmnopqrstuv").WithPadding(-1)

	id1, err1 := enc.DecodeString(s1)
	id2, err2 := enc.DecodeString(s2)
	fmt.Println("id1:", id1, "err1:", err1)
	fmt.Println("id2:", id2, "err2:", err2)

	ok1 := enc.EncodeToString(id1[10:]) == s1[16:]
	ok2 := enc.EncodeToString(id2[10:]) == s2[16:]
	fmt.Println("id1 OK:", ok1)
	fmt.Println("id2 OK:", ok2)

}
id1: [74 249 240 112 212 220 114 0 11 165 125 84] err1: <nil>
id2: [74 249 240 112 212 220 114 0 11 165 125 84] err2: <nil>
id1 OK: true
id2 OK: false

Program exited.

Can you bench the perf difference? Avoiding dec/enc would be better if possible.

Benchmark in #75.

Appears to increase parsing time from ~14ns/op to ~20ns/op on my machine.

Also added a "Quick" test that fails for the current implementation just now (and passes the new implementation).

This is a test using randomized input and is meant to check if the fix is sufficient.

I think I will also extend the test a bit to check what happens if using characters not allowed by the base32 alphabet.

I think I will also extend the test a bit to check what happens if using characters not allowed by the base32 alphabet.

Done.

Actually ~20ns/op is not completely realistic it seams. If I insert the code directly in the decode function, then the operation is faster; presumably due to in-lining or other relevant optimization.

BenchmarkFromString-4           70718811                16.58 ns/op

v.s. when the check code is commented out:

BenchmarkFromString-4           83358038                14.58 ns/op

Want me to proceed with this?

Yes go ahead