[bug] xid.FromString return string is difference.

Question

[bug] xid.FromString return string is difference.

syssam opened this issue 3 years ago · comments

s.y.s commented 3 years ago

xid.FromString("c6e52g2mrqcjl44hf179")
expect to "c6e52g2mrqcjl44hf179", but currently is "c6e52g2mrqcjl44hf170"

Olivier Poitrey · Answer 1 · Wed Nov 24 2021 00:18:13 GMT+0800 (China Standard Time)

How did you obtain this xid? It is not a possible value as both values are decoded as the same in base32.

s.y.s · Answer 2 · Wed Nov 24 2021 09:25:29 GMT+0800 (China Standard Time)

v is print c6e52g2mrqcjl44hf170

package main

import (
	"fmt"

	"github.com/rs/xid"
)

func main() {
	v, e := xid.FromString("c6e52g2mrqcjl44hf179")
	if e != nil {
		fmt.Println(e)
		return
	}
	fmt.Println(v)
}

Olivier Poitrey · Answer 3 · Wed Nov 24 2021 10:07:53 GMT+0800 (China Standard Time)

This xid can not exist (the one ending with 69). How was it generated?

s.y.s · Answer 4 · Thu Nov 25 2021 11:00:05 GMT+0800 (China Standard Time)

"c6e52g2mrqcjl44hf179" is not generated by xid, correct value is "c6e52g2mrqcjl44hf170"
XID is used on graphql id, like wrapper https://github.com/ent/contrib/blob/master/entgql/internal/todouuid/ent/schema/uuidgql/uuidgql.go

func MarshalID(id xid.ID) graphql.Marshaler {
	return graphql.WriterFunc(func(w io.Writer) {
		_, _ = io.WriteString(w, strconv.Quote(id.String()))
	})
}

func UnmarshalID(v interface{}) (id xid.ID, err error) {
	s, ok := v.(string)
	if !ok {
		return id, fmt.Errorf("invalid type %T, expect string", v)
	}
	return xid.FromString(s)
}

if I have an API get user information who id is "c6e52g2mrqcjl44hf170", but someone is manually chang to "c6e52g2mrqcjl44hf179", he also will get "c6e52g2mrqcjl44hf170" user, not throw user is not exist

s.y.s · Answer 5 · Tue Nov 30 2021 09:58:14 GMT+0800 (China Standard Time)

@rs any updated?

Olivier Poitrey · Answer 6 · Tue Nov 30 2021 10:00:27 GMT+0800 (China Standard Time)

This is a base32 issue. It will require to detect invalid base32 and output an error on parsing.

Sindre Røkenes Myren · Answer 7 · Thu Mar 03 2022 20:09:34 GMT+0800 (China Standard Time)

For reference, here is a another reproduction of the issue.

https://go.dev/play/p/ncN-mCIhpzr

import (
	"fmt"

	xid "github.com/rs/xid"
)

func main() {
	id1, _ := xid.FromString("9bsv0s6krhp002t5fla0")
	id2, _ := xid.FromString("9bsv0s6krhp002t5fla9")
	fmt.Println("Hello", id1)
	fmt.Println("Hello", id2)
}

Hello 9bsv0s6krhp002t5fla0
Hello 9bsv0s6krhp002t5fla0

Program exited.

Olivier Poitrey · Answer 8 · Thu Mar 03 2022 20:22:46 GMT+0800 (China Standard Time)

The change is in the padding part and has no effect on the base32 encoded value.

Sindre Røkenes Myren · Answer 9 · Thu Mar 03 2022 21:01:18 GMT+0800 (China Standard Time)

Yes, I see that base32.NewEncoding("0123456789abcdefghijklmnopqrstuv").WithPadding(-1) from the standard lib is behaving in exactly the same way; DecodeFromString against both values will give the same binary result and no error.

As mentioned by @syssam, this means that ID comparisons in an ID appear to give false positives. This can seam scarier than it is. Perhaps particularly so if the ID is used as a username in a username/password context 🙈

Is there any efficient way in which we can detect the padding changes and return an error? Would this be the correct thing to do for an ID?

Olivier Poitrey · Answer 10 · Thu Mar 03 2022 21:22:01 GMT+0800 (China Standard Time)

I don’t know any.

Sindre Røkenes Myren · Answer 11 · Thu Mar 03 2022 22:11:04 GMT+0800 (China Standard Time)

Would something like this work?

I just tested one example... I don't know if this check is valid for all possible XIDs, but I suppose one could write a "quickest" to check that.

https://go.dev/play/p/Gr0ikoyXPzo

package main

import (
	"encoding/base32"
	"fmt"
)

func main() {
	const s1 = "9bsv0s6krhp002t5fla0"
	const s2 = "9bsv0s6krhp002t5fla9" // invalid

	enc := base32.NewEncoding("0123456789abcdefghijklmnopqrstuv").WithPadding(-1)

	id1, err1 := enc.DecodeString(s1)
	id2, err2 := enc.DecodeString(s2)
	fmt.Println("id1:", id1, "err1:", err1)
	fmt.Println("id2:", id2, "err2:", err2)

	ok1 := enc.EncodeToString(id1[10:]) == s1[16:]
	ok2 := enc.EncodeToString(id2[10:]) == s2[16:]
	fmt.Println("id1 OK:", ok1)
	fmt.Println("id2 OK:", ok2)

}

id1: [74 249 240 112 212 220 114 0 11 165 125 84] err1: <nil>
id2: [74 249 240 112 212 220 114 0 11 165 125 84] err2: <nil>
id1 OK: true
id2 OK: false

Program exited.

Olivier Poitrey · Answer 12 · Thu Mar 03 2022 22:31:26 GMT+0800 (China Standard Time)

Can you bench the perf difference? Avoiding dec/enc would be better if possible.

Sindre Røkenes Myren · Answer 13 · Fri Mar 04 2022 16:21:46 GMT+0800 (China Standard Time)

Benchmark in #75.

Appears to increase parsing time from ~14ns/op to ~20ns/op on my machine.

Sindre Røkenes Myren · Answer 14 · Fri Mar 04 2022 17:05:56 GMT+0800 (China Standard Time)

Also added a "Quick" test that fails for the current implementation just now (and passes the new implementation).

This is a test using randomized input and is meant to check if the fix is sufficient.

I think I will also extend the test a bit to check what happens if using characters not allowed by the base32 alphabet.

Sindre Røkenes Myren · Answer 15 · Fri Mar 04 2022 17:10:07 GMT+0800 (China Standard Time)

I think I will also extend the test a bit to check what happens if using characters not allowed by the base32 alphabet.

Done.

Sindre Røkenes Myren · Answer 16 · Fri Mar 11 2022 02:25:36 GMT+0800 (China Standard Time)

Actually ~20ns/op is not completely realistic it seams. If I insert the code directly in the decode function, then the operation is faster; presumably due to in-lining or other relevant optimization.

BenchmarkFromString-4           70718811                16.58 ns/op

v.s. when the check code is commented out:

BenchmarkFromString-4           83358038                14.58 ns/op

Want me to proceed with this?

Olivier Poitrey · Answer 17 · Fri Mar 11 2022 02:34:55 GMT+0800 (China Standard Time)

Yes go ahead