StructScan infinite loop

Question

StructScan infinite loop

elwinar opened this issue 10 years ago · comments

I had a case earlier which made me pull my hair for a full day and half :

Let's say we have the following structs :

type Entity struct {
    ID int64 `db:"id"`
    Service *Service
}

type Service struct {
    Entity *Entity
}

func NewEntity(id int64) *Entity {
    e := &Entity{}
    e.Service = &Service{}
    e.Service.Entity = e
    return e
}

When StructScaning, the function tries to establish a field map by exploring the tree of embedded types of the struct. I my case, the exploration never end because of the pointers.

This issue bring the following question to my mind : what is the point of exploring non-embedded fields ? It seem rather non-semantic to me…
And independantly, why making a field map and not simply look for the fields as needed when scanning ?

Jason Moiron · Answer 1 · Wed May 14 2014 23:14:44 GMT+0800 (China Standard Time)

Thanks for the bug report; I only saw the title on my way into work this morning and thought right away "hmm, structscan would choke on circular data structures".

Scanning non-embedded structs was a mistake I am going to be fixing in a "API Breaking" release coming up.
Getting the reflect introspection work out of the way up front saves a lot of time. I'd like to change it so that the maps provide a direct way of getting the field. Right now they contain the BFS position in the tree, which means it still must be traversed. It'd be better to store a list positions at each depth.

Does this need to be fixed for embedded structs as well, or does the compiler prevent you from doing embedded struct loops? If not, it'l be fixed soon with the api cleanup release.

Romain Baugue · Answer 2 · Thu May 15 2014 00:37:26 GMT+0800 (China Standard Time)

If you don't scan non-embedded structs, I don't think the BFS will be necessary. I didn't researched much the way reflection work in Go, but I think that one way or another, each exported field of the struct should be accessible right away with the right RTTI function.

It doesn't need to be fixed for embedded structs, as the compiler complains about recursive embedded (see http://play.golang.org/p/Nkz7n-1W6m).

Jason Moiron · Answer 3 · Thu May 15 2014 00:45:40 GMT+0800 (China Standard Time)

Okay, excellent. I suspected the compiler might complain.

BFS is still sadly required to emulate Go's shadowing rules. reflect.FieldByName works on embedded fields, but it won't utilize tags, so it has to be somewhat re-implemented.

Removing the descent into non-embedded fields is going to simplify things quite a lot. There is a pull request for adding embedded struct support to modl and the code is way simpler because it doesn't have to check for Scanner or Valuer or do any weird stuff.

Romain Baugue · Answer 4 · Thu May 15 2014 00:51:25 GMT+0800 (China Standard Time)

I was trying at the same time, but for me the Tags are embedded at the same time… http://play.golang.org/p/PbEtNt_JOh

Jason Moiron · Answer 5 · Thu May 15 2014 00:53:23 GMT+0800 (China Standard Time)

Yes but it's backwards; when you scan, you only have the tag name, and you must go and find the field that fits it.

Romain Baugue · Answer 6 · Thu May 15 2014 00:55:13 GMT+0800 (China Standard Time)

I think that the StructScan function should just iterate over the given struct fields and populate a map[string]string with the correspondences sql -> struct. No need to do something fancy…

Jason Moiron · Answer 7 · Thu May 15 2014 01:01:54 GMT+0800 (China Standard Time)

I'll do some benchmarking to see if Value.FieldByName carries a big penalty compared to just accessing by offset. If not, then you're right, each reachable name in a struct is unique and should get me there. It's still nice to cache the field mapping, they are actually expensive to build.

Romain Baugue · Answer 8 · Thu May 15 2014 01:07:19 GMT+0800 (China Standard Time)

Access the value's fields by offset should be better, that's right.
Here is an example of the field mapping loop : http://play.golang.org/p/UzZzrOr1AU

Jason Moiron · Answer 9 · Thu May 15 2014 01:11:53 GMT+0800 (China Standard Time)

That doesn't do embedded structs (eg: 'age' is not accessible). More and more I'm thinking that sqlx's reflect stuff can be genericized and pushed into an sqlx/reflect package. I need to reuse some of it in modl anyway.

Romain Baugue · Answer 10 · Thu May 15 2014 01:20:45 GMT+0800 (China Standard Time)

You're right, it doesn't do embedded struct. I didn't saw it… but the Field.Anonymous field can differentiate embedded field from others, so it can be resolved with either a recursion or a stack (as you did).

Jason Moiron · Answer 11 · Thu May 15 2014 01:22:23 GMT+0800 (China Standard Time)

Yes, I think it will end up looking a bit like https://github.com/sqs/modl/commit/dae428523dbab11adfed450f45782f5c7152027b

Romain Baugue · Answer 12 · Thu May 15 2014 01:23:53 GMT+0800 (China Standard Time)

Exactly.
I think that you should externalize your extended reflection methods into another package. That would simplify both sqlx and modl…

Eve Freeman · Answer 13 · Thu May 15 2014 01:25:13 GMT+0800 (China Standard Time)

And you could call it reflectx. :)

Jason Moiron · Answer 14 · Fri May 16 2014 10:21:49 GMT+0800 (China Standard Time)

@wfreeman that was probably a joke but I can't think of a better way to make it live with the regular reflect package ...

Eve Freeman · Answer 15 · Fri May 16 2014 10:29:26 GMT+0800 (China Standard Time)

I was actually entirely serious. And willing to help--I'd use it.

Jason Moiron · Answer 16 · Fri May 16 2014 10:48:00 GMT+0800 (China Standard Time)

It seems like all sqlx needs is a GetFieldByName(s) which:

understands embedded structs
can optionally obey a struct tag
caches as much reflect work as it can (whatever benchmarks fastest)

sqlx also needs both the values (for named queries) and the addresses (for scanning) of these fields.

Can you think of anything else that would be useful?

Edit This is all complicated by the NameMapper in sqlx. This ends up being quite a lot like FieldByNameFunc(), so maybe I can make a similar API.

Jason Moiron · Answer 17 · Fri May 16 2014 22:48:07 GMT+0800 (China Standard Time)

commit 8aa2977

PASS
BenchmarkFieldNameL1    10000000               261 ns/op
BenchmarkFieldNameL4     1000000              2933 ns/op
BenchmarkFieldPosL1     20000000               112 ns/op
BenchmarkFieldPosL4     20000000               137 ns/op
ok      github.com/jmoiron/sqlx/reflect 11.120s

Lookups by field index are way, way faster than lookups by field name, especially when accessing embedded names.

Eve Freeman · Answer 18 · Sat May 17 2014 00:24:37 GMT+0800 (China Standard Time)

Interesting. I tried the FieldByIndex([]int{0,0,0,0}) for the last one and it was slightly slower than doing f = f.Field(0) 4 times, but might be the same in practice. So, are you thinking of caching the indexes?

Jason Moiron · Answer 19 · Sat May 17 2014 00:33:05 GMT+0800 (China Standard Time)

It's slightly slower but nowhere near FieldByName:

PASS
BenchmarkFieldNameL1    10000000               258 ns/op
BenchmarkFieldNameL4     1000000              2926 ns/op
BenchmarkFieldPosL1     20000000               112 ns/op
BenchmarkFieldPosL4     20000000               136 ns/op
BenchmarkFieldByIndexL4 10000000               147 ns/op

This is to be expected as there will be some overhead to the iteration that isn't there when just knowing the indexes all up front. Notably it's still nearly twice as fast to descend 4 levels than to get a top level field by name.

Jason Moiron · Answer 20 · Sun Jun 08 2014 02:01:41 GMT+0800 (China Standard Time)

This has all been merged into master now.