dfdx / Ghost.jl

The Code Tracer

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Hashing issue with Ghost.Variable and dictionary keys

darsnack opened this issue · comments

I have a context where I store a Dict{Ghost.Variable, Vector{Any}}. I occasionally get an error (usually with very long programs > 500 assignments), where haskey(dict, var) returns false. But when I print the contents of the dictionary, the particular key is clearly there.

Is it possible that the custom hash function for Ghost.Variable causes issues when using them as keys to a Dict? I don't really know how we can test this.

Variables are references to operations and are thus mutable, making them hash-unfriendly. For example, consider the following sequence of events:

  1. You put a bound variable to a dict. During the put operation, its hash is calculated and the object is put into position X.
  2. You apply some tape transformations, changing the position of the operation and thus mutating the bound variable.
  3. You run haskey(dict, var), calculate variable hash, but now it points to a position Y in the hash table, so haskey() returns false.

The way I deal with it in Yota is by using OrderedCollections.LittleDict instead of Dict. LittleDict is not a hash table and checks the presence of the object using equality.

I admit this isn't very friendly and not even documented 😨 Maybe we can change the hash function logic to store the calculated hash the first time the variable is created. I'll play around with this option.

Implemented in #21, Ghost v0.3.1. The hash is frozen the first time it's calculated, so whatever you put to Dict will be referable by that hash forever. This is quite a hack, so no strong guarantees of backward compatibility yet :(

Wow that was fast! I'll give it a shot with my current code. I was going to refactor the code to eliminate the need for Dicts in this case, but there are other places where I still use them.

I can confirm the fix worked for me.