Tucker error--mat1 and mat2 dtype

Question

Tucker error--mat1 and mat2 dtype

schmigle opened this issue a year ago · comments

A few months ago, I was able to run EAT without issue. However, on recent re-install, I've run into the following error with tucker:

Loading Tucker checkpoint from: /groups/baltrus/moshesteyn/annie_eat/tucker_weights.pt
Tuckerin' took: 0.0004[s]
Loading Tucker checkpoint from: /groups/baltrus/moshesteyn/annie_eat/tucker_weights.pt
Traceback (most recent call last):
  File "/groups/baltrus/moshesteyn/EAT/eat.py", line 515, in <module>
    main()
  File "/groups/baltrus/moshesteyn/EAT/eat.py", line 496, in main
    eater = EAT(lookup_p, query_p, output_d,
  File "/groups/baltrus/moshesteyn/EAT/eat.py", line 221, in __init__
    self.query_embs = self.tucker_embeddings(self.query_embs)
  File "/groups/baltrus/moshesteyn/EAT/eat.py", line 245, in tucker_embeddings
    dataset = model.single_pass(dataset)
  File "/groups/baltrus/moshesteyn/EAT/eat.py", line 36, in single_pass
    return self.tucker(x)
  File "/home/u30/moshesteyn/.conda/envs/eat_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/u30/moshesteyn/.conda/envs/eat_env/lib/python3.10/site-packages/torch/nn/modules/container.py", line 217, in forward
    input = module(input)
  File "/home/u30/moshesteyn/.conda/envs/eat_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/u30/moshesteyn/.conda/envs/eat_env/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 114, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 must have the same dtype

The actual embedding process seems to finish without issues. All requirements are installed as in requirements.txt except for PyTorch, which is 2.0.0, since that is what I used before (I ran into some cuda problems using 1.10.0). I'm working in a conda environment rather than a python virtual environment because I'm having issues getting PyTorch installed through the virtual environment; I'm trying to debug that now, but I'm posting the issue anyway because I sort of doubt that's at fault. I tried the solution used in #7 , but while the program thought it finished properly, the results file had no entries, although the header line was listed correctly. I suppose this could mean that not a single protein was recognizable, but this seems unlikely to me. Any ideas?

Edit: I increased the threshold to 1.5 just to see if detection was possible at all, and I actually did see some matches, so perhaps it just is my dataset. I'll leave this marked as unresolved, though, in case this kind of weirdness isn't the result of the underlying dataset; hoping someone has insight. The dataset consists entirely of phage.