jcjohnson / torch-rnn

Efficient, reusable RNNs and LSTMs for torch

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Learning this as I go, can't sort out this error (init.lua, unable to find HDF5 lib)

BryanIRM opened this issue · comments

Hi there, apologies in advance as this is all entirely brand new to me, and I feel like what I'm doing is incredibly rudimentary but I've hit an error I can't sort out. I've been searching for similar cases but nothing seems to have worked and I'm not versed at all in working in Terminal or Linux or any of these programs, unfortunately, so I may need a hand-hold, explain-like-I'm-5 sort of set of steps.

I'm running Ubuntu 18.04 in a VM on my PC so that I can run Torch and do a basic sort of plug-and-play "we input a list of [THING] and had a bot try to create its own list of [THING]". Here's my VM settings:
VM-settings

For the most part, I'm using this blog to coach myself along: http://www.jeffreythompson.org/blog/2016/03/25/torch-rnn-mac-install/
Any hiccups I've been able to address, be it installing things not explicitly stated, etc. But here's the command I'm caught up on:

th train.lua -input_h5 data/All5Leagues.h5 -input_json data/All5Leagues.json -gpu -1

And here's the Terminal copy-paste with the error:

bryan@bryan-VirtualBox:$ cd torch
bryan@bryan-VirtualBox:
/torch$ cd torch-rnn
bryan@bryan-VirtualBox:~/torch/torch-rnn$ th train.lua -input_h5 data/All5Leagues.h5 -input_json data/All5Leagues.json

/home/bryan/torch/install/share/lua/5.1/hdf5/init.lua:15 Unable to find the HDF5 lib we were built against - trying to find it elsewhere
/home/bryan/torch/install/bin/luajit: /home/bryan/torch/install/share/lua/5.1/trepl/init.lua:389: /home/bryan/torch/install/share/lua/5.1/trepl/init.lua:389: /home/bryan/torch/install/share/lua/5.1/hdf5/ffi.lua:29: libhdf5.so: cannot open shared object file: No such file or directory
stack traceback:
[C]: in function 'error'
/home/bryan/torch/install/share/lua/5.1/trepl/init.lua:389: in function 'require'
train.lua:6: in main chunk
[C]: in function 'dofile'
...ryan/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x56163dbb1570

bryan@bryan-VirtualBox:~/torch/torch-rnn$

I understand this means Torch isn't finding some sort of library, which I'm pretty certain I have. But anything I've tried doesn't seem to work and hopefully I could get some assistance. Thanks!

Check the library path in install/share/lua/5.1/hdf5/config.lua, sometimes it doesn't get detected correctly. Also you might hit library version issues, the last time I used torch it didn't support current hdf5 library version and I had to install a patched version.
Unfortunately torch is pretty dead these days, and you'll probably hit more issues sooner or later. I switched my code to pytorch last year.

Thank you! I appreciate the tips. I've actually started over on a new VM because I think I messed with too many things... plus this would cut down on the commands looking for paths all over the place. Here's the current error I'm up to right now:

image

The suggestion I've received on this is to uninstall my system libhdf5 and tell lua to use a different version of it. Again, this is all brand new to me so I don't know if this will work or not but I trust the person suggesting this.

How similar is Pytorch? Is it simple and/or similar to use? Ultimately, I'm just trying to do a silly "feed the machine a list of names of a thing, have it learn and try to create its own, laugh at the results" sort of thing, so hardly anything scientific or seemingly complicated. I know Torch does character-based learning, does Pytorch as well?

It's trying to use CUDA, but cutorch and cunn modules aren't installed. Either install them using luarocks (but you might have to tell nvcc to use a host compiler it's compatible with) or switch to CPU mode using -gpu -1.

PyTorch API is quite similar to Torch, and I tried to keep pytorch-rnn close to torch-rnn, but some parts are still missing.