korymath / jann

Hi. I am jann. I am text input - text output chatbot model that is JUST approximate nearest neighbour.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Can't run examles on fresh ubuntu build

binraker opened this issue · comments

commented

failed to get the simple example to run on a fresh ubuntu build.

//from a fresh install of ubuntu Ubuntu 20.04.2 LTS on Virtual Box
sudo apt update
sudo apt upgrade
sudo apt install gcc
sudo apt install make
sudo apt install perl

//now install the guest additions and restart (change display size if necessary)

sudo apt install git
git clone https://github.com/korymath/jann
cd jann

sudo apt install python3-venv

python3.8 -m venv venv
source venv/bin/activate
pip install --upgrade pip setuptools

sudo apt install libpython3.8-dev
sudo apt install g++

pip install -r requirements.txt

python setup.py install
export TFHUB_CACHE_DIR=Jann/data/module
mkdir -p ${TFHUB_CACHE_DIR}
wget "https://tfhub.dev/google/universal-sentence-encoder-lite/2?tf-hub-format=compressed" -O ${TFHUB_CACHE_DIR}/module_lite.tar.gz
cd ${TFHUB_CACHE_DIR};
mkdir -p universal-sentence-encoder-lite-2 && tar -zxvf module_lite.tar.gz -C universal-sentence-encoder-lite-2;
cd -

mkdir -p Jann/data/CMDC
cd Jann/data/CMDC/
wget http://www.cs.cornell.edu/~cristian/data/cornell_movie_dialogs_corpus.zip
unzip cornell_movie_dialogs_corpus.zip
mv cornell\ movie-dialogs\ corpus/movie_lines.txt movie_lines.txt
mv cornell\ movie-dialogs\ corpus/movie_conversations.txt movie_conversations.txt

cd -

pytest --cov-report=xml --cov-report=html --cov=Jann

Gives:

(venv) peter@peter-VirtualBox:~/jann/Jann$ pytest --cov-report=xml --cov-report=html --cov=Jann
================================================ test session starts =================================================
platform linux -- Python 3.8.5, pytest-6.2.2, py-1.10.0, pluggy-0.13.1 -- /home/peter/jann/venv/bin/python3.8
cachedir: .pytest_cache
rootdir: /home/peter/jann, configfile: setup.cfg
plugins: cov-2.11.1
collected 14 items

tests/test_units.py::test_app PASSED [ 7%]
tests/test_units.py::test_process_cornell_data PASSED [ 14%]
tests/test_units.py::test_process_pairs_data PASSED [ 21%]
tests/test_units.py::test_embed_lines PASSED [ 28%]
tests/test_units.py::test_process_embeddings PASSED [ 35%]
tests/test_units.py::test_index_embeddings PASSED [ 42%]
tests/test_units.py::test_interact_with_model PASSED [ 50%]
tests/test_utils.py::test_parse_arguments PASSED [ 57%]
tests/test_utils.py::test_load_data_list_not_pairs PASSED [ 64%]
tests/test_utils.py::test_load_data_list_pairs PASSED [ 71%]
tests/test_utils.py::test_load_lines PASSED [ 78%]
tests/test_utils.py::test_load_conversations PASSED [ 85%]
tests/test_utils.py::test_extract_pairs PASSED [ 92%]
tests/test_utils.py::test_extract_pairs_from_lines PASSED [100%]

================================================== warnings summary ==================================================
../venv/lib/python3.8/site-packages/tensorflow/python/autograph/impl/api.py:22
/home/peter/jann/venv/lib/python3.8/site-packages/tensorflow/python/autograph/impl/api.py:22: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
import imp

-- Docs: https://docs.pytest.org/en/stable/warnings.html

----------- coverage: platform linux, python 3.8.5-final-0 -----------
Coverage HTML written to dir htmlcov
Coverage XML written to file coverage.xml

=========================================== 14 passed, 1 warning in 24.15s ===========================================
(

(venv) peter@peter-VirtualBox:~/jann/Jann$ ./run_examples/run_CMDC.sh
2021-03-10 19:46:34.528180: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-03-10 19:46:34.528379: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
WARNING:tensorflow:From /home/peter/jann/venv/lib/python3.8/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
INFO:tensorflow:CMDC movie_lines_path: data/CMDC/movie_lines.txt
INFO:tensorflow:CMDC movie_converstions_path: data/CMDC/movie_conversations.txt
INFO:tensorflow:Selecting and saving 50 random lines...
INFO:tensorflow:Found 304713 input lines.
./run_examples/run_CMDC.sh: line 15: 7350 Killed python process_cornell_data.py --infile_path=${INFILEPATH} --outfile=${INFILE} --num_lines=${NUMLINES} --verbose
2021-03-10 19:46:54.247062: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-03-10 19:46:54.247107: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
WARNING:tensorflow:From /home/peter/jann/venv/lib/python3.8/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
INFO:tensorflow:0 lines in input file: data/CMDC/all_lines_50.txt
INFO:tensorflow:Creating new dictionary to save outputs
INFO:tensorflow:0 new lines to encode...
INFO:tensorflow:No new lines encoded. Quitting.
INFO:tensorflow:Output file: data/CMDC/all_lines_50.txt.embedded.pkl
2021-03-10 19:47:00.598407: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-03-10 19:47:00.598567: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
WARNING:tensorflow:From /home/peter/jann/venv/lib/python3.8/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
Traceback (most recent call last):
File "process_embeddings.py", line 48, in
sys.exit(process_embeddings(args))
File "process_embeddings.py", line 15, in process_embeddings
embeddings, _ = utils.load_data(path_to_embeddings, 'dict')
File "/home/peter/jann/venv/lib/python3.8/site-packages/Jann-3.0.0-py3.8.egg/Jann/utils.py", line 109, in load_data
with open(file_path, 'rb') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'data/CMDC/all_lines_50.txt.embedded.pkl'
2021-03-10 19:47:03.640732: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-03-10 19:47:03.640897: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
WARNING:tensorflow:From /home/peter/jann/venv/lib/python3.8/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
Traceback (most recent call last):
File "index_embeddings.py", line 51, in
sys.exit(index_embeddings(args))
File "index_embeddings.py", line 16, in index_embeddings
with open(unique_strings_path) as f:
FileNotFoundError: [Errno 2] No such file or directory: 'data/CMDC/all_lines_50.txt.embedded.pkl_unique_strings.csv'
2021-03-10 19:47:06.944403: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-03-10 19:47:06.944572: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
WARNING:tensorflow:From /home/peter/jann/venv/lib/python3.8/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
INFO:tensorflow:Loading unique strings.
Traceback (most recent call last):
File "interact_with_model.py", line 66, in
sys.exit(interact_with_model(args))
File "interact_with_model.py", line 17, in interact_with_model
with open(unique_strings_path) as f:
FileNotFoundError: [Errno 2] No such file or directory: 'data/CMDC/all_lines_50.txt.embedded.pkl_unique_strings.csv'

Looks like the:
python process_cornell_data.py --infile_path=${INFILEPATH} --outfile=${INFILE} --num_lines=${NUMLINES} --verbose
is getting killed for some reason and so the first file is not created. A bit of printing adding to the script suggests that this line is likely the issue:
for item in np.random.choice(lines, args.num_lines, replace=False):

commented

Hi @binraker -- thank you for reporting this, and I appreciate your patience as I circle back.

Were you able to get this running on your end, there have been a few code changes, and I have not been able to reproduce your error flow.

commented

Closing for now, feel free to re-open if this is still an issue.