Pretrained model ETA

Question

Pretrained model ETA

saicharishmavalluri opened this issue 3 years ago · comments

Sai Charishma Valluri commented 3 years ago

When I test the tool using a pre-trained model, it shows me an ETA of 25 hrs.
Is this the right way, or am I missing something?
I am wondering if there is any other way to test it quickly. Can someone help me?
I am using google colaboratory.
Thank you in advance for your reply!

Xiaodong Gu · Answer 1 · Tue Oct 19 2021 18:49:39 GMT+0800 (China Standard Time)

Seems that the platform restricts a large allocation of memory?
You can try to reduce the batch size (e.g., from 10,000 to 1,000).

Samridhi Vaid · Answer 2 · Wed Oct 20 2021 08:35:00 GMT+0800 (China Standard Time)

Hello @guxd

I am trying to run the pre-trained model but I am not able to search. It gives me the below error.

Sai Charishma Valluri · Answer 3 · Wed Oct 20 2021 10:47:18 GMT+0800 (China Standard Time)

Hello @saicharishmavalluri How did you manage to use it the google collab? CAn you share that?

Thanks

Hi @samvaid

I haven't used the pre-trained model since it is taking me 25 hrs to embed it.
Instead, I trained the normal model in Keras by decreasing the number of epochs to 2.
Also, how did you manage to run it for 44 hours without any hurdles?

Samridhi Vaid · Answer 4 · Wed Oct 20 2021 11:04:03 GMT+0800 (China Standard Time)

@saicharishmavalluri Did you train the Keras notebook on colab? How did you do that?
I did run it locally but eventually ran into some problem

Sai Charishma Valluri · Answer 5 · Wed Oct 20 2021 11:06:36 GMT+0800 (China Standard Time)

@saicharishmavalluri Did you train the Keras notebook on colab? How did you do that?
I did run it locally but eventually ran into some problem

@samvaid
I cloned the entire deep code search code into my google drive and followed the instructions in the readme file of Keras folder

Samridhi Vaid · Answer 6 · Wed Oct 20 2021 11:10:24 GMT+0800 (China Standard Time)

@saicharishmavalluri how much time did it take you to do that? And did it work fine?

Sai Charishma Valluri · Answer 7 · Wed Oct 20 2021 11:23:21 GMT+0800 (China Standard Time)

@saicharishmavalluri how much time did it take you to do that? And did it work fine?

@samvaid
For 2 epochs it took me around 1 hour of time to train the model then the code embedding and search didn't take much time.
Initially, I tried changing the files in the data/github folder with the real dataset and training the model.
Training the model took around 1 hour for 2 epochs but when coming to code embedding, my google colab got crashed because the memory is full.
So, later I hadn't changed any files in the data/github folder and tried. It worked fine for me.

Xiaodong Gu · Answer 8 · Wed Oct 20 2021 12:31:00 GMT+0800 (China Standard Time)

@samvaid Keras and Pytorch use different data folders in Google Drive. Make sure that you have downloaded train.methname.h5 from Google Drive.

Samridhi Vaid · Answer 9 · Wed Oct 20 2021 12:33:23 GMT+0800 (China Standard Time)

@guxd I am trying to use the pretrained model without copying the dataset from google drive.

Sai Charishma Valluri · Answer 10 · Wed Oct 20 2021 22:38:29 GMT+0800 (China Standard Time)

@saicharishmavalluri I am facing trouble while runnin the keras code in colab. I am using the exact same package versions. Can you share your notebook with me?

@samvaid
code_search_keras-2.ipynb.zip

Samridhi Vaid · Answer 11 · Fri Oct 22 2021 00:52:24 GMT+0800 (China Standard Time)

@saicharishmavalluri
Hi. What python version do you have in colab? When I run your notebook, I seem to get an error. My python version is Python 3.7.12

Did you downgrade your colab to python3.6?

Sai Charishma Valluri · Answer 12 · Fri Oct 22 2021 06:24:40 GMT+0800 (China Standard Time)

@saicharishmavalluri 2021-10-21 19:22:05,570: models: INFO: compiling models Traceback (most recent call last): File "main.py", line 272, in engine.load_model(model, config['training_params']['reload']) File "main.py", line 42, in load_model assert os.path.exists(model_path + f"epo{epoch}_code.h5"),f"Weights at epoch {epoch} not found" AssertionError: Weights at epoch 500 not found

I am getting the above error when I am trying to run the below code. I am changing the reload value to 500 #change configs.py file reload value to 500 !python main.py --mode repr_code

@samvaid
sorry for the confusion.
The value to the reload parameter should be your last epoch number. In my case, since I trained for 2 epochs (0,1) my reload parameter value will be 1.
Also please let me know if it works for you, in my case the cell is getting terminated because of memory full.

Sai Charishma Valluri · Answer 13 · Fri Oct 22 2021 07:10:53 GMT+0800 (China Standard Time)

@saicharishmavalluri 2021-10-21 19:22:05,570: models: INFO: compiling models Traceback (most recent call last): File "main.py", line 272, in engine.load_model(model, config['training_params']['reload']) File "main.py", line 42, in load_model assert os.path.exists(model_path + f"epo{epoch}_code.h5"),f"Weights at epoch {epoch} not found" AssertionError: Weights at epoch 500 not found
I am getting the above error when I am trying to run the below code. I am changing the reload value to 500 #change configs.py file reload value to 500 !python main.py --mode repr_code

@samvaid sorry for the confusion. The value to the reload parameter should be your last epoch number. In my case, since I trained for 2 epochs (0,1) my reload parameter value will be 1. Also please let me know if it works for you, in my case the cell is getting terminated because of memory full.

@saicharishmavalluri
'batch_size': 128,
'chunk_size':100000,
'nb_epoch': 2,
'validation_split': 0.2,
'optimizer': 'adam',
#'optimizer': Adam(clip_norm=0.1),
'valid_every': 5,
'n_eval': 100,
'evaluate_all_threshold': {
'mode': 'all',
'top1': 0.4,
},
'save_every': 10,
'reload':-1,

Are above your configuration in the config file when you are training the model?

And then when you are running !python main.py --mode repr_code you just change reload: 1?

@samvaid
Yes

Sai Charishma Valluri · Answer 14 · Fri Oct 22 2021 07:17:54 GMT+0800 (China Standard Time)

@saicharishmavalluri 2021-10-21 19:22:05,570: models: INFO: compiling models Traceback (most recent call last): File "main.py", line 272, in engine.load_model(model, config['training_params']['reload']) File "main.py", line 42, in load_model assert os.path.exists(model_path + f"epo{epoch}_code.h5"),f"Weights at epoch {epoch} not found" AssertionError: Weights at epoch 500 not found
I am getting the above error when I am trying to run the below code. I am changing the reload value to 500 #change configs.py file reload value to 500 !python main.py --mode repr_code

@samvaid sorry for the confusion. The value to the reload parameter should be your last epoch number. In my case, since I trained for 2 epochs (0,1) my reload parameter value will be 1. Also please let me know if it works for you, in my case the cell is getting terminated because of memory full.

@saicharishmavalluri
'batch_size': 128,
'chunk_size':100000,
'nb_epoch': 2,
'validation_split': 0.2,
'optimizer': 'adam',
#'optimizer': Adam(clip_norm=0.1),
'valid_every': 5,
'n_eval': 100,
'evaluate_all_threshold': {
'mode': 'all',
'top1': 0.4,
},
'save_every': 10,
'reload':-1,

Are above your configuration in the config file when you are training the model?
And then when you are running !python main.py --mode repr_code you just change reload: 1?

@samvaid Yes

@saicharishmavalluri I am getting the below error

File "main.py", line 258, in model.compile(optimizer=optimizer) File "/content/deep-code-search/keras/deep-code-search/keras/models.py", line 202, in compile self._code_repr_model.compile(loss='cosine_proximity', optimizer=optimizer, **kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/tracking/base.py", line 457, in _method_wrapper result = method(self, *args, **kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/training.py", line 336, in compile self.loss, self.output_names) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/training_utils.py", line 1351, in prepare_loss_functions loss_functions = [get_loss_function(loss) for _ in output_names] File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/training_utils.py", line 1351, in loss_functions = [get_loss_function(loss) for _ in output_names] File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/training_utils.py", line 1087, in get_loss_function loss_fn = losses.get(loss) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/losses.py", line 1183, in get return deserialize(identifier) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/losses.py", line 1174, in deserialize printable_module_name='loss function') File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/utils/generic_utils.py", line 210, in deserialize_keras_object raise ValueError('Unknown ' + printable_module_name + ':' + object_name) ValueError: Unknown loss function:cosine_proximity

@samvaid
You need to change 'cosine_proximity'-> 'cosine_similarity' in
/content/deep-code-search/keras/deep-code-search/keras/models.py", line 202
Also you should change the 'reload' value once you complete training the model.

Sai Charishma Valluri · Answer 15 · Fri Oct 22 2021 08:44:43 GMT+0800 (China Standard Time)

@saicharishmavalluri The model s getting trained. but when I am running #change configs.py file reload value to 1 !python main.py --mode repr_code

Below is the error

Traceback (most recent call last): File "main.py", line 272, in engine.load_model(model, config['training_params']['reload']) File "main.py", line 42, in load_model assert os.path.exists(model_path + f"epo{epoch}_code.h5"),f"Weights at epoch {epoch} not found" AssertionError: Weights at epoch 1 not found

@samvaid
For how many epochs did you train your model?

Sai Charishma Valluri · Answer 16 · Fri Oct 22 2021 09:10:35 GMT+0800 (China Standard Time)

@saicharishmavalluri The model s getting trained. but when I am running #change configs.py file reload value to 1 !python main.py --mode repr_code
Below is the error
Traceback (most recent call last): File "main.py", line 272, in engine.load_model(model, config['training_params']['reload']) File "main.py", line 42, in load_model assert os.path.exists(model_path + f"epo{epoch}_code.h5"),f"Weights at epoch {epoch} not found" AssertionError: Weights at epoch 1 not found

@samvaid For how many epochs did you train your model?

@saicharishmavalluri 2 epochs

@samvaid
I am sorry, I missed another point.
in the configs.py file you need to change 'save_every' to 1 instead of 10 before training the model.
By default the weights will be saved for every 10 epochs. so if you change it to 1, it will be saved for every epoch.

Sai Charishma Valluri · Answer 17 · Sat Oct 23 2021 06:10:24 GMT+0800 (China Standard Time)

@saicharishmavalluri The model s getting trained. but when I am running #change configs.py file reload value to 1 !python main.py --mode repr_code
Below is the error
Traceback (most recent call last): File "main.py", line 272, in engine.load_model(model, config['training_params']['reload']) File "main.py", line 42, in load_model assert os.path.exists(model_path + f"epo{epoch}_code.h5"),f"Weights at epoch {epoch} not found" AssertionError: Weights at epoch 1 not found

@samvaid For how many epochs did you train your model?

@saicharishmavalluri 2 epochs

@samvaid I am sorry, I missed another point. in the configs.py file you need to change 'save_every' to 1 instead of 10 before training the model. By default the weights will be saved for every 10 epochs. so if you change it to 1, it will be saved for every epoch.

@saicharishmavalluri

after training with the save_every=1 and running !python main.py --mode repr_code, I get this error Traceback (most recent call last): File "main.py", line 272, in engine.load_model(model, config['training_params']['reload']) File "main.py", line 44, in load_model model.load(model_path + f"epo{epoch}_code.h5", model_path + f"epo{epoch}_desc.h5") File "/content/deep-code-search/keras/deep-code-search/keras/models.py", line 230, in load self._code_repr_model.load_weights(code_model_file, **kwargs) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/training.py", line 181, in load_weights return super(Model, self).load_weights(filepath, by_name) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/network.py", line 1177, in load_weights saving.load_weights_from_hdf5_group(f, self.layers) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/saving/hdf5_format.py", line 651, in load_weights_from_hdf5_group original_keras_version = f.attrs['keras_version'].decode('utf8') AttributeError: 'str' object has no attribute 'decode'

I remember getting this error. But I don't remember what I did to make it work. Maybe try training the model again or for another epoch.

Sai Charishma Valluri · Answer 18 · Sat Oct 23 2021 06:11:04 GMT+0800 (China Standard Time)

@saicharishmavalluri Did you do some other changes as well?

No these are the only changes I made.