ratsgo / embedding

한국어 임베딩 (Sentence Embeddings Using Korean Corpora)

Home Page:https://ratsgo.github.io/embedding

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

문장 임베딩 모델 평가 관련 문의 드립니다.

kksbell opened this issue · comments

BERTEmbeddingEvaluator 로 모델 평가 해보려고

docker로 컨테이너 띄워서 bash sentmodel.sh download-pretrained-bert 진행하고
test.py를 다음과 같이 만들어서 실행했더니

import sys
sys.path.append('models')
from models.sent_eval import BERTEmbeddingEvaluator
model = BERTEmbeddingEvaluator(model_fname="/notebooks/embedding/data/sentence-embeddings/bert/pretrain-ckpt",
bertconfig_fname="/notebooks/embedding/data/sentence-embeddings/bert/pretrain-ckpt/bert_config.json",
vocab_fname="/notebooks/embedding/data/sentence-embeddings/bert/pretrain-ckpt/vocab.txt")

model.predict("이 영화 엄청 재미있네요") # label 예측
model.get_token_vector_sequence("이 영화 엄청 재미있네요") # 토큰별 임베딩 추출
model.get_sentence_vector("이 영화 엄청 재미있네요")

다음과 같은 에러사항이 있어 질문 드립니다.

2020-09-28 07:19:11.250226: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Traceback (most recent call last):
File "bert_test.py", line 7, in
vocab_fname="/notebooks/embedding/data/sentence-embeddings/bert/pretrain-ckpt/vocab.txt")
File "/notebooks/embedding/models/sent_eval.py", line 193, in init
saver.restore(self.sess, checkpoint_path)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1534, in restore
raise ValueError("Can't load save_path when it is None.")
ValueError: Can't load save_path when it is None.


model_checkpoint_path: "bert_model.ckpt"
all_model_checkpoint_paths: "bert_model.ckpt"

checkpoint 가 없어서 그런가 해서 checkpoint 파일을 위과 같이 만들고
test.py를 재실행 시켰더니 2번과 같은 에러가 발생했습니다.

2020-09-28 07:13:54.288704: W tensorflow/core/framework/op_kernel.cc:1273] OP_REQUIRES failed at save_restore_v2_ops.cc:184 : Not found: Key fully_connected/biases not found in checkpoint
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.NotFoundError: Key fully_connected/biases not found in checkpoint
[[{{node save/RestoreV2}} = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1546, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: Key fully_connected/biases not found in checkpoint
[[node save/RestoreV2 (defined at /notebooks/embedding/models/sent_eval.py:190) = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

Caused by op 'save/RestoreV2', defined at:
File "bert_test.py", line 7, in
vocab_fname="/notebooks/embedding/data/sentence-embeddings/bert/pretrain-ckpt/vocab.txt")
File "/notebooks/embedding/models/sent_eval.py", line 190, in init
saver = tf.train.Saver(tf.global_variables())
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1102, in init
self.build()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1114, in build
self._build(self._filename, build_save=True, build_restore=True)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1151, in _build
build_save=build_save, build_restore=build_restore)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 795, in _build_internal
restore_sequentially, reshape)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 406, in _AddRestoreOps
restore_sequentially)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 862, in bulk_restore
return io_ops.restore_v2(filename_tensor, names, slices, dtypes)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_io_ops.py", line 1466, in restore_v2
shape_and_slices=shape_and_slices, dtypes=dtypes, name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1770, in init
self._traceback = tf_stack.extract_stack()

NotFoundError (see above for traceback): Key fully_connected/biases not found in checkpoint
[[node save/RestoreV2 (defined at /notebooks/embedding/models/sent_eval.py:190) = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1556, in restore
names_to_keys = object_graph_key_mapping(save_path)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1830, in object_graph_key_mapping
checkpointable.OBJECT_GRAPH_PROTO_KEY)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 371, in get_tensor
status)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors_impl.py", line 528, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: Key _CHECKPOINTABLE_OBJECT_GRAPH not found in checkpoint

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "bert_test.py", line 7, in
vocab_fname="/notebooks/embedding/data/sentence-embeddings/bert/pretrain-ckpt/vocab.txt")
File "/notebooks/embedding/models/sent_eval.py", line 193, in init
saver.restore(self.sess, checkpoint_path)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1562, in restore
err, "a Variable name or other graph key that is missing")
tensorflow.python.framework.errors_impl.NotFoundError: Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Key fully_connected/biases not found in checkpoint
[[node save/RestoreV2 (defined at /notebooks/embedding/models/sent_eval.py:190) = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

Caused by op 'save/RestoreV2', defined at:
File "bert_test.py", line 7, in
vocab_fname="/notebooks/embedding/data/sentence-embeddings/bert/pretrain-ckpt/vocab.txt")
File "/notebooks/embedding/models/sent_eval.py", line 190, in init
saver = tf.train.Saver(tf.global_variables())
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1102, in init
self.build()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1114, in build
self._build(self._filename, build_save=True, build_restore=True)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 1151, in _build
build_save=build_save, build_restore=build_restore)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 795, in _build_internal
restore_sequentially, reshape)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 406, in _AddRestoreOps
restore_sequentially)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py", line 862, in bulk_restore
return io_ops.restore_v2(filename_tensor, names, slices, dtypes)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_io_ops.py", line 1466, in restore_v2
shape_and_slices=shape_and_slices, dtypes=dtypes, name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
op_def=op_def)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1770, in init
self._traceback = tf_stack.extract_stack()

NotFoundError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Key fully_connected/biases not found in checkpoint
[[node save/RestoreV2 (defined at /notebooks/embedding/models/sent_eval.py:190) = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

도움 주시면 감사하겠습니다..!

@kksbell 님 안녕하세요 질문해주셔서 감사드립니다.
아래 코드는 파인튜닝이 완료된 모델에 대해 평가하는 역할을 합니다.
코드 6-34로 파인튜닝을 실시하면 /notebooks/embedding/data/sentence-embeddings/bert/tune-ckpt에 체크포인트가 생기는데요. 해당 경로를 아래 코드에서 model_fname에 넣어주면 됩니다.

import sys
sys.path.append('models')
from models.sent_eval import BERTEmbeddingEvaluator
model = BERTEmbeddingEvaluator(
    model_fname="/notebooks/embedding/data/sentence-embeddings/bert/tune-ckpt",
    bertconfig_fname="/notebooks/embedding/data/sentence-embeddings/bert/pretrain-ckpt/bert_config.json",
    vocab_fname="/notebooks/embedding/data/sentence-embeddings/bert/pretrain-ckpt/vocab.txt"
)

@kksbell 님께서는 해당 경로를 "/notebooks/embedding/data/sentence-embeddings/bert/pretrain-ckpt"로 지정하신 것 같은데요. 도서의 코드 6-34와 6-35를 순서대로 실행해보시면 좋을 것 같습니다.

답변 남겨주셔서 감사합니다!!
많은 도움이 되었습니다!