How to get dynamic word vectors
MarrytheToilet opened this issue · comments
Hi,I want to be able to input a sentence and output a word vector for each word, like BERT as service.
bert-serving-start -pooling_strategy NONE -model_dir /tmp/english_L-12_H-768_A-12/
bc = BertClient()
vec = bc.encode(['hey you', 'whats up?'])
vec # [2, 25, 768]
vec[0] # [1, 25, 768], sentence embeddings for hey you
vec[0][0] # [1, 1, 768], word embedding for [CLS]
vec[0][1] # [1, 1, 768], word embedding for hey
vec[0][2] # [1, 1, 768], word embedding for you
vec[0][3] # [1, 1, 768], word embedding for [SEP]
vec[0][4] # [1, 1, 768], word embedding for padding symbol
vec[0][25] # error, out of index!
@MarrytheToilet In the current implementation of CLIP model, the word-level embeddings are not returned. To support your case, we need to refactor encode_text(...)
API to return full sequence of embeddings, rather than the eos_token embedding only. May I know what's the downstream tasks you are working on that needs word-level embeddings?
Thank you for your reply. I am just an undergraduate completing my graduation project. I hope to find metaphors in text by obtaining dynamic word vectors.
I have now completed my requirements through bert-as-service, thank you very much!
Nice, I will close this issue now. And please feel free to open a new issue when you have new questions.