google-research / bert

TensorFlow code and pre-trained models for BERT

Home Page:

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to get the word embedding after pre-training?

mfxss opened this issue · comments


I am excited on this great model. And I want to get the word embedding . Where shold I find the file from output or should I change to code to do this?

If you want to get the contextual embeddings (like ELMo) see the section here.

If you want the actual word embeddings, the word->id mapping is just the index of the row in vocab.txt, and the embedding matrix is in bert_model.ckpt with the variable name bert/embeddings/word_embeddings.


And I download your released model of chinese_L-12_H-768_A-12. In vocab.txt, I found some token such as
[unused1] [CLS][SEP][MASK] <S> <T> .
What do these tokens mean?

The [CLS], [SEP] and [MASK] tokens are used as described in the paper and README. The [unused] tokens were not used in our model and are randomly initialized.


What is your training data of chinese_L-12_H-768_A-12? And what is it's size?

It's Chinese wikipedia with both Traditional and Simplified characters.

Hello @mfxss ,
Not sure if you still have problem to get the word embedding from BERT. I implement a BERT embedding library which makes you can get word embedding in a programatic way.

Because I'm working closely with mxnet & gluonnlp team, my implementation is done by using mxnet and gluonnlp. However, I am trying to implement it in all other different frameworks.

Hope my works can help you.

Hey guys, if you don't want to install an extra module, here is an example:

BERT_PATH = 'HOME_DIR/bert_en_uncased_L-12_H-768_A-12'

import tensorflow as tf
imported = tf.saved_model.load(BERT_PATH)

for i in imported.trainable_variables:
    if == 'bert_model/word_embeddings/embeddings:0':
        embeddings = i

And embeddings is the tensor of word embedding that you want!

Hi @jacobdevlin-google Thanks for the pointers. I see the output with the gives subword representations. I'm sure to be missing something but my question is how can we get a word (not subword) representation instead? Thanks in advance for your help!

Hi @jacobdevlin-google Thanks for the pointers. I see the output with the gives subword representations. I'm sure to be missing something but my question is how can we get a word (not subword) representation instead? Thanks in advance for your help!

Excuse me did you find a solution for word not subword , please