This is the implementation code of the IMMN model.
Before executing the project, you need to download our dataset. Wiki4MNED dataset can be download from Wiki4MNED password is 6b8b. The WikilinksNED dataset was generated by this project. We executed the processing code and got a candidate WikilinksNED dataset. Note that this dataset may be slightly different from the original data set because the version of the Wikipedia dump file we choose may be different as well as other necessary files. Given the huge data scale, this difference can be ignored. Since the data was large and processing took too much time, we provide an optional processing result here password is id04.
running environment
* python 3.6
* TensorFlow 1.8.0
* numpy 1.17.4
* gensim 3.8.1
* keras 1.2.2
* nltk 3.4.5
run
Run the Wiki4MNED_Our.py file directly for Wiki4MNED dataset.
Run the Wikilinks_Our.py file directly for Wikilinks dataset.
Perform ablation laboratory to directly configure parameters options in Wiki4MNED_Our.py and Wikilinks_Our.py, such as :
model.use_interactive_attention = True
model.use_self_attention = True
model.use_mention_img_embedding = True
model.use_entity_img_embedding = True
model.use_tranE_struct_embedding = False
model.use_fertures = True
model.use_mention_attention = True
Other parameters of the model can also be fine-tuned in the config.py file
data
The methods and models used in the data preprocessing process can be found in the project, but due to the large data size, we only provide processed data and embeddings here