sonhkim / APGHNLPtutorial

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

HybridNLP - Tutorial on Hybrid Techniques for Knowledge-based NLP

Knowledge graphs meet machine learning and all their friends

This repo contains the notebooks and overall materials of the HybridNLP tutorial (http://hybridnlp.expertsystemlab.com/tutorial/)

Many different artificial intelligence techniques can be used to explore and exploit large document corpora that are available inside organizations and on the Web. While natural language is symbolic in nature and the first approaches in the field were based on symbolic and rule-based methods, many of the most widely used methods are currently based on neural approaches. Each of these two main schools of thought in natural language processing have their strengths and limitations and there is an increasing trend that seeks to combine them in complementary ways to get the best of both worlds. This tutorial covers the foundations and modern practical applications of knowledge-based and neural methods, techniques and models and their combination for exploiting large document corpora. The tutorial first focuses on the foundations that can be used to this purpose, including knowledge graphs, word embeddings, and language models. Then it shows how these techniques can be effectively combined in NLP tasks and other data modalities in addition to text related to research and innovation projects.

How to run the tutorial notebooks:

  1. Sign in your Google account and go to “Hello, Colaboratory”: https://colab.research.google.com
  2. Download the tutorial notebooks from the tutorial repo on GitHub: https://github.com/hybridnlp/tutorial
  3. Open the notebooks (warning: Some of the notebooks e.g. notebook 08 may take a while to load data and/or model weights)

How to cite this tutorial:

If you found this tutorial helpful, please cite some of the following papers:

Ronald Denaux and Jose Manuel Gomez-Perez. 2019. Vecsigrafo: Corpus-based Word-Concept Embeddings. Semantic Web (2019), 1–28. https://doi.org/10.3233/SW-190361

@article{Vecsigrafo19,
title={Vecsigrafo: Corpus-based Word-Concept Embeddings},
author={Ronald Denaux and Jose Manuel Gomez-Perez},
journal={Semantic Web},
year={2019},
pages={1-28},
doi = {10.3233/SW-190361}}

Andres Garcia-Silva, Cristian Berrio and Jose Manuel Gomez-Perez. 2019. An Empirical Study on Pre-trained Embeddings and Language Models for Bot Detection. RepL4NLP@ACL 2019: 148-155

@inproceedings{garcia-silva-etal-2019-empirical,
title = {An Empirical Study on Pre-trained Embeddings and Language Models for Bot Detection},
author = {Garcia-Silva, Andres and Berrio, Cristian and Gomez-Perez, Jose Manuel},
booktitle = {Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)},
month = {August},
year = {2019},
address = {Florence, Italy},
publisher = {Association for Computational Linguistics},
url = {ttps://www.aclweb.org/anthology/W19-4317},
doi = {10.18653/v1/W19-4317},
pages = {148}}

Ronald Denaux and Jose Manuel Gomez-Perez. 2019. Assessing the Lexico-Semantic Relational Knowledge Captured by Word and Concept Embeddings. In Proceedings of the 10th international conference on Knowledge capture (K-CAP '19), Mayank Kejriwal and Pedro Szekely (Eds.). ACM, New York, NY, USA. DOI: https://doi.org/10.1145/3360901.3364445

@inproceedings{embrela19,
author = {Gomez-Perez, Jose Manuel and Ortega, Raul},
title = {Assessing the Lexico-Semantic Relational Knowledge Captured by Word and Concept Embeddings},
booktitle = {Proceedings of the 10th International Conference on Knowledge Capture},
series = {K-CAP '19},
year = {2019},
isbn = {978-1-4503-7008-0/19/11},
location = {Marina del Rey, CA, USA},
pages = {},
numpages = {8},
url = {},
doi = {https://doi.org/10.1145/3360901.3364420},
acmid = {},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {embedding evaluation, lexico-semantic relations, knowledge graphs}}

Jose Manuel Gomez-Perez and Raul Ortega. 2019. Look, Read and Enrich. Learning from Scientific Figures and their Captions. In Proceedings of the 10th international conference on Knowledge capture (K-CAP '19), Mayank Kejriwal and Pedro Szekely (Eds.). ACM, New York, NY, USA. DOI: https://doi.org/10.1145/3360901.3364420

@inproceedings{LookReadEnrich19,
author = {Gomez-Perez, Jose Manuel and Ortega, Raul},
title = {Look, Read and Enrich. Learning from Scientific Figures and their Captions},
booktitle = {Proceedings of the 10th International Conference on Knowledge Capture},
series = {K-CAP '19},
year = {2019},
isbn = {978-1-4503-7008-0/19/11},
location = {Marina del Rey, CA, USA},
pages = {},
numpages = {8},
url = {},
doi = {https://doi.org/10.1145/3360901.3364420},
acmid = {},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {scientific figures, figure-caption correspondence, knowledge graphs, transfer learning, multimodal machine comprehension}}

Acknowledgements

We gratefully acknowledge the EU Horizon 2020 programme, under grants European Language Grid-825627 and Co-inform-770302, for their support to produce the current version of this tutorial. We are also thankful to previous projects DANTE-700367, TRIVALENT-740934 and GRESLADIX-IDI-20160805.

About

License:MIT License


Languages

Language:Jupyter Notebook 96.3%Language:Python 2.7%Language:C++ 0.8%Language:Makefile 0.1%Language:Shell 0.1%