computational-social-science huggingface natural-language-processing workshop-materials

From NLP to CSS: A Practical Tutorial on using Transformers in your Research

This page is an open-repository that provides you with the material from our tutorial on Transformers 🤖, HuggingFace 🤗 and Social Science Applications 👥 which was presented by Christopher Klamm, Moritz Laurer and Elliott Ash on the 7th International Conference on Computational Social Science.

Overview

Transformers have revolutionised Natural Language Processing (NLP) since 2018. From text classification, to text summarization, to translation – the state of the art for most NLP tasks today is dominated by this type of new deep learning model architecture. This tutorial will introduce the participants to this new type of model; will demonstrate how thousands of pre-trained models can easily be used with a few lines of code; outline how they can be used as powerful measurement tools in social science using concrete examples and dive deeper into the technical ingredients of transformers. A key part of the tutorial will be the HuggingFace Transformers library and its open-source community. We hope to light the passion of the text-as-data community to contribute to and benefit from open source transformers and create closer ties with the NLP community.

Structure

Introduction w/o math on transformer-based Language Models [Slides] (8.8.21, updated compact version)
HuggingFace and key NLP tasks and pipeline
Core transformers classes: tokenizers and models
Programming tutorial (Python) to train your first model
Theoretical background on transformer-based Language Models [Slides]
Social Science applications [Slides]

Related Resources

There are many other amazing ressources on this topic. To name just a few, here are some links for further research:

Noah Smith (2021): Language Models: Challenges and Progress [Video]
Sebastian Ruder (2021): Recent Advances in Language Model Fine-tuning [Link]
Lena Viota (2021): NLP Course | For You [Link]
Pavlos Protopapas, Mark Glickman, and Chris Tanner (2021): CS109b: Advanced Topics in Data Science [Slides]
Jay Alammar (2021): Language Processing with BERT: The 3 Minute Intro (Deep learning for NLP) [Video]
Melanie Walsh (2021): BERT for Humanists [Tutorial]
KyungHyun Cho (2020): Language modeling [Video]
Rachel Tatmann (2020): NLP for Developers: BERT [Video]
Peter Bloem (2020): Lecture 12.1 Self-attention [Video]
Jay Alammar (2018): The Illustrated Transformer [Blog]

Note: Don't hesitate to send us a message, if something is broken or if you have further questions.

About

Tutorial on Transformers 🤖, HuggingFace 🤗 and Social Science Applications 👥 @ IC2S2

computational-social-science huggingface natural-language-processing workshop-materials