ArkAung / transformer_tutorial

Tutorial for Transformer Architecture in Neural Networks

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Transformer Tutorial

Welcome to the Transformer Tutorial repository. This repository provides a comprehensive tutorial on the Transformer Architecture in Neural Networks. The Transformer model, introduced in the paper "Attention is All You Need", has been a game-changer for tasks involving sequence-to-sequence models. This tutorial aims to provide an in-depth understanding of this architecture.

Repository Contents

This repository contains several Python files and a Jupyter notebook, each with a specific purpose:

  • attention.py: This file contains the implementation of the attention mechanism used in the Transformer model.
  • transformer.py: This file contains the implementation of the Transformer model.
  • language_model.py: Using transformer blocks in transformer.py, this file contains a simple implementation of a language model.
  • utils.py: This file contains various utility functions used across the project.
  • main.py: This file ties everything together. If you want to test this repo, you can just run python main.py after installing requirements.
  • intuition_behind_word_embeddings_with_positional_information.ipynb: This Jupyter notebook provides visuals and explanations for positional encoding for short and long sequences.

Getting Started

To get started with this tutorial, follow these steps:

  1. Clone the repository to your local machine.
  2. Install the necessary Python packages. You can find the required packages in the requirements.txt file.
  3. Run main.py to run the model end to end.
  4. Run the Jupyter notebook intuition_behind_word_embeddings_with_positional_information.ipynb to understand the role that positional encoding plays.
  5. Run the Jupyter notebook learning_process_in_transformers.ipynb to understand how transformers learn.

Graph of Transformer for a simple transformer network

With a single head and a single block

Contributing

Contributions to this tutorial are welcome. If you have any ideas or improvements, feel free to open an issue or submit a pull request. Please make sure to read the CONTRIBUTING.md file before making any contributions.

License

This project is licensed under the MIT License. Please see the LICENSE file for more details.

Security

For any security concerns, please refer to the SECURITY.md file.

Contact

If you have any questions or feedback, feel free to reach out to the repository owner. Your feedback is much appreciated!

About

Tutorial for Transformer Architecture in Neural Networks

License:MIT License


Languages

Language:Jupyter Notebook 99.3%Language:Python 0.7%