RichardAragon / DigammaAttention

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DigammaAttention

This repository contains a PyTorch implementation of a Transformer-based language learning model with optional digamma integration. It's released under the MIT open source license.

Key Features:

Utilizes a Transformer encoder architecture for language modeling tasks. Offers the option to incorporate a digamma function for potential performance enhancements. Implements safe sine operations for numerical stability. Enforces causality constraints through attention masking. Provides a clear and concise implementation for ease of understanding and modification. Getting Started:

Install dependencies:

Bash

pip install torch transformers

Import the model: from llm import LLM

Instantiate the model:

model = LLM(vocab_size, embed_dim, hidden_dim, num_layers, dropout, use_digamma=True)

Pass input sequences to the model:

logits = model(inputs)

About

License:MIT License