YanSte / NLP-LLM-Fine-tuning-DeepSpeed

Natural Language Processing (NLP) and Large Language Models (LLM) with Fine-Tuning LLM and Trainer with DeepSpeed

Home Page:https://www.kaggle.com/code/yannicksteph/nlp-llm-fine-tuning-trainer-deepspeed

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

| NLP | LLM | Fine-tuning | Trainer DeepSpeed |

Natural Language Processing (NLP) and Large Language Models (LLM) with Fine-Tuning LLM and Trainer with DeepSpeed

Learning

| Overview

In this notebook we're going to Fine-Tuning LLM:

Learning

Many LLMs are general purpose models trained on a broad range of data and use cases. This enables them to perform well in a variety of applications, as shown in previous modules. It is not uncommon though to find situations where applying a general purpose model performs unacceptably for specific dataset or use case. This often does not mean that the general purpose model is unusable. Perhaps, with some new data and additional training the model could be improved, or fine-tuned, such that it produces acceptable results for the specific use case.

Learning

Fine-tuning uses a pre-trained model as a base and continues to train it with a new, task targeted dataset. Conceptually, fine-tuning leverages that which has already been learned by a model and aims to focus its learnings further for a specific task.

It is important to recognize that fine-tuning is model training. The training process remains a resource intensive, and time consuming effort. Albeit fine-tuning training time is greatly shortened as a result of having started from a pre-trained model.

Learning

Learning

DeepSpeed

Learning Objectives

By the end of this notebook, you will be able to:

  1. Prepare a novel dataset
  2. Fine-tune the t5-small model to classify movie reviews.
  3. Using DeepSpeed

About

Natural Language Processing (NLP) and Large Language Models (LLM) with Fine-Tuning LLM and Trainer with DeepSpeed

https://www.kaggle.com/code/yannicksteph/nlp-llm-fine-tuning-trainer-deepspeed


Languages

Language:Jupyter Notebook 100.0%