This repo contains the notebooks for the Large Language Models: Application through Production course on edX & Databricks Academy.
-
You first need to add Git credentials to Databricks. Refer to documentation here.
-
Click
Repos
in the sidebar. ClickAdd Repo
on the top right. -
Clone the "HTTPS" URL from GitHub, or copy
https://github.com/databricks-academy/large-language-models.git
and paste into the boxGit repository URL
. The rest of the fields, i.e.Git provider
andRepository name
, will be automatically populated. ClickCreate Repo
on the bottom right.
-
First, select
Single Node
-
This courseware has been tested on Databricks Runtime 13.1 for Machine Learning. If you do not have access to a 13.1 ML Runtime cluster, you will need to install many additional libraries (as the ML Runtime pre-installs many commonly used machine learning packages), and this courseware is not guaranteed to run.
For all of the notebooks except
LLM 04a - Fine-tuning LLMs
andLLM04L - Fine-tuning LLMs Lab
, you can run them on a CPU just fine. We recommend eitheri3.xlarge
ori3.2xlarge
(i3.2xlarge will have slightly faster performance).For these notebooks:
LLM 04a - Fine-tuning LLMs
andLLM04L - Fine-tuning LLMs Lab
, you will need the Databricks Runtime 13.1 for Machine Learning with GPU.Select GPU instance type of
g5.2xlarge
.