On April 18th, Meta released the Llama 3 large language model (LLM). We can leverage the machine learning capabilities of Apple Silicon to run this model and receive answers to our questions.
Quick guide:
- Download Ollama and install.
- Open a terminal and run either model version:
ollama run llama3:8b
(4.7 GB)ollama run llama3:70b
(40 GB)
- Write a question and receive an answer.
Example: Answer to the question "Example of a higher-order function using generic types in Swift".
Ollama is a setup tool for downloading and running large language models. Llama 3 can be retrieved and run in macOS using Ollama.
Download and run the installer for Ollama.
Website: ollama.com/download
You can check if Ollama has been succesfully installed by running the Terminal.app
and executing the command:
ollama -v
You will receive a valid response if Ollama is successfully installed, such as ollama version is 0.1.32
Before installing Llama 3, you must decide on a choice of model version.
Two different versions of Llama 3 are provided by Ollama:
- Meta Llama 3 8B (4.7 GB)
- Meta Llama 3 70B (40 GB)
The model is automatically downloaded when you attempt to run it with Ollama.
Open the Terminal.app
and execute the command corresponding to your choice of model:
ollama run llama3:8b
ollama run llama3:70b
If you prefer a ChatGPT-like environment for using the model.
The environment requires Docker Desktop be installed and running.
Website: docker.com/products/docker-desktop/
Run Docker.app
.
Install the UI by opening the Terminal.app
and executing this command:
docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
Open a web browser and navigate to the URL: http://localhost:3000/
http://localhost:3000/
Click on the sign up
button and enter any details (will only be stored locally).
Click on Select a model
and choose llama3
.
Now you can ask it questions and receive answers.