Learn Data With Mark

Explore the code and scripts behind the Learn Data With Mark YouTube channel.

Large Language Models

Topic	Resources
Retrieval Augmented Generation In this video, we’ll learn how to use Retrieval Augmented Generation with Chroma and LangChain to provide an OpenAI/GPT LLM prompt with more data to effectively answer our questions about the Wimbledon 2023 tennis tournament.	Video Code
Consistent JSON with OpenAI/GPT In this video, we’ll learn how to return a consistent/predictable/valid JSON response to a sentiment analysis prompt using OpenAI.	Video Code
Running Mixtral with Ollama In this video, we’ll learn about Mixtral, the latest large language model from Mistral AI. Mixtral employs a mixture of experts approach, with eight models and a router to manage queries, enhancing the AI’s response quality. We’re going to run Mixtral on our own machine using the awesome Ollama tool. We’ll then compare Mixtral with the original Mixtral model on a variety of tasks including sentiment analysis, summarisation, suggesting prompts to review books, and updating Python code.	Video Code
Constraining LLMs with Guidance AI In this video, we’ll learn how to use the Guidance library to control and constrain text generation by large language models, specifically integrating it with the llama CPP library and the Mistral 7B model. We’ll build an emotion detector with help from functions like select which restricts generation to an array of values and gen, which can be controlled by regular expressions. We’ll also learn how to create reusable components and output results in JSON format.	Video Code
LLaVA: A large multi-modal language model In this video, we’ll learn about LAVA (Large Language And Vision Assistant), a multimodal model that integrates a CLIP vision encoder and the VICUNA LLM. We’ll see how well it gets on describing a cartoon cat, a photo of me with AI generated parrots, and a bunch of images created by the Mid Journey Generative AI tool. And most importantly, we’ll find out whether it knows who Cristiano Ronaldo is!	Video Code

Vector Search

Topic	Resources
Introduction to Vector Search In this video, we’re going to learn about vector search using scikit-learn.	Video Code
FAISS Approx Nearest Neighbours In this video, we will learn about the capabilities of Facebook’s FAISS library in the context of vector search. We will discuss the technical framework of Approximate Nearest Neighbours and its implementation using Cell Probe methods. We will illustrate this with a visualization of 10,000 2D arrays and detail how the vector space is partitioned. Additionally, we’ll explain the role of the K-means algorithm in FAISS’s partitioning process, the steps to train your index, and methods to identify centroids that denote the cells.	Video Code

Topic

Resources

Introduction to Vector Search
In this video, we’re going to learn about vector search using scikit-learn.

Video
Code

FAISS Approx Nearest Neighbours
In this video, we will learn about the capabilities of Facebook’s FAISS library in the context of vector search. We will discuss the technical framework of Approximate Nearest Neighbours and its implementation using Cell Probe methods. We will illustrate this with a visualization of 10,000 2D arrays and detail how the vector space is partitioned. Additionally, we’ll explain the role of the K-means algorithm in FAISS’s partitioning process, the steps to train your index, and methods to identify centroids that denote the cells.

Video
Code

bladealex9848 / LearnDataWithMark

Learn Data With Mark

Large Language Models

Vector Search

Other Topics

About

Languages