devopsotrator / generative-ai-on-aws-architecture-patterns

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Generative AI on AWS architecture patterns

In this workshop you're going to learn:

  • foundational design patterns to implement a broad range of generative AI applications - from simple chatbots to complex multi-agent architectures integrated with external tools and systems
  • how to use Amazon Bedrock to connect to foundation models and use them via an API
  • how to use Amazon Kendra to implement a knowledge base with a document index
  • how to ingest documents into the knowledge base using Kendra connectors
  • how to use Amazon SageMaker JumpStart to deploy an open source large language model (LLM) as a real-time endpoint and use the endpoint for text generation
  • how to use HuggingFace TGI container and Amazon SageMaker to deploy an LLM as a real-time endpoint
  • how to use popular Python frameworks streamlit and LangChain to implement generative AI applications
  • how to leverage AWS services to abstract complex functionality and build your generative AI applications fast and without undifferentiated heavy lifting

Who this workshop is for

This workshop is for developers, designers, and architects who build enterprise applications and interested in improving their understanding of generative AI and want to develop applications leveraging LLMs. We created the workshop with a focus to present the most common emerging and established generative AI architectural patterns corresponding to a wide range of possible business use cases.

Recommended knowledge and experience

Understanding and practical experience with the following AWS services are recommended:

  • Amazon Elastic Container Service
  • Amazon SageMaker
  • Amazon Kendra
  • AWS Lambda
  • Amazon OpenSearch
  • Amazon API Gateway
  • Amazon CloudFormation

We also strongly recommend hands-on experience with general application development on AWS, also using AWS Serverless Application Model and AWS CloudFormation templates.

Previous experience with generative AI, Large Language Models (LLMs) and Foundational Models (FMs) is not required, you can learn as you work through the workshop.

How to use this workshop

The workshop consists of simple self-contained labs covering the foundational architecture patterns for generative AI application and providing implementation of those patterns on AWS.

Pre-requisites and environment setup

You need access to an AWS account. You can use your own account or the shared account, which is given by the workshop moderators.

If you use your own account make sure to fulfill the following pre-requisites before the workshop:

  1. Admin access to the account
  2. Request access to Amazon Bedrock models
  3. AWS Console access
  4. If you'd like to experiment with an LLM real-time SageMaker endpoints, you'll need the following SageMaker ML instances for real-time inference:
  • minimal setup to be able to deploy and experiment with small models, like Falcon-7B: ml.g5.2xlarge
  • recommended setup to be able to use bigger models like Falcon-40B: ml.g5.12xlarge (4x NVidia GPUs) or ml.g5.48xlarge (8x NVidia GPUs)

You can check and increase your quotas for these instances as described in this instructions. You don't need to increase quota if you're going to use Amazon Bedrock only

The next sections contain step-by-step instructions how to setup the required development environments.

Setup AWS Cloud9 environment

You're going to use AWS Cloud9 environment to prepare the chatbot app container and build and deploy the end-to-end application stack.

To setup AWS Cloud9:

  • Sign in an AWS account
  • If you don't have an existing AWS Cloud9 environment, setup a new one. See Hello AWS Cloud9 for a step-by-step instruction
  • Choose at least m5.large instance and Ubuntu Server 22.04 LTS as the platform
  • Open the AWS Cloud9 environment

You're going to use Python 3.11 to build generative AI application. Run the following commands in AWS Cloud9 terminal.

Upgrade Python version to 3.11:

  1. List the installed python versions
ls -l /usr/bin/python*
  1. Install Python 3.11
sudo apt update -y
sudo apt install python3.11 -y
  1. See the default Python version
python3 --version
  1. Set 3.11 as the default Python
sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.10 1
sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.11 2
sudo update-alternatives --config python3
  1. Check that the default Python version is 3.11
python3 --version

Upgrade AWS SAM CLI:

pip install --upgrade aws-sam-cli

Check the AWS SAM CLI version:

sam --version

Download source code into your development environment

To have all workshop assets you must clone the workshop source code repository into your AWS Cloud9 environment.

In the AWS Cloud9 terminal window:

  • Clone the workshop github repo to your environment: git clone https://github.com/aws-samples/generative-ai-on-aws-architecture-patterns.git
  • Change to the workshop folder: cd ~/environment/generative-ai-on-aws-architecture-patterns/
  • Increase the size of AWS Cloud9 EBS volume:
    bash ./content/scripts/resize-disk.sh 100

Setup SageMaker Studio

If you're going to use SageMaker Studio to deploy a real-time LLM endpoint, you need to provision a SageMaker domain and setup SageMaker Studio.

❗ You don't need Studio and the endpoint if you use Amazon Bedrock API to connect to an LLM. You can also skip all SageMaker-related sections if you're going to use Amazon Bedrock only.

To setup Studio:

  1. Provision a SageMaker domain if you don't have one
  2. Provision a user profile if you don't have one

SageMaker execution role permissions

You need the following permissions for your Studio user profile execution role:

Use the following inline permission policy to add to the user profile execution role:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "ServiceQuotas",
            "Effect": "Allow",
            "Action": [
                "servicequotas:GetServiceQuota"
            ],
            "Resource": "*"
        },
        {
            "Sid": "Bedrock",
            "Effect": "Allow",
            "Action": "bedrock:*",
            "Resource": "*"
        }
    ]
}

Add bedrock.amazonaws.com to the user profile execution role trust relationships:

{
    "Effect": "Allow",
    "Principal": {
        "Service": "bedrock.amazonaws.com"
    },
    "Action": "sts:AssumeRole"
}

Download source code into Amazon SageMaker Studio

In the Studio top menu select File, then New, then Terminal:

Run the following command in the terminal:

git clone https://github.com/aws-samples/generative-ai-on-aws-architecture-patterns.git

The code repository will be downloaded and saved in your home directory in Studio.

Now you all setup to start with implementation of your first generative AI design pattern.

Generative AI design patterns

Before you start the hands-on activities on the workshop's design patterns, let's go through some theory.

While LLMs able to perform a broad range of tasks and can generalize or do complex reasoning well, they also lack domain-specific and up-to-date knowledge.

There are two broad foundational design pattern to connect an LLM to specialized skills or to up-to-date data:

  1. Parametric learning – changing model's parameters/weights via gradient update/backpropagation by pre-training or fine-tuning of the model
  2. Non-parametric or in-context learning (ICL) - using a model as is without any model update with its parameters/weights frozen by providing all relevant information and context via the input prompt

All design patterns in this workshop built on the in-context learning approach.

Each lab focuses on one of the following design patterns:

  • Lab 1: In-context learning and prompt engineering
  • Lab 2: Retrieval augmented generation (RAG) and conversational FAQ (CFAQ)
  • Lab 3: Natural language queries (NLQ)
  • Lab 4: Reasoning and react (ReAct)
  • Lab 5: Multi-agents

This set of design pattern is not exhaustive, but covers the majority of common and frequent use cases.

In-context learning (ICL)

This very broad design pattern includes:

  • Use LLMs off the shelf without fine-tuning
  • Use natural language instructions in a prompt
  • Provide relevant information in the prompt: instructions, context, examples, factual data, code, etc
  • Do prompt engineering to improve model performance

It's important to understand, that ICL is not training and there is no parameter/weight update happening for a model.

The following example shows the ICL approach:

Refer to the original paper A Survey on In-context Learning for more details.

Get started with hands-on labs

Navigate to each lab's folder and follow the instructions in README.md file.

Contributors

The baseline of source code and overall architecture were taken from the public AWS workshop Implementing Generative AI on AWS.

The workshop authors:

Yevgeniy Ilyin Nikita Fedkin

Special thanks to:

for help with questions, recommendations, and the workshop review.


Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. SPDX-License-Identifier: MIT-0

About

License:MIT No Attribution


Languages

Language:Jupyter Notebook 57.0%Language:Python 38.5%Language:Shell 3.9%Language:Dockerfile 0.6%