Ravi Srivastava: High-Level View of Amazon Bedrock

Amazon Bedrock
Amazon Q

Bedrock Overview

Amazon Bedrock is a fully managed serverless service from AWS. AWS Provisions monitors patches and does everything else, and as an end user, you can use the service and just pay per use.

The second thing is it makes base foundation models from Amazon and third-party providers accessible through an API. Bedrock service hosts many foundation models from Amazon and third-party foundation model providers and then makes them accessible to the end user through an API.

The models from Anthropic Cohere Stability, AI 21 labs also from Amazon.

high level, how Amazon bedrock works.

The user can access bedrock service either through the AWS console CLI or the AWS SDK. The user makes an API request using the bedrock APIs and then it also adds some configuration parameters (model ID, temperature max tokens, etc.).

It is a fully managed service that uses foundation models. It allows you to access models from Amazon and third parties with a single set of text and image generation APIs.

User will use Lang Chain to build generative AI prototypes with Amazon Bedrock

Note: Please Add all the configuration parameters that can impact the output or response from the foundation model.

Then this request goes to Amazon Bedrock Service, which looks at all these configuration parameters, and then passes the request to the foundation model that has been hosted in the AWS account.

Based on the model requests will direct to the respective model like cohere. If the user provides some other foundation model then it can go to another one as defined in the request.

Amazon Bedrock - AWS platform

- login in was a console

Select Amazon Bedrock from list

Manage all models, Request all model and Save

Different Models:

Claude 3 Family of Amazon:

Boto 3 is a Python package:

Describes the API operations for creating, managing, fine-turning, and evaluating Amazon Bedrock models. Accessing Amazon Bedrock models using AWS Boto3 client

It allows to build and scale generative AI applications using foundation models. With the Boto3 SDK for Python, it interacts with Amazon Bedrock and invokes models for running inference.

Key terminology:

following list to understand generative AI terminology and Amazon Bedrock's capabilities: •

Foundation model (FM) – An AI model trained on a massive amount of diverse data. Foundation models can generate text or image, also convert input into embeddings.

Base model – A foundation model that is packaged by a provider and ready to use.

Model inference – The process of a foundation model generating an output (response) from a given input (prompt).

Inference parameters – Values that can be adjusted during model inference to influence a response. Inference parameters can affect how varied responses are and can also limit the length of a response or the occurrence of specified sequences.

Prompt – An input provided to a model to guide it to generate an appropriate response or output for the input.

Token – A sequence of characters that a model can interpret or predict as a single unit of meaning.

Model parameters – Values that define a model and its behavior in interpreting input and generating responses.

Playground – A user-friendly graphical interface in the AWS Management Console (prompt for models) in which users can experiment with running model inference to familiarize with Amazon Bedrock.

Embedding – The process of condensing information by transforming input into a vector of numerical values, known as the embeddings

Orchestration – The process of coordinating between foundation models and enterprise data and applications to carry out a task.

Agent – An application that carry out orchestrations through cyclically interpreting inputs and producing outputs by using a foundation model. An agent can be used to carry out customer requests.

Retrieval augmented generation (RAG) – The process of querying and retrieving information from a data source to augment a generated response to a prompt.

Model customization – The process of using training data to adjust the model parameter values in a base model to create a custom model.

Hyperparameters – Values that can be adjusted for model customization to control the training process and, consequently, the output custom model.

Model evaluation – The process of evaluating and comparing model outputs to determine the model that is best suited for a use case.

Provisioned Throughput – A level of throughput that you purchase for a base or custom model to increase the amount and/or rate of tokens processed during model inference.

Knowledge base:

To create a knowledge base, you connect to a supported data source that contains the documents that you want your knowledge base to be able to access.

A data source contains the raw form of your documents. To optimize the query process, a knowledge base converts raw data into vector embeddings, a numerical representation of the data, to quantify similarity to queries also converted into vector embeddings. Amazon Bedrock Knowledge Bases uses the following resources in the process of converting data sources:

Embedding model – A foundation model that converts your data into vector embeddings.
Vector store – A service that stores the vector representation of your data. The following vector stores are supported:

Pinecone
Etc.

The process of converting raw data into vector embeddings is called ingestion. The ingestion process that turns your data into a knowledge base involves the following steps:

Ingestion:

The data is parsed by the parser that you choose. For more information about parsing.
Each document in your data source is split into chunks.
The embedding model converts the data into vector embeddings.

Vector embeddings are a series of numbers that represent each chunk of text. A model
converts each text chunk into a series of numbers, known as vectors
These vectors can either be floating-point numbers (float32) or binary numbers.

The vector embeddings are written to a vector index in your chosen vector store.

After the ingestion process is complete, your knowledge base is ready to be queried.

An embedding model is used to convert the user's query to a vector. The vector index is then queried to find chunks that are semantically similar to the user's query by comparing document vectors to the user query vector. The user prompt is augmented with the additional context from the chunks that are retrieved from the vector index. The prompt alongside the additional context is then sent to the model to generate a response for the user.

Amazon Bedrock + Langchain

What is LangChain?

LangChain is a framework for developing applications powered by language models. It helps do this in two ways:

Integration - Bring external data, such as files, other applications, and api data, to your LLMs
Agency - Allow LLMs to interact with its environment via decision making. Use LLMs to help decide which action to take next

Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) via a single API. LangChain is particularly focused on the “chain of thought” and “language user interface” paradigms. It provides a set of abstractions and utilities for common tasks, such as chaining language models together, integrating external databases or knowledge bases for more informed responses, and managing the input and output processing in conversations.

Development and Experimentation: Use LangChain during the development phase to prototype your application

LangChain?

Components - LangChain makes it easy to swap out abstractions and components necessary to work with language models.
Customized Chains - LangChain provides out of the box support for using and customizing 'chains' - a series of actions strung together.
Speed - This team ships insanely fast. You'll be up to date with the latest LLM features.
Community - Wonderful discord and community support, meet ups, hackathons, etc.

Though LLMs can be straightforward (text-in, text-out) you'll quickly run into friction points that LangChain helps with once you develop more complicated applications.

Use Cases

Text Generation
Summarization - One of the most common use cases with LLM
Question and Answering Over Documents - Use information held within documents to answer questions or query.
Extraction - Pull structured data from a body of text or an user query
Evaluation - Understand the quality of output from your application.
Extraction and Enforce Output Format - Another approach using Pydantic & JsonOutputParser
Querying Tabular Data - Pull data from databases or other tabular source.
Code Understanding - Reason about and digest code.
Chatbots - A framework to have a back-and-forth interaction with a user combined with memory in a chat interface.
Agents - Use LLMs to make decisions about what to do next. Enable these decisions with tools.

Note:

%pip install --upgrade --quiet langchain_aws

from langchain_aws import BedrockLLM

llm = BedrockLLM(
credentials_profile_name="bedrock-admin", model_id="amazon.titan-text-express-v1"
)

Test summarization use case:

Text summarization, an expert at any global location to look at these images and incident reports understand the root cause, and provide a solution. send these detailed incident reports to a foundation model, and then the foundation model can summarize the report and send it back to expert review and make a decision in a much shorter time.

architecture

- bedrock Lambda

- API gateway.

- Cohere Foundation model

- Boto 3, SDK for Python to access AWS services

invoke this Rest API and provide a prompt, this prompt could be the incident report which can help the Expert make a faster decision.

Now once this Rest API is invoked, this prompt will be passed as an event to this lambda function, and the lambda function is going to make a call to this AWS bedrock service, which will invoke the Cohere Foundation model.

when the incident report is sent to the Foundation model, the model is going to summarize the text, send it back to the lambda function, and the lambda function is going to send the response back to the user.

1- Get access to the Cohere Foundation model in the Amazon Bedrock service

2- get access the Boto3, check the Boto3 version, it should be greater than 1.34.42.

3- write this lambda function, go to the AWS console and write the Lambda function, create an IAM role, and increase the timeout limit for this lambda function.

4- add policy permission, policy has been successfully attached to this role. attached Policy to Role

5- first thing import the Boto3.

6- The next thing create a client connection with bedrock.

# Create a Bedrock Runtime client in the AWS Region of your choice.

client = boto3.client("bedrock-runtime", region_name="us-east-1")

7- The lambda function will have an event (this will be prompt) and context/prompt as inputs

8- Invoke the Model

# Invoke the model with the request.

response = client.invoke_model(modelId=model_id, body=request)
response = client.invoke_model(
body=b'bytes'|file,
contentType='string',
accept='string',
modelId='string',
)

9- invoke the model, send a request to the bedrock service to get a response, and the output would be stored in the response. It has the client invoke model method, and it has four parameters.

○ body which is in the form of the byte.

○ content type, mime type of the input data in the request. content-type is basically a response type.

○ accept which is the form that you're going to send the request in.

○ model ID. So the model ID helps the bedrock service to reach out to the correct foundation model (whether it's coheir, Jurassic stability etc..)

○ Example

Client_Bedrockrequest = client.invoke_model

(body =

json.dumps({

"prompt": prompt,

"max_tokens": 200,

"temperature": 0.6,

"p": 1,

"k": 0,

"num_generations": 2,

"return_likelihoods": "GENERATION"

})

modelId = 'cohere.command-text-v14' # (Change this to try different model versions)

accept = "application/json"
contentType = "application/json"

)

○ Convert byte to String
Client_Bedrockrequest_byte = Client_Bedrockrequest ['body'].read()
client_Bedrock_string = json.load(client_Bedrock_byte)

○ Deploy the lambda code and test

Creating Rest API and Integrating with Lambda Function:

○ Now we can create REST API

○ Create a REST API in API Gateway:

§ Open AWS Console,

§ type API Gateway

□ Goto API Gateway

□ create an REST API

1. Under REST API, choose Build.

i) When the Create Example API popup appears, choose OK.

2. For API name, enter XYPAPI.

3. (Optional) For Description, enter a description.

4. Keep API endpoint type set to Regional.

5. Choose Create API.

□ Create Resource

1. Choose Create resource.

2. Keep Proxy resource turned off.

3. Keep Resource path as /.

4. For Resource name, enter xyz.

5. Keep CORS (Cross Origin Resource Sharing) turned off.

6. Choose Create resource

□ Create Methods — POST

1. Select the /xyz resource, and then choose Create method.

2. For Method type, select POST.

3. For Integration type, select Lambda function.

4. Turn on Lambda proxy integration.

5. For Lambda function, select the AWS Region where you created your Lambda function, and then enter the function name.

6. To use the default timeout value of 29 seconds, keep Default timeout turned on

7. Choose Create method

□ Deploy the API

1. Choose Deploy API.

2. For Stage, select New stage.

3. For Stage name, enter xyzTest.

4. (Optional) For Description, enter a description.

5. Choose Deploy.

6. Under Stage details, choose the copy icon to copy your API's invoke URL.

Reference:

Ravi Srivastava

Tuesday, October 15, 2024

High-Level View of Amazon Bedrock

No comments:

Post a Comment

About Me