Bedrock Overview
Amazon Bedrock is a fully managed serverless service from
AWS. AWS Provisions monitors patches and does everything else, and as an end user, you can use the service and just pay per use.
The second thing is it makes base foundation models from
Amazon and third-party providers accessible through an API. Bedrock service hosts many foundation models
from Amazon and third-party foundation model providers and then makes them accessible to the end user through an API.
The models from Anthropic Cohere Stability, AI 21 labs also from Amazon.
high level, how Amazon bedrock works.
The user can access
bedrock service either through the AWS console CLI or the AWS SDK. The user makes an
API request using the bedrock APIs and then it also adds some configuration
parameters (model ID, temperature max tokens, etc.).
It is a fully managed service that uses foundation models. It allows you to access models from Amazon and third parties with a single set of text and image generation APIs.
User will use Lang Chain to build generative AI prototypes with Amazon Bedrock
Note: Please Add all the configuration parameters that
can impact the output or response from the foundation model.
Then this request goes to Amazon Bedrock Service, which
looks at all these configuration parameters, and then passes the request to the
foundation model that has been hosted in the AWS account.
Based on the model
requests will direct to the respective model like cohere. If the user provides some other
foundation model then it can go to another one as defined in the request.
Amazon Bedrock - AWS platform
- login in was a console
Select Amazon Bedrock from list
Manage all models, Request all model and Save
Claude 3 Family of Amazon:

Boto 3 is a Python package:
Describes the API operations for creating, managing,
fine-turning, and evaluating Amazon Bedrock models. Accessing Amazon Bedrock models using AWS Boto3 client
It allows to build and scale generative AI applications
using foundation models. With the Boto3 SDK for Python, it interacts with Amazon
Bedrock and invokes models for running inference.
Key terminology:
following list to understand generative AI terminology and
Amazon Bedrock's capabilities: •
Foundation model (FM) – An AI model trained on a massive
amount of diverse data. Foundation models can generate text or image, also
convert input into embeddings.
Base model – A foundation model that is packaged by a
provider and ready to use.
Model inference – The process of a foundation model
generating an output (response) from a given input (prompt).
Inference parameters – Values that can be adjusted during
model inference to influence a response. Inference parameters can affect how
varied responses are and can also limit the length of a response or the
occurrence of specified sequences.
Prompt – An input provided to a model to guide it to
generate an appropriate response or output for the input.
Token – A sequence of characters that a model can interpret
or predict as a single unit of meaning.
Model parameters – Values that define a model and its
behavior in interpreting input and generating responses.
Playground – A user-friendly graphical interface in the AWS
Management Console (prompt for models) in which users can experiment with
running model inference to familiarize with Amazon Bedrock.
Embedding – The process of condensing information by
transforming input into a vector of numerical values, known as the embeddings
Orchestration – The process of coordinating between
foundation models and enterprise data and applications to carry out a
task.
Agent – An application that carry out orchestrations through
cyclically interpreting inputs and producing outputs by using a foundation
model. An agent can be used to carry out customer requests.
Retrieval augmented generation (RAG) – The process of
querying and retrieving information from a data source to augment a
generated response to a prompt.
Model customization – The process of using training data to
adjust the model parameter values in a base model to create a custom
model.
Hyperparameters – Values that can be adjusted for model
customization to control the training process and, consequently, the output
custom model.
Model evaluation – The process of evaluating and comparing
model outputs to determine the model that is best suited for a use
case.
Provisioned Throughput – A level of throughput that you
purchase for a base or custom model to increase the amount and/or rate
of tokens processed during model inference.
Knowledge base:
To create a knowledge base, you connect to a supported data
source that contains the documents that you want your knowledge base to be able
to access.
A data source contains the raw form of your documents. To
optimize the query process, a knowledge base converts raw data into vector
embeddings, a numerical representation of the data, to quantify
similarity to queries also converted into vector embeddings. Amazon
Bedrock Knowledge Bases uses the following resources in the process of
converting data sources:
- Embedding
model – A foundation model that converts your data into vector embeddings.
- Vector
store – A service that stores the vector representation of your data. The
following vector stores are supported:
The process of converting raw data into vector embeddings is
called ingestion. The ingestion process that turns your data into a
knowledge base involves the following steps:
Ingestion:
- The
data is parsed by the parser that you choose. For more information about
parsing.
- Each
document in your data source is split into chunks.
- The
embedding model converts the data into vector embeddings.
- Vector
embeddings are a series of numbers that represent each chunk of text. A
model
- converts each text chunk into a series of numbers, known as vectors
- These vectors can either be floating-point numbers (float32)
or binary numbers.
- The
vector embeddings are written to a vector index in your chosen vector store.
After the ingestion process is complete, your knowledge base
is ready to be queried.
An embedding model is used to convert the user's query to a
vector. The vector index is then queried to find chunks that are semantically
similar to the user's query by comparing document vectors to the user query
vector. The user prompt is augmented with the additional context from the chunks
that are retrieved from the vector index. The prompt alongside the additional
context is then sent to the model to generate a response for the user.
Amazon Bedrock + Langchain
What
is LangChain?
LangChain is a framework for developing applications
powered by language models. It helps do this in two ways:
- Integration - Bring external data, such
as files, other applications, and api data, to your LLMs
- Agency - Allow LLMs to interact with its environment via decision
making. Use LLMs to help decide which action to take next
Amazon
Bedrock is a fully managed service that offers a
choice of high-performing foundation models (FMs) via a single API. LangChain
is particularly focused on the “chain of thought” and “language user interface”
paradigms. It provides a set of abstractions and utilities for common
tasks, such as chaining language models together, integrating external
databases or knowledge bases for more informed responses, and managing the
input and output processing in conversations.
Development and Experimentation: Use
LangChain during the development phase to prototype your application
LangChain?
- Components - LangChain makes it easy to swap out abstractions and
components necessary to work with language models.
- Customized Chains -
LangChain provides out of the box support for using and customizing
'chains' - a series of actions strung together.
- Speed - This team ships insanely fast. You'll be up to date with the
latest LLM features.
- Community - Wonderful discord and community support, meet
ups, hackathons, etc.
Though
LLMs can be straightforward (text-in, text-out) you'll quickly run into
friction points that LangChain helps with once you develop more complicated
applications.
Use
Cases
- Text Generation
- Summarization - One of the most common
use cases with LLM
- Question and Answering Over Documents - Use information held within documents to answer questions or
query.
- Extraction - Pull structured data from a body of text or an user query
- Evaluation - Understand the quality of output from your application.
- Extraction and Enforce Output Format - Another approach using Pydantic & JsonOutputParser
- Querying Tabular Data -
Pull data from databases or other tabular source.
- Code Understanding -
Reason about and digest code.
- Chatbots - A framework to have a back-and-forth interaction with a user
combined with memory in a chat interface.
- Agents - Use LLMs to make decisions about what to do next. Enable
these decisions with tools.
Note:
%pip install --upgrade --quiet
langchain_aws
from langchain_aws import BedrockLLM
llm = BedrockLLM(
credentials_profile_name="bedrock-admin",
model_id="amazon.titan-text-express-v1"
)
Test summarization use case:
Text summarization, an expert at any global location to look at these images and incident reports understand the root cause, and provide a solution. send these detailed incident reports to a foundation model, and then the foundation model can summarize the report and send it back to expert review and make a decision in a much shorter time.
architecture
- bedrock Lambda
- API gateway.
- Cohere Foundation model
- Boto 3, SDK for Python to access AWS services
invoke this Rest API and provide a prompt, this prompt could be the incident report which can help the Expert make a faster decision.
Now once this Rest API is invoked, this prompt will be passed as an event to this lambda function, and the lambda function is going to make a call to this AWS bedrock service, which will invoke the Cohere Foundation model.
when the incident report is sent to the Foundation model, the model is going to summarize the text, send it back to the lambda function, and the lambda function is going to send the response back to the user.
1- Get access to the Cohere Foundation model in the Amazon Bedrock service
2- get access the Boto3, check the Boto3 version, it should be greater than 1.34.42.
3- write this lambda function, go to the AWS console and write the Lambda function, create an IAM role, and increase the timeout limit for this lambda function.
4- add policy permission, policy has been successfully attached to this role. attached Policy to Role
5- first thing import the Boto3.
6- The next thing create a client connection with bedrock.
# Create a Bedrock Runtime client in the AWS Region of your choice.
client = boto3.client("bedrock-runtime", region_name="us-east-1")
7- The lambda function will have an event (this will be prompt) and context/prompt as inputs
8- Invoke the Model
# Invoke the model with the request.
response = client.invoke_model(modelId=model_id, body=request)
response = client.invoke_model(
9- invoke the model, send a request to the bedrock service to get a response, and the output would be stored in the response. It has the client invoke model method, and it has four parameters.
○ body which is in the form of the byte.
○ content type, mime type of the input data in the request. content-type is basically a response type.
○ accept which is the form that you're going to send the request in.
○ model ID. So the model ID helps the bedrock service to reach out to the correct foundation model (whether it's coheir, Jurassic stability etc..)
○ Example
Client_Bedrockrequest = client.invoke_model
(body =
json.dumps({
"prompt": prompt,
"max_tokens": 200,
"temperature": 0.6,
"p": 1,
"k": 0,
"num_generations": 2,
"return_likelihoods": "GENERATION"
})
modelId = 'cohere.command-text-v14' # (Change this to try different model versions)
accept = "application/json"
contentType = "application/json"
Client_Bedrockrequest_byte = Client_Bedrockrequest ['body'].read()
client_Bedrock_string = json.load(client_Bedrock_byte)
○ Deploy the lambda code and test
Creating Rest API and Integrating with Lambda Function:
○ Now we can create REST API
○ Create a REST API in API Gateway:
§ Open AWS Console,
§ type API Gateway
□ Goto API Gateway
□ create an REST API
1. Under REST API, choose Build.
i) When the Create Example API popup appears, choose OK.
2. For API name, enter XYPAPI.
3. (Optional) For Description, enter a description.
4. Keep API endpoint type set to Regional.
5. Choose Create API.
□ Create Resource
1. Choose Create resource.
2. Keep Proxy resource turned off.
3. Keep Resource path as /.
4. For Resource name, enter xyz.
5. Keep CORS (Cross Origin Resource Sharing) turned off.
6. Choose Create resource
□ Create Methods — POST
1. Select the /xyz resource, and then choose Create method.
2. For Method type, select POST.
3. For Integration type, select Lambda function.
4. Turn on Lambda proxy integration.
5. For Lambda function, select the AWS Region where you created your Lambda function, and then enter the function name.
6. To use the default timeout value of 29 seconds, keep Default timeout turned on
7. Choose Create method
□ Deploy the API
1. Choose Deploy API.
2. For Stage, select New stage.
3. For Stage name, enter xyzTest.
4. (Optional) For Description, enter a description.
5. Choose Deploy.
6. Under Stage details, choose the copy icon to copy your API's invoke URL.
Reference: