Use the Converse API in Amazon Bedrock to create generative AI applications using single API across multiple foundation models
You can now use the Converse API in Amazon Bedrock to create conversational applications like chatbots and support assistants. It is a consistent, unified API that works with all Amazon Bedrock models that support messages. The benefit is that you have a single code-base (application) and use it with different models – this makes it preferable to use the Converse
API over InvokeModel (or InvokeModelWithResponseStream) APIs.
I will walk you through how to use this API with the AWS SDK for Go v2.
Converse API overview
Here is a super-high level overview of the API - you will see these in action when we go through some of the examples.
- The API consists of two operations -
Converse
andConverseStream
- The conversations are in the form of a
Message
object, which are encapsulated in aContentBlock
. - A
ContentBlock
can also have images, which are represented by anImageBlock
. - A message can have one of two roles -
user
orassistant
- For streaming response, use the
ConverseStream
API - The streaming output (
ConverseStreamOutput
) has multiple events, each of which has different response items such as the text output, metadata etc.
Let's explore a few sample apps now.
Basic example
Refer to **Before You Begin* section in this blog post to complete the prerequisites for running the examples. This includes installing Go, configuring Amazon Bedrock access and providing necessary IAM permissions.*
Let's start off with a simple example. You can refer to the complete code here.
To run the example:
git clone https://github.com/abhirockzz/converse-api-bedrock-go
cd converse-api-bedrock-go
go run basic/main.go
The response may be different in your case:
The crux of the app is a for
loop in which:
- A types.Message instance is created with the appropriate role (
user
orassistant
) - Sent using the
Converse
API - The response is collected and added to existing list of messages
- The conversation continues, until the app is exited
//...
for {
fmt.Print("\nEnter your message: ")
input, _ := reader.ReadString('\n')
input = strings.TrimSpace(input)
userMsg := types.Message{
Role: types.ConversationRoleUser,
Content: []types.ContentBlock{
&types.ContentBlockMemberText{
Value: input,
},
},
}
converseInput.Messages = append(converseInput.Messages, userMsg)
output, err := brc.Converse(context.Background(), converseInput)
if err != nil {
log.Fatal(err)
}
reponse, _ := output.Output.(*types.ConverseOutputMemberMessage)
responseContentBlock := reponse.Value.Content[0]
text, _ := responseContentBlock.(*types.ContentBlockMemberText)
fmt.Println(text.Value)
assistantMsg := types.Message{
Role: types.ConversationRoleAssistant,
Content: reponse.Value.Content,
}
converseInput.Messages = append(converseInput.Messages, assistantMsg)
}
//...
I used the Claude Sonnet model in the example. Refer to Supported models and model features for a complete list.
Multi-modal conversations: Combine image and text
You can also use the Converse
API to build multi-modal application that work images - note that they only return text, for now.
You can refer to the complete code here.
To run the example:
go run multi-modal-chat/main.go
I used the following picture of pizza and asked "what's in the image?":
Here is the output:
The is a simple single-turn exchange, but feel free to continue using a combination of images and text to continue the conversation.
The conversation for loop is similar to the previous example, but it has an added benefit of using the image data type with the help of types.ImageBlock:
//...
types.ContentBlockMemberImage{
Value: types.ImageBlock{
Format: types.ImageFormatJpeg,
Source: &types.ImageSourceMemberBytes{
Value: imageContents,
},
},
}
//...
**Note: *imageContents
is nothing but a []byte
representation of the image.*
Streaming chat
Streaming provide a better user experience because the client application does not need to wait for the complete response to be generated for it start showing up in the conversation.
You can refer to the complete code here.
To run the example:
go run chat-streaming/main.go
Streaming based implementations can be a bit complicated. But in this case, it was simplified due to the clear API abstractions that the Converse API provided, including partial response types such as types.ContentBlockDeltaMemberText.
The application invokes ConverseStream API and then processes the output components in bedrockruntime.ConverseStreamOutput.
func processStreamingOutput(output *bedrockruntime.ConverseStreamOutput, handler StreamingOutputHandler) (types.Message, error) {
<span class="k">var</span> <span class="n">combinedResult</span> <span class="kt">string</span>
<span class="n">msg</span> <span class="o">:=</span> <span class="n">types</span><span class="o">.</span><span class="n">Message</span><span class="p">{}</span>
<span class="k">for</span> <span class="n">event</span> <span class="o">:=</span> <span class="k">range</span> <span class="n">output</span><span class="o">.</span><span class="n">GetStream</span><span class="p">()</span><span class="o">.</span><span class="n">Events</span><span class="p">()</span> <span class="p">{</span>
<span class="k">switch</span> <span class="n">v</span> <span class="o">:=</span> <span class="n">event</span><span class="o">.</span><span class="p">(</span><span class="k">type</span><span class="p">)</span> <span class="p">{</span>
<span class="k">case</span> <span class="o">*</span><span class="n">types</span><span class="o">.</span><span class="n">ConverseStreamOutputMemberMessageStart</span><span class="o">:</span>
<span class="n">msg</span><span class="o">.</span><span class="n">Role</span> <span class="o">=</span> <span class="n">v</span><span class="o">.</span><span class="n">Value</span><span class="o">.</span><span class="n">Role</span>
<span class="k">case</span> <span class="o">*</span><span class="n">types</span><span class="o">.</span><span class="n">ConverseStreamOutputMemberContentBlockDelta</span><span class="o">:</span>
<span class="n">textResponse</span> <span class="o">:=</span> <span class="n">v</span><span class="o">.</span><span class="n">Value</span><span class="o">.</span><span class="n">Delta</span><span class="o">.</span><span class="p">(</span><span class="o">*</span><span class="n">types</span><span class="o">.</span><span class="n">ContentBlockDeltaMemberText</span><span class="p">)</span>
<span class="n">handler</span><span class="p">(</span><span class="n">context</span><span class="o">.</span><span class="n">Background</span><span class="p">(),</span> <span class="n">textResponse</span><span class="o">.</span><span class="n">Value</span><span class="p">)</span>
<span class="n">combinedResult</span> <span class="o">=</span> <span class="n">combinedResult</span> <span class="o">+</span> <span class="n">textResponse</span><span class="o">.</span><span class="n">Value</span>
<span class="k">case</span> <span class="o">*</span><span class="n">types</span><span class="o">.</span><span class="n">UnknownUnionMember</span><span class="o">:</span>
<span class="n">fmt</span><span class="o">.</span><span class="n">Println</span><span class="p">(</span><span class="s">"unknown tag:"</span><span class="p">,</span> <span class="n">v</span><span class="o">.</span><span class="n">Tag</span><span class="p">)</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="n">msg</span><span class="o">.</span><span class="n">Content</span> <span class="o">=</span> <span class="nb">append</span><span class="p">(</span><span class="n">msg</span><span class="o">.</span><span class="n">Content</span><span class="p">,</span>
<span class="o">&</span><span class="n">types</span><span class="o">.</span><span class="n">ContentBlockMemberText</span><span class="p">{</span>
<span class="n">Value</span><span class="o">:</span> <span class="n">combinedResult</span><span class="p">,</span>
<span class="p">},</span>
<span class="p">)</span>
<span class="k">return</span> <span class="n">msg</span><span class="p">,</span> <span class="no">nil</span>
}
Wrap up
There are a few other awesome things the Converse
API does to make your life easier.
- It allows you to pass inference parameters specific to a model.
- You can also use the
Converse
API to implement tool use in your applications. - If you are using Mistral AI or Llama 2 Chat models, the
Converse
API will embed your input in a model-specific prompt template that enables conversations - one less thing to worry about!
Like I always say, Python does not have to be the only way to build generative AI powered machine learning applications. As an AI engineer, choose the right tools (including foundation models) and programming languages for your solutions. I maybe biased towards Go but this applies equally well to Java, JS/TS, C# etc.
Happy building!
Top comments (0)