Category Archives: AWS

Guardrails for Amazon Bedrock now available with new safety filters and privacy controls

This post was originally published on this site

Today, I am happy to announce the general availability of Guardrails for Amazon Bedrock, first released in preview at re:Invent 2023. With Guardrails for Amazon Bedrock, you can implement safeguards in your generative artificial intelligence (generative AI) applications that are customized to your use cases and responsible AI policies. You can create multiple guardrails tailored to different use cases and apply them across multiple foundation models (FMs), improving end-user experiences and standardizing safety controls across generative AI applications. You can use Guardrails for Amazon Bedrock with all large language models (LLMs) in Amazon Bedrock, including fine-tuned models.

Guardrails for Bedrock offers industry-leading safety protection on top of the native capabilities of FMs, helping customers block as much as 85% more harmful content than protection natively provided by some foundation models on Amazon Bedrock today. Guardrails for Amazon Bedrock is the only responsible AI capability offered by top cloud providers that enables customers to build and customize safety and privacy protections for their generative AI applications in a single solution, and it works with all large language models (LLMs) in Amazon Bedrock, as well as fine-tuned models.

Aha! is a software company that helps more than 1 million people bring their product strategy to life. “Our customers depend on us every day to set goals, collect customer feedback, and create visual roadmaps,” said Dr. Chris Waters, co-founder and Chief Technology Officer at Aha!. “That is why we use Amazon Bedrock to power many of our generative AI capabilities. Amazon Bedrock provides responsible AI features, which enable us to have full control over our information through its data protection and privacy policies, and block harmful content through Guardrails for Bedrock. We just built on it to help product managers discover insights by analyzing feedback submitted by their customers. This is just the beginning. We will continue to build on advanced AWS technology to help product development teams everywhere prioritize what to build next with confidence.”

In the preview post, Antje showed you how to use guardrails to configure thresholds to filter content across harmful categories and define a set of topics that need to be avoided in the context of your application. The Content filters feature now has two additional safety categories: Misconduct for detecting criminal activities and Prompt Attack for detecting prompt injection and jailbreak attempts. We also added important new features, including sensitive information filters to detect and redact personally identifiable information (PII) and word filters to block inputs containing profane and custom words (for example, harmful words, competitor names, and products).

Guardrails for Amazon Bedrock sits in between the application and the model. Guardrails automatically evaluates everything going into the model from the application and coming out of the model to the application to detect and help prevent content that falls into restricted categories.

You can recap the steps in the preview release blog to learn how to configure Denied topics and Content filters. Let me show you how the new features work.

New features
To start using Guardrails for Amazon Bedrock, I go to the AWS Management Console for Amazon Bedrock, where I can create guardrails and configure the new capabilities. In the navigation pane in the Amazon Bedrock console, I choose Guardrails, and then I choose Create guardrail.

I enter the guardrail Name and Description. I choose Next to move to the Add sensitive information filters step.

I use Sensitive information filters to detect sensitive and private information in user inputs and FM outputs. Based on the use cases, I can select a set of entities to be either blocked in inputs (for example, a FAQ-based chatbot that doesn’t require user-specific information) or redacted in outputs (for example, conversation summarization based on chat transcripts). The sensitive information filter supports a set of predefined PII types. I can also define custom regex-based entities specific to my use case and needs.

I add two PII types (Name, Email) from the list and add a regular expression pattern using Booking ID as Name and [0-9a-fA-F]{8} as the Regex pattern.

I choose Next and enter custom messages that will be displayed if my guardrail blocks the input or the model response in the Define blocked messaging step. I review the configuration at the last step and choose Create guardrail.

I navigate to the Guardrails Overview page and choose the Anthropic Claude Instant 1.2 model using the Test section. I enter the following call center transcript in the Prompt field and choose Run.

Please summarize the below call center transcript. Put the name, email and the booking ID to the top:
Agent: Welcome to ABC company. How can I help you today?
Customer: I want to cancel my hotel booking.
Agent: Sure, I can help you with the cancellation. Can you please provide your booking ID?
Customer: Yes, my booking ID is 550e8408.
Agent: Thank you. Can I have your name and email for confirmation?
Customer: My name is Jane Doe and my email is jane.doe@gmail.com
Agent: Thank you for confirming. I will go ahead and cancel your reservation.

Guardrail action shows there are three instances where the guardrails came in to effect. I use View trace to check the details. I notice that the guardrail detected the Name, Email and Booking ID and masked them in the final response.

I use Word filters to block inputs containing profane and custom words (for example, competitor names or offensive words). I check the Filter profanity box. The profanity list of words is based on the global definition of profanity. Additionally, I can specify up to 10,000 phrases (with a maximum of three words per phrase) to be blocked by the guardrail. A blocked message will show if my input or model response contain these words or phrases.

Now, I choose Custom words and phrases under Word filters and choose Edit. I use Add words and phrases manually to add a custom word CompetitorY. Alternatively, I can use Upload from a local file or Upload from S3 object if I need to upload a list of phrases. I choose Save and exit to return to my guardrail page.

I enter a prompt containing information about a fictional company and its competitor and add the question What are the extra features offered by CompetitorY?. I choose Run.

I use View trace to check the details. I notice that the guardrail intervened according to the policies I configured.

Now available
Guardrails for Amazon Bedrock is now available in US East (N. Virginia) and US West (Oregon) Regions.

For pricing information, visit the Amazon Bedrock pricing page.

To get started with this feature, visit the Guardrails for Amazon Bedrock web page.

For deep-dive technical content and to learn how our Builder communities are using Amazon Bedrock in their solutions, visit our community.aws website.

— Esra

Agents for Amazon Bedrock: Introducing a simplified creation and configuration experience

This post was originally published on this site

With Agents for Amazon Bedrock, applications can use generative artificial intelligence (generative AI) to run tasks across multiple systems and data sources. Starting today, these new capabilities streamline the creation and management of agents:

Quick agent creation – You can now quickly create an agent and optionally add instructions and action groups later, providing flexibility and agility for your development process.

Agent builder – All agent configurations can be operated in the new agent builder section of the console.

Simplified configuration – Action groups can use a simplified schema that just lists functions and parameters without having to provide an API schema.

Return of control –You can skip using an AWS Lambda function and return control to the application invoking the agent. In this way, the application can directly integrate with systems outside AWS or call internal endpoints hosted in any Amazon Virtual Private Cloud (Amazon VPC) without the need to integrate the required networking and security configurations with a Lambda function.

Infrastructure as code – You can use AWS CloudFormation to deploy and manage agents with the new simplified configuration, ensuring consistency and reproducibility across environments for your generative AI applications.

Let’s see how these enhancements work in practice.

Creating an agent using the new simplified console
To test the new experience, I want to build an agent that can help me reply to an email containing customer feedback. I can use generative AI, but a single invocation of a foundation model (FM) is not enough because I need to interact with other systems. To do that, I use an agent.

In the Amazon Bedrock console, I choose Agents from the navigation pane and then Create Agent. I enter a name for the agent (customer-feedback) and a description. Using the new interface, I proceed and create the agent without providing additional information at this stage.

Console screenshot.

I am now presented with the Agent builder, the place where I can access and edit the overall configuration of an agent. In the Agent resource role, I leave the default setting as Create and use a new service role so that the AWS Identity and Access Management (IAM) role assumed by the agent is automatically created for me. For the model, I select Anthropic and Claude 3 Sonnet.

Console screenshot.

In Instructions for the Agent, I provide clear and specific instructions for the task the agent has to perform. Here, I can also specify the style and tone I want the agent to use when replying. For my use case, I enter:

Help reply to customer feedback emails with a solution tailored to the customer account settings.

In Additional settings, I select Enabled for User input so that the agent can ask for additional details when it does not have enough information to respond. Then, I choose Save to update the configuration of the agent.

I now choose Add in the Action groups section. Action groups are the way agents can interact with external systems to gather more information or perform actions. I enter a name (retrieve-customer-settings) and a description for the action group:

Retrieve customer settings including customer ID.

The description is optional but, when provided, is passed to the model to help choose when to use this action group.

Console screenshot.

In Action group type, I select Define with function details so that I only need to specify functions and their parameters. The other option here (Define with API schemas) corresponds to the previous way of configuring action groups using an API schema.

Action group functions can be associated to a Lambda function call or configured to return control to the user or application invoking the agent so that they can provide a response to the function. The option to return control is useful for four main use cases:

  • When it’s easier to call an API from an existing application (for example, the one invoking the agent) than building a new Lambda function with the correct authentication and network configurations as required by the API
  • When the duration of the task goes beyond the maximum Lambda function timeout of 15 minutes so that I can handle the task with an application running in containers or virtual servers or use a workflow orchestration such as AWS Step Functions
  • When I have time-consuming actions because, with the return of control, the agent doesn’t wait for the action to complete before proceeding to the next step, and the invoking application can run actions asynchronously in the background while the orchestration flow of the agent continues
  • When I need a quick way to mock the interaction with an API during the development and testing and of an agent

In Action group invocation, I can specify the Lambda function that will be invoked when this action group is identified by the model during orchestration. I can ask the console to quickly create a new Lambda function, to select an existing Lambda function, or return control so that the user or application invoking the agent will ask for details to generate a response. I select Return Control to show how that works in the console.

Console screenshot.

I configure the first function of the action group. I enter a name (retrieve-customer-settings-from-crm) and the following description for the function:

Retrieve customer settings from CRM including customer ID using the customer email in the sender/from fields of the email.

Console screenshot.

In Parameters, I add email with Customer email as the description. This is a parameter of type String and is required by this function. I choose Add to complete the creation of the action group.

Because, for my use case, I expect many customers to have issues when logging in, I add another action group (named check-login-status) with the following description:

Check customer login status.

This time, I select the option to create a new Lambda function so that I can handle these requests in code.

For this action group, I configure a function (named check-customer-login-status-in-login-system) with the following description:

Check customer login status in login system using the customer ID from settings.

In Parameters, I add customer_id, another required parameter of type String. Then, I choose Add to complete the creation of the second action group.

When I open the configuration of this action group, I see the name of the Lambda function that has been created in my account. There, I choose View to open the Lambda function in the console.

Console screenshot.

In the Lambda console, I edit the starting code that has been provided and implement my business case:

import json

def lambda_handler(event, context):
    print(event)
    
    agent = event['agent']
    actionGroup = event['actionGroup']
    function = event['function']
    parameters = event.get('parameters', [])

    # Execute your business logic here. For more information,
    # refer to: https://docs.aws.amazon.com/bedrock/latest/userguide/agents-lambda.html
    if actionGroup == 'check-login-status' and function == 'check-customer-login-status-in-login-system':
        response = {
            "status": "unknown"
        }
        for p in parameters:
            if p['name'] == 'customer_id' and p['type'] == 'string' and p['value'] == '12345':
                response = {
                    "status": "not verified",
                    "reason": "the email address has not been verified",
                    "solution": "please verify your email address"
                }
    else:
        response = {
            "error": "Unknown action group {} or function {}".format(actionGroup, function)
        }
    
    responseBody =  {
        "TEXT": {
            "body": json.dumps(response)
        }
    }

    action_response = {
        'actionGroup': actionGroup,
        'function': function,
        'functionResponse': {
            'responseBody': responseBody
        }

    }

    dummy_function_response = {'response': action_response, 'messageVersion': event['messageVersion']}
    print("Response: {}".format(dummy_function_response))

    return dummy_function_response

I choose Deploy in the Lambda console. The function is configured with a resource-based policy that allows Amazon Bedrock to invoke the function. For this reason, I don’t need to update the IAM role used by the agent.

I am ready to test the agent. Back in the Amazon Bedrock console, with the agent selected, I look for the Test Agent section. There, I choose Prepare to prepare the agent and test it with the latest changes.

As input to the agent, I provide this sample email:

From: danilop@example.com

Subject: Problems logging in

Hi, when I try to log into my account, I get an error and cannot proceed further. Can you check? Thank you, Danilo

In the first step, the agent orchestration decides to use the first action group (retrieve-customer-settings) and function (retrieve-customer-settings-from-crm). This function is configured to return control, and in the console, I am asked to provide the output of the action group function. The customer email address is provided as the input parameter.

Console screenshot.

To simulate an interaction with an application, I reply with a JSON syntax and choose Submit:

{ "customer id": 12345 }

In the next step, the agent has the information required to use the second action group (check-login-status) and function (check-customer-login-status-in-login-system) to call the Lambda function. In return, the Lambda function provides this JSON payload:

{
  "status": "not verified",
  "reason": "the email address has not been verified",
  "solution": "please verify your email address"
}

Using this content, the agent can complete its task and suggest the correct solution for this customer.

Console screenshot.

I am satisfied with the result, but I want to know more about what happened under the hood. I choose Show trace where I can see the details of each step of the agent orchestration. This helps me understand the agent decisions and correct the configurations of the agent groups if they are not used as I expect.

Console screenshot.

Things to know
You can use the new simplified experience to create and manage Agents for Amazon Bedrock in the US East (N. Virginia) and US West (Oregon) AWS Regions.

You can now create an agent without having to specify an API schema or provide a Lambda function for the action groups. You just need to list the parameters that the action group needs. When invoking the agent, you can choose to return control with the details of the operation to perform so that you can handle the operation in your existing applications or if the duration is longer than the maximum Lambda function timeout.

CloudFormation support for Agents for Amazon Bedrock has been released recently and is now being updated to support the new simplified syntax.

To learn more:

Danilo

Import custom models in Amazon Bedrock (preview)

This post was originally published on this site

With Amazon Bedrock, you have access to a choice of high-performing foundation models (FMs) from leading artificial intelligence (AI) companies that make it easier to build and scale generative AI applications. Some of these models provide publicly available weights that can be fine-tuned and customized for specific use cases. However, deploying customized FMs in a secure and scalable way is not an easy task.

Starting today, Amazon Bedrock adds in preview the capability to import custom weights for supported model architectures (such as Meta Llama 2, Llama 3, and Mistral) and serve the custom model using On-Demand mode. You can import models with weights in Hugging Face safetensors format from Amazon SageMaker and Amazon Simple Storage Service (Amazon S3).

In this way, you can use Amazon Bedrock with existing customized models such as Code Llama, a code-specialized version of Llama 2 that was created by further training Llama 2 on code-specific datasets, or use your data to fine-tune models for your own unique business case and import the resulting model in Amazon Bedrock.

Let’s see how this works in practice.

Bringing a custom model to Amazon Bedrock
In the Amazon Bedrock console, I choose Imported models from the Foundation models section of the navigation pane. Now, I can create a custom model by importing model weights from an Amazon Simple Storage Service (Amazon S3) bucket or from an Amazon SageMaker model.

I choose to import model weights from an S3 bucket. In another browser tab, I download the MistralLite model from the Hugging Face website using this pull request (PR) that provides weights in safetensors format. The pull request is currently Ready to merge, so it might be part of the main branch when you read this. MistralLite is a fine-tuned Mistral-7B-v0.1 language model with enhanced capabilities of processing long context up to 32K tokens.

When the download is complete, I upload the files to an S3 bucket in the same AWS Region where I will import the model. Here are the MistralLite model files in the Amazon S3 console:

Console screenshot.

Back at the Amazon Bedrock console, I enter a name for the model and keep the proposed import job name.

Console screenshot.

I select Model weights in the Model import settings and browse S3 to choose the location where I uploaded the model weights.

Console screenshot.

To authorize Amazon Bedrock to access the files on the S3 bucket, I select the option to create and use a new AWS Identity and Access Management (IAM) service role. I use the View permissions details link to check what will be in the role. Then, I submit the job.

About ten minutes later, the import job is completed.

Console screenshot.

Now, I see the imported model in the console. The list also shows the model Amazon Resource Name (ARN) and the creation date.

Console screenshot.

I choose the model to get more information, such as the S3 location of the model files.

Console screenshot.

In the model detail page, I choose Open in playground to test the model in the console. In the text playground, I type a question using the prompt template of the model:

<|prompter|>What are the main challenges to support a long context for LLM?</s><|assistant|>

The MistralLite imported model is quick to reply and describe some of those challenges.

Console screenshot.

In the playground, I can tune responses for my use case using configurations such as temperature and maximum length or add stop sequences specific to the imported model.

To see the syntax of the API request, I choose the three small vertical dots at the top right of the playground.

Console screenshot.

I choose View API syntax and run the command using the AWS Command Line Interface (AWS CLI):

aws bedrock-runtime invoke-model 
--model-id arn:aws:bedrock:us-east-1:123412341234:imported-model/a82bkefgp20f 
--body "{"prompt":"<|prompter|>What are the main challenges to support a long context for LLM?</s><|assistant|>","max_tokens":512,"top_k":200,"top_p":0.9,"stop":[],"temperature":0.5}" 
--cli-binary-format raw-in-base64-out 
--region us-east-1 
invoke-model-output.txt

The output is similar to what I got in the playground. As you can see, for imported models, the model ID is the ARN of the imported model. I can use the model ID to invoke the imported model with the AWS CLI and AWS SDKs.

Things to know
You can bring your own weights for supported model architectures to Amazon Bedrock in the US East (N. Virginia) AWS Region. The model import capability is currently available in preview.

When using custom weights, Amazon Bedrock serves the model with On-Demand mode, and you only pay for what you use with no time-based term commitments. For detailed information, see Amazon Bedrock pricing.

The ability to import models is managed using AWS Identity and Access Management (IAM), and you can allow this capability only to the roles in your organization that need to have it.

With this launch, it’s now easier to build and scale generative AI applications using custom models with security and privacy built in.

To learn more:

Danilo

Unify DNS management using Amazon Route 53 Profiles with multiple VPCs and AWS accounts

This post was originally published on this site

If you are managing lots of accounts and Amazon Virtual Private Cloud (Amazon VPC) resources, sharing and then associating many DNS resources to each VPC can present a significant burden. You often hit limits around sharing and association, and you may have gone as far as building your own orchestration layers to propagate DNS configuration across your accounts and VPCs.

Today, I’m happy to announce Amazon Route 53 Profiles, which provide the ability to unify management of DNS across all of your organization’s accounts and VPCs. Route 53 Profiles let you define a standard DNS configuration, including Route 53 private hosted zone (PHZ) associations, Resolver forwarding rules, and Route 53 Resolver DNS Firewall rule groups, and apply that configuration to multiple VPCs in the same AWS Region. With Profiles, you have an easy way to ensure all of your VPCs have the same DNS configuration without the complexity of handling separate Route 53 resources. Managing DNS across many VPCs is now as simple as managing those same settings for a single VPC.

Profiles are natively integrated with AWS Resource Access Manager (RAM) allowing you to share your Profiles across accounts or with your AWS Organizations account. Profiles integrates seamlessly with Route 53 private hosted zones by allowing you to create and add existing private hosted zones to your Profile so that your organizations have access to these same settings when the Profile is shared across accounts. AWS CloudFormation allows you to use Profiles to set DNS settings consistently for VPCs as accounts are newly provisioned. With today’s release, you can better govern DNS settings for your multi-account environments.

How Route 53 Profiles works
To start using the Route 53 Profiles, I go to the AWS Management Console for Route 53, where I can create Profiles, add resources to them, and associate them to their VPCs. Then, I share the Profile I created across another account using AWS RAM.

In the navigation pane in the Route 53 console, I choose Profiles and then I choose Create profile to set up my Profile.

I give my Profile configuration a friendly name such as MyFirstRoute53Profile and optionally add tags.

I can configure settings for DNS Firewall rule groups, private hosted zones and Resolver rules or add existing ones within my account all within the Profile console page.

I choose VPCs to associate my VPCs to the Profile. I can add tags as well as do configurations for recursive DNSSEC validation, the failure mode for the DNS Firewalls associated to my VPCs. I can also control the order of DNS evaluation: First VPC DNS then Profile DNS, or first Profile DNS then VPC DNS.

I can associate one Profile per VPC and can associate up to 5,000 VPCs to a single Profile.

Profiles gives me the ability to manage settings for VPCs across accounts in my organization. I am able to disable reverse DNS rules for each of the VPCs the Profile is associated with rather than configuring these on a per-VPC basis. The Route 53 Resolver automatically creates rules for reverse DNS lookups for me so that different services can easily resolve hostnames from IP addresses. If I use DNS Firewall, I am able to select the failure mode for my firewall via settings, to fail open or fail closed. I am also able to specify if I wish for the VPCs associated to the Profile to have recursive DNSSEC validation enabled without having to use DNSSEC signing in Route 53 (or any other provider).

Let’s say I associate a Profile to a VPC. What happens when a query exactly matches both a resolver rule or PHZ associated directly to the VPC and a resolver rule or PHZ associated to the VPC’s Profile? Which DNS settings take precedence, the Profile’s or the local VPC’s? For example, if the VPC is associated to a PHZ for example.com and the Profile contains a PHZ for example.com, that VPC’s local DNS settings will take precedence over the Profile. When a query is made for a name for a conflicting domain name (for example, the Profile contains a PHZ for infra.example.com and the VPC is associated to a PHZ that has the name account1.infra.example.com), the most specific name wins.

Sharing Route 53 Profiles across accounts using AWS RAM
I use AWS Resource Access Manager (RAM) to share the Profile I created in the previous section with my other account.

I choose the Share profile option in the Profiles detail page or I can go to the AWS RAM console page and choose Create resource share.

I provide a name for my resource share and then I search for the ‘Route 53 Profiles’ in the Resources section. I select the Profile in Selected resources. I can choose to add tags. Then, I choose Next.

Profiles utilize RAM managed permissions, which allow me to attach different permissions to each resource type. By default, only the owner (the network admin) of the Profile will be able to modify the resources within the Profile. Recipients of the Profile (the VPC owners) will only be able to view the contents of the Profile (the ReadOnly mode). To allow a recipient of the Profile to add PHZs or other resources to it, the Profile’s owner will have to attach the necessary permissions to the resource. Recipients will not be able to edit or delete any resources added by the Profile owner to the shared resource.

I leave the default selections and choose Next to grant access to my other account.

On the next page, I choose Allow sharing with anyone, enter my other account’s ID and then choose Add. After that, I choose that account ID in the Selected principals section and choose Next.

In the Review and create page, I choose Create resource share. Resource share is successfully created.

Now, I switch to my other account that I share my Profile with and go to the RAM console. In the navigation menu, I go to the Resource shares and choose the resource name I created in the first account. I choose Accept resource share to accept the invitation.

That’s it! Now, I go to my Route 53 Profiles page and I choose the Profile shared with me.

I have access to the shared Profile’s DNS Firewall rule groups, private hosted zones, and Resolver rules. I can associate this account’s VPCs to this Profile. I am not able to edit or delete any resources. Profiles are Regional resources and cannot be shared across Regions.

Available now
You can easily get started with Route 53 Profiles using the AWS Management Console, Route 53 API, AWS Command Line Interface (AWS CLI), AWS CloudFormation, and AWS SDKs.

Route 53 Profiles will be available in all AWS Regions, except in Canada West (Calgary), the AWS GovCloud (US) Regions and the Amazon Web Services China Regions.

For more details about the pricing, visit the Route 53 pricing page.

Get started with Profiles today and please let us know your feedback either through your usual AWS Support contacts or the AWS re:Post for Amazon Route 53.

— Esra

Anthropic’s Claude 3 Opus model is now available on Amazon Bedrock

This post was originally published on this site

We are living in the generative artificial intelligence (AI) era; a time of rapid innovation. When Anthropic announced its Claude 3 foundation models (FMs) on March 4, we made Claude 3 Sonnet, a model balanced between skills and speed, available on Amazon Bedrock the same day. On March 13, we launched the Claude 3 Haiku model on Amazon Bedrock, the fastest and most compact member of the Claude 3 family for near-instant responsiveness.

Today, we are announcing the availability of Anthropic’s Claude 3 Opus on Amazon Bedrock, the most intelligent Claude 3 model, with best-in-market performance on highly complex tasks. It can navigate open-ended prompts and sight-unseen scenarios with remarkable fluency and human-like understanding, leading the frontier of general intelligence.

With the availability of Claude 3 Opus on Amazon Bedrock, enterprises can build generative AI applications to automate tasks, generate revenue through user-facing applications, conduct complex financial forecasts, and accelerate research and development across various sectors. Like the rest of the Claude 3 family, Opus can process images and return text outputs.

Claude 3 Opus shows an estimated twofold gain in accuracy over Claude 2.1 on difficult open-ended questions, reducing the likelihood of faulty responses. As enterprise customers rely on Claude across industries like healthcare, finance, and legal research, improved accuracy is essential for safety and performance.

How does Claude 3 Opus perform?
Claude 3 Opus outperforms its peers on most of the common evaluation benchmarks for AI systems, including undergraduate-level expert knowledge (MMLU), graduate-level expert reasoning (GPQA), basic mathematics (GSM8K), and more. It exhibits high levels of comprehension and fluency on complex tasks, leading the frontier of general intelligence.


Source: https://www.anthropic.com/news/claude-3-family

Here are a few supported use cases for the Claude 3 Opus model:

  • Task automation: planning and execution of complex actions across APIs, databases, and interactive coding
  • Research: brainstorming and hypothesis generation, research review, and drug discovery
  • Strategy: advanced analysis of charts and graphs, financials and market trends, and forecasting

To learn more about Claude 3 Opus’s features and capabilities, visit Anthropic’s Claude on Bedrock page and Anthropic Claude models in the Amazon Bedrock documentation.

Claude 3 Opus in action
If you are new to using Anthropic models, go to the Amazon Bedrock console and choose Model access on the bottom left pane. Request access separately for Claude 3 Opus.

2024-claude3-opus-2-model-access screenshot

To test Claude 3 Opus in the console, choose Text or Chat under Playgrounds in the left menu pane. Then choose Select model and select Anthropic as the category and Claude 3 Opus as the model.

To test more Claude prompt examples, choose Load examples. You can view and run examples specific to Claude 3 Opus, such as analyzing a quarterly report, building a website, and creating a side-scrolling game.

By choosing View API request, you can also access the model using code examples in the AWS Command Line Interface (AWS CLI) and AWS SDKs. Here is a sample of the AWS CLI command:

aws bedrock-runtime invoke-model 
     --model-id anthropic.claude-3-opus-20240229-v1:0 
     --body "{"messages":[{"role":"user","content":[{"type":"text","text":" Your task is to create a one-page website for an online learning platform.n"}]}],"anthropic_version":"bedrock-2023-05-31","max_tokens":2000,"temperature":1,"top_k":250,"top_p":0.999,"stop_sequences":["nnHuman:"]}" 
     --cli-binary-format raw-in-base64-out 
     --region us-east-1 
     invoke-model-output.txt

As I mentioned in my previous Claude 3 model launch posts, you need to use the new Anthropic Claude Messages API format for some Claude 3 model features, such as image processing. If you use Anthropic Claude Text Completions API and want to use Claude 3 models, you should upgrade from the Text Completions API.

My colleagues, Dennis Traub and Francois Bouteruche, are building code examples for Amazon Bedrock using AWS SDKs. You can learn how to invoke Claude 3 on Amazon Bedrock to generate text or multimodal prompts for image analysis in the Amazon Bedrock documentation.

Here is sample JavaScript code to send a Messages API request to generate text:

// claude_opus.js - Invokes Anthropic Claude 3 Opus using the Messages API.
import {
  BedrockRuntimeClient,
  InvokeModelCommand
} from "@aws-sdk/client-bedrock-runtime";

const modelId = "anthropic.claude-3-opus-20240229-v1:0";
const prompt = "Hello Claude, how are you today?";

// Create a new Bedrock Runtime client instance
const client = new BedrockRuntimeClient({ region: "us-east-1" });

// Prepare the payload for the model
const payload = {
  anthropic_version: "bedrock-2023-05-31",
  max_tokens: 1000,
  messages: [{
    role: "user",
    content: [{ type: "text", text: prompt }]
  }]
};

// Invoke Claude with the payload and wait for the response
const command = new InvokeModelCommand({
  contentType: "application/json",
  body: JSON.stringify(payload),
  modelId
});
const apiResponse = await client.send(command);

// Decode and print Claude's response
const decodedResponseBody = new TextDecoder().decode(apiResponse.body);
const responseBody = JSON.parse(decodedResponseBody);
const text = responseBody.content[0].text;
console.log(`Response: ${text}`);

Now, you can install the AWS SDK for JavaScript Runtime Client for Node.js and run claude_opus.js.

npm install @aws-sdk/client-bedrock-runtime
node claude_opus.js

For more examples in different programming languages, check out the code examples section in the Amazon Bedrock User Guide, and learn how to use system prompts with Anthropic Claude at Community.aws.

Now available
Claude 3 Opus is available today in the US West (Oregon) Region; check the full Region list for future updates.

Give Claude 3 Opus a try in the Amazon Bedrock console today and send feedback to AWS re:Post for Amazon Bedrock or through your usual AWS Support contacts.

Channy

AWS Weekly Roundup: New features on Knowledge Bases for Amazon Bedrock, OAC for Lambda function URL origins on Amazon CloudFront, and more (April 15, 2024)

This post was originally published on this site

AWS Community Days conferences are in full swing with AWS communities around the globe. The AWS Community Day Poland was hosted last week with more than 600 cloud enthusiasts in attendance. Community speakers Agnieszka Biernacka, Krzysztof Kąkol, and more, presented talks which captivated the audience and resulted in vibrant discussions throughout the day. My teammate, Wojtek Gawroński, was at the event and he’s already looking forward to attending again next year!

Last week’s launches
Here are some launches that got my attention during the previous week.

Amazon CloudFront now supports Origin Access Control (OAC) for Lambda function URL origins – Now you can protect your AWS Lambda URL origins by using Amazon CloudFront Origin Access Control (OAC) to only allow access from designated CloudFront distributions. The CloudFront Developer Guide has more details on how to get started using CloudFront OAC to authenticate access to Lambda function URLs from your designated CloudFront distributions.

AWS Client VPN and AWS Verified Access migration and interoperability patterns – If you’re using AWS Client VPN or a similar third-party VPN-based solution to provide secure access to your applications today, you’ll be pleased to know that you can now combine the use of AWS Client VPN and AWS Verified Access for your new or existing applications.

These two announcements related to Knowledge Bases for Amazon Bedrock caught my eye:

Metadata filtering to improve retrieval accuracy – With metadata filtering, you can retrieve not only semantically relevant chunks but a well-defined subset of those relevant chunks based on applied metadata filters and associated values.

Custom prompts for the RetrieveAndGenerate API and configuration of the maximum number of retrieved results – These are two new features which you can now choose as query options alongside the search type to give you control over the search results. These are retrieved from the vector store and passed to the Foundation Models for generating the answer.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS news
AWS open source news and updates – My colleague Ricardo writes this weekly open source newsletter in which he highlights new open source projects, tools, and demos from the AWS Community.

Upcoming AWS events
AWS Summits – These are free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Whether you’re in the Americas, Asia Pacific & Japan, or EMEA region, learn here about future AWS Summit events happening in your area.

AWS Community Days – Join an AWS Community Day event just like the one I mentioned at the beginning of this post to participate in technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from your area. If you’re in Kenya, or Nepal, there’s an event happening in your area this coming weekend.

You can browse all upcoming in-person and virtual events here.

That’s all for this week. Check back next Monday for another Weekly Roundup!

– Veliswa

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS.

Introducing AWS Deadline Cloud: Set up a cloud-based render farm in minutes

This post was originally published on this site

Customers in industries such as architecture, engineering, & construction (AEC) and media & entertainment (M&E) generate the final frames for film, TV, games, industrial design visualizations, and other digital media with a process called rendering, which takes 2D/3D digital content data and computes an output, such as an image or video file. Rendering also requires significant compute power, especially to generate 3D graphics and visual effects (VFX) with resolutions as high as 16K for films and TV. This constrains the number of rendering projects that customers can take on at once.

To address this growing demand for rendering high-resolution content, customers often build what are called “render farms” which combine the power of hundreds or thousands of computing nodes to process their rendering jobs. Render farms can traditionally take weeks or even months to build and deploy, and they require significant planning and upfront commitments to procure hardware.

As a result, customers increasingly are transitioning to scalable, cloud-based render farms for efficient production instead of a dedicated render farm on-premises, which can require extremely high fixed costs. But, rendering in the cloud still requires customers to manage their own infrastructure, build bespoke tooling to manage costs on a project-by-project basis, and monitor software licensing costs with their preferred partners themselves.

Today, we are announcing the general availability of AWS Deadline Cloud, a new fully managed service that enables creative teams to easily set up a render farm in minutes, scale to run more projects in parallel, and only pay for what resources they use. AWS Deadline Cloud provides a web-based portal with the ability to create and manage render farms, preview in-progress renders, view and analyze render logs, and easily track these costs.

With Deadline Cloud, you can go from zero to render faster with integrations of digital content creation (DCC) tools and customization tools are built-in. You can reduce the effort and development time required to tailor your rendering pipeline to the needs of each job. You also have the flexibility to use licenses you already own or they are provided by the service for third-party DCC software and renderers such as Maya, Nuke, and Houdini.

Concepts of AWS Deadline Cloud
AWS Deadline Cloud allows you to create and manage rendering projects and jobs on Amazon Elastic Compute Cloud (Amazon EC2) instances directly from DCC pipelines and workstations. You can create a rendering farm, a collection of queues, and fleets. A queue is where your submitted jobs are located and scheduled to be rendered. A fleet is a group of worker nodes that can support multiple queues. A queue can be processed by multiple fleets.

Before you can work on a project, you should have access to the required resources, and the associated farm must be integrated with AWS IAM Identity Center to manage workforce authentication and authorization. IT administrators can create and grant access permissions to users and groups at different levels, such as viewers, contributors, managers, or owners.

Here are four key components of Deadline Cloud:

  • Deadline Cloud monitor – You can access statuses, logs, and other troubleshooting metrics for jobs, steps, and tasks. The Deadline Cloud monitor provides real-time access and updates to job progress. It also provides access to logs and other troubleshooting metrics, and you can browse multiple farm, fleet, and queue listings to view system utilization.
  • Deadline Cloud submitter – You can submit a rendering job directly using AWS SDK or AWS Command Line Interface (AWS CLI). You can also submit from DCC software using a Deadline Cloud submitter, which is a DCC-integrated plugin that supports Open Job Description (OpenJD), an open source template specification. With it, artists can submit rendering jobs from a third-party DCC interface they are more familiar with, such as Maya or Nuke, to Deadline Cloud, where project resources are managed and jobs are monitored in one location.
  • Deadline Cloud budget manager – You can create and edit budgets to help manage project costs and view how many AWS resources are used and the estimated costs for those resources.
  • Deadline Cloud usage explorer – You can use the usage explorer to track approximate compute and licensing costs based on public pricing rates in Amazon EC2 and Usage-Based Licensing (UBL).

Get started with AWS Deadline Cloud
To get started with AWS Deadline Cloud, define and create a farm with Deadline Cloud monitor, download the Deadline Cloud submitter, and install plugins for your favorite DCC applications with just a few clicks. You can define your rendering jobs in your DCC application and submit them to your created farm within the plugin’s user interfaces.

The DCC plugins detect the necessary input scene data and build a job bundle that uploads to the Amazon Simple Storage Service (Amazon S3) bucket in your account, transfer to Deadline Cloud for rendering the job, and provide completed frames to the S3 bucket for your customers to access.

1. Define a farm with Deadline Cloud monitor
Let’s create your Deadline Cloud monitor infrastructure and define your farm first. In the Deadline Cloud console, choose Set up Deadline Cloud to define a farm with a guided experience, including queues and fleets, adding groups and users, choosing a service role, and adding tags to your resources.

In this step, to choose all the default settings for your Deadline Cloud resources, choose Skip to Review in Step 3 after monitor setup. Otherwise choose Next and customize your Deadline Cloud resources.

Set up your monitor’s infrastructure and enter your Monitor display name. This name makes the Monitor URL, a web portal to manage your farms, queues, fleets, and usages. You can’t change the monitor URL after you finish setting up. The AWS Region is the physical location of your rendering farm, so you should choose the closest Region from your studio to reduce the latency and improve data transfer speeds.

To access the monitor, you can create new users and groups and manage users (such as by assigning them groups, permissions, and applications) or delete users from your monitor. Users, groups, and permissions can also be managed in the IAM Identity Center. So, if you don’t set up the IAM Identity Center in your Region, you should enable it first. To learn more, visit Managing users in Deadline Cloud in the AWS documentation.

In Step 2, you can define farm details such as the name and description of your farm. In Additional farm settings, you can set an AWS Key Management Service (AWS KMS) key to encrypt your data and tags to assign AWS resources for filtering your resources or tracking your AWS costs. Your data is encrypted by default with a key that AWS owns and manages for you. To choose a different key, customize your encryption settings.

You can choose Skip to Review and Create to finish the quick setup process with the default settings.

Let’s look at more optional configurations! In the step for defining queue details, you can set up an S3 bucket for your queue. Job assets are uploaded as job attachments during the rendering process. Job attachments are stored in your defined S3 bucket. Additionally, you can set up the default budget action, service access roles, and environment variables for your queue.

In the step for defining fleet details, set the fleet name, description, Instance option (either Spot or On-Demand Instance), and Auto scaling configuration to define the number of instances and the fleet’s worker requirements. We set conservative worker requirements by default. These values can be updated at any time after setting up your render farm. To learn more, visit Manage Deadline Cloud fleets in the AWS documentation.

Worker instances define EC2 instance types with vCPUs and memory size, for example, c5.large, c5a.large, and c6i.large. You can filter up to 100 EC2 instance types by either allowing or excluding types of worker instances.

Review all of the information entered to create your farm and choose Create farm.

The progress of your Deadline Cloud onboarding is displayed, and a success message displays when your monitor and farm are ready for use. To learn more details about the process, visit Set up a Deadline Cloud monitor in the AWS documentation.

In the Dashboard in the left pane, you can visit the overview of the monitor, farms, users, and groups that you created.

Choose Monitor to visit a web portal to manage your farms, queues, fleets, usages, and budgets. After signing in to your user account, you can enter a web portal and explore the Deadline Cloud resources you created. You can also download a Deadline Cloud monitor desktop application with the same user experiences from the Downloads page.

To learn more about using the monitor, visit Using the Deadline Cloud monitor in the AWS documentation.

2. Set up a workstation and submit your render job to Deadline Cloud
Let’s set up a workstation for artists on their desktops by installing the Deadline Cloud submitter application so they can easily submit render jobs from within Maya, Nuke, and Houdini. Choose Downloads in the left menu pane and download the proper submitter installer for your operating system to test your render farm.

This program installs the latest integrated plugin for Deadline Cloud submitter for Maya, Nuke, and Houdini.

For example, open a Maya on your desktop and your asset. I have an example of a wrench file I’m going to test with. Choose Windows in the menu bar and Settings/Preferences in the sub menu. In the Plugin Manager, search for DeadlineCloudSubmitter. Select Loaded to load the Deadline Cloud submitter plugin.

If you are not already authenticated in the Deadline Cloud submitter, the Deadline Cloud Status tab will display. Choose Login and sign in with your user credentials in a browser sign-in window.

Now, select the Deadline Cloud shelf, then choose the orange deadline cloud logo on the ‘Deadline’ shelf to launch the submitter. From the submitter window, choose the farm and queue you want your render submitted to. If desired, in the Scene Settings tab, you can override the frame range, change the Output Path, or both.

If you choose Submit, the wrench turntable Maya file, along with all of the necessary textures and alembic caches, will be uploaded to Deadline Cloud and rendered on the farm. You can monitor rendering jobs in your Deadline Cloud monitor.

When your render is finished, as indicated by the Succeeded status in the job monitor, choose the job, Job Actions, and Download Output. To learn more about scheduling and monitoring jobs, visit Deadline Cloud jobs in the AWS documentation.

View your rendered image with an image viewing application such as DJView. The image will look like this:

To learn more in detail about the developer-side setup process using the command line, visit Setting up a developer workstation for Deadline Cloud in the AWS documentation.

3. Managing budgets and usage for Deadline Cloud
To help you manage costs for Deadline Cloud, you can use a budget manager to create and edit budgets. You can also use a usage explorer to view how many AWS resources are used and the estimated costs for those resources.

Choose Budgets on the Deadline Cloud monitor page to create your budget for your farm.

You can create budget amounts and limits and set automated actions to help reduce or stop additional spend against the budget.

Choose Usage in the Deadline Cloud monitor page to find real-time metrics on the activity happening on each farm. You can look at the farm’s costs by different variables, such as queue, job, or user. Choose various time frames to find usage during a specific period and look at usage trends over time.

The costs displayed in the usage explorer are approximate. Use them as a guide for managing your resources. There may be other costs from using other connected AWS resources, such as Amazon S3, Amazon CloudWatch, and other services that are not accounted for in the usage explorer.

To learn more, visit Managing budgets and usage for Deadline Cloud in the AWS documentation.

Now available
AWS Deadline Cloud is now available in US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), and Europe (Ireland) Regions.

Give AWS Deadline Cloud a try in the Deadline Cloud console. For more information, visit the Deadline Cloud product page, Deadline Cloud User Guide in the AWS documentation, and send feedback to AWS re:Post for AWS Deadline Cloud or through your usual AWS support contacts.

Channy

Amazon GuardDuty EC2 Runtime Monitoring is now generally available

This post was originally published on this site

Amazon GuardDuty is a machine learning (ML)-based security monitoring and intelligent threat detection service that analyzes and processes various AWS data sources, continuously monitors your AWS accounts and workloads for malicious activity, and delivers detailed security findings for visibility and remediation.

I love the feature of GuardDuty Runtime Monitoring that analyzes operating system (OS)-level, network, and file events to detect potential runtime threats for specific AWS workloads in your environment. I first introduced the general availability of this feature for Amazon Elastic Kubernetes Service (Amazon EKS) resources in March 2023. Seb wrote about the expansion of the Runtime Monitoring feature to provide threat detection for Amazon Elastic Container Service (Amazon ECS) and AWS Fargate as well as the preview for Amazon Elastic Compute Cloud (Amazon EC2) workloads in Nov 2023.

Today, we are announcing the general availability of Amazon GuardDuty EC2 Runtime Monitoring to expand threat detection coverage for EC2 instances at runtime and complement the anomaly detection that GuardDuty already provides by continuously monitoring VPC Flow Logs, DNS query logs, and AWS CloudTrail management events. You now have visibility into on-host, OS-level activities and container-level context into detected threats.

With GuardDuty EC2 Runtime Monitoring, you can identify and respond to potential threats that might target the compute resources within your EC2 workloads. Threats to EC2 workloads often involve remote code execution that leads to the download and execution of malware. This could include instances or self-managed containers in your AWS environment that are connecting to IP addresses associated with cryptocurrency-related activity or to malware command-and-control related IP addresses.

GuardDuty Runtime Monitoring provides visibility into suspicious commands that involve malicious file downloads and execution across each step, which can help you discover threats during initial compromise and before they become business-impacting events. You can also centrally enable runtime threat detection coverage for accounts and workloads across the organization using AWS Organizations to simplify your security coverage.

Configure EC2 Runtime Monitoring in GuardDuty
With a few clicks, you can enable GuardDuty EC2 Runtime Monitoring in the GuardDuty console. For your first use, you need to enable Runtime Monitoring.

Any customers that are new to the EC2 Runtime Monitoring feature can try it for free for 30 days and gain access to all features and detection findings. The GuardDuty console shows how many days are left in the free trial.

Now, you can set up the GuardDuty security agent for the individual EC2 instances for which you want to monitor the runtime behavior. You can choose to deploy the GuardDuty security agent either automatically or manually. At GA, you can enable Automated agent configuration, which is a preferred option for most customers as it allows GuardDuty to manage the security agent on their behalf.

The agent will be deployed on EC2 instances with AWS Systems Manager and uses an Amazon Virtual Private Cloud (Amazon VPC) endpoint to receive the runtime events associated with your resource. If you want to manage the GuardDuty security agent manually, visit Managing the security agent Amazon EC2 instance manually in the AWS documentation. In multiple-account environments, delegated GuardDuty administrator accounts manage their member accounts using AWS Organizations. For more information, visit Managing multiple accounts in the AWS documentation.

When you enable EC2 Runtime Monitoring, you can find the covered EC2 instances list, account ID, and coverage status, and whether the agent is able to receive runtime events from the corresponding resource in the EC2 instance runtime coverage tab.

Even when the coverage status is Unhealthy, meaning it is not currently able to receive runtime findings, you still have defense in depth for your EC2 instance. GuardDuty continues to provide threat detection to the EC2 instance by monitoring CloudTrail, VPC flow, and DNS logs associated with it.

Check out GuardDuty EC2 Runtime security findings
When GuardDuty detects a potential threat and generates security findings, you can view the details of the healthy information.

Choose Findings in the left pane if you want to find security findings specific to Amazon EC2 resources. You can use the filter bar to filter the findings table by specific criteria, such as a Resource type of Instance. The severity and details of the findings differ based on the resource role, which indicates whether the EC2 resource was the target of suspicious activity or the actor performing the activity.

With today’s launch, we support over 30 runtime security findings for EC2 instances, such as detecting abused domains, backdoors, cryptocurrency-related activity, and unauthorized communications. For the full list, visit Runtime Monitoring finding types in the AWS documentation.

Resolve your EC2 security findings
Choose each EC2 security finding to know more details. You can find all the information associated with the finding and examine the resource in question to determine if it is behaving in an expected manner.

If the activity is authorized, you can use suppression rules or trusted IP lists to prevent false positive notifications for that resource. If the activity is unexpected, the security best practice is to assume the instance has been compromised and take the actions detailed in Remediating a potentially compromised Amazon EC2 instance in the AWS documentation.

You can integrate GuardDuty EC2 Runtime Monitoring with other AWS security services, such as AWS Security Hub or Amazon Detective. Or you can use Amazon EventBridge, allowing you to use integrations with security event management or workflow systems, such as Splunk, Jira, and ServiceNow, or trigger automated and semi-automated responses such as isolating a workload for investigation.

When you choose Investigate with Detective, you can find Detective-created visualizations for AWS resources to quickly and easily investigate security issues. To learn more, visit Integration with Amazon Detective in the AWS documentation.

Things to know
GuardDuty EC2 Runtime Monitoring support is now available for EC2 instances running Amazon Linux 2 or Amazon Linux 2023. You have the option to configure maximum CPU and memory limits for the agent. To learn more and for future updates, visit Prerequisites for Amazon EC2 instance support in the AWS documentation.

To estimate the daily average usage costs for GuardDuty, choose Usage in the left pane. During the 30-day free trial period, you can estimate what your costs will be after the trial period. At the end of the trial period, we charge you per vCPU hours tracked monthly for the monitoring agents. To learn more, visit the Amazon GuardDuty pricing page.

Enabling EC2 Runtime Monitoring also allows for a cost-saving opportunity on your GuardDuty cost. When the feature is enabled, you won’t be charged for GuardDuty foundational protection VPC Flow Logs sourced from the EC2 instances running the security agent. This is due to similar, but more contextual, network data available from the security agent. Additionally, GuardDuty would still process VPC Flow Logs and generate relevant findings so you will continue to get network-level security coverage even if the agent experiences downtime.

Now available
Amazon GuardDuty EC2 Runtime Monitoring is now available in all AWS Regions where GuardDuty is available, excluding AWS GovCloud (US) Regions and AWS China Regions. For a full list of Regions where EC2 Runtime Monitoring is available, visit Region-specific feature availability.

Give GuardDuty EC2 Runtime Monitoring a try in the GuardDuty console. For more information, visit the Amazon GuardDuty User Guide and send feedback to AWS re:Post for Amazon GuardDuty or through your usual AWS support contacts.

Channy

Run large-scale simulations with AWS Batch multi-container jobs

This post was originally published on this site

Industries like automotive, robotics, and finance are increasingly implementing computational workloads like simulations, machine learning (ML) model training, and big data analytics to improve their products. For example, automakers rely on simulations to test autonomous driving features, robotics companies train ML algorithms to enhance robot perception capabilities, and financial firms run in-depth analyses to better manage risk, process transactions, and detect fraud.

Some of these workloads, including simulations, are especially complicated to run due to their diversity of components and intensive computational requirements. A driving simulation, for instance, involves generating 3D virtual environments, vehicle sensor data, vehicle dynamics controlling car behavior, and more. A robotics simulation might test hundreds of autonomous delivery robots interacting with each other and other systems in a massive warehouse environment.

AWS Batch is a fully managed service that can help you run batch workloads across a range of AWS compute offerings, including Amazon Elastic Container Service (Amazon ECS), Amazon Elastic Kubernetes Service (Amazon EKS), AWS Fargate, and Amazon EC2 Spot or On-Demand Instances. Traditionally, AWS Batch only allowed single-container jobs and required extra steps to merge all components into a monolithic container. It also did not allow using separate “sidecar” containers, which are auxiliary containers that complement the main application by providing additional services like data logging. This additional effort required coordination across multiple teams, such as software development, IT operations, and quality assurance (QA), because any code change meant rebuilding the entire container.

Now, AWS Batch offers multi-container jobs, making it easier and faster to run large-scale simulations in areas like autonomous vehicles and robotics. These workloads are usually divided between the simulation itself and the system under test (also known as an agent) that interacts with the simulation. These two components are often developed and optimized by different teams. With the ability to run multiple containers per job, you get the advanced scaling, scheduling, and cost optimization offered by AWS Batch, and you can use modular containers representing different components like 3D environments, robot sensors, or monitoring sidecars. In fact, customers such as IPG Automotive, MORAI, and Robotec.ai are already using AWS Batch multi-container jobs to run their simulation software in the cloud.

Let’s see how this works in practice using a simplified example and have some fun trying to solve a maze.

Building a Simulation Running on Containers
In production, you will probably use existing simulation software. For this post, I built a simplified version of an agent/model simulation. If you’re not interested in code details, you can skip this section and go straight to how to configure AWS Batch.

For this simulation, the world to explore is a randomly generated 2D maze. The agent has the task to explore the maze to find a key and then reach the exit. In a way, it is a classic example of pathfinding problems with three locations.

Here’s a sample map of a maze where I highlighted the start (S), end (E), and key (K) locations.

Sample ASCII maze map.

The separation of agent and model into two separate containers allows different teams to work on each of them separately. Each team can focus on improving their own part, for example, to add details to the simulation or to find better strategies for how the agent explores the maze.

Here’s the code of the maze model (app.py). I used Python for both examples. The model exposes a REST API that the agent can use to move around the maze and know if it has found the key and reached the exit. The maze model uses Flask for the REST API.

import json
import random
from flask import Flask, request, Response

ready = False

# How map data is stored inside a maze
# with size (width x height) = (4 x 3)
#
#    012345678
# 0: +-+-+ +-+
# 1: | |   | |
# 2: +-+ +-+-+
# 3: | |   | |
# 4: +-+-+ +-+
# 5: | | | | |
# 6: +-+-+-+-+
# 7: Not used

class WrongDirection(Exception):
    pass

class Maze:
    UP, RIGHT, DOWN, LEFT = 0, 1, 2, 3
    OPEN, WALL = 0, 1
    

    @staticmethod
    def distance(p1, p2):
        (x1, y1) = p1
        (x2, y2) = p2
        return abs(y2-y1) + abs(x2-x1)


    @staticmethod
    def random_dir():
        return random.randrange(4)


    @staticmethod
    def go_dir(x, y, d):
        if d == Maze.UP:
            return (x, y - 1)
        elif d == Maze.RIGHT:
            return (x + 1, y)
        elif d == Maze.DOWN:
            return (x, y + 1)
        elif d == Maze.LEFT:
            return (x - 1, y)
        else:
            raise WrongDirection(f"Direction: {d}")


    def __init__(self, width, height):
        self.width = width
        self.height = height        
        self.generate()
        

    def area(self):
        return self.width * self.height
        

    def min_lenght(self):
        return self.area() / 5
    

    def min_distance(self):
        return (self.width + self.height) / 5
    

    def get_pos_dir(self, x, y, d):
        if d == Maze.UP:
            return self.maze[y][2 * x + 1]
        elif d == Maze.RIGHT:
            return self.maze[y][2 * x + 2]
        elif d == Maze.DOWN:
            return self.maze[y + 1][2 * x + 1]
        elif d ==  Maze.LEFT:
            return self.maze[y][2 * x]
        else:
            raise WrongDirection(f"Direction: {d}")


    def set_pos_dir(self, x, y, d, v):
        if d == Maze.UP:
            self.maze[y][2 * x + 1] = v
        elif d == Maze.RIGHT:
            self.maze[y][2 * x + 2] = v
        elif d == Maze.DOWN:
            self.maze[y + 1][2 * x + 1] = v
        elif d ==  Maze.LEFT:
            self.maze[y][2 * x] = v
        else:
            WrongDirection(f"Direction: {d}  Value: {v}")


    def is_inside(self, x, y):
        return 0 <= y < self.height and 0 <= x < self.width


    def generate(self):
        self.maze = []
        # Close all borders
        for y in range(0, self.height + 1):
            self.maze.append([Maze.WALL] * (2 * self.width + 1))
        # Get a random starting point on one of the borders
        if random.random() < 0.5:
            sx = random.randrange(self.width)
            if random.random() < 0.5:
                sy = 0
                self.set_pos_dir(sx, sy, Maze.UP, Maze.OPEN)
            else:
                sy = self.height - 1
                self.set_pos_dir(sx, sy, Maze.DOWN, Maze.OPEN)
        else:
            sy = random.randrange(self.height)
            if random.random() < 0.5:
                sx = 0
                self.set_pos_dir(sx, sy, Maze.LEFT, Maze.OPEN)
            else:
                sx = self.width - 1
                self.set_pos_dir(sx, sy, Maze.RIGHT, Maze.OPEN)
        self.start = (sx, sy)
        been = [self.start]
        pos = -1
        solved = False
        generate_status = 0
        old_generate_status = 0                    
        while len(been) < self.area():
            (x, y) = been[pos]
            sd = Maze.random_dir()
            for nd in range(4):
                d = (sd + nd) % 4
                if self.get_pos_dir(x, y, d) != Maze.WALL:
                    continue
                (nx, ny) = Maze.go_dir(x, y, d)
                if (nx, ny) in been:
                    continue
                if self.is_inside(nx, ny):
                    self.set_pos_dir(x, y, d, Maze.OPEN)
                    been.append((nx, ny))
                    pos = -1
                    generate_status = len(been) / self.area()
                    if generate_status - old_generate_status > 0.1:
                        old_generate_status = generate_status
                        print(f"{generate_status * 100:.2f}%")
                    break
                elif solved or len(been) < self.min_lenght():
                    continue
                else:
                    self.set_pos_dir(x, y, d, Maze.OPEN)
                    self.end = (x, y)
                    solved = True
                    pos = -1 - random.randrange(len(been))
                    break
            else:
                pos -= 1
                if pos < -len(been):
                    pos = -1
                    
        self.key = None
        while(self.key == None):
            kx = random.randrange(self.width)
            ky = random.randrange(self.height)
            if (Maze.distance(self.start, (kx,ky)) > self.min_distance()
                and Maze.distance(self.end, (kx,ky)) > self.min_distance()):
                self.key = (kx, ky)


    def get_label(self, x, y):
        if (x, y) == self.start:
            c = 'S'
        elif (x, y) == self.end:
            c = 'E'
        elif (x, y) == self.key:
            c = 'K'
        else:
            c = ' '
        return c

                    
    def map(self, moves=[]):
        map = ''
        for py in range(self.height * 2 + 1):
            row = ''
            for px in range(self.width * 2 + 1):
                x = int(px / 2)
                y = int(py / 2)
                if py % 2 == 0: #Even rows
                    if px % 2 == 0:
                        c = '+'
                    else:
                        v = self.get_pos_dir(x, y, self.UP)
                        if v == Maze.OPEN:
                            c = ' '
                        elif v == Maze.WALL:
                            c = '-'
                else: # Odd rows
                    if px % 2 == 0:
                        v = self.get_pos_dir(x, y, self.LEFT)
                        if v == Maze.OPEN:
                            c = ' '
                        elif v == Maze.WALL:
                            c = '|'
                    else:
                        c = self.get_label(x, y)
                        if c == ' ' and [x, y] in moves:
                            c = '*'
                row += c
            map += row + 'n'
        return map


app = Flask(__name__)

@app.route('/')
def hello_maze():
    return "<p>Hello, Maze!</p>"

@app.route('/maze/map', methods=['GET', 'POST'])
def maze_map():
    if not ready:
        return Response(status=503, retry_after=10)
    if request.method == 'GET':
        return '<pre>' + maze.map() + '</pre>'
    else:
        moves = request.get_json()
        return maze.map(moves)

@app.route('/maze/start')
def maze_start():
    if not ready:
        return Response(status=503, retry_after=10)
    start = { 'x': maze.start[0], 'y': maze.start[1] }
    return json.dumps(start)

@app.route('/maze/size')
def maze_size():
    if not ready:
        return Response(status=503, retry_after=10)
    size = { 'width': maze.width, 'height': maze.height }
    return json.dumps(size)

@app.route('/maze/pos/<int:y>/<int:x>')
def maze_pos(y, x):
    if not ready:
        return Response(status=503, retry_after=10)
    pos = {
        'here': maze.get_label(x, y),
        'up': maze.get_pos_dir(x, y, Maze.UP),
        'down': maze.get_pos_dir(x, y, Maze.DOWN),
        'left': maze.get_pos_dir(x, y, Maze.LEFT),
        'right': maze.get_pos_dir(x, y, Maze.RIGHT),

    }
    return json.dumps(pos)


WIDTH = 80
HEIGHT = 20
maze = Maze(WIDTH, HEIGHT)
ready = True

The only requirement for the maze model (in requirements.txt) is the Flask module.

To create a container image running the maze model, I use this Dockerfile.

FROM --platform=linux/amd64 public.ecr.aws/docker/library/python:3.12-alpine

WORKDIR /app

COPY requirements.txt requirements.txt
RUN pip3 install -r requirements.txt

COPY . .

CMD [ "python3", "-m" , "flask", "run", "--host=0.0.0.0", "--port=5555"]

Here’s the code for the agent (agent.py). First, the agent asks the model for the size of the maze and the starting position. Then, it applies its own strategy to explore and solve the maze. In this implementation, the agent chooses its route at random, trying to avoid following the same path more than once.

import random
import requests
from requests.adapters import HTTPAdapter, Retry

HOST = '127.0.0.1'
PORT = 5555

BASE_URL = f"http://{HOST}:{PORT}/maze"

UP, RIGHT, DOWN, LEFT = 0, 1, 2, 3
OPEN, WALL = 0, 1

s = requests.Session()

retries = Retry(total=10,
                backoff_factor=1)

s.mount('http://', HTTPAdapter(max_retries=retries))

r = s.get(f"{BASE_URL}/size")
size = r.json()
print('SIZE', size)

r = s.get(f"{BASE_URL}/start")
start = r.json()
print('START', start)

y = start['y']
x = start['x']

found_key = False
been = set((x, y))
moves = [(x, y)]
moves_stack = [(x, y)]

while True:
    r = s.get(f"{BASE_URL}/pos/{y}/{x}")
    pos = r.json()
    if pos['here'] == 'K' and not found_key:
        print(f"({x}, {y}) key found")
        found_key = True
        been = set((x, y))
        moves_stack = [(x, y)]
    if pos['here'] == 'E' and found_key:
        print(f"({x}, {y}) exit")
        break
    dirs = list(range(4))
    random.shuffle(dirs)
    for d in dirs:
        nx, ny = x, y
        if d == UP and pos['up'] == 0:
            ny -= 1
        if d == RIGHT and pos['right'] == 0:
            nx += 1
        if d == DOWN and pos['down'] == 0:
            ny += 1
        if d == LEFT and pos['left'] == 0:
            nx -= 1 

        if nx < 0 or nx >= size['width'] or ny < 0 or ny >= size['height']:
            continue

        if (nx, ny) in been:
            continue

        x, y = nx, ny
        been.add((x, y))
        moves.append((x, y))
        moves_stack.append((x, y))
        break
    else:
        if len(moves_stack) > 0:
            x, y = moves_stack.pop()
        else:
            print("No moves left")
            break

print(f"Solution length: {len(moves)}")
print(moves)

r = s.post(f'{BASE_URL}/map', json=moves)

print(r.text)

s.close()

The only dependency of the agent (in requirements.txt) is the requests module.

This is the Dockerfile I use to create a container image for the agent.

FROM --platform=linux/amd64 public.ecr.aws/docker/library/python:3.12-alpine

WORKDIR /app

COPY requirements.txt requirements.txt
RUN pip3 install -r requirements.txt

COPY . .

CMD [ "python3", "agent.py"]

You can easily run this simplified version of a simulation locally, but the cloud allows you to run it at larger scale (for example, with a much bigger and more detailed maze) and to test multiple agents to find the best strategy to use. In a real-world scenario, the improvements to the agent would then be implemented into a physical device such as a self-driving car or a robot vacuum cleaner.

Running a simulation using multi-container jobs
To run a job with AWS Batch, I need to configure three resources:

  • The compute environment in which to run the job
  • The job queue in which to submit the job
  • The job definition describing how to run the job, including the container images to use

In the AWS Batch console, I choose Compute environments from the navigation pane and then Create. Now, I have the choice of using Fargate, Amazon EC2, or Amazon EKS. Fargate allows me to closely match the resource requirements that I specify in the job definitions. However, simulations usually require access to a large but static amount of resources and use GPUs to accelerate computations. For this reason, I select Amazon EC2.

Console screenshot.

I select the Managed orchestration type so that AWS Batch can scale and configure the EC2 instances for me. Then, I enter a name for the compute environment and select the service-linked role (that AWS Batch created for me previously) and the instance role that is used by the ECS container agent (running on the EC2 instances) to make calls to the AWS API on my behalf. I choose Next.

Console screenshot.

In the Instance configuration settings, I choose the size and type of the EC2 instances. For example, I can select instance types that have GPUs or use the Graviton processor. I do not have specific requirements and leave all the settings to their default values. For Network configuration, the console already selected my default VPC and the default security group. In the final step, I review all configurations and complete the creation of the compute environment.

Now, I choose Job queues from the navigation pane and then Create. Then, I select the same orchestration type I used for the compute environment (Amazon EC2). In the Job queue configuration, I enter a name for the job queue. In the Connected compute environments dropdown, I select the compute environment I just created and complete the creation of the queue.

Console screenshot.

I choose Job definitions from the navigation pane and then Create. As before, I select Amazon EC2 for the orchestration type.

To use more than one container, I disable the Use legacy containerProperties structure option and move to the next step. By default, the console creates a legacy single-container job definition if there’s already a legacy job definition in the account. That’s my case. For accounts without legacy job definitions, the console has this option disabled.

Console screenshot.

I enter a name for the job definition. Then, I have to think about which permissions this job requires. The container images I want to use for this job are stored in Amazon ECR private repositories. To allow AWS Batch to download these images to the compute environment, in the Task properties section, I select an Execution role that gives read-only access to the ECR repositories. I don’t need to configure a Task role because the simulation code is not calling AWS APIs. For example, if my code was uploading results to an Amazon Simple Storage Service (Amazon S3) bucket, I could select here a role giving permissions to do so.

In the next step, I configure the two containers used by this job. The first one is the maze-model. I enter the name and the image location. Here, I can specify the resource requirements of the container in terms of vCPUs, memory, and GPUs. This is similar to configuring containers for an ECS task.

Console screenshot.

I add a second container for the agent and enter name, image location, and resource requirements as before. Because the agent needs to access the maze as soon as it starts, I use the Dependencies section to add a container dependency. I select maze-model for the container name and START as the condition. If I don’t add this dependency, the agent container can fail before the maze-model container is running and able to respond. Because both containers are flagged as essential in this job definition, the overall job would terminate with a failure.

Console screenshot.

I review all configurations and complete the job definition. Now, I can start a job.

In the Jobs section of the navigation pane, I submit a new job. I enter a name and select the job queue and the job definition I just created.

Console screenshot.

In the next steps, I don’t need to override any configuration and create the job. After a few minutes, the job has succeeded, and I have access to the logs of the two containers.

Console screenshot.

The agent solved the maze, and I can get all the details from the logs. Here’s the output of the job to see how the agent started, picked up the key, and then found the exit.

SIZE {'width': 80, 'height': 20}
START {'x': 0, 'y': 18}
(32, 2) key found
(79, 16) exit
Solution length: 437
[(0, 18), (1, 18), (0, 18), ..., (79, 14), (79, 15), (79, 16)]

In the map, the red asterisks (*) follow the path used by the agent between the start (S), key (K), and exit (E) locations.

ASCII-based map of the solved maze.

Increasing observability with a sidecar container
When running complex jobs using multiple components, it helps to have more visibility into what these components are doing. For example, if there is an error or a performance problem, this information can help you find where and what the issue is.

To instrument my application, I use AWS Distro for OpenTelemetry:

Using telemetry data collected in this way, I can set up dashboards (for example, using CloudWatch or Amazon Managed Grafana) and alarms (with CloudWatch or Prometheus) that help me better understand what is happening and reduce the time to solve an issue. More generally, a sidecar container can help integrate telemetry data from AWS Batch jobs with your monitoring and observability platforms.

Things to know
AWS Batch support for multi-container jobs is available today in the AWS Management Console, AWS Command Line Interface (AWS CLI), and AWS SDKs in all AWS Regions where Batch is offered. For more information, see the AWS Services by Region list.

There is no additional cost for using multi-container jobs with AWS Batch. In fact, there is no additional charge for using AWS Batch. You only pay for the AWS resources you create to store and run your application, such as EC2 instances and Fargate containers. To optimize your costs, you can use Reserved Instances, Savings Plan, EC2 Spot Instances, and Fargate in your compute environments.

Using multi-container jobs accelerates development times by reducing job preparation efforts and eliminates the need for custom tooling to merge the work of multiple teams into a single container. It also simplifies DevOps by defining clear component responsibilities so that teams can quickly identify and fix issues in their own areas of expertise without distraction.

To learn more, see how to set up multi-container jobs in the AWS Batch User Guide.

Danilo

AWS Weekly Roundup — Savings Plans, Amazon DynamoDB, AWS CodeArtifact, and more — March 25, 2024

This post was originally published on this site

AWS Summit season is starting! I’m happy I will meet our customers, partners, and the press next week at the AWS Summit Paris and the week after at the AWS Summit Amsterdam. I’ll show you how mobile application developers can use generative artificial intelligence (AI) to boost their productivity. Be sure to stop by and say hi if you’re around.

Now that my talks for the Summit are ready, I took the time to look back at the AWS launches from last week and write this summary for you.

Last week’s launches
Here are some launches that got my attention:

AWS License Manager allows you to track IBM Db2 licenses on Amazon Relational Database Service (Amazon RDS) – I wrote about Amazon RDS when we launched IBM Db2 back in December 2023 and I told you that you must bring your own Db2 license. Starting today, you can track your Amazon RDS for Db2 usage with AWS License Manager. License Manager provides you with better control and visibility of your licenses to help you limit licensing overages and reduce the risk of non-compliance and misreporting.

AWS CodeBuild now supports custom images for AWS Lambda – You can now use compute container images stored in an Amazon Elastic Container Registry (Amazon ECR) repository for projects configured to run on Lambda compute. Previously, you had to use one of the managed container images provided by AWS CodeBuild. AWS managed container images include support for AWS Command Line Interface (AWS CLI), Serverless Application Model, and various programming language runtimes.

AWS CodeArtifact package group configuration – Administrators of package repositories can now manage the configuration of multiple packages in one single place. A package group allows you to define how packages are updated by internal developers or from upstream repositories. You can now allow or block internal developers to publish packages or allow or block upstream updates for a group of packages. Read my blog post for all the details.

Return your Savings Plans – We have announced the ability to return Savings Plans within 7 days of purchase. Savings Plans is a flexible pricing model that can help you reduce your bill by up to 72 percent compared to On-Demand prices, in exchange for a one- or three-year hourly spend commitment. If you realize that the Savings Plan you recently purchased isn’t optimal for your needs, you can return it and if needed, repurchase another Savings Plan that better matches your needs.

Amazon EC2 Mac Dedicated Hosts now provide visibility into supported macOS versions – You can now view the latest macOS versions supported on your EC2 Mac Dedicated Host, which enables you to proactively validate if your Dedicated Host can support instances with your preferred macOS versions.

Amazon Corretto 22 is now generally available – Corretto 22, an OpenJDK feature release, introduces a range of new capabilities and enhancements for developers. New features like stream gatherers and unnamed variables help you write code that’s clearer and easier to maintain. Additionally, optimizations in garbage collection algorithms boost performance. Existing libraries for concurrency, class files, and foreign functions have also been updated, giving you a more powerful toolkit to build robust and efficient Java applications.

Amazon DynamoDB now supports resource-based policies and AWS PrivateLink – With AWS PrivateLink, you can simplify private network connectivity between Amazon Virtual Private Cloud (Amazon VPC), Amazon DynamoDB, and your on-premises data centers using interface VPC endpoints and private IP addresses. On the other side, resource-based policies to help you simplify access control for your DynamoDB resources. With resource-based policies, you can specify the AWS Identity and Access Management (IAM) principals that have access to a resource and what actions they can perform on it. You can attach a resource-based policy to a DynamoDB table or a stream. Resource-based policies also simplify cross-account access control for sharing resources with IAM principals of different AWS accounts.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS news
Here are some additional news items, open source projects, and Twitch shows that you might find interesting:

British Broadcasting Corporation (BBC) migrated 25PB of archives to Amazon S3 Glacier – The BBC Archives Technology and Services team needed a modern solution to centralize, digitize, and migrate its 100-year-old flagship archives. It began using Amazon Simple Storage Service (Amazon S3) Glacier Instant Retrieval, which is an archive storage class that delivers the lowest-cost storage for long-lived data that is rarely accessed and requires retrieval in milliseconds. I did the math, you need 2,788,555 DVD discs to store 25PB of data. Imagine a pile of DVDs reaching 41.8 kilometers (or 25.9 miles) tall! Read the full story.

AWS Build On Generative AIBuild On Generative AI – Season 3 of your favorite weekly Twitch show about all things generative AI is in full swing! Streaming every Monday, 9:00 AM US PT, my colleagues Tiffany and Darko discuss different aspects of generative AI and invite guest speakers to demo their work.

AWS open source news and updates – My colleague Ricardo writes this weekly open source newsletter in which he highlights new open source projects, tools, and demos from the AWS Community.

Upcoming AWS events
Check your calendars and sign up for these AWS events:

AWS SummitsAWS Summits – As I wrote in the introduction, it’s AWS Summit season again! The first one happens next week in Paris (April 3), followed by Amsterdam (April 9), Sydney (April 10–11), London (April 24), Berlin (May 15–16), and Seoul (May 16–17). AWS Summits are a series of free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS.

AWS re:InforceAWS re:Inforce – Join us for AWS re:Inforce (June 10–12) in Philadelphia, Pennsylvania. AWS re:Inforce is a learning conference focused on AWS security solutions, cloud security, compliance, and identity. Connect with the AWS teams that build the security tools and meet AWS customers to learn about their security journeys.

You can browse all upcoming in-person and virtual events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— seb

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!