Tag Archives: AWS

Get started with OpenAI GPT-5.5, GPT-5.4 models, and Codex on Amazon Bedrock

This post was originally published on this site

As we previewed in What’s Next with AWS 2026, we’re announcing the general availability of OpenAI GPT-5.5, GPT-5.4 models, and Codex on Amazon Bedrock, giving you access to frontier models and a coding agent for software development.

According to OpenAI, GPT-5.5 and GPT-5.4 models are excellent for coding, reasoning, agentic workflows, and complex professional work. You can use GPT-5.5 for the hardest customer workloads and GPT-5.4 for the best price-performance. You can call them through Responses API on Amazon Bedrock’s next-generation inference engine built for high performance, reliability, and security.

Codex is the OpenAI coding agent for AI-powered software development. According to OpenAI, more than 4 million developers use Codex every week to write, refactor, debug, test, and validate code across large codebases. With GPT-5.5 powering inference, Codex introduces a new class of intelligence optimized for complex, long-horizon developer workflows. You can use the Codex App, the Codex CLI, and IDE integrations with Visual Studio Code, JetBrains, and Xcode, with all model inference routed through the Responses API on Amazon Bedrock.

For customers with data residency requirements, all processing stays within the Bedrock Region you select. You pay per token with no seat licenses and no per-developer commitments.

GPT-5.5 and GPT-5.4 models on Bedrock in action
You can access the model programmatically using the OpenAI Responses API to call the bedrock-mantle endpoints through the OpenAI SDK, command-line tools such as curl.

Let’s start with OpenAI SDK for Python. Install OpenAI SDK.

pip install -U openai

Set the environment variables for authentication.

export OPENAI_BASE_URL="https://bedrock-mantle.us-east-2.api.aws/openai/v1"
export OPENAI_API_KEY="<BEDROCK_API_KEY>"
export BEDROCK_OPENAI_MODEL_ID="openai.gpt-5.5"

Here is a sample Python code to call GPT-5.5 model on Bedrock:

import os
from openai import OpenAI
 
client = OpenAI(
    base_url=os.environ["OPENAI_BASE_URL"],
    api_key=os.environ["OPENAI_API_KEY"],
)
 
response = client.responses.create(
    model=os.environ["BEDROCK_OPENAI_MODEL_ID"],
    input=[
        {
            "role": "developer",
            "content": "You are a software engineer with excellent AWS cloud knowledge. Be concise and practical.",
        },
        {
            "role": "user",
            "content": "Design a distributed architecture on AWS in Python that should support 100k requests per second across multiple geographic regions.",
        },
    ],
    reasoning={"effort": "medium"},
    text={"verbosity": "low"},
)
 
print(response.output_text)

You can call directly the model endpoint using curl.

curl "$OPENAI_BASE_URL/responses" 
  -H "Content-Type: application/json" 
  -H "Authorization: Bearer $OPENAI_API_KEY" 
  -d '{
    "model": "openai.gpt-5.5",
    "input": [
      {
        "role": "developer",
        "content": "You are a software engineer with excellent AWS cloud knowledge."
      },
      {
        "role": "user",
        "content": "Design a distributed architecture on AWS in Python that should support 100k requests per second across multiple geographic regions."
      }
    ],
    "reasoning": {"effort": "medium"},
    "text": {"verbosity": "low"}
  }'

You can use the Responses API when you want to use model-managed multi-turn state, need hosted tools, function tools, or richer tool orchestration, and run background or long-running work. To learn more, visit the OpenAI Cookbook Responses examples.

Using OpenAI Codex with GPT-5.5 on Amazon Bedrock
You can download Codex CLI, Codex App or Codex VS Code extension and get started with the Bedrock for model inference. Codex supports two Bedrock authentication pathways: Amazon Bedrock API key or AWS SDK credential chain. If you set AWS_BEARER_TOKEN_BEDROCK, Codex uses it first; otherwise Codex falls back to AWS SDK credential chain.

Set AWS_BEARER_TOKEN_BEDROCK in the environment that Codex will read:

export AWS_BEARER_TOKEN_BEDROCK=<your-bedrock-api-key>

Then, configure your preferred Region and set the model ID to openai.gpt-5.5 in ~/.codex/config.toml, which is required for Bedrock API-key authentication. You can also choose openai.gpt-5.4, openai.gpt-oss-120b, or openai.gpt-oss-20b. For the desktop app or VS Code extension, put any environment variables the app needs in ~/.codex/.env.

model = "openai.gpt-5.5"
model_provider = "amazon-bedrock"
[model_providers.amazon-bedrock.aws]
region = "us-east-2"

Restart the desktop app or VS Code extension after changing ~/.codex/config.toml or ~/.codex/.env. In Codex CLI, you should see a /status tab that looks like this:

In Codex App, you can use GPT-5.5 model through Amazon Bedrock inference.

Things to know
Let me share some important technical details that I think you’ll find useful.

  • Model latency: OpenAI model information positions GPT-5.5 as fast and GPT-5.4 as medium speed, but customer-perceived latency depends on reasoning effort, output length, tool calls, background mode, Region, quotas, throttling, prompt size, and cache hits. Start GPT-5.5 at medium effort. Start GPT-5.4 with effort set explicitly rather than relying on its none default.
  • Scaling and capacity: Bedrock’s new inference engine is designed to rapidly provision and serve capacity across many different models. When accepting requests, we prioritize keeping steady state workloads running, and ramp usage and capacity rapidly in response to changes in demand. During periods of high demand, requests are queued, rather than rejected.

Now available
OpenAI GPT models and Codex on Amazon Bedrock are available today: GPT-5.5 model in the US East (Ohio) Region, GPT-5.4 model in the US East (Ohio) and US West (Oregon) Regions. Check the full list of Regions for future updates. To learn more, visit the OpenAI on Amazon Bedrock page and the Amazon Bedrock pricing page.

Give GPT-5.5, GPT-5.4 models, and Codex on Amazon Bedrock a try today and send feedback to AWS re:Post for Amazon Bedrock or through your usual AWS Support contacts.

Channy

AWS Weekly Roundup: Claude Opus 4.8 on AWS, Aurora MySQL with Kiro Powers, and more (June 1, 2026)

This post was originally published on this site

In my last Week in Review post, I shared what I’d been hearing from customers in the AI-Driven Development Lifecycle (AI-DLC) workshops I’ve been delivering. Last week I was back at it, this time in Denver for a two-day AI-DLC workshop, where I helped facilitate 17 teams to deliver nearly 20 separate use cases in just two days. The pace of acceleration that AI-DLC unlocks—especially when paired with tools like Claude Code on Amazon Bedrock—is fundamentally changing how businesses operate. Traditional roles within software development teams are collapsing into smaller, AI-augmented squads, and the paradigm shift is beginning to take place right in front of us. To learn more about how to utilize various AI tools, visit the GitHub repository of AI-DLC workflow.

This shift is also reshaping how AWS account teams (solutions architects, customer solutions managers, and technical account managers) collaborate with customers. It’s becoming less about handing off advisory design documents and more about building alongside them in real time. It’s a genuinely exciting moment to be in the middle of the change, and this week’s headline launch — Anthropic’s most capable model yet, now on AWS — is going to push that pace even further.

Now, let’s get into this week’s AWS news…

Headlines
Claude Opus 4.8 on AWS — Anthropic’s most capable generally available model is now accessible through both Amazon Bedrock and the Claude Platform on AWS. Opus 4.8 is built for agentic coding, knowledge work, and extended autonomous task execution — it sustains longer autonomous sessions with deeper reasoning, recovers from errors, and synthesizes information across lengthy documents. For coding workloads, it reads codebases like an engineer, plans before it edits, and holds context across long sessions. On Amazon Bedrock, you get AWS-managed features like Guardrails, Knowledge Bases, and data residency; on the Claude Platform on AWS, you get Anthropic’s native APIs unified with AWS billing. To learn more, visit the deep-dive blog post.

Last week’s launches
Here are some launches and updates from this past week that caught my attention:

  • Introducing the next generation of AWS Resilience Hub — A reimagined Resilience Hub gives SREs and developers a unified framework to define resilience standards, evaluate applications against them, and demonstrate compliance across an entire portfolio. It introduces modular resilience policies (covering service-level objectives (SLOs), multi-AZ/Region DR, and data recovery), business-oriented application modeling, generative AI-powered assessments aligned with the Well-Architected and Resilience Analysis Frameworks, and automatic dependency discovery via DNS query log analysis. Integration with AWS Organizations enables organization-wide resilience management from a single delegated administrator account.
  • Introducing the next generation of Amazon OpenSearch Serverless for building agentic AI applications — Amazon OpenSearch Serverless is now a fully managed search and vector engine purpose-built for agentic AI applications. It scales from zero to thousands of requests per second—roughly 20x faster than the prior generation—delivers up to 60% cost savings versus peak-provisioned clusters, and adds GPU acceleration plus new SEARCH and VECTORSEARCH collection types. Native integrations with Vercel, Kiro, Claude Code, and Cursor through OpenSearch Agent Skills make it straightforward to plug into your agent stack.
  • New assessment capabilities in AWS Transform — AWS Transform expands with new tools to help you build migration business cases and evaluate TCO before moving workloads to AWS. You can ingest data from RVTools exports, CMDB data, the AWS Transform discovery tool, and third-party discovery tools, then run what-if scenarios across region, utilization, and service mapping for EC2, FSx, S3, SQL Server on EC2, and virtual desktops. The release also adds Agentic Readiness Analysis (ARA) and Modernization Analysis (MODA), which scan code repositories in 5 to 30 minutes per repo to surface severity-tagged findings with file-level evidence and AWS-mapped remediation guidance.
  • Amazon Aurora MySQL with Kiro Powers — Aurora MySQL now integrates with Kiro Powers, drawing from a curated repository of pre-packaged MCP servers, steering files, and hooks validated by Kiro partners. Developers can execute both data plane tasks (queries, schema management) and control plane tasks (cluster management) in natural language, with dynamic guidance for Aurora MySQL Serverless scaling, RDS-to-Aurora migration, and replication setup. The companion Database Blog post explains how the agent produces the API calls, SQL, and configuration for you to review and run — available via one-click install from the Kiro IDE or webpage.
  • Amazon WorkSpaces Applications now supports Windows Desktop OS — You can now bring your own Windows Desktop licenses to Amazon WorkSpaces Applications and stream full Windows desktops and applications from AWS-hosted dedicated hardware. BYOL eliminates OS fees (you pay only for compute and streaming infrastructure), supports eligible Microsoft 365 Apps for enterprise, and gives users a matching experience between local and remote environments — same workflows, shortcuts, and navigation in both.

For a full list of AWS announcements, be sure to keep an eye on the What’s New with AWS page.

Other AWS news
Here are some additional posts and resources that you might find interesting:

For a full list of AWS blog posts, be sure to keep an eye on the AWS Blogs page.

Learn more about AWS, browse and join upcoming AWS-led in-person and virtual events, startup events, and developer-focused events as well as AWS Summits and AWS Community Days. Join the AWS Builder Center to connect with builders, share solutions, and access content that supports your development.

That’s all for this week. Check back next Monday for another Weekly Roundup!

-Micah

Introducing the next generation of AWS Resilience Hub for generative AI-based SRE resilience journey

This post was originally published on this site

Today, we’re announcing the next generation of AWS Resilience Hub with a significantly expanded experience that brings together a new application model, dependency discovery assessment, generative AI-powered failure mode analysis, modular resilience policies, and organization-wide reporting.

Organizations running hundreds of applications share a common challenge: availability is a top concern, yet there is no consistent way to set resilience goals, measure progress, or prove compliance across a portfolio. Teams set different standards, use different tools, and struggle to exchange information about whether applications actually meet expectations.

The next generation of AWS Resilience Hub changes this by giving Site Reliability Engineers (SREs) and development teams a structured way to align on resilience policy expectations, help application teams achieve them, and demonstrate compliance through testing. With integration into AWS Organizations, teams can now evaluate resilience at scale, identify failure modes, discover hidden dependencies, and report on progress across the enterprise.

The next generation of Resilience Hub walks you through your resilience journey and to help you there are the following concepts built into it.

  • Resilience policy: You can define your resilience expectations through modular, composable requirements. Rather than choosing a single rigid policy type, you construct policies by selecting the requirements that matter to your application, such as service level objective (SLO), multi-AZ and multi-Region disaster recovery, and data recovery requirements.
  • Business-level understanding: You can use new application modeling through critical end-user paths that map directly to business outcomes. Systems represent a business application, user journeys describe critical business paths, and services are the deployable units comprising AWS resources, code, and observability. Resilience Hub automatically discovers and maps them into a topology showing how resources connect.
  • AI failure mode assessments: You can run generative AI-powered assessments that analyze your services against your defined resilience policies, AWS Well-Architected best practices, and the AWS Resilience Analysis Framework. These assessments identify potential failure modes and provide actionable recommendations.
  • Dependency discovery assessment: You can automatically discover AWS services, internal endpoints, and third-party endpoints that your services depend on. This dependency assessment uses DNS query log analysis to identify dependencies you may not know about—including unexpected cross-region calls or critical third-party dependencies.

The next generation of AWS Resilience Hub in action
To get started, you configure a resilience policy, set up your first system and service, run a failure mode assessment, review the results, and implement the findings.

Before you begin, you should set up the invoker IAM role, which grants Resilience Hub read-only access to your AWS resources, cross-account roles (if not using AWS Organizations), or service-linked roles (SLRs) with AWS Organizations. Resilience Hub also integrates with AWS Organizations to enable organization-wide resilience management from a single delegated administrator account. This eliminates the need to log in to individual accounts to assess resilience posture across your enterprise. To learn more, visit For prerequisite details in the AWS Resilience Hub User Guide.

To configure a resilience policy, choose Create policy in the Policies menu through the AWS Resilience Hub console. Enter a policy name, description, and choose resilience requirements. For example, you can create a reusable policy for multi-Region disaster recovery used in financial applications—including 99.95% availability SLO, 15-minutes RTO, 5-minutes RPO for multi-Region disaster recovery, and disaster recovery approach that aligns with your RTO and RPO requirements.

If you choose data recovery requirements, you can define the data recovery time objective for restoring from backups for each service associated with this policy.

To create your first system representing your business application, choose Create a system in the Systems menu. Optionally, you can enable AWS Organizations account access for this system.

Now you can create a service that represents a deployable unit, like one of your microservices, and associate it with your system, and tell Resilience Hub where to find your resources. Enter a service name, for example, stock-exchange-service, choose your resilience policy and invoker AWS IAM role name. You can choose service Regions, service resources such as your resource tags, AWS CloudFormation stack, Terraform state file location, or Amazon EKS cluster and namespace.

When you enable dependency discovery for this service, AWS examines your VPC query logs for the VPCs associated with the resources in your service. You can disable this feature anytime from the dependency discovery settings in the service details page.

Now, you can run your first assessment with the service creation complete and a policy applied. Choose Run failure mode assessment in your service page and wait for the assessment to complete.

During the assessment, Resilience Hub assumes your invoker role, reads resources from your configured input sources, identifies parent-child relationships, queries the application topology service to map connections between resources, and builds a topology showing data flow, containment, and permissions.

By choosing Service topology, you can see service resources grouped by service functions in the graph, table, or JSON format.

By choosing Failure mode guidance, you can add assertions used to guide the agents while performing the failure mode assessment. Assertions are either generated by the agent or added by users. You can update them to improve assessment accuracy.

Once the assessment is complete, you can review findings and recommendations in the Assessment tab of your service page. Each finding tells you what the failure mode is, why it matters for your architecture, how to fix it, and which policy requirement it relates to.

You can choose Mark as resolved to implement the recommendation or Mark as irrelevant if the finding doesn’t apply to your use case.

If you’re an existing Resilience Hub customer, Resilience Hub provides migration APIs to simplify the transition of your previous applications. These APIs convert your previous assessment policies to new resilience policies, map your previous applications to the new model, such as multiple related applications to one system with multiple services.

For more information about new features, visit the AWS Resilience Hub User Guide.

Now available
The next generation of AWS Resilience Hub is now generally available in AWS commercial Regions where Resilience Hub is available. For Regional availability and the future roadmap, visit the AWS Capabilities by Region.

Resilience Hub uses a new service-based pricing model. Pricing includes two failure mode assessments per month for services, and optionally automated dependency assessment. You can try AWS Resilience Hub free. For pricing details, visit the AWS Resilience Hub pricing page.

Give the new AWS Resilience Hub a try in the Resilience Hub console and send feedback to AWS re:Post for Resilience Hub or through your usual AWS Support contacts.

Channy

Introducing the next generation of Amazon OpenSearch Serverless for building your agentic AI applications

This post was originally published on this site

Today, we’re announcing the next generation of Amazon OpenSearch Serverless, a fully managed search and vector engine designed for customers building AI agents. The next generation of OpenSearch Serverless scales from zero to thousands of requests per second and back to zero when idle, offering up to 60% cost savings compared to the cost of OpenSearch Service clusters provisioned for peak capacity.

The next generation of OpenSearch Serverless creates resources in seconds and scales capacity up to 20 times faster than the previous generation. With instant resource creation and native integrations with AI development platforms like Vercel and Kiro, you can deploy production-ready search and vector backends for your AI agents in minutes without managing infrastructure.

The next generation of OpenSearch Serverless in action
To get started with the next generation of OpenSearch Serverless, choose Create collection in the Serverless menu in the Amazon OpenSearch Service console.

Create NextGen collection with instant auto scaling and scale-to-zero for cost optimization. At launch, we support full-text search and vector search only for the collection type. If you want to use the existing OpenSearch Serverless infrastructure, choose Switch to Classic.

Choose Express create, the fastest way to create collection. No configuration is required—the default settings and matching security policies are applied automatically. Some configuration options can be changed later.

When you choose Create collection, OpenSearch Serverless will provision resources in seconds.

You can also create a collection of OpenSearch Serverless with AWS Command Line Interface (AWS CLI) or AWS SDKs. Here is a sample CLI command to create a collection group.

aws opensearchserverless create-collection-group 
    --name channy-nextgen-group 
    --standby-replicas ENABLED 
    --generation NEXTGEN 
    --description "My NextGen collection group" 
    --capacity-limits '{
        "maxIndexingCapacityInOCU": 10,
        "maxSearchCapacityInOCU": 10,
        "minIndexingCapacityInOCU": 0,
        "minSearchCapacityInOCU": 0
    }' 
    --region "us-east-1"

Now, you can create a collection that inherits the generation from its parent collection group. Supported collection types: SEARCH and VECTORSEARCH.

aws opensearchserverless create-collection 
    --name channy-nextgen-collection 
    --type SEARCH 
    --collection-group-name channy-nextgen-group 
    --standby-replicas ENABLED 
    --description "My collection in NextGen group" 
    --region "us-east-1"

To learn more about managing the next generation of OpenSearch Serverless, visit the Amazon OpenSearch Serverless documentation.

Building your agents faster with OpenSearch Serverless
To support building production-ready agent applications in Vercel, you can now create a new OpenSearch collection or connect your existing OpenSearch Serverless collection within the Vercel console. Create a search backend in seconds and add features on-demand as your application grows. To learn more, visit AWS for Vercel.

You can go from idea to working prototype in minutes using Claude Code, Cursor, and Kiro. OpenSearch Agent Skills provide a repository of skills that bring OpenSearch intelligence directly into your agent. Each skill encapsulates domain knowledge, best practices, and multi-step execution logic for a specific workflow–so your agent not only gets results, but understands how they were achieved. You can also use the OpenSearch Launchpad in Kiro Powers to accelerate search applications with guided, end-to-end architecture planning.

Now available
The next generation of Amazon OpenSearch Serverless is generally available today and is available in all AWS commercial Regions where Amazon OpenSearch Serverless is currently available.

The next generation of OpenSearch Serverless charges for the compute you use in OpenSearch Compute Units (OCUs) for indexing, search, and GPU acceleration. You are charged separately for storage in GB-month. For more information, see Amazon OpenSearch Service Pricing.

Give it a try and send feedback to the AWS re:Post for Amazon OpenSearch Service or through your usual AWS Support contacts.

Channy

Amazon Redshift introduces AWS Graviton-based RG instances with an integrated data lake query engine

This post was originally published on this site

Since 2013, Amazon Redshift has given the full power of a data warehouse in the cloud, at a fraction of the on-premises cost. Every architectural generation—from dense compute to Amazon RA3 instances, from provisioned to Amazon Redshift Serverless—has made each query cheaper, faster, and more efficient than the last.

For over a decade, as data volumes have grown and analytics requirements have evolved, organizations increasingly leverage both data warehouse tables for structured, frequently-accessed data and data lakes for cost-effective storage of diverse datasets. Add AI agents to the mix and they query your data warehouse at a scale that dwarfs typical human usage, leading to spiraling operational costs.

Amazon Redshift has doubled down on its core strengths to meet the demands of any workload — whether driven by humans or AI agents. For example, in March 2026, Amazon Redshift improved the performance of business intelligence (BI) dashboards and ETL workloads by speeding up new queries by up to 7 times. This significantly improves the response times of low-latency SQL queries, such as those used in near-real-time analytics applications, BI dashboards, ETL pipelines, and autonomous, goal-seeking AI agents.

Today, we’re announcing Amazon Redshift RG instances, a new instance family powered by AWS Graviton. RG instances deliver better performance, running data warehouse workloads up to 2.2x as fast as RA3 instances at 30% lower price per vCPU. Their integrated data lake query engine lets you run SQL analytics across your data warehouse and data lake from a single engine with performance up to 2.4x as fast as RA3 for Apache Iceberg and up to 1.5x as fast as RA3 for Apache Parquet. This blend of speed, cost efficiency, and an integrated data lake query engine makes Redshift RG instances well-suited to handle the high query volumes and low-latency requirements of today’s analytics and agentic AI workloads.

You can compare new RG instances and current RA3 instances:

Current RA3 Instance Recommended RG instance vCPU Memory (GB) Primary Use Case
ra3.xlplus rg.xlarge 4 32 Small cluster departmental analytics
ra3.4xlarge rg.4xlarge 12 → 16 (1.33:1) 96 GB → 128 GB (1.33:1) Standard production workloads, medium data volumes

This approach reduces total analytics costs for customers running combined data warehouse and data lake workloads, while simplifying operations through a single system for querying both warehouse tables and Amazon Simple Storage Service (Amazon S3) data lakes. We recommend using the AWS Pricing Calculator with your specific workload patterns to estimate savings.

Getting started with Amazon Redshift RG instances
You can launch new clusters or migrate existing clusters through the AWS Management Console, AWS Command Line Interface (AWS CLI), or AWS API. The integrated data lake query engine is enabled by default.

In the Amazon Redshift console, you can choose new RG instances when you create a cluster.

You can migrate previous-generation instances to RG instances with optimal paths based on your cluster configuration to estimate costs, validate compatibility, and automate execution.

  • Elastic Resize—in-place migration with 10-15 minutes downtime for compatible configurations
  • Snapshot and Restore—create a RG cluster from an RA3 snapshot. This is best for customers who want to make configuration changes during the migration

Your external tables, schemas, and query syntax—including existing Spectrum queries—remain unchanged. There is no need to recreate external tables or modify application code. To learn more, visit the Redshift Management Guide.

Amazon Redshift now executes data lake queries on cluster nodes—the same compute that processes data warehouse workloads. As a result, Amazon Redshift Spectrum is no longer required. Data lake queries stay within your VPC boundary, use existing IAM roles, and incur zero per-terabyte scanning charges. This removes the $5/TB Spectrum scanning fees that previously added to total Redshift costs.

Now available
Amazon Redshift RG instances are now available in the following AWS Regions: US East (N. Virginia, Ohio), US West (N. California, Oregon), Asia Pacific (Hong Kong, Hyderabad, Jakarta, Malaysia, Melbourne, Mumbai, Osaka, Seoul, Singapore, Sydney, Taiwan, Tokyo), Canada (Central), Europe (Frankfurt, Ireland, Milan, London, Paris, Spain, Stockholm), Middle East (UAE), and South America (São Paulo). For Regional availability and a future roadmap, visit the AWS Capabilities by Region. For Redshift Provisioned, you can select On-Demand Instances with hourly billing and no commitments or choose Reserved Instances for cost savings. To learn more, visit the Amazon Redshift Pricing page.

Give RG instances a try in the Redshift console and send feedback to AWS re:Post for Amazon Redshift or through your usual AWS Support contacts.

Channy

AWS Weekly Roundup: Amazon Bedrock AgentCore payments, Agent Toolkit for AWS, and more (May 11, 2026)

This post was originally published on this site

My most exciting news of last week: Amazon Bedrock AgentCore previewed the first managed payment capabilities enabling AI agents to autonomously access and pay for APIs, MCP servers, web content, and other agents. Built in partnership with Coinbase and Stripe, it removes the undifferentiated heavy lifting of building customized systems for billing, credential management, and compliance.

You can connect a Coinbase CDP wallet or Stripe Privy wallet as a payment connection, set session-level spending limits, and your agent transacts autonomously during execution. What excites me most is what AgentCore payments can unlock—like a research agent that can pay for real-time market data on the fly, or a coding agent calling paid APIs mid-task.

To learn more, visit the blog post, dive deeper using the documentation, and get started with the AgentCore CLI.

Last week’s launches
Here are last week’s launches that caught my attention:

  • Agent Toolkit for AWS – A production-ready suite of tools and guidance, available at no additional charge, that helps AI coding agents build on AWS with fewer errors, lower token costs, and enterprise-grade security controls. The Agent Toolkit for AWS is the successor to the MCP servers, plugins, and skills available on AWS Labs. To get started, visit the quick start guide or browse the available skills and plugins on GitHub.
  • AWS MCP Server GA – You can use a managed remote Model Context Protocol (MCP) server that gives AI agents and coding assistants secure, authenticated access to all AWS services through a small, fixed set of tools. It is part of the Agent Toolkit for AWS. To learn more, visit Seb Stormacq’s blog post.
  • Amazon WorkSpaces for AI agents (Preview) – You can use AI agents to securely access and operate desktop applications through managed WorkSpaces environments. This capability allows organizations to automate everyday workflows at scale while maintaining full enterprise-grade governance and compliance. To learn more, visit Micah Walter’s blog post.
  • Amazon EC2 M8idn/M8idb and R8idn/R8idb instances – These instances are powered by custom sixth-generation Intel Xeon Scalable processors available only on AWS and the latest sixth-generation AWS Nitro cards. These instances deliver up to 43% better compute performance per vCPU compared to previous-generation instances. M8idn/R8idn instances offer up to 600 Gbps network bandwidth, and M8idb/R8idb instances deliver up to 300 Gbps EBS bandwidth.

For a full list of AWS announcements, be sure to keep an eye on the What’s New with AWS page.

Additional updates
Here are some additional news items that you might find interesting:

  • Valkey turns two – Valkey stands as proof that open, community-driven technology innovates faster, scales further, and delivers more value than any single-vendor model. Valkey has surpassed 100 million Docker pulls (up 17x year over year) and attracted more than 225 contributors who have submitted over 1,500 pull requests, roughly double the development pace of Redis over the same period. You can also use the latest Valkey 9.0 in Amazon ElastiCache.
  • Query billion-scale vectors with SQL – You can learn how to query Amazon S3 Vectors from Amazon Aurora PostgreSQL-Compatible Edition using standard SQL, and how to combine vector similarity results with relational filters in a single query, for example, finding the most semantically similar products and then filtering by price, stock status, or tenant in one SQL statement.
  • Building an end-to-end agentic SRE using AWS DevOps Agent – Learn how to configure DevOps Agent Spaces that define an investigation scope, integrating seamlessly with Amazon CloudWatch, Splunk, GitHub, and Slack. You can also learn how to trigger automated investigations via webhooks, generate mitigation plans, and hand off agent-ready specs to coding agents like Kiro for implementation.

For a full list of AWS blog posts, be sure to keep an eye on the AWS Blogs page.

Learn more about AWS, browse and join upcoming AWS-led in-person and virtual events, startup events, and developer-focused events as well as AWS Summits and AWS Community Days. Join the AWS Builder Center to connect with builders, share solutions, and access content that supports your development.

That’s all for this week. Check back next Monday for another Weekly Roundup!

Channy

Modernize your workflows: Amazon WorkSpaces now gives AI agents their own desktop (preview)

This post was originally published on this site

Enterprises face a significant challenge when deploying AI agents: the desktop and legacy applications that power most business workflows are simply inaccessible to modern AI systems. According to a 2024 Gartner report, 75% of organizations run legacy applications that lack modern APIs, and 71% of Fortune 500 companies operate critical processes on mainframe systems without adequate programmatic access. For many organizations, this has meant choosing between delaying AI adoption or undertaking expensive and risky modernization projects.

Today, we are announcing that Amazon WorkSpaces now enables AI agents to securely operate desktop applications without requiring application modernization. The same managed virtual desktops that millions of employees use and trust can now also serve AI agents, turning WorkSpaces into infrastructure for scaling enterprise productivity, not just delivering it. Because agents operate within your existing WorkSpaces environment, there are no APIs to build, no application migrations to plan, and no new infrastructure to manage.

Some of our customers had an early opportunity to give their agents a WorkSpace. Chris Noon, Director, Nuvens Consulting shared with us, “WorkSpaces lets our clients give AI agents the same secure, governed desktop environment their employees already use — no custom API integrations, full audit trails, and enterprise-grade isolation out of the box. For regulated industries, that’s not a nice-to-have — it’s the baseline.”

Secure cloud desktop access for AI agents
With WorkSpaces, AI agents can securely access and operate desktop applications running inside managed WorkSpaces environments to complete complex business workflows. Agents authenticate through AWS Identity and Access Management (IAM) and connect via Workspaces with complete audit trails available through AWS CloudTrail and Amazon CloudWatch. Because agents operate within secure WorkSpaces environments rather than on local machines, your existing security controls and compliance policies remain fully intact.

Amazon Workspaces supports the industry-standard Model Context Protocol (MCP), which means WorkSpaces works with any agent framework, such as LangChain, CrewAI and Strands Agents.

Let’s try it out
To set up a WorkSpaces environment for AI agents, I started in the AWS Management Console by creating a new WorkSpaces Applications stack—the environment definition that controls how agents connect and what they’re allowed to do.

From the Amazon WorkSpaces console, I chose Create stack and configured the basics: name, fleet association, and VPC endpoints. In Step 3 of the stack creation workflow, I noticed the new AI agents section with two options. The first, No AI agent access, is the default configuration for standard WorkSpaces designed for people. The second, Add AI Agents, allows AI agents to securely access and operate applications using their own identity and permissions. I selected Add AI Agents to enable agent connections on this stack.

Workspaces Screenshot

Next, I will enable storage before configuring the agent access settings to define how agents interact with the desktop.

Workspaces screenshot

Under Agent features, I enabled three capabilities. Computer input allows the agent to click, type, and scroll within the desktop. Computer vision allows the agent to capture screenshots of the desktop, which is how it “sees” the application. Finally, screenshot storage configures where session screenshots are stored for audit and debugging.

Workspaces Screenshot

Under Desktop screen layout, I set the screen resolution to 1280×720 and image format to PNG. The resolution determines the fidelity of what the agent sees during a session—a complex application with dense UI elements might benefit from higher resolution, while a terminal-style interface works well at 720p.

Workspaces Screenshot

With my stack configured, WorkSpaces exposes a managed MCP endpoint. I pointed my agent framework to this endpoint, provided IAM credentials for authentication, and my agent began interacting with the desktop applications installed on the fleet’s image.

To see this in action, here’s an agent built with the Strands Agent SDK and Amazon Bedrock handling a prescription refill, looking up the patient record, searching for the medication, placing the order, and confirming a successful refill, all inside a sample pharmacy system with no API.

The application doesn’t know an agent is driving it. Nothing about the software was modified, rebuilt, or integrated. The agent worked with it exactly as it exists today.

Now available
This feature is available today in public preview at no additional cost in US East (N. Virginia, Ohio), US West (Oregon), Canada (Central), Europe (Frankfurt, Ireland, Paris), and Asia (Tokyo, Mumbai, Sydney, Seoul, Singapore) Regions.

Get started building today using our GitHub repo, or visit the WorkSpaces page for more details.

Top announcements of the What’s Next with AWS, 2026

This post was originally published on this site

Today at the What’s Next with AWS, Matt Garman, CEO of AWS, Colleen Aubrey, SVP Amazon Applied AI Solutions, Julia White, CMO of AWS, and OpenAI leaders discussed how they and their customers are changing how businesses operate with agents.

Here’s our roundup of the biggest announcements from the event:

Amazon Quick is an AI assistant for work that connects to all of them, learns what matters to you, and takes action on your behalf. Starting today, you can use the new desktop app, sign up for Free and Plus pricing plans, generate visual assets in the chat, and easily connect Quick to even more apps.

  • Quick’s new desktop app (Preview): You can create a personalized experience by staying connected to your local files, calendar, and communications without opening a browser.
  • New Free and Plus pricing plans for Quick: You can sign up within minutes using your personal email address or existing Google, Apple, Github, or Amazon credentials—no AWS account required.
  • Generate visual assets on the fly: Available today, Quick now lets you create polished documents, presentations, infographics, and images directly from the chat interface, no design skills or hours of formatting required.
  • Easily connect Quick to even more apps: Also available today, Quick is expanding its native integrations to include Google Workspace, Zoom, Airtable, Dropbox, and Microsoft Teams.

To learn more, visit the About Amazon News post.

Amazon Connect is expanding from a single product into a set of four agentic AI solutions designed to work within your existing workflows: Amazon Connect Decisions (supply chains), Talent (hiring), Customer (customer experience), and Health (health care).

  • Amazon Connect Decisions is a supply chain planning and intelligence solution that shifts teams from crisis management to proactive planning and decisioning. AI teammates, combining 30 years of Amazon operational science and 25+ specialized supply chain tools, adapt to your business, learn from your team, and continuously improve your operations.
  • Amazon Connect Talent (Preview) is an agentic AI hiring solution built for talent acquisition leaders managing scaled hiring. It delivers AI-led interviews, science-backed assessments, and consistent evaluation, helping recruiters hire high quality candidates faster while providing applicants with a flexible interview experience that reduces human preconceptions.
  • Amazon Connect Customer, previously known as Amazon Connect, delivers intelligent, personalized customer experiences across voice, chat, and digital channels. Amazon Connect Customer now offers new configuration capabilities that enable organizations to set up conversational AI in weeks, not months, and configure experiences without technical expertise.
  • Amazon Connect Health delivers agentic patient verification, appointment management, patient insights, ambient documentation, and medical coding — giving patients faster access to care, clinicians more time for care, and staff capacity for specialized work.

To learn more, visit the About Amazon News post.

AWS and OpenAI extended partnership
AWS and OpenAI are bringing the latest OpenAI models to Amazon Bedrock, launching Codex on Amazon Bedrock, and launching Amazon Bedrock Managed Agents, powered by OpenAI (all in limited preview), giving enterprises the frontier intelligence they want on the infrastructure they trust.

  • OpenAI models on Amazon Bedrock (Limited preview): The latest OpenAI models, including GPT-5.5 and GPT-5.4, will be available in preview on Amazon Bedrock. Use OpenAI’s frontier models through the same Bedrock APIs you already rely on, with unified security, governance, and cost controls. No additional infrastructure to configure, no new security model to learn.
  • Codex on Amazon Bedrock (Limited preview): You can access the OpenAI coding agent within the AWS environments where they already operate at scale. You can authenticate using their AWS credentials, process inference through Amazon Bedrock infrastructure, and apply Codex usage toward their AWS cloud commitments. Codex on Bedrock is available through the Bedrock API, starting with the Codex CLI, the Codex desktop app, and Visual Studio Code extension.
  • Amazon Bedrock Managed Agents, powered by OpenAI (Limited preview): Amazon Bedrock Managed Agents combines frontier AI models with trusted AWS infrastructure, enabling customers to quickly and easily build production-ready OpenAI-powered agents in the cloud. It is built with the OpenAI harness, which is engineered to unlock the full potential of OpenAI frontier models, delivering faster execution, sharper reasoning, and reliable steering of long-running tasks.

To learn more, visit the AWS What’s New post and About Amazon News post.

Introducing Anthropic’s Claude Opus 4.7 model in Amazon Bedrock

This post was originally published on this site

Today, we’re announcing Claude Opus 4.7 in Amazon Bedrock, Anthropic’s most intelligent Opus model for advancing performance across coding, long-running agents, and professional work.

Claude Opus 4.7 is powered by Amazon Bedrock’s next generation inference engine, delivering enterprise-grade infrastructure for production workloads. Bedrock’s new inference engine has brand-new scheduling and scaling logic which dynamically allocates capacity to requests, improving availability particularly for steady-state workloads while making room for rapidly scaling services. It provides zero operator access—meaning customer prompts and responses are never visible to Anthropic or AWS operators—keeping sensitive data private.

According to Anthropic, Claude Opus 4.7 model provides improvements across the workflows that teams run in production such as agentic coding, knowledge work, visual understanding,long-running tasks. Opus 4.7 works better through ambiguity, is more thorough in its problem solving, and follows instructions more precisely.

  • Agentic coding: The model extends Opus 4.6’s lead in agentic coding, with stronger performance on long-horizon autonomy, systems engineering, and complex code reasoning tasks. According to Anthropic, the model records high-performance scores with 64.3% on SWE-bench Pro, 87.6% on SWE-bench Verified, and 69.4% on Terminal-Bench 2.0.
  • Knowledge work: The model advances professional knowledge work, with stronger performance on document creation, financial analysis, and multi-step research workflows. The model reasons through underspecified requests, making sensible assumptions and stating them clearly, and self-verifies its output to improve quality on the first step. According to Anthropic, the model reaches 64.4% on Finance Agent v1.1.
  • Long-running tasks: The model stays on track over longer horizons, with stronger performance over its full 1M token context window as it reasons through ambiguity and self-verifies its output.
  • Vision: the model adds high-resolution image support, improving accuracy on charts, dense documents, and screen UIs where fine detail matters.

The model is an upgrade from Opus 4.6 but may require prompting changes and harness tweaks to get the most out of the model. To learn more, visit Anthropic’s prompting guide.

Claude Opus 4.7 model in action
You can get started with Claude Opus 4.7 model in Amazon Bedrock console. Choose Playground under Test menu and choose Claude Opus 4.7 when you select model. Now, you can test your complex coding prompt with the model.

I run the following prompt example about technical architecture decision:
Design a distributed architecture on AWS in Python that should support 100k requests per second across multiple geographic regions.

You can also access the model programmatically using the Anthropic Messages API to call the bedrock-runtime through Anthropic SDK or bedrock-mantle endpoints, or keep using the Invoke and Converse API on bedrock-runtime through the AWS Command Line Interface (AWS CLI) and AWS SDK.

To get started with making your first API call to Amazon Bedrock in minutes, choose Quickstart in the left navigation pane in the console. After choosing your use case, you can generate a short term API key to authenticate your requests as testing purpose.

When you choose the API method such as the OpenAI-compatible Responses API, you can get sample codes to run your prompt to make your inference request using the model.


To invoke the model through the Anthropic Claude Messages API, you can proceed as follows using anthropic[bedrock] SDK package for a streamlined experience:

from anthropic import AnthropicBedrockMantle
# Initialize the Bedrock Mantle client (uses SigV4 auth automatically)
mantle_client = AnthropicBedrockMantle(aws_region=REGION)
# Create a message using the Messages API
message = mantle_client.messages.create(
    model="anthropic.claude-opus-4-7",
    max_tokens=2048,
    messages=[ 
	    {"role": "user", "content": "Design a distributed architecture on AWS in Python that should support 100k requests per second across multiple geographic regions"}
    ]
)
print(message.content[0].text)

You can also run the following command to invoke the model directly to bedrock-runtime endpoint using the AWS CLI and the Invoke API:

aws bedrock-runtime invoke-model  
 --model-id anthropic.claude-opus-4-7  
 --region us-east-1  
 --body '{"messages": [{"role": "user", "content": "Design a distributed architecture on AWS in Python that should support 100k requests per second across multiple geographic regions."}], "max_tokens": 512, "temperature": 0.5, "top_p": 0.9}'  
 --cli-binary-format raw-in-base64-out  
invoke-model-output.txt

For more intelligent reasoning capability, you can use Adaptive thinking with Claude Opus 4.7, which lets Claude dynamically allocate thinking token budgets based on the complexity of each request.

To learn more, visit the Anthropic Claude Messages API and check out code examples for multiple use cases and a variety of programming languages.

Things to know
Let me share some important technical details that I think you’ll find useful.

  • Choosing APIs: You can choose from a variety of Bedrock APIs for model inference, as well as the Anthropic Messages API. The Bedrock-native Converse API supports multi-turn conversations and Guardrails integration. The Invoke API provides direct model invocation and lowest-level control.
  • Scaling and capacity: Bedrock’s new inference engine is designed to rapidly provision and serve capacity across many different models. When accepting requests, we prioritize keeping steady state workloads running, and ramp usage and capacity rapidly in response to changes in demand. During periods of high demand, requests are queued, rather than rejected. Up to 10,000 requests per minute (RPM) per account per Region are available immediately, with more available upon request.

Now available
Anthropic’s Claude Opus 4.7 model is available today in the US East (N. Virginia), Asia Pacific (Tokyo), Europe (Ireland), and Europe (Stockholm) Regions; check the full list of Regions for future updates. To learn more, visit the Claude by Anthropic in Amazon Bedrock page and the Amazon Bedrock pricing page.

Give Anthropic’s Claude Opus 4.7 a try in the Amazon Bedrock console today and send feedback to AWS re:Post for Amazon Bedrock or through your usual AWS Support contacts.

Channy

AWS Weekly Roundup: Claude Mythos Preview in Amazon Bedrock, AWS Agent Registry, and more (April 13, 2026)

This post was originally published on this site

In my last Week in Review post, I mentioned how much time I’ve been spending on AI-Driven Development Lifecycle (AI-DLC) workshops with customers this year. A common theme in those sessions is the need for better cost visibility. Teams are moving fast with AI, but as they go from experimenting to full production, finance and leadership really need to know who is using which resources and at what cost. That’s why I was so excited to see the launch of Amazon Bedrock new support for cost allocation by IAM user and role this week. This lets you tag IAM principals with attributes like team or cost center and then activate those tags in your Billing and Cost Management console. The resulting cost data flows into AWS Cost Explorer and the detailed Cost and Usage Report, giving you a clear line of sight into model inference spending. Whether you’re scaling agents across teams, tracking foundation model use by department, or running tools like Claude Code on Amazon Bedrock, this new feature is a game changer for tracking and managing your AI investments. You can get all the details on setting this up in the IAM principal cost allocation documentation.

Now, let’s get into this week’s AWS news…

Headlines
Amazon Bedrock now offers Claude Mythos Preview Anthropic’s most sophisticated AI model to date is now available on Amazon Bedrock as a gated research preview through Project Glasswing. Claude Mythos introduces a new model class focused on cybersecurity, capable of identifying sophisticated security vulnerabilities in software, analyzing large codebases, and delivering state of the art performance across cybersecurity, coding, and complex reasoning tasks. Security teams can use it to discover and address vulnerabilities in critical software before threats emerge. Access is currently limited to allowlisted organizations, with Anthropic and AWS prioritizing internet critical companies and open source maintainers.

AWS Agent Registry for centralized agent discovery and governance now in preview AWS launched Agent Registry through Amazon Bedrock AgentCore, providing organizations with a private catalog for discovering and managing AI agents, tools, skills, MCP servers, and custom resources. The registry helps teams locate existing capabilities rather than duplicating them, with semantic and keyword search, approval workflows, and CloudTrail audit trails. It is accessible via the AgentCore Console, AWS CLI, SDK, and as an MCP server queryable from IDEs.

Last week’s launches
Here are some launches and updates from this past week that caught my attention:

  • Announcing Amazon S3 Files, making S3 buckets accessible as file systems — Amazon S3 Files transforms S3 buckets into shared file systems that connect any AWS compute resource directly with your S3 data. Built on Amazon EFS technology, it delivers full file system semantics with low latency performance, caching actively used data and providing multiple terabytes per second of aggregate read throughput. Applications can access S3 data through both file system and S3 APIs simultaneously without code modifications or data migration.
  • Amazon OpenSearch Service supports Managed Prometheus and agent tracing —Amazon OpenSearch Service now provides a unified observability platform that consolidates metrics, logs, traces, and AI agent tracing into a single interface. The update includes native Prometheus integration with direct PromQL query support, RED metrics monitoring, and OpenTelemetry GenAI semantic convention support for LLM execution visibility. Operations teams can correlate slow traces to logs and overlay Prometheus metrics on dashboards without switching between tools.
  • Amazon WorkSpaces Advisor now available for AI powered troubleshooting— AWS launched Amazon WorkSpaces Advisor, an AI powered administrative tool that uses generative AI to help IT administrators troubleshoot Amazon WorkSpaces Personal deployments. It analyzes WorkSpace configurations, detects problems automatically, and provides actionable recommendations to restore service and optimize performance.
  • Amazon Braket adds support for Rigetti’s 108 qubit Cepheus QPU — Amazon Braket now offers access to Rigetti’s Cepheus-1-108Q device, the first 100+ qubit superconducting quantum processor on the platform. The modular design features twelve 9 qubit chiplets with CZ gates that offer enhanced resilience to phase errors. It supports multiple frameworks including Braket SDK, Qiskit, CUDA-Q, and Pennylane, with pulse level control for researchers.

For a full list of AWS announcements, be sure to keep an eye on the What’s New with AWS page.

Other AWS news
Here are some additional posts and resources that you might find interesting:

Upcoming AWS events
Check your calendar and sign up for upcoming AWS events:

  • What’s Next with AWS (April 28, Virtual) Join this livestream at 9am PT for a candid discussion about how agentic AI is transforming how businesses operate. Featuring AWS CEO Matt Garman, SVP Colleen Aubrey, and OpenAI leaders discussing emerging agent capabilities, Amazon’s internal experiences, and new agentic solutions and platform capabilities.

Browse here for upcoming AWS led in person and virtual events, startup events, and developer focused events.


That’s all for this week. Check back next Monday for another Weekly Roundup!

~ micah