Tag Archives: AWS

Amazon EC2 X8i instances powered by custom Intel Xeon 6 processors are generally available for memory-intensive workloads

This post was originally published on this site

Since a preview launch at AWS re:Invent 2025, we’re announcing the general availability of new memory-optimized Amazon Elastic Compute Cloud (Amazon EC2) X8i instances. These instances are powered by custom Intel Xeon 6 processors with a sustained all-core turbo frequency of 3.9 GHz, available only on AWS. These SAP certified instances deliver the highest performance and fastest memory bandwidth among comparable Intel processors in the cloud.

X8i instances are ideal for memory-intensive workloads including in-memory databases such as SAP HANA, traditional large-scale databases, data analytics, and electronic design automation (EDA), which require high compute performance and a large memory footprint.

These instances provide 1.5 times more memory capacity (up to 6 TB), and 3.4 times more memory bandwidth compared to previous generation X2i instances. These instances offer up to 43% higher performance compared to X2i instances, with higher gains on some of the real-world workloads. They deliver up to 50% higher SAP Application Performance Standard (SAPS) performance, up to 47% faster PostgreSQL performance, up to 88% faster Memcached performance, and up to 46% faster AI inference performance.

During the preview, customers like RISE with SAP utilized up to 6 TB of memory capacity with 50% higher compute performance compared to X2i instances. This enabled faster transaction processing and improved query response times for SAP HANA workloads. Orion reduced the number of active cores on X8i instances compared to X2idn instances while maintaining performance thresholds, cutting SQL Server licensing costs by 50%.

X8i instances
X8i instances are available in 14 sizes including three larger instance sizes (48xlarge, 64xlarge, and 96xlarge), so you can choose the right size for your application to scale up, and two bare metal sizes (metal-48xl and metal-96xl) to deploy workloads that benefit from direct access to physical resources. X8i instances feature up to 100 Gbps of network bandwidth with support for the Elastic Fabric Adapter (EFA) and up to 80 Gbps of throughput to Amazon Elastic Block Store (Amazon EBS).

Here are the specs for X8i instances:

Instance name vCPUs Memory
(GiB)
Network bandwidth (Gbps) EBS bandwidth (Gbps)
x8i.large 2 32 Up to 12.5 Up to 10
x8i.xlarge 4 64 Up to 12.5 Up to 10
x8i.2xlarge 8 128 Up to 15 Up to 10
x8i.4xlarge 16 256 Up to 15 Up to 10
x8i.8xlarge 32 512 15 10
x8i.12xlarge 48 768 22.5 15
x8i.16xlarge 64 1,024 30 20
x8i.24xlarge 96 1,536 40 30
x8i.32xlarge 128 2,048 50 40
x8i.48xlarge 192 3,072 75 60
x8i.64xlarge 256 4,096 80 70
x8i.96xlarge 384 6,144 100 80
x8i.metal-48xl 192 3,072 75 60
x8i.metal-96xl 384 6,144 100 80

X8i instances support the instance bandwidth configuration (IBC) feature like other eighth-generation instance types, offering flexibility to allocate resources between network and EBS bandwidth. You can scale network or EBS bandwidth by up to 25%, improving database performance, query processing speeds, and logging efficiency. These instances also use sixth-generation AWS Nitro cards, which offload CPU virtualization, storage, and networking functions to dedicated hardware and software, enhancing performance and security for your workloads.

Now available
Amazon EC2 X8i instances are now available in US East (N. Virginia), US East (Ohio), US West (Oregon), and Europe (Frankfurt) AWS Regions. For Regional availability and a future roadmap, search the instance type in the CloudFormation resources tab of AWS Capabilities by Region.

You can purchase these instances as On-Demand Instances, Savings Plan, and Spot Instances. To learn more, visit the Amazon EC2 Pricing page.

Give X8i instances a try in the Amazon EC2 console. To learn more, visit the Amazon EC2 X8i instances page and send feedback to AWS re:Post for EC2 or through your usual AWS Support contacts.

Channy

AWS Weekly Roundup: AWS Lambda for .NET 10, AWS Client VPN quickstart, Best of AWS re:Invent, and more (January 12, 2026)

This post was originally published on this site

At the beginning of January, I tend to set my top resolutions for the year, a way to focus on what I want to achieve. If AI and cloud computing are on your resolution list, consider creating an AWS Free Tier account to receive up to $200 in credits and have 6 months of risk-free experimentation with AWS services.

During this period, you can explore essential services across compute, storage, databases, and AI/ML, plus access to over 30 always-free services with monthly usage limits. After 6 months, you can decide whether to upgrade to a standard AWS account.

Whether you’re a student exploring career options, a developer expanding your skill set, or a professional building with cloud technologies, this hands-on approach lets you focus on what matters most: developing real expertise in the areas you’re passionate about.

Last week’s launches
Here are the launches that got my attention this week:

Additional updates
Here are some additional projects, blog posts, and news items that I found interesting:

Crossmodal search with Amazon Nova Multimodal Embeddings Architecture

Upcoming AWS events
Join us January 28 or 29 (depending on your time zone) for Best of AWS re:Invent, a free virtual event where we bring you the most impactful announcements and top sessions from AWS re:Invent. Jeff Barr, AWS VP and Chief Evangelist, will share his highlights during the opening session.

There is still time until January 21 to compete for $250,000 in prizes and AWS credits in the Global 10,000 AIdeas Competition (yes, the second letter is an I as in Idea, not an L as in like). No code required yet: simply submit your idea, and if you’re selected as a semifinalist, you’ll build your app using Kiro within AWS Free Tier limits. Beyond the cash prizes and potential featured placement at AWS re:Invent 2026, you’ll gain hands-on experience with next-generation AI tools and connect with innovators globally.

If you’re interested in these opportunities, join the AWS Builder Center to learn with builders in the AWS community.

That’s all for this week. Check back next Monday for another Weekly Roundup!

Danilo

Happy New Year! AWS Weekly Roundup: 10,000 AIdeas Competition, Amazon EC2, Amazon ECS Managed Instances and more (January 5, 2026)

This post was originally published on this site

Happy New Year! I hope the holidays gave you time to recharge and spend time with your loved ones.

Like every year, I took a few weeks off after AWS re:Invent to rest and plan ahead. I used some of that downtime to plan the next cohort for Become a Solutions Architect (BeSA). BeSA is a free mentoring program that I, along with a few other Amazon Web Services (AWS) employees, volunteer to host as a way to help people excel in their cloud and AI careers. We’re kicking off a 6-week cohort on “Agentic AI on AWS” starting February 21, 2026. Visit the BeSA website to learn more.

There is still time to submit your idea for the Global 10,000 AIdeas Competition and compete for $250,000 in cash prizes, AWS credits, and recognition, including potential featured placement at AWS re:Invent 2026 and across AWS channels.

You will gain hands-on experience with next-generation AI development tools, connect with innovators globally, and access technical enablement through biweekly workshops, AWS User Groups, and AWS Builder Center resources.

The deadline is January 21, 2026, and no code is required yet. If you’re selected as a semifinalist, you’ll build your app then. Your finished app needs to use Kiro for at least part of development, stay within AWS Free Tier limits, and be completely original and not yet published.

If you haven’t yet caught up with all the new releases and announcements from AWS re:Invent 2025, check out our top announcements post or watch the keynotes, innovation talks, and breakout sessions on-demand.

Launches from the last few weeks
I’d like to highlight some launches that got my attention since our last Week in Review on December 15, 2025:

  • Amazon EC2 M8gn and M8gb instances – New M8gn and M8gb instances are powered by AWS Graviton4 processors to deliver up to 30% better compute performance than AWS Graviton3 processors. M8gn instances feature the latest 6th generation AWS Nitro Cards, and offer up to 600 Gbps network bandwidth, the highest network bandwidth among network-optimized EC2 instances. M8gb offer up to 150 Gbps of Amazon EBS bandwidth to provide higher EBS performance compared to same-sized equivalent Graviton4-based instances.
  • AWS Direct Connect supports resilience testing with AWS Fault Injection Service – You can now use AWS Fault Injection Service to test how your applications handle Direct Connect Border Gateway Protocol (BGP) failover in a controlled environment. For example, you can validate that traffic routes to redundant virtual interfaces when a primary virtual interface’s BGP session is disrupted and your applications continue to function as expected.
  • New AWS Security Hub controls in AWS Control Tower – AWS Control Tower now supports 176 additional Security Hub controls in the Control Catalog, covering use cases including security, cost, durability, and operations. With this launch, you can search, discover, enable, and manage these controls directly from AWS Control Tower to govern additional use cases across your multi-account environment.
  • AWS Transform supports network conversion for hybrid data center migrations – You can now use AWS Transform for VMware to automatically convert networks from hybrid data centers. This removes manual network mapping for environments running both VMware and other workloads. The service analyzes VLANs and IP ranges across all exported source networks and maps them to AWS constructs such as virtual private clouds (VPCs), subnets, and security groups.
  • NVIDIA Nemotron 3 Nano available on Amazon Bedrock – Amazon Bedrock now supports NVIDIA Nemotron 3 Nano 30B A3B model, NVIDIA’s latest breakthrough in efficient language modeling that delivers high reasoning performance, built-in tool calling support, and extended context processing with 256K token context window.
  • Amazon EC2 supports Availability Zone ID across its APIs – You can specify the Availability Zone ID (AZ ID) parameter directly in your Amazon EC2 APIs to guarantee consistent placement of resources. AZ IDs are consistent and static identifiers that represent the same physical location across all AWS accounts, helping you optimize resource placement. Prior to this launch, you had to use an AZ name while creating a resource, but these names could map to different physical locations. This mapping made it difficult to ensure resources were always co-located, especially when operating with multiple accounts.
  • Amazon ECS Managed Instances supports Amazon EC2 Spot Instances – Amazon ECS Managed Instances now supports Amazon EC2 Spot Instances, extending the range of capabilities available with AWS managed infrastructure. You can use spare EC2 capacity at up to 90% discount compared to On-Demand prices for fault-tolerant workloads in Amazon ECS Managed Instances.

See AWS What’s New for more launch news that I haven’t covered here. That’s all for this week. Check back next Monday for another Weekly Roundup!

Here’s to a fantastic start to 2026. Happy building!

– Prasad

AWS Weekly Roundup: Amazon ECS, Amazon CloudWatch, Amazon Cognito and more (December 15, 2025)

This post was originally published on this site

Can you believe it? We’re nearly at the end of 2025. And what a year it’s been! From re:Invent recap events, to AWS Summits, AWS Innovate, AWS re:Inforce, Community Days, and DevDays and, recently, adding that cherry on the cake, re:Invent 2025, we have lived through a year filled with exciting moments and technology advancements which continue to shape our new modern world.

Speaking of re:Invent, if you haven’t caught up yet on all the new releases and announcements (and there were plenty of exciting launches across every area), be sure to check out our curated post highlighting the top announcements from AWS re:Invent 2025. We’ve organized all the key releases into easy-to-navigate categories and included links so you can dive deeper into anything that sparks your interest.

While the year may be wrapping up, our teams are still busy working on things that you have either asked for as customers or that we pro-actively create to make your lives easier. Last week had quite a few interesting releases as usual, so let’s look at a few that I think could be useful for many of you out there.

Last week’s launches

Amazon WorkSpaces Secure Browser introduces Web Content Filtering – Organizations can now control web access through category-based filtering across 25+ predefined categories, granular URL policies, and integrated compliance logging. The feature works alongside existing Chrome policies and integrates with Session Logger for enhanced monitoring and is available at no additional cost in 10 AWS Regions with pay-as-you-go pricing.

Amazon Aurora DSQL now supports cluster creation in seconds – Developers can now instantly provision Aurora DSQL databases with setup time reduced from minutes to seconds, enabling rapid prototyping through the integrated AWS console query editor or AI-powered development via the Aurora DSQL Model Context Protocol server. Available at no additional cost in all AWS Regions where Aurora DSQL is offered, with AWS Free Tier access available.

Amazon Aurora PostgreSQL now supports integration with Kiro powers – Developers can now accelerate Aurora PostgreSQL application development using AI-assisted coding through Kiro powers, a repository of pre-packaged Model Context Protocol servers. The Aurora PostgreSQL integration provides direct database connectivity for queries, schema management, and cluster operations, dynamically loading relevant context as developers work. Available for one-click installation in Kiro IDE across all AWS Regions.

Amazon ECS now supports custom container stop signals on AWS Fargate – Fargate tasks now honor the stop signal configured in container images, enabling graceful shutdowns for containers that rely on signals like SIGQUIT or SIGINT instead of the default SIGTERM. The ECS container agent reads the STOPSIGNAL instruction from OCI-compliant images and sends the appropriate signal during task termination. Available at no additional cost across all AWS Regions.

Amazon CloudWatch SDK supports optimized JSON, CBOR protocols – CloudWatch SDK now defaults to JSON and CBOR protocols, delivering lower latency, reduced payload sizes, and decreased client-side CPU and memory usage compared to the traditional AWS Query protocol. Available at no additional cost across all AWS Regions and SDK language variants.

Amazon Cognito identity pools now support private connectivity with AWS PrivateLink – Organizations can now securely exchange federated identities for temporary AWS credentials through private VPC connections, eliminating the need to route authentication traffic over the public internet. Available in all AWS Regions where Cognito identity pools are supported, except AWS China (Beijing) and AWS GovCloud (US) Regions.

AWS Application Migration Service supports IPv6 – Organizations can now migrate applications using IPv6 addressing through dual-stack service endpoints that support both IPv4 and IPv6 communications. During replication, testing, and cutover phases, you can use IPv4, IPv6, or dual-stack configurations to launch servers in your target environment. Available at no additional cost in all AWS Regions that support MGN and EC2 dual-stack endpoints.

And that’s it for the AWS News Blog Weekly Roundup…not just for this week, but for 2025! We’ll be taking a break and returning in January to continue bringing you the latest AWS releases and updates.

As we close out 2025, it’s remarkable to look back at just how much has changed since the beginning of year. From groundbreaking AI capabilities to transformative infrastructure innovations, AWS has delivered an incredible year of releases that have reshaped what’s possible in the cloud. Throughout it all, the AWS News Blog has been right here with you every week with our Weekly Roundup series, helping you stay informed and ready to take advantage of each new opportunity as it arrived. We’re grateful you’ve joined us on this journey, and we can’t wait to continue bringing you the latest AWS innovations when we return in January 2026.

Until then, happy building, and here’s to an even more exciting year ahead!

Matheus Guimaraes | @codingmatheus

New serverless customization in Amazon SageMaker AI accelerates model fine-tuning

This post was originally published on this site

Today, I’m happy to announce new serverless customization in Amazon SageMaker AI for popular AI models, such as Amazon Nova, DeepSeek, GPT-OSS, Llama, and Qwen. The new customization capability provides an easy-to-use interface for the latest fine-tuning techniques like reinforcement learning, so you can accelerate the AI model customization process from months to days.

With a few clicks, you can seamlessly select a model and customization technique, and handle model evaluation and deployment—all entirely serverless so you can focus on model tuning rather than managing infrastructure. When you choose serverless customization, SageMaker AI automatically selects and provisions the appropriate compute resources based on the model and data size.

Getting started with serverless model customization
You can get started customizing models in Amazon SageMaker Studio. Choose Models in the left navigation pane and check out your favorite AI models to be customized.

Customize with UI
You can customize AI models in a only few clicks. In the Customize model dropdown list for a specific model such as Meta Llama 3.1 8B Instruct, choose Customize with UI.

You can select a customization technique used to adapt the base model to your use case. SageMaker AI supports Supervised Fine-Tuning and the latest model customization techniques including Direct Preference Optimization, Reinforcement Learning from Verifiable Rewards (RLVR), and Reinforcement Learning from AI Feedback (RLAIF). Each technique optimizes models in different ways, with selection influenced by factors such as dataset size and quality, available computational resources, task at hand, desired accuracy levels, and deployment constraints.

Upload or select a training dataset to match the format required by the customization technique selected. Use the values of batch size, learning rate, and number of epochs recommended by the technique selected. You can configure advanced settings such as hyperparameters, a newly introduced serverless MLflow application for experiment tracking, and network and storage volume encryption. Choose Submit to get started on your model training job.

After your training job is complete, you can see the models you created in the My Models tab. Choose View details in one of your models.

By choosing Continue customization, you can continue to customize your model by adjusting hyperparameters or training with different techniques. By choosing Evaluate, you can evaluate your customized model to see how it performs compared to the base model.

When you complete both jobs, you can choose either the SageMaker or Bedrock in the Deploy dropdown list to deploy your model.

You can choose Amazon Bedrock for serverless inference. Choose Bedrock and the model name to deploy the model into Amazon Bedrock. To find your deployed models, choose Imported models in the Bedrock console.

You can also deploy your model to a SageMaker AI inference endpoint if you want to control your deployment resources such as an instance type and instance count. After the SageMaker AI deployment is In service, you can use this endpoint to perform inference. In the Playground tab, you can test your customized model with a single prompt or chat mode.

With the serverless MLflow capability, you can automatically log all critical experiment metrics without modifying code and access rich visualizations for further analysis.

Customize with code
When you choose customizing with code, you can see a sample notebook to fine-tune or deploy AI models. If you want to edit the sample notebook, open it in JupyterLab. Alternatively, you can deploy the model immediately by choosing Deploy.

You can choose the Amazon Bedrock or SageMaker AI endpoint by selecting the deployment resources either from Amazon SageMaker Inference or Amazon SageMaker Hyperpod.

When you choose Deploy on the bottom right of the page, it will be redirected back to the model detail page. After the SageMaker AI deployment is in service, you can use this endpoint to perform inference.

Okay, you’ve seen how to streamline the model customization in the SageMaker AI. You can now choose your favorite way. To learn more, visit the Amazon SageMaker AI Developer Guide.

Now available
New serverless AI model customization in Amazon SageMaker AI is now available in US East (N. Virginia), US West (Oregon), Asia Pacific (Tokyo), and Europe (Ireland) Regions. You only pay for the tokens processed during training and inference. To learn more details, visit Amazon SageMaker AI pricing page.

Give it a try in Amazon SageMaker Studio and send feedback to AWS re:Post for SageMaker or through your usual AWS Support contacts.

Channy

Introducing checkpointless and elastic training on Amazon SageMaker HyperPod

This post was originally published on this site

Today, we’re announcing two new AI model training features within Amazon SageMaker HyperPod: checkpointless training, an approach that mitigates the need for traditional checkpoint-based recovery by enabling peer-to-peer state recovery, and elastic training, enabling AI workloads to automatically scale based on resource availability.

  • Checkpointless training – Checkpointless training eliminates disruptive checkpoint-restart cycles, maintaining forward training momentum despite failures, reducing recovery time from hours to minutes. Accelerate your AI model development, reclaim days from development timelines, and confidently scale training workflows to thousands of AI accelerators.
  • Elastic training  – Elastic training maximizes cluster utilization as training workloads automatically expand to use idle capacity as it becomes available, and contract to yield resources as higher-priority workloads like inference volumes peak. Save hours of engineering time per week spent reconfiguring training jobs based on compute availability.

Rather than spending time managing training infrastructure, these new training techniques mean that your team can concentrate entirely on enhancing model performance, ultimately getting your AI models to market faster. By eliminating the traditional checkpoint dependencies and fully utilizing available capacity, you can significantly reduce model training completion times.

Checkpointless training: How it works
Traditional checkpoint-based recovery has these sequential job stages: 1) job termination and restart, 2) process discovery and network setup, 3) checkpoint retrieval, 4) data loader initialization, and 5) training loop resumption. When failures occur, each stage can become a bottleneck and training recovery can take up to an hour on self-managed training clusters. The entire cluster must wait for every single stage to complete before training can resume. This can lead to the entire training cluster sitting idle during recovery operations, which increases costs and extends the time to market.

Checkpointless training removes this bottleneck entirely by maintaining continuous model state preservation across the training cluster. When failures occur, the system instantly recovers by using healthy peers, avoiding the need for a checkpoint-based recovery that requires restarting the entire job. As a result, checkpointless training enables fault recovery in minutes.

Checkpointless training is designed for incremental adoption and built on four core components that work together: 1) collective communications initialization optimizations, 2) memory-mapped data loading that enables caching, 3) in-process recovery, and 4) checkpointless peer-to-peer state replication. These components are orchestrated through the HyperPod training operator that is used to launch the job. Each component optimizes a specific step in the recovery process, and together they enable automatic detection and recovery of infrastructure faults in minutes with zero manual intervention, even with thousands of AI accelerators. You can progressively enable each of these features as your training scales.

The latest Amazon Nova models were trained using this technology on tens of thousands of accelerators. Additionally, based on internal studies on cluster sizes ranging between 16 GPUs to over 2,000 GPUs, checkpointless training showcased significant improvements in recovery times, reducing downtime by over 80% compared to traditional checkpoint-based recovery.

To learn more, visit HyperPod Checkpointless Training in the Amazon SageMaker AI Developer Guide.

Elastic training: How it works
On clusters that run different types of modern AI workloads, accelerator availability can change continuously throughout the day as short-duration training runs complete, inference spikes occur and subside, or resources free up from completed experiments. Despite this dynamic availability of AI accelerators, traditional training workloads remain locked into their initial compute allocation, unable to take advantage of idle accelerators without manual intervention. This rigidity leaves valuable GPU capacity unused and prevents organizations from maximizing their infrastructure investment.

Elastic training transforms how training workloads interact with cluster resources. Training jobs can automatically scale up to utilize available accelerators and gracefully contract when resources are needed elsewhere, all while maintaining training quality.

Workload elasticity is enabled through the HyperPod training operator that orchestrates scaling decisions through integration with the Kubernetes control plane and resource scheduler. It continuously monitors cluster state through three primary channels: pod lifecycle events, node availability changes, and resource scheduler priority signals. This comprehensive monitoring enables near-instantaneous detection of scaling opportunities, whether from newly available resources or requests from higher-priority workloads.

The scaling mechanism relies on adding and removing data parallel replicas. When additional compute resources become available, new data parallel replicas join the training job, accelerating throughput. Conversely, during scale-down events (for example, when a higher-priority workload requests resources), the system scales down by removing replicas rather than terminating the entire job, allowing training to continue at reduced capacity.

Across different scales, the system preserves the global batch size and adapts learning rates, preventing model convergence from being adversely impacted. This enables workloads to dynamically scale up or down to utilize available AI accelerators without any manual intervention.

You can start elastic training through the HyperPod recipes for publicly available foundation models (FMs) including Llama and GPT-OSS. Additionally, you can modify your PyTorch training scripts to add elastic event handlers, which enable the job to dynamically scale.

To learn more, visit the HyperPod Elastic Training in the Amazon SageMaker AI Developer Guide. To get started, find the HyperPod recipes available in the AWS GitHub repository.

Now available
Both features are available in all the Regions in which Amazon SageMaker HyperPod is available. You can use these training techniques without additional cost. To learn more, visit the SageMaker HyperPod product page and SageMaker AI pricing page.

Give it a try and send feedback to AWS re:Post for SageMaker or through your usual AWS Support contacts.

Channy

Announcing replication support and Intelligent-Tiering for Amazon S3 Tables

This post was originally published on this site

Today, we’re announcing two new capabilities for Amazon S3 Tables: support for the new Intelligent-Tiering storage class that automatically optimizes costs based on access patterns, and replication support to automatically maintain consistent Apache Iceberg table replicas across AWS Regions and accounts without manual sync.

Organizations working with tabular data face two common challenges. First, they need to manually manage storage costs as their datasets grow and access patterns change over time. Second, when maintaining replicas of Iceberg tables across Regions or accounts, they must build and maintain complex architectures to track updates, manage object replication, and handle metadata transformations.

S3 Tables Intelligent-Tiering storage class
With the S3 Tables Intelligent-Tiering storage class, data is automatically tiered to the most cost-effective access tier based on access patterns. Data is stored in three low-latency tiers: Frequent Access, Infrequent Access (40% lower cost than Frequent Access), and Archive Instant Access (68% lower cost compared to Infrequent Access). After 30 days without access, data moves to Infrequent Access, and after 90 days, it moves to Archive Instant Access. This happens without changes to your applications or impact on performance.

Table maintenance activities, including compaction, snapshot expiration, and unreferenced file removal, operate without affecting the data’s access tiers. Compaction automatically processes only data in the Frequent Access tier, optimizing performance for actively queried data while reducing maintenance costs by skipping colder files in lower-cost tiers.

By default, all existing tables use the Standard storage class. When creating new tables, you can specify Intelligent-Tiering as the storage class, or you can rely on the default storage class configured at the table bucket level. You can set Intelligent-Tiering as the default storage class for your table bucket to automatically store tables in Intelligent-Tiering when no storage class is specified during creation.

Let me show you how it works
You can use the AWS Command Line Interface (AWS CLI) and the put-table-bucket-storage-class and get-table-bucket-storage-class commands to change or verify the storage tier of your S3 table bucket.

# Change the storage class
aws s3tables put-table-bucket-storage-class 
   --table-bucket-arn $TABLE_BUCKET_ARN  
   --storage-class-configuration storageClass=INTELLIGENT_TIERING

# Verify the storage class
aws s3tables get-table-bucket-storage-class 
   --table-bucket-arn $TABLE_BUCKET_ARN  

{ "storageClassConfiguration":
   { 
      "storageClass": "INTELLIGENT_TIERING"
   }
}

S3 Tables replication support
The new S3 Tables replication support helps you maintain consistent read replicas of your tables across AWS Regions and accounts. You specify the destination table bucket and the service creates read-only replica tables. It replicates all updates chronologically while preserving parent-child snapshot relationships. Table replication helps you build global datasets to minimize query latency for geographically distributed teams, meet compliance requirements, and provide data protection.

You can now easily create replica tables that deliver similar query performance as their source tables. Replica tables are updated within minutes of source table updates and support independent encryption and retention policies from their source tables. Replica tables can be queried using Amazon SageMaker Unified Studio or any Iceberg-compatible engine including DuckDB, PyIceberg, Apache Spark, and Trino.

You can create and maintain replicas of your tables through the AWS Management Console or APIs and AWS SDKs. You specify one or more destination table buckets to replicate your source tables. When you turn on replication, S3 Tables automatically creates read-only replica tables in your destination table buckets, backfills them with the latest state of the source table, and continually monitors for new updates to keep replicas in sync. This helps you meet time-travel and audit requirements while maintaining multiple replicas of your data.

Let me show you how it works
To show you how it works, I proceed in three steps. First, I create an S3 table bucket, create an Iceberg table, and populate it with data. Second, I configure the replication. Third, I connect to the replicated table and query the data to show you that changes are replicated.

For this demo, the S3 team kindly gave me access to an Amazon EMR cluster already provisioned. You can follow the Amazon EMR documentation to create your own cluster. They also created two S3 table buckets, a source and a destination for the replication. Again, the S3 Tables documentation will help you to get started.

I take a note of the two S3 Tables bucket Amazon Resource Names (ARNs). In this demo, I refer to these as the environment variables SOURCE_TABLE_ARN and DEST_TABLE_ARN.

First step: Prepare the source database

I start a terminal, connect to the EMR cluster, start a Spark session, create a table, and insert a row of data. The commands I use in this demo are documented in Accessing tables using the Amazon S3 Tables Iceberg REST endpoint.

sudo spark-shell 
--packages "org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.4.1,software.amazon.awssdk:bundle:2.20.160,software.amazon.awssdk:url-connection-client:2.20.160" 
--master "local[*]" 
--conf "spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions" 
--conf "spark.sql.defaultCatalog=spark_catalog" 
--conf "spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkCatalog" 
--conf "spark.sql.catalog.spark_catalog.type=rest" 
--conf "spark.sql.catalog.spark_catalog.uri=https://s3tables.us-east-1.amazonaws.com/iceberg" 
--conf "spark.sql.catalog.spark_catalog.warehouse=arn:aws:s3tables:us-east-1:012345678901:bucket/aws-news-blog-test" 
--conf "spark.sql.catalog.spark_catalog.rest.sigv4-enabled=true" 
--conf "spark.sql.catalog.spark_catalog.rest.signing-name=s3tables" 
--conf "spark.sql.catalog.spark_catalog.rest.signing-region=us-east-1" 
--conf "spark.sql.catalog.spark_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO" 
--conf "spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.SimpleAWSCredentialProvider" 
--conf "spark.sql.catalog.spark_catalog.rest-metrics-reporting-enabled=false"

spark.sql("""
CREATE TABLE s3tablesbucket.test.aws_news_blog (
customer_id STRING,
address STRING
) USING iceberg
""")

spark.sql("INSERT INTO s3tablesbucket.test.aws_news_blog VALUES ('cust1', 'val1')")

spark.sql("SELECT * FROM s3tablesbucket.test.aws_news_blog LIMIT 10").show()
+-----------+-------+
|customer_id|address|
+-----------+-------+
|      cust1|   val1|
+-----------+-------+

So far, so good.

Second step: Configure the replication for S3 Tables

Now, I use the CLI on my laptop to configure the S3 table bucket replication.

Before doing so, I create an AWS Identity and Access Management (IAM) policy to authorize the replication service to access my S3 table bucket and encryption keys. Refer to the S3 Tables replication documentation for the details. The permissions I used for this demo are:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:*",
                "s3tables:*",
                "kms:DescribeKey",
                "kms:GenerateDataKey",
                "kms:Decrypt"
            ],
            "Resource": "*"
        }
    ]
}

After having created this IAM policy, I can now proceed and configure the replication:

aws s3tables-replication put-table-replication 
--table-arn ${SOURCE_TABLE_ARN} 
--configuration  '{
    "role": "arn:aws:iam::<MY_ACCOUNT_NUMBER>:role/S3TableReplicationManualTestingRole", 
    "rules":[
        {
            "destinations": [
                {
                    "destinationTableBucketARN": "${DST_TABLE_ARN}"
                }]
        }
    ]

The replication starts automatically. Updates are typically replicated within minutes. The time it takes to complete depends on the volume of data in the source table.

Third step: Connect to the replicated table and query the data

Now, I connect to the EMR cluster again, and I start a second Spark session. This time, I use the destination table.

S3 Tables replication - destination table

To verify the replication works, I insert a second row of data on the source table.

spark.sql("INSERT INTO s3tablesbucket.test.aws_news_blog VALUES ('cust2', 'val2')")

I wait a few minutes for the replication to trigger. I follow the status of the replication with the get-table-replication-status command.

aws s3tables-replication get-table-replication-status 
--table-arn ${SOURCE_TABLE_ARN} 
{
    "sourceTableArn": "arn:aws:s3tables:us-east-1:012345678901:bucket/manual-test/table/e0fce724-b758-4ee6-85f7-ca8bce556b41",
    "destinations": [
        {
            "replicationStatus": "pending",
            "destinationTableBucketArn": "arn:aws:s3tables:us-east-1:012345678901:bucket/manual-test-dst",
            "destinationTableArn": "arn:aws:s3tables:us-east-1:012345678901:bucket/manual-test-dst/table/5e3fb799-10dc-470d-a380-1a16d6716db0",
            "lastSuccessfulReplicatedUpdate": {
                "metadataLocation": "s3://e0fce724-b758-4ee6-8-i9tkzok34kum8fy6jpex5jn68cwf4use1b-s3alias/e0fce724-b758-4ee6-85f7-ca8bce556b41/metadata/00001-40a15eb3-d72d-43fe-a1cf-84b4b3934e4c.metadata.json",
                "timestamp": "2025-11-14T12:58:18.140281+00:00"
            }
        }
    ]
}

When replication status shows ready, I connect to the EMR cluster and I query the destination table. Without surprise, I see the new row of data.

S3 Tables replication - target table is up to date

Additional things to know
Here are a couple of additional points to pay attention to:

  • Replication for S3 Tables supports both Apache Iceberg V2 and V3 table formats, giving you flexibility in your table format choice.
  • You can configure replication at the table bucket level, making it straightforward to replicate all tables under that bucket without individual table configurations.
  • Your replica tables maintain the storage class you choose for your destination tables, which means you can optimize for your specific cost and performance needs.
  • Any Iceberg-compatible catalog can directly query your replica tables without additional coordination—they only need to point to the replica table location. This gives you flexibility in choosing query engines and tools.

Pricing and availability
You can track your storage usage by access tier through AWS Cost and Usage Reports and Amazon CloudWatch metrics. For replication monitoring, AWS CloudTrail logs provide events for each replicated object.

There are no additional charges to configure Intelligent-Tiering. You only pay for storage costs in each tier. Your tables continue to work as before, with automatic cost optimization based on your access patterns.

For S3 Tables replication, you pay the S3 Tables charges for storage in the destination table, for replication PUT requests, for table updates (commits), and for object monitoring on the replicated data. For cross-Region table replication, you also pay for inter-Region data transfer out from Amazon S3 to the destination Region based on the Region pair.

As usual, refer to the Amazon S3 pricing page for the details.

Both capabilities are available today in all AWS Regions where S3 Tables are supported.

To learn more about these new capabilities, visit the Amazon S3 Tables documentation or try them in the Amazon S3 console today. Share your feedback through AWS re:Post for Amazon S3 or through your AWS Support contacts.

— seb

Amazon S3 Storage Lens adds performance metrics, support for billions of prefixes, and export to S3 Tables

This post was originally published on this site

Today, we’re announcing three new capabilities for Amazon S3 Storage Lens that give you deeper insights into your storage performance and usage patterns. With the addition of performance metrics, support for analyzing billions of prefixes, and direct export to Amazon S3 Tables, you have the tools you need to optimize application performance, reduce costs, and make data-driven decisions about your Amazon S3 storage strategy.

New performance metric categories
S3 Storage Lens now includes eight new performance metric categories that help identify and resolve performance constraints across your organization. These are available at organization, account, bucket, and prefix levels. For example, the service helps you identify small objects in a bucket or prefix that can  slow down application performance. This can be mitigated by batching small objects for using the Amazon S3 Express One Zone storage class for higher performance small object workloads.

To access the new performance metrics, you need to enable performance metrics in the S3 Storage Lens advanced tier when creating a new Storage Lens dashboard or editing an existing configuration.

Metric category Details Use case Mitigation
Read request size Distribution of read request sizes (GET) by day Identify dataset with small read request patterns that slow down performance Small request: Batch small objects or use Amazon S3 Express One Zone for high-performance small object workloads
Write request size Distribution of write request sizes (PUT, POST, COPY, and UploadPart) by day Identify dataset with small write request patterns that slow down performance Large request: Parallelize requests, use MPU or use AWS CRT
Storage size Distribution of object sizes Identify dataset with small small objects that slow down performance Small object sizes: Consider bundling small objects
Concurrent PUT 503 errors Number of 503s due to concurrent PUT operation on same object Identify prefixes with concurrent PUT throttling that slow down performance For single writer, modify retry behavior or use Amazon S3 Express One Zone. For multiple writers, use consensus mechanism or use Amazon S3 Express One Zone
Cross-Region data transfer Bytes transferred and requests sent across Region, in Region Identify potential performance and cost degradation due to cross-Region data access Co-locate compute with data in the same AWS Region
Unique objects accessed Number or percentage of unique objects accessed per day Identify datasets where small subset of objects are being frequently accessed. These can be moved to higher performance storage tier for better performance Consider moving active data to Amazon S3 Express One Zone or other caching solutions
FirstByteLatency (existing Amazon CloudWatch metric) Daily average of first byte latency metric The daily average per-request time from the complete request being received to when the response starts to be returned
TotalRequestLatency (existing Amazon CloudWatch metric) Daily average of Total Request Latency The daily average elapsed per request time from the first byte received to the last byte sent

How it works
On the Amazon S3 console I choose Create Storage Lens dashboard to create a new dashboard. You can also edit an existing dashboard configuration. I then configure general settings such as providing a Dashboard name, Status, and the optional Tags. Then, I choose Next.


Next, I define the scope of the dashboard by selecting Include all Regions and Include all buckets and specifying the Regions and buckets to be included.


I opt in to the Advanced tier in the Storage Lens dashboard configuration, select Performance metrics, then choose Next.


Next, I select Prefix aggregation as an additional metrics aggregation, then leave the rest of the information as default before I choose Next.


I select the Default metrics report, then General purpose bucket as the bucket type, and then select the Amazon S3 bucket in my AWS account as the Destination bucket. I leave the rest of the information as default, then select Next.


I review all the information before I choose Submit to finalize the process.


After it’s enabled, I’ll receive daily performance metrics directly in the Storage Lens console dashboard. You can also choose to export report in CSV or Parquet format to any bucket in your account or publish to Amazon CloudWatch. The performance metrics are aggregated and published daily and will be available at multiple levels: organization, account, bucket, and prefix. In this dropdown menu, I choose the % concurrent PUT 503 error for the Metric, Last 30 days for the Date range, and 10 for the Top N buckets.


The Concurrent PUT 503 error count metric tracks the number of 503 errors generated by simultaneous PUT operations to the same object. Throttling errors can degrade application performance. For a single writer, modify retry behavior or use higher performance storage tier such as Amazon S3 Express One Zone to mitigate concurrent PUT 503 errors. For multiple writers scenario, use a consensus mechanism to avoid concurrent PUT 503 errors or use higher performance storage tier such as Amazon S3 Express One Zone.

Complete analytics for all prefixes in your S3 buckets
S3 Storage Lens now supports analytics for all prefixes in your S3 buckets through a new Expanded prefixes metrics report. This capability removes previous limitations that restricted analysis to prefixes meeting a 1% size threshold and a maximum depth of 10 levels. You can now track up to billions of prefixes per bucket for analysis at the most granular prefix level, regardless of size or depth.

The Expanded prefixes metrics report includes all existing S3 Storage Lens metric categories: storage usage, activity metrics (requests and bytes transferred), data protection metrics, and detailed status code metrics.

How to get started
I follow the same steps outlined in the How it works section to create or update the Storage Lens dashboard. In Step 4 on the console, where you select export options, you can select the new Expanded prefixes metrics report. Thereafter, I can export the expanded prefixes metrics report in CSV or Parquet format to any general purpose bucket in my account for efficient querying of my Storage Lens data.


Good to know
This enhancement addresses scenarios where organizations need granular visibility across their entire prefix structure. For example, you can identify prefixes with incomplete multipart uploads to reduce costs, track compliance across your entire prefix structure for encryption and replication requirements, and detect performance issues at the most granular level.

Export S3 Storage Lens metrics to S3 Tables
S3 Storage Lens metrics can now be automatically exported to S3 Tables, a fully managed feature on AWS with built-in Apache Iceberg support. This integration provides daily automatic delivery of metrics to AWS managed S3 Tables for immediate querying without requiring additional processing infrastructure.

How to get started
I start by following the process outlined in Step 5 on the console, where I choose the export destination. This time, I choose Expanded prefixes metrics report. In addition to General purpose bucket, I choose Table bucket.

The new Storage Lens metrics are exported to new tables in an AWS managed bucket aws-s3.


I select the expanded_prefixes_activity_metrics table to view API usage metrics for expanded prefix reports.


I can preview the table on the Amazon S3 console or use Amazon Athena to query the table.


Good to know
S3 Tables integration with S3 Storage Lens simplifies metric analysis using familiar SQL tools and AWS analytics services such as Amazon Athena, Amazon QuickSight, Amazon EMR, and Amazon Redshift, without requiring a data pipeline. The metrics are automatically organized for optimal querying, with custom retention and encryption options to suit your needs.

This integration enables cross-account and cross-Region analysis, custom dashboard creation, and data correlation with other AWS services. For example, you can combine Storage Lens metrics with S3 Metadata to analyze prefix-level activity patterns and identify objects in prefixes with cold data that are eligible for transition to lower-cost storage tiers.

For your agentic AI workflows, you can use natural language to query S3 Storage Lens metrics in S3 Tables with the S3 Tables MCP Server. Agents can ask questions such as ‘which buckets grew the most last month?’ or ‘show me storage costs by storage class’ and get instant insights from your observability data.

Now available
All three enhancements are available in all AWS Regions where S3 Storage Lens is currently offered (except the China Regions and AWS GovCloud (US)).

These features are included in the Amazon S3 Storage Lens Advanced tier at no additional charge beyond standard advanced tier pricing. For the S3 Tables export, you pay only for S3 Tables storage, maintenance, and queries. There is no additional charge for the export functionality itself.

To learn more about Amazon S3 Storage Lens performance metrics, support for billions of prefixes, and export to S3 Tables, refer to the Amazon S3 user guide. For pricing details, visit the Amazon S3 pricing page.

Veliswa Boya.

Amazon Bedrock AgentCore adds quality evaluations and policy controls for deploying trusted AI agents

This post was originally published on this site

Today, we’re announcing new capabilities in Amazon Bedrock AgentCore to further remove barriers holding AI agents back from production. Organizations across industries are already building on AgentCore, the most advanced platform to build, deploy, and operate highly capable agents securely at any scale. In just 5 months since preview, the AgentCore SDK has been downloaded over 2 million times. For example:

  • PGA TOUR, a pioneer and innovation leader in sports has built a multi-agent content generation system to create articles for their digital platforms. The new solution, built on AgentCore, enables the PGA TOUR to provide comprehensive coverage for every player in the field, by increasing content writing speed by 1,000 percent while achieving a 95 percent reduction in costs.
  • Independent software vendors (ISVs) like Workday are building the software of the future on AgentCore. AgentCore Code Interpreter provides Workday Planning Agent with secure data protection and essential features for financial data exploration. Users can analyze financial and operational data through natural language queries, making financial planning intuitive and self-driven. This capability reduces time spent on routine planning analysis by 30 percent, saving approximately 100 hours per month.
  • Grupo Elfa, a Brazilian distributor and retailer, relies on AgentCore Observability for complete audit traceability and real-time metrics of their agents, transforming their reactive processes into proactive operations. Using this unified platform, their sales team can handle thousands of daily price quotes while the organization maintains full visibility of agent decisions, helping achieve 100 percent traceability of agent decisions and interactions, and reduced problem resolution time by 50 percent.

As organizations scale their agent deployments, they face challenges around implementing the right boundaries and quality checks to confidently deploy agents. The autonomy that makes agents powerful also makes them hard to confidently deploy at scale, as they might access sensitive data inappropriately, make unauthorized decisions, or take unexpected actions. Development teams must balance enabling agent autonomy while ensuring they operate within acceptable boundaries and with the quality you require to put them in front of customers and employees.

The new capabilities available today take the guesswork out of this process and help you build and deploy trusted AI agents with confidence:

  • Policy in AgentCore (Preview) – Defines clear boundaries for agent actions by intercepting AgentCore Gateway tool calls before they run using policies with fine-grained permissions.
  • AgentCore Evaluations (Preview) – Monitors the quality of your agents based on real-world behavior using built-in evaluators for dimensions such as correctness and helpfulness, plus custom evaluators for business-specific requirements.

We’re also introducing features that expand what agents can do:

  • Episodic functionality in AgentCore Memory – A new long-term strategy that helps agents learn from experiences and adapt solutions across similar situations for improved consistency and performance in similar future tasks.
  • Bidirectional streaming in AgentCore Runtime – Deploys voice agents where both users and agents can speak simultaneously following a natural conversation flow.

Policy in AgentCore for precise agent control
Policy gives you control over the actions agents can take and are applied outside of the agent’s reasoning loop, treating agents as autonomous actors whose decisions require verification before reaching tools, systems, or data. It integrates with AgentCore Gateway to intercept tool calls as they happen, processing requests while maintaining operational speed, so workflows remain fast and responsive.

You can create policies using natural language or directly use Cedar—an open source policy language for fine-grained permissions—simplifying the process to set up, understand, and audit rules without writing custom code. This approach makes policy creation accessible to development, security, and compliance teams who can create, understand, and audit rules without specialized coding knowledge.

The policies operate independently of how the agent was built or which model it uses. You can define which tools and data agents can access—whether they are APIs, AWS Lambda functions, Model Context Protocol (MCP) servers, or third-party services—what actions they can perform, and under what conditions.

Teams can define clear policies once and apply them consistently across their organization. With policies in place, developers gain the freedom to create innovative agentic experiences, and organizations can deploy their agents to act autonomously while knowing they’ll stay within defined boundaries and compliance requirements.

Using Policy in AgentCore
You can start by creating a policy engine in the new Policy section of the AgentCore console and associate it with one or more AgentCore gateways.

A policy engine is a collection of policies that are evaluated at the gateway endpoint. When associating a gateway with a policy engine, you can choose whether to enforce the result of the policy—effectively permitting or denying access to a tool call—or to only emit logs. Using logs helps you test and validate a policy before enabling it in production.

Then, you can define the policies to apply to have granular control over access to the tools offered by the associated AgentCore gateways.

Amazon Bedrock AgentCore Policy console

To create a policy, you can start with a natural language description (that should include information of the authentication claims to use) or directly edit Cedar code.

Amazon Bedrock AgentCore Policy add

Natural language-based policy authoring provides a more accessible way for you to create fine-grained policies. Instead of writing formal policy code, you can describe rules in plain English. The system interprets your intent, generates candidate policies, validates them against the tool schema, and uses automated reasoning to check safety conditions—identifying prompts that are overly permissive, overly restrictive, or contain conditions that can never be satisfied.

Unlike generic large language model (LLM) translations, this feature understands the structure of your tools and generates policies that are both syntactically correct and semantically aligned with your intent, while flagging rules that cannot be enforced. It is also available as a Model Context Protocol (MCP) server, so you can author and validate policies directly in your preferred AI-assisted coding environment as part of your normal development workflow. This approach reduces onboarding time and helps you write high-quality authorization rules without needing Cedar expertise.

The following sample policy uses information from the OAuth claims in the JWT token used to authenticate to an AgentCore gateway (for the role) and the arguments passed to the tool call (context.input) to validate access to the tool processing a refund. Only an authenticated user with the refund-agent role can access the tool but for amounts (context.input.amount) lower than $200 USD.

permit(
  principal is AgentCore::OAuthUser,
  action == AgentCore::Action::"RefundTool__process_refund",
  resource == AgentCore::Gateway::"<GATEWAY_ARN>"
)
when {
  principal.hasTag("role") &&
  principal.getTag("role") == "refund-agent" &&
  context.input.amount < 200
};

AgentCore Evaluations for continuous, real-time quality intelligence
AgentCore Evaluations is a fully managed service that helps you continuously monitor and analyze agent performance based on real-world behavior. With AgentCore Evaluations, you can use built-in evaluators for common quality dimensions such as correctness, helpfulness, tool selection accuracy, safety, goal success rate, and context relevance. You can also create custom model-based scoring systems configured with your choice of prompt and model for business-tailored scoring while the service samples live agent interactions and scores them continuously.

All results from AgentCore Evaluations are visualized in Amazon CloudWatch alongside AgentCore Observability insights, providing one place for unified monitoring. You can also set up alerts and alarms on the evaluation scores to proactively monitor agent quality and respond when metrics fall outside acceptable thresholds.

You can use AgentCore Evaluations during the testing phase where you can check an agent against the baseline before deployment to stop faulty versions from reaching users, and in production for continuous improvement of your agents. When quality metrics drop below defined thresholds—such as a customer service agent satisfaction declining or politeness scores dropping by more than 10 percent over an 8-hour period—the system triggers immediate alerts, helping to detect and address quality issues faster.

Using AgentCore Evaluations
You can create an online evaluation in the new Evaluations section of the AgentCore console. You can use as data source an AgentCore agent endpoint or a CloudWatch log group used by an external agent. For example, I use here the same sample customer support agent I shared when we introduced AgentCore in preview.

Amazon Bedrock AgentCore Evaluations source

Then, you can select the evaluators to use, including custom evaluators that you can define starting from the existing templates or build from scratch.

Amazon Bedrock AgentCore Evaluations source

For example, for a customer support agent, you can select metrics such as:

  • Correctness – Evaluates whether the information in the agent’s response is factually accurate
  • Faithfulness – Evaluates whether information in the response is supported by provided context/sources
  • Helpfulness – Evaluates from user’s perspective how useful and valuable the agent’s response is
  • Harmfulness – Evaluates whether the response contains harmful content
  • Stereotyping – Detects content that makes generalizations about individuals or groups

The evaluators for tool selection and tool parameter accuracy can help you understand if an agent is choosing the right tool for a task and extracting the correct parameters from the user queries.

To complete the creation of the evaluation, you can choose the sampling rate and optional filters. For permissions, you can create a new AWS Identity and Access Management (IAM) service role or pass an existing one.

Amazon Bedrock AgentCore Evaluations create

The results are published, as they are evaluated, on Amazon CloudWatch in the AgentCore Observability dashboard. You can choose any of the bar chart sections to see the corresponding traces and gain deeper insight into the requests and responses behind that specific evaluation.

Amazon AgentCore Evaluations results

Because the results are in CloudWatch, you can use all of its feature to create, for example, alarms and automations.

Creating custom evaluators in AgentCore Evaluations
Custom evaluators allow you to define business-specific quality metrics tailored to your agent’s unique requirements. To create a custom evaluator, you provide the model to use as a judge, including inference parameters such as temperature and max output tokens, and a tailored prompt with the judging instructions. You can start from the prompt used by one of the built-in evaluators or enter a new one.

AgentCore Evaluations create custom evaluator

Then, you define the scale to produce in output. It can be either numeric values or custom text labels that you define. Finally, you configure whether the evaluation is computed by the model on single traces, full sessions, or for each tool call.

AgentCore Evaluations custom evaluator scale

AgentCore Memory episodic functionality for experience-based learning
AgentCore Memory, a fully managed service that gives AI agents the ability to remember past interactions, now includes a new long-term memory strategy that gives agents the ability to learn from past experiences and apply those lessons to provide more helpful assistance in future interactions.

Consider booking travel with an agent: over time, the agent learns from your booking patterns—such as the fact that you often need to move flights to later times when traveling for work due to client meetings. When you start your next booking involving client meetings, the agent proactively suggests flexible return options based on these learned patterns. Just like an experienced assistant who learns your specific travel habits, agents with episodic memory can now recognize and adapt to your individual needs.

When you enable the new episodic functionality, AgentCore Memory captures structured episodes that record the context, reasoning process, actions taken, and outcomes of agent interactions, while a reflection agent analyzes these episodes to extract broader insights and patterns. When facing similar tasks, agents can retrieve these learnings to improve decision-making consistency and reduce processing time. This reduces the need for custom instructions by including in the agent context only the specific learnings an agent needs to complete a task instead of a long list of all possible suggestions.

AgentCore Runtime bidirectional streaming for more natural conversations
With AgentCore Runtime, you can deploy agentic applications with few lines of code. To simplify deploying conversational experiences that feel natural and responsive, AgentCore Runtime now supports bidirectional streaming. This capability enables voice agents to listen and adapt while users speak, so that people can interrupt agents mid-response and have the agent immediately adjust to the new context—without waiting for the agent to finish its current output. Rather than traditional turn-based interaction where users must wait for complete responses, bidirectional streaming creates flowing, natural conversations where agents dynamically change their response based on what the user is saying.

Building these conversational experiences from the ground up requires significant engineering effort to handle the complex flow of simultaneous communication. Bidirectional streaming simplifies this by managing the infrastructure needed for agents to process input while generating output, handling interruptions gracefully, and maintaining context throughout dynamic conversation shifts. You can now deploy agents that naturally adapt to the fluid nature of human conversation—supporting mid-thought interruptions, context switches, and clarifications without losing the thread of the interaction.

Things to know
Amazon Bedrock AgentCore, including the preview of Policy, is available in the US East (Ohio, N. Virginia), US West (Oregon), Asia Pacific (Mumbai, Singapore, Sydney, Tokyo), and Europe (Frankfurt, Ireland) AWS Regions . The preview of AgentCore Evaluations is available in the US East (Ohio, N. Virginia), US West (Oregon), Asia Pacific (Sydney), and Europe (Frankfurt) Regions. For Regional availability and future roadmap, visit AWS Capabilities by Region.

With AgentCore, you pay for what you use with no upfront commitments. For detailed pricing information, visit the Amazon Bedrock pricing page. AgentCore is also a part of the AWS Free Tier that new AWS customers can use to get started at no cost and explore key AWS services.

These new features work with any open source framework such as CrewAI, LangGraph, LlamaIndex, and Strands Agents, and with any foundation model. AgentCore services can be used together or independently, and you can get started using your favorite AI-assisted development environment with the AgentCore open source MCP server.

To learn more and get started quickly, visit the AgentCore Developer Guide.

Danilo

Build multi-step applications and AI workflows with AWS Lambda durable functions

This post was originally published on this site

Modern applications increasingly require complex and long-running coordination between services, such as multi-step payment processing, AI agent orchestration, or approval processes awaiting human decisions. Building these traditionally required significant effort to implement state management, handle failures, and integrate multiple infrastructure services.

Starting today, you can use AWS Lambda durable functions to build reliable multi-step applications directly within the familiar AWS Lambda experience. Durable functions are regular Lambda functions with the same event handler and integrations you already know. You write sequential code in your preferred programming language, and durable functions track progress, automatically retry on failures, and suspend execution for up to one year at defined points, without paying for idle compute during waits.

AWS Lambda durable functions use a checkpoint and replay mechanism, known as durable execution, to deliver these capabilities. After enabling a function for durable execution, you add the new open source durable execution SDK to your function code. You then use SDK primitives like “steps” to add automatic checkpointing and retries to your business logic and “waits” to efficiently suspend execution without compute charges. When execution terminates unexpectedly, Lambda resumes from the last checkpoint, replaying your event handler from the beginning while skipping completed operations.

Getting started with AWS Lambda durable functions
Let me walk you through how to use durable functions.

First, I create a new Lambda function in the console and select Author from scratch. In the Durable execution section, I select Enable. Note that, durable function setting can only be set during function creation and currently can’t be modified for existing Lambda functions.

After I create my Lambda durable function, I can get started with the provided code.

Lambda durable functions introduces two core primitives that handle state management and recovery:

  • Steps—The context.step() method adds automatic retries and checkpointing to your business logic. After a step is completed, it will be skipped during replay.
  • Wait—The context.wait() method pauses execution for a specified duration, terminating the function, suspending and resuming execution without compute charges.

Additionally, Lambda durable functions provides other operations for more complex patterns: create_callback() creates a callback that you can use to await results for external events like API responses or human approvals, wait_for_condition() pauses until a specific condition is met like polling a REST API for process completion, and parallel() or map() operations for advanced concurrency use cases.

Building a production-ready order processing workflow
Now let’s expand the default example to build a production-ready order processing workflow. This demonstrates how to use callbacks for external approvals, handle errors properly, and configure retry strategies. I keep the code intentionally concise to focus on these core concepts. In a full implementation, you could enhance the validation step with Amazon Bedrock to add AI-powered order analysis.

Here’s how the order processing workflow works:

  • First, validate_order() checks order data to ensure all required fields are present.
  • Next, send_for_approval() sends the order for external human approval and waits for a callback response, suspending execution without compute charges.
  • Then, process_order() completes order processing.
  • Throughout the workflow, try-catch error handling distinguishes between terminal errors that stop execution immediately and recoverable errors inside steps that trigger automatic retries.

Here’s the complete order processing workflow with step definitions and the main handler:

import random
from aws_durable_execution_sdk_python import (
    DurableContext,
    StepContext,
    durable_execution,
    durable_step,
)
from aws_durable_execution_sdk_python.config import (
    Duration,
    StepConfig,
    CallbackConfig,
)
from aws_durable_execution_sdk_python.retries import (
    RetryStrategyConfig,
    create_retry_strategy,
)


@durable_step
def validate_order(step_context: StepContext, order_id: str) -> dict:
    """Validates order data using AI."""
    step_context.logger.info(f"Validating order: {order_id}")
    # In production: calls Amazon Bedrock to validate order completeness and accuracy
    return {"order_id": order_id, "status": "validated"}


@durable_step
def send_for_approval(step_context: StepContext, callback_id: str, order_id: str) -> dict:
    """Sends order for approval using the provided callback token."""
    step_context.logger.info(f"Sending order {order_id} for approval with callback_id: {callback_id}")
    
    # In production: send callback_id to external approval system
    # The external system will call Lambda SendDurableExecutionCallbackSuccess or
    # SendDurableExecutionCallbackFailure APIs with this callback_id when approval is complete
    
    return {
        "order_id": order_id,
        "callback_id": callback_id,
        "status": "sent_for_approval"
    }


@durable_step
def process_order(step_context: StepContext, order_id: str) -> dict:
    """Processes the order with retry logic for transient failures."""
    step_context.logger.info(f"Processing order: {order_id}")
    # Simulate flaky API that sometimes fails
    if random.random() > 0.4:
        step_context.logger.info("Processing failed, will retry")
        raise Exception("Processing failed")
    return {
        "order_id": order_id,
        "status": "processed",
        "timestamp": "2025-11-27T10:00:00Z",
    }


@durable_execution
def lambda_handler(event: dict, context: DurableContext) -> dict:
    try:
        order_id = event.get("order_id")
        
        # Step 1: Validate the order
        validated = context.step(validate_order(order_id))
        if validated["status"] != "validated":
            raise Exception("Validation failed")  # Terminal error - stops execution
        context.logger.info(f"Order validated: {validated}")
        
        # Step 2: Create callback
        callback = context.create_callback(
            name="awaiting-approval",
            config=CallbackConfig(timeout=Duration.from_minutes(3))
        )
        context.logger.info(f"Created callback with id: {callback.callback_id}")
        
        # Step 3: Send for approval with the callback_id
        approval_request = context.step(send_for_approval(callback.callback_id, order_id))
        context.logger.info(f"Approval request sent: {approval_request}")
        
        # Step 4: Wait for the callback result
        # This blocks until external system calls SendDurableExecutionCallbackSuccess or SendDurableExecutionCallbackFailure
        approval_result = callback.result()
        context.logger.info(f"Approval received: {approval_result}")
        
        # Step 5: Process the order with custom retry strategy
        retry_config = RetryStrategyConfig(max_attempts=3, backoff_rate=2.0)
        processed = context.step(
            process_order(order_id),
            config=StepConfig(retry_strategy=create_retry_strategy(retry_config)),
        )
        if processed["status"] != "processed":
            raise Exception("Processing failed")  # Terminal error
        
        context.logger.info(f"Order successfully processed: {processed}")
        return processed
        
    except Exception as error:
        context.logger.error(f"Error processing order: {error}")
        raise error  # Re-raise to fail the execution

This code demonstrates several important concepts:

  • Error handling—The try-catch block handles terminal errors. When an unhandled exception is thrown outside of a step (like the validation check), it terminates the execution immediately. This is useful when there’s no point in retrying, such as invalid order data.
  • Step retries—Inside the process_order step, exceptions trigger automatic retries based on the default (step 1) or configured RetryStrategy (step 5). This handles transient failures like temporary API unavailability.
  • Logging—I use context.logger for the main handler and step_context.logger inside steps. The context logger suppresses duplicate logs during replay.

Now I create a test event with order_id and invoke the function asynchronously to start the order workflow. I navigate to the Test tab and fill in the optional Durable execution name to identify this execution. Note that, durable functions provides built-in idempotency. If I invoke the function twice with the same execution name, the second invocation returns the existing execution result instead of creating a duplicate.

I can monitor the execution by navigating to the Durable executions tab in the Lambda console:

Here I can see each step’s status and timing. The execution shows CallbackStarted followed by InvocationCompleted, which indicates the function has terminated and execution is suspended to avoid idle charges while waiting for the approval callback.

I can now complete the callback directly from the console by choosing Send success or Send failure, or programmatically using the Lambda API.

I choose Send success.

After the callback completes, the execution resumes and processes the order. If the process_order step fails due to the simulated flaky API, it automatically retries based on the configured strategy. Once all retries succeed, the execution completes successfully.

Monitoring executions with Amazon EventBridge
You can also monitor durable function executions using Amazon EventBridge. Lambda automatically sends execution status change events to the default event bus, allowing you to build downstream workflows, send notifications, or integrate with other AWS services.

To receive these events, create an EventBridge rule on the default event bus with this pattern:

{
  "source": ["aws.lambda"],
  "detail-type": ["Durable Execution Status Change"]
}

Things to know
Here are key points to note:

  • Availability—Lambda durable functions are now available in US East (Ohio) AWS Region. For the latest Region availability, visit the AWS Capabilities by Region page.
  • Programming language support—At launch, AWS Lambda durable functions supports JavaScript/TypeScript (Node.js 22/24) and Python (3.13/3.14). We recommend bundling the durable execution SDK with your function code using your preferred package manager. The SDKs are fast-moving, so you can easily update dependencies as new features become available.
  • Using Lambda versions—When deploying durable functions to production, use Lambda versions to ensure replay always happens on the same code version. If you update your function code while an execution is suspended, replay will use the version that started the execution, preventing inconsistencies from code changes during long-running workflows.
  • Testing your durable functions—You can test durable functions locally without AWS credentials using the separate testing SDK with pytest integration and the AWS Serverless Application Model (AWS SAM) command line interface (CLI) for more complex integration testing.
  • Open source SDKs—The durable execution SDKs are open source for JavaScript/TypeScript and Python. You can review the source code, contribute improvements, and stay updated with the latest features.
  • Pricing—To learn more on AWS Lambda durable functions pricing, refer to the AWS Lambda pricing page.

Get started with AWS Lambda durable functions by visiting the AWS Lambda console. To learn more, refer to AWS Lambda durable functions documentation page.

Happy building!

Donnie