Tag Archives: AWS

AWS Weekly Roundup: Amazon Bedrock agent workflows, Amazon SageMaker private connectivity, and more (February 2, 2026)

This post was originally published on this site

Over the past week, we passed Laba festival, a traditional marker in the Chinese calendar that signals the final stretch leading up to the Lunar New Year. For many in China, it’s a moment associated with reflection and preparation, wrapping up what the year has carried, and turning attention toward what lies ahead.

Looking forward, next week also brings Lichun, the beginning of spring and the first of the 24 solar terms. In Chinese tradition, spring is often seen as the season when growth begins and new cycles take shape. There’s a common saying that “a year’s plans begin in spring,” capturing the idea that this is a time to set one’s direction and start fresh.

Last week’s launches
Here are the launches that got my attention this week:

  • Amazon Bedrock enhances support for agent workflows with server-side tools and extended prompt caching – Amazon Bedrock introduced two updates that improve how developers build and operate AI agents. The Responses API now supports server-side tool use, so agents can perform actions such as web search, code execution, and database updates within AWS security boundaries. Bedrock also adds a 1-hour time-to-live (TTL) option for prompt caching, which helps improve performance and reduce the cost for long-running, multi-turn agent workflows. Server-side tools are available with OpenAI GPT OSS 20B and 120B models, and the 1-hour prompt caching TTL is generally available for select Claude models by Anthropic in Amazon Bedrock.
  • Amazon SageMaker Unified Studio adds private VPC connectivity with AWS PrivateLinkAmazon SageMaker Unified Studio now supports AWS PrivateLink, providing private connectivity between your VPC and SageMaker Unified Studio without routing customer data over the public internet. With SageMaker service endpoints onboarded into a VPC, data traffic remains within the AWS network and is governed by IAM policies, supporting stricter security and compliance requirements.
  • Amazon S3 adds support for changing object encryption without data movementAmazon S3 now supports changing the server-side encryption type of existing encrypted objects without moving or re-uploading data. Using the UpdateObjectEncryption API, you can switch from SSE-S3 to SSE-KMS, rotate customer -managed AWS Key Management Service (AWS KMS) keys, or standardize encryption across buckets at scale with S3 Batch Operations while preserving object properties and lifecycle eligibility.
  • Amazon Keyspaces introduces table pre-warming for predictable high-throughput workloads – Amazon Keyspaces (for Apache Cassandra) now supports table pre-warming, which helps you proactively set warm throughput levels so tables can handle high read and write traffic instantly without cold-start delays. Pre-warming helps reduce throttling during sudden traffic spikes, such as product launches or sales events, and works with both on-demand and provisioned capacity modes, including multi-Region tables. The feature supports consistent, low-latency performance while giving you more control over throughput readiness.
  • Amazon DynamoDB MRSC global tables integrate with AWS Fault Injection ServiceAmazon DynamoDB multi-Region strong consistency (MRSC) global tables now integrate with AWS Fault Injection Service. With this integration, you can simulate Regional failures, test replication behavior, and validate application resiliency for strongly consistent, multi-Region workloads.

Additional updates
Here are some additional projects, blog posts, and news items that I found interesting:

  • Building zero-trust access across multi-account AWS environments with AWS Verified Access – This post walks through how to implement AWS Verified Access in a centralized, shared-services architecture. It shows how to integrate with AWS IAM Identity Center and AWS Resource Access Manager (AWS RAM) to apply zero trust access controls at the application layer and reduce operational overhead across multi-account AWS environments.
  • Amazon EventBridge increases event payload size to 1 MB – Amazon EventBridge now supports event payloads up to 1 MB, an increase from the previous 256 KB limit. This update helps event-driven architectures carry richer context in a single event, including complex JSON structures, telemetry data, and machine learning (ML) or generative AI outputs, without splitting payloads or relying on external storage.
  • AWS MCP Server adds deployment agent SOPs (preview) – AWS introduced deployment standard operating procedures (SOPs) that AI agents can deploy web applications to AWS from a single natural language prompt in MCP -compatible integrated development environments (IDEs) and command line interfaces (CLIs) such as Kiro, Cursor, and Claude Code. The agent generates AWS Cloud Development Kit (AWS CDK) infrastructure, deploys AWS CloudFormation stacks, and sets up continuous integration and continuous delivery (CI/CD) workflows following AWS best practices. The preview supports frameworks including React, Vue.js, Angular, and Next.js.
  • AWS Network Firewall adds generation AI traffic visibility with web category filtering – AWS Network Firewall now provides visibility into generative AI application traffic through predefined web categories. You can use these categories directly in firewall rules to govern access to generative AI tools and other web services. When combined with TLS inspection, category-based filtering can be applied at the full URL level.
  • AWS Lambda adds enhanced observability for Kafka event source mappingsAWS Lambda introduced enhanced observability for Kafka event source mappings, providing Amazon CloudWatch Logs and metrics to monitor event polling configuration, scaling behavior, and event processing state. The update improves visibility into Kafka-based Lambda workloads, helping teams diagnose configuration issues, permission errors, and function failures more efficiently. The capability supports both Amazon Managed Streaming for Apache Kafka (Amazon MSK) and self-managed Apache Kafka event sources.
  • AWS CloudFormation 2025 year in review – This year-in-review post highlights CloudFormation updates delivered throughout 2025, with a focus on early validation, safer deployments, and improved developer workflows. It covers enhancements such as improved troubleshooting, drift-aware change sets, stack refactoring, StackSets updates, and new -IDE and AI -assisted tooling, including the CloudFormation language server and the Infrastructure as Code (IaC) MCP server.

Upcoming AWS events
Check your calendars so that you can sign up for this upcoming event:

AWS Community Day Romania (April 23–24, 2026) – This community-led AWS event brings together developers, architects, entrepreneurs, and students for more than 10 professional sessions delivered by AWS Heroes, Solutions Architects, and industry experts. Attendees can expect expert-led technical talks, insights from speakers with global conference experience, and opportunities to connect during dedicated networking breaks, all hosted at a premium venue designed to support collaboration and community engagement.

If you’re looking for more ways to stay connected beyond this event, join the AWS Builder Center to learn, build, and connect with builders in the AWS community.

Check back next Monday for another Weekly Roundup.

betty

AWS Weekly Roundup: Amazon EC2 G7e instances with NVIDIA Blackwell GPUs (January 26, 2026)

This post was originally published on this site

Hey! It’s my first post for 2026, and I’m writing to you while watching our driveway getting dug out. I hope wherever you are you are safe and warm and your data is still flowing!

Our driveway getting snow plowed

This week brings exciting news for customers running GPU-intensive workloads, with the launch of our newest graphics and AI inference instances powered by NVIDIA’s latest Blackwell architecture. Along with several service enhancements and regional expansions, this week’s updates continue to expand the capabilities available to AWS customers.

Last week’s launches

Amazon EC2 G7e instances are now generally available — The new G7e instances accelerated by NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs deliver up to 2.3 times better inference performance compared to G6e instances. With two times the GPU memory and support for up to 8 GPUs providing 768 GB of total GPU memory, these instances enable running medium-sized models of up to 70B parameters with FP8 precision on a single GPU. G7e instances are ideal for generative AI inference, spatial computing, and scientific computing workloads. Available now in US East (N. Virginia) and US East (Ohio).

Additional updates

I thought these projects, blog posts, and news items were also interesting:

Amazon Corretto January 2026 Quarterly Updates — AWS released quarterly security and critical updates for Amazon Corretto Long-Term Supported (LTS) versions of OpenJDK. Corretto 25.0.2, 21.0.10, 17.0.18, 11.0.30, and 8u482 are now available, ensuring Java developers have access to the latest security patches and performance improvements.

Amazon ECR now supports cross-repository layer sharing — Amazon Elastic Container Registry now enables you to share common image layers across repositories through blob mounting. This feature helps you achieve faster image pushes by reusing existing layers and reduce storage costs by storing common layers once and referencing them across repositories.

Amazon CloudWatch Database Insights expands to four additional regions — CloudWatch Database Insights on-demand analysis is now available in Asia Pacific (New Zealand), Asia Pacific (Taipei), Asia Pacific (Thailand), and Mexico (Central). This feature uses machine learning to help identify performance bottlenecks and provides specific remediation advice.

Amazon Connect adds conditional logic and real-time updates to Step-by-Step Guides — Amazon Connect Step-by-Step Guides now enables managers to build dynamic guided experiences that adapt based on user interactions. Managers can configure conditional user interfaces with dropdown menus that show or hide fields, change default values, or adjust required fields based on prior inputs. The feature also supports automatic data refresh from Connect resources, ensuring agents always work with current information.

Upcoming AWS events

Keep a look out and be sure to sign up for these upcoming events:

Best of AWS re:Invent (January 28-29, Virtual) — Join us for this free virtual event bringing you the most impactful announcements and top sessions from AWS re:Invent. AWS VP and Chief Evangelist Jeff Barr will share highlights during the opening session. Sessions run January 28 at 9:00 AM PT for AMER, and January 29 at 9:00 AM SGT for APJ and 9:00 AM CET for EMEA. Register to access curated technical learning, strategic insights from AWS leaders, and live Q&A with AWS experts.

AWS Community Day Ahmedabad (February 28, 2026, Ahmedabad, India) — The 11th edition of this community-driven AWS conference brings together cloud professionals, developers, architects, and students for expert-led technical sessions, real-world use cases, tech expo booths with live demos, and networking opportunities. This free event includes breakfast, lunch, and exclusive swag.

Join the AWS Builder Center to learn, build, and connect with builders in the AWS community. Browse for upcoming in-person and virtual developer-focused events in your area.


That’s all for this week. Check back next Monday for another Weekly Roundup!

~ micah

Announcing Amazon EC2 G7e instances accelerated by NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs

This post was originally published on this site

Today, we’re announcing the general availability of Amazon Elastic Compute Cloud (Amazon EC2) G7e instances that deliver cost-effective performance for generative AI inference workloads and the highest performance for graphics workloads.

G7e instances are accelerated by the NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs and are well suited for a broad range of GPU-enabled workloads including spatial computing and scientific computing workloads. G7e instances deliver up to 2.3 times inference performance compared to G6e instances.

Improvements made compared to predecessors:

  • NVIDIA RTX PRO 6000 Blackwell GPUs — NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs offer two times the GPU memory and 1.85 times the GPU memory bandwidth compared to G6e instances. By using the higher GPU memory offered by G7e instances, you can run medium-sized models of up to 70B parameters with FP8 precision on a single GPU.
  • NVIDIA GPUDirect P2P — For models that are too large to fit into the memory of a single GPU, you can split the model or computations across multiple GPUs. G7e instances reduce the latency of your multi-GPU workloads with support for NVIDIA GPUDirect P2P, which enables direct communication between GPUs over PCIe interconnect. These instances offer the lowest peer to peer latency for GPUs on the same PCIe switch. Additionally, G7e instances offer up to four times the inter-GPU bandwidth compared to L40s GPUs featured in G6e instances, boosting the performance of multi-GPU workloads. These improvements mean you can run inference for larger models across multiple GPUs offering up to 768 GB of GPU memory in a single node.
  • Networking — G7e instances offer four times the networking bandwidth compared to G6e instances, which means you can use the instance for small-scale multi-node workloads. Additionally, multi-GPU G7e instances support NVIDIA GPUDirect Remote Direct Memory Access (RDMA) with Elastic Fabric Adapter (EFA), which reduces the latency of remote GPU-to-GPU communication for multi-node workloads. These instance sizes also support NVIDIA GPUDirectStorage with Amazon FSx for Lustre, which increases throughput by up to 1.2 Tbps to the instances compared to G6e instances, which means you can quickly load your models.

EC2 G7e specifications
G7e instances feature up to 8 NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs with up to 768 GB of total GPU memory (96 GB of memory per GPU) and Intel Emerald Rapids processors. They also support up to 192 vCPUs, up to 1,600 Gbps of network bandwidth, up to 2,048 GiB of system memory, and up to 15.2 TB of local NVMe SSD storage.

Here are the specs:

Instance name
 GPUs GPU memory (GB) vCPUs Memory (GiB) Storage (TB) EBS bandwidth (Gbps) Network bandwidth (Gbps)
g7e.2xlarge 1 96 8 64 1.9 x 1 Up to 5 50
g7e.4xlarge 1 96 16 128 1.9 x 1 8 50
g7e.8xlarge 1 96 32 256 1.9 x 1 16 100
g7e.12xlarge 2 192 48 512 3.8 x 1 25 400
g7e.24xlarge 4 384 96 1024 3.8 x 2 50 800
g7e.48xlarge 8 768 192 2048 3.8 x 4 100 1600

To get started with G7e instances, you can use the AWS Deep Learning AMIs (DLAMI) for your machine learning (ML) workloads. To run instances, you can use AWS Management Console, AWS Command Line Interface (AWS CLI) or AWS SDKs. For a managed experience, you can use G7e instances with Amazon Elastic Container Service (Amazon ECS), Amazon Elastic Kubernetes Service (Amazon EKS). Support for Amazon SageMaker AI is also coming soon.

Now available
Amazon EC2 G7e instances are available today in the US East (N. Virginia) and US East (Ohio) AWS Regions. For Regional availability and a future roadmap, search the instance type in the CloudFormation resources tab of AWS Capabilities by Region.

The instances can be purchased as On-Demand Instances, Savings Plan, and Spot Instances. G7e instances are also available in Dedicated Instances and Dedicated Hosts. To learn more, visit the Amazon EC2 Pricing page.

Give G7e instances a try in the Amazon EC2 console. To learn more, visit the Amazon EC2 G7e instances page and send feedback to AWS re:Post for EC2 or through your usual AWS Support contacts.

Channy

AWS Weekly Roundup: Kiro CLI latest features, AWS European Sovereign Cloud, EC2 X8i instances, and more (January 19, 2026)

This post was originally published on this site

At the end of 2025 I was happy to take a long break to enjoy the incredible summers that the southern hemisphere provides. I’m back and writing my first post in 2026 which also happens to be my last post for the AWS News Blog (more on this later).

The AWS community is starting the year strong with various AWS re:invent re:Caps being hosted around the globe, with some communities already hosting their AWS Community Day events, the AWS Community Day Tel Aviv 2026 was hosted last week.

Last week’s launches
Here are last week’s launches that caught my attention:

  • Kiro CLI latest features – Kiro CLI now has granular controls for web fetch URLs, keyboard shortcuts for your custom agents, enhanced diff views, and much more. With these enhancements, you can now use allowlists or blocklists to restrict which URLs the agent can access, ensure a frictionless experience when working with multiple specialized agents in a single session, to name a few.
  • AWS European Sovereign Cloud – Following an announcement in 2023 of plans to build a new, independent cloud infrastructure, last week we announced the general availability of the AWS European Sovereign Cloud to all customers. The cloud is ready to meet the most stringent sovereignty requirements of European customers with a comprehensive set of AWS services.
  • Amazon EC2 X8i instancesPreviously launched in preview at AWS re:Invent 2025, last week we announced the general availability of new memory-optimized Amazon Elastic Compute Cloud (Amazon EC2) X8i instances. These instances are powered by custom Intel Xeon 6 processors with a sustained all-core turbo frequency of 3.9 GHz, available only on AWS. These SAP certified instances deliver the highest performance and fastest memory bandwidth among comparable Intel processors in the cloud.

Additional updates
These projects, blog posts, and news articles also caught my attention:

  • 5 core features in Amazon Quick Suite – AWS VP Agentic AI Swami Sivasubramanian talks about how he uses Amazon Quick Suite for just about everything. In October 2025 we announced Amazon Quick Suite, a new agentic teammate that quickly answers your questions at work and turns insights into actions for you. Amazon Quick Suite has become one of my favorite productivity tools, helping me with my research on various topics in addition to providing me with multiple perspectives on a topic.
  • Deploy AI agents on Amazon Bedrock AgentCore using GitHub Actions – Last year we announced Amazon Bedrock AgentCore, a flexible service that helps you seamlessly create and manage AI agents across different frameworks and models, whether hosted on Amazon Bedrock or other environments. Learn how to use a GitHub Actions workflow to automate the deployment of AI agents on AgentCore Runtime. This approach delivers a scalable solution with enterprise-level security controls, providing complete continuous integration and delivery (CI/CD) automation.

Upcoming AWS events
Join us January 28 or 29 (depending on your time zone) for Best of AWS re:Invent, a free virtual event where we bring you the most impactful announcements and top sessions from AWS re:Invent. Jeff Barr, AWS VP and Chief Evangelist, will share his highlights during the opening session.

There is still time until January 21 to compete for $250,000 in prizes and AWS credits in the Global 10,000 AIdeas Competition (yes, the second letter is an I as in Idea, not an L as in like). No code required yet: simply submit your idea, and if you’re selected as a semifinalist, you’ll build your app using Kiro within AWS Free Tier limits. Beyond the cash prizes and potential featured placement at AWS re:Invent 2026, you’ll gain hands-on experience with next-generation AI tools and connect with innovators globally.

Earlier this month, the 2026 application for the Community Builders program launched. The application is open until January 21st, midnight PST so here’s your last chance to ensure that you don’t miss out.

If you’re interested in these opportunities, join the AWS Builder Center to learn with builders in the AWS community.

With that, I close one of my most meaningful chapters here at AWS. It’s been an absolute pleasure to write for you and I thank you for taking the time to read the work that my team and I pour our absolute hearts into. I’ve grown from the close collaborations with the launch teams and the feedback from all of you. The Sub-Sahara Africa (SSA) community has grown significantly, and I want to dedicate more time focused on this community, I’m still at AWS and I look forward to meeting at an event near you!

Check back next Monday for another Weekly Roundup!

Veliswa Boya

Amazon EC2 X8i instances powered by custom Intel Xeon 6 processors are generally available for memory-intensive workloads

This post was originally published on this site

Since a preview launch at AWS re:Invent 2025, we’re announcing the general availability of new memory-optimized Amazon Elastic Compute Cloud (Amazon EC2) X8i instances. These instances are powered by custom Intel Xeon 6 processors with a sustained all-core turbo frequency of 3.9 GHz, available only on AWS. These SAP certified instances deliver the highest performance and fastest memory bandwidth among comparable Intel processors in the cloud.

X8i instances are ideal for memory-intensive workloads including in-memory databases such as SAP HANA, traditional large-scale databases, data analytics, and electronic design automation (EDA), which require high compute performance and a large memory footprint.

These instances provide 1.5 times more memory capacity (up to 6 TB), and 3.4 times more memory bandwidth compared to previous generation X2i instances. These instances offer up to 43% higher performance compared to X2i instances, with higher gains on some of the real-world workloads. They deliver up to 50% higher SAP Application Performance Standard (SAPS) performance, up to 47% faster PostgreSQL performance, up to 88% faster Memcached performance, and up to 46% faster AI inference performance.

During the preview, customers like RISE with SAP utilized up to 6 TB of memory capacity with 50% higher compute performance compared to X2i instances. This enabled faster transaction processing and improved query response times for SAP HANA workloads. Orion reduced the number of active cores on X8i instances compared to X2idn instances while maintaining performance thresholds, cutting SQL Server licensing costs by 50%.

X8i instances
X8i instances are available in 14 sizes including three larger instance sizes (48xlarge, 64xlarge, and 96xlarge), so you can choose the right size for your application to scale up, and two bare metal sizes (metal-48xl and metal-96xl) to deploy workloads that benefit from direct access to physical resources. X8i instances feature up to 100 Gbps of network bandwidth with support for the Elastic Fabric Adapter (EFA) and up to 80 Gbps of throughput to Amazon Elastic Block Store (Amazon EBS).

Here are the specs for X8i instances:

Instance name vCPUs Memory
(GiB)
Network bandwidth (Gbps) EBS bandwidth (Gbps)
x8i.large 2 32 Up to 12.5 Up to 10
x8i.xlarge 4 64 Up to 12.5 Up to 10
x8i.2xlarge 8 128 Up to 15 Up to 10
x8i.4xlarge 16 256 Up to 15 Up to 10
x8i.8xlarge 32 512 15 10
x8i.12xlarge 48 768 22.5 15
x8i.16xlarge 64 1,024 30 20
x8i.24xlarge 96 1,536 40 30
x8i.32xlarge 128 2,048 50 40
x8i.48xlarge 192 3,072 75 60
x8i.64xlarge 256 4,096 80 70
x8i.96xlarge 384 6,144 100 80
x8i.metal-48xl 192 3,072 75 60
x8i.metal-96xl 384 6,144 100 80

X8i instances support the instance bandwidth configuration (IBC) feature like other eighth-generation instance types, offering flexibility to allocate resources between network and EBS bandwidth. You can scale network or EBS bandwidth by up to 25%, improving database performance, query processing speeds, and logging efficiency. These instances also use sixth-generation AWS Nitro cards, which offload CPU virtualization, storage, and networking functions to dedicated hardware and software, enhancing performance and security for your workloads.

Now available
Amazon EC2 X8i instances are now available in US East (N. Virginia), US East (Ohio), US West (Oregon), and Europe (Frankfurt) AWS Regions. For Regional availability and a future roadmap, search the instance type in the CloudFormation resources tab of AWS Capabilities by Region.

You can purchase these instances as On-Demand Instances, Savings Plan, and Spot Instances. To learn more, visit the Amazon EC2 Pricing page.

Give X8i instances a try in the Amazon EC2 console. To learn more, visit the Amazon EC2 X8i instances page and send feedback to AWS re:Post for EC2 or through your usual AWS Support contacts.

Channy

AWS Weekly Roundup: AWS Lambda for .NET 10, AWS Client VPN quickstart, Best of AWS re:Invent, and more (January 12, 2026)

This post was originally published on this site

At the beginning of January, I tend to set my top resolutions for the year, a way to focus on what I want to achieve. If AI and cloud computing are on your resolution list, consider creating an AWS Free Tier account to receive up to $200 in credits and have 6 months of risk-free experimentation with AWS services.

During this period, you can explore essential services across compute, storage, databases, and AI/ML, plus access to over 30 always-free services with monthly usage limits. After 6 months, you can decide whether to upgrade to a standard AWS account.

Whether you’re a student exploring career options, a developer expanding your skill set, or a professional building with cloud technologies, this hands-on approach lets you focus on what matters most: developing real expertise in the areas you’re passionate about.

Last week’s launches
Here are the launches that got my attention this week:

Additional updates
Here are some additional projects, blog posts, and news items that I found interesting:

Crossmodal search with Amazon Nova Multimodal Embeddings Architecture

Upcoming AWS events
Join us January 28 or 29 (depending on your time zone) for Best of AWS re:Invent, a free virtual event where we bring you the most impactful announcements and top sessions from AWS re:Invent. Jeff Barr, AWS VP and Chief Evangelist, will share his highlights during the opening session.

There is still time until January 21 to compete for $250,000 in prizes and AWS credits in the Global 10,000 AIdeas Competition (yes, the second letter is an I as in Idea, not an L as in like). No code required yet: simply submit your idea, and if you’re selected as a semifinalist, you’ll build your app using Kiro within AWS Free Tier limits. Beyond the cash prizes and potential featured placement at AWS re:Invent 2026, you’ll gain hands-on experience with next-generation AI tools and connect with innovators globally.

If you’re interested in these opportunities, join the AWS Builder Center to learn with builders in the AWS community.

That’s all for this week. Check back next Monday for another Weekly Roundup!

Danilo

Happy New Year! AWS Weekly Roundup: 10,000 AIdeas Competition, Amazon EC2, Amazon ECS Managed Instances and more (January 5, 2026)

This post was originally published on this site

Happy New Year! I hope the holidays gave you time to recharge and spend time with your loved ones.

Like every year, I took a few weeks off after AWS re:Invent to rest and plan ahead. I used some of that downtime to plan the next cohort for Become a Solutions Architect (BeSA). BeSA is a free mentoring program that I, along with a few other Amazon Web Services (AWS) employees, volunteer to host as a way to help people excel in their cloud and AI careers. We’re kicking off a 6-week cohort on “Agentic AI on AWS” starting February 21, 2026. Visit the BeSA website to learn more.

There is still time to submit your idea for the Global 10,000 AIdeas Competition and compete for $250,000 in cash prizes, AWS credits, and recognition, including potential featured placement at AWS re:Invent 2026 and across AWS channels.

You will gain hands-on experience with next-generation AI development tools, connect with innovators globally, and access technical enablement through biweekly workshops, AWS User Groups, and AWS Builder Center resources.

The deadline is January 21, 2026, and no code is required yet. If you’re selected as a semifinalist, you’ll build your app then. Your finished app needs to use Kiro for at least part of development, stay within AWS Free Tier limits, and be completely original and not yet published.

If you haven’t yet caught up with all the new releases and announcements from AWS re:Invent 2025, check out our top announcements post or watch the keynotes, innovation talks, and breakout sessions on-demand.

Launches from the last few weeks
I’d like to highlight some launches that got my attention since our last Week in Review on December 15, 2025:

  • Amazon EC2 M8gn and M8gb instances – New M8gn and M8gb instances are powered by AWS Graviton4 processors to deliver up to 30% better compute performance than AWS Graviton3 processors. M8gn instances feature the latest 6th generation AWS Nitro Cards, and offer up to 600 Gbps network bandwidth, the highest network bandwidth among network-optimized EC2 instances. M8gb offer up to 150 Gbps of Amazon EBS bandwidth to provide higher EBS performance compared to same-sized equivalent Graviton4-based instances.
  • AWS Direct Connect supports resilience testing with AWS Fault Injection Service – You can now use AWS Fault Injection Service to test how your applications handle Direct Connect Border Gateway Protocol (BGP) failover in a controlled environment. For example, you can validate that traffic routes to redundant virtual interfaces when a primary virtual interface’s BGP session is disrupted and your applications continue to function as expected.
  • New AWS Security Hub controls in AWS Control Tower – AWS Control Tower now supports 176 additional Security Hub controls in the Control Catalog, covering use cases including security, cost, durability, and operations. With this launch, you can search, discover, enable, and manage these controls directly from AWS Control Tower to govern additional use cases across your multi-account environment.
  • AWS Transform supports network conversion for hybrid data center migrations – You can now use AWS Transform for VMware to automatically convert networks from hybrid data centers. This removes manual network mapping for environments running both VMware and other workloads. The service analyzes VLANs and IP ranges across all exported source networks and maps them to AWS constructs such as virtual private clouds (VPCs), subnets, and security groups.
  • NVIDIA Nemotron 3 Nano available on Amazon Bedrock – Amazon Bedrock now supports NVIDIA Nemotron 3 Nano 30B A3B model, NVIDIA’s latest breakthrough in efficient language modeling that delivers high reasoning performance, built-in tool calling support, and extended context processing with 256K token context window.
  • Amazon EC2 supports Availability Zone ID across its APIs – You can specify the Availability Zone ID (AZ ID) parameter directly in your Amazon EC2 APIs to guarantee consistent placement of resources. AZ IDs are consistent and static identifiers that represent the same physical location across all AWS accounts, helping you optimize resource placement. Prior to this launch, you had to use an AZ name while creating a resource, but these names could map to different physical locations. This mapping made it difficult to ensure resources were always co-located, especially when operating with multiple accounts.
  • Amazon ECS Managed Instances supports Amazon EC2 Spot Instances – Amazon ECS Managed Instances now supports Amazon EC2 Spot Instances, extending the range of capabilities available with AWS managed infrastructure. You can use spare EC2 capacity at up to 90% discount compared to On-Demand prices for fault-tolerant workloads in Amazon ECS Managed Instances.

See AWS What’s New for more launch news that I haven’t covered here. That’s all for this week. Check back next Monday for another Weekly Roundup!

Here’s to a fantastic start to 2026. Happy building!

– Prasad

AWS Weekly Roundup: Amazon ECS, Amazon CloudWatch, Amazon Cognito and more (December 15, 2025)

This post was originally published on this site

Can you believe it? We’re nearly at the end of 2025. And what a year it’s been! From re:Invent recap events, to AWS Summits, AWS Innovate, AWS re:Inforce, Community Days, and DevDays and, recently, adding that cherry on the cake, re:Invent 2025, we have lived through a year filled with exciting moments and technology advancements which continue to shape our new modern world.

Speaking of re:Invent, if you haven’t caught up yet on all the new releases and announcements (and there were plenty of exciting launches across every area), be sure to check out our curated post highlighting the top announcements from AWS re:Invent 2025. We’ve organized all the key releases into easy-to-navigate categories and included links so you can dive deeper into anything that sparks your interest.

While the year may be wrapping up, our teams are still busy working on things that you have either asked for as customers or that we pro-actively create to make your lives easier. Last week had quite a few interesting releases as usual, so let’s look at a few that I think could be useful for many of you out there.

Last week’s launches

Amazon WorkSpaces Secure Browser introduces Web Content Filtering – Organizations can now control web access through category-based filtering across 25+ predefined categories, granular URL policies, and integrated compliance logging. The feature works alongside existing Chrome policies and integrates with Session Logger for enhanced monitoring and is available at no additional cost in 10 AWS Regions with pay-as-you-go pricing.

Amazon Aurora DSQL now supports cluster creation in seconds – Developers can now instantly provision Aurora DSQL databases with setup time reduced from minutes to seconds, enabling rapid prototyping through the integrated AWS console query editor or AI-powered development via the Aurora DSQL Model Context Protocol server. Available at no additional cost in all AWS Regions where Aurora DSQL is offered, with AWS Free Tier access available.

Amazon Aurora PostgreSQL now supports integration with Kiro powers – Developers can now accelerate Aurora PostgreSQL application development using AI-assisted coding through Kiro powers, a repository of pre-packaged Model Context Protocol servers. The Aurora PostgreSQL integration provides direct database connectivity for queries, schema management, and cluster operations, dynamically loading relevant context as developers work. Available for one-click installation in Kiro IDE across all AWS Regions.

Amazon ECS now supports custom container stop signals on AWS Fargate – Fargate tasks now honor the stop signal configured in container images, enabling graceful shutdowns for containers that rely on signals like SIGQUIT or SIGINT instead of the default SIGTERM. The ECS container agent reads the STOPSIGNAL instruction from OCI-compliant images and sends the appropriate signal during task termination. Available at no additional cost across all AWS Regions.

Amazon CloudWatch SDK supports optimized JSON, CBOR protocols – CloudWatch SDK now defaults to JSON and CBOR protocols, delivering lower latency, reduced payload sizes, and decreased client-side CPU and memory usage compared to the traditional AWS Query protocol. Available at no additional cost across all AWS Regions and SDK language variants.

Amazon Cognito identity pools now support private connectivity with AWS PrivateLink – Organizations can now securely exchange federated identities for temporary AWS credentials through private VPC connections, eliminating the need to route authentication traffic over the public internet. Available in all AWS Regions where Cognito identity pools are supported, except AWS China (Beijing) and AWS GovCloud (US) Regions.

AWS Application Migration Service supports IPv6 – Organizations can now migrate applications using IPv6 addressing through dual-stack service endpoints that support both IPv4 and IPv6 communications. During replication, testing, and cutover phases, you can use IPv4, IPv6, or dual-stack configurations to launch servers in your target environment. Available at no additional cost in all AWS Regions that support MGN and EC2 dual-stack endpoints.

And that’s it for the AWS News Blog Weekly Roundup…not just for this week, but for 2025! We’ll be taking a break and returning in January to continue bringing you the latest AWS releases and updates.

As we close out 2025, it’s remarkable to look back at just how much has changed since the beginning of year. From groundbreaking AI capabilities to transformative infrastructure innovations, AWS has delivered an incredible year of releases that have reshaped what’s possible in the cloud. Throughout it all, the AWS News Blog has been right here with you every week with our Weekly Roundup series, helping you stay informed and ready to take advantage of each new opportunity as it arrived. We’re grateful you’ve joined us on this journey, and we can’t wait to continue bringing you the latest AWS innovations when we return in January 2026.

Until then, happy building, and here’s to an even more exciting year ahead!

Matheus Guimaraes | @codingmatheus

New serverless customization in Amazon SageMaker AI accelerates model fine-tuning

This post was originally published on this site

Today, I’m happy to announce new serverless customization in Amazon SageMaker AI for popular AI models, such as Amazon Nova, DeepSeek, GPT-OSS, Llama, and Qwen. The new customization capability provides an easy-to-use interface for the latest fine-tuning techniques like reinforcement learning, so you can accelerate the AI model customization process from months to days.

With a few clicks, you can seamlessly select a model and customization technique, and handle model evaluation and deployment—all entirely serverless so you can focus on model tuning rather than managing infrastructure. When you choose serverless customization, SageMaker AI automatically selects and provisions the appropriate compute resources based on the model and data size.

Getting started with serverless model customization
You can get started customizing models in Amazon SageMaker Studio. Choose Models in the left navigation pane and check out your favorite AI models to be customized.

Customize with UI
You can customize AI models in a only few clicks. In the Customize model dropdown list for a specific model such as Meta Llama 3.1 8B Instruct, choose Customize with UI.

You can select a customization technique used to adapt the base model to your use case. SageMaker AI supports Supervised Fine-Tuning and the latest model customization techniques including Direct Preference Optimization, Reinforcement Learning from Verifiable Rewards (RLVR), and Reinforcement Learning from AI Feedback (RLAIF). Each technique optimizes models in different ways, with selection influenced by factors such as dataset size and quality, available computational resources, task at hand, desired accuracy levels, and deployment constraints.

Upload or select a training dataset to match the format required by the customization technique selected. Use the values of batch size, learning rate, and number of epochs recommended by the technique selected. You can configure advanced settings such as hyperparameters, a newly introduced serverless MLflow application for experiment tracking, and network and storage volume encryption. Choose Submit to get started on your model training job.

After your training job is complete, you can see the models you created in the My Models tab. Choose View details in one of your models.

By choosing Continue customization, you can continue to customize your model by adjusting hyperparameters or training with different techniques. By choosing Evaluate, you can evaluate your customized model to see how it performs compared to the base model.

When you complete both jobs, you can choose either the SageMaker or Bedrock in the Deploy dropdown list to deploy your model.

You can choose Amazon Bedrock for serverless inference. Choose Bedrock and the model name to deploy the model into Amazon Bedrock. To find your deployed models, choose Imported models in the Bedrock console.

You can also deploy your model to a SageMaker AI inference endpoint if you want to control your deployment resources such as an instance type and instance count. After the SageMaker AI deployment is In service, you can use this endpoint to perform inference. In the Playground tab, you can test your customized model with a single prompt or chat mode.

With the serverless MLflow capability, you can automatically log all critical experiment metrics without modifying code and access rich visualizations for further analysis.

Customize with code
When you choose customizing with code, you can see a sample notebook to fine-tune or deploy AI models. If you want to edit the sample notebook, open it in JupyterLab. Alternatively, you can deploy the model immediately by choosing Deploy.

You can choose the Amazon Bedrock or SageMaker AI endpoint by selecting the deployment resources either from Amazon SageMaker Inference or Amazon SageMaker Hyperpod.

When you choose Deploy on the bottom right of the page, it will be redirected back to the model detail page. After the SageMaker AI deployment is in service, you can use this endpoint to perform inference.

Okay, you’ve seen how to streamline the model customization in the SageMaker AI. You can now choose your favorite way. To learn more, visit the Amazon SageMaker AI Developer Guide.

Now available
New serverless AI model customization in Amazon SageMaker AI is now available in US East (N. Virginia), US West (Oregon), Asia Pacific (Tokyo), and Europe (Ireland) Regions. You only pay for the tokens processed during training and inference. To learn more details, visit Amazon SageMaker AI pricing page.

Give it a try in Amazon SageMaker Studio and send feedback to AWS re:Post for SageMaker or through your usual AWS Support contacts.

Channy

Introducing checkpointless and elastic training on Amazon SageMaker HyperPod

This post was originally published on this site

Today, we’re announcing two new AI model training features within Amazon SageMaker HyperPod: checkpointless training, an approach that mitigates the need for traditional checkpoint-based recovery by enabling peer-to-peer state recovery, and elastic training, enabling AI workloads to automatically scale based on resource availability.

  • Checkpointless training – Checkpointless training eliminates disruptive checkpoint-restart cycles, maintaining forward training momentum despite failures, reducing recovery time from hours to minutes. Accelerate your AI model development, reclaim days from development timelines, and confidently scale training workflows to thousands of AI accelerators.
  • Elastic training  – Elastic training maximizes cluster utilization as training workloads automatically expand to use idle capacity as it becomes available, and contract to yield resources as higher-priority workloads like inference volumes peak. Save hours of engineering time per week spent reconfiguring training jobs based on compute availability.

Rather than spending time managing training infrastructure, these new training techniques mean that your team can concentrate entirely on enhancing model performance, ultimately getting your AI models to market faster. By eliminating the traditional checkpoint dependencies and fully utilizing available capacity, you can significantly reduce model training completion times.

Checkpointless training: How it works
Traditional checkpoint-based recovery has these sequential job stages: 1) job termination and restart, 2) process discovery and network setup, 3) checkpoint retrieval, 4) data loader initialization, and 5) training loop resumption. When failures occur, each stage can become a bottleneck and training recovery can take up to an hour on self-managed training clusters. The entire cluster must wait for every single stage to complete before training can resume. This can lead to the entire training cluster sitting idle during recovery operations, which increases costs and extends the time to market.

Checkpointless training removes this bottleneck entirely by maintaining continuous model state preservation across the training cluster. When failures occur, the system instantly recovers by using healthy peers, avoiding the need for a checkpoint-based recovery that requires restarting the entire job. As a result, checkpointless training enables fault recovery in minutes.

Checkpointless training is designed for incremental adoption and built on four core components that work together: 1) collective communications initialization optimizations, 2) memory-mapped data loading that enables caching, 3) in-process recovery, and 4) checkpointless peer-to-peer state replication. These components are orchestrated through the HyperPod training operator that is used to launch the job. Each component optimizes a specific step in the recovery process, and together they enable automatic detection and recovery of infrastructure faults in minutes with zero manual intervention, even with thousands of AI accelerators. You can progressively enable each of these features as your training scales.

The latest Amazon Nova models were trained using this technology on tens of thousands of accelerators. Additionally, based on internal studies on cluster sizes ranging between 16 GPUs to over 2,000 GPUs, checkpointless training showcased significant improvements in recovery times, reducing downtime by over 80% compared to traditional checkpoint-based recovery.

To learn more, visit HyperPod Checkpointless Training in the Amazon SageMaker AI Developer Guide.

Elastic training: How it works
On clusters that run different types of modern AI workloads, accelerator availability can change continuously throughout the day as short-duration training runs complete, inference spikes occur and subside, or resources free up from completed experiments. Despite this dynamic availability of AI accelerators, traditional training workloads remain locked into their initial compute allocation, unable to take advantage of idle accelerators without manual intervention. This rigidity leaves valuable GPU capacity unused and prevents organizations from maximizing their infrastructure investment.

Elastic training transforms how training workloads interact with cluster resources. Training jobs can automatically scale up to utilize available accelerators and gracefully contract when resources are needed elsewhere, all while maintaining training quality.

Workload elasticity is enabled through the HyperPod training operator that orchestrates scaling decisions through integration with the Kubernetes control plane and resource scheduler. It continuously monitors cluster state through three primary channels: pod lifecycle events, node availability changes, and resource scheduler priority signals. This comprehensive monitoring enables near-instantaneous detection of scaling opportunities, whether from newly available resources or requests from higher-priority workloads.

The scaling mechanism relies on adding and removing data parallel replicas. When additional compute resources become available, new data parallel replicas join the training job, accelerating throughput. Conversely, during scale-down events (for example, when a higher-priority workload requests resources), the system scales down by removing replicas rather than terminating the entire job, allowing training to continue at reduced capacity.

Across different scales, the system preserves the global batch size and adapts learning rates, preventing model convergence from being adversely impacted. This enables workloads to dynamically scale up or down to utilize available AI accelerators without any manual intervention.

You can start elastic training through the HyperPod recipes for publicly available foundation models (FMs) including Llama and GPT-OSS. Additionally, you can modify your PyTorch training scripts to add elastic event handlers, which enable the job to dynamically scale.

To learn more, visit the HyperPod Elastic Training in the Amazon SageMaker AI Developer Guide. To get started, find the HyperPod recipes available in the AWS GitHub repository.

Now available
Both features are available in all the Regions in which Amazon SageMaker HyperPod is available. You can use these training techniques without additional cost. To learn more, visit the SageMaker HyperPod product page and SageMaker AI pricing page.

Give it a try and send feedback to AWS re:Post for SageMaker or through your usual AWS Support contacts.

Channy