Category Archives: AWS

AWS Weekly Roundup: Amazon EC2 G7e instances with NVIDIA Blackwell GPUs (January 26, 2026)

This post was originally published on this site

Hey! It’s my first post for 2026, and I’m writing to you while watching our driveway getting dug out. I hope wherever you are you are safe and warm and your data is still flowing!

Our driveway getting snow plowed

This week brings exciting news for customers running GPU-intensive workloads, with the launch of our newest graphics and AI inference instances powered by NVIDIA’s latest Blackwell architecture. Along with several service enhancements and regional expansions, this week’s updates continue to expand the capabilities available to AWS customers.

Last week’s launches

Amazon EC2 G7e instances are now generally available — The new G7e instances accelerated by NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs deliver up to 2.3 times better inference performance compared to G6e instances. With two times the GPU memory and support for up to 8 GPUs providing 768 GB of total GPU memory, these instances enable running medium-sized models of up to 70B parameters with FP8 precision on a single GPU. G7e instances are ideal for generative AI inference, spatial computing, and scientific computing workloads. Available now in US East (N. Virginia) and US East (Ohio).

Additional updates

I thought these projects, blog posts, and news items were also interesting:

Amazon Corretto January 2026 Quarterly Updates — AWS released quarterly security and critical updates for Amazon Corretto Long-Term Supported (LTS) versions of OpenJDK. Corretto 25.0.2, 21.0.10, 17.0.18, 11.0.30, and 8u482 are now available, ensuring Java developers have access to the latest security patches and performance improvements.

Amazon ECR now supports cross-repository layer sharing — Amazon Elastic Container Registry now enables you to share common image layers across repositories through blob mounting. This feature helps you achieve faster image pushes by reusing existing layers and reduce storage costs by storing common layers once and referencing them across repositories.

Amazon CloudWatch Database Insights expands to four additional regions — CloudWatch Database Insights on-demand analysis is now available in Asia Pacific (New Zealand), Asia Pacific (Taipei), Asia Pacific (Thailand), and Mexico (Central). This feature uses machine learning to help identify performance bottlenecks and provides specific remediation advice.

Amazon Connect adds conditional logic and real-time updates to Step-by-Step Guides — Amazon Connect Step-by-Step Guides now enables managers to build dynamic guided experiences that adapt based on user interactions. Managers can configure conditional user interfaces with dropdown menus that show or hide fields, change default values, or adjust required fields based on prior inputs. The feature also supports automatic data refresh from Connect resources, ensuring agents always work with current information.

Upcoming AWS events

Keep a look out and be sure to sign up for these upcoming events:

Best of AWS re:Invent (January 28-29, Virtual) — Join us for this free virtual event bringing you the most impactful announcements and top sessions from AWS re:Invent. AWS VP and Chief Evangelist Jeff Barr will share highlights during the opening session. Sessions run January 28 at 9:00 AM PT for AMER, and January 29 at 9:00 AM SGT for APJ and 9:00 AM CET for EMEA. Register to access curated technical learning, strategic insights from AWS leaders, and live Q&A with AWS experts.

AWS Community Day Ahmedabad (February 28, 2026, Ahmedabad, India) — The 11th edition of this community-driven AWS conference brings together cloud professionals, developers, architects, and students for expert-led technical sessions, real-world use cases, tech expo booths with live demos, and networking opportunities. This free event includes breakfast, lunch, and exclusive swag.

Join the AWS Builder Center to learn, build, and connect with builders in the AWS community. Browse for upcoming in-person and virtual developer-focused events in your area.


That’s all for this week. Check back next Monday for another Weekly Roundup!

~ micah

Announcing Amazon EC2 G7e instances accelerated by NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs

This post was originally published on this site

Today, we’re announcing the general availability of Amazon Elastic Compute Cloud (Amazon EC2) G7e instances that deliver cost-effective performance for generative AI inference workloads and the highest performance for graphics workloads.

G7e instances are accelerated by the NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs and are well suited for a broad range of GPU-enabled workloads including spatial computing and scientific computing workloads. G7e instances deliver up to 2.3 times inference performance compared to G6e instances.

Improvements made compared to predecessors:

  • NVIDIA RTX PRO 6000 Blackwell GPUs — NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs offer two times the GPU memory and 1.85 times the GPU memory bandwidth compared to G6e instances. By using the higher GPU memory offered by G7e instances, you can run medium-sized models of up to 70B parameters with FP8 precision on a single GPU.
  • NVIDIA GPUDirect P2P — For models that are too large to fit into the memory of a single GPU, you can split the model or computations across multiple GPUs. G7e instances reduce the latency of your multi-GPU workloads with support for NVIDIA GPUDirect P2P, which enables direct communication between GPUs over PCIe interconnect. These instances offer the lowest peer to peer latency for GPUs on the same PCIe switch. Additionally, G7e instances offer up to four times the inter-GPU bandwidth compared to L40s GPUs featured in G6e instances, boosting the performance of multi-GPU workloads. These improvements mean you can run inference for larger models across multiple GPUs offering up to 768 GB of GPU memory in a single node.
  • Networking — G7e instances offer four times the networking bandwidth compared to G6e instances, which means you can use the instance for small-scale multi-node workloads. Additionally, multi-GPU G7e instances support NVIDIA GPUDirect Remote Direct Memory Access (RDMA) with Elastic Fabric Adapter (EFA), which reduces the latency of remote GPU-to-GPU communication for multi-node workloads. These instance sizes also support NVIDIA GPUDirectStorage with Amazon FSx for Lustre, which increases throughput by up to 1.2 Tbps to the instances compared to G6e instances, which means you can quickly load your models.

EC2 G7e specifications
G7e instances feature up to 8 NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs with up to 768 GB of total GPU memory (96 GB of memory per GPU) and Intel Emerald Rapids processors. They also support up to 192 vCPUs, up to 1,600 Gbps of network bandwidth, up to 2,048 GiB of system memory, and up to 15.2 TB of local NVMe SSD storage.

Here are the specs:

Instance name
 GPUs GPU memory (GB) vCPUs Memory (GiB) Storage (TB) EBS bandwidth (Gbps) Network bandwidth (Gbps)
g7e.2xlarge 1 96 8 64 1.9 x 1 Up to 5 50
g7e.4xlarge 1 96 16 128 1.9 x 1 8 50
g7e.8xlarge 1 96 32 256 1.9 x 1 16 100
g7e.12xlarge 2 192 48 512 3.8 x 1 25 400
g7e.24xlarge 4 384 96 1024 3.8 x 2 50 800
g7e.48xlarge 8 768 192 2048 3.8 x 4 100 1600

To get started with G7e instances, you can use the AWS Deep Learning AMIs (DLAMI) for your machine learning (ML) workloads. To run instances, you can use AWS Management Console, AWS Command Line Interface (AWS CLI) or AWS SDKs. For a managed experience, you can use G7e instances with Amazon Elastic Container Service (Amazon ECS), Amazon Elastic Kubernetes Service (Amazon EKS). Support for Amazon SageMaker AI is also coming soon.

Now available
Amazon EC2 G7e instances are available today in the US East (N. Virginia) and US East (Ohio) AWS Regions. For Regional availability and a future roadmap, search the instance type in the CloudFormation resources tab of AWS Capabilities by Region.

The instances can be purchased as On-Demand Instances, Savings Plan, and Spot Instances. G7e instances are also available in Dedicated Instances and Dedicated Hosts. To learn more, visit the Amazon EC2 Pricing page.

Give G7e instances a try in the Amazon EC2 console. To learn more, visit the Amazon EC2 G7e instances page and send feedback to AWS re:Post for EC2 or through your usual AWS Support contacts.

Channy

AWS Weekly Roundup: Kiro CLI latest features, AWS European Sovereign Cloud, EC2 X8i instances, and more (January 19, 2026)

This post was originally published on this site

At the end of 2025 I was happy to take a long break to enjoy the incredible summers that the southern hemisphere provides. I’m back and writing my first post in 2026 which also happens to be my last post for the AWS News Blog (more on this later).

The AWS community is starting the year strong with various AWS re:invent re:Caps being hosted around the globe, with some communities already hosting their AWS Community Day events, the AWS Community Day Tel Aviv 2026 was hosted last week.

Last week’s launches
Here are last week’s launches that caught my attention:

  • Kiro CLI latest features – Kiro CLI now has granular controls for web fetch URLs, keyboard shortcuts for your custom agents, enhanced diff views, and much more. With these enhancements, you can now use allowlists or blocklists to restrict which URLs the agent can access, ensure a frictionless experience when working with multiple specialized agents in a single session, to name a few.
  • AWS European Sovereign Cloud – Following an announcement in 2023 of plans to build a new, independent cloud infrastructure, last week we announced the general availability of the AWS European Sovereign Cloud to all customers. The cloud is ready to meet the most stringent sovereignty requirements of European customers with a comprehensive set of AWS services.
  • Amazon EC2 X8i instancesPreviously launched in preview at AWS re:Invent 2025, last week we announced the general availability of new memory-optimized Amazon Elastic Compute Cloud (Amazon EC2) X8i instances. These instances are powered by custom Intel Xeon 6 processors with a sustained all-core turbo frequency of 3.9 GHz, available only on AWS. These SAP certified instances deliver the highest performance and fastest memory bandwidth among comparable Intel processors in the cloud.

Additional updates
These projects, blog posts, and news articles also caught my attention:

  • 5 core features in Amazon Quick Suite – AWS VP Agentic AI Swami Sivasubramanian talks about how he uses Amazon Quick Suite for just about everything. In October 2025 we announced Amazon Quick Suite, a new agentic teammate that quickly answers your questions at work and turns insights into actions for you. Amazon Quick Suite has become one of my favorite productivity tools, helping me with my research on various topics in addition to providing me with multiple perspectives on a topic.
  • Deploy AI agents on Amazon Bedrock AgentCore using GitHub Actions – Last year we announced Amazon Bedrock AgentCore, a flexible service that helps you seamlessly create and manage AI agents across different frameworks and models, whether hosted on Amazon Bedrock or other environments. Learn how to use a GitHub Actions workflow to automate the deployment of AI agents on AgentCore Runtime. This approach delivers a scalable solution with enterprise-level security controls, providing complete continuous integration and delivery (CI/CD) automation.

Upcoming AWS events
Join us January 28 or 29 (depending on your time zone) for Best of AWS re:Invent, a free virtual event where we bring you the most impactful announcements and top sessions from AWS re:Invent. Jeff Barr, AWS VP and Chief Evangelist, will share his highlights during the opening session.

There is still time until January 21 to compete for $250,000 in prizes and AWS credits in the Global 10,000 AIdeas Competition (yes, the second letter is an I as in Idea, not an L as in like). No code required yet: simply submit your idea, and if you’re selected as a semifinalist, you’ll build your app using Kiro within AWS Free Tier limits. Beyond the cash prizes and potential featured placement at AWS re:Invent 2026, you’ll gain hands-on experience with next-generation AI tools and connect with innovators globally.

Earlier this month, the 2026 application for the Community Builders program launched. The application is open until January 21st, midnight PST so here’s your last chance to ensure that you don’t miss out.

If you’re interested in these opportunities, join the AWS Builder Center to learn with builders in the AWS community.

With that, I close one of my most meaningful chapters here at AWS. It’s been an absolute pleasure to write for you and I thank you for taking the time to read the work that my team and I pour our absolute hearts into. I’ve grown from the close collaborations with the launch teams and the feedback from all of you. The Sub-Sahara Africa (SSA) community has grown significantly, and I want to dedicate more time focused on this community, I’m still at AWS and I look forward to meeting at an event near you!

Check back next Monday for another Weekly Roundup!

Veliswa Boya

Amazon EC2 X8i instances powered by custom Intel Xeon 6 processors are generally available for memory-intensive workloads

This post was originally published on this site

Since a preview launch at AWS re:Invent 2025, we’re announcing the general availability of new memory-optimized Amazon Elastic Compute Cloud (Amazon EC2) X8i instances. These instances are powered by custom Intel Xeon 6 processors with a sustained all-core turbo frequency of 3.9 GHz, available only on AWS. These SAP certified instances deliver the highest performance and fastest memory bandwidth among comparable Intel processors in the cloud.

X8i instances are ideal for memory-intensive workloads including in-memory databases such as SAP HANA, traditional large-scale databases, data analytics, and electronic design automation (EDA), which require high compute performance and a large memory footprint.

These instances provide 1.5 times more memory capacity (up to 6 TB), and 3.4 times more memory bandwidth compared to previous generation X2i instances. These instances offer up to 43% higher performance compared to X2i instances, with higher gains on some of the real-world workloads. They deliver up to 50% higher SAP Application Performance Standard (SAPS) performance, up to 47% faster PostgreSQL performance, up to 88% faster Memcached performance, and up to 46% faster AI inference performance.

During the preview, customers like RISE with SAP utilized up to 6 TB of memory capacity with 50% higher compute performance compared to X2i instances. This enabled faster transaction processing and improved query response times for SAP HANA workloads. Orion reduced the number of active cores on X8i instances compared to X2idn instances while maintaining performance thresholds, cutting SQL Server licensing costs by 50%.

X8i instances
X8i instances are available in 14 sizes including three larger instance sizes (48xlarge, 64xlarge, and 96xlarge), so you can choose the right size for your application to scale up, and two bare metal sizes (metal-48xl and metal-96xl) to deploy workloads that benefit from direct access to physical resources. X8i instances feature up to 100 Gbps of network bandwidth with support for the Elastic Fabric Adapter (EFA) and up to 80 Gbps of throughput to Amazon Elastic Block Store (Amazon EBS).

Here are the specs for X8i instances:

Instance name vCPUs Memory
(GiB)
Network bandwidth (Gbps) EBS bandwidth (Gbps)
x8i.large 2 32 Up to 12.5 Up to 10
x8i.xlarge 4 64 Up to 12.5 Up to 10
x8i.2xlarge 8 128 Up to 15 Up to 10
x8i.4xlarge 16 256 Up to 15 Up to 10
x8i.8xlarge 32 512 15 10
x8i.12xlarge 48 768 22.5 15
x8i.16xlarge 64 1,024 30 20
x8i.24xlarge 96 1,536 40 30
x8i.32xlarge 128 2,048 50 40
x8i.48xlarge 192 3,072 75 60
x8i.64xlarge 256 4,096 80 70
x8i.96xlarge 384 6,144 100 80
x8i.metal-48xl 192 3,072 75 60
x8i.metal-96xl 384 6,144 100 80

X8i instances support the instance bandwidth configuration (IBC) feature like other eighth-generation instance types, offering flexibility to allocate resources between network and EBS bandwidth. You can scale network or EBS bandwidth by up to 25%, improving database performance, query processing speeds, and logging efficiency. These instances also use sixth-generation AWS Nitro cards, which offload CPU virtualization, storage, and networking functions to dedicated hardware and software, enhancing performance and security for your workloads.

Now available
Amazon EC2 X8i instances are now available in US East (N. Virginia), US East (Ohio), US West (Oregon), and Europe (Frankfurt) AWS Regions. For Regional availability and a future roadmap, search the instance type in the CloudFormation resources tab of AWS Capabilities by Region.

You can purchase these instances as On-Demand Instances, Savings Plan, and Spot Instances. To learn more, visit the Amazon EC2 Pricing page.

Give X8i instances a try in the Amazon EC2 console. To learn more, visit the Amazon EC2 X8i instances page and send feedback to AWS re:Post for EC2 or through your usual AWS Support contacts.

Channy

AWS Weekly Roundup: AWS Lambda for .NET 10, AWS Client VPN quickstart, Best of AWS re:Invent, and more (January 12, 2026)

This post was originally published on this site

At the beginning of January, I tend to set my top resolutions for the year, a way to focus on what I want to achieve. If AI and cloud computing are on your resolution list, consider creating an AWS Free Tier account to receive up to $200 in credits and have 6 months of risk-free experimentation with AWS services.

During this period, you can explore essential services across compute, storage, databases, and AI/ML, plus access to over 30 always-free services with monthly usage limits. After 6 months, you can decide whether to upgrade to a standard AWS account.

Whether you’re a student exploring career options, a developer expanding your skill set, or a professional building with cloud technologies, this hands-on approach lets you focus on what matters most: developing real expertise in the areas you’re passionate about.

Last week’s launches
Here are the launches that got my attention this week:

Additional updates
Here are some additional projects, blog posts, and news items that I found interesting:

Crossmodal search with Amazon Nova Multimodal Embeddings Architecture

Upcoming AWS events
Join us January 28 or 29 (depending on your time zone) for Best of AWS re:Invent, a free virtual event where we bring you the most impactful announcements and top sessions from AWS re:Invent. Jeff Barr, AWS VP and Chief Evangelist, will share his highlights during the opening session.

There is still time until January 21 to compete for $250,000 in prizes and AWS credits in the Global 10,000 AIdeas Competition (yes, the second letter is an I as in Idea, not an L as in like). No code required yet: simply submit your idea, and if you’re selected as a semifinalist, you’ll build your app using Kiro within AWS Free Tier limits. Beyond the cash prizes and potential featured placement at AWS re:Invent 2026, you’ll gain hands-on experience with next-generation AI tools and connect with innovators globally.

If you’re interested in these opportunities, join the AWS Builder Center to learn with builders in the AWS community.

That’s all for this week. Check back next Monday for another Weekly Roundup!

Danilo

Happy New Year! AWS Weekly Roundup: 10,000 AIdeas Competition, Amazon EC2, Amazon ECS Managed Instances and more (January 5, 2026)

This post was originally published on this site

Happy New Year! I hope the holidays gave you time to recharge and spend time with your loved ones.

Like every year, I took a few weeks off after AWS re:Invent to rest and plan ahead. I used some of that downtime to plan the next cohort for Become a Solutions Architect (BeSA). BeSA is a free mentoring program that I, along with a few other Amazon Web Services (AWS) employees, volunteer to host as a way to help people excel in their cloud and AI careers. We’re kicking off a 6-week cohort on “Agentic AI on AWS” starting February 21, 2026. Visit the BeSA website to learn more.

There is still time to submit your idea for the Global 10,000 AIdeas Competition and compete for $250,000 in cash prizes, AWS credits, and recognition, including potential featured placement at AWS re:Invent 2026 and across AWS channels.

You will gain hands-on experience with next-generation AI development tools, connect with innovators globally, and access technical enablement through biweekly workshops, AWS User Groups, and AWS Builder Center resources.

The deadline is January 21, 2026, and no code is required yet. If you’re selected as a semifinalist, you’ll build your app then. Your finished app needs to use Kiro for at least part of development, stay within AWS Free Tier limits, and be completely original and not yet published.

If you haven’t yet caught up with all the new releases and announcements from AWS re:Invent 2025, check out our top announcements post or watch the keynotes, innovation talks, and breakout sessions on-demand.

Launches from the last few weeks
I’d like to highlight some launches that got my attention since our last Week in Review on December 15, 2025:

  • Amazon EC2 M8gn and M8gb instances – New M8gn and M8gb instances are powered by AWS Graviton4 processors to deliver up to 30% better compute performance than AWS Graviton3 processors. M8gn instances feature the latest 6th generation AWS Nitro Cards, and offer up to 600 Gbps network bandwidth, the highest network bandwidth among network-optimized EC2 instances. M8gb offer up to 150 Gbps of Amazon EBS bandwidth to provide higher EBS performance compared to same-sized equivalent Graviton4-based instances.
  • AWS Direct Connect supports resilience testing with AWS Fault Injection Service – You can now use AWS Fault Injection Service to test how your applications handle Direct Connect Border Gateway Protocol (BGP) failover in a controlled environment. For example, you can validate that traffic routes to redundant virtual interfaces when a primary virtual interface’s BGP session is disrupted and your applications continue to function as expected.
  • New AWS Security Hub controls in AWS Control Tower – AWS Control Tower now supports 176 additional Security Hub controls in the Control Catalog, covering use cases including security, cost, durability, and operations. With this launch, you can search, discover, enable, and manage these controls directly from AWS Control Tower to govern additional use cases across your multi-account environment.
  • AWS Transform supports network conversion for hybrid data center migrations – You can now use AWS Transform for VMware to automatically convert networks from hybrid data centers. This removes manual network mapping for environments running both VMware and other workloads. The service analyzes VLANs and IP ranges across all exported source networks and maps them to AWS constructs such as virtual private clouds (VPCs), subnets, and security groups.
  • NVIDIA Nemotron 3 Nano available on Amazon Bedrock – Amazon Bedrock now supports NVIDIA Nemotron 3 Nano 30B A3B model, NVIDIA’s latest breakthrough in efficient language modeling that delivers high reasoning performance, built-in tool calling support, and extended context processing with 256K token context window.
  • Amazon EC2 supports Availability Zone ID across its APIs – You can specify the Availability Zone ID (AZ ID) parameter directly in your Amazon EC2 APIs to guarantee consistent placement of resources. AZ IDs are consistent and static identifiers that represent the same physical location across all AWS accounts, helping you optimize resource placement. Prior to this launch, you had to use an AZ name while creating a resource, but these names could map to different physical locations. This mapping made it difficult to ensure resources were always co-located, especially when operating with multiple accounts.
  • Amazon ECS Managed Instances supports Amazon EC2 Spot Instances – Amazon ECS Managed Instances now supports Amazon EC2 Spot Instances, extending the range of capabilities available with AWS managed infrastructure. You can use spare EC2 capacity at up to 90% discount compared to On-Demand prices for fault-tolerant workloads in Amazon ECS Managed Instances.

See AWS What’s New for more launch news that I haven’t covered here. That’s all for this week. Check back next Monday for another Weekly Roundup!

Here’s to a fantastic start to 2026. Happy building!

– Prasad

AWS Weekly Roundup: Amazon ECS, Amazon CloudWatch, Amazon Cognito and more (December 15, 2025)

This post was originally published on this site

Can you believe it? We’re nearly at the end of 2025. And what a year it’s been! From re:Invent recap events, to AWS Summits, AWS Innovate, AWS re:Inforce, Community Days, and DevDays and, recently, adding that cherry on the cake, re:Invent 2025, we have lived through a year filled with exciting moments and technology advancements which continue to shape our new modern world.

Speaking of re:Invent, if you haven’t caught up yet on all the new releases and announcements (and there were plenty of exciting launches across every area), be sure to check out our curated post highlighting the top announcements from AWS re:Invent 2025. We’ve organized all the key releases into easy-to-navigate categories and included links so you can dive deeper into anything that sparks your interest.

While the year may be wrapping up, our teams are still busy working on things that you have either asked for as customers or that we pro-actively create to make your lives easier. Last week had quite a few interesting releases as usual, so let’s look at a few that I think could be useful for many of you out there.

Last week’s launches

Amazon WorkSpaces Secure Browser introduces Web Content Filtering – Organizations can now control web access through category-based filtering across 25+ predefined categories, granular URL policies, and integrated compliance logging. The feature works alongside existing Chrome policies and integrates with Session Logger for enhanced monitoring and is available at no additional cost in 10 AWS Regions with pay-as-you-go pricing.

Amazon Aurora DSQL now supports cluster creation in seconds – Developers can now instantly provision Aurora DSQL databases with setup time reduced from minutes to seconds, enabling rapid prototyping through the integrated AWS console query editor or AI-powered development via the Aurora DSQL Model Context Protocol server. Available at no additional cost in all AWS Regions where Aurora DSQL is offered, with AWS Free Tier access available.

Amazon Aurora PostgreSQL now supports integration with Kiro powers – Developers can now accelerate Aurora PostgreSQL application development using AI-assisted coding through Kiro powers, a repository of pre-packaged Model Context Protocol servers. The Aurora PostgreSQL integration provides direct database connectivity for queries, schema management, and cluster operations, dynamically loading relevant context as developers work. Available for one-click installation in Kiro IDE across all AWS Regions.

Amazon ECS now supports custom container stop signals on AWS Fargate – Fargate tasks now honor the stop signal configured in container images, enabling graceful shutdowns for containers that rely on signals like SIGQUIT or SIGINT instead of the default SIGTERM. The ECS container agent reads the STOPSIGNAL instruction from OCI-compliant images and sends the appropriate signal during task termination. Available at no additional cost across all AWS Regions.

Amazon CloudWatch SDK supports optimized JSON, CBOR protocols – CloudWatch SDK now defaults to JSON and CBOR protocols, delivering lower latency, reduced payload sizes, and decreased client-side CPU and memory usage compared to the traditional AWS Query protocol. Available at no additional cost across all AWS Regions and SDK language variants.

Amazon Cognito identity pools now support private connectivity with AWS PrivateLink – Organizations can now securely exchange federated identities for temporary AWS credentials through private VPC connections, eliminating the need to route authentication traffic over the public internet. Available in all AWS Regions where Cognito identity pools are supported, except AWS China (Beijing) and AWS GovCloud (US) Regions.

AWS Application Migration Service supports IPv6 – Organizations can now migrate applications using IPv6 addressing through dual-stack service endpoints that support both IPv4 and IPv6 communications. During replication, testing, and cutover phases, you can use IPv4, IPv6, or dual-stack configurations to launch servers in your target environment. Available at no additional cost in all AWS Regions that support MGN and EC2 dual-stack endpoints.

And that’s it for the AWS News Blog Weekly Roundup…not just for this week, but for 2025! We’ll be taking a break and returning in January to continue bringing you the latest AWS releases and updates.

As we close out 2025, it’s remarkable to look back at just how much has changed since the beginning of year. From groundbreaking AI capabilities to transformative infrastructure innovations, AWS has delivered an incredible year of releases that have reshaped what’s possible in the cloud. Throughout it all, the AWS News Blog has been right here with you every week with our Weekly Roundup series, helping you stay informed and ready to take advantage of each new opportunity as it arrived. We’re grateful you’ve joined us on this journey, and we can’t wait to continue bringing you the latest AWS innovations when we return in January 2026.

Until then, happy building, and here’s to an even more exciting year ahead!

Matheus Guimaraes | @codingmatheus

New serverless customization in Amazon SageMaker AI accelerates model fine-tuning

This post was originally published on this site

Today, I’m happy to announce new serverless customization in Amazon SageMaker AI for popular AI models, such as Amazon Nova, DeepSeek, GPT-OSS, Llama, and Qwen. The new customization capability provides an easy-to-use interface for the latest fine-tuning techniques like reinforcement learning, so you can accelerate the AI model customization process from months to days.

With a few clicks, you can seamlessly select a model and customization technique, and handle model evaluation and deployment—all entirely serverless so you can focus on model tuning rather than managing infrastructure. When you choose serverless customization, SageMaker AI automatically selects and provisions the appropriate compute resources based on the model and data size.

Getting started with serverless model customization
You can get started customizing models in Amazon SageMaker Studio. Choose Models in the left navigation pane and check out your favorite AI models to be customized.

Customize with UI
You can customize AI models in a only few clicks. In the Customize model dropdown list for a specific model such as Meta Llama 3.1 8B Instruct, choose Customize with UI.

You can select a customization technique used to adapt the base model to your use case. SageMaker AI supports Supervised Fine-Tuning and the latest model customization techniques including Direct Preference Optimization, Reinforcement Learning from Verifiable Rewards (RLVR), and Reinforcement Learning from AI Feedback (RLAIF). Each technique optimizes models in different ways, with selection influenced by factors such as dataset size and quality, available computational resources, task at hand, desired accuracy levels, and deployment constraints.

Upload or select a training dataset to match the format required by the customization technique selected. Use the values of batch size, learning rate, and number of epochs recommended by the technique selected. You can configure advanced settings such as hyperparameters, a newly introduced serverless MLflow application for experiment tracking, and network and storage volume encryption. Choose Submit to get started on your model training job.

After your training job is complete, you can see the models you created in the My Models tab. Choose View details in one of your models.

By choosing Continue customization, you can continue to customize your model by adjusting hyperparameters or training with different techniques. By choosing Evaluate, you can evaluate your customized model to see how it performs compared to the base model.

When you complete both jobs, you can choose either the SageMaker or Bedrock in the Deploy dropdown list to deploy your model.

You can choose Amazon Bedrock for serverless inference. Choose Bedrock and the model name to deploy the model into Amazon Bedrock. To find your deployed models, choose Imported models in the Bedrock console.

You can also deploy your model to a SageMaker AI inference endpoint if you want to control your deployment resources such as an instance type and instance count. After the SageMaker AI deployment is In service, you can use this endpoint to perform inference. In the Playground tab, you can test your customized model with a single prompt or chat mode.

With the serverless MLflow capability, you can automatically log all critical experiment metrics without modifying code and access rich visualizations for further analysis.

Customize with code
When you choose customizing with code, you can see a sample notebook to fine-tune or deploy AI models. If you want to edit the sample notebook, open it in JupyterLab. Alternatively, you can deploy the model immediately by choosing Deploy.

You can choose the Amazon Bedrock or SageMaker AI endpoint by selecting the deployment resources either from Amazon SageMaker Inference or Amazon SageMaker Hyperpod.

When you choose Deploy on the bottom right of the page, it will be redirected back to the model detail page. After the SageMaker AI deployment is in service, you can use this endpoint to perform inference.

Okay, you’ve seen how to streamline the model customization in the SageMaker AI. You can now choose your favorite way. To learn more, visit the Amazon SageMaker AI Developer Guide.

Now available
New serverless AI model customization in Amazon SageMaker AI is now available in US East (N. Virginia), US West (Oregon), Asia Pacific (Tokyo), and Europe (Ireland) Regions. You only pay for the tokens processed during training and inference. To learn more details, visit Amazon SageMaker AI pricing page.

Give it a try in Amazon SageMaker Studio and send feedback to AWS re:Post for SageMaker or through your usual AWS Support contacts.

Channy

Introducing checkpointless and elastic training on Amazon SageMaker HyperPod

This post was originally published on this site

Today, we’re announcing two new AI model training features within Amazon SageMaker HyperPod: checkpointless training, an approach that mitigates the need for traditional checkpoint-based recovery by enabling peer-to-peer state recovery, and elastic training, enabling AI workloads to automatically scale based on resource availability.

  • Checkpointless training – Checkpointless training eliminates disruptive checkpoint-restart cycles, maintaining forward training momentum despite failures, reducing recovery time from hours to minutes. Accelerate your AI model development, reclaim days from development timelines, and confidently scale training workflows to thousands of AI accelerators.
  • Elastic training  – Elastic training maximizes cluster utilization as training workloads automatically expand to use idle capacity as it becomes available, and contract to yield resources as higher-priority workloads like inference volumes peak. Save hours of engineering time per week spent reconfiguring training jobs based on compute availability.

Rather than spending time managing training infrastructure, these new training techniques mean that your team can concentrate entirely on enhancing model performance, ultimately getting your AI models to market faster. By eliminating the traditional checkpoint dependencies and fully utilizing available capacity, you can significantly reduce model training completion times.

Checkpointless training: How it works
Traditional checkpoint-based recovery has these sequential job stages: 1) job termination and restart, 2) process discovery and network setup, 3) checkpoint retrieval, 4) data loader initialization, and 5) training loop resumption. When failures occur, each stage can become a bottleneck and training recovery can take up to an hour on self-managed training clusters. The entire cluster must wait for every single stage to complete before training can resume. This can lead to the entire training cluster sitting idle during recovery operations, which increases costs and extends the time to market.

Checkpointless training removes this bottleneck entirely by maintaining continuous model state preservation across the training cluster. When failures occur, the system instantly recovers by using healthy peers, avoiding the need for a checkpoint-based recovery that requires restarting the entire job. As a result, checkpointless training enables fault recovery in minutes.

Checkpointless training is designed for incremental adoption and built on four core components that work together: 1) collective communications initialization optimizations, 2) memory-mapped data loading that enables caching, 3) in-process recovery, and 4) checkpointless peer-to-peer state replication. These components are orchestrated through the HyperPod training operator that is used to launch the job. Each component optimizes a specific step in the recovery process, and together they enable automatic detection and recovery of infrastructure faults in minutes with zero manual intervention, even with thousands of AI accelerators. You can progressively enable each of these features as your training scales.

The latest Amazon Nova models were trained using this technology on tens of thousands of accelerators. Additionally, based on internal studies on cluster sizes ranging between 16 GPUs to over 2,000 GPUs, checkpointless training showcased significant improvements in recovery times, reducing downtime by over 80% compared to traditional checkpoint-based recovery.

To learn more, visit HyperPod Checkpointless Training in the Amazon SageMaker AI Developer Guide.

Elastic training: How it works
On clusters that run different types of modern AI workloads, accelerator availability can change continuously throughout the day as short-duration training runs complete, inference spikes occur and subside, or resources free up from completed experiments. Despite this dynamic availability of AI accelerators, traditional training workloads remain locked into their initial compute allocation, unable to take advantage of idle accelerators without manual intervention. This rigidity leaves valuable GPU capacity unused and prevents organizations from maximizing their infrastructure investment.

Elastic training transforms how training workloads interact with cluster resources. Training jobs can automatically scale up to utilize available accelerators and gracefully contract when resources are needed elsewhere, all while maintaining training quality.

Workload elasticity is enabled through the HyperPod training operator that orchestrates scaling decisions through integration with the Kubernetes control plane and resource scheduler. It continuously monitors cluster state through three primary channels: pod lifecycle events, node availability changes, and resource scheduler priority signals. This comprehensive monitoring enables near-instantaneous detection of scaling opportunities, whether from newly available resources or requests from higher-priority workloads.

The scaling mechanism relies on adding and removing data parallel replicas. When additional compute resources become available, new data parallel replicas join the training job, accelerating throughput. Conversely, during scale-down events (for example, when a higher-priority workload requests resources), the system scales down by removing replicas rather than terminating the entire job, allowing training to continue at reduced capacity.

Across different scales, the system preserves the global batch size and adapts learning rates, preventing model convergence from being adversely impacted. This enables workloads to dynamically scale up or down to utilize available AI accelerators without any manual intervention.

You can start elastic training through the HyperPod recipes for publicly available foundation models (FMs) including Llama and GPT-OSS. Additionally, you can modify your PyTorch training scripts to add elastic event handlers, which enable the job to dynamically scale.

To learn more, visit the HyperPod Elastic Training in the Amazon SageMaker AI Developer Guide. To get started, find the HyperPod recipes available in the AWS GitHub repository.

Now available
Both features are available in all the Regions in which Amazon SageMaker HyperPod is available. You can use these training techniques without additional cost. To learn more, visit the SageMaker HyperPod product page and SageMaker AI pricing page.

Give it a try and send feedback to AWS re:Post for SageMaker or through your usual AWS Support contacts.

Channy

Announcing replication support and Intelligent-Tiering for Amazon S3 Tables

This post was originally published on this site

Today, we’re announcing two new capabilities for Amazon S3 Tables: support for the new Intelligent-Tiering storage class that automatically optimizes costs based on access patterns, and replication support to automatically maintain consistent Apache Iceberg table replicas across AWS Regions and accounts without manual sync.

Organizations working with tabular data face two common challenges. First, they need to manually manage storage costs as their datasets grow and access patterns change over time. Second, when maintaining replicas of Iceberg tables across Regions or accounts, they must build and maintain complex architectures to track updates, manage object replication, and handle metadata transformations.

S3 Tables Intelligent-Tiering storage class
With the S3 Tables Intelligent-Tiering storage class, data is automatically tiered to the most cost-effective access tier based on access patterns. Data is stored in three low-latency tiers: Frequent Access, Infrequent Access (40% lower cost than Frequent Access), and Archive Instant Access (68% lower cost compared to Infrequent Access). After 30 days without access, data moves to Infrequent Access, and after 90 days, it moves to Archive Instant Access. This happens without changes to your applications or impact on performance.

Table maintenance activities, including compaction, snapshot expiration, and unreferenced file removal, operate without affecting the data’s access tiers. Compaction automatically processes only data in the Frequent Access tier, optimizing performance for actively queried data while reducing maintenance costs by skipping colder files in lower-cost tiers.

By default, all existing tables use the Standard storage class. When creating new tables, you can specify Intelligent-Tiering as the storage class, or you can rely on the default storage class configured at the table bucket level. You can set Intelligent-Tiering as the default storage class for your table bucket to automatically store tables in Intelligent-Tiering when no storage class is specified during creation.

Let me show you how it works
You can use the AWS Command Line Interface (AWS CLI) and the put-table-bucket-storage-class and get-table-bucket-storage-class commands to change or verify the storage tier of your S3 table bucket.

# Change the storage class
aws s3tables put-table-bucket-storage-class 
   --table-bucket-arn $TABLE_BUCKET_ARN  
   --storage-class-configuration storageClass=INTELLIGENT_TIERING

# Verify the storage class
aws s3tables get-table-bucket-storage-class 
   --table-bucket-arn $TABLE_BUCKET_ARN  

{ "storageClassConfiguration":
   { 
      "storageClass": "INTELLIGENT_TIERING"
   }
}

S3 Tables replication support
The new S3 Tables replication support helps you maintain consistent read replicas of your tables across AWS Regions and accounts. You specify the destination table bucket and the service creates read-only replica tables. It replicates all updates chronologically while preserving parent-child snapshot relationships. Table replication helps you build global datasets to minimize query latency for geographically distributed teams, meet compliance requirements, and provide data protection.

You can now easily create replica tables that deliver similar query performance as their source tables. Replica tables are updated within minutes of source table updates and support independent encryption and retention policies from their source tables. Replica tables can be queried using Amazon SageMaker Unified Studio or any Iceberg-compatible engine including DuckDB, PyIceberg, Apache Spark, and Trino.

You can create and maintain replicas of your tables through the AWS Management Console or APIs and AWS SDKs. You specify one or more destination table buckets to replicate your source tables. When you turn on replication, S3 Tables automatically creates read-only replica tables in your destination table buckets, backfills them with the latest state of the source table, and continually monitors for new updates to keep replicas in sync. This helps you meet time-travel and audit requirements while maintaining multiple replicas of your data.

Let me show you how it works
To show you how it works, I proceed in three steps. First, I create an S3 table bucket, create an Iceberg table, and populate it with data. Second, I configure the replication. Third, I connect to the replicated table and query the data to show you that changes are replicated.

For this demo, the S3 team kindly gave me access to an Amazon EMR cluster already provisioned. You can follow the Amazon EMR documentation to create your own cluster. They also created two S3 table buckets, a source and a destination for the replication. Again, the S3 Tables documentation will help you to get started.

I take a note of the two S3 Tables bucket Amazon Resource Names (ARNs). In this demo, I refer to these as the environment variables SOURCE_TABLE_ARN and DEST_TABLE_ARN.

First step: Prepare the source database

I start a terminal, connect to the EMR cluster, start a Spark session, create a table, and insert a row of data. The commands I use in this demo are documented in Accessing tables using the Amazon S3 Tables Iceberg REST endpoint.

sudo spark-shell 
--packages "org.apache.iceberg:iceberg-spark-runtime-3.5_2.12:1.4.1,software.amazon.awssdk:bundle:2.20.160,software.amazon.awssdk:url-connection-client:2.20.160" 
--master "local[*]" 
--conf "spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions" 
--conf "spark.sql.defaultCatalog=spark_catalog" 
--conf "spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkCatalog" 
--conf "spark.sql.catalog.spark_catalog.type=rest" 
--conf "spark.sql.catalog.spark_catalog.uri=https://s3tables.us-east-1.amazonaws.com/iceberg" 
--conf "spark.sql.catalog.spark_catalog.warehouse=arn:aws:s3tables:us-east-1:012345678901:bucket/aws-news-blog-test" 
--conf "spark.sql.catalog.spark_catalog.rest.sigv4-enabled=true" 
--conf "spark.sql.catalog.spark_catalog.rest.signing-name=s3tables" 
--conf "spark.sql.catalog.spark_catalog.rest.signing-region=us-east-1" 
--conf "spark.sql.catalog.spark_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO" 
--conf "spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.SimpleAWSCredentialProvider" 
--conf "spark.sql.catalog.spark_catalog.rest-metrics-reporting-enabled=false"

spark.sql("""
CREATE TABLE s3tablesbucket.test.aws_news_blog (
customer_id STRING,
address STRING
) USING iceberg
""")

spark.sql("INSERT INTO s3tablesbucket.test.aws_news_blog VALUES ('cust1', 'val1')")

spark.sql("SELECT * FROM s3tablesbucket.test.aws_news_blog LIMIT 10").show()
+-----------+-------+
|customer_id|address|
+-----------+-------+
|      cust1|   val1|
+-----------+-------+

So far, so good.

Second step: Configure the replication for S3 Tables

Now, I use the CLI on my laptop to configure the S3 table bucket replication.

Before doing so, I create an AWS Identity and Access Management (IAM) policy to authorize the replication service to access my S3 table bucket and encryption keys. Refer to the S3 Tables replication documentation for the details. The permissions I used for this demo are:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:*",
                "s3tables:*",
                "kms:DescribeKey",
                "kms:GenerateDataKey",
                "kms:Decrypt"
            ],
            "Resource": "*"
        }
    ]
}

After having created this IAM policy, I can now proceed and configure the replication:

aws s3tables-replication put-table-replication 
--table-arn ${SOURCE_TABLE_ARN} 
--configuration  '{
    "role": "arn:aws:iam::<MY_ACCOUNT_NUMBER>:role/S3TableReplicationManualTestingRole", 
    "rules":[
        {
            "destinations": [
                {
                    "destinationTableBucketARN": "${DST_TABLE_ARN}"
                }]
        }
    ]

The replication starts automatically. Updates are typically replicated within minutes. The time it takes to complete depends on the volume of data in the source table.

Third step: Connect to the replicated table and query the data

Now, I connect to the EMR cluster again, and I start a second Spark session. This time, I use the destination table.

S3 Tables replication - destination table

To verify the replication works, I insert a second row of data on the source table.

spark.sql("INSERT INTO s3tablesbucket.test.aws_news_blog VALUES ('cust2', 'val2')")

I wait a few minutes for the replication to trigger. I follow the status of the replication with the get-table-replication-status command.

aws s3tables-replication get-table-replication-status 
--table-arn ${SOURCE_TABLE_ARN} 
{
    "sourceTableArn": "arn:aws:s3tables:us-east-1:012345678901:bucket/manual-test/table/e0fce724-b758-4ee6-85f7-ca8bce556b41",
    "destinations": [
        {
            "replicationStatus": "pending",
            "destinationTableBucketArn": "arn:aws:s3tables:us-east-1:012345678901:bucket/manual-test-dst",
            "destinationTableArn": "arn:aws:s3tables:us-east-1:012345678901:bucket/manual-test-dst/table/5e3fb799-10dc-470d-a380-1a16d6716db0",
            "lastSuccessfulReplicatedUpdate": {
                "metadataLocation": "s3://e0fce724-b758-4ee6-8-i9tkzok34kum8fy6jpex5jn68cwf4use1b-s3alias/e0fce724-b758-4ee6-85f7-ca8bce556b41/metadata/00001-40a15eb3-d72d-43fe-a1cf-84b4b3934e4c.metadata.json",
                "timestamp": "2025-11-14T12:58:18.140281+00:00"
            }
        }
    ]
}

When replication status shows ready, I connect to the EMR cluster and I query the destination table. Without surprise, I see the new row of data.

S3 Tables replication - target table is up to date

Additional things to know
Here are a couple of additional points to pay attention to:

  • Replication for S3 Tables supports both Apache Iceberg V2 and V3 table formats, giving you flexibility in your table format choice.
  • You can configure replication at the table bucket level, making it straightforward to replicate all tables under that bucket without individual table configurations.
  • Your replica tables maintain the storage class you choose for your destination tables, which means you can optimize for your specific cost and performance needs.
  • Any Iceberg-compatible catalog can directly query your replica tables without additional coordination—they only need to point to the replica table location. This gives you flexibility in choosing query engines and tools.

Pricing and availability
You can track your storage usage by access tier through AWS Cost and Usage Reports and Amazon CloudWatch metrics. For replication monitoring, AWS CloudTrail logs provide events for each replicated object.

There are no additional charges to configure Intelligent-Tiering. You only pay for storage costs in each tier. Your tables continue to work as before, with automatic cost optimization based on your access patterns.

For S3 Tables replication, you pay the S3 Tables charges for storage in the destination table, for replication PUT requests, for table updates (commits), and for object monitoring on the replicated data. For cross-Region table replication, you also pay for inter-Region data transfer out from Amazon S3 to the destination Region based on the Region pair.

As usual, refer to the Amazon S3 pricing page for the details.

Both capabilities are available today in all AWS Regions where S3 Tables are supported.

To learn more about these new capabilities, visit the Amazon S3 Tables documentation or try them in the Amazon S3 console today. Share your feedback through AWS re:Post for Amazon S3 or through your AWS Support contacts.

— seb