Introducing Express brokers for Amazon MSK to deliver high throughput and faster scaling for your Kafka clusters

This post was originally published on this site

Today, we’re announcing the general availability of Express brokers, a new broker type for Amazon Managed Streaming for Apache Kafka (Amazon MSK). It’s designed to deliver up to three times more throughput per-broker, scale up to 20 times faster, and reduce recovery time by 90 percent as compared to Standard brokers running Apache Kafka. Express brokers come preconfigured with Kafka best practices by default, support Kafka APIs, and provide the same low latency performance that Amazon MSK customers expect, so they can continue using existing client applications without any changes.

Express brokers provide improved compute and storage elasticity for Kafka applications when using Amazon MSK provisioned clusters. Amazon MSK is a fully-managed AWS service that makes it easier for you to build and run highly available and scalable applications based on Apache Kafka.

Let’s dive deeper into some of the key features that Express brokers have and the benefits they provide:

  • Easier operations with hands-free storage management – Express brokers offer unlimited storage without preprovisioning, eliminating disk-related bottlenecks. Cluster sizing is simpler, requiring only ingress and egress throughput divided by recommended per-broker throughput. This removes the need for proactive disk capacity monitoring and scaling, simplifying cluster management and improving resilience by eliminating a potential failure source.
  • Fewer brokers with up to three times throughput per broker – Higher throughput per broker allows for smaller clusters for the same workload. Standard brokers’ throughput must account for client traffic and background operations, with m7g.16xl Standard brokers safely handling 154 MBps ingress. Express brokers use opinionated settings and resource isolation, enabling m7g.16xl size instances to safely manage up to 500 MBps ingress without compromising performance or availability during cluster events.
  • Higher utilization with 20 times faster scaling – Express brokers reduce data movement during scaling, making them up to 20 times faster than Standard brokers. This allows for more quicker and reliable cluster resizing. You can monitor each broker’s ingress throughput capacity and add brokers within minutes, eliminating the need for over-provisioning in anticipation of traffic spikes.
  • Higher resilience with 90 percent faster recovery – Express brokers are designed for mission-critical applications requiring high resilience. They come preconfigured with best-practice defaults, including 3-way replication (RF=3), which reduce failures due to misconfiguration. Express brokers also recover 90 percent faster from transient failures compared to standard Apache Kafka brokers. Express brokers’ rebalancing and recovery use minimal cluster resources, simplifying capacity planning. This eliminates the risk of increased resource utilization and the need for continuous monitoring when right-sizing clusters.

You have choice options in Amazon MSK depending on your workload and preference:

MSK provisioned MSK Serverless
Standard brokers Express brokers
Configuration range Most flexible Flexible Least flexible
Cluster rebalancing Customer managed Customer managed
but up to 20x faster
MSK managed
Capacity management Yes Yes (compute only) No
Storage management Yes No No

Express brokers lower costs, provide higher resiliency, and lower operational overhead, making them the best choice for all Kafka workloads. If you prefer to use Kafka without managing any aspect of its capacity, its configuration, or how it scales, then you can choose Amazon MSK Serverless. This provides a fully abstracted Apache Kafka experience that eliminates the need for any infrastructure management, scales automatically, and charges you on a pay-per-use consumption model that doesn’t require you to optimize resource utilization.

Getting started with Express brokers in Amazon MSK
To get started with Express brokers, you can use the Sizing and Pricing worksheet that Amazon MSK provides. This worksheet helps you estimate the cluster size you’ll need to accommodate your workload and also gives you a rough estimate of the total monthly cost you’ll incur.

The throughput requirements of your workload are the primary factor in the size of your cluster. You should also consider other factors, such as partition and connection count to arrive at the size and number of brokers you’ll need for your cluster. For example, if your streaming application needs 30 MBps of data ingress (write) and 80 MBps data egress (read) capacity, you can use three express.m7g.large brokers to meet your throughput needs (assuming the partition count for your workload is within the maximum number of partitions that Amazon MSK recommends for an m7g.large instance).

The following table shows the recommended maximum ingress, egress, and partition counts per instance size for sustainable and safe operations. You can learn more about these recommendations in the Best practices section of Amazon MSK Developer Guide.

Instance size Ingress (MBps) Egress (MBps)
express.m7g.large 15.6 31.2
express.m7g.4xlarge 124.9 249.8
express.m7g.16xlarge 500.0 1000.0

Once you have decided the number and size of Express brokers you’ll need for your workload, go to the AWS Management Console or use the CreateCluster API to create an Amazon MSK provisioned cluster.

When you create a new cluster on the Amazon MSK console, in the Broker type option, choose Express brokers and then select the mount of compute capacity that you want to provision for the broker. As you can see in the screen shot, you can use Apache Kafka 3.6.0 version and Graviton-based instances for Express brokers. You don’t need to preprovision storage for Express brokers.

You can also customize some of these configurations to further fine-tune the performance of your clusters according to your own preferences. To learn more, visit Express broker configurations in the Amazon MSK developer guide.

To create an MSK cluster in the AWS Command Line Interface (AWS CLI), use the create-cluster command.

aws kafka create-cluster 
    --cluster-name "channy-express-cluster" 
    --kafka-version "3.6.0" 
    --number-of-broker-nodes 3 
    --broker-node-group-info file://brokernodegroupinfo.json

A JSON file named brokernodegroupinfo.json specifies the three subnets over which you want Amazon MSK to distribute the broker nodes.

{
    "InstanceType": "express.m7g.large",
    "BrokerAZDistribution": "DEFAULT",
    "ClientSubnets": [
        "subnet-0123456789111abcd",
        "subnet-0123456789222abcd",
        "subnet-0123456789333abcd"
    ]
}

Once the cluster is created, you can use the bootstrap connection string to connect your clients to the cluster endpoints.

With Express brokers, you can scale vertically (changing instance size) or horizontally (adding brokers). Vertical scaling doubles throughput without requiring partition reassignment. Horizontal scaling adds brokers in sets of three and and allows you to create more partitions, but it requires partition reassignment for new brokers to serve traffic.

A major benefit of Express brokers is that you can add brokers and rebalance partitions within minutes. On the other hand, rebalancing partitions after adding Standard brokers can take several hours. The graph below shows the time it took to rebalance partitions after adding 3 Express brokers to a cluster and reassigning 2000 partitions to each of the new brokers.

As you can see, it took approximately 10 minutes to reassign these partitions and utilize the additional capacity of the new brokers. When we ran the same experiment on an equivalent cluster comprising of Standard brokers, partition reassignment took over 24hours.

To learn more about the partition reassignment, visit Expanding your cluster in the Apache Kafka documentation.

Things to know
Here are some things you should know about Express brokers:

  • Data migration – You can migrate the data in your existing Kafka or MSK cluster to a cluster composed of Express brokers using Amazon MSK Replicator, which copies both the data and the metadata of your cluster to a new cluster.
  • Monitoring – You can monitor your cluster composed of Express brokers in the cluster and at the broker level with Amazon CloudWatch metrics and enable open monitoring with Prometheus to expose metrics using the JMX Exporter and the Node Exporter.
  • Security – Just like with other broker types, Amazon MSK integrates with AWS Key Management Service (AWS KMS) to offer transparent server-side encryption for the storage in Express brokers. When you create an MSK cluster with Express brokers, you can specify the AWS KMS key that you want Amazon MSK to use to encrypt your data at rest. If you don’t specify a KMS key, Amazon MSK creates an AWS managed key for you and uses it on your behalf.

Now available
The Express broker type is available today in the US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), and Europe (Ireland), and Europe (Stockholm) Regions.

You pay an hourly rate for Apache Kafka broker instance usage (billed at one-second resolution) for Express brokers, with varying fees depending on the size of the broker instance and active brokers in your MSK clusters. You also pay a per-GB rate for data written to an Express broker (billed at per-byte resolution). To learn more, visit the Amazon MSK pricing page.

Give Express brokers for Amazon MSK a try in the Amazon MSK console. For more information, visit the Amazon MSK Developer Guide and send feedback to AWS re:Post for Amazon MSK or through your usual AWS support contacts.

Channy

[Guest Diary] Insights from August Web Traffic Surge, (Wed, Nov 6th)

This post was originally published on this site

[This is a Guest Diary by Trevor Coleman, an ISC intern as part of the SANS.edu Bachelor's Degree in Applied Cybersecurity (BACS) program [1].


Figure 1: ISC Web Honeypot Log Overview Chart [2]

The month of August brought with it a notable surge in web traffic log activities, catching my attention. As I delved into investigating the underlying causes of this spike, I uncovered some concerning findings that shed light on the potential risks organizations face in today's digital landscape.

The web honeypot log traffic, as parsed in the DShield-SIEM dashboard [3], served as a visual representation of the significant increase in activity. With over 62,000,000 activity logs originating from a single IP source, it was evident that something was amiss, comparatively to the second most source at 757,000.  The most observed activity was directed towards destination ports 5555, 7547, and 9000, indicating a targeted effort to exploit vulnerabilities in web applications.  Ports 5555 and 9000 are commonly used in malware attacks for known vulnerabilities on webservers.

 


Figure 2: DShield-SIEM Traffic Analytics for IP %%ip: 23.95.107.6%%.

 


Figure 3: Five IP addresses with highest traffic volume in August.

Analysis of the HTTP requests to the web honeypot revealed that the attacker exploited various known vulnerabilities. Out of the total requests, 57,243,299 (92%) were GET requests, 4,960,056 (8%) were POST requests, while there were significantly fewer PUT (18,466) and DELETE (4,150) requests. Figure 5 shows the top 5 http request methods and corresponding logs and count of each attempt. Note only 2 different PATCH request types were present.

 


Figure 4: Top HTTP Request Methods and Logs for IP %%ip:23.95.107.6%%.

 

These included path traversal, Nuclei exploits, open redirect in Bitrix site manager, SQL injection, and PHP/WordPress attacks. The frequency and nature of these attacks pointed towards a well-orchestrated campaign aimed at compromising systems and gaining unauthorized access to sensitive information [4][5][6].

The attacker's use of scanning capabilities to identify known exploits and CVEs, as well as the observation of Mitre ATT&CK techniques and tactics [7], highlighted the sophistication of the threat actors behind these malicious activities. The ultimate goal of this attack was clear – to exploit vulnerabilities in web applications and compromise target systems for nefarious purposes.

In order to protect systems from such attacks, it is imperative for organizations to implement a multi-layered defense strategy. This includes the following measures:

  • Deploy a web application firewall to monitor and filter incoming and outgoing traffic.
  • Ensure continuous application patching to address known vulnerabilities and mitigate risks.
  • Conduct frequent web application vulnerability scans to identify and remediate CVEs.
  • Perform annual web penetration tests to proactively identify weaknesses and shore up defenses.
  • Monitor GET and POST request traffic exposed to the internet to detect and respond to suspicious activities.
  • Close unnecessary ports and services to minimize the attack surface and reduce potential entry points for threat actors.

Furthermore, it is important to keep a watchful eye on IP addresses associated with malicious activities and take appropriate action to mitigate risks. In this case, the IP address 23.95.107.6, leased by RackNerd LLC, has been flagged for abuse due to web application attacks [8][9]. RackNerd, which provides Infrastructure-as-a-Service (IaaS), including VPS and dedicated servers with headquarters in Los Angeles, California.

Ports 22, 5222, and 5269 were found to be open on this device [10]. Ports 5222 and 5269 are commonly used for Extensible Messaging and Presence Protocol (XMPP) for chat clients such as Jabber, Google Talk and WhatsApp to name a few [11][12].  This situation further highlighting the need for heightened vigilance and remediation efforts.

In conclusion, the recent surge in web traffic log activities serves as a stark reminder of the evolving cybersecurity threat landscape and the importance of proactive defense measures. By staying informed, conducting regular vulnerability assessments, and implementing robust security protocols, organizations can strengthen their resilience against cyber threats and safeguard their digital assets from malicious actors.

 

[1] https://www.sans.edu/cyber-security-programs/bachelors-degree/
[2] https://isc.sans.edu/myweblogs/
[3] https://github.com/bruneaug/DShield-SIEM/blob/main/README.md
[4] https://www.cve.org/CVERecord?id=CVE-2024-1561
[5] https://nvd.nist.gov/vuln/detail/CVE-2024-27920
[6] https://www.cvedetails.com/cve/CVE-2008-2052/
[7] https://attack.mitre.org/tactics/TA0043/    
[8] https://www.virustotal.com/gui/ip-address/23.95.107.6/detection
[9] https://www.abuseipdb.com/check/23.95.107.6
[10] https://www.shodan.io/host/23.95.107.6  
[11] https://www.speedguide.net/port.php?port=5222
[12] https://www.speedguide.net/port.php?port=5269

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

Python RAT with a Nice Screensharing Feature, (Tue, Nov 5th)

This post was originally published on this site

While hunting, I found another interesting Python RAT in the wild. This is not brand new because the script was released two years ago[1]. The script I found is based on the same tool and still has a low VT score: 3/64 (SHA256:1281b7184278f2a4814b245b48256da32a6348b317b83c440008849a16682ccb)[2]. The RAT has a lot of features to control the victim's computer:

AWS Weekly Roundup: AWS Lambda, Amazon Bedrock, Amazon Redshift, Amazon CloudWatch, and more (Nov 4, 2024)

This post was originally published on this site

The spooky season has come and gone now. While there aren’t any Halloween-themed releases, AWS has celebrated it in big style by having a plethora of exciting releases last week! I think it’s safe to say that we have truly entered the ‘pre’ re:Invent stage as more and more interesting things are being released every week on the countdown to AWS re:Invent 2024.

There is a lot to cover, so let me put my wizard hat on, open the big bag of treats, and dive into last week’s goodies!

Something for developers
There was no shortage of treats from AWS for developers this Halloween!

AWS enhances the Lambda application building experience with VS Code IDE and AWS Toolkit — AWS has enhanced AWS Lambda development with the AWS Toolkit for Visual Studio Code, providing a guided setup for coding, testing, and deploying Lambda applications directly within the IDE. It includes sample walkthroughs and one-click deployment, simplifying the development process. Now, building apps with Lambda is as intuitive as crafting a spell in a wizard’s workshop!

AWS Amplify integration with Amazon S3 for static website hosting — AWS Amplify Hosting now integrates with Amazon S3 for seamless static website hosting, with global CDN support via Amazon CloudFront. This simplifies set up, offering secure, high-performance delivery with custom domains and SSL certificates. Hosting your site is now easier than spotting a jack-o’-lantern on Halloween night!

AWS Lambda now supports AWS Fault Injection Service (FIS) actions — AWS Lambda now supports AWS Fault Injection Simulator (FIS) actions, enabling developers to test resilience by injecting controlled faults like latency and execution errors. This helps simulate real-world failures without code changes, improving monitoring and operational readiness. Great for testing that old candy dispenser!

AWS CodeBuild now supports retrying builds automatically — AWS CodeBuild now offers automatic build retries, allowing developers to set a retry limit for failed builds. This reduces manual intervention by automatically retrying builds up to the specified limit, tackling those pesky, intermittent failures like a ghostbuster clearing a haunted pipeline!

Amazon Virtual Private Cloud launches new security group sharing features — Amazon VPC now supports sharing security groups across multiple VPCs within the same account and with participant accounts in shared VPCs. This streamlines security management and ensures consistent traffic filtering across your organization. Now, keeping your network secure is as seamless as warding off digital goblins!

Amazon DataZone expands data access with tools like Tableau, Power BI, and more — Amazon DataZone now supports the Amazon Athena JDBC Driver, allowing seamless access to data lake assets from BI tools, like Tableau and Power BI. This lets analysts connect and analyze data with ease. Now, accessing data is as effortless as a witch flying on her broomstick!

Generative AI
Amazon Q and Amazon Bedrock continue to make generative AI seem like magic. Here are some releases from last week.

Amazon Q Developer inline chat — Amazon Q Developer has introduced inline chat support, allowing developers to engage directly within their code editor for actions like optimization, commenting, and test generation. Real-time inline diffs make it simple to review changes, available in Visual Studio Code and JetBrains IDEs. It’s practically code magic – no witch’s cauldron needed!

Meta’s Llama 3.1 8B and 70B models are now available for fine-tuning in Amazon Bedrock — Amazon Bedrock now supports fine-tuning for Meta’s Llama 3.1 8B and 70B models, allowing developers to customize these AI models with their own data. With a 128K context length, Llama 3.1 processes large text volumes efficiently, making it perfect for domain-specific applications. Now, your AI won’t be scared of handling monstrous amounts of data—even on a dark, stormy night!

Fine-tuning for Anthropic’s Claude 3 Haiku in Amazon Bedrock is now generally available — Fine-tuning for the Claude 3 Haiku model in Amazon Bedrock is now generally available, enabling customization with your data for better accuracy. Make your AI as unique as your Halloween costume!

Cost Planning, Saving, and Tracking
Here are some new releases that can help you stay on top of your budget and keep an eye on the amount of candy that you buy.

AWS now accepts partial card payments — AWS now supports partial payments with credit or debit cards, letting users split monthly bills across multiple cards. This flexibility makes managing your budget as smooth as a ghost gliding through a haunted house!

Amazon Bedrock now supports cost allocation tags on inference profiles — Amazon Bedrock now supports cost allocation tags for inference profiles, allowing customers to track and manage generative AI costs by department or application. This makes financial management a treat, not a trick!

AWS Deadline Cloud now adds budget-related events — AWS Deadline Cloud, a service used for rendering and managing visual effects and animation workloads, now sends budget-related events via Amazon EventBridge, enabling real-time spending updates and automated notifications. This helps keep project costs under control without any unexpected scares!

And the busiest team of the week award goes to…Amazon Redshift!
Clearly, the Amazon Redshift team loves Halloween and decided to celebrate in big style with many releases! Here are the highlights:

Amazon Redshift integration with Amazon Bedrock for generative AI — Amazon Redshift now integrates with Amazon Bedrock for generative AI tasks using SQL, adding AI capabilities like text generation directly in your data warehouse. Now, you can conjure up rich insights without the need for complicated spells!

Announcing general availability of auto-copy for Amazon Redshift — Auto-copy for continuous data ingestion from Amazon S3 into Amazon Redshift is now generally available. This streamlines workflows, making data integration as smooth as carving a soft pumpkin!

Amazon Redshift now supports incremental refresh on Materialized Views (MVs) for data lake tables — Amazon Redshift now supports incremental refresh for materialized views on data lake tables, updating only changed data to boost efficiency. This keeps your data fresh without any haunting overhead!

Announcing Amazon Redshift Serverless with AI-driven scaling and optimization — Amazon Redshift Serverless now offers AI-driven scaling, adjusting resources automatically based on workload. This ensures smooth performance without any chilling surprises!

CSV result format support for Amazon Redshift Data API — Amazon Redshift Data API now supports CSV output for SQL query results, enhancing data processing flexibility. This makes handling data as smooth as a ghost’s whisper!

Halloween week contest runner-up…Amazon CloudWatch!
The Amazon CloudWatch team has also been busy giving out candy this Halloween! Let’s check it out.

Amazon CloudWatch now monitors EBS volumes exceeding provisioned performance — Amazon CloudWatch now provides metrics to check if Amazon EBS volumes exceed their IOPS or throughput limits. This helps quickly spot and resolve performance issues before they turn into a haunting problem!

New Amazon CloudWatch metrics for monitoring I/O latency of Amazon EBS volumes — Amazon CloudWatch now offers metrics for average read and write I/O latency of Amazon EBS volumes, aiding in identifying performance issues. These insights are available per minute at no extra cost. This should help you prevent latency from sneaking up on you like a Halloween ghost!

Amazon ElastiCache for Valkey adds new CloudWatch metrics to monitor server-side response time — Amazon ElastiCache now includes metrics for read and write request latency, helping monitor server response times. This aids in quickly spotting and resolving performance issues before they become a frightful surprise!

Conclusion
And that’s a wrap on Halloween 2024. I don’t know about you, but this is my favorite time of the year, followed by News Year’s. Both carry an element of unpredictability that I very much enjoy. With Halloween, you can get excited about what costume you’re going to wear, whereas New Year’s is all about new possibilities and conquering new horizons.

Luckily, you don’t have to wait for the new year to unlock new frontiers with AWS as we bring excitement and innovation throughout the year. And what better way to see this in action than at AWS re:Invent 2024!

I wonder what kinds of spells and surprises we’ll be conjuring up come Halloween 2025. Until next time, keep your eyes on the horizon—and your broomsticks at the ready!

Analyzing an Encrypted Phishing PDF, (Mon, Nov 4th)

This post was originally published on this site

Once in a while, I get a question about my pdf-parser.py tool, not able to decode strings and streams from a PDF document.

And often, I have the answer without looking at the PDF: it's encrypted.

PDF documents can be encrypted, and what's special about encrypted PDFs, is that the structure of the PDF document is not encrypted. You can still see the objects, dictionaries, … . What gets encrypted, are the strings and streams.

So my tools pdfid and pdf-parser can provide information about the structure of an encrypted PDF document, but not the strings and streams.

My PDF tools do not support encryption, you need to use another open source tool: qpdf, developed by Jay Berkenbilt.

A PDF document can be encrypted for DRM and/or for confidentiality. PDFs encrypted solely for DRM, can be opened and viewed by the user without providing a password. PDFs encrypted for confidentiality can only be opened and viewed when the user provides the correct password.

Let's take an example of a phishing PDF: 5c2764b9d3a6df67f99e342404e46a41ec6e1f5582919d5f99098d90fd45367f.

Analyzing this document with pdfid gives this:

The document is encrypted (/Encrypt is greater than 0) and it contains URIs (/URI is greater than 0).

qpdf can help you determine if a password is needed to view the content, like this:

If you get this output without providing a password, it means that the user password is empty and that the document can be opened without providing a password.

You must then decrypt the PDF with qpdf for further analysis like this:

And then it can be analyzed with pdf-parser to extract the URI like this:

If you don't decrypt the PDF prior to analysis with pdf-parser, the string of the URI will be ciphertext:

Didier Stevens
Senior handler
blog.DidierStevens.com

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

qpdf: Extracting PDF Streams, (Sat, Nov 2nd)

This post was originally published on this site

In diary entry "Analyzing PDF Streams" I answer a question asked by a student of Xavier: "how can you export all streams of a PDF?". I explained how to do this with my pdf-parser.py tool.

I recently found another method, using the open-source tool qpdf. Since version 11, you can extract streams with qpdf.

If you want the contents of the streams inside a single JSON object (BASE64 encoded), use this command:

qpdf.exe –json –json-stream-data=inline exampl.pdf

And if you want the contents of the streams in separate files (filename prefix "stream"), use this command:

qpdf.exe –json –json-stream-data=file –json-stream-prefix=stream exampl.pdf

Didier Stevens
Senior handler
blog.DidierStevens.com

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

Fine-tuning for Anthropic’s Claude 3 Haiku model in Amazon Bedrock is now generally available

This post was originally published on this site

Today, we are announcing the general availability of fine-tuning for Anthropic’s Claude 3 Haiku model in Amazon Bedrock in the US West (Oregon) AWS Region. Amazon Bedrock is the only fully managed service that provides you with the ability to fine-tune Claude models. You can now fine-tune and customize the Claude 3 Haiku model with your own task-specific training dataset to boost model accuracy, quality, and consistency to further tailor generative AI for your business.

Fine-tuning is a technique where a pre-trained large language model (LLM) is customized for a specific task by updating the weights and tuning hyperparameters like learning rate and batch size for optimal results.

Anthropic’s Claude 3 Haiku model is the fastest and most compact model in the Claude 3 model family. Fine-tuning Claude 3 Haiku offers significant advantages for businesses:

  • Customization – You can customize models that excel in areas crucial to your business compared to more general models by encoding company and domain knowledge.
  • Specialized performance – You can generate higher quality results and create unique user experiences that reflect your company’s proprietary information, brand, products, and more.
  • Task-specific optimization – You can enhance performance for domain-specific actions such as classification, interactions with custom APIs, or industry-specific data interpretation.
  • Data security – You can fine-tune with peace of mind in your secure AWS environment. Amazon Bedrock makes a separate copy of the base foundation model that is accessible only by you and trains this private copy of the model.

You can now optimize performance for specific business use cases by providing domain-specific labeled data to fine-tune the Claude 3 Haiku model in Amazon Bedrock.

In early 2024, we started to engage customers with a team of experts from the AWS Generative AI Innovation Center to help fine-tune Anthropic’s Claude models with their proprietary data sources. I’m happy to share that you can now fine-tune Anthropic’s Claude 3 Haiku model in Amazon Bedrock directly in the Amazon Bedrock console.

Get started with fine-tuning for Anthropic’s Claude 3 Haiku model in Amazon Bedrock
I will demonstrate how to easily fine-tune the Claude 3 Haiku model in Amazon Bedrock. To learn more about the fine-tuning workflow in detail, visit the AWS Machine Learning Blog post, Fine-tune Anthropic’s Claude 3 Haiku in Amazon Bedrock to boost model accuracy and quality.

To create a simple fine-tuning job in the Amazon Bedrock console, go to the Foundation models section in the navigation pane and select Custom models. In the Models section, select the Customize model button and then select Create Fine-tuning job.

Next, choose the model that you want to customize with your own data, give your resulting model a name, and optionally add encryption keys and any tags to associate with the model in the Model details section. Enter a name for the job and optionally add any tags for the job in the Job configuration section.

You can select the Amazon S3 location of the training dataset file and, if applicable, the validation dataset file in the Input data section.

Each dataset can be created using a JSON Lines (JSONL) file with multiple JSON lines using the single-turn or multi-turn messaging format. Each JSON line is a sample containing a system and message, an array of message objects.

I prepared some of the latest information about Anthropic’s Claude 3.5 Sonnet model in Amazon Bedrock. To learn more, visit Prepare the datasets in the Amazon Bedrock documentation.

{"system": "You are an helpful assistant.","messages":[{"role": "user", "content": "What’s the latest models to support Amazon Bedrock"},{"role": "assistant", "content": "Amazon Bedrock is a fully managed service that support Anthropic’s Claude 3 Sonnet, Haiku, and Opus models and the latest Claud 3.5 Sonnet model."}]}
{"system": "You are an helpful assistant.","messages":[{"role": "user", "content": "How long context window of Claude 3.5?"},{"role": "assistant", "content": "Anthropic's Claude 3 models have a 200,000 token context window enabling you to relay a large volume of information to Claude. This translates to roughly 150,000 words, or over 500 pages of material."}]}
{"system": "You are an helpful assistant.","messages":[{"role": "user", "content": "Is Claude 3.5 available in Bedrock?"},{"role": "assistant", "content": "Anthropic’s most intelligent and advanced model, Claude 3.5 Sonnet, demonstrates exceptional capabilities across a diverse range of tasks and evaluations while also outperforming Claude 3 Opus."}]}

In the Hyperparameters section, enter values for hyperparameters to use in training, such as epochs, batch size, and learning rate multiplier. If you’ve included a validation dataset, you can enable Early stopping, a technique used to prevent overfitting and stop the training process when the validation loss stops improving. You can set an early stopping threshold and patience value.

You can also select the output location where Amazon Bedrock should save the output of the job in the Output data section. Choose an AWS Identity and Access Management (IAM) custom service role with the appropriate permissions in the Service access section. To learn more, see Create a service role for model customization in the Amazon Bedrock documentation.

Finally, choose Create Fine-tuning job and wait for your fine-tuning job to start.

You can track its progress or stop it in the Jobs tab in the Custom models section.

After a model customization job is complete, you can analyze the results of the training process by looking at the files in the output Amazon Simple Storage Service (Amazon S3) folder that you specified when you submitted the job, or you can view details about the model.

Before using a customized model, you need to purchase Provisioned Throughput for Amazon Bedrock and then use the resulting provisioned model for inference. When you purchase Provisioned Throughput, you can select a commitment term, choose a number of model units, and see estimated hourly, daily, and monthly costs. To learn more about the custom model pricing for the Claude 3 Haiku model, visit Amazon Bedrock Pricing.

Now, you can test your custom model in the console playground. I choose my custom model and ask whether Anthropic’s Claude 3.5 Sonnet model is available in Amazon Bedrock.

I receive the answer:

Yes. You can use Anthropic’s most intelligent and advanced model, Claude 3.5 Sonnet in the Amazon Bedrock. You can demonstrate exceptional capabilities across a diverse range of tasks and evaluations while also outperforming Claude 3 Opus.

You can complete this job using AWS APIs, AWS SDKs, or AWS Command Line Interface (AWS CLI). To learn more about using AWS CLI, visit Code samples for model customization in the AWS documentation.

If you are using Jupyter Notebook, visit the GitHub repository and follow a hands-on guide for custom models. To build a production-level operation, I recommend reading Streamline custom model creation and deployment for Amazon Bedrock with Provisioned Throughput using Terraform on the AWS Machine Learning Blog.

Datasets and parameters
When fine-tuning Claude 3 Haiku, the first thing you should do is look at your datasets. There are two datasets that are involved in training Haiku, and that’s the Training dataset and the Validation dataset. There are specific parameters that you must follow in order to make your training successful, which are outlined in the following table.

Training data Validation data
File format JSONL
File size <= 10GB <= 1GB
Line count 32 – 10,000 lines 32 – 1,000 lines
Training + Validation Sum <= 10,000 lines
Token limit < 32,000 tokens per entry
Reserved keywords Avoid having “nHuman:” or “nAssistant:” in prompts

When you prepare the datasets, start with a small high-quality dataset and iterate based on tuning results. You can consider using larger models from Anthropic like Claude 3 Opus or Claude 3.5 Sonnet to help refine and improve your training data. You can also use them to generate training data for fine-tuning the Claude 3 Haiku model, which can be very effective if the larger models already perform well on your target task.

For more guidance on selecting the proper hyperparameters and preparing the datasets, read the AWS Machine Learning Blog post, Best practices and lessons for fine-tuning Anthropic’s Claude 3 Haiku in Amazon Bedrock.

Demo video
Check out this deep dive demo video for a step-by-step walkthrough that will help you get started with fine-tuning Anthropic’s Claude 3 Haiku model in Amazon Bedrock.

Now available
Fine-tuning for Anthropic’s Claude 3 Haiku model in Amazon Bedrock is now generally available in the US West (Oregon) AWS Region; check the full Region list for future updates. To learn more, visit Custom models in the Amazon Bedrock documentation.

Give fine-tuning for the Claude 3 Haiku model a try in the Amazon Bedrock console today and send feedback to AWS re:Post for Amazon Bedrock or through your usual AWS Support contacts.

I look forward to seeing what you build when you put this new technology to work for your business.

Channy

Unlock the potential of your supply chain data and gain actionable insights with AWS Supply Chain Analytics

This post was originally published on this site

Today, we’re announcing the general availability of AWS Supply Chain Analytics powered by Amazon QuickSight. This new feature helps you to build custom report dashboards using your data in AWS Supply Chain. With this feature, your business analysts or supply chain managers can perform custom analyses, visualize data, and gain actionable insights for your supply chain management operations.

Here’s how it looks:

AWS Supply Chain Analytics leverages the AWS Supply Chain data lake and provides Amazon QuickSight embedded authoring tools directly into the AWS Supply Chain user interface. This integration provides you with a unified and configurable experience for creating custom insights, metrics, and key performance indicators (KPIs) for your operational analytics.

In addition, AWS Supply Chain Analytics provides prebuilt dashboards that you can use as-is or modify based on your needs. At launch, you will have the following prebuilt dashboards:

  1. Plan-Over-Plan Variance: Presents a comparison between two demand plans, showcasing variances in both units and values across key dimensions such as product, site, and time periods.
  2. Seasonality Analytics: Presents a year-over-year view of demand, illustrating trends in average demand quantities and highlighting seasonality patterns through heatmaps at both monthly and weekly levels.

Let’s get started
Let me walk you through the features of AWS Supply Chain Analytics.

The first step is to enable AWS Supply Chain Analytics. To do this, navigate to Settings, then select Organizations and choose Analytics. Here, I can Enable data access for Analytics.

Now I can edit existing roles or create a new role with analytics access. To learn more, visit User permission roles.

Once this feature is enabled, when I log in to AWS Supply Chain I can access the AWS Supply Chain Analytics feature by selecting either the Connecting to Analytics card or Analytics on the left navigation menu.

Here, I have an embedded Amazon QuickSight interface ready for me to use. To get started, I navigate to Prebuilt Dashboards.

Then, I can select the prebuilt dashboards I need in the Supply Chain Function dropdown list:

What I like the most about this prebuilt dashboards is I can easily get started. AWS Supply Chain Analytics will prepare all the datasets, analysis, and even a dashboard for me. I select Add to begin.

Then, I navigate to the dashboard page, and I can see the results. I can also share this dashboard with my team, which improves the collaboration aspect.

If I need to include other datasets for me to build a custom dashboard, I can navigate to Datasets and select New dataset.

Here, I have AWS Supply Chain data lake as an existing dataset for me to use.

Next, I need to select Create dataset.

Then, I can select a table that I need to include into my analysis. On the Data section, I can see all available fields. All data sets that start with asc_ are generated by AWS Supply Chain, such as data from Demand Planning, Insights, Supply Planning, and others.

I can also find all the datasets I have ingested into AWS Supply Chain. To learn more on data entities, visit the AWS Supply Chain documentation page. One thing to note here is if I have not ingested data into AWS Supply Chain Data Lake, I need to ingest data before using AWS Supply Chain Analytics. To learn how to ingest data into the data lake, visit the data lake page.

At this stage, I can start my analysis. 

Now available
AWS Supply Chain Analytics is now generally available in all regions where AWS Supply Chain is offered. Give it a try to experience and transform your operations with the AWS Supply Chain Analytics.

Happy building,
— Donnie

Amazon Aurora PostgreSQL Limitless Database is now generally available

This post was originally published on this site

Today, we are announcing the general availability of Amazon Aurora PostgreSQL Limitless Database, a new serverless horizontal scaling (sharding) capability of Amazon Aurora. With Aurora PostgreSQL Limitless Database, you can scale beyond the existing Aurora limits for write throughput and storage by distributing a database workload over multiple Aurora writer instances while maintaining the ability to use it as a single database.

When we previewed Aurora PostgreSQL Limitless Database at AWS re:Invent 2023, I explained that it uses a two-layer architecture consisting of multiple database nodes in a DB shard group – either routers or shards to scale based on the workload.

  • Routers – Nodes that accept SQL connections from clients, send SQL commands to shards, maintain system-wide consistency, and return results to clients.
  • Shards – Nodes that store a subset of tables and full copies of data, which accept queries from routers.

There will be three types of tables that contain your data: sharded, reference, and standard.

  • Sharded tables – These tables are distributed across multiple shards. Data is split among the shards based on the values of designated columns in the table, called shard keys. They are useful for scaling the largest, most I/O-intensive tables in your application.
  • Reference tables – These tables copy data in full on every shard so that join queries can work faster by eliminating unnecessary data movement. They are commonly used for infrequently modified reference data, such as product catalogs and zip codes.
  • Standard tables – These tables are like regular Aurora PostgreSQL tables. Standard tables are all placed together on a single shard so join queries can work faster by eliminating unnecessary data movement. You can create sharded and reference tables from standard tables.

Once you have created the DB shard group and your sharded and reference tables, you can load massive amounts of data into Aurora PostgreSQL Limitless Database and query data in those tables using standard PostgreSQL queries. To learn more, visit Limitless Database architecture in the Amazon Aurora User Guide.

Getting started with Aurora PostgreSQL Limitless Database
You can get started in the AWS Management Console and AWS Command Line Interface (AWS CLI) to create a new DB cluster that uses Aurora PostgreSQL Limitless Database, add a DB shard group to the cluster, and query your data.

1. Create an Aurora PostgreSQL Limitless Database Cluster
Open the Amazon Relational Database Service (Amazon RDS) console and choose Create database. For Engine options, choose Aurora (PostgreSQL Compatible) and Aurora PostgreSQL with Limitless Database (Compatible with PostgreSQL 16.4).

For Aurora PostgreSQL Limitless Database, enter a name for your DB shard group and values for minimum and maximum capacity measured by Aurora Capacity Units (ACUs) across all routers and shards. The initial number of routers and shards in a DB shard group is determined by this maximum capacity. Aurora PostgreSQL Limitless Database scales a node up to a higher capacity when its current utilization is too low to handle the load. It scales the node down to a lower capacity when its current capacity is higher than needed.

For DB shard group deployment, choose whether to create standbys for the DB shard group: no compute redundancy, one compute standby in a different Availability Zone, or two compute standbys in two different Availability Zones.

You can set the remaining DB settings to what you prefer and choose Create database. After the DB shard group is created, it is displayed on the Databases page.

You can connect, reboot, or delete a DB shard group, or you can change the capacity, split a shard, or add a router in the DB shard group. To learn more, visit Working with DB shard groups in the Amazon Aurora User Guide.

2. Create Aurora PostgreSQL Limitless Database tables
As shared previously, Aurora PostgreSQL Limitless Database has three table types: sharded, reference, and standard. You can convert standard tables to sharded or reference tables to distribute or replicate existing standard tables or create new sharded and reference tables.

You can use variables to create sharded and reference tables by setting the table creation mode. The tables that you create will use this mode until you set a different mode. The following examples show how to use these variables to create sharded and reference tables.

For example, create a sharded table named items with a shard key composed of the item_id and item_cat columns.

SET rds_aurora.limitless_create_table_mode='sharded';
SET rds_aurora.limitless_create_table_shard_key='{"item_id", "item_cat"}';
CREATE TABLE items(item_id int, item_cat varchar, val int, item text);

Now, create a sharded table named item_description with a shard key composed of the item_id and item_cat columns and collocate it with the items table.

SET rds_aurora.limitless_create_table_collocate_with='items';
CREATE TABLE item_description(item_id int, item_cat varchar, color_id int, ...);

You can also create a reference table named colors.

SET rds_aurora.limitless_create_table_mode='reference';
CREATE TABLE colors(color_id int primary key, color varchar);

You can find information about Limitless Database tables by using the rds_aurora.limitless_tables view, which contains information about tables and their types.

postgres_limitless=> SELECT * FROM rds_aurora.limitless_tables;

 table_gid | local_oid | schema_name | table_name  | table_status | table_type  | distribution_key
-----------+-----------+-------------+-------------+--------------+-------------+------------------
         1 |     18797 | public      | items       | active       | sharded     | HASH (item_id, item_cat)
         2 |     18641 | public      | colors      | active       | reference   | 

(2 rows)

You can convert standard tables into sharded or reference tables. During the conversion, data is moved from the standard table to the distributed table, then the source standard table is deleted. To learn more, visit Converting standard tables to limitless tables in the Amazon Aurora User Guide.

3. Query Aurora PostgreSQL Limitless Database tables
Aurora PostgreSQL Limitless Database is compatible with PostgreSQL syntax for queries. You can query your Limitless Database using psql or any other connection utility that works with PostgreSQL. Before querying tables, you can load data into Aurora Limitless Database tables by using the COPY command or by using the data loading utility.

To run queries, connect to the cluster endpoint, as shown in Connecting to your Aurora Limitless Database DB cluster. All PostgreSQL SELECT queries are performed on the router to which the client sends the query and shards where the data is located.

To achieve a high degree of parallel processing, Aurora PostgreSQL Limitless Database utilizes two querying methods: single-shard queries and distributed queries, which determines whether your query is single-shard or distributed and processes the query accordingly.

  • Single-shard query – A query where all the data needed for the query is on one shard. The entire operation can be performed on one shard, including any result set generated. When the query planner on the router encounters a query like this, the planner sends the entire SQL query to the corresponding shard.
  • Distributed query – A query run on a router and more than one shard. The query is received by one of the routers. The router creates and manages the distributed transaction, which is sent to the participating shards. The shards create a local transaction with the context provided by the router, and the query is run.

For examples of single-shard queries, you use the following parameters to configure the output from the EXPLAIN command.

postgres_limitless=> SET rds_aurora.limitless_explain_options = shard_plans, single_shard_optimization;
SET

postgres_limitless=> EXPLAIN SELECT * FROM items WHERE item_id = 25;

                     QUERY PLAN
--------------------------------------------------------------
 Foreign Scan  (cost=100.00..101.00 rows=100 width=0)
   Remote Plans from Shard postgres_s4:
         Index Scan using items_ts00287_id_idx on items_ts00287 items_fs00003  (cost=0.14..8.16 rows=1 width=15)
           Index Cond: (id = 25)
 Single Shard Optimized
(5 rows) 

To learn more about the EXPLAIN command, see EXPLAIN in the PostgreSQL documentation.

For examples of distributed queries, you can insert new items named Book and Pen into the items table.

postgres_limitless=> INSERT INTO items(item_name)VALUES ('Book'),('Pen')

This makes a distributed transaction on two shards. When the query runs, the router sets a snapshot time and passes the statement to the shards that own Book and Pen. The router coordinates an atomic commit across both shards, and returns the result to the client.

You can use distributed query tracing, a tool to trace and correlate queries in PostgreSQL logs across Aurora PostgreSQL Limitless Database. To learn more, visit Querying Limitless Database in the Amazon Aurora User Guide.

Some SQL commands aren’t supported. For more information, see Aurora Limitless Database reference in the Amazon Aurora User Guide.

Things to know
Here are a couple of things that you should know about this feature:

  • Compute – You can only have one DB shard group per DB cluster and set the maximum capacity of a DB shard group to 16–6144 ACUs. Contact us if you need more than 6144 ACUs. The initial number of routers and shards is determined by the maximum capacity that you set when you create a DB shard group. The number of routers and shards doesn’t change when you modify the maximum capacity of a DB shard group. To learn more, see the table of the number of routers and shards in the Amazon Aurora User Guide.
  • Storage – Aurora PostgreSQL Limitless Database only supports the Amazon Aurora I/O-Optimized DB cluster storage configuration. Each shard has a maximum capacity of 128 TiB. Reference tables have a size limit of 32 TiB for the entire DB shard group. To reclaim storage space by cleaning up your data, you can use the vacuuming utility in PostgreSQL.
  • Monitoring – You can use Amazon CloudWatch, Amazon CloudWatch Logs, or Performance Insights to monitor Aurora PostgreSQL Limitless Database. There are also new statistics functions and views and wait events for Aurora PostgreSQL Limitless Database that you can use for monitoring and diagnostics.

Now available
Amazon Aurora PostgreSQL Limitless Database is available today with PostgreSQL 16.4 compatibility in the AWS US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Hong Kong), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), Europe (Ireland), and Europe (Stockholm) Regions.

Give Aurora PostgreSQL Limitless Database a try in the Amazon RDS console. For more information, visit the Amazon Aurora User Guide and send feedback to AWS re:Post for Amazon Aurora or through your usual AWS support contacts.

Channy