Amazon SageMaker HyperPod introduces Amazon EKS support

This post was originally published on this site

Today, we are pleased to announce Amazon Elastic Kubernetes Service (EKS) support in Amazon SageMaker HyperPod — purpose-built infrastructure engineered with resilience at its core for foundation model (FM) development. This new capability enables customers to orchestrate HyperPod clusters using EKS, combining the power of Kubernetes with Amazon SageMaker HyperPod‘s resilient environment designed for training large models. Amazon SageMaker HyperPod helps efficiently scale across more than a thousand artificial intelligence (AI) accelerators, reducing training time by up to 40%.

Amazon SageMaker HyperPod now enables customers to manage their clusters using a Kubernetes-based interface. This integration allows seamless switching between Slurm and Amazon EKS for optimizing various workloads, including training, fine-tuning, experimentation, and inference. The CloudWatch Observability EKS add-on provides comprehensive monitoring capabilities, offering insights into CPU, network, disk, and other low-level node metrics on a unified dashboard. This enhanced observability extends to resource utilization across the entire cluster, node-level metrics, pod-level performance, and container-specific utilization data, facilitating efficient troubleshooting and optimization.

Launched at re:Invent 2023, Amazon SageMaker HyperPod has become a go-to solution for AI startups and enterprises looking to efficiently train and deploy large scale models. It is compatible with SageMaker’s distributed training libraries, which offer Model Parallel and Data Parallel software optimizations that help reduce training time by up to 20%. SageMaker HyperPod automatically detects and repairs or replaces faulty instances, enabling data scientists to train models uninterrupted for weeks or months. This allows data scientists to focus on model development, rather than managing infrastructure.

The integration of Amazon EKS with Amazon SageMaker HyperPod uses the advantages of Kubernetes, which has become popular for machine learning (ML) workloads due to its scalability and rich open-source tooling. Organizations often standardize on Kubernetes for building applications, including those required for generative AI use cases, as it allows reuse of capabilities across environments while meeting compliance and governance standards. Today’s announcement enables customers to scale and optimize resource utilization across more than a thousand AI accelerators. This flexibility enhances the developer experience, containerized app management, and dynamic scaling for FM training and inference workloads.

Amazon EKS support in Amazon SageMaker HyperPod strengthens resilience through deep health checks, automated node recovery, and job auto-resume capabilities, ensuring uninterrupted training for large scale and/or long-running jobs. Job management can be streamlined with the optional HyperPod CLI, designed for Kubernetes environments, though customers can also use their own CLI tools. Integration with Amazon CloudWatch Container Insights provides advanced observability, offering deeper insights into cluster performance, health, and utilization. Additionally, data scientists can use tools like Kubeflow for automated ML workflows. The integration also includes Amazon SageMaker managed MLflow, providing a robust solution for experiment tracking and model management.

At a high level, Amazon SageMaker HyperPod cluster is created by the cloud admin using the HyperPod cluster API and is fully managed by the HyperPod service, removing the undifferentiated heavy lifting involved in building and optimizing ML infrastructure. Amazon EKS is used to orchestrate these HyperPod nodes, similar to how Slurm orchestrates HyperPod nodes, providing customers with a familiar Kubernetes-based administrator experience.

Let’s explore how to get started with Amazon EKS support in Amazon SageMaker HyperPod
I start by preparing the scenario, checking the prerequisites, and creating an Amazon EKS cluster with a single AWS CloudFormation stack following the Amazon SageMaker HyperPod EKS workshop, configured with VPC and storage resources.

To create and manage Amazon SageMaker HyperPod clusters, I can use either the AWS Management Console or AWS Command Line Interface (AWS CLI). Using the AWS CLI, I specify my cluster configuration in a JSON file. I choose the Amazon EKS cluster created previously as the orchestrator of the SageMaker HyperPod Cluster. Then, I create the cluster worker nodes that I call “worker-group-1”, with a private Subnet, NodeRecovery set to Automatic to enable automatic node recovery and for OnStartDeepHealthChecks I add InstanceStress and InstanceConnectivity to enable deep health checks.

cat > eli-cluster-config.json << EOL
{
    "ClusterName": "example-hp-cluster",
    "Orchestrator": {
        "Eks": {
            "ClusterArn": "${EKS_CLUSTER_ARN}"
        }
    },
    "InstanceGroups": [
        {
            "InstanceGroupName": "worker-group-1",
            "InstanceType": "ml.p5.48xlarge",
            "InstanceCount": 32,
            "LifeCycleConfig": {
                "SourceS3Uri": "s3://${BUCKET_NAME}",
                "OnCreate": "on_create.sh"
            },
            "ExecutionRole": "${EXECUTION_ROLE}",
            "ThreadsPerCore": 1,
            "OnStartDeepHealthChecks": [
                "InstanceStress",
                "InstanceConnectivity"
            ],
        },
  ....
    ],
    "VpcConfig": {
        "SecurityGroupIds": [
            "$SECURITY_GROUP"
        ],
        "Subnets": [
            "$SUBNET_ID"
        ]
    },
    "ResilienceConfig": {
        "NodeRecovery": "Automatic"
    }
}
EOL

You can add InstanceStorageConfigs to provision and mount an additional Amazon EBS volumes on HyperPod nodes.

To create the cluster using the SageMaker HyperPod APIs, I run the following AWS CLI command:

aws sagemaker create-cluster  
--cli-input-json file://eli-cluster-config.json

The AWS command returns the ARN of the new HyperPod cluster.

{
"ClusterArn": "arn:aws:sagemaker:us-east-2:ACCOUNT-ID:cluster/wccy5z4n4m49"
}

I then verify the HyperPod cluster status in the SageMaker Console, awaiting until the status changes to InService.

Alternatively, you can check the cluster status using the AWS CLI running the describe-cluster command:

aws sagemaker describe-cluster --cluster-name my-hyperpod-cluster

Once the cluster is ready, I can access the SageMaker HyperPod cluster nodes. For most operations, I can use kubectl commands to manage resources and jobs from my development environment, using the full power of Kubernetes orchestration while benefiting from SageMaker HyperPod’s managed infrastructure. On this occasion, for advanced troubleshooting or direct node access, I use AWS Systems Manager (SSM) to log into individual nodes, following the instructions in the Access your SageMaker HyperPod cluster nodes page.

To run jobs on the SageMaker HyperPod cluster orchestrated by EKS, I follow the steps outlined in the Run jobs on SageMaker HyperPod cluster through Amazon EKS page. You can use the HyperPod CLI and the native kubectl command to find avaible HyperPod clusters and submit training jobs (Pods). For managing ML experiments and training runs, you can use Kuberflow Training Operator, Kueue and Amazon SageMaker-managed MLflow.

Finally, in the SageMaker Console, I can view the Status and Kubernetes version of recently added EKS clusters, providing a comprehensive overview of my SageMaker HyperPod environment.

And I can monitor cluster performance and health insights using Amazon CloudWatch Container.

Things to know
Here are some key things you should know about Amazon EKS support in Amazon SageMaker HyperPod:

Resilient Environment – This integration provides a more resilient training environment with deep health checks, automated node recovery, and job auto-resume. SageMaker HyperPod automatically detects, diagnoses, and recovers from faults, allowing you to continually train foundation models for weeks or months without disruption. This can reduce training time by up to 40%.

Enhanced GPU Observability Amazon CloudWatch Container Insights provides detailed metrics and logs for your containerized applications and microservices. This enables comprehensive monitoring of cluster performance and health.

Scientist-Friendly Tool – This launch includes a custom HyperPod CLI for job management, Kubeflow Training Operators for distributed training, Kueue for scheduling, and integration with SageMaker Managed MLflow for experiment tracking. It also works with SageMaker’s distributed training libraries, which provide Model Parallel and Data Parallel optimizations to significantly reduce training time. These libraries, combined with auto-resumption of jobs, enable efficient and uninterrupted training of large models.

Flexible Resource Utilization – This integration enhances developer experience and scalability for FM workloads. Data scientists can efficiently share compute capacity across training and inference tasks. You can use your existing Amazon EKS clusters or create and attach new ones to HyperPod compute, bring your own tools for job submission, queuing and monitoring.

To get started with Amazon SageMaker HyperPod on Amazon EKS, you can explore resources such as the SageMaker HyperPod EKS Workshop, the aws-do-hyperpod project, and the awsome-distributed-training project. This release is generally available in the AWS Regions where Amazon SageMaker HyperPod is available except Europe(London). For pricing information, visit the Amazon SageMaker Pricing page.

This blog post was a collaborative effort. I would like to thank Manoj Ravi, Adhesh Garg, Tomonori Shimomura, Alex Iankoulski, Anoop Saha, and the entire team for their significant contributions in compiling and refining the information presented here. Their collective expertise was crucial in creating this comprehensive article.

– Eli.

AWS Weekly Roundup: Amazon DynamoDB, AWS AppSync, Storage Browser for Amazon S3, and more (September 9, 2024)

This post was originally published on this site

Last week, the latest AWS Heroes arrived! AWS Heroes are amazing technical experts who generously share their insights, best practices, and innovative solutions to help others.

The AWS GenAI Lofts are in full swing with San Francisco and São Paulo open now, and London, Paris, and Seoul coming in the next couple of months. Here’s an insider view from a workshop in San Francisco last week.

AWS GenAI Loft San Francisco workshop

Last week’s launches
Here are the launches that got my attention.

Storage Browser for Amazon S3 (alpha release) – An open source Amplify UI React component that you can add to your web applications to provide your end users with a simple interface for data stored in S3. The component uses the new ListCallerAccessGrants API to list all S3 buckets, prefixes, and objects they can access, as defined by their S3 Access Grants.

AWS Network Load Balancer – Now supports a configurable TCP idle timeout. For more information, see this Networking & Content Devliery Blog post.

AWS Gateway Load Balancer – Also supports a configurable TCP idle timeout. More info is available in this blog post.

Amazon ECS – Now supports AWS Graviton-based Spot compute with AWS Fargate. This allows to run fault-tolerant Arm-based applications with up to 70% lower costs compared to on-demand.

Zone Groups for Availability Zones in AWS Regions – We are working on extending the Zone Group construct to Availability Zones (AZs) with a consistent naming format across all AWS Regions.

Amazon Managed Service for Apache Flink – Now supports Apache Flink 1.20. You can upgrade to benefit from bug fixes, performance improvements, and new functionality added by the Flink community.

AWS Glue – Now provides job queuing. If quotas or limits are insufficient to start a Glue job, AWS Glue will now automatically queue the job and wait for limits to free up.

Amazon DynamoDB – Now supports Attribute-Based Access Control (ABAC) for tables and indexes (limited preview). ABAC is an authorization strategy that defines access permissions based on tags attached to users, roles, and AWS resources. Read more in this Database Blog post.

Amazon BedrockStability AI’s top text-to-image models (Stable Image Ultra, Stable Diffusion 3 Large, and Stable Image Core) are now available to generate high-quality visuals with speed and precision.

Amazon Bedrock Agents – Now supports Anthropic Claude 3.5 Sonnet, including Anthropic recommended tool use for function calling which can improve developer and end user experience.

Amazon Sagemaker Studio – You can now use Amazon EMR Serverless directly from your Studio Notebooks to interactively query, explore and visualize data, and run Apache Spark jobs.

Amazon SageMakerIntroducing sagemaker-core, a new Python SDK that provides an object-oriented interface for interacting with SageMaker resources such as TrainingJob, Model, and Endpoint resource classes.

AWS AppSync – Improves monitoring by including DEBUG and INFO logging levels for its GraphQL APIs. You now have more granular control over log verbosity to make it easier to troubleshoot your APIs while optimizing readability and costs.

Amazon WorkSpaces Pools – You can now bring your Windows 10 or 11 licenses and provide a consistent desktop experience when switching between on-premise and virtual desktops.

Amazon SES – A new enhanced onboarding experience to help discover and activate key SES features, including recommendations for optimal setup and the option to enable the Virtual Deliverability Manager to enhance email deliverability.

Amazon Redshift – Now the Amazon Redshift Data API support session reuse to retain the context of a session from one query execution to another, reducing connection setup latency on repeated queries to the same data warehouse.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS news
Here are some additional projects, blog posts, and news items that you might find interesting:

Amazon Q Developer Code Challenge – At the 2024 AWS Summit in Sydney, we put two teams (one using Amazon Q Developer, one not) in a battle of coding prowess, starting with basic math and string manipulation, up to including complex algorithms and intricate ciphers. Here are the results.

Amazon Q Developer Code Challenge graph

AWS named as a Leader in the first Gartner Magic Quadrant for AI Code Assistants – It’s great to see how new technologies make the whole software development lifecycle easier and increase developer productivity.

Build powerful RAG pipelines with LlamaIndex and Amazon Bedrock – A deep dive tutorial that covers simple and advanced use cases.

Evaluating prompts at scale with Prompt Management and Prompt Flows for Amazon Bedrock – To implement an automated prompt evaluation system to streamline prompt development and improve the overall quality of AI-generated content.

Amazon Redshift data ingestion options – An overview of the available ingestion methods and how they work for different use cases.

Amazon Redshift data ingestion options

Upcoming AWS events
Check your calendars and sign up for upcoming AWS events:

AWS Summits – Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. AWS Summits for this year are coming to an end. There are two more left that you can still register: Toronto (September 11), and Ottawa (October 9).

AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs driven by expert AWS users and industry leaders from around the world. Upcoming AWS Community Days are in the SF Bay Area (September 13), where our own Antje Barth is a keynote speaker, Argentina (September 14), Armenia (September 14), and DACH (in Munich on September 17).

AWS GenAI Lofts – Collaborative spaces and immersive experiences that showcase AWS’s cloud and AI expertise, while providing startups and developers with hands-on access to AI products and services, exclusive sessions with industry leaders, and valuable networking opportunities with investors and peers. Find a GenAI Loft location near you and don’t forget to register.

Browse all upcoming AWS-led in-person and virtual events here.

That’s all for this week. Check back next Monday for another Weekly Roundup!

Danilo

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Wireshark 4.4's IP Address Functions, (Mon, Sep 9th)

This post was originally published on this site

New IP address functions have been added in Wireshark 4.4 (if you use Wireshark on Windows, there's a bug in release 4.4.0: the DLL with these functions is missing, it will be included in release 4.4.1; all is fine with Linux and Mac versions of Wireshark).

These are the functions:

They are explained in the Wireshark filter manual under "Functions".

Function ip_rfc1918, for example, returns True when the argument of this function is a private use IPv4 address. It can be used as a display filter, like this:

These functions can also be used in custom columns, like function ip_special_name that returns the IP special-purpose block name as a string:

To summarize: these functions were introduced with Wireshark release 4.4, but this will not work only if you are using Windows version 4.4.0. I used release candicate 4.4.1 to take these screenshots, as the missing dll (ipaddress.dll) is present in that package.

 

Didier Stevens
Senior handler
blog.DidierStevens.com

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

Password Cracking & Energy: More Dedails, (Sun, Sep 8th)

This post was originally published on this site

Here are more details on the power consumption of my desktop computer when I crack passwords (cfr diary entry "Quickie: Password Cracking & Energy").

The vertical scale of this chart is expressed in Watts:

  1. 0 Watt: my desktop computer is turned off
  2. 76 Watt average: my desktop computer is turned on & idling
  3. 151 Watt average: hashcat is running in dictionary attack mode cracking SHA256 hashes
  4. 445 Watt average: hashcat is running in brute-force attack mode cracking SHA256 hashes

The most power is required (445 Watt) when hashcat is using the GPU ( NVIDIA GeForce RTX 3080) in brute-force attack mode. For comparison, 445 Watt average continuous is enough to heat my office in winter to a nice & comfy temperature, I don't need central heating in that room when hashcat is running for many hours.

You might wonder if 445 Watt is enough for that, given that electrical heaters typically come in 1000+ Watt models. But electrical heaters don't consume electrical power constantly to heat a room, they have a thermostat that shuts of current flow regularly when the desired room temperature is reached. They are more powerfull so that they can heat up a room faster. While my desktop computer requires 445 Watt continuously when cracking with the GPU.

 

Didier Stevens
Senior handler
blog.DidierStevens.com

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

Python & Notepad++, (Sat, Sep 7th)

This post was originally published on this site

PythonScript is a Notepad++ plugin that provides a Python interpreter to edit Notepad++ documents.

You install PythonScript in Notepad++ like this:

Use "New Script" to create a new Python script:

As an example, I will create a template substitution script, something that I use often. You provide a substitution template as input, and then each line of the open document is substituted according to the given template.

First we create the script substitute.py:

This is the template substitution script I developed:

def Substitute(contents, lineNumber, totalLines):
    contents = contents.rstrip('nr')
    if contents != '':
        editor.replaceLine(lineNumber, template.replace(token, contents))

token = notepad.prompt('Provide a token', 'Substitute token', '%%')
template = notepad.prompt('Provide a template', 'Substitute template', '')
if token != None and template != None:
    editor.forEachLine(Substitute)

You can paste it into Notepad++:

I will now demonstrate the script on a new document I created in Notepad++: the list of today's top 10 scanning IP addresses:

For each IP address, I want to generate a command that I will then execute.

The script can now be invoked to be executed on this open document like this:

The first line of Python script substitute.py to be executed, is line 6 (token = notepad.prompt…). It prompts the user for a token string (default %%), this is a string that, when used in the template string, will be replaced by each line in the open document

Line 7 prompts the user for a template string:

When the user has not cancelled answering the prompts (tested in line 8), line 9 (editor.forEachLine(Substitute)) is executed: it runs function Substitute on each line of the document:

Then I can copy/paste all these generated commands into a cmd.exe console:

This example is a bit contrived, as you could also use a for loop in the scripting shell to achieve the same result.

I also use this Python script often when I'm programming. Say that I want to hardcode this list of scanning IP addresses in a Python script. I can quickly create a Python list as follows:

And then I add the variable assignment statemnt and create a list:

 

Didier Stevens
Senior handler
blog.DidierStevens.com

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

Scans for Moodle Learning Platform Following Recent Update, (Wed, Sep 4th)

This post was originally published on this site

On August 10th, the popular learning platform "Moodle" released an update fixing %%cve:2024-43425%%. RedTeam Pentesting found the vulnerability and published a detailed blog post late last week. The blog post demonstrates in detail how a user with the "trainer" role could execute arbitrary code on the server. A trainer would have to publish a "calculated question". These questions are generated dynamically by evaluating a formula. Sadly, the formula was evaluated using PHP's "eval" command. As pointed out by RedTeam Pentesting, "eval" is a very dangerous command to use and should be avoided if at all possible. This applies not only to PHP but to most languages (also see my video about command injection vulnerabilities). As I usually say: "eval is only one letter away from evil".

The exploit does require the attacker to be able to publish questions. However, Moodle is used by larger organizations like Universities. An attacker may be able to obtain credentials as a "trainer" via brute forcing or credential stuffing.

I got pointed to "Moodle" after seeing this URL in our "First Seen" list of newly accessed URLs:

/lib/ajax/service.php?info=tool_mobile_get_public_config&lang=en

This "public config" may return additional details in some cases, but from my tests with a demo instance of Moodle, it only returns:

 {"error":"Coding error detected, it must be fixed by a programmer: Invalid json in request: Syntax error","errorcode":"codingerror","stacktrace":null,"debuginfo":null,"reproductionlink":null}

At least this URL could be used to find Moodle instances and probe them later with more specific exploits. I will have to add this case to our honeypot responses to get more details. These scans are not new, but we had only individual scans (one or two per day) so they never passed our threshold as "significant". Only yesterday did they pass the "line".

But in the meantime:

  1. Keep Moodle up to date (they do have a decent chart outlining support timeframes for different versions)
  2. Audit the "trainer" accounts, not just because of the vulnerability, but in general, they can cause damage to the system.
  3. Let me know if you have additional insight into Moodle. Is there something else that this URL could trigger?


Johannes B. Ullrich, Ph.D. , Dean of Research, SANS.edu
Twitter|

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

AWS named as a Leader in the first Gartner Magic Quadrant for AI Code Assistants

This post was originally published on this site

On August 19th, 2024, Gartner published its first Magic Quadrant for AI Code Assistants, which includes Amazon Web Services (AWS). Amazon Q Developer qualified for inclusion, having launched in general availability on April 30, 2024. AWS was ranked as a Leader for its ability to execute and completeness of vision.

We believe this Leader placement reflects our rapid pace of innovation, which makes the whole software development lifecycle easier and increases developer productivity with enterprise-grade access controls and security.

The Gartner Magic Quadrant evaluates 12 AI code assistants based on their Ability to Execute, which measures a vendor’s capacity to deliver its products or services effectively, and Completeness of Vision, which assesses a vendor’s understanding of the market and its strategy for future growth, according to Gartner’s report, How Markets and Vendors Are Evaluated in Gartner Magic Quadrants.

Here is the graphical representation of the 2024 Gartner Magic Quadrant for AI Code Assistants.

Here is the quote from Gartner’s report:

Amazon Web Services (AWS) is a Leader in this Magic Quadrant. Its product, Amazon Q Developer (formerly CodeWhisperer), is focused on assisting and automating developer tasks using AI. For example, Amazon Q Developer helps with code suggestions and transformation, testing and security, as well as feature development. Its operations are geographically diverse, and its clients are of all sizes. AWS is focused on delivering AI-driven solutions that enhance the software development life cycle (SDLC), automating complex tasks, optimizing performance, ensuring security, and driving innovation.

My team focuses on creating content on Amazon Q Developer that directly supports software developers’ jobs-to-be-done, enabled and enhanced by generative AI in Amazon Q Developer Center and Community.aws.

I’ve had the chance to talk with our customers to ask why they choose Amazon Q Developer. They said it is available to accelerate and complete tasks across the SDLC much more than general AI code assistants—from coding, testing, and upgrading, to troubleshooting, performing security scanning and fixes, optimizing AWS resources, and creating data engineering pipelines.

Here are the highlights that customers talked about more often:

Available everywhere you need it – You can use Amazon Q Developer in any of the following integrated development environment (IDE), including Visual Studio Code, JetBrains IDEs, AWS Toolkit with Amazon Q, JupyterLab, Amazon EMR Studio, Amazon SageMaker Studio, or AWS Glue Studio. You can also use Amazon Q Developer in the AWS Management Console, AWS Command Line Interface (AWS CLI), AWS documentation, AWS Support, AWS Console Mobile Application, Amazon CodeCatalyst, or through Slack and Microsoft Teams with AWS Chatbot. According to Safe Software, “Amazon Q knows all the ways to make use of the many tools that AWS provides. Because we are now able to accomplish more, we will be able to extend our automations into other AWS services and make use of Amazon Q to help us get there.” To learn more, visit Amazon Q Developer features and Amazon Q Developer customers.

Customizing code recommendations – You can get code recommendations based on your internal code base. Amazon Q Developer accelerates onboarding to a new code base to generate even more relevant inline code recommendations and chat responses (in preview) by making it aware of your internal libraries, APIs, best practices, and architectural patterns. Your organization’s administrators can securely connect Amazon Q Developer to your internal code bases to create multiple customizations. According to National Australia Bank (NAB), NAB has now added specific suggestions using the Amazon Q customization capability that are tailored to the NAB coding standards. They’re seeing increased acceptance rates of 60 percent with customization. To learn more, visit Customizing suggestions in the AWS documentation.

Upgrading your Java applicationsAmazon Q Developer Agent for code transformation automates the process of upgrading and transforming your legacy Java applications. According to an internal Amazon study, Amazon has migrated tens of thousands of production applications from Java 8 or 11 to Java 17 with assistance from Amazon Q Developer. This represents a savings of over 4,500 years of development work for over a thousand developers (when compared to manual upgrades) and performance improvements worth $260 million dollars in annual cost savings. Transformations from Windows to cross-platform .NET are also coming soon! To learn more, visit Upgrading language versions with the Amazon Q Developer Agent for code transformation in the AWS documentation.

Access the complete 2024 Gartner Magic Quadrant for AI Code Assistants report to learn more.

Channy

Gartner Magic Quadrant for AI Code Assistants, Arun Batchu, Philip Walsh, Matt Brasier, Haritha Khandabattu, 19 August, 2024.

Gartner does not endorse any vendor, product or service depicted in its research publications and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

GARTNER is a registered trademark and service mark of Gartner and Magic Quadrant is a registered trademark of Gartner, Inc. and/or its affiliates in the U.S. and internationally and are used herein with permission. All rights reserved.

Protected OOXML Text Documents, (Mon, Sep 2nd)

This post was originally published on this site

Just like "Protected OOXML Spreadsheets", Word documents can also be protected:

You have to look into the word/settings.xml file, and search for element w:documentProtection:

The hash algorithm is the same as for OOXML spreadsheets. However, you will not be able to use hashcat to crack protected Word document hashes, because the password is encoded differently before it is repeatedly hashed.

A legacy algorithm is used to preprocess the password, and I found a Python implementation here.

# https://stackoverflow.com/questions/65877620/open-xml-document-protection-implementation-documentprotection-class
dHighOrderWordLists = [
    [0xE1, 0xF0],
    [0x1D, 0x0F],
    [0xCC, 0x9C],
    [0x84, 0xC0],
    [0x11, 0x0C],
    [0x0E, 0x10],
    [0xF1, 0xCE],
    [0x31, 0x3E],
    [0x18, 0x72],
    [0xE1, 0x39],
    [0xD4, 0x0F],
    [0x84, 0xF9],
    [0x28, 0x0C],
    [0xA9, 0x6A],
    [0x4E, 0xC3]
]

dEncryptionMatrix = [
    [[0xAE, 0xFC], [0x4D, 0xD9], [0x9B, 0xB2], [0x27, 0x45], [0x4E, 0x8A], [0x9D, 0x14], [0x2A, 0x09]],
    [[0x7B, 0x61], [0xF6, 0xC2], [0xFD, 0xA5], [0xEB, 0x6B], [0xC6, 0xF7], [0x9D, 0xCF], [0x2B, 0xBF]],
    [[0x45, 0x63], [0x8A, 0xC6], [0x05, 0xAD], [0x0B, 0x5A], [0x16, 0xB4], [0x2D, 0x68], [0x5A, 0xD0]],
    [[0x03, 0x75], [0x06, 0xEA], [0x0D, 0xD4], [0x1B, 0xA8], [0x37, 0x50], [0x6E, 0xA0], [0xDD, 0x40]],
    [[0xD8, 0x49], [0xA0, 0xB3], [0x51, 0x47], [0xA2, 0x8E], [0x55, 0x3D], [0xAA, 0x7A], [0x44, 0xD5]],
    [[0x6F, 0x45], [0xDE, 0x8A], [0xAD, 0x35], [0x4A, 0x4B], [0x94, 0x96], [0x39, 0x0D], [0x72, 0x1A]],
    [[0xEB, 0x23], [0xC6, 0x67], [0x9C, 0xEF], [0x29, 0xFF], [0x53, 0xFE], [0xA7, 0xFC], [0x5F, 0xD9]],
    [[0x47, 0xD3], [0x8F, 0xA6], [0x0F, 0x6D], [0x1E, 0xDA], [0x3D, 0xB4], [0x7B, 0x68], [0xF6, 0xD0]],
    [[0xB8, 0x61], [0x60, 0xE3], [0xC1, 0xC6], [0x93, 0xAD], [0x37, 0x7B], [0x6E, 0xF6], [0xDD, 0xEC]],
    [[0x45, 0xA0], [0x8B, 0x40], [0x06, 0xA1], [0x0D, 0x42], [0x1A, 0x84], [0x35, 0x08], [0x6A, 0x10]],
    [[0xAA, 0x51], [0x44, 0x83], [0x89, 0x06], [0x02, 0x2D], [0x04, 0x5A], [0x08, 0xB4], [0x11, 0x68]],
    [[0x76, 0xB4], [0xED, 0x68], [0xCA, 0xF1], [0x85, 0xC3], [0x1B, 0xA7], [0x37, 0x4E], [0x6E, 0x9C]],
    [[0x37, 0x30], [0x6E, 0x60], [0xDC, 0xC0], [0xA9, 0xA1], [0x43, 0x63], [0x86, 0xC6], [0x1D, 0xAD]],
    [[0x33, 0x31], [0x66, 0x62], [0xCC, 0xC4], [0x89, 0xA9], [0x03, 0x73], [0x06, 0xE6], [0x0D, 0xCC]],
    [[0x10, 0x21], [0x20, 0x42], [0x40, 0x84], [0x81, 0x08], [0x12, 0x31], [0x24, 0x62], [0x48, 0xC4]]
]


def WordEncodePassword(password):
  password_bytes = password.encode('utf-8')
  password_bytes = password_bytes[:15]

  password_length = len(password_bytes)

  if password_length > 0:
    high_order_word_list = dHighOrderWordLists[password_length - 1].copy()
  else:
    high_order_word_list = [0x00, 0x00]

  for i in range(password_length):
    password_byte = password_bytes[i]
    matrix_index = i + len(dEncryptionMatrix) - password_length

    for j in range(len(dEncryptionMatrix[0])):
      # Only perform XOR operation using the encryption matrix if the j-th bit is set
      mask = 1 << j
      if (password_byte & mask) == 0:
        continue

      for k in range(len(dEncryptionMatrix[0][0])):
        high_order_word_list[k] = high_order_word_list[k] ^ dEncryptionMatrix[matrix_index][j][k]

  low_order_word = 0x0000

  for i in range(password_length - 1, -1, -1):
    password_byte = password_bytes[i]
    low_order_word = (
      (((low_order_word >> 14) & 0x0001) | ((low_order_word << 1) & 0x7fff))
      ^ password_byte
    )

  low_order_word = (
    (((low_order_word >> 14) & 0x0001) | ((low_order_word << 1) & 0x7fff))
    ^ password_length
    ^ 0xce4b
  )

  low_order_word_list = [(low_order_word & 0xff00) >> 8, low_order_word & 0x00ff]

  key = high_order_word_list + low_order_word_list
  key.reverse()

  # `key_str` is a hex string with uppercase hexadecimal letters, e.g. '7EEDCE64'
  key_str = ''.join(f'{c:X}' for c in key)

  return key_str

This password preprocessing code can then be used with the same hashing function as for Excel, like this:

def CalculateHash(password, salt):
    passwordBytes = password.encode('utf16')[2:]
    buffer = salt + passwordBytes
    hash = hashlib.sha512(buffer).digest()
    for iter in range(100000):
        buffer = hash + struct.pack('<I', iter)
        hash = hashlib.sha512(buffer).digest()
    return hash

def WordCalculateHash(password, salt):
    return CalculateHash(WordEncodePassword(password), binascii.a2b_base64(salt))

Using password "P@ssword" and the salt seen in the screenshot above, we can calculate the hash:

This calculated hash (BASE64 representation) is the same as the stored hash, thus the password is indeed "P@ssw0rd".

 

Didier Stevens
Senior handler
blog.DidierStevens.com

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

AWS Weekly Roundup: AWS Parallel Computing Service, Amazon EC2 status checks, and more (September 2, 2024)

This post was originally published on this site

With the arrival of September, AWS re:Invent 2024 is now 3 months away and I am very excited for the new upcoming services and announcements at the conference. I remember attending re:Invent 2019, just before the COVID-19 pandemic. It was the biggest in-person re:Invent with 60,000+ attendees and it was my second one. It was amazing to be in that atmosphere! Registration is now open for AWS re:Invent 2024. Come join us in Las Vegas for five exciting days of keynotes, breakout sessions, chalk talks, interactive learning opportunities, and career-changing connections!

Now let’s look at the last week’s new announcements.

Last week’s launches
Here are the launches that got my attention.

Announcing AWS Parallel Computing Service – AWS Parallel Computing Service (AWS PCS) is a new managed service that lets you run and scale high performance computing (HPC) workloads on AWS. You can build scientific and engineering models and run simulations using a fully managed Slurm scheduler with built-in technical support and a rich set of customization options. Tailor your HPC environment to your specific needs and integrate it with your preferred software stack. Build complete HPC clusters that integrates compute, storage, networking, and visualization resources, and seamlessly scale from zero to thousands of instances. To learn more, visit AWS Parallel Computing Service and read Channy’s blog post.

Amazon EC2 status checks now support reachability health of attached EBS volumes – You can now use Amazon EC2 status checks to directly monitor if the Amazon EBS volumes attached to your instances are reachable and able to complete I/O operations. With this new status check, you can quickly detect attachment issues or volume impairments that may impact the performance of your applications running on Amazon EC2 instances. You can further integrate these status checks within Auto Scaling groups to monitor the health of EC2 instances and replace impacted instances to ensure high availability and reliability of your applications. Attached EBS status checks can be used along with the instance status and system status checks to monitor the health of your instances. To learn more, refer to the Status checks for Amazon EC2 instances documentation.

Amazon QuickSight now supports sharing views of embedded dashboards – You can now share views of embedded dashboards in Amazon QuickSight. This feature allows you to enable more collaborative capabilities in your application with embedded QuickSight dashboards. Additionally, you can enable personalization capabilities such as bookmarks for anonymous users. You can share a unique link that displays only your changes while staying within the application, and use dashboard or console embedding to generate a shareable link to your application page with QuickSight’s reference encapsulated using the QuickSight Embedding SDK. QuickSight Readers can then send this shareable link to their peers. When their peer accesses the shared link, they are taken to the page on the application that contains the embedded QuickSight dashboard. For more information, refer to Embedded view documentation.

Amazon Q Business launches IAM federation for user identity authenticationAmazon Q Business is a fully managed service that deploys a generative AI business expert for your enterprise data. You can use the Amazon Q Business IAM federation feature to connect your applications directly to your identity provider to source user identity and user attributes for these applications. Previously, you had to sync your user identity information from your identity provider into AWS IAM Identity Center, and then connect your Amazon Q Business applications to IAM Identity Center for user authentication. At launch, Amazon Q Business IAM federation will support the OpenID Connect (OIDC) and SAML2.0 protocols for identity provider connectivity. To learn more, visit Amazon Q Business documentation.

Amazon Bedrock now supports cross-Region inferenceAmazon Bedrock announces support for cross-Region inference, an optional feature that enables you to seamlessly manage traffic bursts by utilizing compute across different AWS Regions. If you are using on-demand mode, you’ll be able to get higher throughput limits (up to 2x your allocated in-Region quotas) and enhanced resilience during periods of peak demand by using cross-Region inference. By opting in, you no longer have to spend time and effort predicting demand fluctuations. Instead, cross-Region inference dynamically routes traffic across multiple Regions, ensuring optimal availability for each request and smoother performance during high-usage periods. You can control where your inference data flows by selecting from a pre-defined set of Regions, helping you comply with applicable data residency requirements and sovereignty laws. Find the list at Supported Regions and models for cross-Region inference. To get started, refer to the Amazon Bedrock documentation or this Machine Learning blog.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

We launched existing services and instance types in additional Regions:

Other AWS events
AWS GenAI Lofts are collaborative spaces and immersive experiences that showcase AWS’s cloud and AI expertise, while providing startups and developers with hands-on access to AI products and services, exclusive sessions with industry leaders, and valuable networking opportunities with investors and peers. Find a GenAI Loft location near you and don’t forget to register.

Gen AI loft workshop

credit: Antje Barth

Upcoming AWS events
Check your calendar and sign up for upcoming AWS events:

AWS Summits are free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. AWS Summits for this year are coming to an end. There are 3 more left that you can still register: Jakarta (September 5), Toronto (September 11), and Ottawa (October 9).

AWS Community Days feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world. While AWS Summits 2024 are almost over, AWS Community Days are in full swing. Upcoming AWS Community Days are in Belfast (September 6), SF Bay Area (September 13), where our own Antje Barth is a keynote speaker, Argentina (September 14), and Armenia (September 14).

Browse all upcoming AWS led in-person and virtual events here.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— Esra

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!