Amazon WorkSpaces Pools: Cost-effective, non-persistent virtual desktops

This post was originally published on this site

You can now create a pool of non-persistent virtual desktops using Amazon WorkSpaces and share them across a group of users. As the desktop administrator you can manage your entire portfolio of persistent and non-persistent virtual desktops using one GUI, command line, or set of API-powered tools. Your users can log in to these desktops using a browser, a client application (Windows, Mac, or Linux), or a thin client device.

Amazon WorkSpaces Pools (non-persistent desktops)
WorkSpaces Pools ensures that each user gets the same applications and the same experience. When the user logs in, they always get access to a fresh WorkSpace that’s based on the latest configuration for the pool, centrally managed by their administrator. If the administrator enables application settings persistence for the pool, users can configure certain application settings, such as browser favorites, plugins, and UI customizations. Users can also access persistent file or object storage external to the desktop.

These desktops are a great fit for many types of users and use cases including remote workers, task workers (shared service centers, finance, procurement, HR, and so forth), contact center workers, and students.

As the administrator for the pool, you have full control over the compute resources (bundle type) and the initial configuration of the desktops in the pool, including the set of applications that are available to the users. You can use an existing custom WorkSpaces image, create a new one, or use one of the standard ones. You can also include Microsoft 365 Apps for Enterprise on the image. You can configure the pool to accommodate the size and working hours of your user base, and you can optionally join the pool to your organization’s domain and active directory.

Getting started
Let’s walk through the process of setting up a pool and inviting some users. I open the WorkSpaces console and choose Pools to get started:

I have no pools, so I choose Create WorkSpace on the Pools tab to begin the process of creating a pool:

The console can recommend workspace options for me, or I can choose what I want. I leave Recommend workspace options… selected, and choose No – non-persistent to create a pool of non-persistent desktops. Then I select my use cases from the menu and pick the operating system and choose Next to proceed:

The use case menu has lots of options:

On the next page I start by reviewing the WorkSpace options and assigning a name to my pool:

Next, I scroll down and choose a bundle. I can pick a public bundle or a custom one of my own. Bundles must use the WSP 2.0 protocol. I can create a custom bundle to provide my users with access to applications or to alter any desired system settings.

Moving right along, I can customize the settings for each user session. I can also enable application settings persistence to save application customizations and Windows settings on a per-user basis between sessions:

Next, I set the capacity of my pool, and optionally establish one or more schedules based on date or time. The schedules give me the power to match the size of my pool (and hence my costs) to the rhythms and needs of my users:

If the amount of concurrent usage is more dynamic and not aligned to a schedule, then I can use manual scale out and scale in policies to control the size of my pool:


I tag my pool, and then choose Next to proceed:

The final step is to select a WorkSpaces pool directory or create a new one following these steps. Then, I choose Create WorkSpace pool.

WorkSpaces Pools Directory

After the pool has been created and started, I can send registration codes to users, and they can log in to a WorkSpace:

WorkSpaces Pools Login with Registration Code

I can monitor the status of the pool from the console:

WorkSpaces Pool Status On Console

Things to know
Here are a couple of things that you should know about WorkSpaces Pools:

Programmatic access – You can automate the setup process that I showed above by using functions like CreateWorkSpacePool, DescribeWorkSpacePool, UpdateWorkSpacePool, or the equivalent AWS command line interface (CLI) commands.

Regions – WorkSpaces Pools is available in all commercial AWS Regions where WorkSpaces Personal is available, except Israel (Tel Aviv), Africa (Cape Town), and China (Ningxia). Check the full Region list for future updates.

Pricing – Refer to the Amazon WorkSpaces Pricing page for complete pricing information.

Visit Amazon WorkSpaces Pools to learn more.

Jeff;

Introducing end-to-end data lineage (preview) visualization in Amazon DataZone

This post was originally published on this site

Amazon DataZone is a data management service to catalog, discover, analyze, share, and govern data between data producers and consumers in your organization. Engineers, data scientists, product managers, analysts, and business users can easily access data throughout your organization using a unified data portal so that they can discover, use, and collaborate to derive data-driven insights.

Now, I am excited to announce in preview a new API-driven and OpenLineage compatible data lineage capability in Amazon DataZone, which provides an end-to-end view of data movement over time. Data lineage is a new feature within Amazon DataZone that helps users visualize and understand data provenance, trace change management, conduct root cause analysis when a data error is reported, and be prepared for questions on data movement from source to target. This feature provides a comprehensive view of lineage events, captured automatically from Amazon DataZone’s catalog along with other events captured programmatically outside of Amazon DataZone by stitching them together for an asset.

When you need to validate how the data of interest originated in the organization, you may rely on manual documentation or human connections. This manual process is time-consuming and can result in inconsistency, which directly reduces your trust in the data. Data lineage in Amazon DataZone can raise trust by helping you understand where the data originated, how it has changed, and its consumption in time. For example, data lineage can be programmatically setup to show the data from the time it was captured as raw files in Amazon Simple Storage Service (Amazon S3), through its ETL transformations using AWS Glue, to the time it was consumed in tools such as Amazon QuickSight.

With Amazon DataZone’s data lineage, you can reduce the time spent mapping a data asset and its relationships, troubleshooting and developing pipelines, and asserting data governance practices. Data lineage helps you gather all lineage information in one place using API, and then provide a graphical view with which data users can be more productive, make better data-driven decisions, and also identify the root cause of data issues.

Let me tell you how to get started with data lineage in Amazon DataZone. Then, I will show you how data lineage enhances the Amazon DataZone data catalog experience by visually displaying connections about how a data asset came to be so you can make informed decisions when searching or using the data asset.

Getting started with data lineage in Amazon DataZone
In preview, I can get started by hydrating lineage information into Amazon DataZone programmatically by either directly creating lineage nodes using Amazon DataZone APIs or by sending OpenLineage compatible events from existing pipeline components to capture data movement or transformations that happens outside of Amazon DataZone. For information about assets in the catalog, Amazon DataZone automatically captures lineage of its states (i.e., inventory or published states), and its subscriptions for producers, such as data engineers, to trace who is consuming the data they produced or for data consumers, such as data analyst or data engineers, to understand if they are using the right data for their analysis.

With the information being sent, Amazon DataZone will start populating the lineage model and will be able to map the identifier sent through the APIs with the assets already cataloged. As new lineage information is being sent, the model starts creating versions to start the visualization of the asset at a given time, but it also allows me to navigate to previous versions.

I use a preconfigured Amazon DataZone domain for this use case. I use Amazon DataZone domains to organize my data assets, users, and projects. I go to the Amazon DataZone console and choose View domains. I choose my domain Sales_Domain and choose Open data portal.

I have five projects under my domain: one for a data producer (SalesProject) and four for data consumers (MarketingTestProject, AdCampaignProject, SocialCampaignProject, and WebCampaignProject). You can visit Amazon DataZone Now Generally Available – Collaborate on Data Projects across Organizational Boundaries to create your own domain and all the core components.

I enter “Market Sales Table” in the Search Assets bar and then go to the detail page for the Market Sales Table asset. I choose the LINEAGE tab to visualize lineage with upstream and downstream nodes.

I can now dive into asset details, processes, or jobs that lead to or from those assets and drill into column-level lineage.

Interactive visualization with data lineage
I will show you the graphical interface using various personas who regularly interact with Amazon DataZone and will benefit from the data lineage feature.

First, let’s say I am a marketing analyst, who needs to confirm the origin of a data asset to confidently use in my analysis. I go to the MarketingTestProject page and choose the LINEAGE tab. I notice the lineage includes information about the asset as it occurs inside and out of Amazon DataZone. The labels Cataloged, Published, and Access requested represent actions inside the catalog. I expand the market_sales dataset item to see where the data came from.

I now feel assured of the origin of the data asset and trust that it aligns with my business purpose ahead of starting my analysis.

Second, let’s say I am a data engineer. I need to understand the impact of my work on dependent objects to avoid unintended changes. As a data engineer, any changes made to the system should not break any downstream processes. By browsing lineage, I can clearly see who has subscribed and has access to the asset. With this information, I can inform the project teams about an impending change that can affect their pipeline. When a data issue is reported, I can investigate each node and traverse between its versions to dive into what has changed over time to identify the root cause of the issue and fix it in a timely manner.

Finally, as an administrator or steward, I am responsible for securing data, standardizing business taxonomies, enacting data management processes, and for general catalog management. I need to collect details about the source of data and understand the transformations that have happened along the way.

For example, as an administrator looking to respond to questions from an auditor, I traverse the graph upstream to see where the data is coming from and notice that the data is from two different sources: online sale and in-store sale. These sources have their own pipelines until the flow reaches a point where the pipelines merge.

While navigating through the lineage graph, I can expand the columns to ensure sensitive columns are dropped during the transformation processes and respond to the auditors with details in a timely manner.

Join the preview
Data lineage capability is available in preview in all Regions where Amazon DataZone is generally available. For a list of Regions where Amazon DataZone domains can be provisioned, visit AWS Services by Region.

Data lineage costs are dependent on storage usage and API requests, which are already included in Amazon DataZone’s pricing model. For more details, visit Amazon DataZone pricing.

To learn more about data lineage in Amazon DataZone, visit the Amazon DataZone User Guide.

— Esra

Amazon CodeCatalyst now supports GitLab and Bitbucket repositories, with blueprints and Amazon Q feature development

This post was originally published on this site

I’m happy to announce that we’re further integrating Amazon CodeCatalyst with two popular code repositories: GitLab and BitBucket, in addition to the existing integration with GitHub. We bring the same set of capabilities that you use today on CodeCatalyst with GitHub to GitLab.com and Bitbucket Cloud.

Amazon CodeCatalyst is a unified software development and delivery service. It enables software development teams to quickly and easily plan, develop, collaborate on, build, and deliver applications on Amazon Web Services (AWS), reducing friction throughout the development lifecycle.

The GitHub, GitLab.com, and Bitbucket Cloud repositories extension for CodeCatalyst simplifies managing your development workflow. The extension allows you to view and manage external repositories directly within CodeCatalyst. Additionally, you can store and manage workflow definition files alongside your code in external repositories while also creating, reading, updating, and deleting files in linked repositories from CodeCatalyst dev environments. The extension also triggers CodeCatalyst workflow runs automatically upon code pushes and when pull requests are opened, merged, or closed. Furthermore, it allows you to directly utilize source files from linked repositories and execute actions within CodeCatalyst workflows, eliminating the need to switch platforms and maximizing efficiency.

But there’s more: starting today, you can create a CodeCatalyst project in a GitHub, GitLab.com, or Bitbucket Cloud repository from a blueprint, you can add a blueprint to an existing code base in a repository on any of those three systems, and you can also create custom blueprints stored in your external repositories hosted on GitHub, GitLab.com, or Bitbucket Cloud.

CodeCatalyst blueprints help to speed up your developments. These pre-built templates provide a source repository, sample code, continuous integration and delivery (CI/CD) workflows, and integrated issue tracking to get you started quickly. Blueprints automatically update with best practices, keeping your code modern. IT leaders can create custom blueprints to standardize development for your team, specifying technology, access controls, deployment, and testing methods. And now, you can use blueprints even if your code resides in GitHub, GitLab.com, or Bitbucket Cloud.

Link your CodeCatalyst space with a git repository hosting service
Getting started using any of these three source code repository providers is easy. As a CodeCatalyst space administrator, I select the space where I want to configure the extensions. Then, I select Settings, and in the Installed extensions section, I select Configure to link my CodeCatalyst space with my GitHub, GitLab.com, or Bitbucket Cloud account.

Link CodeCatalyst with a git repository hosting service

This is a one-time operation for each CodeCatalyst space, but you might want to connect your space to multiple source providers’ accounts.

When using GitHub, I also have to link my personal CodeCatalyst user to my GitHub user. Under my personal menu on the top right side of the screen, I select My settings. Then, I navigate down to the Personal connections section. I select Create and follow the instructions to authenticate on GitHub and link my two identities.

Link personal CodeCatalyst account to your git hosting provider account

This is a one-time operation for each user in the CodeCatalyst space. This is only required when you’re using GitHub with blueprints.

Create a project from a blueprint and host it on GitHub, GitLab.com, and Bitbucket Cloud
Let’s show you how to create a project in an external repository from a blueprint and later add other blueprints to this project. You can use any of the three git hosting providers supported by CodeCatalyst. In this demo, I chose to use GitHub.

Let’s imagine I want to create a new project to implement an API. I start from a blueprint that implements an API with Python and the AWS Serverless Application Model (AWS SAM). The blueprint also creates a CI workflow and an issue management system. I want my project code to be hosted on GitHub. It allows me to directly use source files from my repository in GitHub and execute actions within CodeCatalyst workflows, eliminating the need to switch platforms.

I start by selecting Create project on my CodeCatalyst space page. I select Start with a blueprint and select the CodeCatalyst blueprint or Space blueprint I want to use. Then, I select Next.

Amazon CodeCatalyst create project from blueprint

I enter a name for my project. I open the Advanced section, and I select GitHub as Repository provider and my GitHub account. You can configure additional connections to GitHub by selecting Connect a GitHub account.

Amazon CodeCatalyst - select a github account

The rest of the configuration depends on the selected blueprint. In this case, I chose the language version, the AWS account to deploy the project to, the name of the AWS Lambda function, and the name of the AWS CloudFormation stack.

After the project is created, I navigate to my GitHub account, and I can see that a new repository has been created. It contains the code and resources from the blueprint.

Amazon CodeCatalyst - creation f new GitHub repository

Add a blueprint to an existing GitHub, GitLab.com, or Bitbucket Cloud project
You can apply multiple blueprints in a project to incorporate functional components, resources, and governance to existing CodeCatalyst projects. Your projects can support various elements that are managed independently in separate blueprints. The service documentation helps you learn more about lifecycle management with blueprints on existing projects.

I can now add a blueprint to an existing project in an external source code repository. Now that my backend API project has been created, I want to add a web application to my project.

I navigate to the Blueprints section in the left-side menu, and I select the orange Add blueprint button on the top-right part of the screen.

CodeCatalyst - add blue print to an existing project

I select the Single-page application blueprint and select Next.

On the next screen, I make sure to select my GitHub connection, as I did when I created the project. I also fill in the required information for this specific template. On the right side of the screen, I review the proposed changes.

CodeCatalyst - add a blueprint to a project in GitHub

Similarly, when using CodeCatalyst Enterprise Tier, I can create my own custom blueprints to share with my teammates or other groups within my organization. For brevity, I don’t share step-by-step instructions to do so in this post. For more information, see Standardizing projects with custom blueprints in the documentation.

When CodeCatalyst finishes installing the new blueprint, I can see a second repository on GitHub.

Amazon CodeCatalyst - multiple repositories

Single or multiple repository strategies
When organizing code, you can choose between a single large repository, like a toolbox overflowing with everything, or splitting it into smaller, specialized ones for better organization. Single repositories simplify dependency management for tightly linked projects but can become messy at scale. Multiple repositories offer cleaner organization and improved security but require planning to manage dependencies between separate projects.

CodeCatalyst lets you use the best strategy for your project. For more information, see the section Store and collaborate on code with source repositories in CodeCatalyst in the documentation.

In the example I showed before, the blueprint I selected proposed to apply the second blueprint as a separate repository in GitHub. Depending on the blueprint you selected, the blueprint may propose that you create a separate repository or merge the new code in an existing repository. In the latter case, the blueprint will submit a pull request for you to merge into your repository.

Region and availability
This new GitHub integration is available at no additional cost in the two AWS Regions where Amazon CodeCatalyst is available, US West (Oregon) and Europe (Ireland) at the time of publication.

Try it now!

— seb

Optimizing Amazon Simple Queue Service (SQS) for speed and scale

This post was originally published on this site

After several public betas, we launched Amazon Simple Queue Service (Amazon SQS) in 2006. Nearly two decades later, this fully managed service is still a fundamental building block for microservices, distributed systems, and serverless applications, processing over 100 million messages per second at peak times.

Because there’s always a better way, we continue to look for ways to improve performance, security, internal efficiency, and so forth. When we do find a potential way to do something better, we are careful to preserve existing behavior, and often run new and old systems in parallel to allow us to compare results.

Today I would like to tell you how we recently made improvements to Amazon SQS to reduce latency, increase fleet capacity, mitigate an approaching scalability cliff, and reduce power consumption.

Improving SQS
Like many AWS services, Amazon SQS is implemented using a collection of internal microservices. Let’s focus on two of them today:

Customer Front-End – The customer-facing front-end accepts, authenticates, and authorizes API calls such as CreateQueue and SendMessage. It then routes each request to the storage back-end.

Storage Back-End -This internal microservice is responsible for persisting messages sent to standard (non-FIFO) queues. Using a cell-based model, each cluster in the cell contains multiple hosts, each customer queue is assigned to one or more clusters, and each cluster is responsible for a multitude of queues:

Connections – Old and New
The original implementation used a connection per request between these two services. Each front-end had to connect to many hosts, which mandated the use of a connection pool, and also risked reaching an ultimate, hard-wired limit on the number of open connections. While it is often possible to simply throw hardware at problems like this and scale out, that’s not always the best way. It simply moves the moment of truth (the “scalability cliff”) into the future and does not make efficient use of resources.

After carefully considering several long-term solutions, the Amazon SQS team invented a new, proprietary binary framing protocol between the customer front-end and storage back-end. The protocol multiplexes multiple requests and responses across a single connection, using 128-bit IDs and checksumming to prevent crosstalk. Server-side encryption provides an additional layer of protection against unauthorized access to queue data.

It Works!
The new protocol was put into production earlier this year and has processed 744.9 trillion requests as I write this. The scalability cliff has been eliminated and we are already looking for ways to put this new protocol to work in other ways.

Performance-wise, the new protocol has reduced dataplane latency by 11% on average, and by 17.4% at the P90 mark. In addition to making SQS itself more performant, this change benefits services that build on SQS as well. For example, messages sent through Amazon Simple Notification Service (Amazon SNS) now spend 10% less time “inside” before being delivered. Finally, due to the protocol change, the existing fleet of SQS hosts (a mix of X86 and Graviton-powered instances) can now handle 17.8% more requests than before.

More to Come
I hope that you have enjoyed this little peek inside the implementation of Amazon SQS. Let me know in the comments, and I will see if I can find some more stories to share.

Jeff;

AWS Weekly Roundup: Claude 3.5 Sonnet in Amazon Bedrock, CodeCatalyst updates, SageMaker with MLflow, and more (June 24, 2024)

This post was originally published on this site

This week, I had the opportunity to try the new Anthropic Claude 3.5 Sonnet model in Amazon Bedrock just before it launched, and I was really impressed by its speed and accuracy!

It was also the week of AWS Summit JapanJAWS-UG, a Japanese AWS user group, held various sessions with AWS Heroes and Community Builders at the AWS Community Lounge, and many developers participated. Dr. Werner Vogels, a keynote speaker at the Japan Summit, had his first meeting with the Japanese community since 2020. Following the AWS Summit Japan, there was a lively event on Saturday where AWS community leaders from the Northeast Asia region (Japan, China, Hong Kong, Taiwan, and Korea) all gathered together in one place.

2024-aws-summit-tokyo-community

Last week’s launches
With many new capabilities, from recommendations on the size of your Amazon Relational Database Services (Amazon RDS) databases to new built-in transformations in AWS Glue, here’s what got my attention:

Amazon Bedrock – Now supports Anthropic’s Claude 3.5 Sonnet and compressed embeddings from Cohere Embed.

AWS CodeArtifactWith support for Rust packages with Cargo, developers can now store and access their Rust libraries (known as crates).

Amazon CodeCatalyst – Many updates from this unified software development service. You can now assign issues in CodeCatalyst to Amazon Q and direct it to work with source code hosted in GitHub Cloud and Bitbucket Cloud and ask Amazon Q to analyze issues and recommend granular tasks. These tasks can then be individually assigned to users or to Amazon Q itself. You can now also use Amazon Q to help pick the best blueprint for your needs. You can now securely store, publish, and share Maven, Python, and NuGet packages. You can also link an issue to other issues. This allows customers to link issues in CodeCatalyst as blocked by, duplicate of, related to, or blocks another issue. You can now configure a single CodeBuild webhook at organization or enterprise level to receive events from all repositories in your organizations, instead of creating webhooks for each individual repository. Finally, you can now add a default IAM role to an environment.

Amazon EC2 – C7g and R7g instances (powered by AWS Graviton3 processors) are now available in Europe (Milan), Asia Pacific (Hong Kong), and South America (São Paulo) Regions. C7i-flex instances are now available in US East (Ohio) Region.

AWS Compute Optimizer – Now provides rightsizing recommendations for Amazon RDS MySQL, and RDS PostgreSQL. More info in this Cloud Financial Management blog post.

Amazon OpenSearch Service – With JSON Web Token (JWT) authentication and authorization, it’s now easier to integrate identity providers and isolate tenants in a multi-tenant application.

Amazon SageMaker – Now helps you manage machine learning (ML) experiments and the entire ML lifecycle with a fully managed MLflow capability.

AWS Glue – The serverless data integration service now offers 13 new built-in transforms: flag duplicates in column, format Phone Number, format case, fill with mode, flag duplicate rows, remove duplicates, month name, iIs even, cryptographic hash, decrypt, encrypt, int to IP, and IP to int.

Amazon MWAA – Amazon Managed Workflows for Apache Airflow (MWAA) now supports custom domain names for the Airflow web server, allowing to use private web servers with load balancers, custom DNS entries, or proxies to point users to a user-friendly web address.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS news
Here are some additional projects, blog posts, and news items that you might find interesting:

AWS re:Inforce 2024 re:Cap – A summary of our annual, immersive, cloud-security learning event by my colleague Wojtek.

Three ways Amazon Q Developer agent for code transformation accelerates Java upgrades – This post offers interesting details on how Amazon Q Developer handles major version upgrades of popular frameworks, replacing deprecated API calls on your behalf, and explainability on code changes.

Five ways Amazon Q simplifies AWS CloudFormation development – For template code generation, querying CloudFormation resource requirements, explaining existing template code, understanding deployment options and issues, and querying CloudFormation documentation.

Improving air quality with generative AI – A nice solution that uses artificial intelligence (AI) to standardize air quality data, addressing the air quality data integration problem of low-cost sensors.

Deploy a Slack gateway for Amazon Bedrock – A solution bringing the power of generative AI directly into your Slack workspace.

An agent-based simulation of Amazon’s inbound supply chain – Simulating the entire US inbound supply chain, including the “first-mile” of distribution and tracking the movement of hundreds of millions of individual products through the network.

AWS CloudFormation Linter (cfn-lint) v1 – This upgrade is particularly significant because it converts from using the CloudFormation spec to using CloudFormation registry resource provider schemas.

A practical approach to using generative AI in the SDLC – Learn how an AI assistant like Amazon Q Developer helps my colleague Jenna figure out what to build and how to build it.

AWS open source news and updates – My colleague Ricardo writes about open source projects, tools, and events from the AWS Community. Check out Ricardo’s page for the latest updates.

Upcoming AWS events
Check your calendars and sign up for upcoming AWS events:

AWS Summits – Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. This week, you can join the AWS Summit in Washington, DC, June 26–27. Learn here about future AWS Summit events happening in your area.

AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world. This week there are AWS Community Days in Switzerland (June 27), Sri Lanka (June 27), and the Gen AI Edition in Ahmedabad, India (June 29).

Browse all upcoming AWS led in-person and virtual events and developer-focused events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

Danilo

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Sysinternals' Process Monitor Version 4 Released, (Sat, Jun 22nd)

This post was originally published on this site

Version 4.01 of Sysinternals' Process Monitor (procmon) was released (just one day after the release of version 4.0).

These releases bring improvements to performance and the user interface.

And a new event for the Process start was added.

This can now be displayed as a column:

And it can also be used as a filter, for example to filter out all process that started before the new process you want to analyze:

Didier Stevens
Senior handler
blog.DidierStevens.com

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

AWS CodeArtifact adds support for Rust packages with Cargo

This post was originally published on this site

Starting today, Rust developers can store and access their libraries (known as crates in Rust’s world) on AWS CodeArtifact.

Modern software development relies heavily on pre-written code packages to accelerate development. These packages, which can number in the hundreds for a single application, tackle common programming tasks and can be created internally or obtained from external sources. While these packages significantly help to speed up development, their use introduces two main challenges for organizations: legal and security concerns.

On the legal side, organizations need to ensure they have compatible licenses for these third-party packages and that they don’t infringe on intellectual property rights. Security is another risk, as vulnerabilities in these packages could be exploited to compromise an application. A known tactic, the supply chain attack, involves injecting vulnerabilities into popular open source projects.

To address these challenges, organizations can set up private package repositories. These repositories store pre-approved packages vetted by security and legal teams, limiting the risk of legal or security exposure. This is where CodeArtifact enters.

AWS CodeArtifact is a fully managed artifact repository service designed to securely store, publish, and share software packages used in application development. It supports popular package managers and formats such as npm, PyPI, Maven, NuGet, SwiftPM, and Rubygem, enabling easy integration into existing development workflows. It helps enhance security through controlled access and facilitates collaboration across teams. CodeArtifact helps maintain a consistent, secure, and efficient software development lifecycle by integrating with AWS Identity and Access Management (IAM) and continuous integration and continuous deployment (CI/CD) tools.

For the eighth year in a row, Rust has topped the chart as “the most desired programming language” in Stack Overflow’s annual developer survey, with more than 80 percent of developers reporting that they’d like to use the language again next year. Rust’s growing popularity stems from its ability to combine the performance and memory safety of systems languages such as C++ with features that makes writing reliable, concurrent code easier. This, along with a rich ecosystem and a strong focus on community collaboration, makes Rust an attractive option for developers working on high-performance systems and applications.

Rust developers rely on Cargo, the official package manager, to manage package dependencies. Cargo simplifies the process of finding, downloading, and integrating pre-written crates (libraries) into their projects. This not only saves time by eliminating manual dependency management, but also ensures compatibility and security. Cargo’s robust dependency resolution system tackles potential conflicts between different crate versions, and because many crates come from a curated registry, developers can be more confident about the code’s quality and safety. This focus on efficiency and reliability makes Cargo an essential tool for building Rust applications.

Let’s create a CodeArtifact repository for my crates
In this demo, I use the AWS Command Line Interface (AWS CLI) and AWS Management Console to create two repositories. I configure the first repository to download public packages from the official crates.io repository. I configure the second repository to download packages from the first one only. This dual repository configuration is the recommended way to manage repositories and external connections, see the CodeArtifact documentation for managing external connections. To quote the documentation:

“It is recommended to have one repository per domain with an external connection to a given public repository. To connect other repositories to the public repository, add the repository with the external connection as an upstream to them.”

I sketched this diagram to illustrate the setup.

Code Artifact repositories for cargo

Domains and repositories can be created either from the command line or the console. I choose the command line. In shell terminal, I type:

CODEARTIFACT_DOMAIN=stormacq-test

# Create an internal-facing repository: crates-io-store
aws codeartifact create-repository 
   --domain $CODEARTIFACT_DOMAIN   
   --repository crates-io-store

# Associate the internal-facing repository crates-io-store to the public crates-io
aws codeartifact associate-external-connection 
--domain $CODEARTIFACT_DOMAIN 
--repository crates-io-store  
--external-connection public:crates-io

# Create a second internal-facing repository: cargo-repo 
# and connect it to upstream crates-io-store just created
aws codeartifact create-repository 
   --domain $CODEARTIFACT_DOMAIN   
   --repository cargo-repo         
   --upstreams '{"repositoryName":"crates-io-store"}'	 

Next, as a developer, I want my local machine to fetch crates from the internal repository (cargo-repo) I just created.

I configure cargo to fetch libraries from the internal repository instead of the public crates.io. To do so, I create a config.toml file to point to CodeArtifact internal repository.

# First, I retrieve the URI of the repo
REPO_ENDPOINT=$(aws codeartifact get-repository-endpoint 
                           --domain $CODEARTIFACT_DOMAIN  
                           --repository cargo-repo       
                           --format cargo                
                           --output text)

# at this stage, REPO_ENDPOINT is https://stormacq-test-012345678912.d.codeartifact.us-west-2.amazonaws.com/cargo/cargo-repo/

# Next, I create the cargo config file
cat << EOF > ~/.cargo/config.toml
[registries.cargo-repo]
index = "sparse+$REPO_ENDPOINT"
credential-provider = "cargo:token-from-stdout aws codeartifact get-authorization-token --domain $CODEARTIFACT_DOMAIN --query authorizationToken --output text"

[registry]
default = "cargo-repo"

[source.crates-io]
replace-with = "cargo-repo"
EOF

Note that the two environment variables are replaced when I create the config file. cargo doesn’t support environment variables in its configuration.

From now on, on this machine, every time I invoke cargo to add a crate, cargo will obtain an authorization token from CodeArtifact to communicate with the internal cargo-repo repository. I must have IAM privileges to call the get-authorization-token CodeArtifact API in addition to permissions for read/publish package according to the command I use. If you’re running this setup from a build machine for your continuous integration (CI) pipeline, your build machine must have proper permissions to do so.

I can now test this setup and add a crate to my local project.

$ cargo add regex
    Updating `codeartifact` index
      Adding regex v1.10.4 to dependencies
             Features:
             + perf
             + perf-backtrack
             + perf-cache
             + perf-dfa
             + perf-inline
             + perf-literal
             + perf-onepass
             + std
             + unicode
             + unicode-age
             + unicode-bool
             + unicode-case
             + unicode-gencat
             + unicode-perl
             + unicode-script
             + unicode-segment
             - logging
             - pattern
             - perf-dfa-full
             - unstable
             - use_std
    Updating `cargo-repo` index

# Build the project to trigger the download of the crate
$ cargo build
  Downloaded memchr v2.7.2 (registry `cargo-repo`)
  Downloaded regex-syntax v0.8.3 (registry `cargo-repo`)
  Downloaded regex v1.10.4 (registry `cargo-repo`)
  Downloaded aho-corasick v1.1.3 (registry `cargo-repo`)
  Downloaded regex-automata v0.4.6 (registry `cargo-repo`)
  Downloaded 5 crates (1.5 MB) in 1.99s
   Compiling memchr v2.7.2 (registry `cargo-repo`)
   Compiling regex-syntax v0.8.3 (registry `cargo-repo`)
   Compiling aho-corasick v1.1.3 (registry `cargo-repo`)
   Compiling regex-automata v0.4.6 (registry `cargo-repo`)
   Compiling regex v1.10.4 (registry `cargo-repo`)
   Compiling hello_world v0.1.0 (/home/ec2-user/hello_world)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 16.60s

I can verify CodeArtifact downloaded the crate and its dependencies from the upstream public repository. I connect to the CodeArtifact console and check the list of packages available in either repository I created. At this stage, the package list should be identical in the two repositories.

CodeArtifact cargo packages list

Publish a private package to the repository
Now that I know the upstream link works as intended, let’s publish a private package to my cargo-repo repository to make it available to other teams in my organization.

To do so, I use the standard Rust tool cargo, just like usual. Before doing so, I add and commit the project files to the gitrepository.

$  git add . && git commit -m "initial commit"
 5 files changed, 1855 insertions(+)
create mode 100644 .gitignore
create mode 100644 Cargo.lock
create mode 100644 Cargo.toml
create mode 100644 commands.sh
create mode 100644 src/main.rs

$  cargo publish 
    Updating `codeartifact` index
   Packaging hello_world v0.1.0 (/home/ec2-user/hello_world)
    Updating crates.io index
    Updating `codeartifact` index
   Verifying hello_world v0.1.0 (/home/ec2-user/hello_world)
   Compiling libc v0.2.155
... (redacted for brevity) ....
   Compiling hello_world v0.1.0 (/home/ec2-user/hello_world/target/package/hello_world-0.1.0)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 1m 03s
    Packaged 5 files, 44.1KiB (11.5KiB compressed)
   Uploading hello_world v0.1.0 (/home/ec2-user/hello_world)
    Uploaded hello_world v0.1.0 to registry `cargo-repo`
note: waiting for `hello_world v0.1.0` to be available at registry `cargo-repo`.
You may press ctrl-c to skip waiting; the crate should be available shortly.
   Published hello_world v0.1.0 at registry `cargo-repo`

Lastly, I use the console to verify the hello_world crate is now available in the cargo-repo.

CodeArtifact cargo package hello world

Pricing and availability
You can now store your Rust libraries in the 13 AWS Regions where CodeArtifact is available. There is no additional cost for Rust packages. The three billing dimensions are the storage (measured in GB per month), the number of requests, and the data transfer out to the internet or to other AWS Regions. Data transfer to AWS services in the same Region is not charged, meaning you can run your continuous integration and delivery (CI/CD) jobs on Amazon Elastic Compute Cloud (Amazon EC2) or AWS CodeBuild, for example, without incurring a charge for the CodeArtifact data transfer. As usual, the pricing page has the details.

Now go build your Rust applications and upload your private crates to CodeArtifact!

— seb

No Excuses, Free Tools to Help Secure Authentication in Ubuntu Linux [Guest Diary], (Thu, Jun 20th)

This post was originally published on this site

[This is a Guest Diary by Owen Slubowski, an ISC intern as part of the SANS.edu BACS program]

Over the past 20 weeks I have had the privilege to take part in the SANS Internet Storm Center Internship. This has been an awesome chance to deploy and monitor a honeypot to explore what must be the fate of so many unsecured devices on the internet. Over the tenure here the one thing that was so shocking to me was not only the amount of devices that are conducting password attacks, but also the damage they could have done if their malware had been successful. Over the 20 weeks of this internship, I had more than 16,790 unique devices attempt to gain unauthorized access to my honeypot over SSH and Telnet from 49 different countries!


Figure1: DSheild SIEM graph displaying the different countries interacting with the honeypot

With the amount of threat actors out there it almost seems like a strong password policy isn’t enough on its own. And over the multitude of attack reports I wrote it always listed the same control that could have protected the system: MFA and filtering to protect the system. In my mind these solutions always imply a greater cost that is often outside of our reach as hobbyist and small organizations … Or are they?  Over the course of the next few pages, I look to discuss different technical controls I was first introduced to during the internship that can be applied to Ubuntu Linux at no cost and how they can help protect against these attempts to login by various threat actors.

All the testing done below will be done with 3 Linux boxes: Ubuntu-Secure (192.168.137.133) the server, Ubuntu-Client (192.168.137.135) the legitimate user, and Kali (192.168.137.134) the attacker.  Ubuntu-Secure has default SSH configurations and is easily guessed by the attacker using hydra and rockyou.txt in less than 2 minutes! 


Figure2: Demonstration of successful password guessing attack against ubuntu-client

TCP Wrappers 

One of the easiest ways to mitigate password attacks is to only allow legitimate IPs to access remote access protocols. This is usually done with either a host based firewall or a network firewall, but is there an easier and cheaper way? TCP Wrappers is a free tool that does just that. Like an ACL, TCP Wrapper allows us to specify what devices should and shouldn’t be allowed to access the service [1]! 


Figure3: The configurations added to host.allow  

Above we see our hosts.allow configuration file. In the first line we defined that the SSH service should allow access to 192.168.137.135 (Ubuntu-Client), and the second line functions as a default deny since there are no other legitimate users for this service[1]. Please note that the “All:DENY” statement can alternatively be placed in the host.deny file. There is functionally no difference between the two locations however I find placing both allow and deny statements in the hosts.allow makes for easier reading and troubleshooting. Below we can see that while Ubuntu-Client’s access is unencumbered, the attacker’s attempt has been completely blocked!


Figure4: Ubuntu-Client can successfully SSH to Ubuntu-Secure


Figure5: The attacker cannot connect!

Evidence of successful and refused connections can be found in /var/log/auth.log. This can be analyzed manually or ingested into a SIEM to assist in troubleshooting TCP wrapper rules, and to provide intelligence on adversaries attempting to access your device. Below we see an example log of Ubuntu-Client successfully connecting and evidence of the attacker Kali being refused.


Figure6: In green we see the successful connection and disconnect from Ubuntu-Client, and in red we see the blocked connection from the attacker Kali.

MFA for Ubuntu 

MFA is the most secure method of authentication hands down. Instead of opting for a pricey enterprise MFA solution like RSA or DUO in this section we will cover how to use Google authenticator to provide MFA for free! This is a super simple process:

First: install the Google authenticator with “sudo apt-get install libpam-google-authenticator”[2]


Figure7: Installation of Libpam-google-authenticator

Next use your favorite text editor edit the file /etc/pam.d/sshd and on line 2 add the text “auth required pam_google_authenticator.so”  to line 2 [2]. Then edit /etc/ssh/sshd_config and change the “ChallengeResponseAuthentication” to “yes” on line 63 [2].

Figure8: Addition to /etc/pam.d/sshd line 2 

Figure9: Edit to line 63 in the /etc/ssh/sshd_config file 

The final step is to run the command “google-authenticator” to finish setup [2]. This command will ask you five questions, and we answered them “yes, yes, yes, no, yes” as recommended by the Ubuntu.com tutorial [2]. There will also be a large QR code for you to scan to enroll the server into your Google authenticator app.


Figure10: The first lines of the google-authenticator command output with question 1


Figure11: The second half of the google-authenticator output with questions 2-5

Once enrolled restart the SSH service with “sudo systemctl restart sshd.service” then we are ready for a test! 

When attempting to log into Ubuntu-Secure we now see the first prompt of “verification code” which is found in our Google authentication app, and this code changes every 30 seconds. If our code was correct, then we will be prompted for our password and we are in!  


Figure12: Shows the new authentication workflow with Google authentication running 

While the ever-changing Google authenticator application is going to be near impossible to guess for an attacker, let’s run hydra against it for good measure. 


Figure13: The attacker can no longer brute force ubuntu-secure

After letting it run for a while, we can easily conclude that adding MFA thwarted this password guessing attack. 

Being in the IT and cybersecurity world it seems the costs of controls keeps going up and up. With all the new flashy tools coming out daily it’s easy to forget that there are tons of free tools that can be just as effective at stopping attacks. With limited time in our day to secure our personal infrastructure it’s refreshing to see how both these tools can be effectively deployed easily and quickly improve security! There truly is no excuse for unsecure authentication in 2024!

[1] https://ostechnix.com/restrict-access-linux-servers-using-tcp-wrappers/
[2] https://ubuntu.com/tutorials/configure-ssh-2fa#2-installing-and-configuring-required-packages
[3] https://www.sans.edu/cyber-security-programs/bachelors-degree/
———–
Guy Bruneau IPSS Inc.
My Handler Page
Twitter: GuyBruneau
gbruneau at isc dot sans dot edu

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.