Careers in Cybersecurity

This post was originally published on this site

Have you considered a career in cybersecurity? It is a fast-paced, highly dynamic field with a huge number of specialties to choose from, including forensics, endpoint security, critical infrastructure, incident response, secure coding, and awareness and training. In addition, a career in cybersecurity allows you to work almost anywhere in the world, with amazing benefits and an opportunity to make a real difference. However, the most exciting thing is you do NOT need a technical background, anyone can get started.

AA21-148A: Sophisticated Spearphishing Campaign Targets Government Organizations, IGOs, and NGOs

This post was originally published on this site

Original release date: May 28, 2021

Summary

The Cybersecurity and Infrastructure Security Agency (CISA) and the Federal Bureau of Investigation (FBI) are responding to a spearphishing campaign targeting government organizations, intergovernmental organizations (IGOs), and non-governmental organizations (NGOs). A sophisticated cyber threat actor leveraged a compromised end-user account from Constant Contact, a legitimate email marketing software company, to spoof a U.S.-based government organization and distribute links to malicious URLs.[1] Note: CISA and FBI acknowledge open-source reporting attributing the activity discussed in the report to APT29 (also known as Nobelium, The Dukes, and Cozy Bear).[2,3] However, CISA and FBI are investigating this activity and have not attributed it to any threat actor at this time. CISA and FBI will update this Joint Cybersecurity Advisory as new information becomes available.

This Joint Cybersecurity Advisory contains information on tactics, techniques, and procedures (TTPs) and malware associated with this campaign. For more information on the malware, refer to Malware Analysis Report MAR-10339794-1.v1: Cobalt Strike Beacon.

CISA and FBI urge governmental and international affairs organizations and individuals associated with such organizations to immediately adopt a heightened state of awareness and implement the recommendations in the Mitigations section of this advisory.

For a downloadable list of indicators of compromise (IOCs), refer to AA21-148A.stix, and MAR-10339794-1.v1.stix.

Technical Details

Based on incident reports, malware collection, and trusted third-party reporting, CISA and FBI are responding to a sophisticated spearphishing campaign. A cyber threat actor leveraged a compromised end-user account from Constant Contact, a legitimate email marketing software company, to send phishing emails to more than 7,000 accounts across approximately 350 government organizations, IGOs, and NGOs. The threat actor sent spoofed emails that appeared to originate from a U.S. Government organization. The emails contained a legitimate Constant Contact link that redirected to a malicious URL [T1566.002, T1204.001], from which a malicious ISO file was dropped onto the victim’s machine.

The ISO file contained (1) a malicious Dynamic Link Library (DLL) named Documents.dll [T1055.001], which is a custom Cobalt Strike Beacon version 4 implant, (2) a malicious shortcut file that executes the Cobalt Strike Beacon loader [T1105], and (3) a benign decoy PDF titled “Foreign Threats to the 2020 US Federal Elections” with file name “ICA-declass.pdf” (see figure 1). Note: The decoy file appears to be a copy of the declassified Intelligence Community Assessment pursuant to Executive Order 13848 Section 1(a), which is available at https://www.intelligence.gov/index.php/ic-on-the-record-database/results/1046-foreign-threats-to-the-2020-us-federal-elections-intelligence-community-assessment.

Figure 1: Decoy PDF: ICA-declass.pdf

Cobalt Strike is a commercial penetration testing tool used to conduct red team operations.[4] It contains a number of tools that complement the cyber threat actor’s exploitation efforts, such as a keystroke logger, file injection capability, and network services scanners. The Cobalt Strike Beacon is the malicious implant that calls back to attacker-controlled infrastructure and checks for additional commands to execute on the compromised system [TA0011].

The configuration file for this Cobalt Strike Beacon implant contained communications protocols, an implant watermark, and the following hardcoded command and control (C2) domains:

  • dataplane.theyardservice[.]com/jquery-3.3.1.min.woff2
  • cdn.theyardservice[.]com/jquery-3.3.1.min.woff2
  • static.theyardservice[.]com/jquery-3.3.1.min.woff2
  • worldhomeoutlet[.]com/jquery-3.3.1.min.woff2

The configuration file was encoded via an XOR with the key 0x2e and a 16-bit byte swap.

For more information on the ISO file and Cobalt Strike Beacon implant, including IOCs, refer to Malware Analysis Report MAR-10339794-1.v1: Cobalt Strike Beacon.

Indicators of Compromise

The following IOCS were derived from trusted third parties and open-source research. For a downloadable list of IOCs, refer to AA21-148A.stix and MAR-10339794-1.v1.stix.

  • URL: https[:]//r20.rs6.net/tn.jsp?f=
    Host IP: 208.75.122[.]11 (US)
    Owner: Constant Contact, Inc.
    Activity: legitimate Constant Contact link found in phishing email that redirects victims to actor-controlled infrastructure at https[:]//usaid.theyardservice.com/d/<target_email_address>
     
  • URL: https[:]//usaid.theyardservice.com/d/<target_email_address>
    Host IP: 83.171.237[.]173 (Germany)
    Owner: [redacted]
    First Seen: May 25, 2021
    Activity: actor-controlled URL that was redirected from https[:]//r20.rs6.net/tn.jsp?f=; the domain usaid[.]theyardservice.com was detected as a malware site; hosted a malicious ISO file “usaid[.]theyardservice.com”
     
  • File: ICA-declass.iso [MD5: cbc1dc536cd6f4fb9648e229e5d23361]
    File Type: Macintosh Disk Image
    Detection: Artemis!7EDF943ED251, Trojan:Win32/Cobaltstrike!MSR, or other malware
    Activity: ISO file container; contains a custom Cobalt Strike Beacon loader; communicated with multiple URLs, domains, and IP addresses
     
  • File: /d/ [MD5: ebe2f8df39b4a94fb408580a728d351f]
    File Type: Macintosh Disk Image
    Detection: Cobalt, Artemis!7EDF943ED251, or other malware
    Activity: ISO file container; contains a custom Cobalt Strike Beacon loader; communicated with multiple URLs, domains, and IP addresses
     
  • File: ICA-declass.iso [MD5: 29e2ef8ef5c6ff95e98bff095e63dc05]
    File Type: Macintosh Disk Image
    Detection: Cobalt Strike, Rozena, or other malware
    Activity: ISO file container; contains a custom Cobalt Strike Beacon loader; communicated with multiple URLs, domains, and IP addresses
     
  • File: Reports.lnk [MD5: dcfd60883c73c3d92fceb6ac910d5b80]
    File Type: LNK (Windows shortcut)
    Detection: Worm: Win32-Script.Save.df8efe7a, Static AI – Suspicious LNK, or other malware
    Activity: shortcut contained in malicious ISO files; executes a custom Cobalt Strike Beacon loader
     
  • File: ICA-declass.pdf [MD5: b40b30329489d342b2aa5ef8309ad388]
    File Type: PDF
    Detection: undetected
    Activity: benign, password-protected PDF displayed to victim as a decoy; currently unrecognized by antivirus software
     
  • File: DOCUMENT.DLL [MD5: 7edf943ed251fa480c5ca5abb2446c75]
    File Type: Win32 DLL
    Detection: Trojan: Win32/Cobaltstrike!MSR, Rozena, or other malware
    Activity: custom Cobalt Strike Beacon loader contained in malicious ISO files; communicating with multiple URLs, domains, and IP addresses by antivirus software
     
  • File: DOCUMENT.DLL [MD5: 1c3b8ae594cb4ce24c2680b47cebf808]
    File Type: Win32 DLL
    Detection: Cobalt Strike, Razy, Khalesi, or other malware
    Activity: Custom Cobalt Strike Beacon loader contained in malicious ISO files; communicating with multiple URLs, domains, and IP addresses by antivirus software
     
  • Domain: usaid[.]theyardservice.com
    Host IP: 83.171.237[.]173 (Germany)
    First Seen: May 25, 2021
    Owner: Withheld for Privacy Purposes
    Activity: subdomain used to distribute ISO file according to the trusted third party; detected as a malware site by antivirus programs
     
  • Domain: worldhomeoutlet.com
    Host IP: 192.99.221[.]77 (Canada)
    Created Date: March 11, 2020
    Owner: Withheld for Privacy Purposes by Registrar
    Activity: Cobalt Strike C2 subdomain according to the trusted third party; categorized as suspicious and observed communicating with multiple malicious files according to antivirus software; associated with Cobalt Strike malware
     
  • Domain: dataplane.theyardservice[.]com
    Host IP: 83.171.237[.]173 (Germany)
    First Seen: May 25, 2021
    Owner: [redacted]
    Activity: Cobalt Strike C2 subdomain according to the trusted third party; categorized as suspicious and observed communicating with multiple malicious files according to antivirus software; observed in phishing, malware, and spam activity
     
  • Domain: cdn.theyardservice[.]com
    Host IP: 83.171.237[.]173 (Germany)
    First Seen: May 25, 2021
    Owner: Withheld for Privacy Purposes by Registrar
    Activity: Cobalt Strike C2 subdomain according to the trusted third party; categorized as suspicious and observed communicating with multiple malicious files according to antivirus software
     
  • Domain: static.theyardservice[.]com
    Host IP: 83.171.237[.]173 (Germany)
    First Seen: May 25, 2021
    Owner: Withheld for Privacy Purposes
    Activity: Cobalt Strike C2 subdomain according to the trusted third party; categorized as suspicious and observed communicating with multiple malicious files according to antivirus software
     
  • IP: 192.99.221[.]77
    Organization: OVH SAS
    Resolutions: 7
    Geolocation: Canada
    Activity: detected as a malware site; hosts a suspicious domain worldhomeoutlet[.]com; observed in Cobalt Strike activity
     
  • IP: 83.171.237[.]173
    Organization: Droptop GmbH
    Resolutions: 15
    Geolocation: Germany
    Activity: Categorized as malicious by antivirus software; hosted multiple suspicious domains and multiple malicious files were observed downloaded from this IP address; observed in Cobalt Strike and activity
     
  • Domain: theyardservice[.]com
    Host IP: 83.171.237[.]173 (Germany)
    Created Date: January 27, 2010
    Owner: Withheld for Privacy Purposes
    Activity: Threat actor controlled domain according to the trusted third party; categorized as suspicious by antivirus software; observed in Cobalt Strike activity

 

Table 1 provides a summary of the MITRE ATT&CK techniques observed.

Table 1: MITRE ATT&CK techniques observed

Technique Title

Technique ID

Process Injection: Dynamic-link Library Injection

T1055.001

Ingress Tool Transfer

T1105

User Execution: Malicious Link

T1204.001

Phishing: Spearphishing Link

T1566.002

Mitigations

CISA and FBI urge CI owners and operators to apply the following mitigations.

  • Implement multi-factor authentication (MFA) for every account. While privileged accounts and remote access systems are critical, it is also important to ensure full coverage across SaaS solutions. Enabling MFA for corporate communications platforms (as with all other accounts) provides vital defense against these types of attacks and, in many cases, can prevent them.
  • Keep all software up to date. The most effective cybersecurity programs quickly update all of their software as soon as patches are available. If your organization is unable to update all software shortly after a patch is released, prioritize implementing patches for CVEs that are already known to be exploited.
  • Implement endpoint and detection response (EDR) tools. EDR allows a high degree of visibility into the security status of endpoints and is can be an effective tool against threat actors.
    Note: Organizations using Microsoft Defender for Endpoint or Microsoft 365 Defense should refer to Microsoft: Use attack surface reduction rules to prevent malware infection for more information on hardening the enterprise attack surface.
  • Implement centralized log management for host monitoring. A centralized logging application allows technicians to look out for anomalous activity in the network environment, such as new applications running on hosts, out-of-place communication between devices, or unaccountable login failures on machines. It also aids in troubleshooting applications or equipment in the event of a fault. CISA and the FBI recommend that organizations:
    • Forward logs from local hosts to a centralized log management server—often referred to as a security information and event management (SIEM) tool.
    • Ensure logs are searchable. The ability to search, analyze, and visualize communications will help analysts diagnose issues and may lead to detection of anomalous activity.
    • Correlate logs from both network and host security devices. By reviewing logs from multiple sources, an organization can better triage an individual event and determine its impact to the organization as a whole.
    • Review both centralized and local log management policies to maximize efficiency and retain historical data. Organizations should retain critical logs for a minimum of 30 days.
  • Deploy signatures to detect and/or block inbound connection from Cobalt Strike servers and other post-exploitation tools.
  • Implement unauthorized execution prevention by disabling macro scripts from Microsoft Office files transmitted via email. Consider using Office Viewer software to open Microsoft Office files transmitted via email instead of full Microsoft Office suite applications.
  • Configure and maintain user and administrative accounts using a strong account management policy.
    • Use administrative accounts on dedicated administration workstations.
    • Limit access to and use of administrative accounts.
    • Use strong passwords. For more information on strong passwords, refer to CISA Tip: Choosing and Protecting Passwords and National Institute of Standards (NIST) SP 800-63: Digital Identity Guidelines: Authentication and Lifecycle Management.
    • Remove default accounts if unneeded. Change the password of default accounts that are needed.
    • Disable all unused accounts.
  • Implement a user training program and simulated attacks for spearphishing to discourage users from visiting malicious websites or opening malicious attachments and re-enforce the appropriate user responses to spearphishing emails.

RESOURCES

Contact Information

To report suspicious or criminal activity related to information found in this Joint Cybersecurity Advisory, contact your local FBI field office at www.fbi.gov/contact-us/field, or the FBI’s 24/7 Cyber Watch (CyWatch) at (855) 292-3937 or by e-mail at CyWatch@fbi.gov. When available, please include the following information regarding the incident: date, time, and location of the incident; type of activity; number of people affected; type of equipment used for the activity; the name of the submitting company or organization; and a designated point of contact. To request incident response resources or technical assistance related to these threats, contact CISA at CISAServiceDesk@cisa.dhs.gov.

This document is marked TLP:WHITE. Disclosure is not limited. Sources may use TLP:WHITE when information carries minimal or no foreseeable risk of misuse, in accordance with applicable rules and procedures for public release. Subject to standard copyright rules, TLP:WHITE information may be distributed without restriction. For more information on the Traffic Light Protocol, see http://www.us-cert.gov/tlp/.

 

References

Revisions

  • Initial version: May 28, 2021

This product is provided subject to this Notification and this Privacy & Use policy.

What is Malware

This post was originally published on this site

Malware is software–a computer program–used to perform malicious actions. In fact, the term malware is a combination of the words malicious and software. Cyber criminals install malware on your computers or devices to gain control over them or gain access to what they contain. Once installed, these attackers can use malware to spy on your online activities, steal your passwords and files, or use your system to attack others.

Amazon Redshift ML Is Now Generally Available – Use SQL to Create Machine Learning Models and Make Predictions from Your Data

This post was originally published on this site

With Amazon Redshift, you can use SQL to query and combine exabytes of structured and semi-structured data across your data warehouse, operational databases, and data lake. Now that AQUA (Advanced Query Accelerator) is generally available, you can improve the performance of your queries by up to 10 times with no additional costs and no code changes. In fact, Amazon Redshift provides up to three times better price/performance than other cloud data warehouses.

But what if you want to go a step further and process this data to train machine learning (ML) models and use these models to generate insights from data in your warehouse? For example, to implement use cases such as forecasting revenue, predicting customer churn, and detecting anomalies? In the past, you would need to export the training data from Amazon Redshift to an Amazon Simple Storage Service (Amazon S3) bucket, and then configure and start a machine learning training process (for example, using Amazon SageMaker). This process required many different skills and usually more than one person to complete. Can we make it easier?

Today, Amazon Redshift ML is generally available to help you create, train, and deploy machine learning models directly from your Amazon Redshift cluster. To create a machine learning model, you use a simple SQL query to specify the data you want to use to train your model, and the output value you want to predict. For example, to create a model that predicts the success rate for your marketing activities, you define your inputs by selecting the columns (in one or more tables) that include customer profiles and results from previous marketing campaigns, and the output column you want to predict. In this example, the output column could be one that shows whether a customer has shown interest in a campaign.

After you run the SQL command to create the model, Redshift ML securely exports the specified data from Amazon Redshift to your S3 bucket and calls Amazon SageMaker Autopilot to prepare the data (pre-processing and feature engineering), select the appropriate pre-built algorithm, and apply the algorithm for model training. You can optionally specify the algorithm to use, for example XGBoost.

Architectural diagram.

Redshift ML handles all of the interactions between Amazon Redshift, S3, and SageMaker, including all the steps involved in training and compilation. When the model has been trained, Redshift ML uses Amazon SageMaker Neo to optimize the model for deployment and makes it available as a SQL function. You can use the SQL function to apply the machine learning model to your data in queries, reports, and dashboards.

Redshift ML now includes many new features that were not available during the preview, including Amazon Virtual Private Cloud (VPC) support. For example:

Architectural diagram.

  • You can also create SQL functions that use existing SageMaker endpoints to make predictions (remote inference). In this case, Redshift ML is batching calls to the endpoint to speed up processing.

Before looking into how to use these new capabilities in practice, let’s see the difference between Redshift ML and similar features in AWS databases and analytics services.

ML Feature Data Training
from SQL
Predictions
using SQL Functions
Amazon Redshift ML

Data warehouse

Federated relational databases

S3 data lake (with Redshift Spectrum)

Yes, using
Amazon SageMaker Autopilot
Yes, a model can be imported and executed inside the Amazon Redshift cluster, or invoked using a SageMaker endpoint.
Amazon Aurora ML Relational database
(compatible with MySQL or PostgreSQL)
No

Yes, using a SageMaker endpoint.

A native integration with Amazon Comprehend for sentiment analysis is also available.

Amazon Athena ML

S3 data lake

Other data sources can be used through Athena Federated Query.

No Yes, using a SageMaker endpoint.

Building a Machine Learning Model with Redshift ML
Let’s build a model that predicts if customers will accept or decline a marketing offer.

To manage the interactions with S3 and SageMaker, Redshift ML needs permissions to access those resources. I create an AWS Identity and Access Management (IAM) role as described in the documentation. I use RedshiftML for the role name. Note that the trust policy of the role allows both Amazon Redshift and SageMaker to assume the role to interact with other AWS services.

From the Amazon Redshift console, I create a cluster. In the cluster permissions, I associate the RedshiftML IAM role. When the cluster is available, I load the same dataset used in this super interesting blog post that my colleague Julien wrote when SageMaker Autopilot was announced.

The file I am using (bank-additional-full.csv) is in CSV format. Each line describes a direct marketing activity with a customer. The last column (y) describes the outcome of the activity (if the customer subscribed to a service that was marketed to them).

Here are the first few lines of the file. The first line contains the headers.

age,job,marital,education,default,housing,loan,contact,month,day_of_week,duration,campaign,pdays,previous,poutcome,emp.var.rate,cons.price.idx,cons.conf.idx,euribor3m,nr.employed,y 56,housemaid,married,basic.4y,no,no,no,telephone,may,mon,261,1,999,0,nonexistent,1.1,93.994,-36.4,4.857,5191.0,no
57,services,married,high.school,unknown,no,no,telephone,may,mon,149,1,999,0,nonexistent,1.1,93.994,-36.4,4.857,5191.0,no
37,services,married,high.school,no,yes,no,telephone,may,mon,226,1,999,0,nonexistent,1.1,93.994,-36.4,4.857,5191.0,no
40,admin.,married,basic.6y,no,no,no,telephone,may,mon,151,1,999,0,nonexistent,1.1,93.994,-36.4,4.857,5191.0,no

I store the file in one of my S3 buckets. The S3 bucket is used to unload data and store SageMaker training artifacts.

Then, using the Amazon Redshift query editor in the console, I create a table to load the data.

CREATE TABLE direct_marketing (
	age DECIMAL NOT NULL, 
	job VARCHAR NOT NULL, 
	marital VARCHAR NOT NULL, 
	education VARCHAR NOT NULL, 
	credit_default VARCHAR NOT NULL, 
	housing VARCHAR NOT NULL, 
	loan VARCHAR NOT NULL, 
	contact VARCHAR NOT NULL, 
	month VARCHAR NOT NULL, 
	day_of_week VARCHAR NOT NULL, 
	duration DECIMAL NOT NULL, 
	campaign DECIMAL NOT NULL, 
	pdays DECIMAL NOT NULL, 
	previous DECIMAL NOT NULL, 
	poutcome VARCHAR NOT NULL, 
	emp_var_rate DECIMAL NOT NULL, 
	cons_price_idx DECIMAL NOT NULL, 
	cons_conf_idx DECIMAL NOT NULL, 
	euribor3m DECIMAL NOT NULL, 
	nr_employed DECIMAL NOT NULL, 
	y BOOLEAN NOT NULL
);

I load the data into the table using the COPY command. I can use the same IAM role I created earlier (RedshiftML) because I am using the same S3 bucket to import and export the data.

COPY direct_marketing 
FROM 's3://my-bucket/direct_marketing/bank-additional-full.csv' 
DELIMITER ',' IGNOREHEADER 1
IAM_ROLE 'arn:aws:iam::123412341234:role/RedshiftML'
REGION 'us-east-1';

Now, I create the model straight form the SQL interface using the new CREATE MODEL statement:

CREATE MODEL direct_marketing
FROM direct_marketing
TARGET y
FUNCTION predict_direct_marketing
IAM_ROLE 'arn:aws:iam::123412341234:role/RedshiftML'
SETTINGS (
  S3_BUCKET 'my-bucket'
);

In this SQL command, I specify the parameters required to create the model:

  • FROM – I select all the rows in the direct_marketing table, but I can replace the name of the table with a nested query (see example below).
  • TARGET – This is the column that I want to predict (in this case, y).
  • FUNCTION – The name of the SQL function to make predictions.
  • IAM_ROLE – The IAM role assumed by Amazon Redshift and SageMaker to create, train, and deploy the model.
  • S3_BUCKET – The S3 bucket where the training data is temporarily stored, and where model artifacts are stored if you choose to retain a copy of them.

Here I am using a simple syntax for the CREATE MODEL statement. For more advanced users, other options are available, such as:

  • MODEL_TYPE – To use a specific model type for training, such as XGBoost or multilayer perceptron (MLP). If I don’t specify this parameter, SageMaker Autopilot selects the appropriate model class to use.
  • PROBLEM_TYPE – To define the type of problem to solve: regression, binary classification, or multiclass classification. If I don’t specify this parameter, the problem type is discovered during training, based on my data.
  • OBJECTIVE – The objective metric used to measure the quality of the model. This metric is optimized during training to provide the best estimate from data. If I don’t specify a metric, the default behavior is to use mean squared error (MSE) for regression, the F1 score for binary classification, and accuracy for multiclass classification. Other available options are F1Macro (to apply F1 scoring to multiclass classification) and area under the curve (AUC). More information on objective metrics is available in the SageMaker documentation.

Depending on the complexity of the model and the amount of data, it can take some time for the model to be available. I use the SHOW MODEL command to see when it is available:

SHOW MODEL direct_marketing

When I execute this command using the query editor in the console, I get the following output:

Console screenshot.

As expected, the model is currently in the TRAINING state.

When I created this model, I selected all the columns in the table as input parameters. I wonder what happens if I create a model that uses fewer input parameters? I am in the cloud and I am not slowed down by limited resources, so I create another model using a subset of the columns in the table:

CREATE MODEL simple_direct_marketing
FROM (
        SELECT age, job, marital, education, housing, contact, month, day_of_week, y
 	  FROM direct_marketing
)
TARGET y
FUNCTION predict_simple_direct_marketing
IAM_ROLE 'arn:aws:iam::123412341234:role/RedshiftML'
SETTINGS (
  S3_BUCKET 'my-bucket'
);

After some time, my first model is ready, and I get this output from SHOW MODEL. The actual output in the console is in multiple pages, I merged the results here to make it easier to follow:

Console screenshot.

From the output, I see that the model has been correctly recognized as BinaryClassification, and F1 has been selected as the objective. The F1 score is a metrics that considers both precision and recall. It returns a value between 1 (perfect precision and recall) and 0 (lowest possible score). The final score for the model (validation:f1) is 0.79. In this table I also find the name of the SQL function (predict_direct_marketing) that has been created for the model, its parameters and their types, and an estimation of the training costs.

When the second model is ready, I compare the F1 scores. The F1 score of the second model is lower (0.66) than the first one. However, with fewer parameters the SQL function is easier to apply to new data. As is often the case with machine learning, I have to find the right balance between complexity and usability.

Using Redshift ML to Make Predictions
Now that the two models are ready, I can make predictions using SQL functions. Using the first model, I check how many false positives (wrong positive predictions) and false negatives (wrong negative predictions) I get when applying the model on the same data used for training:

SELECT predict_direct_marketing, y, COUNT(*)
  FROM (SELECT predict_direct_marketing(
                   age, job, marital, education, credit_default, housing,
                   loan, contact, month, day_of_week, duration, campaign,
                   pdays, previous, poutcome, emp_var_rate, cons_price_idx,
                   cons_conf_idx, euribor3m, nr_employed), y
          FROM direct_marketing)
 GROUP BY predict_direct_marketing, y;

The result of the query shows that the model is better at predicting negative rather than positive outcomes. In fact, even if the number of true negatives is much bigger than true positives, there are much more false positives than false negatives. I added some comments in green and red to the following screenshot to clarify the meaning of the results.

Console screenshot.

Using the second model, I see how many customers might be interested in a marketing campaign. Ideally, I should run this query on new customer data, not the same data I used for training.

SELECT COUNT(*)
  FROM direct_marketing
 WHERE predict_simple_direct_marketing(
           age, job, marital, education, housing,
           contact, month, day_of_week) = true;

Wow, looking at the results, there are more than 7,000 prospects!

Console screenshot.

Availability and Pricing
Redshift ML is available today in the following AWS Regions: US East (Ohio), US East (N Virginia), US West (Oregon), US West (San Francisco), Canada (Central), Europe (Frankfurt), Europe (Ireland), Europe (Paris), Europe (Stockholm), Asia Pacific (Hong Kong) Asia Pacific (Tokyo), Asia Pacific (Singapore), Asia Pacific (Sydney), and South America (São Paulo). For more information, see the AWS Regional Services list.

With Redshift ML, you pay only for what you use. When training a new model, you pay for the Amazon SageMaker Autopilot and S3 resources used by Redshift ML. When making predictions, there is no additional cost for models imported into your Amazon Redshift cluster, as in the example I used in this post.

Redshift ML also allows you to use existing Amazon SageMaker endpoints for inference. In that case, the usual SageMaker pricing for real-time inference applies. Here you can find a few tips on how to control your costs with Redshift ML.

To learn more, you can see this blog post from when Redshift ML was announced in preview and the documentation.

Start getting better insights from your data with Redshift ML.

Danilo

Getting Started with Amazon ECS Anywhere – Now Generally Available

This post was originally published on this site

Since Amazon Elastic Container Service (Amazon ECS) was launched in 2014, AWS has released other options for running Amazon ECS tasks outside of an AWS Region such as AWS Wavelength, an offering for mobile edge devices or AWS Outposts, a service that extends to customers’ environments using hardware owned and fully managed by AWS.

But some customers have applications that need to run on premises due to regulatory, latency, and data residency requirements or the desire to leverage existing infrastructure investments. In these cases, customers have to install, operate, and manage separate container orchestration software and need to use disparate tooling across their AWS and on-premises environments. Customers asked us for a way to manage their on-premises containers without this added complexity and cost.

Following Jeff’s preannouncement last year, I am happy to announce the general availability of Amazon ECS Anywhere, a new capability in Amazon ECS that enables customers to easily run and manage container-based applications on premises, including virtual machines (VMs), bare metal servers, and other customer-managed infrastructure.

With ECS Anywhere, you can run and manage containers on any customer-managed infrastructure using the same cloud-based, fully managed, and highly scalable container orchestration service you use in AWS today. You no longer need to prepare, run, update, or maintain your own container orchestrators on premises, making it easier to manage your hybrid environment and leverage the cloud for your infrastructure by installing simple agents.

ECS Anywhere provides consistent tooling and APIs for all container-based applications and the same Amazon ECS experience for cluster management, workload scheduling, and monitoring both in the cloud and on customer-managed infrastructure. You can now enjoy the benefits of reduced cost and complexity by running container workloads such as data processing at edge locations on your own hardware maintaining reduced latency, and in the cloud using a single, consistent container orchestrator.

Amazon ECS Anywhere – Getting Started
To get started with ECS Anywhere, register your on-premises servers or VMs (also referred to as External instances) in the ECS cluster. The AWS Systems Manager Agent, Amazon ECS container agent, and Docker must be installed on these external instances. Your external instances require an IAM role that permits them to communicate with AWS APIs. For more information, see Required IAM permissions in the ECS Developer Guide.

To create a cluster for ECS Anywhere, on the Create Cluster page in the ECS console, choose the Networking Only template. This option is for use with either AWS Fargate or external instance capacity. We recommend that you use the AWS Region that is geographically closest to the on-premises servers you want to register.

This creates an empty cluster to register external instances. On the ECS Instances tab, choose Register External Instances to get activation codes and an installation script.

On the Step 1: External instances activation details page, in Activation key duration (in days), enter the number of days the activation key should remain active. The activation key can be used for up to 1,000 activations. In Number of instances, enter the number of external instances you want to register to your cluster. In Instance role, enter the IAM role to associate with your external instances.

Choose Next step to get a registration command.

On the Step 2: Register external instances page, copy the registration command. Run this command on the external instances you want to register to your cluster.

Paste the registration command in your on-premise servers or VMs. Each external instance is then registered as an AWS Systems Manager managed instance, which is then registered to your Amazon ECS clusters.

Both x86_64 and ARM64 CPU architectures are supported. The following is a list of supported operating systems:

  • CentOS 7, CentOS 8
  • RHEL 7
  • Fedora 32, Fedora 33
  • openSUSE Tumbleweed
  • Ubuntu 18, Ubuntu 20
  • Debian 9, Debian 10
  • SUSE Enterprise Server 15

When the ECS agent has started and completed the registration, your external instance will appear on the ECS Instances tab.

You can also add your external instances to the existing cluster. In this case, you can see both Amazon EC2 instances and external instances are prefixed with mi-* together.

Now that the external instances are registered to your cluster, you are ready to create a task definition. Amazon ECS provides the requiresCompatibilities parameter to validate that the task definition is compatible with the the EXTERNAL launch type when creating your service or running your standalone task. The following is an example task definition:

{
	"requiresCompatibilities": [
		"EXTERNAL"
	],
	"containerDefinitions": [{
		"name": "nginx",
		"image": "public.ecr.aws/nginx/nginx:latest",
		"memory": 256,
		"cpu": 256,
		"essential": true,
		"portMappings": [{
			"containerPort": 80,
			"hostPort": 8080,
			"protocol": "tcp"
		}]
	}],
	"networkMode": "bridge",
	"family": "nginx"
}

You can create a task definition in the ECS console. In Task Definition, choose Create new task definition. For Launch type, choose EXTERNAL and then configure the task and container definitions to use external instances.

On the Tasks tab, choose Run new task. On the Run Task page, for Cluster, choose the cluster to run your task definition on. In Number of tasks, enter the number of copies of that task to run with the EXTERNAL launch type.

Or, on the Services tab, choose Create. Configure service lets you specify copies of your task definition to run and maintain in a cluster. To run your task in the registered external instance, for Launch type, choose EXTERNAL. When you choose this launch type, load balancers, tag propagation, and service discovery integration are not supported.

The tasks you run on your external instances must use the bridge, host, or none network modes. The awsvpc network mode isn’t supported. For more information about each network mode, see Choosing a network mode in the Amazon ECS Best Practices Guide.

Now you can run your tasks and associate a mix of EXTERNAL, FARGATE, and EC2 capacity provider types with the same ECS service and specify how you would like your tasks to be split across them.

Things to Know
Here are a couple of things to keep in mind:

Connectivity: In the event of loss of network connectivity between the ECS agent running on the on-premises servers and the ECS control plane in the AWS Region, existing ECS tasks will continue to run as usual. If tasks still have connectivity with other AWS services, they will continue to communicate with them for as long as the task role credentials are active. If a task launched as part of a service crashes or exits on its own, ECS will be unable to replace it until connectivity is restored.

Monitoring: With ECS Anywhere, you can get Amazon CloudWatch metrics for your clusters and services, use the CloudWatch Logs driver (awslogs) to get your containers’ logs, and access the ECS CloudWatch event stream to monitor your clusters’ events.

Networking: ECS external instances are optimized for running applications that generate outbound traffic or process data. If your application requires inbound traffic, such as a web service, you will need to employ a workaround to place these workloads behind a load balancer until the feature is supported natively. For more information, see Networking with ECS Anywhere.

Data Security: To help customers maintain data security, ECS Anywhere only sends back to the AWS Region metadata related to the state of the tasks or the state of the containers (whether they are running or not running, performance counters, and so on). This communication is authenticated and encrypted in transit through Transport Layer Security (TLS).

ECS Anywhere Partners
ECS Anywhere integrates with a variety of ECS Anywhere partners to help customers take advantage of ECS Anywhere and provide additional functionality for the feature. Here are some of the blog posts that our partners wrote to share their experiences and offerings. (I am updating this article with links as they are published.)

Now Available
Amazon ECS Anywhere is now available in all commercial regions except AWS China Regions where ECS is supported. With ECS Anywhere, there are no minimum fees or upfront commitments. You pay per instance hour for each managed ECS Anywhere task. For more information, see the pricing page.

To learn more, see ECS Anywhere in the Amazon ECS Developer Guide. Please send feedback to the AWS forum for Amazon ECS or through your usual AWS Support contacts.

Get started with the Amazon ECS Anywhere today.

Channy

Update. Watch a cool demo of ECS Anywhere to operate a Raspberry Pi cluster at home office.

Introducing Amazon Kinesis Data Analytics Studio – Quickly Interact with Streaming Data Using SQL, Python, or Scala

This post was originally published on this site

The best way to get timely insights and react quickly to new information you receive from your business and your applications is to analyze streaming data. This is data that must usually be processed sequentially and incrementally on a record-by-record basis or over sliding time windows, and can be used for a variety of analytics including correlations, aggregations, filtering, and sampling.

To make it easier to analyze streaming data, today we are please to introduce Amazon Kinesis Data Analytics Studio.

Now, from the Amazon Kinesis console you can select a Kinesis data stream and with a single click start a Kinesis Data Analytics Studio notebook powered by Apache Zeppelin and Apache Flink to interactively analyze data in the stream. Similarly, you can select a cluster in the Amazon Managed Streaming for Apache Kafka console to start a notebook to analyze data in Apache Kafka streams. You can also start a notebook from the Kinesis Data Analytics Studio console and connect to custom sources.

Architectural diagram.

In the notebook, you can interact with streaming data and get results in seconds using SQL queries and Python or Scala programs. When you are satisfied with your results, with a few clicks you can promote your code to a production stream processing application that runs reliably at scale with no additional development effort.

For new projects, we recommend that you use the new Kinesis Data Analytics Studio over Kinesis Data Analytics for SQL Applications. Kinesis Data Analytics Studio combines ease of use with advanced analytical capabilities, which makes it possible to build sophisticated stream processing applications in minutes. Let’s see how that works in practice.

Using Kinesis Data Analytics Studio to Analyze Streaming Data
I want to get a better understanding of the data sent by some sensors to a Kinesis data stream.

To simulate the workload, I use this random_data_generator.py Python script. You don’t need to know Python to use Kinesis Data Analytics Studio. In fact, I am going to use SQL in the following steps. Also, you can avoid any coding and use the Amazon Kinesis Data Generator user interface (UI) to send test data to Kinesis Data Streams or Kinesis Data Firehose. I am using a Python script to have finer control over the data that is being sent.

import datetime
import json
import random
import boto3

STREAM_NAME = "my-input-stream"


def get_random_data():
    current_temperature = round(10 + random.random() * 170, 2)
    if current_temperature > 160:
        status = "ERROR"
    elif current_temperature > 140 or random.randrange(1, 100) > 80:
        status = random.choice(["WARNING","ERROR"])
    else:
        status = "OK"
    return {
        'sensor_id': random.randrange(1, 100),
        'current_temperature': current_temperature,
        'status': status,
        'event_time': datetime.datetime.now().isoformat()
    }


def send_data(stream_name, kinesis_client):
    while True:
        data = get_random_data()
        partition_key = str(data["sensor_id"])
        print(data)
        kinesis_client.put_record(
            StreamName=stream_name,
            Data=json.dumps(data),
            PartitionKey=partition_key)


if __name__ == '__main__':
    kinesis_client = boto3.client('kinesis')
    send_data(STREAM_NAME, kinesis_client)

This script sends random records to my Kinesis data stream using JSON syntax. For example:

{'sensor_id': 77, 'current_temperature': 93.11, 'status': 'OK', 'event_time': '2021-05-19T11:20:00.978328'}
{'sensor_id': 47, 'current_temperature': 168.32, 'status': 'ERROR', 'event_time': '2021-05-19T11:20:01.110236'}
{'sensor_id': 9, 'current_temperature': 140.93, 'status': 'WARNING', 'event_time': '2021-05-19T11:20:01.243881'}
{'sensor_id': 27, 'current_temperature': 130.41, 'status': 'OK', 'event_time': '2021-05-19T11:20:01.371191'}

From the Kinesis console, I select a Kinesis data stream (my-input-stream) and choose Process data in real time from the Process drop-down. In this way, the stream is configured as a source for the notebook.

Console screenshot.

Then, in the following dialog box, I create an Apache Flink – Studio notebook.

I enter a name (my-notebook) and a description for the notebook. The AWS Identity and Access Management (IAM) permissions to read from the Kinesis data stream I selected earlier (my-input-stream) are automatically attached to the IAM role assumed by the notebook.

Console screenshot.

I choose Create to open the AWS Glue console and create an empty database. Back in the Kinesis Data Analytics Studio console, I refresh the list and select the new database. It will define the metadata for my sources and destinations. From here, I can also review the default Studio notebook settings. Then, I choose Create Studio notebook.

Console screenshot.

Now that the notebook has been created, I choose Run.

Console screenshot.

When the notebook is running, I choose Open in Apache Zeppelin to get access to the notebook and write code in SQL, Python, or Scala to interact with my streaming data and get insights in real time.

In the notebook, I create a new note and call it Sensors. Then, I create a sensor_data table describing the format of the data in the stream:

%flink.ssql

CREATE TABLE sensor_data (
    sensor_id INTEGER,
    current_temperature DOUBLE,
    status VARCHAR(6),
    event_time TIMESTAMP(3),
    WATERMARK FOR event_time AS event_time - INTERVAL '5' SECOND
)
PARTITIONED BY (sensor_id)
WITH (
    'connector' = 'kinesis',
    'stream' = 'my-input-stream',
    'aws.region' = 'us-east-1',
    'scan.stream.initpos' = 'LATEST',
    'format' = 'json',
    'json.timestamp-format.standard' = 'ISO-8601'
)

The first line in the previous command tells to Apache Zeppelin to provide a stream SQL environment (%flink.ssql) for the Apache Flink interpreter. I can also interact with the streaming data using a batch SQL environment (%flink.bsql), or Python (%flink.pyflink) or Scala (%flink) code.

The first part of the CREATE TABLE statement is familiar to anyone who has used SQL with a database. A table is created to store the sensor data in the stream. The WATERMARK option is used to measure progress in the event time, as described in the Event Time and Watermarks section of the Apache Flink documentation.

The second part of the CREATE TABLE statement describes the connector used to receive data in the table (for example, kinesis or kafka), the name of the stream, the AWS Region, the overall data format of the stream (such as json or csv), and the syntax used for timestamps (in this case, ISO 8601). I can also choose the starting position to process the stream, I am using LATEST to read the most recent data first.

When the table is ready, I find it in the AWS Glue Data Catalog database I selected when I created the notebook:

Console screenshot.

Now I can run SQL queries on the sensor_data table and use sliding or tumbling windows to get a better understanding of what is happening with my sensors.

For an overview of the data in the stream, I start with a simple SELECT to get all the content of the sensor_data table:

%flink.ssql(type=update)

SELECT * FROM sensor_data;

This time the first line of the command has a parameter (type=update) so that the output of the SELECT, which is more than one row, is continuously updated when new data arrives.

On the terminal of my laptop, I start the random_data_generator.py script:

$ python3 random_data_generator.py

At first I see a table that contains the data as it comes. To get a better understanding, I select a bar graph view. Then, I group the results by status to see their average current_temperature, as shown here:

Notebook screenshot.

As expected by the way I am generating these results, I have different average temperatures depending on the status (OK, WARNING, or ERROR). The higher the temperature, the greater the probability that something is not working correctly with my sensors.

I can run the aggregated query explicitly using a SQL syntax. This time, I want the result computed on a sliding window of 1 minute with results updated every 10 seconds. To do so, I am using the HOP function in the GROUP BY section of the SELECT statement. To add the time to the output of the select, I use the HOP_ROWTIME function. For more information, see how group window aggregations work in the Apache Flink documentation.

%flink.ssql(type=update)

SELECT sensor_data.status,
       COUNT(*) AS num,
       AVG(sensor_data.current_temperature) AS avg_current_temperature,
       HOP_ROWTIME(event_time, INTERVAL '10' second, INTERVAL '1' minute) as hop_time
  FROM sensor_data
 GROUP BY HOP(event_time, INTERVAL '10' second, INTERVAL '1' minute), sensor_data.status;

This time, I look at the results in table format:

Notebook screenshot.

To send the result of the query to a destination stream, I create a table and connect the table to the stream. First, I need to give permissions to the notebook to write into the stream.

In the Kinesis Data Analytics Studio console, I select my-notebook. Then, in the Studio notebooks details section, I choose Edit IAM permissions. Here, I can configure the sources and destinations used by the notebook and the IAM role permissions are updated automatically.

Console screenshot.

In the Included destinations in IAM policy section, I choose the destination and select my-output-stream. I save changes and wait for the notebook to be updated. I am now ready to use the destination stream.

In the notebook, I create a sensor_state table connected to my-output-stream.

%flink.ssql

CREATE TABLE sensor_state (
    status VARCHAR(6),
    num INTEGER,
    avg_current_temperature DOUBLE,
    hop_time TIMESTAMP(3)
)
WITH (
'connector' = 'kinesis',
'stream' = 'my-output-stream',
'aws.region' = 'us-east-1',
'scan.stream.initpos' = 'LATEST',
'format' = 'json',
'json.timestamp-format.standard' = 'ISO-8601');

I now use this INSERT INTO statement to continuously insert the result of the select into the sensor_state table.

%flink.ssql(type=update)

INSERT INTO sensor_state
SELECT sensor_data.status,
    COUNT(*) AS num,
    AVG(sensor_data.current_temperature) AS avg_current_temperature,
    HOP_ROWTIME(event_time, INTERVAL '10' second, INTERVAL '1' minute) as hop_time
FROM sensor_data
GROUP BY HOP(event_time, INTERVAL '10' second, INTERVAL '1' minute), sensor_data.status;

The data is also sent to the destination Kinesis data stream (my-output-stream) so that it can be used by other applications. For example, the data in the destination stream can be used to update a real-time dashboard, or to monitor the behavior of my sensors after a software update.

I am satisfied with the result. I want to deploy this query and its output as a Kinesis Analytics application.

First, I create a SensorsApp note in my notebook and copy the statements that I want to execute as part of the application. The tables have already been created, so I just copy the INSERT INTO statement above.

Then, from the menu at the top right of my notebook, I choose Build SensorsApp and export to Amazon S3 and confirm the application name.

Notebook screenshot.

When the export is ready, I choose Deploy SensorsApp as Kinesis Analytics application in the same menu. After that, I fine-tune the configuration of the application. I set parallelism to 1 because I have only one shard in my input Kinesis data stream and not a lot of traffic. Then, I run the application, without having to write any code.

From the Kinesis Data Analytics applications console, I choose Open Apache Flink dashboard to get more information about the execution of my application.

Apache Flink console screenshot.

Availability and Pricing
You can use Amazon Kinesis Data Analytics Studio today in all AWS Regions where Kinesis Data Analytics is generally available. For more information, see the AWS Regional Services List.

In Kinesis Data Analytics Studio, we run the open-source versions of Apache Zeppelin and Apache Flink, and we contribute changes upstream. For example, we have contributed bug fixes for Apache Zeppelin, and we have contributed to AWS connectors for Apache Flink, such as those for Kinesis Data Streams and Kinesis Data Firehose. Also, we are working with the Apache Flink community to contribute availability improvements, including automatic classification of errors at runtime to understand whether errors are in user code or in application infrastructure.

With Kinesis Data Analytics Studio, you pay based on the average number of Kinesis Processing Units (KPU) per hour, including those used by your running notebooks. One KPU comprises 1 vCPU of compute, 4 GB of memory, and associated networking. You also pay for running application storage and durable application storage. For more information, see the Kinesis Data Analytics pricing page.

Start using Kinesis Data Analytics Studio today to get better insights from your streaming data.

Danilo

CEO Fraud

This post was originally published on this site

CEO Fraud / BEC is a type of targeted email attack. It commonly involves a cyber criminal pretending to be your boss or a senior leader and then tricking you into sending the criminal highly sensitive information, buying gift cards or initiating a wire transfer. Be highly suspicious of any emails demanding immediate action and/or asking you to bypass any security procedures.

SecretManagement Module v1.1.0 preview update

This post was originally published on this site

Microsoft.PowerShell.SecretManagement 1.1.0-preview

The next 1.1.0 version of SecretManagement has a number of minor fixes, but also two significant changes, one of which is a potentially breaking change for extension authors.
For more information on the changes in this release, see the CHANGELOG document.
This blog discusses the primary changes, why one is potentially breaking, and is especially relevant for extension vault owners who may want to test their vault and make any changes while these updates are in a preview state.

SecretManagement extension vault modules hosting change

SecretManagement extension vaults are PowerShell modules that conform to a special format.
Currently, when SecretManagement loads an extension vault module for use, it loads the module into the current user session.
However, this method of hosting extension vault modules prevented SecretManagement from running in ConstrainedLanguage (CL) mode.
To fix this problem, v1.1.0 of SecretManagement now hosts extension vaults in a separate runspace session.
Changing how the extension module is hosted, has the potential to break existing extension vaults.

What are the effects of the hosting change?

This change mostly affects extension vault authors because they can no longer assume their loaded vault module has access to current user session state.
Generally, extension vaults should never rely on shared user session state.
But if an extension did have this dependency, it would no longer work with this next version of SecretManagement.
We tested some extension vault modules on the PowerShellGallery, but found only one module that was affected by this change (SecretManagement.KeePass).

SecretManagement.KeePass extension vault

The SecretManagement.KeePass extension vault currently relies on its module being loaded in the user session in order to support its vault unlock operation via the Unlock-KeePassSecretVault command.
With the new SecretManagement version, the Unlock-KeePassSecretVault cmdlet is no longer effective, and the user is always prompted for a password when required by the vault.
This can be problematic for scripts being run unattended that do not support user interaction.

The KeePass vault needed to use this user shared state approach due to a deficiency in SecretManagement.
SecretManagement did not support a vault unlock operation, and required vaults that needed it to implement their own.
So one of the changes in this release is to add a new extension vault function, Unlock-SecretVault, that allows extension vaults to provide this functionality directly through SecretManagement.
With this new function, the KeePass vault no longer needs to leverage shared state with the user session.

New Unlock-SecretVault command

SecretManagement now supports a new Unlock-SecretVault command.
Extension vaults that require unlocking can optionally support this by implementing the Unlock-SecretVault function in their extension.
SecretManagement.KeePass extension vault module will be updated to use this new function, mitigating the problem.
For more information see SecretManagement Readme and Design documentation.

Other differences you may notice

Since the extension vaults are no longer loaded in the user session, you will no longer see them listed when you run the Get-Module command, which lists the currently loaded modules in your session.
This is arguably a good thing, since users generally don’t want or need to know about extension vault modules, and don’t need to see them in their current session.
Seeing extension vault modules unexpectedly loaded in their sessions may be confusing to users.

Some extension vault modules, such as Microsoft.PowerShell.SecretStore and SecretManagement.KeePass, include additional commands for the user.
The Microsoft.PowerShell.SecretStore vault provides additional commands for configuring the vault.
The SecretManagement.KeepPass vault provides additional commands for unlocking and registering the vault.
These commands will continue to work.
When these commands are run, the extension vault module will be visible from Get-Module because the commands are run in the current user session.
But that is the only time the extension vault modules will be loaded in the user session.

What is a runspace?

A PowerShell runspace encompasses the session context in which PowerShell scripts run, and can be thought of as an individual PowerShell session.
The PowerShell command shell usually has just a single session, but can support any number of sessions via multiple runspaces.
The runspace isolates multiple running scripts from each other within a single process.

What is CL mode?

Constrained Language is a PowerShell language mode that restricts language elements which can be used to invoke arbitrary APIs.
It is commonly used within a system wide application control policy, such as Windows AppLocker or Windows Defender Application Control, that restricts what applications are available and what scripts are trusted on the system.
Untrusted scripts run in ConstrainedLanguage while trusted scripts run in FullLanguage mode.

Feedback and Support

Community feedback has been essential to the iterative development of these modules.
Thank you to everyone who has contributed issues, and feedback thus far!
To file issues or get support for the SecretManagement interface or vault development experience please use the SecretManagement repository.
For issues which pertain specifically to the SecretStore and its cmdlet interface please use the SecretStore repository.

The post SecretManagement Module v1.1.0 preview update appeared first on PowerShell Team.

Getting Started with Your Favorite Operational Tools on AWS Lambda – Extensions Are Now Generally Available

This post was originally published on this site

In October 2020, we announced the preview of AWS Lambda extensions, which you can use to easily integrate Lambda functions with your favorite tools for monitoring, observability, security, and governance.

Today, I’m happy to announce the general availability of AWS Lambda Extensions which comes with new performance improvements and an expanded set of partners. As part of the GA release, we have enabled functions to send responses as soon as the function code is complete without waiting for the included extensions to finish. This enables extensions to perform activities like sending telemetry to a preferred destination after the function’s response has been returned. We also welcome extensions from new partners: Imperva, Instana, Sentry, Site24x7, and the AWS Distro for OpenTelemetry.

You can use Lambda extensions for use cases such as capturing diagnostic information before, during, and after function invocation; automatically instrumenting your code without needing code changes; fetching configuration settings or secrets before the function invocation; detecting and alerting on function activity through security agents; and sending telemetry to custom destinations such as Amazon Simple Storage Service (Amazon S3), Amazon Kinesis, Amazon Elasticsearch Service directly and asynchronously from your Lambda functions.

Customers are drawn to the vision of Serverless. The reduced operational responsibility frees them up to focus on their business problems. To help customers monitor, observe, secure, and govern their functions, AWS Lambda provides native integrations for logs and metrics through Amazon CloudWatch, tracing through AWS X-Ray, tracking configuration changes through AWS Config, and recording API calls through AWS CloudTrail In addition, AWS Lambda partners provide tools for application management, API integration, deployment, monitoring, and security.

AWS Lambda extensions provide a simple way to extend the Lambda execution environment, which is where your function code is executed. AWS customers, partners, and the open source community can use the new Lambda Extensions API to build their own extensions, which are companion processes that augment the capabilities of Lambda functions. To learn how to build your own extensions, see the Building Extensions for AWS Lambda – In preview blog post. The post also includes information about changes to the Lambda lifecycle.

How AWS Lambda Extensions Works
AWS Lambda extensions are designed to be the easiest way to plug in the tools you use today without complex installation or configuration management. You can add tools to your functions using Lambda layers or include them in the image for functions deployed as container images.

Lambda extensions use the Extensions API to register for function and execution environment lifecycle events. In response to these events, extensions can start new processes or run logic. Lambda extensions can also use the Runtime Logs API to subscribe to a stream of the same logs that the Lambda service sends to Amazon CloudWatch directly from the Lambda execution environment. Lambda streams the logs to the extension, and the extension can then process, filter, and send the logs to any preferred destination.

Most customers will use Lambda extensions without needing to know about the capabilities of the Extensions API. You can just consume capabilities of an extension by configuring the options in your Lambda functions.

How to Use Lambda Extensions
You can install and manage extensions using the Lambda console, the AWS Command Line Interface (CLI), or infrastructure as code (IaC) services and tools such as AWS CloudFormation, AWS Serverless Application Model (AWS SAM), and Terraform.

To use Lambda extensions to integrate existing tools with your Lambda functions, choose your a Lambda function and on the Configuration tab, choose Monitoring and Operations tools.

On the Extensions page, you can find available extensions from AWS Lambda partners. Choose an extension to view its installation instructions.

AWS Lambda Extensions Partners
At this launch, Lambda extensions integrate with these AWS Lambda partners who have provided the following information to introduce their extensions. (I am updating this article with links as they are published.)

  • AppDynamics provides end-to-end transaction tracing for AWS Lambda. With the AppDynamics extension, it is no longer mandatory for developers to include the AppDynamics tracer as a dependency in their function code, making tracing transactions across hybrid architectures even simpler.
  • Coralogix is a log analytics and cloud security platform that empowers thousands of companies to improve security and accelerate software delivery, allowing you to get deep insights without paying for the noise. Coralogix can now read Lambda function logs and metrics directly, without using CloudWatch or Amazon S3, reducing the latency, and cost of observability.
  • The Datadog extension brings comprehensive, real-time visibility to your serverless applications. Combined with Datadog’s integration with AWS, you get metrics, traces, and logs to help you monitor, detect, and resolve issues at any scale. The Datadog extension makes it easier than ever to get telemetry from your serverless workloads.
  • The Dynatrace extension makes it even easier to bring AWS Lambda metrics and traces into the Dynatrace platform for intelligent observability and automatic root cause detection. Get comprehensive, end-to-end observability with the flip of a switch and no code changes.
  • Epsagon helps you monitor, troubleshoot, and lower the cost of your Lambda functions. Epsagon’s extension reduces the overhead of sending traces to the Epsagon service, with minimal performance impact to your function.
  • HashiCorp Vault allows you to secure, store, and tightly control access to your application’s secrets and sensitive data. With the Vault extension, you can now authenticate and securely retrieve dynamic secrets before your Lambda function is invoked.
  • Honeycomb is a powerful observability tool that helps you debug your entire production app stack. Honeycomb’s extension decreases the overhead, latency, and cost of sending events to the Honeycomb service, while increasing reliability.
  • Instana Enterprise Observability Platform ingests performance metrics, traces requests, and profiles processes to make observability work for the enterprise. The Instana Lambda extension offers modification-free, low latency tracing of Lambda functions backed by their real-time Enterprise Observability Platform.
  • Imperva Serverless Protection protects organizations from vulnerabilities created by misconfigured apps and code-level security risks in serverless computing environments. The Imperva extension enables customers to easily embed additional security in their DevOps processes for serverless applications without requiring any code changes, leading to faster time to market.
  • Lumigo provides a monitoring and observability platform for serverless and microservices applications. The Lumigo extension enables the new Lumigo Lambda Profiler to see a breakdown of function resources, including CPU, memory, and network metrics. Use the extension to receive actionable insights to reduce Lambda runtime duration and cost, fix bottlenecks, and increase efficiency.
  • Check Point CloudGuard provides full lifecycle security for serverless applications. The CloudGuard extension enables Function Self Protection data aggregation as an out-of-process extension, providing detection and alerting on application layer attacks.
  • New Relic enables you to efficiently monitor, troubleshoot, and optimize your Lambda functions. New Relic’s extension allows you send your Lambda service platform logs directly to New Relic’s unified observability platform, allowing you to quickly visualize data with minimal latency and cost.
  • Thundra provides an application debugging, observability and security platform for serverless, container and virtual machine (VM) workloads. The Thundra extension adds asynchronous telemetry reporting functionality to the Thundra agents, getting rid of network latency.
  • Splunk offers an enterprise-grade cloud monitoring solution for real-time full-stack visibility at scale. The Splunk extension provides a simplified runtime-independent interface to collect high-resolution observability data with minimal overhead. Monitor, manage, and optimize the performance and cost of your serverless applications with Splunk Observability solutions.
  • Sentry’s extension enables developers to monitor code health. From error tracking to performance monitoring, developers can see issues more clearly, solve them quicker, and continuously stay informed about the health of their applications, all without making code changes.
  • Site24x7 provides a performance monitoring solution for DevOps and IT operations. The Site24x7 extension enables real-time observability into your Lambda functions. It enables you to monitor critical Lambda metrics and function executions logs and optimize execution time and performance.
  • The Sumo Logic extension enables you to get instant visibility into the health and performance of your mission-critical applications using AWS Lambda. With this extension and Sumo Logic’s continuous intelligence platform, you can now ensure that all your Lambda functions are running as expected by analyzing function, platform, and extension logs to quickly identify and remediate errors and exceptions.

Here are Lambda extensions from AWS services:

  • AWS AppConfig helps you manage, store, and safely deploy application configurations to your hosts at runtime. The AWS AppConfig extension integrates Lambda and AWS AppConfig seamlessly. Lambda functions have simple access to external configuration settings quickly and easily. Developers can now dynamically change their Lambda function’s configuration safely using robust validation features.
  • Amazon CodeGuru Profiler helps developers improve application performance and reduce costs by pinpointing an application’s most expensive line of code. It provides recommendations for improving code to save money. The Lambda integration removes the need to change any code or redeploy packages.
  • Amazon CloudWatch Lambda Insights enables you to efficiently monitor, troubleshoot, and optimize Lambda functions. The Lambda Insights extension simplifies the collection, visualization, and investigation of detailed compute performance metrics, errors, and logs. You can more easily isolate and correlate performance problems to optimize your Lambda environments.
  • AWS Distro for OpenTelemetry is a secure, production-ready, AWS-supported distribution of the OpenTelemetry project. The Lambda extension runs the OpenTelemetry collector and enables functions to send trace data to AWS monitoring services such as AWS X-Ray and to any destination such as Honeycomb and Lightstep that supports OpenTelemetry Protocol (OTLP) using the OTLP exporter.

To get started with Lambda extensions, use the links provided to install these extensions.

Things to Know
Here are a couple of things to keep in mind:

Pricing: Extensions share the same billing model as Lambda functions and you are charged for compute time used in all phases of the Lambda lifecycle. For function invocations, you pay for requests served and the compute time used to run your code and all extensions, together, in 1ms increments. To learn more about billing for extensions, visit the Lambda FAQs page.

Performance: Lambda extensions might impact the performance of your function because they share resources such as CPU, memory, and storage with the function, and because extensions are initialized before function code. For example, if an extension performs compute-intensive operations, you might see your function’s execution duration increase because the extension and your function code share the same CPU resources.

Because Lambda uses allocates proportional CPU power based on the memory setting, you might see increased execution and initialization duration at lower memory settings as more processes compete for the same CPU resources. You can use CloudWatch metrics such as PostRuntimeExecutionDuration to measure the extra time the extension takes after the function execution and MaxMemoryUsed to measure the increase in memory used.

Available Now
The performance improvements announced as part of GA are currently in US East (N. Virginia), Europe (Ireland), and Europe (Milan) Regions.

You can also build your own extensions. To learn how to build extensions, see the Lambda Extensions API in the AWS Lambda Developer Guide. You can send feedback through the AWS forum for AWS Lambda or through your usual AWS Support contacts.

Channy

Forwarding Emails

This post was originally published on this site

When you forward an email to others or copy new people to an email thread, review all the content in the entire email and make sure the information contained in it is suitable for everyone. It is very easy to forward emails to others, not realizing there is highly sensitive information in the bottom of the email that people should not have access to.