PowerShell 7.2 Preview 2 release

This post was originally published on this site

PowerShell 7.2 Preview 2

Today we are proud to announce the second preview release of PowerShell 7.2.
This preview is still based on .NET 5 as we wait for the first preview of .NET 6 which we expect PowerShell 7.2 to be based upon.

This preview includes many changes including code cleanup, bug fixes, and a few new features.

Code cleanup

The community has made significant contributions to code cleanup
which is a focus early in a new release.
Approximately two thirds of the 120 pull requets were for code cleanup!

Thanks to all the community members involved in submitting pull requests and reviewing them!

Notable bug fixes

Although we appreciate all bug fixes from the community, there are a few I believe have a broader impact and worth mentioning.

Correct handling of Windows invalid reparse points

On Windows, reparse points are a collection of user-defined data that define specific filesystem behaviors.
For example, symbolic links, OneDrive files, and Microsoft installed applications use reparse points.
Due to a bug introduced in PowerShell 7.1, if you try to use an executable on a drive that isn’t NTFS, you’ll get an Incorrect Function error.
This can be a local USB drive or a network share, for example.

Thanks to our community maintainer Ilya Sazonov for the fix.

We expect to backport this fix to PowerShell 7.1 for the next servicing release.

Breaking changes

-PipelineVariable common parameter

The -PipelineVariaable common parameter
now correctly contains all the objects passed in from the pipeline making script cmdlets work the same as C# cmdlets instead of just the first input object.

You can see an example of the change in behavior in the original issue.

Thanks to Joel Sallow for the fix.

New features

$PSStyle automatic variable for ANSI rendering

When working in the console with a modern terminal, color and text effects can help
make text information more interesting and useful.

This experimental feature called PSAnsiRendering exposes a new $PSStyle automatic variable that can be used for two different purposes.

The first is to make it easier to author text content that contains ANSI escape codes which control
text decorations like color, bold, italics, etc…

This example simply dumps the contents of $PSStyle and shows you the members you can use and their effect on text as well as the actual ANSI escape sequence.
Note that the custom formatting for this variable includes nested types like Formatting, Foreground, and Background.

$PSStyle variable

You can use multiple ANSI escape sequences together.
In this example, I’ve set warning messages to have bold and italicized yellow text on a magenta background:

Warning message style customization

There are also FromRgb() methods available to make use of full 24-bit color if your terminal supports it:

24-bit color text

C# module authors can also leverage $PSStyle by using the PSStyle singleton class in the System.Management.Automation namespace:

string text = $"{PSStyle.Instance.Reverse}{PSStyle.Instance.Foreground.Green}PowerShell{PSStyle.Instance.Foreground.Yellow} Rocks!{PSStyle.Instance.Reset}";

You can control how PowerShell outputs strings that contain ANSI escape sequences by setting $PSStyle.OutputRendering:

  • Automatic
    This is the default and currently will output the text as-is whether it is to the host or through the pipeline if the
    terminal supports ANSI escape sequences (otherwise the output will be plaintext). This is similar behavior to what you
    would get on Linux.
  • Ansi
    This value will output the text as-is whether it is to the host or through the pipeline.
  • PlainText
    This value will remove ANSI escape sequences from any text output whether it is to the host or through the pipeline.
  • Host
    This value will output the text as-is if sent to the host if ANSI escape sequences are supported, but will output plaintext
    if the output is sent through the pipeline or redirected. This is similar behavior to what you would get on macOS.

As this is an experimental feature, we encourage feedback on this before we make a decision to take it out of experimental.
See the original issue for additional details, but open new issues if you have any problems or
suggestions on how to improve this feature.

We very much appreciate on going feedback on our preview releases so we can make adjustments before the release is finalized.
Please participate in on going discussions or create new issues in our repo.

Thanks again to the PowerShell community and all the amazing contributors!

Steve Lee
Pricipal Software Engineer Manager
PowerShell Team

The post PowerShell 7.2 Preview 2 release appeared first on PowerShell.

AA20-345A: Cyber Actors Target K-12 Distance Learning Education to Cause Disruptions and Steal Data

This post was originally published on this site

Original release date: December 10, 2020<br/><h3>Summary</h3><p>This Joint Cybersecurity Advisory was coauthored by the Federal Bureau of Investigation (FBI), the Cybersecurity and Infrastructure Security Agency (CISA), and the Multi-State Information Sharing and Analysis Center (MS-ISAC).</p>

<p>The FBI, CISA, and MS-ISAC assess malicious cyber actors are targeting kindergarten through twelfth grade (K-12) educational institutions, leading to ransomware attacks, the theft of data, and the disruption of distance learning services. Cyber actors likely view schools as targets of opportunity, and these types of attacks are expected to continue through the 2020/2021 academic year. These issues will be particularly challenging for K-12 schools that face resource limitations; therefore, educational leadership, information technology personnel, and security personnel will need to balance this risk when determining their cybersecurity investments.</p>

<p><a href="https://us-cert.cisa.gov/sites/default/files/publications/AA20-345A_Joint_Cybersecurity_Advisory_Distance_Learning_S508C.pdf">Click here</a> for a PDF version of this report.</p>
<h3>Technical Details</h3><p>As of December 2020, the FBI, CISA, and MS-ISAC continue to receive reports from K-12 educational institutions about the disruption of distance learning efforts by cyber actors.</p>

<h4>Ransomware</h4>

<p>The FBI, CISA, and MS-ISAC have received numerous reports of ransomware attacks against K-12 educational institutions. In these attacks, malicious cyber actors target school computer systems, slowing access, and—in some instances—rendering the systems inaccessible for basic functions, including distance learning. Adopting tactics previously leveraged against business and industry, ransomware actors have also stolen—and threatened to leak—confidential student data to the public unless institutions pay a ransom.</p>

<p>According to MS-ISAC data, the percentage of reported ransomware incidents against K-12 schools increased at the beginning of the 2020 school year. In August and September, 57% of ransomware incidents reported to the MS-ISAC involved K-12 schools, compared to 28% of all reported ransomware incidents from January through July.</p>

<p>The five most common ransomware variants identified in incidents targeting K-12 schools between January and September 2020—based on open source information as well as victim and third-party incident reports made to MS-ISAC—are Ryuk, Maze, Nefilim, AKO, and Sodinokibi/REvil.</p>

<h4>Malware</h4>

<p>Figure 1 identifies the top 10 malware strains that have affected state, local, tribal, and territorial (SLTT) educational institutions over the past year (up to and including September 2020). Note: These malware variants are purely opportunistic as they not only affect educational institutions but other organizations as well.</p>

<p>ZeuS and Shlayer are among the most prevalent malware affecting K-12 schools.</p>

<ul>
<li>ZeuS is a Trojan with several variants that targets Microsoft Windows operating systems. Cyber actors use ZeuS to infect target machines and send stolen information to command-and-control servers.</li>
<li>Shlayer is a Trojan downloader and dropper for MacOS malware. It is primarily distributed through malicious websites, hijacked domains, and malicious advertising posing as a fake Adobe Flash updater. <strong>Note: </strong>Shlayer is the only malware of the top 10 that targets MacOS; the other 9 affect Microsoft Windows operating systems</li>
</ul>

<p class="text-align-center"><img alt="" data-entity-type="file" data-entity-uuid="ee5aa08d-fe73-44e6-8f7d-4b5e6ac08320" height="275" src="https://us-cert.cisa.gov/sites/default/files/publications/Top%2010%20Malware%20-%20K-12.png" width="614" /></p>

<p class="text-align-center"><cite>Figure 1: Top 10 malware affecting SLTT educational institutions</cite></p>

<h4><cite>&nbsp;</cite><br />
Distributed Denial-of-Service Attacks</h4>

<p>Cyber actors are causing disruptions to K-12 educational institutions—including third-party services supporting distance learning—with distributed denial-of-service (DDoS) attacks,&nbsp; which temporarily limit or prevent users from conducting daily operations. The availability of DDoS-for-hire services provides opportunities for any motivated malicious cyber actor to conduct disruptive attacks regardless of experience level. <strong>Note:</strong> DDoS attacks overwhelm servers with a high level of internet traffic originating from many different sources, making it impossible to mitigate at a single source.</p>

<h4>Video Conference Disruptions</h4>

<p>Numerous reports received by the FBI, CISA, and MS-ISAC since March 2020 indicate uninvited users have disrupted live video-conferenced classroom sessions. These disruptions have included verbally harassing students and teachers, displaying pornography and/or violent images, and doxing meeting attendees (<strong>Note: </strong>doxing is the act of compiling or publishing personal information about an individual on the internet, typically with malicious intent). To enter classroom sessions, uninvited users have been observed:</p>

<ul>
<li>Using student names to trick hosts into accepting them into class sessions, and</li>
<li>Accessing meetings from either publicly available links or links shared with outside users (e.g., students sharing links and/or passwords with friends).</li>
</ul>

<p>Video conference sessions without proper control measures risk disruption or compromise of classroom conversations and exposure of sensitive information.</p>

<h3>Additional Risks and Vulnerabilities</h3>

<p>In addition to the recent reporting of distance learning disruptions received by the FBI, CISA, and MS-ISAC, malicious cyber actors are expected to continue seeking opportunities to exploit the evolving remote learning environment.</p>

<h4>Social Engineering</h4>

<p>Cyber actors could apply social engineering methods against students, parents, faculty, IT personnel, or other individuals involved in distance learning. Tactics, such as phishing, trick victims into revealing personal information (e.g., password or bank account information) or performing a task (e.g., clicking on a link). In such scenarios, a victim could receive what appears to be legitimate email that:</p>

<ul>
<li>Requests personally identifiable information (PII) (e.g., full name, birthdate, student ID),</li>
<li>Directs the user to confirm a password or personal identification number (PIN),</li>
<li>Instructs the recipient to visit a website that is compromised by the cyber actor, or</li>
<li>Contains an attachment with malware.</li>
</ul>

<p>Cyber actors also register web domains that are similar to legitimate websites in an attempt to capture individuals who mistype URLs or click on similar looking URLs. These types of attacks are referred to as domain spoofing or homograph attacks. For example, a user wanting to access <code>www.cottoncandyschool.edu</code> could mistakenly click on <code>www.cottencandyschool.edu</code> (changed “<code>o</code>” to an “<code>e</code>”) or <code>www.cottoncandyschoo1.edu</code> (changed letter “<code>l</code>” to a number “1”) (<strong>Note:</strong> this is a fictitious example to demonstrate how a user can mistakenly click and access a website without noticing subtle changes in website URLs). Victims believe they are on a legitimate website when, in reality, they are visiting a site controlled by a cyber actor.</p>

<h4>Technology Vulnerabilities and Student Data</h4>

<p>Whether as collateral for ransomware attacks or to sell on the dark web, cyber actors may seek to exploit the data-rich environment of student information in schools and education technology (edtech) services. The need for schools to rapidly transition to distance learning likely contributed to cybersecurity gaps, leaving schools vulnerable to attack. In addition, educational institutions that have outsourced their distance learning tools may have lost visibility into data security measures. Cyber actors could view the increased reliance on—and sharp usership growth in—these distance learning services and student data as lucrative targets.</p>

<h4>Open/Exposed Ports</h4>

<p>The FBI, CISA, and MS-ISAC frequently see malicious cyber actors exploiting exposed Remote Desktop Protocol (RDP) services to gain initial access to a network and, often, to manually deploy ransomware. For example, cyber actors will attack ports 445 (Server Message Block [SMB]) and 3389 (RDP) to gain network access. They are then positioned to move laterally throughout a network (often using SMB), escalate privileges, access and exfiltrate sensitive information, harvest credentials, or deploy a wide variety of malware. This popular attack vector allows cyber actors to maintain a low profile, as they are using a legitimate network service that provides them with the same functionality as any other remote user.</p>

<h4>End-of-Life Software</h4>

<p>End-of-Life (EOL) software is regularly exploited by cyber actors—often to gain initial access, deface websites, or further their reach in a network. Once a product reaches EOL, customers no longer receive security updates, technical support, or bug fixes. Unpatched and vulnerable servers are likely to be exploited by cyber actors, hindering an organization’s operational capacity.</p>
<h3>Mitigations</h3><h4>Plans and Policies</h4>

<p>The FBI and CISA encourage educational providers to maintain business continuity plans—the practice of executing essential functions through emergencies (e.g., cyberattacks)—to minimize service interruptions. Without planning, provision, and implementation of continuity principles, institutions may be unable to continue teaching and administrative operations. Evaluating continuity and capability will help identify potential operational gaps. Through identifying and addressing these gaps, institutions can establish a viable continuity program that will help keep them functioning during cyberattacks or other emergencies. The FBI and CISA suggest K-12 educational institutions review or establish patching plans, security policies, user agreements, and business continuity plans to ensure they address current threats posed by cyber actors.</p>

<h4>Network Best Practices</h4>

<ul>
<li>Patch operating systems, software, and firmware as soon as manufacturers release updates.</li>
<li>Check configurations for every operating system version for educational institution-owned assets to prevent issues from arising that local users are unable to fix due to having local administration disabled.</li>
<li>Regularly change passwords to network systems and accounts and avoid reusing passwords for different accounts.</li>
<li>Use multi-factor authentication where possible.</li>
<li>Disable unused remote access/RDP ports and monitor remote access/RDP logs.</li>
<li>Implement application and remote access allow listing to only allow systems to execute programs known and permitted by the established security policy.</li>
<li>Audit user accounts with administrative privileges and configure access controls with least privilege in mind.</li>
<li>Audit logs to ensure new accounts are legitimate.</li>
<li>Scan for open or listening ports and mediate those that are not needed.</li>
<li>Identify critical assets such as student database servers and distance learning infrastructure; create backups of these systems and house the backups offline from the network.</li>
<li>Implement network segmentation. Sensitive data should not reside on the same server and network segment as the email environment.</li>
<li>Set antivirus and anti-malware solutions to automatically update; conduct regular scans.</li>
</ul>

<h4>User Awareness Best Practices</h4>

<ul>
<li>Focus on awareness and training. Because end users are targeted, make employees and students aware of the threats—such as ransomware and phishing scams—and how they are delivered. Additionally, provide users training on information security principles and techniques as well as overall emerging cybersecurity risks and vulnerabilities.</li>
<li>Ensure employees know who to contact when they see suspicious activity or when they believe they have been a victim of a cyberattack. This will ensure that the proper established mitigation strategy can be employed quickly and efficiently.</li>
<li>Monitor privacy settings and information available on social networking sites.</li>
</ul>

<h4>Ransomware Best Practices</h4>

<p>The FBI and CISA do not recommend paying ransoms. Payment does not guarantee files will be recovered. It may also embolden adversaries to target additional organizations, encourage other criminal actors to engage in the distribution of ransomware, and/or fund illicit activities. However, regardless of whether your organization decided to pay the ransom, the FBI urges you to report ransomware incidents to your local FBI field office. Doing so provides the FBI with the critical information they need to prevent future attacks by identifying and tracking ransomware attackers and holding them accountable under U.S. law.</p>

<p>In addition to implementing the above network best practices, the FBI and CISA also recommend the following:</p>

<ul>
<li>Regularly back up data, air gap, and password protect backup copies offline.</li>
<li>Implement a recovery plan to maintain and retain multiple copies of sensitive or proprietary data and servers in a physically separate, secure location.</li>
</ul>

<h4>Denial-of-Service Best Practices</h4>

<ul>
<li>Consider enrolling in a denial-of-service mitigation service that detects abnormal traffic flows and redirects traffic away from your network.</li>
<li>Create a partnership with your local internet service provider (ISP) prior to an event and work with your ISP to control network traffic attacking your network during an event.</li>
<li>Configure network firewalls to block unauthorized IP addresses and disable port forwarding.</li>
</ul>

<h4>Video-Conferencing Best Practices</h4>

<ul>
<li>Ensure participants use the most updated version of remote access/meeting applications.</li>
<li>Require passwords for session access.</li>
<li>Encourage students to avoid sharing passwords or meeting codes.</li>
<li>Establish a vetting process to identify participants as they arrive, such as a waiting room.</li>
<li>Establish policies to require participants to sign in using true names rather than aliases.</li>
<li>Ensure only the host controls screensharing privileges.</li>
<li>Implement a policy to prevent participants from entering rooms prior to host arrival and to prevent the host from exiting prior to the departure of all participants.</li>
</ul>

<h4>Edtech Implementation Considerations</h4>

<ul>
<li>When partnering with third-party and edtech services to support distance learning, educational institutions should consider the following:</li>
<li>The service provider’s cybersecurity policies and response plan in the event of a breach and their remediation practices:
<ul>
<li>How did the service provider resolve past cyber incidents? How did their cybersecurity practices change after these incidents?</li>
</ul>
</li>
<li>The provider’s data security practices for their products and services (e.g., data encryption in transit and at rest, security audits, security training of staff, audit logs);</li>
<li>The provider’s data maintenance and storage practices (e.g., use of company servers, cloud storage, or third-party services);</li>
<li>Types of student data the provider collects and tracks (e.g., PII, academic, disciplinary, medical, biometric, IP addresses);</li>
<li>Entities to whom the provider will grant access to the student data (e.g., vendors);</li>
<li>How the provider will use student data (e.g., will they sell it to—or share it with—third parties for service enhancement, new product development, studies, marketing/advertising?);</li>
<li>The provider’s de-identification practices for student data; and</li>
<li>The provider’s policies on data retention and deletion.</li>
</ul>

<h4>Malware Defense</h4>

<p>Table 1 identifies CISA-created Snort signatures, which have been successfully used to detect and defend against related attacks, for the malware variants listed below. <strong>Note:</strong> the listing is not fully comprehensive and should not be used at the exclusion of other detection methods.</p>

<p class="text-align-center"><em>Table 1: Malware signatures</em></p>

<table border="1" cellpadding="1" cellspacing="1" class="general-table" style="width: 881.46px; height: 312px; margin-right: auto; margin-left: auto;">
<thead>
<tr>
<th scope="col" style="width: 198px;"><strong>Malware</strong></th>
<th scope="col" style="width: 356px;">Signature</th>
</tr>
</thead>
<tbody>
<tr>
<td scope="col" style="width: 198px; text-align: left;"><strong>NanoCore</strong></td>
<td scope="col" style="width: 356px; text-align: left;"><code>alert tcp any any -&gt; any $HTTP_PORTS (msg:"NANOCORE:HTTP GET URI contains 'FAD00979338'"; sid:00000000; rev:1; flow:established,to_server; content:"GET"; http_method; content:"getPluginName.php?PluginID=FAD00979338"; fast_pattern; http_uri; classtype:http-uri; metadata:service http;)&nbsp;</code></td>
</tr>
<tr>
<td scope="col" style="width: 198px; text-align: left;">
<p><strong>Cerber</strong></p>
</td>
<td scope="col" style="width: 356px; text-align: left;"><code>alert tcp any any -&gt; any $HTTP_PORTS (msg:"HTTP Client Header contains 'host|3a 20|polkiuj.top'"; sid:00000000; rev:1; flow:established,to_server; flowbits:isnotset,&lt;unique_ID&gt;.tagged; content:"host|3a 20|polkiuj.top|0d 0a|"; http_header; fast_pattern:only; flowbits:set,&lt;unique_ID&gt;.tagged; tag:session,10,packets; classtype:http-header; metadata:service http;)&nbsp;</code></td>
</tr>
<tr>
<td scope="col" style="width: 198px; text-align: left;"><strong>Kovter</strong></td>
<td scope="col" style="width: 356px; text-align: left;"><code>alert tcp any any -&gt; any $HTTP_PORTS (msg:"Kovter:HTTP URI POST to CnC Server"; sid:00000000; rev:1; flow:established,to_server; flowbits:isnotset,&lt;unique_ID&gt;.tagged; content:"POST / HTTP/1.1"; depth:15; content:"Content-Type|3a 20|application/x-www-form-urlencoded"; http_header; depth:47; fast_pattern; content:"User-Agent|3a 20|Mozilla/"; http_header; content:!"LOADCURRENCY"; nocase; content:!"Accept"; http_header; content:!"Referer|3a|"; http_header; content:!"Cookie|3a|"; nocase; http_header; pcre:"/^(?:[A-Za-z0-9+/]{4})*(?:[A-Za-z0-9+/]{2}==|[A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{4})$/P"; pcre:"/User-Agentx3a[^rn]+rnHostx3ax20(?:d{1,3}.){3}d{1,3}rnContent-Lengthx3ax20[1-5][0-9]{2,3}rn(?:Cache-Control|Pragma)x3a[^rn]+rn(?:rn)?$/H"; flowbits:set,&lt;unique_ID&gt;.tagged; tag:session,10,packets; classtype:nonstd-tcp; metadata:service http;)</code></td>
</tr>
<tr>
<td scope="col" style="width: 198px; text-align: left;"><strong>Dridex</strong></td>
<td scope="col" style="width: 356px; text-align: left;">
<p><code>alert tcp any any -&gt; any $HTTP_PORTS (msg:"HTTP URI GET contains 'invoice_########.doc' (DRIDEX)"; sid:00000000; rev:1; flow:established,to_server; content:"invoice_"; http_uri; fast_pattern:only; content:".doc"; nocase; distance:8; within:4; content:"GET"; nocase; http_method; classtype:http-uri; metadata:service http;)<br />
alert tcp any any -&gt; any $HTTP_PORTS (msg:"HTTP Client Header contains 'Host|3a 20|tanevengledrep ru' (DRIDEX)"; sid:00000000; rev:1; flow:established,to_server; flowbits:isnotset,&lt;unique_ID&gt;.tagged; content:"Host|3a 20|tanevengledrep|2e|ru|0d 0a|"; http_header; fast_pattern:only; flowbits:set,&lt;unique_ID&gt;.tagged; tag:session,10,packets; classtype:http-header; metadata:service http;)</code></p>
</td>
</tr>
</tbody>
</table>
<h3>Contact Information</h3><p>To report suspicious or criminal activity related to information found in this Joint Cybersecurity Advisory, contact your local FBI field office at <a href="https://www.fbi.gov/contact-us/field-offices">www.fbi.gov/contact-us/field</a>. When available, please include the following information regarding the incident: date, time, and location of the incident; type of activity; number of people affected; type of equipment used for the activity; the name of the submitting organization; and a designated point of contact.</p>

<p>To request incident response resources or technical assistance related to these threats, contact CISA at <a href="https://us-cert.cisa.govmailto:Central@cisa.gov">Central@cisa.gov</a>.</p>

<h3>Resources</h3>

<p>MS-ISAC membership is open to employees or representatives from all public K-12 education entities in the United States. The MS-ISAC provides multiple cybersecurity services and benefits to help K-12 education entities increase their cybersecurity posture. To join, visit <a href="https://learn.cisecurity.org/ms-isac-registration">https://learn.cisecurity.org/ms-isac-registration</a>.</p>

<ul>
<li><a href="https://www.cisa.gov/telework">CISA Telework Guidance and Resources</a></li>
<li><a href="https://www.cisa.gov/publication/secure-video-conferencing-schools">CISA Cybersecurity Recommendations and Tips for Schools Using Video Conferencing</a></li>
<li><a href="https://us-cert.cisa.gov/Ransomware">CISA Ransomware Publications</a></li>
<li><a href="https://www.cisa.gov/emergency-services-sector-continuity-planning-suite">CISA Emergency Services Sector Continuity Planning Suite</a></li>
<li><a href="https://www.cisa.gov/publication/ransomware-guide">CISA-MS-ISAC Joint Ransomware Guide</a></li>
<li><a href="https://us-cert.cisa.gov/ncas/tips/ST04-014">CISA Tip: Avoiding Social Engineering and Phishing Attacks</a></li>
<li><a href="https://www.us-cert.gov/ncas/tips/ST04-006">CISA Tip: Understanding Patches</a></li>
<li><a href="https://cyber.org/cybersafety">CISA and CYBER.ORG “Cyber Safety Video Series” for K-12 students and educators</a></li>
<li><a href="https://www.ic3.gov/media/2019/191002.aspx">FBI PSA: “High-Impact Ransomware Attacks Threaten U.S. Businesses and Organizations</a></li>
</ul>

<p><strong>Note: </strong>contact your local FBI field office (<a href="http://www.fbi.gov/contact-us/field">www.fbi.gov/contact-us/field</a>) for additional FBI products on ransomware, edtech, and cybersecurity for educational institutions.</p>
<h3>Revisions</h3>
<ul> <li>Initial Version: December 10, 2020</li> </ul>
<hr />
<div class="field field–name-body field–type-text-with-summary field–label-hidden field–item"><p class="privacy-and-terms">This product is provided subject to this <a href="https://us-cert.cisa.gov/privacy/notification">Notification</a> and this <a href="https://www.dhs.gov/privacy-policy">Privacy &amp; Use</a> policy.</p>

</div>

Incident Report – PowerShell Gallery Downtime October 30, 2020

This post was originally published on this site

 

The PowerShell gallery experienced downtime on October 30th 2020. This report will give context as to what caused the downtime, what actions were taken to mitigate the issue, and what steps we are taking to improve the PowerShell gallery experience moving forward.

Downtime Impact

The downtime was declared at 2020-10-30 03:29 PDT, and was mitigated about 12 hours later at 2020-10-30 15:39 PDT. During this time packages were not available from the gallery, and the web interface was not accessible.

Root Cause of the Downtime

The downtime was a result of an attempt to fix ongoing statistics errors with the gallery. For roughly 3 weeks the PowerShell gallery was experiencing many server errors (roughly 100-200 per minute) due to a key that had reached a max int value (total downloads reached over 2 billion) and was causing persistent int overflow errors on the gallery. This prevented new entries from being added to the ‘PackageStatistics’ table (required for the intermediary processing of statistics). The int overflow first occurred on 9/18/2020.

After an attempt to perform database migrations failed due to the persistent errors manual updates were made to the database to fix inflated package statistics numbers.
These changes triggered a series of deadlock and timeout errors which consumed all our available cloud resources.
This caused a spike in DTU/CPU utilization for the database which inversely correlated with the availability for the service. The availability for the gallery was so low that it was non-functional and declared down.

Mitigating the Downtime

The first mitigation step was to restore the gallery database (DB) to a previous timestamp. It was believed that an error in the attempted fix of gallery statistics caused the DB to get into a bad state and thus restoring the DB reverted those changes. This initial error was likely due to a trigger on the database that we did not account for. Unfortunately, reverting the DB caused additional issues. Checking the PowerShell gallery backend logs, we saw that the service had trouble connecting to the DB with an error that user credentials were wrong. This indicated that the user had been orphaned by the restore so we re-created the user in the DB. After this step, checking the PowerShell gallery backend logs again, the service had additional trouble connecting to the DB with an error that login was failing. We determined that this error was caused by the DB restore dropping the DB from the gallery’s failover group. The next mitigation step was to re-add the DB to the gallery’s failover group. The final mitigation step was to restart the cloud services so they could re-connect to the failover group. At this point the gallery started working again. We validated these fixes with customers, as well as with our own testing and continued to closely monitor the DTU/CPU utilization and service availability.

Statistics Errors

The gallery has had ongoing issues with the package statistics since August 2020.

These errors came from the gallery reaching a scale (more than 2 billion installations) that was not supported by the design of the statistics pipeline. The impact of this has been both incorrect and unavailable package statistics. The package statistics from 2020-09-18 through 2020-10-07 were never recorded, which meant we were unable to recover statistics from this period.

Restoring Statistics

We restored statistics in two ways, first we repaired statistics for individual packages (surfaced on a package’s page), and then we repaired aggregated statistics (surfaced on the gallery homepage and statistics page).

In ordered to repair package statistics we updated values in our main database and within the code base itself, that referenced a key for package statistics from an integer to a bigint/long. There was some pending data that was dropped when the int overflow error first appeared. We retrieved specific ‘lost’ data from a restored database, but were unfortunately unable to recover some data (mentioned in Statistics Errors).

To repair the aggregated statistics, we then made parallel changes to our data warehouse.

Our repair items are focused on 3 categories: detect, diagnose, and fix. By focusing on these three areas, we hope to not only improve the overall performance of the gallery but also, more quickly find and mitigate issues as they arise.

  • Detect:
    • Add more notifications to the production database
    • Create alerts for when critical metrics are reached in the DB
    • Improve post-deployment validation so that we can quickly roll back undesirable changes
  • Diagnose:
    • Send database logs to a central location outside of the service so that logs are more easily available
  • Fix:
    • Improve the deployment process for gallery cloud services
    • Better document (internal) procedures for recovery and communication during an outage

 

We are also in the process of designing architectural changes to the PowerShell gallery, to ensure this is a reliable, performant, and supportable service moving forward.

Expectations going forward

In conjunction with these repairs, we are working to set and monitor Service Level Objectives (SLOs). Look forward to a future post detailing these expectations and how gallery users can track our progress against these objectives.

Reporting Issues

If you notice any issues with the PowerShell gallery please open an issue in our GitHub repository.

If you are a package owner and have an issue with your package please use our support alias: cgadmin@microsoft.com.

We continue to update the status of the PowerShell gallery at: aka.ms/PSGalleryStatus.

Sydney

PowerShell Team

 

The post Incident Report – PowerShell Gallery Downtime October 30, 2020 appeared first on PowerShell.