Category Archives: Powershell

Using PSScriptAnalyzer to check PowerShell version compatibility

This post was originally published on this site

PSScriptAnalyzer version 1.18 was released recently, and ships with powerful new rules that can check PowerShell scripts for incompatibilities with other PowerShell versions and environments.

In this blog post, the first in a series, we’ll see how to use these new rules to check a script for problems running on PowerShell 3, 5.1 and 6.

Wait, what’s PSScriptAnalyzer?

PSScriptAnalzyer is a module providing static analysis (or linting) and some dynamic analysis (based on the state of your environment) for PowerShell. It’s able to find problems and fix bad habits in PowerShell scripts as you create them, similar to the way the C# compiler will give you warnings and find errors in C# code before it’s executed.

If you use the VSCode PowerShell extension, you might have seen the “green squigglies” and problem reports that PSScriptAnalyzer generates for scripts you author:

Image of PSScriptAnalyzer linting in VSCode with green squigglies

You can install PSScriptAnalyzer to use on your own scripts with:

Install-Module PSScriptAnalyzer -Scope CurrentUser

PSScriptAnalyzer works by running a series of rules on your scripts, each of which independently assesses some issue. For example AvoidUsingCmdletAliases checks that aliases aren’t used in scripts, and MisleadingBackticks checks that backticks at the ends of lines aren’t followed by whitespace.

For more information, see the PSScriptAnalyzer deep dive blog series.

Introducing the compatibility check rules

The new compatibility checking functionality is provided by three new rules:

  • PSUseCompatibleSyntax, which checks whether a syntax used in a script will work in other PowerShell versions.
  • PSUseCompatibleCommands, which checks whether commands used in a script are available in other PowerShell environments.
  • PSUseCompatibleTypes, which checks whether .NET types and static methods/properties are available in other PowerShell environments.

The syntax check rule simply requires a list of PowerShell versions you want to target, and will tell you if a syntax used in your script won’t work in any of those versions.

The command and type checking rules are more sophisticated and rely on profiles (catalogs of commands and types available) from commonly used PowerShell platforms. They require configuration to use these profiles, which we’ll go over below.

For this post, we’ll look at configuring and using PSUseCompatibleSyntax and PSUseCompatibleCommands to check that a script works with different versions of PowerShell. We’ll look at PSUseCompatibleTypes in a later post, although it’s configured very similarly to PSUseCompatibleCommands.

Working example: a small PowerShell script

Imagine we have a small (and contrived) archival script saved to .archiveScript.ps1:

# Import helper module to get folders to archive
Import-Module -FullyQualifiedName @{ ModuleName = ArchiveHelper; ModuleVersion = '1.1' }

$paths = Get-FoldersToArchive -RootPath 'C:DocumentsDocumentsToArchive'
$archiveBasePath = 'ArchiveServerDocumentArchive'

# Dictionary to collect hashes
$hashes = [System.Collections.Generic.Dictionary[string, string]]::new()
foreach ($path in $paths)
{
    # Get the hash of the file and turn it into a base64 string
    $hash = (Get-FileHash -LiteralPath $path).Hash

    # Add this file to the hash catalog
    $hashes[$hash] = $path

    # Now give the archive a unique name and zip it up
    $name = Split-Path -LeafBase $path
    Compress-Archive -LiteralPath $path -DestinationPath (Join-Path $archiveBasePath "$name-$hash.zip")
}

# Add the hash catalog to the archive directory
ConvertTo-Json $hashes | Out-File -LiteralPath (Join-Path $archiveBasePath "catalog.json") -NoNewline

This script was written in PowerShell 6.2, and we’ve tested that it works there. But we also want to run it on other machines, some of which run PowerShell 5.1 and some of which run PowerShell 3.0.

Ideally we will test it on those other platforms, but it would be nice if we could try to iron out as many bugs as possible ahead of time.

Checking syntax with PSUseCompatibleSyntax

The first and easiest rule to apply is PSUseCompatibleSyntax. We’re going to create some settings for PSScriptAnalyzer to enable the rule, and then run analysis on our script to get back any diagnostics about compatibility.

Running PSScriptAnalyzer is straightforward. It comes as a PowerShell module, so once it’s installed on your module path you just invoke it on your file with Invoke-ScriptAnalyzer, like this:

Invoke-ScriptAnalyzer -Path '.archiveScript.ps1`

A very simple invocation like this one will run PSScriptAnalyzer using its default rules and configurations on the script you point it to.

However, because they require more targeted configuration, the compatibility rules are not enabled by default. Instead, we need to supply some settings to run the syntax check rule. In particular, PSUseCompatibleSyntax requires a list of the PowerShell versions you are targeting with your script.

$settings = @{
    Rules = @{
        PSUseCompatibleSyntax = @{
            # This turns the rule on (setting it to false will turn it off)
            Enable = $true

            # List the targeted versions of PowerShell here
            TargetVersions = @(
                '3.0',
                '5.1',
                '6.2'
            )
        }
    }
}

Invoke-ScriptAnalyzer -Path .archiveScript.ps1 -Settings $settings

Running this will present us with the following output:

RuleName                            Severity     ScriptName Line  Message
--------                            --------     ---------- ----  -------
PSUseCompatibleSyntax               Warning      archiveScr 8     The constructor syntax
                                                 ipt.ps1          '[System.Collections.Generic.Dictionary[string,
                                                                  string]]::new()' is not available by default in
                                                                  PowerShell versions 3,4

This is telling us that the [dictionary[string, string]]::new() syntax we used won’t work in PowerShell 3. Better than that, in this case the rule has actually proposed a fix:

$diagnostics = Invoke-ScriptAnalyzer -Path .archiveScript.ps1 -Settings $settings
$diagnostics[0].SuggestedCorrections
File              : C:UsersroholtDocumentsDevsandboxVersionedScriptarchiveScript.ps1
Description       : Use the 'New-Object @($arg1, $arg2, ...)' syntax instead for compatibility with PowerShell versions 3,4
StartLineNumber   : 8
StartColumnNumber : 11
EndLineNumber     : 8
EndColumnNumber   : 73
Text              : New-Object 'System.Collections.Generic.Dictionary[string,string]'
Lines             : {New-Object 'System.Collections.Generic.Dictionary[string,string]'}
Start             : Microsoft.Windows.PowerShell.ScriptAnalyzer.Position
End               : Microsoft.Windows.PowerShell.ScriptAnalyzer.Position

The suggested correction is to use New-Object instead. The way this is suggested might seem slightly unhelpful here with all the position information, but we’ll see later why this is useful.

This dictionary example is a bit artificial of course (since a hashtable would come more naturally), but having a spanner thrown into the works in PowerShell 3 or 4 because of a ::new() is not uncommon. The PSUseCompatibleSyntax rule will also warn you about classes, workflows and using statements depending on the versions of PowerShell you’re authoring for.

We’re not going to make the suggested change just yet, since there’s more to show you first.

Checking command usage with PSUseCompatibleCommands

We now want to check the commands. Because command compatibility is a bit more complicated than syntax (since the availability of commands depends on more than what version of PowerShell is being run), we have to target profiles instead.

Profiles are catalogs of information taken from stock machines running common PowerShell environments. The ones shipped in PSScriptAnalyzer can’t always match your working environment perfectly, but they come pretty close (there’s also a way to generate your own profile, detailed in a later blog post). In our case, we’re trying to target PowerShell 3.0, PowerShell 5.1 and PowerShell 6.2 on Windows. We have the first two profiles, but in the last case we’ll need to target 6.1 instead. These targets are very close, so warnings will still be pertinent to using PowerShell 6.2. Later when a 6.2 profile is made available, we’ll be able to switch over to that.

We need to look under the PSUseCompatibleCommands documentation for a list of profiles available by default. For our desired targets we pick:

  • PowerShell 6.1 on Windows Server 2019 (win-8_x64_10.0.14393.0_6.1.3_x64_4.0.30319.42000_core)
  • PowerShell 5.1 on Windows Server 2019 (win-8_x64_10.0.17763.0_5.1.17763.316_x64_4.0.30319.42000_framework)
  • PowerShell 3.0 on Windows Server 2012 (win-8_x64_6.2.9200.0_3.0_x64_4.0.30319.42000_framework)

The long names on the right are canonical profile identifiers, which we use in the settings:

$settings = @{
    Rules = @{
        PSUseCompatibleCommands = @{
            # Turns the rule on
            Enable = $true

            # Lists the PowerShell platforms we want to check compatibility with
            TargetProfiles = @(
                'win-8_x64_10.0.14393.0_6.1.3_x64_4.0.30319.42000_core',
                'win-8_x64_10.0.17763.0_5.1.17763.316_x64_4.0.30319.42000_framework',
                'win-8_x64_6.2.9200.0_3.0_x64_4.0.30319.42000_framework'
            )
        }
    }
}

Invoke-ScriptAnalyzer -Path ./archiveScript.ps1 -Settings $settings

There might be a delay the first time you execute this because the rules have to load the catalogs into a cache. Each catalog of a PowerShell platform contains details of all the modules and .NET assemblies available to PowerShell on that platform, which can be as many as 1700 commands with 15,000 parameters and 100 assemblies with 10,000 types. But once it’s loaded, further compatibility analysis will be fast. We get output like this:

RuleName                            Severity     ScriptName Line  Message
--------                            --------     ---------- ----  -------
PSUseCompatibleCommands             Warning      archiveScr 2     The parameter 'FullyQualifiedName' is not available for
                                                 ipt.ps1          command 'Import-Module' by default in PowerShell version
                                                                  '3.0' on platform 'Microsoft Windows Server 2012
                                                                  Datacenter'
PSUseCompatibleCommands             Warning      archiveScr 12    The command 'Get-FileHash' is not available by default in
                                                 ipt.ps1          PowerShell version '3.0' on platform 'Microsoft Windows
                                                                  Server 2012 Datacenter'
PSUseCompatibleCommands             Warning      archiveScr 18    The parameter 'LeafBase' is not available for command
                                                 ipt.ps1          'Split-Path' by default in PowerShell version
                                                                  '5.1.17763.316' on platform 'Microsoft Windows Server
                                                                  2019 Datacenter'
PSUseCompatibleCommands             Warning      archiveScr 18    The parameter 'LeafBase' is not available for command
                                                 ipt.ps1          'Split-Path' by default in PowerShell version '3.0' on
                                                                  platform 'Microsoft Windows Server 2012 Datacenter'
PSUseCompatibleCommands             Warning      archiveScr 19    The command 'Compress-Archive' is not available by
                                                 ipt.ps1          default in PowerShell version '3.0' on platform
                                                                  'Microsoft Windows Server 2012 Datacenter'
PSUseCompatibleCommands             Warning      archiveScr 23    The parameter 'NoNewline' is not available for command
                                                 ipt.ps1          'Out-File' by default in PowerShell version '3.0' on
                                                                  platform 'Microsoft Windows Server 2012 Datacenter'

This is telling us that:

  • Import-Module doesn’t support -FullyQualifiedName in PowerShell 3.0;
  • Get-FileHash doesn’t exist in PowerShell 3.0;
  • Split-Path doesn’t have -LeafBase in PowerShell 5.1 or PowerShell 3.0;
  • Compress-Archive isn’t available in PowerShell 3.0, and;
  • Out-File doesn’t support -NoNewline in PowerShell 3.0

One thing you’ll notice is that the Get-FoldersToArchive function is not being warned about. This is because the compatibility rules are designed to ignore user-provided commands; a command will only be marked as incompatible if it’s present in some profile and not in one of your targets.

Again, we can change the script to fix these warnings, but before we do, I want to show you how to make this a more continuous experience; as you change your script, you want to know if the changes you make break compatibility, and that’s easy to do with the steps below.

Using a settings file for repeated invocation

The first thing we want is to make the PSScriptAnalyzer invocation more automated and reproducible. A nice step toward this is taking the settings hashtable we made and turning it into a declarative data file, separating out the “what” from the “how”.

PSScriptAnalyzer will accept a path to a PSD1 in the -Settings parameter, so all we need to do is turn our hashtable into a PSD1 file, which we’ll make ./PSScriptAnalyzerSettings.psd1. Notice we can merge the settings for both PSUseCompatibleSyntax and PSUseCompatibleCommands:

# PSScriptAnalyzerSettings.psd1
# Settings for PSScriptAnalyzer invocation.
@{
    Rules = @{
        PSUseCompatibleCommands = @{
            # Turns the rule on
            Enable = $true

            # Lists the PowerShell platforms we want to check compatibility with
            TargetProfiles = @(
                'win-8_x64_10.0.14393.0_6.1.3_x64_4.0.30319.42000_core',
                'win-8_x64_10.0.17763.0_5.1.17763.316_x64_4.0.30319.42000_framework',
                'win-8_x64_6.2.9200.0_3.0_x64_4.0.30319.42000_framework'
            )
        }
        PSUseCompatibleSyntax = @{
            # This turns the rule on (setting it to false will turn it off)
            Enable = $true

            # Simply list the targeted versions of PowerShell here
            TargetVersions = @(
                '3.0',
                '5.1',
                '6.2'
            )
        }
    }
}

Now we can run the PSScriptAnalyzer again on the script using the settings file:

Invoke-ScriptAnalyzer -Path ./archiveScript.ps1 -Settings ./PSScriptAnalyzerSettings.psd1

This gives the output:

RuleName                            Severity     ScriptName Line  Message
--------                            --------     ---------- ----  -------
PSUseCompatibleCommands             Warning      archiveScr 1     The parameter 'FullyQualifiedName' is not available for
                                                 ipt.ps1          command 'Import-Module' by default in PowerShell version
                                                                  '3.0' on platform 'Microsoft Windows Server 2012
                                                                  Datacenter'
PSUseCompatibleCommands             Warning      archiveScr 9     The command 'Get-FileHash' is not available by default in
                                                 ipt.ps1          PowerShell version '3.0' on platform 'Microsoft Windows
                                                                  Server 2012 Datacenter'
PSUseCompatibleCommands             Warning      archiveScr 12    The parameter 'LeafBase' is not available for command
                                                 ipt.ps1          'Split-Path' by default in PowerShell version '3.0' on
                                                                  platform 'Microsoft Windows Server 2012 Datacenter'
PSUseCompatibleCommands             Warning      archiveScr 12    The parameter 'LeafBase' is not available for command
                                                 ipt.ps1          'Split-Path' by default in PowerShell version
                                                                  '5.1.17763.316' on platform 'Microsoft Windows Server
                                                                  2019 Datacenter'
PSUseCompatibleCommands             Warning      archiveScr 13    The command 'Compress-Archive' is not available by
                                                 ipt.ps1          default in PowerShell version '3.0' on platform
                                                                  'Microsoft Windows Server 2012 Datacenter'
PSUseCompatibleCommands             Warning      archiveScr 16    The parameter 'NoNewline' is not available for command
                                                 ipt.ps1          'Out-File' by default in PowerShell version '3.0' on
                                                                  platform 'Microsoft Windows Server 2012 Datacenter'
PSUseCompatibleSyntax               Warning      archiveScr 6     The constructor syntax
                                                 ipt.ps1          '[System.Collections.Generic.Dictionary[string,
                                                                  string]]::new()' is not available by default in
                                                                  PowerShell versions 3,4

Now we don’t depend on any variables anymore, and have a separate spefication of the analysis you want. Using this, you could put this into continuous integration environments for example to check that changes in scripts don’t break compatibility.

But what we really want is to know that PowerShell scripts stay compatible as you edit them. That’s what the settings file is building to, and also where it’s easiest to make the changes you need to make your script compatible. For that, we want to integrate with the VSCode PowerShell extension.

Integrating with VSCode for on-the-fly compatibility checking

As explained at the start of this post, VSCode PowerShell extension has builtin support for PSScriptAnalyzer. In fact, as of version 1.12.0, the PowerShell extension ships with PSScriptAnalyzer 1.18, meaning you don’t need to do anything other than create a settings file to do compatibility analysis.

We already have our settings file ready to go from the last step, so all we have to do is point the PowerShell extension to the file in the VSCode settings.

You can open the settings with Ctrl+, (use Cmd instead of Ctrl on macOS). In the Settings view, we want PowerShell > Script Analysis: Settings Path. In the settings.json view this is "powershell.scriptAnalysis.settingsPath". Entering a relative path here will find a settings file in our workspace, so we just put ./PSScriptAnalyzerSettings.psd1:

VSCode settings GUI with PSScriptAnalyzer settings path configured to "./PSScriptAnalyzerSettings.psd1"

In the settings.json view this will look like:

"powershell.scriptAnalysis.settingsPath": "./PSScriptAnalyzerSettings.psd1"

Now, opening the script in VSCode we see “green squigglies” for compatibility warnings:

VSCode window containing script, with green squigglies underneath incompatible code

In the problems pane, you’ll get a full desrciption of all the incompatibilities:

VSCode problems pane, listing and describing identified compatibility issues

Let’s fix the syntax problem first. If you remember, PSScriptAnalyzer supplies a suggested correction to this problem. VSCode integrates with PSScriptAnalyzer’s suggested corrections and can apply them if you click on the lightbulb or with Ctrl+Space when the region is under the cursor:

VSCode suggesting New-Object instead of ::new() syntax

Applying this change, the script is now:

Import-Module -FullyQualifiedName @{ ModuleName = ArchiveHelper; ModuleVersion = '1.1' }

$paths = Get-FoldersToArchive -RootPath 'C:DocumentsDocumentsToArchive'
$archivePath = 'ArchiveServerDocumentArchive'

$hashes = New-Object 'System.Collections.Generic.Dictionary[string,string]'
foreach ($path in $paths)
{
    $hash = (Get-FileHash -LiteralPath $path).Hash
    $hashes[$hash] = $path
    $name = Split-Path -LeafBase $path
    Compress-Archive -LiteralPath $path -DestinationPath (Join-Path $archivePath "$name-$hash.zip")
}

ConvertTo-Json $hashes | Out-File -LiteralPath (Join-Path $archivePath "catalog.json") -NoNewline

The other incompatibilities don’t have corrections; for now PSUseCompatibleCommands knows what commands are available on each platform, but not what to substitute with when a command isn’t available. So we just need to apply some PowerShell knowledge:

  • Instead of Import-Module -FullyQualifiedName @{...} we use Import-Module -Name ... -Version ...;
  • Instead of Get-FileHash, we’re going to need to use .NET directly and write a function;
  • Instead of Split-Path -LeafBase, we can use [System.IO.Path]::GetFileNameWithoutExtension();
  • Instead of Compress-Archive we’ll need to use more .NET methods in a function, and;
  • Instead of Out-File -NoNewline we can use New-Item -Value

We end up with something like this (the specific implementation is unimportant, but we have something that will work in all versions):

Import-Module -Name ArchiveHelper -Version '1.1'

function CompatibleGetFileHash
{
    param(
        [string]
        $LiteralPath
    )

    try
    {
        $hashAlg = [System.Security.Cryptography.SHA256]::Create()
        $file = [System.IO.File]::Open($LiteralPath, 'Open', 'Read')
        $file.Position = 0
        $hashBytes = $hashAlg.ComputeHash($file)
        return [System.BitConverter]::ToString($hashBytes).Replace('-', '')
    }
    finally
    {
        $file.Dispose()
        $hashAlg.Dispose()
    }
}

function CompatibleCompressArchive
{
    param(
        [string]
        $LiteralPath,

        [string]
        $DestinationPath
    )

    if ($PSVersion.Major -le 3)
    {
        # PSUseCompatibleTypes identifies that [System.IO.Compression.ZipFile]
        # isn't available by default in PowerShell 3 and we have to do this.
        # We'll cover that rule in the next blog post.
        Add-Type -AssemblyName System.IO.Compression.FileSystem -ErrorAction Ignore
    }

    [System.IO.Compression.ZipFile]::Create(
        $LiteralPath,
        $DestinationPath,
        'Optimal',
        <# includeBaseDirectory #> $true)
}

$paths = Get-FoldersToArchive -RootPath 'C:DocumentsDocumentsToArchive'
$archivePath = 'ArchiveServerDocumentArchive'

$hashes = New-Object 'System.Collections.Generic.Dictionary[string,string]'
foreach ($path in $paths)
{
    $hash = CompatibleGetFileHash -LiteralPath $path
    $hashes[$hash] = $path
    $name = [System.IO.Path]::GetFileNameWithoutExtension($path)
    CompatibleCompressArchive -LiteralPath $path -DestinationPath (Join-Path $archivePath "$name-$hash.zip")
}

$jsonStr = ConvertTo-Json $hashes
New-Item -Path (Join-Path $archivePath "catalog.json") -Value $jsonStr

You should notice that as you type, VSCode displays new analysis of what you’re writing and the green squigglies drop away. When we’re done we get a clean bill of health for script compatibility:

VSCode window with script and problems pane, with no green squigglies and no problems

This means you’ll now be able to use this script across all the PowerShell versions you need to target. Better, you now have a configuration in your workspace so as you write more scripts, there is continual checking for compatibility. And if your compatibility targets change, all you need to do is change your configuration file in one place to point to your desired targets, at which point you’ll get analysis for your updated target platforms.

Summary

Hopefully in this blog post you got some idea of the new compatibility rules that come with PSScriptAnalyzer 1.18.

We’ve covered how to set up and use the syntax compatibility checking rule, PSUseCompatibleSyntax, and the command checking rule, PSUseCompatibleCommands, both using a hashtable configuration and a settings PSD1 file.

We’ve also looked at using the compatibility rules in with the PowerShell extension for VSCode, where they come by default from version 1.12.0.

If you’ve got the latest release of the PowerShell extension for VSCode (1.12.1), you’ll be able to set your configuration file and instantly get compatibility checking.

In the next blog post, we’ll look at how to use these rules and PSUseCompatibleTypes (which checks if .NET types and static methods are available on target platforms) can be used to help you write scripts that work cross platform across Windows and Linux using both Windows PowerShell and PowerShell Core.


Rob Holt

Software Engineer

PowerShell Team

The post Using PSScriptAnalyzer to check PowerShell version compatibility appeared first on PowerShell.

The Next Release of PowerShell – PowerShell 7

This post was originally published on this site

Recently, the PowerShell Team shipped the Generally Available (GA) release of PowerShell Core 6.2. Since that release, we’ve already begun work on the next iteration!

We’re calling the next release PowerShell 7, the reasons for which will be explained in this blog post.

Why 7 and not 6.3?

PowerShell Core usage has grown significantly in the last two years. In particular, the bulk of our growth has come from Linux usage, an encouraging statistic given our investment in making PowerShell viable cross-platform.

image

However, we also can clearly see that our Windows usage has not been growing as significantly, surprising given that PowerShell was popularized on the Windows platform. We believe that this could be occurring because existing Windows PowerShell users have existing automation that is incompatible with PowerShell Core because of unsupported modules, assemblies, and APIs. These folks are unable to take advantage of PowerShell Core’s new features, increased performance, and bug fixes. To address this, we are renewing our efforts towards a full replacement of Windows PowerShell 5.1 with our next release.

This means that Windows PowerShell and PowerShell Core users will be able to use the same version of PowerShell to automate across Windows, Linux, and macOS and on Windows, and PowerShell 7 users will have a very high level of compatibility with Windows PowerShell modules they rely on today.

We’re also going to take the opportunity to simplify our references to PowerShell in documentation and product pages, dropping the “Core” in “PowerShell 7”. The PSEdition will still reflect Core, but this will only be a technical distinction in APIs and documentation where appropriate.

Note that the major version does not imply that we will be making significant breaking changes. While we took the opportunity to make some breaking changes in 6.0, many of those were compromises to ensure our compatibility on non-Windows platforms. Prior to that, Windows PowerShell historically updated its major version based on new versions of Windows rather than Semantic Versioning

.NET Core 3.0

PowerShell Core 6.1 brought compatibility with many built-in Windows PowerShell modules, and our estimation is that PowerShell 7 can attain compatibility with 90+% of the inbox Windows PowerShell modules by leveraging changes in .NET Core 3.0 that bring back many APIs required by modules built on .NET Framework so that they work with .NET Core runtime. For example, we expect Out-GridView to come back (for Windows only, though)!

A significant effort for PowerShell 7 is porting the PowerShell Core 6 code base to .NET Core 3.0 and also working with Windows partner teams to validate their modules against PowerShell 7.

Support Lifecycle Changes

Currently, PowerShell Core is under the Microsoft Modern Lifecycle Policy. This means that PowerShell Core 6 is fix-forward: we produce servicing releases for security fixes and critical bug fixes,
and you must install the latest stable version within 6 months of a new minor version release.

In PowerShell 7, we will align more closely with the .NET Core support lifecycle, enabling PowerShell 7 to have both LTS (Long Term Servicing) and non-LTS releases.

We will still have monthly Preview releases to get feedback early.

When do I get PowerShell 7?

The first Preview release of PowerShell 7 will likely be in May. Be aware, however, that this depends on completing integration and validation of PowerShell with .NET Core 3.0.

Since PowerShell 7 is aligned with the .NET Core timeline, we expect the generally available (GA) release to be some time after the GA of .NET Core 3.0.

What about shipping in Windows?

We are planning on eventually shipping PowerShell 7 in Windows as a side-by-side feature with Windows PowerShell 5.1, but we still need to work out some of the details on how you will manage this inbox version of PowerShell 7.

And since the .NET Core timeline doesn’t align with the Windows timeline, we can’t say right now when it will show up in a future version of Windows 10 or Windows Server.

What other features will be in PowerShell 7?

We haven’t closed on our feature planning yet, but expect another blog post relatively soon with a roadmap of our current feature level plans for PowerShell 7.

Steve Lee
https://twitter.com/Steve_MSFT
Principal Engineering Manager
PowerShell Team

The post The Next Release of PowerShell – PowerShell 7 appeared first on PowerShell.

The PowerShell Gallery is now more Accessible

This post was originally published on this site

Over the past few months, the team has been working hard to make the PowerShell Gallery as accessible as possible. This blog details why it matters and what work has been done.

Why making the PowerShell Gallery more accessible was a priority

Accessible products change lives and allow everyone to be included in our product. Accessibility is also a major component of striving toward Microsoft’s mission to “Empower every person and every organization on the planet to achieve more.” Improvements in accessibility mean improvements in usability which makes the experience better for everyone. In doing accessibility testing for the Gallery, for example, we found that it was confusing for users to distinguish between “deleting” and “unlisting” packages. By clearly naming this action in the UI, it makes the process of unlisting a package more clear for all package owners.

The steps taken to make the PowerShell Gallery more accessible

The first part of the process focused on bug generation and resolution. We used scanning technology to ensure that the Gallery alerts and helper texts were configured properly, and were compatible with screen reading technology. We use Keros scanning which is Microsoft’s premier accessibility tool to identify accessibility issues and worked to triage and fix the detected issues.

For the second part of the process, we undertook a scenario-focused accessibility study. For the study, blind or visually impaired IT professionals went through core scenarios for using the Gallery. These scenarios included: finding packages, publishing packages, managing packages, and getting support. The majority of the scenarios focused on searching for packages as we believe this is the primary way customers interact with the Gallery. After the study concluded we reviewed the results and watched recordings of the participants navigating through our scenarios. This process allowed us to focus on improving our lowest performing scenarios by addressing specific usability improvements. After making these improvements we underwent a review by accessibility experts to assure we had high usability and accessibility.

Usability Improvements

  • Screen Reader Compatibility: Screen reader technologies make consuming web content accessible so we underwent thorough review, and improvement, to ensure that the Gallery was providing accurate, consistent, and helpful information to screen readers. Some examples of areas we improved:
    • Accurate Headers
    • Clearly labeled tables
    • Helpful tool tips
    • Labeled graph node points
  • Improved Aria Tags: Accessible Rich Internet Applications (Aria) is a specification that makes web content more accessible by passing helpful information to assistive technologies such as screen readers. We underwent a thorough review, and enhancement, of our Aria tags to make sure they were as helpful as possible. One improvement we made, for example, was an ARIA description explaining how to use tags in the Gallery search bar.
  • Renamed UI elements to be more descriptive: Through our review we noticed we were generating some confusion by labeling the unlist button as “delete” and we worked to fix these types of issues.
  • Filters: We added filters for the operating system to make it easier to find compatible packages.
  • Results description: we made searching for packages more straightforward by displaying the total number of results and pages.
  • Page Scrolling: we made searching for packages easier by adding multi-page scrolling.

Reporting Issues

Our goal is to make the Gallery completely user friendly. If you encounter any issues in the PowerShell Gallery that make it less accessible/usable we would love to hear about it on our GitHub page. Please file an issue letting us know what we can do to make the Gallery even more accessible.

The post The PowerShell Gallery is now more Accessible appeared first on PowerShell.

LiveFyre commenting will no longer be available on the PowerShell Gallery

This post was originally published on this site

Commenting on the PowerShell Gallery is provided by LiveFyre–a third-party comment system. LiveFyre is no longer supported by Adobe and therefore we are unable to service issues as they arise. We have gotten reports of authentication failing for Twitter and Microsoft AAD and unfortunately we are unable to bring back those services. As we cannot predict when more issues will occur, and we cannot fix issues as they arise we must depreciate use of LiveFyre on the PowerShell Gallery. As of May 1st 2019 LiveFyre commenting will no longer be available on the PowerShell Gallery. Unfortunately we are unable to migrate comments off of LiveFyre so comment history will be lost.

How will package consumers be able to get support?

The other existing channels for getting support and contacting package owners will still be available on the Gallery. The left pane of the package page is the best place to get support. If you are looking to contact the package owner, select “Contact Owners” on the package page. If you are looking to contact Gallery support use the “Report” button. If the package owner has provided a link to their project site in their module manifest a link to their site is also available in the left pane and can be a good avenue for support. For more information on getting package support please see our documentation.

Questions

We appreciate your understanding as we undergo this transition.
Please direct any questions to sysmith@microsoft.com.

The post LiveFyre commenting will no longer be available on the PowerShell Gallery appeared first on PowerShell.

PowerShell ScriptAnalyzer Version 1.18.0 Released

This post was originally published on this site

PSScriptAnalyzer (PSSA1.18.0 is now available on the PSGallery and brings a lot of improvements in the following areas:

  • Better compatibility analysis of commands, types and syntax across different platforms and versions of PowerShell
  • Better formatting and customization. New capabilities are:
    • Multi-line pipeline indentation styles
    • Cmdlet casing for better consistency and readability
    • Consistent whitespace inside braces and pipes
  • Custom rules can now be suppressed and preserve the RuleSuppressionID
  • Better DSC support by being able to understand different syntaxes of Import-DscResource
  • Better user experience by being able to pipe to Invoke-ScriptAnalyzer and added tab completion of the returned objects that are piped to the next pipeline
  • Better handling of parsing errors by emitting them as a diagnostic record with a new Severity type
  • Improved Performance: Expect it to be about twice as fast in most cases and even more when re-analyzing a file. More on this below
  • Fixes and enhancements to the engine, rules, and documentation

There are some minor breaking changes such as e.g. requiring the minimum version of PowerShell Core to 6.1 as 6.0 has reached the end of its support lifecycle. With this, it was possible to update the used version of Newtonsoft.Json to 11.0.2. On Windows PowerShell, the minimum required runtime was upped from4.5.0 to4.5.2, which is the lowest version that is still supported by Microsoft but Windows update will have taken care of upgrading the to this patched version anyway, therefore no disruption is expected. We have also replaced old command data files of PowerShell 6.0 with a newer version for theUseCompatibleCmdletsrule.

Formatting

New rules and new features/customization of existing rules were added and in an upcoming release of the PowerShell vscode extension, those new features will also be configurable from within the extension in an upcoming update.

New PSUseConsistentWhitespace options

The PSUseConsistentWhitespace rule has 2 new configuration options that are both enabled by default:

  • CheckInnerBrace: Checks if there is a space after the opening brace and a space before the closing brace. E.g. if ($true) { foo } instead of if ($true) {bar}.
  • CheckPipe: Checks if a pipe is surrounded on both sides by a space. E.g. foo | bar instead offoo|bar.

In an upcoming update of the PowerShell vscode extension, this feature will be configurable in the editor via the settings powershell.WhitespaceInsideBrace and powershell.WhitespaceAroundPipe.

New PipelineIndentation option for PSUseConsistentIndentation

The PSUseConsistentIndentation rule was fixed to handle multi-line pipeline (before, the behavior was a bit ill-defined) and as part of that we decided to expose 3 options for a new configuration option calledPipelineIndentation. This allows PSSA to cater to different tastes of the user whether to increase indentation after a pipeline for multi-line statements. The settings are:

  • IncreaseIndentationForFirstPipeline (default): Indent once after the first pipeline and keep this indentation. Example:
foo |
    bar |
    baz
  • IncreaseIndentationAfterEveryPipeline: Indent more after the first pipeline and keep this indentation. Example:
foo |
    bar |
        baz
  • NoIndentation: Do not increase indentation. Example:
foo |
bar |
baz

In an upcoming update of the PowerShell vscode extension, this feature will be configurable in the editor via the setting powershell.codeFormatting.

New PSUseConsistentCasing rule

By popular request, this rule can correct the casing of cmdlet names. This can correct e.g. get-azadapplicaTION to Get-AzADApplication. This not only makes code more consistent but can improve readability in most cases. In an upcoming update of the PowerShell vscode extension, this feature will be configurable in the editor via the settingpowershell.useCorrectCasingsettings.

Compatibility Analysis

The UseCompatibleCmdlets rule requires JSON files in the Settings folder of PSSA’s installation and their file name is mapped to directly the compatibility configuration. In the new version we have replaced the JSON files for PowerShell 6.0 with files for 6.1 and also added new files for e.g. ARM on Linux (Raspian) and also PowerShell 2.0 that is still being used by some despite it being deprecated. If desired, one can always add custom JSON files to the Settings folder and it will just work by using the filename without the need to re-compile. To generate your custom JSON file for your environment, you can use the New-CommandDataFile.ps1 script.

To further add more analysis, 3 more rules were being added:

These rules do not follow the definition style of the UseCompatibleCmdlets rule. For usage and examples please refer to the rule documentation links of the 3 new rules above, there will be a more detailed blog post about them in the future.

Better DSC Support

Invoke-ScriptAnalyzer has a -SaveDscDependency switch that will download the required module from the PSGalleryto allow for parsing of the DSC files. In order to do that is has to parse calls to Import-DscResource correctly. Previously it could neither take the version into account or parse the hashtable syntax (Import-DscResource -ModuleName (@{ModuleName="SomeDscModule1";ModuleVersion="1.2.3.4"})). We added support for both of them. But because there could be different variations of the first one (different parameter name order or not using named paramters, etc.), please use it in the form Import-DscResource -ModuleName MyModuleName -ModuleVersion 1.2.3.4.

Custom Rules

We added the capability of being able to suppress violations from custom rules the same way how you can already suppress rules from the built-in rules. It is worth noting though that the rulename of custom rules has to be of the format CustomRuleModuleFileNameCustomRuleName, this is to uniquely identify the rule as it could be possible that 2 custom rule modules emit a rule of the same name.

When a custom rule emits a DiagnosticRecord, then the engine has to translate all properties of the object as it has to be re-created when emitting it via Invoke-ScriptAnalyzer. We added the translation of the SuggestedCorrectionsproperty already in the last release (1.17.1) to allow for auto-correction in the editor or via the -Fix switch. However, we also found that customers want to also use the RuleSuppressionID property in their custom rules, so we added translation for this as well.

Engine Improvements

PSScriptAnalyzer is highly multi-threaded by executing each rule (excluding custom or DSC rules) in parallel in each own thread. But there are some global resources such as e.g. a CommandInfo cache that needs to be accessed using a lock. Caching and lock granularity has been improved and are therefore reducing congestion which leads to much better performance. You can expect PSScriptAnalyzer to be about twice as fast for ‘cold runs’ (where Invoke-ScriptAnalyzerhas not been called before) and magnitudes faster when re-analyzing the same file. To you as a user, this will mean that you will see the squiggles faster when opening a new file in VS-Code and you will get faster updates when editing a file whilst reducing the CPU consumption in the background. We have more optimizations planned in this area, you can expect further improvements of similar scale in future versions and we hope to release future versions more frequent as well.

Miscellaneous Fixes

We received reports of some functionality not working when using Turkish culture and made a fix for and as part reviewed some culture critical points and made sure they work better across all cultures. The bug was very specific to Turkish culture, therefore we are confident that PSSA should work with any other cultures as well.

The Changelog has more details on the various fixes that were made to other rules.

On behalf of the Script Analyzer team,

Chris Bergmeister, Project Maintainer
Jim Truher, Senior Software Engineer, Microsoft

The post PowerShell ScriptAnalyzer Version 1.18.0 Released appeared first on PowerShell.

Invoke-Sqlcmd is Now Available Supporting Cross-Platform

This post was originally published on this site

The official SqlServer module now includes a version of the Invoke-Sqlcmd cmdlet that runs in PSCore 6.2 and above. The version of the SqlServer module which contains this cmdlet is 21.1.18095-preview and is available in the PowerShell Gallery.

The post Invoke-Sqlcmd is Now Available Supporting Cross-Platform appeared first on PowerShell.

DSC Resource Kit Release February 2019

This post was originally published on this site

We just released the DSC Resource Kit!

This release includes updates to 14 DSC resource modules. In the past 6 weeks, 126 pull requests have been merged and 102 issues have been closed, all thanks to our amazing community!

The modules updated in this release are:

  • ActiveDirectoryCSDsc
  • CertificateDsc
  • ComputerManagementDsc
  • DFSDsc
  • NetworkingDsc
  • PSDscResources
  • SharePointDsc
  • SqlServerDsc
  • StorageDsc
  • xActiveDirectory
  • xExchange
  • xHyper-V
  • xPSDesiredStateConfiguration
  • xWebAdministration

For a detailed list of the resource modules and fixes in this release, see the Included in this Release section below.

Our latest community call for the DSC Resource Kit was last Wednesday, February 13. We were not able to record the call this time, apologies. We will fix this for the next call. You can join us for the next call at 12PM (Pacific time) on March 27 to ask questions and give feedback about your experience with the DSC Resource Kit.

The next DSC Resource Kit release will be on Wednesday, April 3.

We strongly encourage you to update to the newest version of all modules using the PowerShell Gallery, and don’t forget to give us your feedback in the comments below, on GitHub, or on Twitter (@PowerShell_Team)!

Please see our documentation here for information on the support of these resource modules.

Included in this Release

You can see a detailed summary of all changes included in this release in the table below. For past release notes, go to the README.md or CHANGELOG.md file on the GitHub repository page for a specific module (see the How to Find DSC Resource Modules on GitHub section below for details on finding the GitHub page for a specific module).

Module Name Version Release Notes
ActiveDirectoryCSDsc 3.2.0.0
  • Added “DscResourcesToExport” to manifest to improve information in PowerShell Gallery – fixes Issue 68.
  • Removed unused CAType variables and references in AdcsOnlineResponder – fixes issue 52.
  • Updated Examples to enable publising to PowerShell Gallery – fixes issue 54.
  • Cleaned up property alignment in module manifest file.
  • Added new resource AdcsOcspExtension – see Issue 70.
    • Added new ActiveDirectoryCSDsc.CommonHelper.psm1 helper module and unit test.
    • Added stub function to /Tests/TestHelpers (ADCSStub.psm1) so Pester tests can run without having to install ADCSAdministration module.
  • Converted module to auto-documentation Wiki – fixes Issue 53.
  • Enabled Example publishing to PSGallery.
  • Moved change log to CHANGELOG.MD.
  • Opted into Common Tests “Validate Example Files To Be Published”, “Validate Markdown Links” and “Relative Path Length”.
  • Correct AppVeyor Invoke-AppveyorAfterTestTask – fixes Issue 73.
CertificateDsc 4.4.0.0
  • Minor style corrections from PR for Issue 161 that were missed.
  • Opt-in to Example publishing to PowerShell Gallery – fixes Issue 177.
  • Changed Test-CertificateAuthority to return the template name if it finds the display name of the template in the certificate -fixes Issue 147.
ComputerManagementDsc 6.2.0.0
  • WindowsEventLog:
    • Migrated the xWinEventLog from xWinEventLog and renamed to WindowsEventLog.
    • Moved strings in localization file.
    • LogMode is now set with Limit-EventLog,
    • Fixes Issue 18.
DFSDsc 4.3.0.0
  • Fixes PSSA style violation issues resulting – fixes Issue 84.
  • Added “DscResourcesToExport” to manifest to improve information in PowerShell Gallery – fixes Issue 86.
  • Set FunctionsToExport, CmdletsToExport, VariablesToExport, AliasesToExport to empty list in manifest to meet best practice.
  • Explicitly removed extra hidden files from release package
NetworkingDsc 7.0.0.0
  • Refactored module folder structure to move resource to root folder of repository and remove test harness – fixes Issue 372.
  • Removed module conflict tests because only required for harness style modules.
  • Opted into Common Tests “Validate Example Files To Be Published”, “Validate Markdown Links” and “Relative Path Length”.
  • Added “DscResourcesToExport” to manifest to improve information in PowerShell Gallery and removed wildcards from “FunctionsToExport”, “CmdletsToExport”, “VariablesToExport” and “AliasesToExport” – fixes Issue 376.
  • MSFT_NetIPInterface:
    • Added Dhcp, WeakHostReceive and WeakHostSend parameters so that MSFT_DHCPClient, MSFT_WeakHostReceive, MSFT_WeakHostSend can be deprecated – fixes Issue 360.
  • MSFT_DhcpClient:
    • BREAKING CHANGE: Resource has been deprecated and replaced by Dhcp parameter in MSFT_NetIPInterface.
  • MSFT_WeakHostReceive:
    • BREAKING CHANGE: Resource has been deprecated and replaced by WeakHostReceive parameter in MSFT_NetIPInterface.
  • MSFT_WeakHostSend:
    • BREAKING CHANGE: Resource has been deprecated and replaced by WeakHostSend parameter in MSFT_NetIPInterface.
  • MSFT_IPAddress:
    • Updated examples to use NetIPInterface.
  • MSFT_NetAdapterName:
    • Updated examples to use NetIPInterface.
  • MSFT_DnsServerAddress:
    • Updated examples to use NetIPInterface.
  • MSFT_NetworkTeam:
    • Change Get-TargetResource to return actual TeamMembers if network team exists and “Ensure” returns “Present” even when actual TeamMembers do not match “TeamMembers” parameter – fixes Issue 342.
  • Updated examples to format required for publishing to PowerShell Gallery – fixes Issue 374.
  • MSFT_NetAdapterAdvancedProperty:
  • Fixes NetworkAdapterName being returned in Name property when calling Get-TargetResourceFixes – fixes Issue 370.
PSDscResources 2.10.0.0
  • Fixed CompanyName typo – Fixes Issue 100
  • Update LICENSE file to match the Microsoft Open Source Team standard – Fixes Issue 120.
  • Update CommonResourceHelper unit tests to meet Pester 4.0.0 standards (issue 129).
  • Update ResourceHelper unit tests to meet Pester 4.0.0 standards (issue 129).
  • Ported fixes from xPSDesiredStateConfiguration:
    • xArchive
      • Fix end-to-end tests.
      • Update integration tests to meet Pester 4.0.0 standards.
      • Update end-to-end tests to meet Pester 4.0.0 standards.
      • Update unit and integration tests to meet Pester 4.0.0 standards.
      • Wrapped all path and identifier strings in verbose messages with quotes to make it easier to identify the limit of the string when debugging.
      • Refactored date/time checksum code to improve testability and ensure tests can run on machines with localized datetime formats that are not US.
      • Fix “Get-ArchiveEntryLastWriteTime” to return [datetime].
      • Improved verbose logging to make debugging path issues easier.
  • Added .gitattributes file to ensure CRLF settings are configured correctly for the repository.
  • Updated “.vscodesettings.json” to refer to AnalyzerSettings.psd1 so that custom syntax problems are highlighted in Visual Studio Code.
  • Fixed style guideline violations in CommonResourceHelper.psm1.
  • Updated “appveyor.yml” to meet more recent standards.
  • Removed OS image version from “appveyor.yml” to use default image (Issue 127).
  • Removed code to install WMF5.1 from “appveyor.yml” because it is already installed in AppVeyor images (Issue 128).
  • Removed .vscode from .gitignore so that Visual Studio code environment settings can be committed.
  • Environment
    • Update tests to meet Pester 4.0.0 standards (issue 129).
  • Group
    • Update tests to meet Pester 4.0.0 standards (issue 129).
    • Fix unit tests to run on Nano Server.
    • Refactored unit tests to enclude Context fixtures and change functions to Describe fixtures.
  • GroupSet
    • Update tests to meet Pester 4.0.0 standards (issue 129).
SharePointDsc 3.2.0.0
  • Changes to SharePointDsc unit testing
    • Implemented Strict Mode version 1 for all code run during unit tests.
    • Changed InstallAccount into PSDscRunAsCredential parameter in examples
  • SPAuthenticationRealm
    • New resource for setting farm authentication realm
  • SPConfigWizard
    • Updated PSConfig parameters according recommendations in blog post of Stefan Gossner
  • SPDistributedCacheService
    • Fixed exception on Stop-SPServiceInstance with SharePoint 2019
  • SPFarm
    • Improved logging
    • Added ability to manage the Developer Dashboard settings
  • SPFarmSolution
    • Fixed issue where uninstalling a solution would not work as expected if it contained web application resources.
  • SPIncomingEmailSettings
    • New resource for configuring incoming email settings
  • SPInstallPrereqs
    • Improved logging
    • Corrected detection for Windows Server 2019
    • Corrected support for Windows Server 2019 for SharePoint 2016
  • SPProductUpgrade
    • Fixed issue where upgrading SP2013 would not properly detect the installed version
    • Fixed issue where the localized SharePoint 2019 CU was detected as a Service Pack for a Language Pack
  • SPSearchAuthorativePage
    • Fixed issue where modifying search query would not target the correct search application
  • SPSearchResultSource
    • Updated resource to allow localized ProviderTypes
  • SPServiceAppSecurity
    • Updated resource to allow localized permission levels
  • SPServiceInstance
    • Added -All switch to resolve ‘Unable to locate service application’ in SP2013
  • SPSite
    • Improved logging
  • SPUserProfileProperty
    • Fix user profile property mappings does not work
  • SPUserProfileServiceApp
    • Added warning message when MySiteHostLocation is not specified. This is currently not required, which results in an error. Will be corrected in SPDsc v4.0 (is a breaking change).
  • SPUserProfileSyncConnection
    • Fixed issue where test resource never would return true for any configurations on SharePoint 2016/2019
    • Fixed issue where updating existing connection never would work for any configurations on SharePoint 2016/2019
    • Updated documentation to reflect that Fore will not impact configurations for SharePoint 2016/2019. Updated the test method accordingly.
  • SPUserProfileSyncService
    • Fixed issue where failure to configure the sync service would not throw error
  • SPWebAppPeoplePickerSettings
    • Converted password for access account to secure string. Previsouly the resource would fail setting the password and an exeption was thrown that printed the password in clear text.
  • SPWebAppPolicy
    • Fixed issue where parameter MembersToExclude did not work as expected
  • SPWorkflowService
    • Added support for specifying scope name.
    • Added support for detecting incorrect configuration for scope name and WorkflowHostUri
SqlServerDsc 12.3.0.0
  • Changes to SqlServerDsc
    • Reverting the change that was made as part of the issue 1260 in the previous release, as it only mitigated the issue, it did not solve the issue.
    • Removed the container testing since that broke the integration tests, possible due to using excessive amount of memory on the AppVeyor build worker. This will make the unit tests to take a bit longer to run (issue 1260).
    • The unit tests and the integration tests are now run in two separate build workers in AppVeyor. One build worker runs the integration tests, while a second build worker runs the unit tests. The build workers runs in parallel on paid accounts, but sequentially on free accounts (issue 1260).
    • Clean up error handling in some of the integration tests that was part of a workaround for a bug in Pester. The bug is resolved, and the error handling is not again built into Pester.
    • Speeding up the AppVeyor tests by splitting the common tests in a separate build job.
    • Updated the appveyor.yml to have the correct build step, and also correct run the build step only in one of the jobs.
    • Update integration tests to use the new integration test template.
    • Added SqlAgentOperator resource.
  • Changes to SqlServiceAccount
    • Fixed Get-ServiceObject when searching for Integration Services service. Unlike the rest of SQL Server services, the Integration Services service cannot be instanced, however you can have multiple versions installed. Get-Service object would return the correct service name that you are looking for, but it appends the version number at the end. Added parameter VersionNumber so the search would return the correct service name.
    • Added code to allow for using Managed Service Accounts.
    • Now the correct service type string value is returned by the function Get-TargetResource. Previously one value was passed in as a parameter (e.g. DatabaseEngine), but a different string value as returned (e.g. SqlServer). Now Get-TargetResource return the same values that can be passed as values in the parameter ServiceType (issue 981).
  • Changes to SqlServerLogin
    • Fixed issue in Test-TargetResource to valid password on disabled accounts (issue 915).
    • Now when adding a login of type SqlLogin, and the SQL Server login mode is set to "Integrated", an error is correctly thrown (issue 1179).
  • Changes to SqlSetup
    • Updated the integration test to stop the named instance while installing the other instances to mitigate issue 1260.
    • Add parameters to configure the Tempdb files during the installation of the instance. The new parameters are SqlTempdbFileCount, SqlTempdbFileSize, SqlTempdbFileGrowth, SqlTempdbLogFileSize and SqlTempdbLogFileGrowth (issue 1167).
  • Changes to SqlServerEndpoint
StorageDsc 4.5.0.0
  • Opt-in to Example publishing to PowerShell Gallery – fixes Issue 186.
  • DiskAccessPath:
    • Updated the resource to not assign a drive letter by default when adding a disk access path. Adding a Set-Partition -NoDefaultDriveLetter $NoDefaultDriveLetter block defaulting to true. When adding access paths the disks will no longer have drive letters automatically assigned on next reboot which is the desired behavior – Fixes Issue 145.
xActiveDirectory 2.24.0.0
  • Added parameter to xADDomainController to support InstallationMediaPath (issue 108).
  • Updated xADDomainController schema to be standard and provide Descriptions.
xExchange 1.27.0.0
  • Added additional parameters to the MSFT_xExchTransportService resource
  • Added additional parameters to the MSFT_xExchEcpVirtualDirectory resource
  • Added additional unit tests to the MSFT_xExchAutodiscoverVirutalDirectory resource
  • Added additional parameters to the MSFT_xExchExchangeCertificate resource
  • MSFT_xExchMailboxDatabase: Fixes issue with DataMoveReplicationConstraint parameter (401)
  • Added additional parameters and comment based help to the MSFT_xExchMailboxDatabase resource
  • Move code that sets $global:DSCMachineStatus into a dedicated helper function. Issue 407
  • Add missing parameters for xExchMailboxDatabaseCopy, adds comment based help, and adds remaining Unit tests.
xHyper-V 3.16.0.0
  • MSFT_xVMHyperV:
    • Moved localization string data to own file.
    • Fixed code styling issues.
    • Fixed bug where StartupMemory was not evaluated in Test-TargetResource.
    • Redo of abandoned PRs:
    • Fixed Get throws error when NetworkAdapters are not attached or missing properties.
xPSDesiredStateConfiguration 8.5.0.0
  • Pull server module publishing
    • Removed forced verbose logging from CreateZipFromSource, Publish-DSCModulesAndMof and Publish-MOFToPullServer as it polluted the console
  • Corrected GitHub Pull Request template to remove referral to BestPractices.MD which has been combined into StyleGuidelines.md (issue 520).
  • xWindowsOptionalFeature
    • Suppress useless verbose output from Import-Module cmdlet. (issue 453).
  • Changes to xRemoteFile
    • Corrected a resource name in the example xRemoteFile_DownloadFileConfig.ps1
  • Fix MSFT_xDSCWebService to find Microsoft.Powershell.DesiredStateConfiguration.Service.Resources.dll when server is configured with pt-BR Locales (issue 284).
  • Changes to xDSCWebService
    • Fixed an issue which prevented the removal of the IIS Application Pool created during deployment of an DSC Pull Server instance. (issue 464)
    • Fixed an issue where a Pull Server cannot be deployed on a machine when IIS Express is installed aside a full blown IIS (issue 191)
  • Update CommonResourceHelper unit tests to meet Pester 4.0.0 standards (issue 473).
  • Update ResourceHelper unit tests to meet Pester 4.0.0 standards (issue 473).
  • Update MSFT_xDSCWebService unit tests to meet Pester 4.0.0 standards (issue 473).
  • Update MSFT_xDSCWebService integration tests to meet Pester 4.0.0 standards (issue 473).
  • Refactored MSFT_xDSCWebService integration tests to meet current standards and to use Pester TestDrive.
  • xArchive
    • Fix end-to-end tests (issue 457).
    • Update integration tests to meet Pester 4.0.0 standards.
    • Update end-to-end tests to meet Pester 4.0.0 standards.
    • Update unit and integration tests to meet Pester 4.0.0 standards.
    • Wrapped all path and identifier strings in verbose messages with quotes to make it easier to identify the limit of the string when debugging.
    • Refactored date/time checksum code to improve testability and ensure tests can run on machines with localized datetime formats that are not US.
    • Fix “Get-ArchiveEntryLastWriteTime” to return [datetime] (issue 471).
    • Improved verbose logging to make debugging path issues easier.
    • Added handling for “/” as a path seperator by backporting code from PSDscResources – (issue 469).
    • Copied unit tests from PSDscResources.
    • Added .gitattributes file and removed git configuration from AppVeyor to ensure CRLF settings are configured correctly for the repository.
  • Updated “.vscodesettings.json” to refer to AnalyzerSettings.psd1 so that custom syntax problems are highlighted in Visual Studio Code.
  • Fixed style guideline violations in CommonResourceHelper.psm1.
  • Changes to xService
    • Fixes issue where Get-TargetResource or Test-TargetResource will throw an exception if the target service is configured with a non-existent dependency.
    • Refactored Get-TargetResource Unit tests.
  • Changes to xPackage
    • Fixes an issue where incorrect verbose output was displayed if product found. (issue 446)
  • Fixes files which are getting triggered for re-encoding after recent pull request (possibly 472).
  • Moves version and change history from README.MD to new file, CHANGELOG.MD.
  • Fixes markdown issues in README.MD and HighQualityResourceModulePlan.md.
  • Opted in to “Common Tests – Validate Markdown Files”
  • Changes to xPSDesiredStateConfiguration
    • In AppVeyor CI the tests are split into three separate jobs, and also run tests on two different build worker images (Windows Server 2012R2 and Windows Server 2016). The common tests are only run on the Windows Server 2016 build worker image. Helps with issue 477.
  • xGroup
    • Corrected style guideline violations. (issue 485)
  • xWindowsProcess
    • Corrected style guideline violations. (issue 496)
  • Changes to PSWSIISEndpoint.psm1
    • Fixes most PSScriptAnalyzer issues.
  • Changes to xRegistry
    • Fixed an issue that fails to remove reg key when the Key is specified as common registry path. (issue 444)
  • Changes to xService
    • Added support for Group Managed Service Accounts
  • Adds new Integration tests for MSFT_xDSCWebService and removes old Integration test file, MSFT_xDSCWebService.xxx.ps1.
  • xRegistry
    • Corrected style guideline violations. (issue 489)
  • Fix script analyzer issues in UseSecurityBestPractices.psm1. issue 483
  • Fixes script analyzer issues in xEnvironmentResource. issue 484
  • Fixes script analyzer issues in MSFT_xMsiPackage.psm1. issue 486
  • Fixes script analyzer issues in MSFT_xPackageResource.psm1. issue 487
  • Adds spaces between variable types and variables, and changes Type Accelerators to Fully Qualified Type Names on affected code.
  • Fixes script analyzer issues in MSFT_xPSSessionConfiguration.psm1 and convert Type Accelerators to Fully Qualified Type Names issue 488.
  • Adds spaces between array members.
  • Fixes script analyzer issues in MSFT_xRemoteFile.psm1 and correct general style violations. (issue 490)
  • Remove unnecessary whitespace from line endings.
  • Add statement to README.md regarding the lack of testing of this module with PowerShell 4 issue 522.
  • Fixes script analyzer issues in MSFT_xWindowsOptionalFeature.psm1 and correct general style violations. issue 494)
  • Fixes script analyzer issues in MSFT_xRemoteFile.psm1 missed from issue 490.
  • Fix script analyzer issues in MSFT_xWindowsFeature.psm1. issue 493
  • Fix script analyzer issues in MSFT_xUserResource.psm1. issue 492
  • Moves calls to set $global:DSCMachineStatus = 1 into a helper function to reduce the number of locations where we need to suppress PSScriptAnalyzer rules PSAvoidGlobalVars and PSUseDeclaredVarsMoreThanAssignments.
  • Adds spaces between comment hashtags and comments.
  • Fixes script analyzer issues in MSFT_xServiceResource.psm1. issue 491
  • Fixes script analyzer issues in MSFT_xWindowsPackageCab.psm1. issue 495
  • xFileUpload:
    • Fixes script analyzer issues in xFileUpload.schema.psm1. issue 497
    • Update to meet style guidelines.
    • Added Integration tests.
    • Updated manifest Author, Company and Copyright to match standards.
  • Updated module manifest Copyright to match standards and remove year.
  • Auto-formatted the module manifest to improve layout.
  • Fix Run-On Words in README.md.
  • Changes to xPackage
    • Fix an misnamed variable that causes an error during error message output. issue 449)
  • Fixes script analyzer issues in MSFT_xPSSessionConfiguration.psm1. issue 566
  • Fixes script analyzer issues in xGroupSet.schema.psm1. issue 498
  • Fixes script analyzer issues in xProcessSet.schema.psm1. issue 499
  • Fixes script analyzer issues in xServiceSet.schema.psm1. issue 500
  • Fixes script analyzer issues in xWindowsFeatureSet.schema.psm1. issue 501
  • Fixes script analyzer issues in xWindowsOptionalFeatureSet.schema.psm1 issue 502
  • Updates Should statements in Pester tests to use dashes before parameters.
  • Added a CODE_OF_CONDUCT.md with the same content as in the README.md issue 562
  • Replaces Type Accelerators with fully qualified type names.
xWebAdministration 2.5.0.0
  • Added SiteId to xWebSite to address [396]
  • xWebSite: Full path is used to get list of default documents
  • xIISLogging: Added support for LogTargetW3C
  • xWebsite: Added support for LogTargetW3C

How to Find Released DSC Resource Modules

To see a list of all released DSC Resource Kit modules, go to the PowerShell Gallery and display all modules tagged as DSCResourceKit. You can also enter a module’s name in the search box in the upper right corner of the PowerShell Gallery to find a specific module.

Of course, you can also always use PowerShellGet (available starting in WMF 5.0) to find modules with DSC Resources:

<span class="pl-c"><span class="pl-c">#</span> To list all modules that tagged as DSCResourceKit</span>
<span class="pl-c1">Find-Module</span> <span class="pl-k">-</span>Tag DSCResourceKit 
<span class="pl-c"><span class="pl-c">#</span> To list all DSC resources from all sources </span>
<span class="pl-c1">Find-DscResource</span>

Please note only those modules released by the PowerShell Team are currently considered part of the ‘DSC Resource Kit’ regardless of the presence of the ‘DSC Resource Kit’ tag in the PowerShell Gallery.

To find a specific module, go directly to its URL on the PowerShell Gallery:
http://www.powershellgallery.com/packages/< module name >
For example:
http://www.powershellgallery.com/packages/xWebAdministration

How to Install DSC Resource Modules From the PowerShell Gallery

We recommend that you use PowerShellGet to install DSC resource modules:

<span class="pl-c1">Install-Module</span> <span class="pl-k">-</span>Name <span class="pl-k">&lt;</span> module name <span class="pl-k">&gt;</span>

For example:

<span class="pl-c1">Install-Module</span> <span class="pl-k">-</span>Name xWebAdministration

To update all previously installed modules at once, open an elevated PowerShell prompt and use this command:

<span class="pl-c1">Update-Module</span>

After installing modules, you can discover all DSC resources available to your local system with this command:

<span class="pl-c1">Get-DscResource</span>

How to Find DSC Resource Modules on GitHub

All resource modules in the DSC Resource Kit are available open-source on GitHub.
You can see the most recent state of a resource module by visiting its GitHub page at:
https://github.com/PowerShell/< module name >
For example, for the CertificateDsc module, go to:
https://github.com/PowerShell/CertificateDsc.

All DSC modules are also listed as submodules of the DscResources repository in the DscResources folder and the xDscResources folder.

How to Contribute

You are more than welcome to contribute to the development of the DSC Resource Kit! There are several different ways you can help. You can create new DSC resources or modules, add test automation, improve documentation, fix existing issues, or open new ones.
See our contributing guide for more info on how to become a DSC Resource Kit contributor.

If you would like to help, please take a look at the list of open issues for the DscResources repository.
You can also check issues for specific resource modules by going to:
https://github.com/PowerShell/< module name >/issues
For example:
https://github.com/PowerShell/xPSDesiredStateConfiguration/issues

Your help in developing the DSC Resource Kit is invaluable to us!

Questions, comments?

If you’re looking into using PowerShell DSC, have questions or issues with a current resource, or would like a new resource, let us know in the comments below, on Twitter (@PowerShell_Team), or by creating an issue on GitHub.

Katie Kragenbrink
Software Engineer
PowerShell DSC Team
@katiedsc (Twitter)
@kwirkykat (GitHub)

The post DSC Resource Kit Release February 2019 appeared first on PowerShell.

Parsing Text with PowerShell (3/3)

This post was originally published on this site

This is the third and final post in a three-part series.

  • Part 1:
    • Useful methods on the String class
    • Introduction to Regular Expressions
    • The Select-String cmdlet
  • Part 2:
    • the -split operator
    • the -match operator
    • the switch statement
    • the Regex class
  • Part 3:
    • a real world, complete and slightly bigger, example of a switch-based parser
      • General structure of a switch-based parser
      • The real world example

In the previous posts, we looked at the different operators what are available to us in PowerShell.

When analyzing crashes at DICE, I noticed that some of the C++ runtime binaries where missing debug symbols. They should be available for download from Microsoft’s public symbol server, and most versions were there. However, due to some process errors at DevDiv, some builds were released publicly without available debug symbols.
In some cases, those missing symbols prevented us from debugging those crashes, and in all cases, they triggered my developer OCD.

So, to give actionable feedback to Microsoft, I scripted a debugger (cdb.exe in this case) to give a verbose list of the loaded modules, and parsed the output with PowerShell, which was also later used to group and filter the resulting data set. I sent this data to Microsoft, and 5 days later, the missing symbols were available for download. Mission accomplished!

This post will describe the parser I wrote for this task (it turned out that I had good use for it for other tasks later), and the general structure is applicable to most parsing tasks.

The example will show how a switch-based parser would look when the input data isn’t as tidy as it normally is in examples, but messy – as the real world data often is.

General Structure of a switch Based Parser

Depending on the structure of our input, the code must be organized in slightly different ways.

Input may have a record start that differs by indentation or some distinct token like

Foo                    &lt;- Record start - No whitespace at the beginning of the line
    Prop1=Staffan      &lt;- Properties for the record - starts with whitespace
    Prop3 =ValueN
Bar
    Prop1=Steve
    Prop2=ValueBar2

If the data to be parsed has an explicit start record, it is a bit easier than if it doesn’t have one.
We create a new data object when we get a record start, after writing any previously created object to the pipeline.
At the end, we need to check if we have parsed a record that hasn’t been written to the pipeline.

The general structure of a such a switch-based parser can be as follows:

$inputData = @"
Foo
    Prop1=Value1
    Prop3=Value3
Bar
    Prop1=ValueBar1
    Prop2=ValueBar2
"@ -split 'r?n'   # This regex is useful to split at line endings, with or without carriage return

class SomeDataClass {
    $ID
    $Name
    $Property2
    $Property3
}

# map to project input property names to the properties on our data class
$propertyNameMap = @{
    Prop1 = "Name"
    Prop2 = "Property2"
    Prop3 = "Property3"
}

$currentObject = $null
switch -regex ($inputData) {

    '^(S.*)' {
        # record start pattern, in this case line that doesn't start with a whitespace.
        if ($null -ne $currentObject) {
            $currentObject                   # output to pipeline if we have a previous data object
        }
        $currentObject = [SomeDataClass] @{  # create new object for this record
            Id = $matches.1                  # with Id like Foo or Bar
        }
        continue
    }

    # set the properties on the data object
    '^s+([^=]+)=(.*)' {
        $name, $value = $matches[1, 2]
        # project property names
        $propName = $propertyNameMap[$name]
        if ($propName = $null) {
            $propName = $name
        }
        # assign the parsed value to the projected property name
        $currentObject.$propName = $value
        continue
    }
}

if ($currentObject) {
    # Handle the last object if any
    $currentObject # output to pipeline
}

ID  Name      Property2 Property3
--  ----      --------- ---------
Foo Value1              Value3
Bar ValueBar1 ValueBar2

Alternatively, we may have input where the records are separated by a blank line, but without any obvious record start.

commitId=1234                         &lt;- In this case, a commitId is first in a record
description=Update readme.md
                                      &lt;- the blank line separates records
user=Staffan                          &lt;- For this record, a user property comes first
commitId=1235
description=Fix bug.md

In this case the structure of the code looks a bit different. We create an object at the beginning, but keep track of if it’s dirty or not.
If we get to the end with a dirty object, we must output it.

$inputData = @"

commit=1234
desc=Update readme.md

user=Staffan
commit=1235
desc=Bug fix

"@ -split "r?n"

class SomeDataClass {
    [int] $CommitId
    [string] $Description
    [string] $User
}

# map to project input property names to the properties on our data class
# we only need to provide the ones that are different. 'User' works fine as it is.
$propertyNameMap = @{
    commit = "CommitId"
    desc   = "Description"
}

$currentObject = [SomeDataClass]::new()
$objectDirty = $false
switch -regex ($inputData) {
    # set the properties on the data object
    '^([^=]+)=(.*)$' {
        # parse a name/value
        $name, $value = $matches[1, 2]
        # project property names
        $propName = $propertyNameMap[$name]
        if ($null -eq $propName) {
            $propName = $name
        }
        # assign the projected property
        $currentObject.$propName = $value
        $objectDirty = $true
        continue
    }

    '^s*$' {
        # separator pattern, in this case any blank line
        if ($objectDirty) {
            $currentObject                           # output to pipeline
            $currentObject = [SomeDataClass]::new()  # create new object
            $objectDirty = $false                    # and mark it as not dirty
        }
    }
    default {
        Write-Warning "Unexpected input: '$_'"
    }
}

if ($objectDirty) {
    # Handle the last object if any
    $currentObject # output to pipeline
}

CommitId Description      User
-------- -----------      ----
    1234 Update readme.md
    1235 Bug fix          Staffan

The Real World Example

I have adapted this sample slightly so that I get the loaded modules from a running process instead of from my crash dumps. The format of the output from the debugger is the same.
The following command launches a command line debugger on notepad, with a script that gives a verbose listing of the loaded modules, and quits:

# we need to muck around with the console output encoding to handle the trademark chars
# imagine no encodings
# it's easy if you try
# no code pages below us
# above us only sky
[Console]::OutputEncoding = [System.Text.Encoding]::GetEncoding("iso-8859-1")

$proc = Start-Process notepad -passthru
Start-Sleep -seconds 1
$cdbOutput = cdb -y 'srv*c:symbols*http://msdl.microsoft.com/download/symbols' -c ".reload -f;lmv;q" -p $proc.ProcessID

The output of the command above is here for those who want to follow along but who aren’t running windows or don’t have cdb.exe installed.

The (abbreviated) output looks like this:

Microsoft (R) Windows Debugger Version 10.0.16299.15 AMD64
Copyright (c) Microsoft Corporation. All rights reserved.

*** wait with pending attach

************* Path validation summary **************
Response                         Time (ms)     Location
Deferred                                       srv*c:symbols*http://msdl.microsoft.com/download/symbols
Symbol search path is: srv*c:symbols*http://msdl.microsoft.com/download/symbols
Executable search path is:
ModLoad: 00007ff6`e9da0000 00007ff6`e9de3000   C:Windowssystem32notepad.exe
...
ModLoad: 00007ffe`97d80000 00007ffe`97db1000   C:WINDOWSSYSTEM32ntmarta.dll
(98bc.40a0): Break instruction exception - code 80000003 (first chance)
ntdll!DbgBreakPoint:
00007ffe`9cd53050 cc              int     3
0:007&gt; cdb: Reading initial command '.reload -f;lmv;q'
Reloading current modules
.....................................................
start             end                 module name
00007ff6`e9da0000 00007ff6`e9de3000   notepad    (pdb symbols)          c:symbolsnotepad.pdb2352C62CDF448257FDBDDA4081A8F9081notepad.pdb
    Loaded symbol image file: C:Windowssystem32notepad.exe
    Image path: C:Windowssystem32notepad.exe
    Image name: notepad.exe
    Image was built with /Brepro flag.
    Timestamp:        329A7791 (This is a reproducible build file hash, not a timestamp)
    CheckSum:         0004D15F
    ImageSize:        00043000
    File version:     10.0.17763.1
    Product version:  10.0.17763.1
    File flags:       0 (Mask 3F)
    File OS:          40004 NT Win32
    File type:        1.0 App
    File date:        00000000.00000000
    Translations:     0409.04b0
    CompanyName:      Microsoft Corporation
    ProductName:      Microsoft??? Windows??? Operating System
    InternalName:     Notepad
    OriginalFilename: NOTEPAD.EXE
    ProductVersion:   10.0.17763.1
    FileVersion:      10.0.17763.1 (WinBuild.160101.0800)
    FileDescription:  Notepad
    LegalCopyright:   ??? Microsoft Corporation. All rights reserved.
...
00007ffe`9ccb0000 00007ffe`9ce9d000   ntdll      (pdb symbols)          c:symbolsntdll.pdbB8AD79538F2730FD9BACE36C9F9316A01ntdll.pdb
    Loaded symbol image file: C:WINDOWSSYSTEM32ntdll.dll
    Image path: C:WINDOWSSYSTEM32ntdll.dll
    Image name: ntdll.dll
    Image was built with /Brepro flag.
    Timestamp:        E8B54827 (This is a reproducible build file hash, not a timestamp)
    CheckSum:         001F20D1
    ImageSize:        001ED000
    File version:     10.0.17763.194
    Product version:  10.0.17763.194
    File flags:       0 (Mask 3F)
    File OS:          40004 NT Win32
    File type:        2.0 Dll
    File date:        00000000.00000000
    Translations:     0409.04b0
    CompanyName:      Microsoft Corporation
    ProductName:      Microsoft??? Windows??? Operating System
    InternalName:     ntdll.dll
    OriginalFilename: ntdll.dll
    ProductVersion:   10.0.17763.194
    FileVersion:      10.0.17763.194 (WinBuild.160101.0800)
    FileDescription:  NT Layer DLL
    LegalCopyright:   ??? Microsoft Corporation. All rights reserved.
quit:

The output starts with info that I’m not interested in here. I only want to get the detailed information about the loaded modules. It is not until the line

start             end                 module name

that I care about the output.

Also, at the end there is a line that we need to be aware of:

quit:

that is not part of the module output.

To skip the parts of the debugger output that we don’t care about, we have a boolean flag initially set to true.
If that flag is set, we check if the current line, $_, is the module header in which case we flip the flag.

$inPreamble = $true
    switch -regex ($cdbOutput) {

        { $inPreamble -and $_ -eq "start             end                 module name" } { $inPreamble = $false; continue }

I have made the parser a separate function that reads its input from the pipeline. This way, I can use the same function to parse module data, regardless of how I got the module data. Maybe it was saved on a file. Or came from a dump, or a live process. It doesn’t matter, since the parser is decoupled from the data retrieval.

After the sample, there is a breakdown of the more complicated regular expressions used, so don’t despair if you don’t understand them at first.
Regular Expressions are notoriously hard to read, so much so that they make Perl look readable in comparison.

# define an class to store the data
class ExecutableModule {
    [string]   $Name
    [string]   $Start
    [string]   $End
    [string]   $SymbolStatus
    [string]   $PdbPath
    [bool]     $Reproducible
    [string]   $ImagePath
    [string]   $ImageName
    [DateTime] $TimeStamp
    [uint32]   $FileHash
    [uint32]   $CheckSum
    [uint32]   $ImageSize
    [version]  $FileVersion
    [version]  $ProductVersion
    [string]   $FileFlags
    [string]   $FileOS
    [string]   $FileType
    [string]   $FileDate
    [string[]] $Translations
    [string]   $CompanyName
    [string]   $ProductName
    [string]   $InternalName
    [string]   $OriginalFilename
    [string]   $ProductVersionStr
    [string]   $FileVersionStr
    [string]   $FileDescription
    [string]   $LegalCopyright
    [string]   $LegalTrademarks
    [string]   $LoadedImageFile
    [string]   $PrivateBuild
    [string]   $Comments
}

<#
.SYNOPSIS Runs a debugger on a program to dump its loaded modules
#>
function Get-ExecutableModuleRawData {
    param ([string] $Program)
    $consoleEncoding = [Console]::OutputEncoding
    [Console]::OutputEncoding = [System.Text.Encoding]::GetEncoding("iso-8859-1")
    try {
        $proc = Start-Process $program -PassThru
        Start-Sleep -Seconds 1  # sleep for a while so modules are loaded
        cdb -y srv*c:symbols*http://msdl.microsoft.com/download/symbols -c ".reload -f;lmv;q" -p $proc.Id
        $proc.Close()
    }
    finally {
        [Console]::OutputEncoding = $consoleEncoding
    }
}

<#
.SYNOPSIS Converts verbose module data from windows debuggers into ExecutableModule objects.
#>
function ConvertTo-ExecutableModule {
    [OutputType([ExecutableModule])]
    param (
        [Parameter(ValueFromPipeline)]
        [string[]] $ModuleRawData
    )
    begin {
        $currentObject = $null
        $preamble = $true
        $propertyNameMap = @{
            'File flags'      = 'FileFlags'
            'File OS'         = 'FileOS'
            'File type'       = 'FileType'
            'File date'       = 'FileDate'
            'File version'    = 'FileVersion'
            'Product version' = 'ProductVersion'
            'Image path'      = 'ImagePath'
            'Image name'      = 'ImageName'
            'FileVersion'     = 'FileVersionStr'
            'ProductVersion'  = 'ProductVersionStr'
        }
    }
    process {
        switch -regex ($ModuleRawData) {

            # skip lines until we get to our sentinel line
            { $preamble -and $_ -eq "start             end                 module name" } { $preamble = $false; continue }

            #00007ff6`e9da0000 00007ff6`e9de3000   notepad    (deferred)
            #00007ffe`9ccb0000 00007ffe`9ce9d000   ntdll      (pdb symbols)          c:symbolsntdll.pdbB8AD79538F2730FD9BACE36C9F9316A01ntdll.pdb
            '^([0-9a-f`]{17})s([0-9a-f`]{17})s+(S+)s+(([^)]+))s*(.+)?' {
                # see breakdown of the expression later in the post
                # on record start, output the currentObject, if any is set
                if ($null -ne $currentObject) {
                    $currentObject
                }
                $start, $end, $module, $pdbKind, $pdbPath = $matches[1..5]
                # create an instance of the object that we are adding info from the current record into.
                $currentObject = [ExecutableModule] @{
                    Start        = $start
                    End          = $end
                    Name         = $module
                    SymbolStatus = $pdbKind
                    PdbPath      = $pdbPath
                }
                continue
            }
            '^s+Image was built with /Brepro flag.' {
                $currentObject.Reproducible = $true
                continue
            }
            '^s+Timestamp:s+[^(]+((?<timestamp>.{8}))' {
                # see breakdown of the regular  expression later in the post
                # Timestamp:        Mon Jan  7 23:42:30 2019 (5C33D5D6)
                $intValue = [Convert]::ToInt32($matches.timestamp, 16)
                $currentObject.TimeStamp = [DateTime]::new(1970, 01, 01, 0, 0, 0, [DateTimeKind]::Utc).AddSeconds($intValue)
                continue
            }
            '^s+TimeStamp:s+(?<value>.{8}) (This' {
                # Timestamp:        E78937AC (This is a reproducible build file hash, not a timestamp)
                $currentObject.FileHash = [Convert]::ToUInt32($matches.value, 16)
                continue
            }
            '^s+Loaded symbol image file: (?<imageFile>[^)]+)' {
                $currentObject.LoadedImageFile = $matches.imageFile
                continue
            }
            '^s+Checksum:s+(?<checksum>S+)' {
                $currentObject.Checksum = [Convert]::ToUInt32($matches.checksum, 16)
                continue
            }
            '^s+Translations:s+(?<value>S+)' {
                $currentObject.Translations = $matches.value.Split(".")
                continue
            }
            '^s+ImageSize:s+(?<imageSize>.{8})' {
                $currentObject.ImageSize = [Convert]::ToUInt32($matches.imageSize, 16)
                continue
            }
            '^s{4}(?<name>[^:]+):s+(?<value>.+)' {
                # see breakdown of the regular expression later in the post
                # This part is any 'name: value' pattern
                $name, $value = $matches['name', 'value']

                # project the property name
                $propName = $propertyNameMap[$name]
                $propName = if ($null -eq $propName) { $name } else { $propName }

                # note the dynamic property name in the assignment
                # this will fail if the property doesn't have a member with the specified name
                $currentObject.$propName = $value
                continue
            }
            'quit:' {
                # ignore and exit
                break
            }
            default {
                # When writing the parser, it can be useful to include a line like the one below to see the cases that are not handled by the parser
                # Write-Warning "missing case for '$_'. Unexpected output format from cdb.exe"

                continue # skip lines that doesn't match the patterns we are interested in, like the start/end/modulename header and the quit: output
            }
        }
    }
    end {
        # this is needed to output the last object
        if ($null -ne $currentObject) {
            $currentObject
        }
    }
}


Get-ExecutableModuleRawData Notepad |
    ConvertTo-ExecutableModule |
    Sort-Object ProductVersion, Name
    Format-Table -Property Name, FileVersion, Product_Version, FileDescription

Name               FileVersionStr                             ProductVersion FileDescription
----               --------------                             -------------- ---------------
PROPSYS            7.0.17763.1 (WinBuild.160101.0800)         7.0.17763.1    Microsoft Property System
ADVAPI32           10.0.17763.1 (WinBuild.160101.0800)        10.0.17763.1   Advanced Windows 32 Base API
bcrypt             10.0.17763.1 (WinBuild.160101.0800)        10.0.17763.1   Windows Cryptographic Primitives Library
...
uxtheme            10.0.17763.1 (WinBuild.160101.0800)        10.0.17763.1   Microsoft UxTheme Library
win32u             10.0.17763.1 (WinBuild.160101.0800)        10.0.17763.1   Win32u
WINSPOOL           10.0.17763.1 (WinBuild.160101.0800)        10.0.17763.1   Windows Spooler Driver
KERNELBASE         10.0.17763.134 (WinBuild.160101.0800)      10.0.17763.134 Windows NT BASE API Client DLL
wintypes           10.0.17763.134 (WinBuild.160101.0800)      10.0.17763.134 Windows Base Types DLL
SHELL32            10.0.17763.168 (WinBuild.160101.0800)      10.0.17763.168 Windows Shell Common Dll
...
windows_storage    10.0.17763.168 (WinBuild.160101.0800)      10.0.17763.168 Microsoft WinRT Storage API
CoreMessaging      10.0.17763.194                             10.0.17763.194 Microsoft CoreMessaging Dll
gdi32full          10.0.17763.194 (WinBuild.160101.0800)      10.0.17763.194 GDI Client DLL
ntdll              10.0.17763.194 (WinBuild.160101.0800)      10.0.17763.194 NT Layer DLL
RMCLIENT           10.0.17763.194 (WinBuild.160101.0800)      10.0.17763.194 Resource Manager Client
RPCRT4             10.0.17763.194 (WinBuild.160101.0800)      10.0.17763.194 Remote Procedure Call Runtime
combase            10.0.17763.253 (WinBuild.160101.0800)      10.0.17763.253 Microsoft COM for Windows
COMCTL32           6.10 (WinBuild.160101.0800)                10.0.17763.253 User Experience Controls Library
urlmon             11.00.17763.168 (WinBuild.160101.0800)     11.0.17763.168 OLE32 Extensions for Win32
iertutil           11.00.17763.253 (WinBuild.160101.0800)     11.0.17763.253 Run time utility for Internet Explorer

Regex pattern breakdown

Here is a breakdown of the more complicated patterns, using the ignore pattern whitespace modifier x:

([0-9a-f`]{17})s([0-9a-f`]{17})s+(S+)s+(([^)]+))s*(.+)?

# example input: 00007ffe`9ccb0000 00007ffe`9ce9d000   ntdll      (pdb symbols)          c:symbolsntdll.pdbB8AD79538F2730FD9BACE36C9F9316A01ntdll.pdb

(?x)                # ignore pattern whitespace
^                   # the beginning of the line
([0-9a-f`]{17})     # capture expression like 00007ff6`e9da0000 - any hex number or backtick, and exactly 17 of them
s                  # a space
([0-9a-f`]{17})     # capture expression like 00007ff6`e9da0000 - any hex number or backtick, and exactly 17 of them
s+                 # skip any number of spaces
(S+)               # capture until we get a space - this would match the 'ntdll' part
s+                 # skip one or more spaces
(                  # start parenthesis
    ([^)])         # capture anything but end parenthesis
)                  # end parenthesis
s*                 # skip zero or more spaces
(.+)?               # optionally capture any symbol file path

Breakdown of the name-value pattern:

^s+(?<name>[^:]+):s+(?<value>.+)

# example input:  File flags:       0 (Mask 3F)

(?x)                # ignore pattern whitespace
^                   # the beginning of the line
s+                 # require one or more spaces
(?<name>[^:]+)      # capture anything that is not a `:` into the named group "name"
:                   # require a comma
s+                 # require one or more spaces
(?<value>.+)        # capture everything until the end into the name group "value"

Breakdown of the timestamp pattern:

^s{4}Timestamp:s+[^(]+((?<timestamp>.{8}))

#example input:     Timestamp:        Mon Jan  7 23:42:30 2019 (5C33D5D6)

(?x)                # ignore pattern whitespace
^                   # the beginning of the line
s+                 # require one or more spaces
Timestamp:          # The literal text 'Timestamp:'
s+                 # require one or more spaces
[^(]+              # one or more of anything but a open parenthesis
(                  # a literal '('
(?<timestamp>.{8})  # 8 characters of anything, captured into the group 'timestamp'
)                  # a literal ')'

Gotchas – the Regex Cache

Something that can happen if you are writing a more complicated parser is the following:
The parser works well. You have 15 regular expressions in your switch statement and then you get some input you haven’t seen before, so you add a 16th regex.
All of a sudden, the performance of your parser tanks. WTF?

The .net regex implementation has a cache of recently used regexs. You can check the size of it like this:

PS> [regex]::CacheSize
15

# bump it
[regex]::CacheSize = 20

And now your parser is fast(er) again.

Bonus tip

I frequently use PowerShell to write (generate) my code:

Get-ExecutableModuleRawData pwsh |
    Select-String '^s+([^:]+):' |       # this pattern matches the module detail fields
    Foreach-Object {$_.matches.groups[1].value} |
    Select-Object -Unique |
    Foreach-Object -Begin   { "class ExecutableModuleData {" }`
                   -Process { "    [string] $" + ($_ -replace "s.", {[char]::ToUpperInvariant($_.Groups[0].Value[1])}) }`
                   -End     { "}" }

The output is

class ExecutableModuleData {
    [string] $LoadedSymbolImageFile
    [string] $ImagePath
    [string] $ImageName
    [string] $Timestamp
    [string] $CheckSum
    [string] $ImageSize
    [string] $FileVersion
    [string] $ProductVersion
    [string] $FileFlags
    [string] $FileOS
    [string] $FileType
    [string] $FileDate
    [string] $Translations
    [string] $CompanyName
    [string] $ProductName
    [string] $InternalName
    [string] $OriginalFilename
    [string] $ProductVersion
    [string] $FileVersion
    [string] $FileDescription
    [string] $LegalCopyright
    [string] $Comments
    [string] $LegalTrademarks
    [string] $PrivateBuild
}

It is not complete – I don’t have the fields from the record start, some types are incorrect and when run against some other executables a few other fields may appear.
But it is a very good starting point. And way more fun than typing it 🙂

Note that this example is using a new feature of the -replace operator – to use a ScriptBlock to determine what to replace with – that was added in PowerShell Core 6.1.

Bonus tip #2

A regular expression construct that I often find useful is non-greedy matching.
The example below shows the effect of the ? modifier, that can be used after * (zero or more) and + (one or more)

# greedy matching - match to the last occurrence of the following character (>)
if("<Tag>Text</Tag>" -match '<(.+)>') { $matches }

Name                           Value
----                           -----
1                              Tag&gt;Text&lt;/Tag
0                              &lt;Tag&gt;Text&lt;/Tag&gt;

# non-greedy matching - match to the first occurrence of the the following character (>)
if("<Tag>Text</Tag>" -match '<(.+?)>') { $matches }
Name                           Value
----                           -----
1                              Tag
0                              &lt;Tag&gt;

See Regex Repeat for more info on how to control pattern repetition.

Summary

In this post, we have looked at how the structure of a switch-based parser could look, and how it can be written so that it works as a part of the pipeline.
We have also looked at a few slightly more complicated regular expressions in some detail.

As we have seen, PowerShell has a plethora of options for parsing text, and most of them revolve around regular expressions.
My personal experience has been that the time I’ve invested in understanding the regex language was well invested.

Hopefully, this gives you a good start with the parsing tasks you have at hand.

Thanks to Jason Shirk, Mathias Jessen and Steve Lee for reviews and feedback.

Staffan Gustafsson, @StaffanGson, github

Staffan works at DICE in Stockholm, Sweden, as a Software Engineer and has been using PowerShell since the first public beta.
He was most seriously pleased when PowerShell was open sourced, and has since contributed bug fixes, new features and performance improvements.
Staffan is a speaker at PSConfEU and is always happy to talk PowerShell.

The post Parsing Text with PowerShell (3/3) appeared first on Powershell.

Parsing Text with PowerShell (3/3)

This post was originally published on this site

This is the third and final post in a three-part series.

  • Part 1:
    • Useful methods on the String class
    • Introduction to Regular Expressions
    • The Select-String cmdlet
  • Part 2:
    • the -split operator
    • the -match operator
    • the switch statement
    • the Regex class
  • Part 3:
    • a real world, complete and slightly bigger, example of a switch-based parser
      • General structure of a switch-based parser
      • The real world example

In the previous posts, we looked at the different operators what are available to us in PowerShell.

When analyzing crashes at DICE, I noticed that some of the C++ runtime binaries where missing debug symbols. They should be available for download from Microsoft’s public symbol server, and most versions were there. However, due to some process errors at DevDiv, some builds were released publicly without available debug symbols.
In some cases, those missing symbols prevented us from debugging those crashes, and in all cases, they triggered my developer OCD.

So, to give actionable feedback to Microsoft, I scripted a debugger (cdb.exe in this case) to give a verbose list of the loaded modules, and parsed the output with PowerShell, which was also later used to group and filter the resulting data set. I sent this data to Microsoft, and 5 days later, the missing symbols were available for download. Mission accomplished!

This post will describe the parser I wrote for this task (it turned out that I had good use for it for other tasks later), and the general structure is applicable to most parsing tasks.

The example will show how a switch-based parser would look when the input data isn’t as tidy as it normally is in examples, but messy – as the real world data often is.

General Structure of a switch Based Parser

Depending on the structure of our input, the code must be organized in slightly different ways.

Input may have a record start that differs by indentation or some distinct token like

Foo                    <- Record start - No whitespace at the beginning of the line
    Prop1=Staffan      <- Properties for the record - starts with whitespace
    Prop3 =ValueN
Bar
    Prop1=Steve
    Prop2=ValueBar2

If the data to be parsed has an explicit start record, it is a bit easier than if it doesn’t have one.
We create a new data object when we get a record start, after writing any previously created object to the pipeline.
At the end, we need to check if we have parsed a record that hasn’t been written to the pipeline.

The general structure of a such a switch-based parser can be as follows:

$inputData = @"
Foo
    Prop1=Value1
    Prop3=Value3
Bar
    Prop1=ValueBar1
    Prop2=ValueBar2
"@ -split 'r?n'   # This regex is useful to split at line endings, with or without carriage return

class SomeDataClass {
    $ID
    $Name
    $Property2
    $Property3
}

# map to project input property names to the properties on our data class
$propertyNameMap = @{
    Prop1 = "Name"
    Prop2 = "Property2"
    Prop3 = "Property3"
}

$currentObject = $null
switch -regex ($inputData) {

    '^(S.*)' {
        # record start pattern, in this case line that doesn't start with a whitespace.
        if ($null -ne $currentObject) {
            $currentObject                   # output to pipeline if we have a previous data object
        }
        $currentObject = [SomeDataClass] @{  # create new object for this record
            Id = $matches.1                  # with Id like Foo or Bar
        }
        continue
    }

    # set the properties on the data object
    '^s+([^=]+)=(.*)' {
        $name, $value = $matches[1, 2]
        # project property names
        $propName = $propertyNameMap[$name]
        if ($propName = $null) {
            $propName = $name
        }
        # assign the parsed value to the projected property name
        $currentObject.$propName = $value
        continue
    }
}

if ($currentObject) {
    # Handle the last object if any
    $currentObject # output to pipeline
}
ID  Name      Property2 Property3
--  ----      --------- ---------
Foo Value1              Value3
Bar ValueBar1 ValueBar2

Alternatively, we may have input where the records are separated by a blank line, but without any obvious record start.

commitId=1234                         <- In this case, a commitId is first in a record
description=Update readme.md
                                      <- the blank line separates records
user=Staffan                          <- For this record, a user property comes first
commitId=1235
description=Fix bug.md

In this case the structure of the code looks a bit different. We create an object at the beginning, but keep track of if it’s dirty or not.
If we get to the end with a dirty object, we must output it.

$inputData = @"

commit=1234
desc=Update readme.md

user=Staffan
commit=1235
desc=Bug fix

"@ -split "r?n"

class SomeDataClass {
    [int] $CommitId
    [string] $Description
    [string] $User
}

# map to project input property names to the properties on our data class
# we only need to provide the ones that are different. 'User' works fine as it is.
$propertyNameMap = @{
    commit = "CommitId"
    desc   = "Description"
}

$currentObject = [SomeDataClass]::new()
$objectDirty = $false
switch -regex ($inputData) {
    # set the properties on the data object
    '^([^=]+)=(.*)$' {
        # parse a name/value
        $name, $value = $matches[1, 2]
        # project property names
        $propName = $propertyNameMap[$name]
        if ($null -eq $propName) {
            $propName = $name
        }
        # assign the projected property
        $currentObject.$propName = $value
        $objectDirty = $true
        continue
    }

    '^s*$' {
        # separator pattern, in this case any blank line
        if ($objectDirty) {
            $currentObject                           # output to pipeline
            $currentObject = [SomeDataClass]::new()  # create new object
            $objectDirty = $false                    # and mark it as not dirty
        }
    }
    default {
        Write-Warning "Unexpected input: '$_'"
    }
}

if ($objectDirty) {
    # Handle the last object if any
    $currentObject # output to pipeline
}
CommitId Description      User
-------- -----------      ----
    1234 Update readme.md
    1235 Bug fix          Staffan

The Real World Example

I have adapted this sample slightly so that I get the loaded modules from a running process instead of from my crash dumps. The format of the output from the debugger is the same.
The following command launches a command line debugger on notepad, with a script that gives a verbose listing of the loaded modules, and quits:

# we need to muck around with the console output encoding to handle the trademark chars
# imagine no encodings
# it's easy if you try
# no code pages below us
# above us only sky
[Console]::OutputEncoding = [System.Text.Encoding]::GetEncoding("iso-8859-1")

$proc = Start-Process notepad -passthru
Start-Sleep -seconds 1
$cdbOutput = cdb -y 'srv*c:symbols*http://msdl.microsoft.com/download/symbols' -c ".reload -f;lmv;q" -p $proc.ProcessID

The output of the command above is attached to the blog post for those who want to follow along but who aren’t running windows or don’t have cdb.exe installed.

The (abbreviated) output looks like this:

Microsoft (R) Windows Debugger Version 10.0.16299.15 AMD64
Copyright (c) Microsoft Corporation. All rights reserved.

*** wait with pending attach

************* Path validation summary **************
Response                         Time (ms)     Location
Deferred                                       srv*c:symbols*http://msdl.microsoft.com/download/symbols
Symbol search path is: srv*c:symbols*http://msdl.microsoft.com/download/symbols
Executable search path is:
ModLoad: 00007ff6`e9da0000 00007ff6`e9de3000   C:Windowssystem32notepad.exe
...
ModLoad: 00007ffe`97d80000 00007ffe`97db1000   C:WINDOWSSYSTEM32ntmarta.dll
(98bc.40a0): Break instruction exception - code 80000003 (first chance)
ntdll!DbgBreakPoint:
00007ffe`9cd53050 cc              int     3
0:007> cdb: Reading initial command '.reload -f;lmv;q'
Reloading current modules
.....................................................
start             end                 module name
00007ff6`e9da0000 00007ff6`e9de3000   notepad    (pdb symbols)          c:symbolsnotepad.pdb2352C62CDF448257FDBDDA4081A8F9081notepad.pdb
    Loaded symbol image file: C:Windowssystem32notepad.exe
    Image path: C:Windowssystem32notepad.exe
    Image name: notepad.exe
    Image was built with /Brepro flag.
    Timestamp:        329A7791 (This is a reproducible build file hash, not a timestamp)
    CheckSum:         0004D15F
    ImageSize:        00043000
    File version:     10.0.17763.1
    Product version:  10.0.17763.1
    File flags:       0 (Mask 3F)
    File OS:          40004 NT Win32
    File type:        1.0 App
    File date:        00000000.00000000
    Translations:     0409.04b0
    CompanyName:      Microsoft Corporation
    ProductName:      Microsoft??? Windows??? Operating System
    InternalName:     Notepad
    OriginalFilename: NOTEPAD.EXE
    ProductVersion:   10.0.17763.1
    FileVersion:      10.0.17763.1 (WinBuild.160101.0800)
    FileDescription:  Notepad
    LegalCopyright:   ??? Microsoft Corporation. All rights reserved.
...
00007ffe`9ccb0000 00007ffe`9ce9d000   ntdll      (pdb symbols)          c:symbolsntdll.pdbB8AD79538F2730FD9BACE36C9F9316A01ntdll.pdb
    Loaded symbol image file: C:WINDOWSSYSTEM32ntdll.dll
    Image path: C:WINDOWSSYSTEM32ntdll.dll
    Image name: ntdll.dll
    Image was built with /Brepro flag.
    Timestamp:        E8B54827 (This is a reproducible build file hash, not a timestamp)
    CheckSum:         001F20D1
    ImageSize:        001ED000
    File version:     10.0.17763.194
    Product version:  10.0.17763.194
    File flags:       0 (Mask 3F)
    File OS:          40004 NT Win32
    File type:        2.0 Dll
    File date:        00000000.00000000
    Translations:     0409.04b0
    CompanyName:      Microsoft Corporation
    ProductName:      Microsoft??? Windows??? Operating System
    InternalName:     ntdll.dll
    OriginalFilename: ntdll.dll
    ProductVersion:   10.0.17763.194
    FileVersion:      10.0.17763.194 (WinBuild.160101.0800)
    FileDescription:  NT Layer DLL
    LegalCopyright:   ??? Microsoft Corporation. All rights reserved.
quit:

The output starts with info that I’m not interested in here. I only want to get the detailed information about the loaded modules. It is not until the line

start             end                 module name

that I care about the output.

Also, at the end there is a line that we need to be aware of:

quit:

that is not part of the module output.

To skip the parts of the debugger output that we don’t care about, we have a boolean flag initially set to true.
If that flag is set, we check if the current line, $_, is the module header in which case we flip the flag.

    $inPreamble = $true
    switch -regex ($cdbOutput) {

        { $inPreamble -and $_ -eq "start             end                 module name" } { $inPreamble = $false; continue }

I have made the parser a separate function that reads its input from the pipeline. This way, I can use the same function to parse module data, regardless of how I got the module data. Maybe it was saved on a file. Or came from a dump, or a live process. It doesn’t matter, since the parser is decoupled from the data retrieval.

After the sample, there is a breakdown of the more complicated regular expressions used, so don’t despair if you don’t understand them at first.
Regular Expressions are notoriously hard to read, so much so that they make Perl look readable in comparison.

# define an class to store the data
class ExecutableModule {
    [string]   $Name
    [string]   $Start
    [string]   $End
    [string]   $SymbolStatus
    [string]   $PdbPath
    [bool]     $Reproducible
    [string]   $ImagePath
    [string]   $ImageName
    [DateTime] $TimeStamp
    [uint32]   $FileHash
    [uint32]   $CheckSum
    [uint32]   $ImageSize
    [version]  $FileVersion
    [version]  $ProductVersion
    [string]   $FileFlags
    [string]   $FileOS
    [string]   $FileType
    [string]   $FileDate
    [string[]] $Translations
    [string]   $CompanyName
    [string]   $ProductName
    [string]   $InternalName
    [string]   $OriginalFilename
    [string]   $ProductVersionStr
    [string]   $FileVersionStr
    [string]   $FileDescription
    [string]   $LegalCopyright
    [string]   $LegalTrademarks
    [string]   $LoadedImageFile
    [string]   $PrivateBuild
    [string]   $Comments
}

<#
.SYNOPSIS Runs a debugger on a program to dump its loaded modules
#>
function Get-ExecutableModuleRawData {
    param ([string] $Program)
    $consoleEncoding = [Console]::OutputEncoding
    [Console]::OutputEncoding = [System.Text.Encoding]::GetEncoding("iso-8859-1")
    try {
        $proc = Start-Process $program -PassThru
        Start-Sleep -Seconds 1  # sleep for a while so modules are loaded
        cdb -y srv*c:symbols*http://msdl.microsoft.com/download/symbols -c ".reload -f;lmv;q" -p $proc.Id
        $proc.Close()
    }
    finally {
        [Console]::OutputEncoding = $consoleEncoding
    }
}

<#
.SYNOPSIS Converts verbose module data from windows debuggers into ExecutableModule objects.
#>
function ConvertTo-ExecutableModule {
    [OutputType([ExecutableModule])]
    param (
        [Parameter(ValueFromPipeline)]
        [string[]] $ModuleRawData
    )
    begin {
        $currentObject = $null
        $preamble = $true
        $propertyNameMap = @{
            'File flags'      = 'FileFlags'
            'File OS'         = 'FileOS'
            'File type'       = 'FileType'
            'File date'       = 'FileDate'
            'File version'    = 'FileVersion'
            'Product version' = 'ProductVersion'
            'Image path'      = 'ImagePath'
            'Image name'      = 'ImageName'
            'FileVersion'     = 'FileVersionStr'
            'ProductVersion'  = 'ProductVersionStr'
        }
    }
    process {
        switch -regex ($ModuleRawData) {

            # skip lines until we get to our sentinel line
            { $preamble -and $_ -eq "start             end                 module name" } { $preamble = $false; continue }

            #00007ff6`e9da0000 00007ff6`e9de3000   notepad    (deferred)
            #00007ffe`9ccb0000 00007ffe`9ce9d000   ntdll      (pdb symbols)          c:symbolsntdll.pdbB8AD79538F2730FD9BACE36C9F9316A01ntdll.pdb
            '^([0-9a-f`]{17})s([0-9a-f`]{17})s+(S+)s+(([^)]+))s*(.+)?' {
                # see breakdown of the expression later in the post
                # on record start, output the currentObject, if any is set
                if ($null -ne $currentObject) {
                    $currentObject
                }
                $start, $end, $module, $pdbKind, $pdbPath = $matches[1..5]
                # create an instance of the object that we are adding info from the current record into.
                $currentObject = [ExecutableModule] @{
                    Start        = $start
                    End          = $end
                    Name         = $module
                    SymbolStatus = $pdbKind
                    PdbPath      = $pdbPath
                }
                continue
            }
            '^s+Image was built with /Brepro flag.' {
                $currentObject.Reproducible = $true
                continue
            }
            '^s+Timestamp:s+[^(]+((?<timestamp>.{8}))' {
                # see breakdown of the regular  expression later in the post
                # Timestamp:        Mon Jan  7 23:42:30 2019 (5C33D5D6)
                $intValue = [Convert]::ToInt32($matches.timestamp, 16)
                $currentObject.TimeStamp = [DateTime]::new(1970, 01, 01, 0, 0, 0, [DateTimeKind]::Utc).AddSeconds($intValue)
                continue
            }
            '^s+TimeStamp:s+(?<value>.{8}) (This' {
                # Timestamp:        E78937AC (This is a reproducible build file hash, not a timestamp)
                $currentObject.FileHash = [Convert]::ToUInt32($matches.value, 16)
                continue
            }
            '^s+Loaded symbol image file: (?<imageFile>[^)]+)' {
                $currentObject.LoadedImageFile = $matches.imageFile
                continue
            }
            '^s+Checksum:s+(?<checksum>S+)' {
                $currentObject.Checksum = [Convert]::ToUInt32($matches.checksum, 16)
                continue
            }
            '^s+Translations:s+(?<value>S+)' {
                $currentObject.Translations = $matches.value.Split(".")
                continue
            }
            '^s+ImageSize:s+(?<imageSize>.{8})' {
                $currentObject.ImageSize = [Convert]::ToUInt32($matches.imageSize, 16)
                continue
            }
            '^s{4}(?<name>[^:]+):s+(?<value>.+)' {
                # see breakdown of the regular expression later in the post
                # This part is any 'name: value' pattern
                $name, $value = $matches['name', 'value']

                # project the property name
                $propName = $propertyNameMap[$name]
                $propName = if ($null -eq $propName) { $name } else { $propName }

                # note the dynamic property name in the assignment
                # this will fail if the property doesn't have a member with the specified name
                $currentObject.$propName = $value
                continue
            }
            'quit:' {
                # ignore and exit
                break
            }
            default {
                # When writing the parser, it can be useful to include a line like the one below to see the cases that are not handled by the parser
                # Write-Warning "missing case for '$_'. Unexpected output format from cdb.exe"

                continue # skip lines that doesn't match the patterns we are interested in, like the start/end/modulename header and the quit: output
            }
        }
    }
    end {
        # this is needed to output the last object
        if ($null -ne $currentObject) {
            $currentObject
        }
    }
}


Get-ExecutableModuleRawData Notepad |
    ConvertTo-ExecutableModule |
    Sort-Object ProductVersion, Name
    Format-Table -Property Name, FileVersion, Product_Version, FileDescription
Name               FileVersionStr                             ProductVersion FileDescription
----               --------------                             -------------- ---------------
PROPSYS            7.0.17763.1 (WinBuild.160101.0800)         7.0.17763.1    Microsoft Property System
ADVAPI32           10.0.17763.1 (WinBuild.160101.0800)        10.0.17763.1   Advanced Windows 32 Base API
bcrypt             10.0.17763.1 (WinBuild.160101.0800)        10.0.17763.1   Windows Cryptographic Primitives Library
...
uxtheme            10.0.17763.1 (WinBuild.160101.0800)        10.0.17763.1   Microsoft UxTheme Library
win32u             10.0.17763.1 (WinBuild.160101.0800)        10.0.17763.1   Win32u
WINSPOOL           10.0.17763.1 (WinBuild.160101.0800)        10.0.17763.1   Windows Spooler Driver
KERNELBASE         10.0.17763.134 (WinBuild.160101.0800)      10.0.17763.134 Windows NT BASE API Client DLL
wintypes           10.0.17763.134 (WinBuild.160101.0800)      10.0.17763.134 Windows Base Types DLL
SHELL32            10.0.17763.168 (WinBuild.160101.0800)      10.0.17763.168 Windows Shell Common Dll
...
windows_storage    10.0.17763.168 (WinBuild.160101.0800)      10.0.17763.168 Microsoft WinRT Storage API
CoreMessaging      10.0.17763.194                             10.0.17763.194 Microsoft CoreMessaging Dll
gdi32full          10.0.17763.194 (WinBuild.160101.0800)      10.0.17763.194 GDI Client DLL
ntdll              10.0.17763.194 (WinBuild.160101.0800)      10.0.17763.194 NT Layer DLL
RMCLIENT           10.0.17763.194 (WinBuild.160101.0800)      10.0.17763.194 Resource Manager Client
RPCRT4             10.0.17763.194 (WinBuild.160101.0800)      10.0.17763.194 Remote Procedure Call Runtime
combase            10.0.17763.253 (WinBuild.160101.0800)      10.0.17763.253 Microsoft COM for Windows
COMCTL32           6.10 (WinBuild.160101.0800)                10.0.17763.253 User Experience Controls Library
urlmon             11.00.17763.168 (WinBuild.160101.0800)     11.0.17763.168 OLE32 Extensions for Win32
iertutil           11.00.17763.253 (WinBuild.160101.0800)     11.0.17763.253 Run time utility for Internet Explorer

Regex pattern breakdown

Here is a breakdown of the more complicated patterns, using the ignore pattern whitespace modifier x:

([0-9a-f`]{17})s([0-9a-f`]{17})s+(S+)s+(([^)]+))s*(.+)?

# example input: 00007ffe`9ccb0000 00007ffe`9ce9d000   ntdll      (pdb symbols)          c:symbolsntdll.pdbB8AD79538F2730FD9BACE36C9F9316A01ntdll.pdb

(?x)                # ignore pattern whitespace
^                   # the beginning of the line
([0-9a-f`]{17})     # capture expression like 00007ff6`e9da0000 - any hex number or backtick, and exactly 17 of them
s                  # a space
([0-9a-f`]{17})     # capture expression like 00007ff6`e9da0000 - any hex number or backtick, and exactly 17 of them
s+                 # skip any number of spaces
(S+)               # capture until we get a space - this would match the 'ntdll' part
s+                 # skip one or more spaces
(                  # start parenthesis
    ([^)])         # capture anything but end parenthesis
)                  # end parenthesis
s*                 # skip zero or more spaces
(.+)?               # optionally capture any symbol file path

Breakdown of the name-value pattern:

^s+(?<name>[^:]+):s+(?<value>.+)

# example input:  File flags:       0 (Mask 3F)

(?x)                # ignore pattern whitespace
^                   # the beginning of the line
s+                 # require one or more spaces
(?<name>[^:]+)      # capture anything that is not a `:` into the named group "name"
:                   # require a comma
s+                 # require one or more spaces
(?<value>.+)        # capture everything until the end into the name group "value"

Breakdown of the timestamp pattern:

^s{4}Timestamp:s+[^(]+((?<timestamp>.{8}))

#example input:     Timestamp:        Mon Jan  7 23:42:30 2019 (5C33D5D6)

(?x)                # ignore pattern whitespace
^                   # the beginning of the line
s+                 # require one or more spaces
Timestamp:          # The literal text 'Timestamp:'
s+                 # require one or more spaces
[^(]+              # one or more of anything but a open parenthesis
(                  # a literal '('
(?<timestamp>.{8})  # 8 characters of anything, captured into the group 'timestamp'
)                  # a literal ')'

Gotchas – the Regex Cache

Something that can happen if you are writing a more complicated parser is the following:
The parser works well. You have 15 regular expressions in your switch statement and then you get some input you haven’t seen before, so you add a 16th regex.
All of a sudden, the performance of your parser tanks. WTF?

The .net regex implementation has a cache of recently used regexs. You can check the size of it like this:

PS> [regex]::CacheSize
15

# bump it
[regex]::CacheSize = 20

And now your parser is fast(er) again.

Bonus tip

I frequently use PowerShell to write (generate) my code:

Get-ExecutableModuleRawData pwsh |
    Select-String '^s+([^:]+):' |       # this pattern matches the module detail fields
    Foreach-Object {$_.matches.groups[1].value} |
    Select-Object -Unique |
    Foreach-Object -Begin   { "class ExecutableModuleData {" }`
                   -Process { "    [string] $" + ($_ -replace "s.", {[char]::ToUpperInvariant($_.Groups[0].Value[1])}) }`
                   -End     { "}" }

The output is

class ExecutableModuleData {
    [string] $LoadedSymbolImageFile
    [string] $ImagePath
    [string] $ImageName
    [string] $Timestamp
    [string] $CheckSum
    [string] $ImageSize
    [string] $FileVersion
    [string] $ProductVersion
    [string] $FileFlags
    [string] $FileOS
    [string] $FileType
    [string] $FileDate
    [string] $Translations
    [string] $CompanyName
    [string] $ProductName
    [string] $InternalName
    [string] $OriginalFilename
    [string] $ProductVersion
    [string] $FileVersion
    [string] $FileDescription
    [string] $LegalCopyright
    [string] $Comments
    [string] $LegalTrademarks
    [string] $PrivateBuild
}

It is not complete – I don’t have the fields from the record start, some types are incorrect and when run against some other executables a few other fields may appear.
But it is a very good starting point. And way more fun than typing it 🙂

Note that this example is using a new feature of the -replace operator – to use a ScriptBlock to determine what to replace with – that was added in PowerShell Core 6.1.

Bonus tip #2

A regular expression construct that I often find useful is non-greedy matching.
The example below shows the effect of the ? modifier, that can be used after * (zero or more) and + (one or more)

# greedy matching - match to the last occurrence of the following character (>)
if("<Tag>Text</Tag>" -match '<(.+)>') { $matches }
Name                           Value
----                           -----
1                              Tag>Text</Tag
0                              <Tag>Text</Tag>
# non-greedy matching - match to the first occurrence of the the following character (>)
if("<Tag>Text</Tag>" -match '<(.+?)>') { $matches }
Name                           Value
----                           -----
1                              Tag
0                              <Tag>

See Regex Repeat for more info on how to control pattern repetition.

Summary

In this post, we have looked at how the structure of a switch-based parser could look, and how it can be written so that it works as a part of the pipeline.
We have also looked at a few slightly more complicated regular expressions in some detail.

As we have seen, PowerShell has a plethora of options for parsing text, and most of them revolve around regular expressions.
My personal experience has been that the time I’ve invested in understanding the regex language was well invested.

Hopefully, this gives you a good start with the parsing tasks you have at hand.

Thanks to Jason Shirk, Mathias Jessen and Steve Lee for reviews and feedback.

Staffan Gustafsson, @StaffanGson, github

Staffan works at DICE in Stockholm, Sweden, as a Software Engineer and has been using PowerShell since the first public beta.
He was most seriously pleased when PowerShell was open sourced, and has since contributed bug fixes, new features and performance improvements.
Staffan is a speaker at PSConfEU and is always happy to talk PowerShell.

Parsing Text with PowerShell (2/3)

This post was originally published on this site

This is the second post in a three-part series.

  • Part 1:
    • Useful methods on the String class
    • Introduction to Regular Expressions
    • The Select-String cmdlet
  • Part 2:
    • the -split operator
    • the -match operator
    • the switch statement
    • the Regex class
  • Part 3:
    • a real world, complete and slightly bigger, example of a switch-based parser

The -split operator

The -split operator splits one or more strings into substrings.

The first example is a name-value pattern, which is a common parsing task. Note the usage of the Max-substrings parameter to the -split operator.
We want to ensure that it doesn’t matter if the value contains the character to split on.

$text = "Description=The '=' character is used for assigning values to a variable"
$name, $value = $text -split "=", 2

@"
Name  =  $name
Value =  $value
"@

Name  =  Description
Value =  The '=' character is used for assigning values to a variable

When the line to parse contains fields separated by a well known separator, that is never a part of the field values, we can use the -split operator in combination with multiple assignment to get the fields into variables.

$name, $location, $occupation = "Spider Man,New York,Super Hero" -split ','

If only the location is of interest, the unwanted items can be assigned to $null.

$null, $location, $null = "Spider Man,New York,Super Hero" -split ','

$location

New York

If there are many fields, assigning to null doesn’t scale well. Indexing can be used instead, to get the fields of interest.

$inputText = "x,Staffan,x,x,x,x,x,x,x,x,x,x,Stockholm,x,x,x,x,x,x,x,x,11,x,x,x,x"
$name, $location, $age = ($inputText -split ',')[1,12,21]

$name
$location
$age

Staffan
Stockholm
11

It is almost always a good idea to create an object that gives context to the different parts.

$inputText = "x,Steve,x,x,x,x,x,x,x,x,x,x,Seattle,x,x,x,x,x,x,x,x,22,x,x,x,x"
$name, $location, $age = ($inputText -split ',')[1,12,21]
[PSCustomObject] @{
    Name = $name
    Location = $location
    Age = [int] $age
}

Name  Location Age
----  -------- ---
Steve Seattle   22

Instead of creating a PSCustomObject, we can create a class. It’s a bit more to type, but we can get more help from the engine, for example with tab completion.

The example below also shows an example of type conversion, where the default string to number conversion doesn’t work.
The age field is handled by PowerShell’s built-in type conversion. It is of type [int], and PowerShell will handle the conversion from string to int,
but in some cases we need to help out a bit. The ShoeSize field is also an [int], but the data is hexadecimal,
and without the hex specifier (‘0x’), this conversion fails for some values, and provides incorrect results for the others.

class PowerSheller {
    [string] $Name
    [string] $Location
    [int] $Age
    [int] $ShoeSize
}

$inputText = "x,Staffan,x,x,x,x,x,x,x,x,x,x,Stockholm,x,x,x,x,x,x,x,x,33,x,11d,x,x"
$name, $location, $age, $shoeSize = ($inputText -split ',')[1,12,21,23]
[PowerSheller] @{
    Name = $name
    Location = $location
    Age = $age
    # ShoeSize is expressed in hex, with no '0x' because reasons 🙂
    # And yes, it's in millimeters.
    ShoeSize = [Convert]::ToInt32($shoeSize, 16)
}

Name    Location  Age ShoeSize
----    --------  --- --------
Staffan Stockholm  33      285

The split operator’s first argument is actually a regex (by default, can be changed with options).
I use this on long command lines in log files (like those given to compilers) where there can be hundreds of options specified. This makes it hard to see if a certain option is specified or not, but when split into their own lines, it becomes trivial.
The pattern below uses a positive lookahead assertion.
It can be very useful to make patterns match only in a given context, like if they are, or are not, preceded or followed by another pattern.

$cmdline = "cl.exe /D Bar=1 /I SomePath /D Foo  /O2 /I SomeOtherPath /Debug a1.cpp a3.cpp a2.cpp"

$cmdline -split "s+(?=[-/])"

cl.exe
/D Bar=1
/I SomePath
/D Foo
/O2
/I SomeOtherPath
/Debug a1.cpp a2.cpp

Breaking down the regex, by rewriting it with the x option:

(?x)      # ignore whitespace in the pattern, and enable comments after '#'
s+       # one or more spaces
(?=[-/])  # only match the previous spaces if they are followed by any of '-' or '/'.

Splitting with a scriptblock

The -split operator also comes in another form, where you can pass it a scriptblock instead of a regular expression.
This allows for more complicated logic, that can be hard or impossible to express as a regular expression.

The scriptblock accepts two parameters, the text to split and the current index. $_ is bound to the character at the current index.

function SplitWhitespaceInMiddleOfText {
    param(
        [string]$Text,
        [int] $Index
    )
    if ($Index -lt 10 -or $Index -gt 40){
        return $false
    }
    $_ -match 's'
}

$inputText = "Some text that only needs splitting in the middle of the text"
$inputText -split $function:SplitWhitespaceInMiddleOfText

Some text that
only
needs
splitting
in
the middle of the text

The $function:SplitWhitespaceInMiddleOfText syntax is a way to get to content (the scriptblock that implements it) of the function, just as $env:UserName gets the content of an item in the env: drive.
It provides a way to document and/or reuse the scriptblock.

The -match operator

The -match operator works in conjunction with the $matches automatic variable. Each time a -match or a -notmatch succeeds, the $matches variable is populated so that each capture group gets its own entry. If the capture group is named, the key will be the name of the group, otherwise it will be the index.

As an example:

if ('a b c' -match '(w) (?<named>w) (w)'){
    $matches
}

Name                           Value
----                           -----
named                          b
2                              c
1                              a
0                              a b c

Notice that the indices only increase on groups without names. I.E. the indices of later groups change when a group is named.

Armed with the regex knowledge from the earlier post, we can write the following:

PS> "    10,Some text" -match '^s+(d+),(.+)'

True

PS> $matches
Name                           Value
----                           -----
2                              Some text
1                              10
0                              10,Some text

or with named groups

PS> "    10,Some text" -match '^s+(?<num>d+),(?<text>.+)'

True

PS> $matches
Name                           Value
----                           -----
num                            10
text                           Some text
0                              10,Some text

The important thing here is to put parentheses around the parts of the pattern that we want to extract. That is what creates the capture groups that allow us to reference those parts of the matching text, either by name or by index.

Combining this into a function makes it easy to use:

function ParseMyString($text){
    if ($text -match '^s+(d+),(.+)') {
        [PSCustomObject] @{
            Number = [int] $matches[1]
            Text    = $matches[2]
        }
    }
    else {
        Write-Warning "ParseMyString: Input `$text` doesn't match pattern"
    }
}

ParseMyString "    10,Some text"

 

Number  Text
------- ----
     10 Some text

Notice the type conversion when assigning the Number property. As long as the number is in range of an integer, this will always succeed, since we have made a successful match in the if statement above. ([long] or [bigint] could be used. In this case I provide the input, and I have promised myself to stick to a range that fits in a 32-bit integer.)
Now we will be able to sort or do numerical operations on the Number property, and it will behave like we want it to – as a number, not as a string.

The switch statement

Now we’re at the big guns 🙂

The switch statement in PowerShell has been given special functionality for parsing text.
It has two flags that are useful for parsing text and files with text in them. -regex and -file.

When specifying -regex, the match clauses that are strings are treated as regular expressions. The switch statement also sets the $matches automatic variable.

When specifying -file, PowerShell treats the input as a file name, to read input from, rather than as a value statement.

Note the use of a ScriptBlock instead of a string as the match clause to determine if we should skip preamble lines.

class ParsedOutput {
    [int] $Number
    [string] $Text

    [string] ToString() { return "{0} ({1})" -f $this.Text, $this.Number }
}

$inputData =
    "Preamble line",
    "LastLineOfPreamble",
    "    10,Some Text",
    "    Some other text,20"

$inPreamble = $true
switch -regex ($inputData) {

    {$inPreamble -and $_ -eq 'LastLineOfPreamble'} { $inPreamble = $false; continue }

    "^s+(?<num>d+),(?<text>.+)" {  # this matches the first line of non-preamble input
        [ParsedOutput] @{
            Number = $matches.num
            Text = $matches.text
        }
        continue
    }

    "^s+(?<text>[^,]+),(?<num>d+)" { # this matches the second line of non-preamble input
        [ParsedOutput] @{
            Number = $matches.num
            Text = $matches.text
        }
        continue
    }
}

 

Number  Text
------ ----
    10 Some Text
    20 Some other text

The pattern [^,]+ in the text group in the code above is useful. It means match anything that is not a comma ,. We are using the any-of construct [], and within those brackets, ^ changes meaning from the beginning of the line to anything but.

That is useful when we are matching delimited fields. A requirement is that the delimiter cannot be part of the set of allowed field values.

The regex class

regex is a type accelerator for System.Text.RegularExpressions.Regex. It can be useful when porting code from C#, and sometimes when we want to get more control in situations when we have many matches of a capture group. It also allows us to pre-create the regular expressions which can matter in performance sensitive scenarios, and to specify a timeout.

One instance where the regex class is needed is when you have multiple captures of a group.

Consider the following:

Text Pattern
a,b,c, (w,)+

If the match operator is used, $matches will contain

Name                           Value
----                           -----
1                              c,
0                              a,b,c,

The pattern matched three times, for a,, b, and c,. However, only the last match is preserved in the $matches dictionary.
However, the following will allow us to get to all the captures of the group:

[regex]::match('a,b,c,', '(w,)+').Groups[1].Captures
Index Length Value
----- ------ -----
    0      2 a,
    2      2 b,
    4      2 c,

Below is an example that uses the members of the Regex class to parse input data

class ParsedOutput {
    [int] $Number
    [string] $Text

    [string] ToString() { return "{0} ({1})" -f $this.Text, $this.Number }
}

$inputData =
    "    10,Some Text",
    "    Some other text,20"  # this text will not match

[regex] $pattern = "^s+(d+),(.+)"

foreach($d in $inputData){
    $match = $pattern.Match($d)
    if ($match.Success){
        $number, $text = $match.Groups[1,2].Value
        [ParsedOutput] @{
            Number = $number
            Text = $text
        }
    }
    else {
        Write-Warning "regex: '$d' did not match pattern '$pattern'"
    }
}

 

WARNING: regex: '    Some other text,20' did not match pattern '^s+(d+),(.+)'
Number Text
------ ----
    10 Some Text

It may surprise you that the warning appears before the output. PowerShell has a quite complex formatting system at the end of the pipeline, which treats pipeline output different than other streams. Among other things, it buffers output in the beginning of a pipeline to calculate sensible column widths. This works well in practice, but sometimes gives strange reordering of output on different streams.

Summary

In this post we have looked at how the -split operator can be used to split a string in parts, how the -match operator can be used to extract different patterns from some text, and how the powerful switch statement can be used to match against multiple patterns.

We ended by looking at how the regex class, which in some cases provides a bit more control, but at the expense of ease of use. This concludes the second part of this series. Next time, we will look at a complete, real world, example of a switch-based parser.

Thanks to Jason Shirk, Mathias Jessen and Steve Lee for reviews and feedback.

Staffan Gustafsson, @StaffanGson, powercode@github

Staffan works at DICE in Stockholm, Sweden, as a Software Engineer and has been using PowerShell since the first public beta.
He was most seriously pleased when PowerShell was open sourced, and has since contributed bug fixes, new features and performance improvements.
Staffan is a speaker at PSConfEU and is always happy to talk PowerShell.

The post Parsing Text with PowerShell (2/3) appeared first on Powershell.