Category Archives: Powershell

Parsing Text with PowerShell (3/3)

This post was originally published on this site

This is the third and final post in a three-part series.

  • Part 1:
    • Useful methods on the String class
    • Introduction to Regular Expressions
    • The Select-String cmdlet
  • Part 2:
    • the -split operator
    • the -match operator
    • the switch statement
    • the Regex class
  • Part 3:
    • a real world, complete and slightly bigger, example of a switch-based parser
      • General structure of a switch-based parser
      • The real world example

In the previous posts, we looked at the different operators what are available to us in PowerShell.

When analyzing crashes at DICE, I noticed that some of the C++ runtime binaries where missing debug symbols. They should be available for download from Microsoft’s public symbol server, and most versions were there. However, due to some process errors at DevDiv, some builds were released publicly without available debug symbols.
In some cases, those missing symbols prevented us from debugging those crashes, and in all cases, they triggered my developer OCD.

So, to give actionable feedback to Microsoft, I scripted a debugger (cdb.exe in this case) to give a verbose list of the loaded modules, and parsed the output with PowerShell, which was also later used to group and filter the resulting data set. I sent this data to Microsoft, and 5 days later, the missing symbols were available for download. Mission accomplished!

This post will describe the parser I wrote for this task (it turned out that I had good use for it for other tasks later), and the general structure is applicable to most parsing tasks.

The example will show how a switch-based parser would look when the input data isn’t as tidy as it normally is in examples, but messy – as the real world data often is.

General Structure of a switch Based Parser

Depending on the structure of our input, the code must be organized in slightly different ways.

Input may have a record start that differs by indentation or some distinct token like

Foo                    <- Record start - No whitespace at the beginning of the line
    Prop1=Staffan      <- Properties for the record - starts with whitespace
    Prop3 =ValueN
Bar
    Prop1=Steve
    Prop2=ValueBar2

If the data to be parsed has an explicit start record, it is a bit easier than if it doesn’t have one.
We create a new data object when we get a record start, after writing any previously created object to the pipeline.
At the end, we need to check if we have parsed a record that hasn’t been written to the pipeline.

The general structure of a such a switch-based parser can be as follows:

$inputData = @"
Foo
    Prop1=Value1
    Prop3=Value3
Bar
    Prop1=ValueBar1
    Prop2=ValueBar2
"@ -split 'r?n'   # This regex is useful to split at line endings, with or without carriage return

class SomeDataClass {
    $ID
    $Name
    $Property2
    $Property3
}

# map to project input property names to the properties on our data class
$propertyNameMap = @{
    Prop1 = "Name"
    Prop2 = "Property2"
    Prop3 = "Property3"
}

$currentObject = $null
switch -regex ($inputData) {

    '^(S.*)' {
        # record start pattern, in this case line that doesn't start with a whitespace.
        if ($null -ne $currentObject) {
            $currentObject                   # output to pipeline if we have a previous data object
        }
        $currentObject = [SomeDataClass] @{  # create new object for this record
            Id = $matches.1                  # with Id like Foo or Bar
        }
        continue
    }

    # set the properties on the data object
    '^s+([^=]+)=(.*)' {
        $name, $value = $matches[1, 2]
        # project property names
        $propName = $propertyNameMap[$name]
        if ($propName = $null) {
            $propName = $name
        }
        # assign the parsed value to the projected property name
        $currentObject.$propName = $value
        continue
    }
}

if ($currentObject) {
    # Handle the last object if any
    $currentObject # output to pipeline
}
ID  Name      Property2 Property3
--  ----      --------- ---------
Foo Value1              Value3
Bar ValueBar1 ValueBar2

Alternatively, we may have input where the records are separated by a blank line, but without any obvious record start.

commitId=1234                         <- In this case, a commitId is first in a record
description=Update readme.md
                                      <- the blank line separates records
user=Staffan                          <- For this record, a user property comes first
commitId=1235
description=Fix bug.md

In this case the structure of the code looks a bit different. We create an object at the beginning, but keep track of if it’s dirty or not.
If we get to the end with a dirty object, we must output it.

$inputData = @"

commit=1234
desc=Update readme.md

user=Staffan
commit=1235
desc=Bug fix

"@ -split "r?n"

class SomeDataClass {
    [int] $CommitId
    [string] $Description
    [string] $User
}

# map to project input property names to the properties on our data class
# we only need to provide the ones that are different. 'User' works fine as it is.
$propertyNameMap = @{
    commit = "CommitId"
    desc   = "Description"
}

$currentObject = [SomeDataClass]::new()
$objectDirty = $false
switch -regex ($inputData) {
    # set the properties on the data object
    '^([^=]+)=(.*)$' {
        # parse a name/value
        $name, $value = $matches[1, 2]
        # project property names
        $propName = $propertyNameMap[$name]
        if ($null -eq $propName) {
            $propName = $name
        }
        # assign the projected property
        $currentObject.$propName = $value
        $objectDirty = $true
        continue
    }

    '^s*$' {
        # separator pattern, in this case any blank line
        if ($objectDirty) {
            $currentObject                           # output to pipeline
            $currentObject = [SomeDataClass]::new()  # create new object
            $objectDirty = $false                    # and mark it as not dirty
        }
    }
    default {
        Write-Warning "Unexpected input: '$_'"
    }
}

if ($objectDirty) {
    # Handle the last object if any
    $currentObject # output to pipeline
}
CommitId Description      User
-------- -----------      ----
    1234 Update readme.md
    1235 Bug fix          Staffan

The Real World Example

I have adapted this sample slightly so that I get the loaded modules from a running process instead of from my crash dumps. The format of the output from the debugger is the same.
The following command launches a command line debugger on notepad, with a script that gives a verbose listing of the loaded modules, and quits:

# we need to muck around with the console output encoding to handle the trademark chars
# imagine no encodings
# it's easy if you try
# no code pages below us
# above us only sky
[Console]::OutputEncoding = [System.Text.Encoding]::GetEncoding("iso-8859-1")

$proc = Start-Process notepad -passthru
Start-Sleep -seconds 1
$cdbOutput = cdb -y 'srv*c:symbols*http://msdl.microsoft.com/download/symbols' -c ".reload -f;lmv;q" -p $proc.ProcessID

The output of the command above is attached to the blog post for those who want to follow along but who aren’t running windows or don’t have cdb.exe installed.

The (abbreviated) output looks like this:

Microsoft (R) Windows Debugger Version 10.0.16299.15 AMD64
Copyright (c) Microsoft Corporation. All rights reserved.

*** wait with pending attach

************* Path validation summary **************
Response                         Time (ms)     Location
Deferred                                       srv*c:symbols*http://msdl.microsoft.com/download/symbols
Symbol search path is: srv*c:symbols*http://msdl.microsoft.com/download/symbols
Executable search path is:
ModLoad: 00007ff6`e9da0000 00007ff6`e9de3000   C:Windowssystem32notepad.exe
...
ModLoad: 00007ffe`97d80000 00007ffe`97db1000   C:WINDOWSSYSTEM32ntmarta.dll
(98bc.40a0): Break instruction exception - code 80000003 (first chance)
ntdll!DbgBreakPoint:
00007ffe`9cd53050 cc              int     3
0:007> cdb: Reading initial command '.reload -f;lmv;q'
Reloading current modules
.....................................................
start             end                 module name
00007ff6`e9da0000 00007ff6`e9de3000   notepad    (pdb symbols)          c:symbolsnotepad.pdb2352C62CDF448257FDBDDA4081A8F9081notepad.pdb
    Loaded symbol image file: C:Windowssystem32notepad.exe
    Image path: C:Windowssystem32notepad.exe
    Image name: notepad.exe
    Image was built with /Brepro flag.
    Timestamp:        329A7791 (This is a reproducible build file hash, not a timestamp)
    CheckSum:         0004D15F
    ImageSize:        00043000
    File version:     10.0.17763.1
    Product version:  10.0.17763.1
    File flags:       0 (Mask 3F)
    File OS:          40004 NT Win32
    File type:        1.0 App
    File date:        00000000.00000000
    Translations:     0409.04b0
    CompanyName:      Microsoft Corporation
    ProductName:      Microsoft??? Windows??? Operating System
    InternalName:     Notepad
    OriginalFilename: NOTEPAD.EXE
    ProductVersion:   10.0.17763.1
    FileVersion:      10.0.17763.1 (WinBuild.160101.0800)
    FileDescription:  Notepad
    LegalCopyright:   ??? Microsoft Corporation. All rights reserved.
...
00007ffe`9ccb0000 00007ffe`9ce9d000   ntdll      (pdb symbols)          c:symbolsntdll.pdbB8AD79538F2730FD9BACE36C9F9316A01ntdll.pdb
    Loaded symbol image file: C:WINDOWSSYSTEM32ntdll.dll
    Image path: C:WINDOWSSYSTEM32ntdll.dll
    Image name: ntdll.dll
    Image was built with /Brepro flag.
    Timestamp:        E8B54827 (This is a reproducible build file hash, not a timestamp)
    CheckSum:         001F20D1
    ImageSize:        001ED000
    File version:     10.0.17763.194
    Product version:  10.0.17763.194
    File flags:       0 (Mask 3F)
    File OS:          40004 NT Win32
    File type:        2.0 Dll
    File date:        00000000.00000000
    Translations:     0409.04b0
    CompanyName:      Microsoft Corporation
    ProductName:      Microsoft??? Windows??? Operating System
    InternalName:     ntdll.dll
    OriginalFilename: ntdll.dll
    ProductVersion:   10.0.17763.194
    FileVersion:      10.0.17763.194 (WinBuild.160101.0800)
    FileDescription:  NT Layer DLL
    LegalCopyright:   ??? Microsoft Corporation. All rights reserved.
quit:

The output starts with info that I’m not interested in here. I only want to get the detailed information about the loaded modules. It is not until the line

start             end                 module name

that I care about the output.

Also, at the end there is a line that we need to be aware of:

quit:

that is not part of the module output.

To skip the parts of the debugger output that we don’t care about, we have a boolean flag initially set to true.
If that flag is set, we check if the current line, $_, is the module header in which case we flip the flag.

    $inPreamble = $true
    switch -regex ($cdbOutput) {

        { $inPreamble -and $_ -eq "start             end                 module name" } { $inPreamble = $false; continue }

I have made the parser a separate function that reads its input from the pipeline. This way, I can use the same function to parse module data, regardless of how I got the module data. Maybe it was saved on a file. Or came from a dump, or a live process. It doesn’t matter, since the parser is decoupled from the data retrieval.

After the sample, there is a breakdown of the more complicated regular expressions used, so don’t despair if you don’t understand them at first.
Regular Expressions are notoriously hard to read, so much so that they make Perl look readable in comparison.

# define an class to store the data
class ExecutableModule {
    [string]   $Name
    [string]   $Start
    [string]   $End
    [string]   $SymbolStatus
    [string]   $PdbPath
    [bool]     $Reproducible
    [string]   $ImagePath
    [string]   $ImageName
    [DateTime] $TimeStamp
    [uint32]   $FileHash
    [uint32]   $CheckSum
    [uint32]   $ImageSize
    [version]  $FileVersion
    [version]  $ProductVersion
    [string]   $FileFlags
    [string]   $FileOS
    [string]   $FileType
    [string]   $FileDate
    [string[]] $Translations
    [string]   $CompanyName
    [string]   $ProductName
    [string]   $InternalName
    [string]   $OriginalFilename
    [string]   $ProductVersionStr
    [string]   $FileVersionStr
    [string]   $FileDescription
    [string]   $LegalCopyright
    [string]   $LegalTrademarks
    [string]   $LoadedImageFile
    [string]   $PrivateBuild
    [string]   $Comments
}

<#
.SYNOPSIS Runs a debugger on a program to dump its loaded modules
#>
function Get-ExecutableModuleRawData {
    param ([string] $Program)
    $consoleEncoding = [Console]::OutputEncoding
    [Console]::OutputEncoding = [System.Text.Encoding]::GetEncoding("iso-8859-1")
    try {
        $proc = Start-Process $program -PassThru
        Start-Sleep -Seconds 1  # sleep for a while so modules are loaded
        cdb -y srv*c:symbols*http://msdl.microsoft.com/download/symbols -c ".reload -f;lmv;q" -p $proc.Id
        $proc.Close()
    }
    finally {
        [Console]::OutputEncoding = $consoleEncoding
    }
}

<#
.SYNOPSIS Converts verbose module data from windows debuggers into ExecutableModule objects.
#>
function ConvertTo-ExecutableModule {
    [OutputType([ExecutableModule])]
    param (
        [Parameter(ValueFromPipeline)]
        [string[]] $ModuleRawData
    )
    begin {
        $currentObject = $null
        $preamble = $true
        $propertyNameMap = @{
            'File flags'      = 'FileFlags'
            'File OS'         = 'FileOS'
            'File type'       = 'FileType'
            'File date'       = 'FileDate'
            'File version'    = 'FileVersion'
            'Product version' = 'ProductVersion'
            'Image path'      = 'ImagePath'
            'Image name'      = 'ImageName'
            'FileVersion'     = 'FileVersionStr'
            'ProductVersion'  = 'ProductVersionStr'
        }
    }
    process {
        switch -regex ($ModuleRawData) {

            # skip lines until we get to our sentinel line
            { $preamble -and $_ -eq "start             end                 module name" } { $preamble = $false; continue }

            #00007ff6`e9da0000 00007ff6`e9de3000   notepad    (deferred)
            #00007ffe`9ccb0000 00007ffe`9ce9d000   ntdll      (pdb symbols)          c:symbolsntdll.pdbB8AD79538F2730FD9BACE36C9F9316A01ntdll.pdb
            '^([0-9a-f`]{17})s([0-9a-f`]{17})s+(S+)s+(([^)]+))s*(.+)?' {
                # see breakdown of the expression later in the post
                # on record start, output the currentObject, if any is set
                if ($null -ne $currentObject) {
                    $currentObject
                }
                $start, $end, $module, $pdbKind, $pdbPath = $matches[1..5]
                # create an instance of the object that we are adding info from the current record into.
                $currentObject = [ExecutableModule] @{
                    Start        = $start
                    End          = $end
                    Name         = $module
                    SymbolStatus = $pdbKind
                    PdbPath      = $pdbPath
                }
                continue
            }
            '^s+Image was built with /Brepro flag.' {
                $currentObject.Reproducible = $true
                continue
            }
            '^s+Timestamp:s+[^(]+((?<timestamp>.{8}))' {
                # see breakdown of the regular  expression later in the post
                # Timestamp:        Mon Jan  7 23:42:30 2019 (5C33D5D6)
                $intValue = [Convert]::ToInt32($matches.timestamp, 16)
                $currentObject.TimeStamp = [DateTime]::new(1970, 01, 01, 0, 0, 0, [DateTimeKind]::Utc).AddSeconds($intValue)
                continue
            }
            '^s+TimeStamp:s+(?<value>.{8}) (This' {
                # Timestamp:        E78937AC (This is a reproducible build file hash, not a timestamp)
                $currentObject.FileHash = [Convert]::ToUInt32($matches.value, 16)
                continue
            }
            '^s+Loaded symbol image file: (?<imageFile>[^)]+)' {
                $currentObject.LoadedImageFile = $matches.imageFile
                continue
            }
            '^s+Checksum:s+(?<checksum>S+)' {
                $currentObject.Checksum = [Convert]::ToUInt32($matches.checksum, 16)
                continue
            }
            '^s+Translations:s+(?<value>S+)' {
                $currentObject.Translations = $matches.value.Split(".")
                continue
            }
            '^s+ImageSize:s+(?<imageSize>.{8})' {
                $currentObject.ImageSize = [Convert]::ToUInt32($matches.imageSize, 16)
                continue
            }
            '^s{4}(?<name>[^:]+):s+(?<value>.+)' {
                # see breakdown of the regular expression later in the post
                # This part is any 'name: value' pattern
                $name, $value = $matches['name', 'value']

                # project the property name
                $propName = $propertyNameMap[$name]
                $propName = if ($null -eq $propName) { $name } else { $propName }

                # note the dynamic property name in the assignment
                # this will fail if the property doesn't have a member with the specified name
                $currentObject.$propName = $value
                continue
            }
            'quit:' {
                # ignore and exit
                break
            }
            default {
                # When writing the parser, it can be useful to include a line like the one below to see the cases that are not handled by the parser
                # Write-Warning "missing case for '$_'. Unexpected output format from cdb.exe"

                continue # skip lines that doesn't match the patterns we are interested in, like the start/end/modulename header and the quit: output
            }
        }
    }
    end {
        # this is needed to output the last object
        if ($null -ne $currentObject) {
            $currentObject
        }
    }
}


Get-ExecutableModuleRawData Notepad |
    ConvertTo-ExecutableModule |
    Sort-Object ProductVersion, Name
    Format-Table -Property Name, FileVersion, Product_Version, FileDescription
Name               FileVersionStr                             ProductVersion FileDescription
----               --------------                             -------------- ---------------
PROPSYS            7.0.17763.1 (WinBuild.160101.0800)         7.0.17763.1    Microsoft Property System
ADVAPI32           10.0.17763.1 (WinBuild.160101.0800)        10.0.17763.1   Advanced Windows 32 Base API
bcrypt             10.0.17763.1 (WinBuild.160101.0800)        10.0.17763.1   Windows Cryptographic Primitives Library
...
uxtheme            10.0.17763.1 (WinBuild.160101.0800)        10.0.17763.1   Microsoft UxTheme Library
win32u             10.0.17763.1 (WinBuild.160101.0800)        10.0.17763.1   Win32u
WINSPOOL           10.0.17763.1 (WinBuild.160101.0800)        10.0.17763.1   Windows Spooler Driver
KERNELBASE         10.0.17763.134 (WinBuild.160101.0800)      10.0.17763.134 Windows NT BASE API Client DLL
wintypes           10.0.17763.134 (WinBuild.160101.0800)      10.0.17763.134 Windows Base Types DLL
SHELL32            10.0.17763.168 (WinBuild.160101.0800)      10.0.17763.168 Windows Shell Common Dll
...
windows_storage    10.0.17763.168 (WinBuild.160101.0800)      10.0.17763.168 Microsoft WinRT Storage API
CoreMessaging      10.0.17763.194                             10.0.17763.194 Microsoft CoreMessaging Dll
gdi32full          10.0.17763.194 (WinBuild.160101.0800)      10.0.17763.194 GDI Client DLL
ntdll              10.0.17763.194 (WinBuild.160101.0800)      10.0.17763.194 NT Layer DLL
RMCLIENT           10.0.17763.194 (WinBuild.160101.0800)      10.0.17763.194 Resource Manager Client
RPCRT4             10.0.17763.194 (WinBuild.160101.0800)      10.0.17763.194 Remote Procedure Call Runtime
combase            10.0.17763.253 (WinBuild.160101.0800)      10.0.17763.253 Microsoft COM for Windows
COMCTL32           6.10 (WinBuild.160101.0800)                10.0.17763.253 User Experience Controls Library
urlmon             11.00.17763.168 (WinBuild.160101.0800)     11.0.17763.168 OLE32 Extensions for Win32
iertutil           11.00.17763.253 (WinBuild.160101.0800)     11.0.17763.253 Run time utility for Internet Explorer

Regex pattern breakdown

Here is a breakdown of the more complicated patterns, using the ignore pattern whitespace modifier x:

([0-9a-f`]{17})s([0-9a-f`]{17})s+(S+)s+(([^)]+))s*(.+)?

# example input: 00007ffe`9ccb0000 00007ffe`9ce9d000   ntdll      (pdb symbols)          c:symbolsntdll.pdbB8AD79538F2730FD9BACE36C9F9316A01ntdll.pdb

(?x)                # ignore pattern whitespace
^                   # the beginning of the line
([0-9a-f`]{17})     # capture expression like 00007ff6`e9da0000 - any hex number or backtick, and exactly 17 of them
s                  # a space
([0-9a-f`]{17})     # capture expression like 00007ff6`e9da0000 - any hex number or backtick, and exactly 17 of them
s+                 # skip any number of spaces
(S+)               # capture until we get a space - this would match the 'ntdll' part
s+                 # skip one or more spaces
(                  # start parenthesis
    ([^)])         # capture anything but end parenthesis
)                  # end parenthesis
s*                 # skip zero or more spaces
(.+)?               # optionally capture any symbol file path

Breakdown of the name-value pattern:

^s+(?<name>[^:]+):s+(?<value>.+)

# example input:  File flags:       0 (Mask 3F)

(?x)                # ignore pattern whitespace
^                   # the beginning of the line
s+                 # require one or more spaces
(?<name>[^:]+)      # capture anything that is not a `:` into the named group "name"
:                   # require a comma
s+                 # require one or more spaces
(?<value>.+)        # capture everything until the end into the name group "value"

Breakdown of the timestamp pattern:

^s{4}Timestamp:s+[^(]+((?<timestamp>.{8}))

#example input:     Timestamp:        Mon Jan  7 23:42:30 2019 (5C33D5D6)

(?x)                # ignore pattern whitespace
^                   # the beginning of the line
s+                 # require one or more spaces
Timestamp:          # The literal text 'Timestamp:'
s+                 # require one or more spaces
[^(]+              # one or more of anything but a open parenthesis
(                  # a literal '('
(?<timestamp>.{8})  # 8 characters of anything, captured into the group 'timestamp'
)                  # a literal ')'

Gotchas – the Regex Cache

Something that can happen if you are writing a more complicated parser is the following:
The parser works well. You have 15 regular expressions in your switch statement and then you get some input you haven’t seen before, so you add a 16th regex.
All of a sudden, the performance of your parser tanks. WTF?

The .net regex implementation has a cache of recently used regexs. You can check the size of it like this:

PS> [regex]::CacheSize
15

# bump it
[regex]::CacheSize = 20

And now your parser is fast(er) again.

Bonus tip

I frequently use PowerShell to write (generate) my code:

Get-ExecutableModuleRawData pwsh |
    Select-String '^s+([^:]+):' |       # this pattern matches the module detail fields
    Foreach-Object {$_.matches.groups[1].value} |
    Select-Object -Unique |
    Foreach-Object -Begin   { "class ExecutableModuleData {" }`
                   -Process { "    [string] $" + ($_ -replace "s.", {[char]::ToUpperInvariant($_.Groups[0].Value[1])}) }`
                   -End     { "}" }

The output is

class ExecutableModuleData {
    [string] $LoadedSymbolImageFile
    [string] $ImagePath
    [string] $ImageName
    [string] $Timestamp
    [string] $CheckSum
    [string] $ImageSize
    [string] $FileVersion
    [string] $ProductVersion
    [string] $FileFlags
    [string] $FileOS
    [string] $FileType
    [string] $FileDate
    [string] $Translations
    [string] $CompanyName
    [string] $ProductName
    [string] $InternalName
    [string] $OriginalFilename
    [string] $ProductVersion
    [string] $FileVersion
    [string] $FileDescription
    [string] $LegalCopyright
    [string] $Comments
    [string] $LegalTrademarks
    [string] $PrivateBuild
}

It is not complete – I don’t have the fields from the record start, some types are incorrect and when run against some other executables a few other fields may appear.
But it is a very good starting point. And way more fun than typing it 🙂

Note that this example is using a new feature of the -replace operator – to use a ScriptBlock to determine what to replace with – that was added in PowerShell Core 6.1.

Bonus tip #2

A regular expression construct that I often find useful is non-greedy matching.
The example below shows the effect of the ? modifier, that can be used after * (zero or more) and + (one or more)

# greedy matching - match to the last occurrence of the following character (>)
if("<Tag>Text</Tag>" -match '<(.+)>') { $matches }
Name                           Value
----                           -----
1                              Tag>Text</Tag
0                              <Tag>Text</Tag>
# non-greedy matching - match to the first occurrence of the the following character (>)
if("<Tag>Text</Tag>" -match '<(.+?)>') { $matches }
Name                           Value
----                           -----
1                              Tag
0                              <Tag>

See Regex Repeat for more info on how to control pattern repetition.

Summary

In this post, we have looked at how the structure of a switch-based parser could look, and how it can be written so that it works as a part of the pipeline.
We have also looked at a few slightly more complicated regular expressions in some detail.

As we have seen, PowerShell has a plethora of options for parsing text, and most of them revolve around regular expressions.
My personal experience has been that the time I’ve invested in understanding the regex language was well invested.

Hopefully, this gives you a good start with the parsing tasks you have at hand.

Thanks to Jason Shirk, Mathias Jessen and Steve Lee for reviews and feedback.

Staffan Gustafsson, @StaffanGson, github

Staffan works at DICE in Stockholm, Sweden, as a Software Engineer and has been using PowerShell since the first public beta.
He was most seriously pleased when PowerShell was open sourced, and has since contributed bug fixes, new features and performance improvements.
Staffan is a speaker at PSConfEU and is always happy to talk PowerShell.

Parsing Text with PowerShell (2/3)

This post was originally published on this site

This is the second post in a three-part series.

  • Part 1:
    • Useful methods on the String class
    • Introduction to Regular Expressions
    • The Select-String cmdlet
  • Part 2:
    • the -split operator
    • the -match operator
    • the switch statement
    • the Regex class
  • Part 3:
    • a real world, complete and slightly bigger, example of a switch-based parser

The -split operator

The -split operator splits one or more strings into substrings.

A common pattern is a name-value pattern:
Note the usage of the Max-substrings parameter to the -split operator.
We want to ensure that is doesn’t matter if the value contains the character to split on.

$text = "Description=The '=' character is used for assigning values to a variable"
$name, $value = $text -split "=", 2

@"
Name  =  $name
Value =  $value
"@
Name  =  Description
Value =  The '=' character is used for assigning values to a variable

When the line to parse contains fields separated by a well known separator, that is never a part of the field values, we can use the -split operator
in combination with multiple assignment to get the fields into variables.

$name, $location, $occupation = "Spider Man,New York,Super Hero" -split ','

If only the location is of interest, the unwanted items can be assigned to $null.

$null, $location, $null = "Spider Man,New York,Super Hero" -split ','

$location
New York

If there are many fields, assigning to null doesn’t scale well. Indexing can be used instead, to get the fields of interest.

$inputText = "x,Staffan,x,x,x,x,x,x,x,x,x,x,Stockholm,x,x,x,x,x,x,x,x,11,x,x,x,x"
$name, $location, $age = ($inputText -split ',')[1,12,21]

$name
$location
$age
Staffan
Stockholm
11

It is almost always a good idea to create an object that gives context to the different parts.

$inputText = "x,Steve,x,x,x,x,x,x,x,x,x,x,Seattle,x,x,x,x,x,x,x,x,22,x,x,x,x"
$name, $location, $age = ($inputText -split ',')[1,12,21]
[PSCustomObject] @{
    Name = $name
    Location = $location
    Age = [int] $age
}
Name  Location Age
----  -------- ---
Steve Seattle   22

Instead of creating a PSCustomObject, we can create a class. It’s a bit more to type, but we can get more help from the engine, for example with tab completion.

The example below also shows an example of type conversion, where the default string to number conversion doesn’t work.
The age field is handled by PowerShell’s built-in type conversion. It is of type [int], and PowerShell will handle the conversion from string to int,
but in some cases we need to help out a bit. The ShoeSize field is also an [int], but the data is hexadecimal,
and without the hex specifier (‘0x’), this conversion fails for some values, and provides incorrect results for the others.

class PowerSheller {
    [string] $Name
    [string] $Location
    [int] $Age
    [int] $ShoeSize
}

$inputText = "x,Staffan,x,x,x,x,x,x,x,x,x,x,Stockholm,x,x,x,x,x,x,x,x,33,x,11d,x,x"
$name, $location, $age, $shoeSize = ($inputText -split ',')[1,12,21,23]
[PowerSheller] @{
    Name = $name
    Location = $location
    Age = $age
    # ShoeSize is expressed in hex, with no '0x' because reasons :)
    # And yes, it's in millimeters.
    ShoeSize = [Convert]::ToInt32($shoeSize, 16)
}
Name    Location  Age ShoeSize
----    --------  --- --------
Staffan Stockholm  33      285

The split operator’s first argument is actually a regex (by default, can be changed with options).
I use this on long command lines in log files (like those given to compilers) where there can be hundreds of options specified. This makes it hard to see if a certain option is specified or not, but when split into their own lines, it becomes trivial.
The pattern below uses a positive lookahead assertion.
It can be very useful to make patterns match only in a given context, like if they are, or are not, preceded or followed by another pattern.

$cmdline = "cl.exe /D Bar=1 /I SomePath /D Foo  /O2 /I SomeOtherPath /Debug a1.cpp a3.cpp a2.cpp"

$cmdline -split "s+(?=[-/])"
cl.exe
/D Bar=1
/I SomePath
/D Foo
/O2
/I SomeOtherPath
/Debug a1.cpp a2.cpp

Breaking down the regex, by rewriting it with the x option:

(?x)      # ignore whitespace in the pattern, and enable comments after '#'
s+       # one or more spaces
(?=[-/])  # only match the previous spaces if they are followed by any of '-' or '/'.

Splitting with a scriptblock

The -split operator also comes in another form, where you can pass it a scriptblock instead of a regular expression.
This allows for more complicated logic, that can be hard or impossible to express as a regular expression.

The scriptblock accepts two parameters, the text to split and the current index. $_ is bound to the character at the current index.

function SplitWhitespaceInMiddleOfText {
    param(
        [string]$Text,
        [int] $Index
    )
    if ($Index -lt 10 -or $Index -gt 40){
        return $false
    }
    $_ -match 's'
}

$inputText = "Some text that only needs splitting in the middle of the text"
$inputText -split $function:SplitWhitespaceInMiddleOfText
Some text that
only
needs
splitting
in
the middle of the text

The $function:SplitWhitespaceInMiddleOfText syntax is a way to get to content (the scriptblock that implements it) of the function, just as $env:UserName gets the content of an item in the env: drive.
It provides a way to document and/or reuse the scriptblock.

The -match operator

The -match operator works in conjunction with the $matches automatic variable.
Each time a -match or a -notmatch succeeds, the $matches variable is populated so that each capture group gets its own entry.
If the capture group is named, the key will be the name of the group, otherwise it will be the index.

As an example:

if ('a b c' -match '(w) (?<named>w) (w)'){
    $matches
}
Name                           Value
----                           -----
named                          b
2                              c
1                              a
0                              a b c

Notice that the indices only increase on groups without names. I.E. the indices of later groups change when a group is named.

Armed with the regex knowledge from the earlier post, we can write the following:

PS> "    10,Some text" -match '^s+(d+),(.+)'
True
PS> $matches
Name                           Value
----                           -----
2                              Some text
1                              10
0                              10,Some text

or with named groups

PS> "    10,Some text" -match '^s+(?<num>d+),(?<text>.+)'
True
PS> $matches
Name                           Value
----                           -----
num                            10
text                           Some text
0                              10,Some text

The important thing here is that the parts of the pattern that we want to extract has parenthesis around them.
That is what creates the capture groups that allow us to reference those parts of the matching text, either by name or by index.

Combining this into a function makes it easy to use:

function ParseMyString($text){
    if ($text -match '^s+(d+),(.+)') {
        [PSCustomObject] @{
            Number = [int] $matches[1]
            Text    = $matches[2]
        }
    }
    else {
        Write-Warning "ParseMyString: Input `$text` doesn't match pattern"
    }
}

ParseMyString "    10,Some text"
Number  Text
------- ----
     10 Some text

Notice the type conversion when assigning the Number property. As long as the number is in range of an integer, this will always succeed, since we have made a successful match in the if statement above. ([long] or [bigint] could be used. In this case I provide the input, and I have promised myself to stick to a range that fits in a 32-bit integer.)
Now we will be able to sort or do numerical operations on the Number property, and it will behave like we want it to – as a number, not as a string.

The switch statement

Now we’re at the big guns 🙂

The switch statement in PowerShell has been given special functionality for parsing text.
It has two flags that are useful for parsing text and files with text in them. -regex and -file.

When specifying -regex, the match clauses that are strings are treated as regular expressions. The switch statement also sets the $matches automatic variable.

When specifying -file, PowerShell treats the input as a file name, to read input from, rather than as a value statement.

Note the use of a ScriptBlock instead of a string as the match clause to determine if we should skip preamble lines.

class ParsedOutput {
    [int] $Number
    [string] $Text

    [string] ToString() { return "{0} ({1})" -f $this.Text, $this.Number }
}

$inputData =
    "Preamble line",
    "LastLineOfPreamble",
    "    10,Some Text",
    "    Some other text,20"

$inPreamble = $true
switch -regex ($inputData) {

    {$inPreamble -and $_ -eq 'LastLineOfPreamble'} { $inPreamble = $false; continue }

    "^s+(?<num>d+),(?<text>.+)" {  # this matches the first line of non-preamble input
        [ParsedOutput] @{
            Number = $matches.num
            Text = $matches.text
        }
        continue
    }

    "^s+(?<text>[^,]+),(?<num>d+)" { # this matches the second line of non-preamble input
        [ParsedOutput] @{
            Number = $matches.num
            Text = $matches.text
        }
        continue
    }
}
Number  Text
------ ----
    10 Some Text
    20 Some other text

The pattern [^,]+ in the text group in the code above is useful. It means match anything that is not a comma ,. We are using the any-of construct [], and within those brackets, ^ changes meaning from the beginning of the line to anything but.

That is useful when we are matching delimited fields. A requirement is that the delimiter cannot be part of the set of allowed field values.

The regex class

regex is a type accelerator for System.Text.RegularExpressions.Regex. It can be useful when porting code from C#, and sometimes when we want to get more control in situations when we have many matches of a capture group. It also allows us to pre-create the regular expressions which can matter in performance sensitive scenarios, and to specify a timeout.

One instance where the regex class is needed is when you have multiple captures of a group.

Consider the following:

Text Pattern
a,b,c, (w,)+

If the match operator is used, $matches will contain

Name                           Value
----                           -----
1                              c,
0                              a,b,c,

The pattern matched three times, for a,, b, and c,. However, only the last match is preserved in the $matches dictionary.
However, the following will allow us to get to all the captures of the group:

[regex]::match('a,b,c,', '(w,)+').Groups[1].Captures
Index Length Value
----- ------ -----
    0      2 a,
    2      2 b,
    4      2 c,

Below is an example that uses the members of the Regex class to parse input data

class ParsedOutput {
    [int] $Number
    [string] $Text

    [string] ToString() { return "{0} ({1})" -f $this.Text, $this.Number }
}

$inputData =
    "    10,Some Text",
    "    Some other text,20"  # this text will not match

[regex] $pattern = "^s+(d+),(.+)"

foreach($d in $inputData){
    $match = $pattern.Match($d)
    if ($match.Success){
        $number, $text = $match.Groups[1,2].Value
        [ParsedOutput] @{
            Number = $number
            Text = $text
        }
    }
    else {
        Write-Warning "regex: '$d' did not match pattern '$pattern'"
    }
}
WARNING: regex: '    Some other text,20' did not match pattern '^s+(d+),(.+)'
Number Text
------ ----
    10 Some Text

It may surprise you that the warning appears before the output. PowerShell has a quite complex formatting system at the end of the pipeline, which treats pipeline output different than other streams. Among other things, it buffers output in the beginning of a pipeline to calculate sensible column widths. This works well in practice, but sometimes gives strange reordering of output on different streams.

Summary

In this post we have looked at how the -split operator can be used to split a string in parts, how the -match operator can be used to extract different patterns from some text, and how the powerful switch statement can be used to match against multiple patterns.

We ended by looking at how the regex class, which in some cases provides a bit more control, but at the expense of ease of use.

This concludes the second part of this post. Next time, we will look at a complete, real world, example of a switch-based parser.

Thanks to Jason Shirk, Mathias Jessen and Steve Lee for reviews and feedback.

Staffan Gustafsson, @StaffanGson, powercode@github

Staffan works at DICE in Stockholm, Sweden, as a Software Engineer and has been using PowerShell since the first public beta.
He was most seriously pleased when PowerShell was open sourced, and has since contributed bug fixes, new features and performance improvements.
Staffan is a speaker at PSConfEU and is always happy to talk PowerShell.

The PowerShell-Docs repositories have been moved

This post was originally published on this site

The PowerShell-Docs repositories have been moved from the PowerShell organization to the MicrosoftDocs organization in GitHub.

The tools we use to build the documentation are designed to work in the MicrosoftDocs org. Moving the repository lets us build the foundation for future improvements in our documentation experience.

Impact of the move

During the move there may be some downtime. The affected repositories will be inaccessible during
the move process. Also, the documentation processes will be paused. After the move, we need to test
access permissions and automation scripts.
 

After these tasks are complete, access and operations will return to normal. GitHub automatically
redirects requests to the old repo URL to the new location.
 

For more information about transferring repositories in GitHub, see About repository transfers

If the transferred repository has any forks, then those forks will remain associated with the
repository after the transfer is complete.
  • All Git information about commits, including contributions, are preserved.
  • All of the issues and pull requests remain intact when transferring a repository.
  • All links to the previous repository location are automatically redirected to the new location.


When you use git clone, git fetch, or git push on a transferred repository, these commands will r
edirect to the new repository location or URL.

However, to avoid confusion, we strongly recommend updating any existing local clones to point to
the new repository URL.
For more information, see Changing a remote’s URL.

The following example shows how to change the “upstream” remote to point to the new location:

[Wed 06:08PM] [staging =]
PS C:GitPS-DocsPowerShell-Docs> git remote -v
origin  https://github.com/sdwheeler/PowerShell-Docs.git (fetch)
origin  https://github.com/sdwheeler/PowerShell-Docs.git (push)
upstream        https://github.com/PowerShell/PowerShell-Docs.git (fetch)
upstream        https://github.com/PowerShell/PowerShell-Docs.git (push)

[Wed 06:09PM] [staging =]
PS C:GitPS-DocsPowerShell-Docs> git remote set-url upstream https://github.com/MicrosoftDocs/PowerShell-Docs.git

[Wed 06:10PM] [staging =]
PS C:GitPS-DocsPowerShell-Docs> git remote -v
origin  https://github.com/sdwheeler/PowerShell-Docs.git (fetch)
origin  https://github.com/sdwheeler/PowerShell-Docs.git (push)
upstream        https://github.com/MicrosoftDocs/PowerShell-Docs.git (fetch)
upstream        https://github.com/MicrosoftDocs/PowerShell-Docs.git (push)


Which repositories were moved?

 

The following repositories were transferred:

  • PowerShell/PowerShell-Docs
  • PowerShell/powerShell-Docs.cs-cz
  • PowerShell/powerShell-Docs.de-de
  • PowerShell/powerShell-Docs.es-es
  • PowerShell/powerShell-Docs.fr-fr
  • PowerShell/powerShell-Docs.hu-hu
  • PowerShell/powerShell-Docs.it-it
  • PowerShell/powerShell-Docs.ja-jp
  • PowerShell/powerShell-Docs.ko-kr
  • PowerShell/powerShell-Docs.nl-nl
  • PowerShell/powerShell-Docs.pl-pl
  • PowerShell/powerShell-Docs.pt-br
  • PowerShell/powerShell-Docs.pt-pt
  • PowerShell/powerShell-Docs.ru-ru
  • PowerShell/powerShell-Docs.sv-se
  • PowerShell/powerShell-Docs.tr-tr
  • PowerShell/powerShell-Docs.zh-cn
  • PowerShell/powerShell-Docs.zh-tw

Call to action

If you have a fork that you cloned, change your remote configuration to point to the new upstream URL.
Help us make the documentation better.
  • Submit issues when you find a problem in the docs.
  • Suggest fixes to documentation by submitting changes through the PR process.
Sean Wheeler
Senior Content Developer for PowerShell
https://github.com/sdwheeler

Announcing the PowerShell Preview Extension in VSCode

This post was originally published on this site

Preview builds of the PowerShell extension are now available in VSCode

We are excited to announce the PowerShell Preview extension in the VSCode marketplace!
The PowerShell Preview extension allows users on Windows PowerShell 5.1, Powershell 6.0, and all newer versions to get and test the latest updates to the PowerShell extension and comes with some exciting features. The PowerShell Preview extension is a substitute for the PowerShell extension so both the PowerShell extension and the PowerShell Preview
extension should not be enabled at the same time.

Features of the PowerShell Preview extension

The PowerShell Preview extension is built on .NET Standard thereby enabling simplification of our code and dependency structure.

The PowerShell Preview extension also contains PSReadLine support in the integrated console for Windows behind a
feature flag. PSReadLine provides a consistent and rich interactive experience, including syntax coloring and
multi-line editing and history, in the PowerShell console, in Cloud Shell, and now in VSCode terminal.
For more information on the benefits of PSReadLine, check out their documentation.

To enable PSReadLine support in the Preview version on Windows, please add the following to your user settings:

"powershell.developer.featureFlags": [ "PSReadLine" ]

HUGE thanks to @SeeminglyScience for all his amazing work getting PSReadLine working in PowerShell Editor Services!

Why we created the PowerShell Preview extension

By having a preview channel, which supports Windows Powershell 5.1 and PowerShell Core 6, in addition to our existing stable channel, we can get new features out faster. PSReadLine support for the VSCode integrated console is a great
example of a feature that the preview build makes possible. Having a preview channel also allows us to get more feedback
on new features and to iterate on changes before they arrive in our stable channel.

How to Get/Use the PowerShell Preview extension

If you dont already have VSCode, start here.

Once you have VSCode open, click Clt+Shift+X to open the extensions marketplace.
Next, type PowerShell Preview in the search bar.
Click Install on the PowerShell Preview page.
Finally, click Reload in order to refresh VSCode.

If you already have the PowerShell extension please disable it to use the Powershell Preview extension.
To disable the PowerShell extension find it in the Extensions sidebar view, specifically under the list of Enabled extensions, Right-click on the PowerShell extension and select Disable. Please note that it is important to only have either the
PowerShell extension or the PowerShell Preview extension endabled at one time.
How to Disable

Breaking Changes

As stated above, this version of the PowerShell extension only works with Windows PowerShell versions 5.1 and
PowerShell Core 6.

Reporting Feedback

An important benefit of a preview extension is getting feedback from users.
To report issues with the extension use our GitHub repo.
When reporting issues be sure to specify the version of the extension you are using.

Sydney Smith
Program Manager
PowerShell Team

Parsing Text with PowerShell (1/3)

This post was originally published on this site

This is the first post in a three part series.

  • Part 1:
    • Useful methods on the String class
    • Introduction to Regular Expressions
    • The Select-String cmdlet
  • Part 2:
    • The -split operator
    • The -match operator
    • The switch statement
    • The Regex class
  • Part 3:
    • A real world, complete and slightly bigger, example of a switch-based parser

A task that appears regularly in my workflow is text parsing. It may be about getting a token from a single line of text or about turning the text output of native tools into structured objects so I can leverage the power of PowerShell.

I always strive to create structure as early as I can in the pipeline, so that later on I can reason about the content as properties on objects instead of as text at some offset in a string. This also helps with sorting, since the properties can have their correct type, so that numbers, dates etc. are sorted as such and not as text.

There are a number of options available to a PowerShell user, and I’m giving an overview here of the most common ones.

This is not a text about how to create a high performance parser for a language with a structured EBNF grammar. There are better tools available for that, for example ANTLR.

.Net methods on the string class

Any treatment of string parsing in PowerShell would be incomplete if it didn’t mention the methods on the string class.
There are a few methods that I’m using more often than others when parsing strings:

Name Description
Substring(int startIndex) Retrieves a substring from this instance. The substring starts at a specified character position and continues to the end of the string.
Substring(int startIndex, int length) Retrieves a substring from this instance. The substring starts at a specified character position and has a specified length.
IndexOf(string value) Reports the zero-based index of the first occurrence of the specified string in this instance.
IndexOf(string value, int startIndex) Reports the zero-based index of the first occurrence of the specified string in this instance. The search starts at a specified character position.
LastIndexOf(string value) Reports the zero-based index of the first occurrence of the specified string in this instance. Often used together with Substring.
LastIndexOf(string value, int startIndex) Reports the zero-based index position of the last occurrence of a specified string within this instance. The search starts at a specified character position and proceeds backward toward the beginning of the string.

This is a minor subset of the available functions. It may be well worth your time to read up on the string class since it is so fundamental in PowerShell.
Docs are found here.

As an example, this can be useful when we have very large input data of comma-separated input with 15 columns and we are only interested in the third column from the end. If we were to use the -split ',' operator, we would create 15 new strings and an array for each line. On the other hand, using LastIndexOf on the input string a few times and then SubString to get the value of interest is faster and results in just one new string.

function parseThirdFromEnd([string]$line){
    $i = $line.LastIndexOf(",")             # get the last separator
    $i = $line.LastIndexOf(",", $i - 1)     # get the second to last separator, also the end of the column we are interested in
    $j = $line.LastIndexOf(",", $i - 1)     # get the separator before the column we want
    $j++                                    # more forward past the separator
    $line.SubString($j,$i-$j)               # get the text of the column we are looking for
}

In this sample, I ignore that the IndexOf and LastIndexOf returns -1 if they cannot find the text to search for. From experience, I also know that it is easy to mess up the index arithmetics.
So while using these methods can improve performance, it is also more error prone and a lot more to type. I would only resort to this when I know the input data is very large and performance is an issue. So this is not a recommendation, or a starting point, but something to resort to.

On rare occasions, I write the whole parser in C#. An example of this is in a module wrapping the Perforce version control system, where the command line tool can output python dictionaries. It is a binary format, and the use case was complicated enough that I was more comfortable with a compiler checked implementation language.

Regular Expressions

Almost all of the parsing options in PowerShell make use of regular expressions, so I will start with a short intro of some regular expression concepts that are used later in these posts.

Regular expressions are very useful to know when writing simple parsers since they allow us to express patterns of interest and to capture text that matches those patterns.

It is a very rich language, but you can get quite a long way by learning a few key parts. I’ve found regular-expressions.info to be a good online resource for more information. It is not written directly for the .net regex implementation, but most of the information is valid across the different implementations.

Regex Description
* Zero or more of the preceding character. a* matches the empty string, a, aa, etc, but not b.
+ One or more of the preceding character. a+ matches a, aa, etc, but not the empty string or b.
. Matches any character
[ax1] Any of a,x,1
a-d matches any of a,b,c,d
w The w meta character is used to find a word character. A word character is a character from a-z, A-Z, 0-9, including the _ (underscore) character. It also matches variants of the characters such as ??? and ???.
W The inversion of w. Matches any non-word character
s The s meta character is used to find white space
S The inversion of s. Matches any non-whitespace character
d Matches digits
D The inversion of d. Matches non-digits
b Matches a word boundary, that is, the position between a word and a space.
B The inversion of b. . erB matches the er in verb but not the er in never.
^ The beginning of a line
$ The end of a line
(<expr>) Capture groups

Combining these, we can create a pattern like below to match a text like:

Text Pattern
" 42,Answer" ^s+d+,.+

The above pattern can be written like this using the x (ignore pattern whitespace) modifier.

Starting the regex with (?x) ignores whitespace in the pattern (it has to be specified explicitly, with s) and also enables the comment character #.

(?x)  # this regex ignores whitespace in the pattern. Makes it possible do document a regex with comments.
^     # the start of the line
s+   # one or more whitespace character
d+   # one or more digits
,     # a comma
.+    # one or more characters of any kind

By using capture groups, we make it possible to refer back to specific parts of a matched expression.

Text Pattern
" 42,Answer" ^s+(d+),(.+)
(?x)  # this regex ignores whitespace in the pattern. Makes it possible to document a regex with comments.
^     # the start of the line
s+   # one or more whitespace character
(d+) # capture one or more digits in the first group (index 1)
,     # a comma
(.+)  # capture one or more characters of any kind in the second group (index 2)

Naming regular expression groups

There is a construct called named capturing groups, (?<group_name>pattern), that will create a capture group with a designated name.

The regex above can be rewritten like this, which allows us to refer to the capture groups by name instead of by index.

^s+(?<num>d+),(?<text>.+)

Different languages have implementation specific solutions to accessing the values of the captured groups. We will see later on in this series how it is done in PowerShell.

The Select-String cmdlet

The Select-String command is a work horse, and is very powerful when you understand the output it produces.
I use it mainly when searching for text in files, but occasionally also when looking for something in command output and similar.

The key to being efficient with Select-String is to know how to get to the matched patterns in the output. In its internals, it uses the same regex class as the -match and -split operator, but instead of populating a global variable with the resulting groups, as -match does, it writes an object to the pipeline, with a Matches property that contains the results of the match.

Set-Content twitterData.txt -value @"
Lee, Steve-@Steve_MSFT,2992
Lee Holmes-13000 @Lee_Holmes
Staffan Gustafsson-463 @StaffanGson
Tribbiani, Joey-@Matt_LeBlanc,463400
"@

# extracting captured groups
Get-ChildItem twitterData.txt |
    Select-String -Pattern "^(w+) ([^-]+)-(d+) (@w+)" |
    Foreach-Object {
        $first, $last, $followers, $handle = $_.Matches[0].Groups[1..4].Value   # this is a common way of getting the groups of a call to select-string
        [PSCustomObject] @{
            FirstName = $first
            LastName = $last
            Handle = $handle
            TwitterFollowers = [int] $followers
        }
    }
FirstName LastName   Handle       TwitterFollowers
--------- --------   ------       ----------------
Lee       Holmes     @Lee_Holmes             13000
Staffan   Gustafsson @StaffanGson              463

Support for Multiple Patterns

As we can see above, only half of the data matched the pattern to Select-String.

A technique that I find useful is to take advantage of the fact that Select-String supports the use of multiple patterns.

The lines of input data in twitterData.txt contain the same type of information, but they’re formatted in slightly different ways.
Using multiple patterns in combination with named capture groups makes it a breeze to extract the groups even when the positions of the groups differ.

$firstLastPattern = "^(?<first>w+) (?<last>[^-]+)-(?<followers>d+) (?<handle>@.+)"
$lastFirstPattern = "^(?<last>[^s,]+),s+(?<first>[^-]+)-(?<handle>@[^,]+),(?<followers>d+)"
Get-ChildItem twitterData.txt |
     Select-String -Pattern $firstLastPattern, $lastFirstPattern |
    Foreach-Object {
        # here we access the groups by name instead of by index
        $first, $last, $followers, $handle = $_.Matches[0].Groups['first', 'last', 'followers', 'handle'].Value
        [PSCustomObject] @{
            FirstName = $first
            LastName = $last
            Handle = $handle
            TwitterFollowers = [int] $followers
        }
    }
FirstName LastName   Handle        TwitterFollowers
--------- --------   ------        ----------------
Steve     Lee        @Steve_MSFT               2992
Lee       Holmes     @Lee_Holmes              13000
Staffan   Gustafsson @StaffanGson               463
Joey      Tribbiani  @Matt_LeBlanc           463400

Breaking down the $firstLastPattern gives us

(?x)                # this regex ignores whitespace in the pattern. Makes it possible do document a regex with comments.
^                   # the start of the line
(?<first>w+)       # capture one or more of any word characters into a group named 'first'
s                  # a space
(?<last>[^-]+)      # capture one of more of any characters but `-` into a group named 'last'
-                   # a '-'
(?<followers>d+)   # capture 1 or more digits into a group named 'followers'
s                  # a space
(?<handle>@.+)      # capture a '@' followed by one or more characters into a group named 'handle'

The second regex is similar, but with the groups in different order. But since we retrieve the groups by name, we don’t have to care about the positions of the capture groups, and multiple assignment works fine.

Context around Matches

Select-String also has a Context parameter which accepts an array of one or two numbers specifying the number of lines before and after a match that should be captured. All text parsing techniques in this post can be used to parse information from the context lines.
The result object has a Context property, that returns an object with PreContext and PostContext properties, both of the type string[].

This can be used to get the second line before a match:

# using the context property
Get-ChildItem twitterData.txt |
    Select-String -Pattern "Staffan" -Context 2,1 |
    Foreach-Object { $_.Context.PreContext[1], $_.Context.PostContext[0] }
Lee Holmes-13000 @Lee_Holmes
Tribbiani, Joey-@Matt_LeBlanc,463400

To understand the indexing of the Pre- and PostContext arrays, consider the following:

Lee, Steve-@Steve_MSFT,2992                  <- PreContext[0]
Lee Holmes-13000 @Lee_Holmes                 <- PreContext[1]
Staffan Gustafsson-463 @StaffanGson          <- Pattern matched this line
Tribbiani, Joey-@Matt_LeBlanc,463400         <- PostContext[0]

The pipeline support of Select-String makes it different from the other parsing tools available in PowerShell, and makes it the undisputed king of one-liners.

I would like stress how much more useful Select-String becomes once you understand how to get to the parts of the matches.

Summary

We have looked at useful methods of the string class, especially how to use Substring to get to text at a specific offset. We also looked at regular expression, a language used to describe patterns in text, and on the Select-String cmdlet, which makes heavy use of regular expression.

Next time, we will look at the operators -split and -match, the switch statement (which is surprisingly useful for text parsing), and the regex class.

Staffan Gustafsson, @StaffanGson, github

Thanks to Jason Shirk, Mathias Jessen and Steve Lee for reviews and feedback.

Windows Security change affecting PowerShell

This post was originally published on this site

Windows Security change affecting PowerShell

January 9, 2019

The recent (1/8/2019) Windows security patch CVE-2019-0543, has introduced a breaking change for a PowerShell remoting scenario. It is a narrowly scoped scenario that should have low impact for most users.

The breaking change only affects local loopback remoting, which is a PowerShell remote connection made back to the same machine, while using non-Administrator credentials.

PowerShell remoting endpoints do not allow access to non-Administrator accounts by default. However, it is possible to modify endpoint configurations, or create new custom endpoint configurations, that do allow non-Administrator account access. So you would not be affected by this change, unless you explicitly set up loopback endpoints on your machine to allow non-Administrator account access.

Example of broken loopback scenario

# Create endpoint that allows Users group access
PS > Register-PSSessionConfiguration -Name MyNonAdmin -SecurityDescriptorSddl 'O:NSG:BAD:P(A;;GA;;;BA)(A;;GA;;;BU)S:P(AU;FA;GA;;;WD)(AU;SA;GXGW;;;WD)' -Force

# Create non-Admin credential
PS > $nonAdminCred = Get-Credential ~NonAdminUser

# Create a loopback remote session to custom endpoint using non-Admin credential
PS > $session = New-PSSession -ComputerName localhost -ConfigurationName MyNonAdmin -Credential $nonAdminCred

New-PSSession : [localhost] Connecting to remote server localhost failed with the following error message : The WSMan
service could not launch a host process to process the given request.  Make sure the WSMan provider host server and
proxy are properly registered. For more information, see the about_Remote_Troubleshooting Help topic.
At line:1 char:1
+ New-PSSession -ComputerName localhost -ConfigurationName MyNonAdmin - ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : OpenError: (System.Manageme....RemoteRunspace:RemoteRunspace) [New-PSSession], PSRemotin
   gTransportException
    + FullyQualifiedErrorId : -2146959355,PSSessionOpenFailed

The above example fails only when using non-Administrator credentials, and the connection is made back to the same machine (localhost). Administrator credentials still work. And the above scenario will work when remoting off-box to another machine.

Example of working loopback scenario

# Create Admin credential
PS > $adminCred = Get-Credential ~AdminUser

# Create a loopback remote session to custom endpoint using Admin credential
PS > $session = New-PSSession -ComputerName localhost -ConfigurationName MyNonAdmin -Credential $adminCred
PS > $session

 Id Name            ComputerName    ComputerType    State         ConfigurationName     Availability
 -- ----            ------------    ------------    -----         -----------------     ------------
  1 WinRM1          localhost       RemoteMachine   Opened        MyNonAdmin               Available

The above example uses Administrator credentials to the same MyNonAdmin custom endpoint, and the connection is made back to the same machine (localhost). The session is created successfully using Administrator credentials.

The breaking change is not in PowerShell but in a system security fix that restricts process creation between Windows sessions. This fix is preventing WinRM (which PowerShell uses as a remoting transport and host) from successfully creating the remote session host, for this particular scenario. There are no plans to update WinRM.

This affects Windows PowerShell and PowerShell Core 6 (PSCore6) WinRM based remoting.

This does not affect SSH remoting with PSCore6.

This does not affect JEA (Just Enough Administration) sessions.

A workaround for a loopback connection is to always use Administrator credentials.

Another option is to use PSCore6 with SSH remoting.

Paul Higinbotham
Senior Software Engineer
PowerShell Team

DSC Resource Kit Release January 2019

This post was originally published on this site

We just released the DSC Resource Kit!

This release includes updates to 14 DSC resource modules. In the past 6 weeks, 41 pull requests have been merged and 54 issues have been closed, all thanks to our amazing community!

The modules updated in this release are:

  • ActiveDirectoryCSDsc
  • AuditPolicyDsc
  • CertificateDsc
  • ComputerManagementDsc
  • NetworkingDsc
  • SecurityPolicyDsc
  • SqlServerDsc
  • StorageDsc
  • xActiveDirectory
  • xBitlocker
  • xExchange
  • xFailOverCluster
  • xHyper-V
  • xWebAdministration

Several of these modules were released to remove the hidden files/folders from this issue. This issue should now be fixed for all modules except DFSDsc which is waiting for some fixes to its tests.

For a detailed list of the resource modules and fixes in this release, see the Included in this Release section below.

Our latest community call for the DSC Resource Kit was today, January 9. A recording is available on YouTube here. Join us for the next call at 12PM (Pacific time) on February 13 to ask questions and give feedback about your experience with the DSC Resource Kit.

The next DSC Resource Kit release will be on Wednesday, February 20.

We strongly encourage you to update to the newest version of all modules using the PowerShell Gallery, and don’t forget to give us your feedback in the comments below, on GitHub, or on Twitter (@PowerShell_Team)!

Please see our documentation here for information on the support of these resource modules.

Included in this Release

You can see a detailed summary of all changes included in this release in the table below. For past release notes, go to the README.md or CHANGELOG.md file on the GitHub repository page for a specific module (see the How to Find DSC Resource Modules on GitHub section below for details on finding the GitHub page for a specific module).

Module Name Version Release Notes
ActiveDirectoryCSDsc 3.1.0.0
  • Updated LICENSE file to match the Microsoft Open Source Team standard.
  • Added .VSCode settings for applying DSC PSSA rules – fixes Issue 60.
  • Added fix for two tier PKI deployment fails on initial deployment, not error – fixes Issue 57.
AuditPolicyDsc 1.4.0.0
  • Explicitly removed extra hidden files from release package
CertificateDsc 4.3.0.0
  • Updated certificate import to only use Import-CertificateEx – fixes Issue 161
  • Update LICENSE file to match the Microsoft Open Source Team standard -fixes Issue 164.
  • Opted into Common Tests – fixes Issue 168:
    • Required Script Analyzer Rules
    • Flagged Script Analyzer Rules
    • New Error-Level Script Analyzer Rules
    • Custom Script Analyzer Rules
    • Validate Example Files To Be Published
    • Validate Markdown Links
    • Relative Path Length
  • CertificateExport:
    • Fixed bug causing PFX export with matchsource enabled to fail – fixes Issue 117
ComputerManagementDsc 6.1.0.0
  • Updated LICENSE file to match the Microsoft Open Source Team standard. Fixes Issue 197.
  • Explicitly removed extra hidden files from release package
NetworkingDsc 6.3.0.0
  • MSFT_IPAddress:
    • Updated to allow retaining existing addresses in order to support cluster configurations as well
SecurityPolicyDsc 2.7.0.0
  • Bug fix – Issue 83 – Network_access_Remotely_accessible_registry_paths_and_subpaths correctly applies multiple paths
  • Update LICENSE file to match the Microsoft Open Source Team standard
SqlServerDsc 12.2.0.0
  • Changes to SqlServerDsc
    • During testing in AppVeyor the Build Worker is restarted in the install step to make sure the are no residual changes left from a previous SQL Server install on the Build Worker done by the AppVeyor Team (issue 1260).
    • Code cleanup: Change parameter names of Connect-SQL to align with resources.
    • Updated README.md in the Examples folder.
      • Added a link to the new xADObjectPermissionEntry examples in ActiveDirectory, fixed a broken link and a typo. Adam Rush (@adamrushuk)
    • Change to SqlServerLogin so it doesn”t check properties for absent logins.
StorageDsc 4.4.0.0
  • Refactored module folder structure to move resource to root folder of repository and remove test harness – fixes Issue 169.
  • Updated Examples to support deployment to PowerShell Gallery scripts.
  • Removed limitation on using Pester 4.0.8 during AppVeyor CI.
  • Moved the Code of Conduct text out of the README.md and into a CODE_OF_CONDUCT.md file.
  • Explicitly removed extra hidden files from release package
xActiveDirectory 2.23.0.0
  • Explicitly removed extra hidden files from release package
xBitlocker 1.4.0.0
  • Change double quoted string literals to single quotes
  • Add spaces between array members
  • Add spaces between variable types and variable names
  • Add spaces between comment hashtag and comments
  • Explicitly removed extra hidden files from release package
xExchange 1.26.0.0
  • Add support for Exchange Server 2019
  • Added additional parameters to the MSFT_xExchUMService resource
  • Rename improperly named functions, and add comment based help in MSFT_xExchClientAccessServer, MSFT_xExchDatabaseAvailabilityGroupNetwork, MSFT_xExchEcpVirtualDirectory, MSFT_xExchExchangeCertificate, MSFT_xExchImapSettings.
  • Added additional parameters to the MSFT_xExchUMCallRouterSettings resource
  • Rename improper function names in MSFT_xExchDatabaseAvailabilityGroup, MSFT_xExchJetstress, MSFT_xExchJetstressCleanup, MSFT_xExchMailboxDatabase, MSFT_xExchMailboxDatabaseCopy, MSFT_xExchMailboxServer, MSFT_xExchMaintenanceMode, MSFT_xExchMapiVirtualDirectory, MSFT_xExchOabVirtualDirectory, MSFT_xExchOutlookAnywhere, MSFT_xExchOwaVirtualDirectory, MSFT_xExchPopSettings, MSFT_xExchPowershellVirtualDirectory, MSFT_xExchReceiveConnector, MSFT_xExchWaitForMailboxDatabase, and MSFT_xExchWebServicesVirtualDirectory.
  • Add remaining unit and integration tests for MSFT_xExchExchangeServer.
xFailOverCluster 1.12.0.0
  • Explicitly removed extra hidden files from release package
xHyper-V 3.15.0.0
  • Explicitly removed extra hidden files from release package
xWebAdministration 2.4.0.0
  • Explicitly removed extra hidden files from release package

How to Find Released DSC Resource Modules

To see a list of all released DSC Resource Kit modules, go to the PowerShell Gallery and display all modules tagged as DSCResourceKit. You can also enter a module’s name in the search box in the upper right corner of the PowerShell Gallery to find a specific module.

Of course, you can also always use PowerShellGet (available starting in WMF 5.0) to find modules with DSC Resources:

# To list all modules that tagged as DSCResourceKit
Find-Module -Tag DSCResourceKit 
# To list all DSC resources from all sources 
Find-DscResource

Please note only those modules released by the PowerShell Team are currently considered part of the ‘DSC Resource Kit’ regardless of the presence of the ‘DSC Resource Kit’ tag in the PowerShell Gallery.

To find a specific module, go directly to its URL on the PowerShell Gallery:
http://www.powershellgallery.com/packages/< module name >
For example:
http://www.powershellgallery.com/packages/xWebAdministration

How to Install DSC Resource Modules From the PowerShell Gallery

We recommend that you use PowerShellGet to install DSC resource modules:

Install-Module -Name < module name >

For example:

Install-Module -Name xWebAdministration

To update all previously installed modules at once, open an elevated PowerShell prompt and use this command:

Update-Module

After installing modules, you can discover all DSC resources available to your local system with this command:

Get-DscResource

How to Find DSC Resource Modules on GitHub

All resource modules in the DSC Resource Kit are available open-source on GitHub.
You can see the most recent state of a resource module by visiting its GitHub page at:
https://github.com/PowerShell/< module name >
For example, for the CertificateDsc module, go to:
https://github.com/PowerShell/CertificateDsc.

All DSC modules are also listed as submodules of the DscResources repository in the DscResources folder and the xDscResources folder.

How to Contribute

You are more than welcome to contribute to the development of the DSC Resource Kit! There are several different ways you can help. You can create new DSC resources or modules, add test automation, improve documentation, fix existing issues, or open new ones.
See our contributing guide for more info on how to become a DSC Resource Kit contributor.

If you would like to help, please take a look at the list of open issues for the DscResources repository.
You can also check issues for specific resource modules by going to:
https://github.com/PowerShell/< module name >/issues
For example:
https://github.com/PowerShell/xPSDesiredStateConfiguration/issues

Your help in developing the DSC Resource Kit is invaluable to us!

Questions, comments?

If you’re looking into using PowerShell DSC, have questions or issues with a current resource, or would like a new resource, let us know in the comments below, on Twitter (@PowerShell_Team), or by creating an issue on GitHub.

Katie Kragenbrink
Software Engineer
PowerShell DSC Team
@katiedsc (Twitter)
@kwirkykat (GitHub)

The PowerShell-Docs repo is moving

This post was originally published on this site

On January 16, 2019 at 5:00PM PDT, the PowerShell-Docs repositories are moving from the PowerShell
organization to the MicrosoftDocs organization in GitHub.

The tools we use to build the documentation are designed to work in the MicrosoftDocs org. Moving
the repository lets us build the foundation for future improvements in our documentation experience.

Impact of the move

During the move there may be some downtime. The affected repositories will be inaccessible during
the move process. Also, the documentation processes will be paused. After the move, we need to test
access permissions and automation scripts.

After these tasks are complete, access and operations will return to normal. GitHub automatically
redirects requests to the old repo URL to the new location.

For more information about transferring repositories in GitHub,
see About repository transfers.

  • If the transferred repository has any forks, then those forks will remain associated with the
    repository after the transfer is complete.
  • All Git information about commits, including contributions, are preserved.
  • All of the issues and pull requests remain intact when transferring a repository.
  • All links to the previous repository location are automatically redirected to the new location.

When you use git clone, git fetch, or git push on a transferred repository, these commands will
redirect to the new repository location or URL.

However, to avoid confusion, we strongly recommend updating any existing local clones to point to
the new repository URL. You can do this by using git remote on the command line:

git remote set-url origin new_url

For more information, see Changing a remote’s URL.

Which repositories are being moved?

The following repositories are being transferred:

  • PowerShell/PowerShell-Docs
  • PowerShell/powerShell-Docs.cs-cz
  • PowerShell/powerShell-Docs.de-de
  • PowerShell/powerShell-Docs.es-es
  • PowerShell/powerShell-Docs.fr-fr
  • PowerShell/powerShell-Docs.hu-hu
  • PowerShell/powerShell-Docs.it-it
  • PowerShell/powerShell-Docs.ja-jp
  • PowerShell/powerShell-Docs.ko-kr
  • PowerShell/powerShell-Docs.nl-nl
  • PowerShell/powerShell-Docs.pl-pl
  • PowerShell/powerShell-Docs.pt-br
  • PowerShell/powerShell-Docs.pt-pt
  • PowerShell/powerShell-Docs.ru-ru
  • PowerShell/powerShell-Docs.sv-se
  • PowerShell/powerShell-Docs.tr-tr
  • PowerShell/powerShell-Docs.zh-cn
  • PowerShell/powerShell-Docs.zh-tw

Call to action

If you have a fork that you cloned, change your remote configuration to point to the new upstream URL.

Help us make the documentation better.

  • Submit issues when you find a problem in the docs.
  • Suggest fixes to documentation by submitting changes through the PR process.

 

Sean Wheeler
Senior Content Developer for PowerShell
https://github.com/sdwheeler

DSC Resource Kit Release November 2018

This post was originally published on this site

We just released the DSC Resource Kit!

This release includes updates to 9 DSC resource modules. In the past 6 weeks, 61 pull requests have been merged and 67 issues have been closed, all thanks to our amazing community!

The modules updated in this release are:

  • AuditPolicyDsc
  • DFSDsc
  • NetworkingDsc
  • SecurityPolicyDsc
  • SharePointDsc
  • StorageDsc
  • xBitlocker
  • xExchange
  • xHyper-V

For a detailed list of the resource modules and fixes in this release, see the Included in this Release section below.

Our latest community call for the DSC Resource Kit was supposed to be today, November 28, but the public link to the call expired, so the call was cancelled. I will update the link for next time. If there is interest in rescheduling this call, the new call time will be announced on Twitter (@katiedsc or @migreene) The call for the next release cycle is also getting moved a week later than usual to January 9 at 12PM (Pacific standard time). Join us to ask questions and give feedback about your experience with the DSC Resource Kit.

The next DSC Resource Kit release will be on Wednesday, January 9.

We strongly encourage you to update to the newest version of all modules using the PowerShell Gallery, and don’t forget to give us your feedback in the comments below, on GitHub, or on Twitter (@PowerShell_Team)!

Please see our documentation here for information on the support of these resource modules.

Included in this Release

You can see a detailed summary of all changes included in this release in the table below. For past release notes, go to the README.md or CHANGELOG.md file on the GitHub repository page for a specific module (see the How to Find DSC Resource Modules on GitHub section below for details on finding the GitHub page for a specific module).

Module Name Version Release Notes
AuditPolicyDsc 1.3.0.0
  • Update LICENSE file to match the Microsoft Open Source Team standard.
  • Added the AuditPolicyGuid resource.
DFSDsc 4.2.0.0
  • Add support for modifying staging quota size in MSFT_DFSReplicationGroupMembership – fixes Issue 77.
  • Refactored module folder structure to move resource to root folder of repository and remove test harness – fixes Issue 74.
  • Updated Examples to support deployment to PowerShell Gallery scripts.
  • Remove exclusion of all tags in appveyor.yml, so all common tests can be run if opt-in.
  • Added .VSCode settings for applying DSC PSSA rules – fixes Issue 75.
  • Updated LICENSE file to match the Microsoft Open Source Team standard – fixes Issue 79
NetworkingDsc 6.2.0.0
  • Added .VSCode settings for applying DSC PSSA rules – fixes Issue 357.
  • Updated LICENSE file to match the Microsoft Open Source Team standard – fixes Issue 363
  • MSFT_NetIPInterface:
    • Added a new resource for configuring the IP interface settings for a network interface.
SecurityPolicyDsc 2.6.0.0
  • Added SecurityOption – Network_access_Restrict_clients_allowed_to_make_remote_calls_to_SAM
  • Bug fix – Issue 105 – Spelling error in SecurityOption”User_Account_Control_Behavior_of_the_elevation_prompt_for_standard_users”
  • Bug fix – Issue 90 – Corrected value for Microsoft_network_server_Server_SPN_target_name_validation_level policy
SharePointDsc 3.0.0.0
  • Changes to SharePointDsc
    • Added support for SharePoint 2019
    • Added CredSSP requirement to the Readme files
    • Added VSCode Support for running SharePoint 2019 unit tests
    • Removed the deprecated resources SPCreateFarm and SPJoinFarm (replaced in v2.0 by SPFarm)
  • SPBlobCacheSettings
    • Updated the Service Instance retrieval to be language independent
  • SPConfigWizard
    • Fixed check for Ensure=Absent in the Set method
  • SPInstallPrereqs
    • Added support for detecting updated installation of Microsoft Visual C++ 2015/2017 Redistributable (x64) for SharePoint 2016 and SharePoint 2019.
  • SPSearchContentSource
    • Added support for Business Content Source Type
  • SPSearchMetadataCategory
    • New resource added
  • SPSearchServiceApp
    • Updated resource to make sure the presence of the service app proxy is checked and created if it does not exist
  • SPSecurityTokenServiceConfig
    • The resource only tested for the Ensure parameter. Added more parameters
  • SPServiceAppSecurity
    • Added support for specifying array of access levels.
    • Changed implementation to use Grant-SPObjectSecurity with Replace switch instead of using a combination of Revoke-SPObjectSecurity and Grant-SPObjectSecurity
    • Added all supported access levels as available values.
    • Removed unknown access levels: Change Permissions, Write, and Read
  • SPUserProfileProperty
    • Removed obsolete parameters (MappingConnectionName, MappingPropertyName, MappingDirection) and introduced new parameter PropertyMappings
  • SPUserProfileServiceApp
    • Updated the check for successful creation of the service app to throw an error if this is not done correctly The following changes will break v2.x and earlier configurations that use these resources:
  • Implemented IsSingleInstance parameter to force that the resource can only be used once in a configuration for the following resources:
    • SPAntivirusSettings
    • SPConfigWizard
    • SPDiagnosticLoggingSettings
    • SPFarm
    • SPFarmAdministrators
    • SPInfoPathFormsServiceConfig
    • SPInstall
    • SPInstallPrereqs
    • SPIrmSettings
    • SPMinRoleCompliance
    • SPPasswordChangeSettings
    • SPProjectServerLicense
    • SPSecurityTokenServiceConfig
    • SPShellAdmin
  • Standardized Url/WebApplication parameter to default WebAppUrl parameter for the following resources:
    • SPDesignerSettings
    • SPFarmSolution
    • SPSelfServiceSiteCreation
    • SPWebAppBlockedFileTypes
    • SPWebAppClientCallableSettings
    • SPWebAppGeneralSettings
    • SPWebApplication
    • SPWebApplicationAppDomain
    • SPWebAppSiteUseAndDeletion
    • SPWebAppThrottlingSettings
    • SPWebAppWorkflowSettings
  • Introduced new mandatory parameters
    • SPSearchResultSource: Added option to create Result Sources at different scopes.
    • SPServiceAppSecurity: Changed parameter AccessLevel to AccessLevels in MSFT_SPServiceAppSecurityEntry to support array of access levels.
    • SPUserProfileProperty: New parameter PropertyMappings
SharePointDsc 3.1.0.0
  • Changes to SharePointDsc
    • Updated LICENSE file to match the Microsoft Open Source Team standard.
  • ProjectServerConnector
    • Added a file hash validation check to prevent the ability to load custom code into the module.
  • SPFarm
    • Fixed localization issue where TypeName was in the local language.
  • SPInstallPrereqs
    • Updated links in the Readme.md file to docs.microsoft.com.
    • Fixed required prereqs for SharePoint 2019, added MSVCRT11.
  • SPManagedMetadataServiceApp
    • Fixed issue where Get-TargetResource method throws an error when the service app proxy does not exist.
  • SPSearchContentSource
    • Corrected issue where the New-SPEnterpriseSearchCrawlContentSource cmdlet was called twice.
  • SPSearchServiceApp
    • Fixed issue where Get-TargetResource method throws an error when the service application pool does not exist.
    • Implemented check to make sure cmdlets are only executed when it actually has something to update.
    • Deprecated WindowsServiceAccount parameter and moved functionality to new resource (SPSearchServiceSettings).
  • SPSearchServiceSettings
    • Added new resource to configure search service settings.
  • SPServiceAppSecurity
    • Fixed unavailable utility method (ExpandAccessLevel).
    • Updated the schema to no longer specify username as key for the sub class.
  • SPUserProfileServiceApp
    • Fixed issue where localized versions of Windows and SharePoint would throw an error.
  • SPUserProfileSyncConnection
    • Corrected implementation of Ensure parameter.
StorageDsc 4.3.0.0
  • WaitForDisk:
    • Added readonly-property isAvailable which shows the current state of the disk as a boolean – fixes Issue 158.
xBitlocker 1.3.0.0
  • Update appveyor.yml to use the default template.
  • Added default template files .gitattributes, and .vscode settings.
  • Fixes most PSScriptAnalyzer issues.
  • Fix issue where AutoUnlock is not set if requested, if the disk was originally encrypted and AutoUnlock was not used.
  • Add remaining Unit Tests for xBitlockerCommon.
  • Add Unit tests for MSFT_xBLTpm
  • Add remaining Unit Tests for xBLAutoBitlocker
  • Add Unit tests for MSFT_xBLBitlocker
  • Moved change log to CHANGELOG.md file
  • Fixed Markdown validation warnings in README.md
  • Added .MetaTestOptIn.json file to root of module
  • Add Integration Tests for module resources
  • Rename functions with improper Verb-Noun constructs
  • Add comment based help to any functions without it
  • Update Schema.mof Description fields
  • Fixes issue where Switch parameters are passed to Enable-Bitlocker even if the corresponding DSC resource parameter was set to False (Issue 12)
xExchange 1.25.0.0
  • Opt-in for the common test flagged Script Analyzer rules (issue 234).
  • Opt-in for the common test testing for relative path length.
  • Removed the property PSDscAllowPlainTextPassword from all examples so the examples are secure by default. The property PSDscAllowPlainTextPassword was previously needed to (test) compile the examples in the CI pipeline, but now the CI pipeline is using a certificate to compile the examples.
  • Opt-in for the common test that validates the markdown links.
  • Fix typo of the word “Certificate” in several example files.
  • Add spaces between array members.
  • Add initial set of Unit Tests (mostly Get-TargetResource tests) for all remaining resource files.
  • Add WaitForComputerObject parameter to xExchWaitForDAG
  • Add spaces between comment hashtags and comments.
  • Add space between variable types and variables.
  • Fixes issue where xExchMailboxDatabase fails to test for a Journal Recipient because the module did not load the Get-Recipient cmdlet (335).
  • Fixes broken Integration tests in MSFT_xExchMaintenanceMode.Integration.Tests.ps1 (336).
  • Fix issue where Get-ReceiveConnector against an Absent connector causes an error to be logged in the MSExchange Management log.
  • Rename poorly named functions in xExchangeDiskPart.psm1 and MSFT_xExchAutoMountPoint.psm1, and add comment based help.
xHyper-V 3.14.0.0
  • MSFT_xVMHost:
    • Added support to Enable / Disable VM Live Migration. Fixes Issue 155.

How to Find Released DSC Resource Modules

To see a list of all released DSC Resource Kit modules, go to the PowerShell Gallery and display all modules tagged as DSCResourceKit. You can also enter a module’s name in the search box in the upper right corner of the PowerShell Gallery to find a specific module.

Of course, you can also always use PowerShellGet (available starting in WMF 5.0) to find modules with DSC Resources:

# To list all modules that tagged as DSCResourceKit
Find-Module -Tag DSCResourceKit 
# To list all DSC resources from all sources 
Find-DscResource

Please note only those modules released by the PowerShell Team are currently considered part of the ‘DSC Resource Kit’ regardless of the presence of the ‘DSC Resource Kit’ tag in the PowerShell Gallery.

To find a specific module, go directly to its URL on the PowerShell Gallery:
http://www.powershellgallery.com/packages/< module name >
For example:
http://www.powershellgallery.com/packages/xWebAdministration

How to Install DSC Resource Modules From the PowerShell Gallery

We recommend that you use PowerShellGet to install DSC resource modules:

Install-Module -Name < module name >

For example:

Install-Module -Name xWebAdministration

To update all previously installed modules at once, open an elevated PowerShell prompt and use this command:

Update-Module

After installing modules, you can discover all DSC resources available to your local system with this command:

Get-DscResource

How to Find DSC Resource Modules on GitHub

All resource modules in the DSC Resource Kit are available open-source on GitHub.
You can see the most recent state of a resource module by visiting its GitHub page at:
https://github.com/PowerShell/< module name >
For example, for the CertificateDsc module, go to:
https://github.com/PowerShell/CertificateDsc.

All DSC modules are also listed as submodules of the DscResources repository in the DscResources folder and the xDscResources folder.

How to Contribute

You are more than welcome to contribute to the development of the DSC Resource Kit! There are several different ways you can help. You can create new DSC resources or modules, add test automation, improve documentation, fix existing issues, or open new ones.
See our contributing guide for more info on how to become a DSC Resource Kit contributor.

If you would like to help, please take a look at the list of open issues for the DscResources repository.
You can also check issues for specific resource modules by going to:
https://github.com/PowerShell/< module name >/issues
For example:
https://github.com/PowerShell/xPSDesiredStateConfiguration/issues

Your help in developing the DSC Resource Kit is invaluable to us!

Questions, comments?

If you’re looking into using PowerShell DSC, have questions or issues with a current resource, or would like a new resource, let us know in the comments below, on Twitter (@PowerShell_Team), or by creating an issue on GitHub.

Katie Kragenbrink
Software Engineer
PowerShell DSC Team
@katiedsc (Twitter)
@kwirkykat (GitHub)

PowerShell Constrained Language mode and the Dot-Source Operator

This post was originally published on this site

PowerShell Constrained Language mode and the Dot-Source Operator

PowerShell works with application control systems, such as AppLocker and Windows Defender Application Control (WDAC), by automatically running in
ConstrainedLanguage mode. ConstrainedLanguage mode restricts some exploitable aspects of PowerShell while still giving you a rich shell to run commands and scripts in. This is different from usual application white listing rules, where an application is either allowed to run or not.

But there are times when the full power of PowerShell is needed, so we allow script files to run in FullLanguage mode when they are trusted by the policy. Trust can be indicated through file signing or other policy mechanisms such as file hash. However, script typed into the interactive shell is always run constrained.

Since PowerShell can run script in both Full and Constrained language modes, we need to protect the boundary between them. We don’t want to leak variables or functions between sessions running in different language modes.

The PowerShell dot-source operator brings script files into the current session scope. It is a way to reuse script. All script functions and variables defined in the script file become part of the script it is dot sourced into. It is like copying and pasting text from the script file directly into your script.

# HelperFn1, HelperFn2 are defined in HelperFunctions.ps1
# Dot-source the file here to get access to them (no need to copy/paste)
. c:ScriptsHelperFunctions.ps1
HelperFn1
HelperFn2

This presents a problem when language modes are in effect with system application control. If an untrusted script is dot-sourced into a script with full trust then it has access to all those functions that run in FullLanguage mode, which can result in application control bypass through arbitrary code execution or privilege escalation. Consequently, PowerShell prevents this by throwing an error when dot-sourcing is attempted across language modes.

Example 1:

System is in WDAC policy lock down. To start with, neither script is trusted and so both run in ConstrainedLanguage mode. But the HelperFn1 function uses method invocation which isn’t allowed in that mode.

PS> type c:MyScript.ps1
Write-Output "Dot sourcing MyHelper.ps1 script file"
. c:MyHelper.ps1
HelperFn1
PS> type c:MyHelper.ps1
function HelperFn1
{
    "Language mode: $($ExecutionContext.SessionState.LanguageMode)"
    [System.Console]::WriteLine("This can only run in FullLanguage mode!")
}
PS> c:MyScript.ps1
Dot sourcing MyHelper.ps1 script file
Language mode: ConstrainedLanguage
Cannot invoke method. Method invocation is supported only on core types in this language mode.
At C:MyHelper.ps1:4 char:5
+     [System.Console]::WriteLine("This cannot run in ConstrainedLangua ...
+     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (:) [], RuntimeException
    + FullyQualifiedErrorId : MethodInvocationNotSupportedInConstrainedLanguage

Both scripts are untrusted and run in ConstrainedLanguage mode, so dot-sourcing the MyHelper.ps1 file works. However, the HelperFn1 function performs method invocation that is not allowed in ConstrainedLanguage and fails when run. MyHelper.ps1 needs to be signed as trusted so it can run at FullLanguage.

Next we have mixed language modes. MyHelper.ps1 is signed and trusted, but MyScript.ps1 is not.

PS> c:MyScript.ps1
Dot sourcing MyHelper.ps1 script file
C:MyHelper.ps1 : Cannot dot-source this command because it was defined in a different language mode. To invoke this command without importing its contents, omit the '.' operator.
At C:MyScript.ps1:2 char:1
+ . 'c:MyHelper.ps1'
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (:) [MyHelper.ps1], NotSupportedException
    + FullyQualifiedErrorId : DotSourceNotSupported,MyHelper.ps1
...

And we get a dot-source error because we are trying to dot-source script that has a different language mode than the session it is being dot-sourced into.

Finally, we sign as trusted both script files and everything works.

PS> c:MyScript.ps1
Dot sourcing MyHelper.ps1 script file
Language mode: FullLanguage
This can only run in FullLanguage mode!

The lesson here is to ensure all script components run in the same language mode on policy locked down systems. If one component must run in FullLanguage mode, then all components should run in FullLanguage mode. This means validating that each component is safe to run in FullLanguage and indicating they are trusted to the application control policy.

So this solves all language mode problems, right? If FullLanguage is not needed then just ensure all script components run untrusted, which is the default condition. If they require FullLanguage then carefully validate all components and mark them as trusted. Unfortuantely, there is one case where this best practice doesn’t work.

PowerShell Profile File

The PowerShell profile file (profile.ps1) is loaded and run at PowerShell start up. If that script requires FullLanguage mode on policy lock down systems, you just validate and sign the file as trusted, right?

Example 2:

PS> type c:users<user>DocumentsWindowsPowerShellprofile.ps1
Write-Output "Running Profile"
[System.Console]::WriteLine("This can only run in FullLanguage!")
# Sign file so it is trusted and will run in FullLanguage mode
PS> Set-AuthenticodeSignature -FilePath .Profile.ps1 -Certificate $myPolicyCert
# Start a new PowerShell session and run the profile script
PS> powershell.exe
Windows PowerShell
Copyright (C) Microsoft Corporation. All rights reserved.
C:Users<user>DocumentsWindowsPowerShellprofile.ps1 : Cannot dot-source this command because it was defined in a different language mode. To invoke this command without importing its contents, omit the '.' operator.
At line:1 char:1
+ . 'C:Users<user>DocumentsWindowsPowerShellprofile.ps1'
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (:) [profile.ps1], NotSupportedException
    + FullyQualifiedErrorId : DotSourceNotSupported,profile.ps1

What gives? The profile.ps1 file was signed and is policy trusted. Why the error?
Well, the issue is that PowerShell dot-sources the profile.ps1 file into the default PowerShell session, which must run in ConstrainedLanguage because of the policy. So we are attempting to dot-source a FullLanguage script into a ConstrainedLanguage session, and that is not allowed. This is a catch 22 because if the profile.ps1 is not signed, it may not run if it needs FullLanguage privileges (e.g., invoke methods). But if you sign it, it still won’t run because of how it is dot-sourced into the current ConstrainedLanguage interactive session.

Unfortunately, the only solution is to keep the profile.ps1 file fairly simple so that it does not need FullLanguage, and refrain from making it trusted. Keep in mind that this is only an issue when running with application control policy. Otherwise, language modes do not come into play and PowerShell profile files run normally.

Paul Higinbotham
Senior Software Engineer
PowerShell Team