Yes Virginia, languages other than PowerShell do exist.
I was working with a partner group here at Microsoft and they explained that they wanted to parse PowerShell scripts from Python.
Their natural approach was to invoke the PowerShell executable and construct a command-line that did what they needed.
I thought there might be a better way as creating a new PowerShell process each time is expensive, so I started doing a bit of research to see something could be done.
I’ve been aware of IronPython (Python that tightly integrates .NET) for a long time, and
we met with Jim Hugunin shortly after he arrived at Microsoft and PowerShell was just getting underway,
but the group is using cPython so I went hunting for Python modules that host .NET and found the pythonnet
module.
The pythonnet package gives Python developers extremely easy access to the dotnet runtime from Python.
I thought this package might be the key for accessing PowerShell,
after some investigation I found that it has exactly what I needed to host PowerShell in a Python script.
The guts
I needed to figure out a way to load the PowerShell engine.
First, there are a couple of requirements to make this all work.
Dotnet has to be available, as does PowerShell and pythonnet
provides a way to specify where to look for dotnet.
Setting the environment variable DOTNET_ROOT
to the install location,
enables pythonnet
a way find the assemblies and other support files to host .NET.
import os
os.environ["DOTNET_ROOT"] = "/root/.dotnet"
Now that we know where dotnet is, we need to load up the CLR and set up the runtime configuration.
The runtime configuration describes various aspects of how we’ll run.
We can create a very simple pspython.runtimeconfig.json
{
"runtimeOptions": {
"tfm": "net6.0",
"framework": {
"name": "Microsoft.NETCore.App",
"version": "6.0.0"
}
}
}
The combination of the DOTNET_ROOT
and the runtime configuration enables
loading the CLR with the get_coreclr
and set_runtime
functions.
# load up the clr
from clr_loader import get_coreclr
from pythonnet import set_runtime
rt = get_coreclr("/root/pspython.runtimeconfig.json")
set_runtime(rt)
Now that we have the CLR loaded, we need to load the PowerShell engine.
This was a little non-obvious.
Initially, I just attempted to load System.Management.Automation.dll
but that failed
due to a strong name validation error.
However, If I loaded Microsoft.Management.Infrastructure.dll
first, I can avoid that error.
I’m not yet sure about why I need to load this assembly first, that’s still something
I need to determine.
import clr
import sys
import System
from System import Environment
from System import Reflection
psHome = r'/opt/microsoft/powershell/7/'
mmi = psHome + r'Microsoft.Management.Infrastructure.dll'
clr.AddReference(mmi)
from Microsoft.Management.Infrastructure import *
full_filename = psHome + r'System.Management.Automation.dll'
clr.AddReference(full_filename)
from System.Management.Automation import *
from System.Management.Automation.Language import Parser
Eventually I would like to make the locations of dotnet and PSHOME
configurable,
but for the moment, I have what I need.
Now that the PowerShell engine is available to me,
I created a couple of helper functions to make handling the results easier from Python.
I also created a PowerShell object (PowerShell.Create()
) that I will use in some of my functions.
ps = PowerShell.Create()
def PsRunScript(script):
ps.Commands.Clear()
ps.Commands.AddScript(script)
result = ps.Invoke()
rlist = []
for r in result:
rlist.append(r)
return rlist
class ParseResult:
def __init__(self, scriptDefinition, tupleResult):
self.ScriptDefinition = scriptDefinition
self.Ast = tupleResult[0]
self.Tokens = tupleResult[1]
self.Errors = tupleResult[2]
def PrintAst(self):
print(self.ast.Extent.Text)
def PrintErrors(self):
for e in self.Errors:
print(str(e))
def PrintTokens(self):
for t in self.Tokens:
print(str(t))
def FindAst(self, astname):
Func = getattr(System, "Func`2")
func = Func[System.Management.Automation.Language.Ast, bool](lambda a : type(a).__name__ == astname)
asts = self.Ast.FindAll(func, True)
return asts
def ParseScript(scriptDefinition):
token = None
error = None
# this returns a tuple of ast, tokens, and errors rather than the c# out parameter
ast = Parser.ParseInput(scriptDefinition, token, error)
# ParseResult will bundle the 3 parts into something more easily consumed.
pr = ParseResult(scriptDefinition, ast)
return pr
def ParseFile(filePath):
token = None
error = None
# this returns a tuple of ast, tokens, and errors rather than the c# out parameter
ast = Parser.ParseFile(filePath, token, error)
# ParseResult will bundle the 3 parts into something more easily consumed.
pr = ParseResult(filePath, ast)
return pr
def PrintResults(result):
for r in result:
print(r)
I really wanted to mimic the PowerShell AST methods with some more friendly Python functions.
To create the FindAst() function, I needed to combine the delegate in c# with the lambda feature in Python.
Normally, in PowerShell, this would look like:
$ast.FindAll({$args[0] -is [System.Management.Automation.Language.CommandAst]}, $true)
But I thought from a Python script, it would easier to use the name of the type.
You still need to know the name of the type,
but bing is great for that sort of thing.
As I said, I don’t really know the Python language,
so I expect there are better ways to handle the Collection[PSObject]
that Invoke()
returns.
I found that I had to iterate over the result no matter what, so I built it into the convenience function.
Anyone with suggestions is more than welcome to improve this.
The glory
Now that we have the base module together, we can write some pretty simple Python to
execute our PowerShell scripts.
Invoking a PowerShell script is now as easy as:
#!/usr/bin/python3
from pspython import *
scriptDefinition = 'Get-ChildItem'
print(f"Run the script: '{scriptDefinition}")
result = PsRunScript(scriptDefinition)
PrintResults(result)
/root/__pycache__
/root/dotnet-install.sh
/root/get-pip.py
/root/grr.py
/root/hosted.runtimeconfig.json
/root/pspar.py
/root/pspython.py
/root/psrun.py
You’ll notice that the output is not formatted by PowerShell.
This is because Python is just taking the .NET objects and (essentially) calling ToString()
on them.
It’s also possible to retrieve objects and then manage formatting via PowerShell.
This example retrieves objects via Get-ChildItem
,
selects those files that start with “ps” in Python,
and then creates a string result in table format.
scriptDefinition = 'Get-ChildItem'
result = list(filter(lambda r: r.BaseObject.Name.startswith('ps'), PsRunScript(scriptDefinition)))
ps.Commands.Clear()
ps.Commands.AddCommand("Out-String").AddParameter("Stream", True).AddParameter("InputObject", result)
strResult = ps.Invoke()
# print results
PrintResults(strResult)
Directory: /root
UnixMode User Group LastWriteTime Size Name
-------- ---- ----- ------------- ---- ----
-rwxr-xr-x root dialout 6/17/2022 01:30 1117 pspar.py
-rwxr-xr-x root dialout 6/16/2022 18:55 2474 pspython.py
-rwxr-xr-x root dialout 6/16/2022 21:43 684 psrun.py
But that’s not all
We can also call static methods on PowerShell types.
Those of you that noticed in my module there are a couple of language related functions.
The ParseScript
and ParseFile
functions allow us to call the PowerShell language parser
enabling some very interesting scenarios.
Imagine I wanted to determine what commands a script is calling.
The PowerShell AST makes that a breeze, but first we have to use the parser.
In PowerShell, that would be done like this:
$tokens = $errors = $null
$AST = [System.Management.Automation.Language.Parser]::ParseFile("myscript.ps1", [ref]$tokens, [ref]$errors)
The resulting AST is stored in $AST
, the tokens in $tokens
, and the errors in $errors
.
With this Python module, I encapsulate that into the Python function ParseFile
,
which returns an object containing all three of those results in a single element.
I also created a couple of helper functions to print the tokens and errors more easily.
Additionally, I created a function that allows me to look for any type of AST (or sub AST)
in any arbitrary AST.
parseResult = ParseFile(scriptFile)
commandAst = parseResult.FindAst("CommandAst")
commands = set()
for c in commandAst:
commandName = c.GetCommandName()
# sometimes CommandName is null, don't include those
if commandName != None:
commands.add(c.GetCommandName().lower())
PrintResults(sorted(commands))
Note that there is a check for commandName not being null.
This is because when & $commandName
is used, the command name cannot be
determined via static analysis since the command name is determined at run-time.
…a few, uh, provisos, uh, a couple of quid pro quo
First, you have to have dotnet installed (via the install-dotnet),
as well as a full installation of PowerShell.
pythonnet
doesn’t run on all versions of Python,
I’ve tested it only on Python 3.8 and Python 3.9 on Ubuntu20.04.
As of the time I wrote this, I couldn’t get it to run on Python 3.10.
There’s more info on pythonnet at the pythonnet web-site.
Also, this is a hosted instance of PowerShell.
Some things, like progress, and verbose, and errors may act a bit differently than you
would see from pwsh.exe
.
Over time, I will probably add additional helper functions to retrieve more runtime information
from the engine instance.
If you would like to pitch in, I’m happy to take Pull Requests or to simply understand your use cases integrating PowerShell and Python.
Take it out for a spin
I’ve wrapped all of this up and added a Dockerfile (running on Ubuntu 20.04) on
github.
To create the docker image, just run
Docker build --tag pspython:demo .
from the root of the repository.
The post Hosting PowerShell in a Python script appeared first on PowerShell Team.