Make Your WSL Environment Programmable

I have been fascinated by WSL architecture since its debut. It has the elegant and beautiful architecture to achieve interoperability between Windows and the Linux world. Eventually, I dive into the WSL Win32 APIs, its registry model, and internal too.

But there was only slight WSL API documentation and sample. Later, I realized that Microsoft does not intend to use the WSL API for general purposes but WSL distribution developers. Also, the CLI tool known as WSL.exe and distro launchers is moving parts (because each Windows 10 major release has different features, command-line options). So these situations make it hard to automate the WSL environment.

After many trials and errors, I developed a small but effective automation tool called WSL SDK. This tool is an out-of-process style COM server so that you can access the SDK with any COM-supported language.
In this article, I want to share the walkthrough of the WSL SDK to overcome the difficulties of using official Win32 WSL APIs.

Hidden treasures of WSL

If you are interested in the WSL APIs, you might get a hint about the API. For example, the API WslLaunch returns an HRESULT code. A WSL-related COM interface is in its internal composition known as LxssUserSession, and this API wraps around the COM object.

Sadly, the internal COM object was completely hiding by Microsoft, and it looks like it was pretty intended. I guess that there are reasonable decisions about this direction. However, its internal COM object also does not have documentation well.

Unkindness of the WSL APIs

Because of that, there is well-known and by-design behavior about the WSL APIs. If you call the WSL APIs with P/Invoke via PowerShell, LINQPad, or any COM-enabled environment, you cannot reach any WSL APIs. Many enthusiasts tried the APIs, but there is no luck.
And why is that? Those environments I mentioned already initiated with another CoInitializeSecurity call. Sadly, WSL APIs require a particular initialization parameter. And the CoInitializeSecurity called in somewhere; you can never invoke the CoInitializeSecurity again. To overcome this issue, inevitably, I should choose the out-of-process model.
Excavating the old samples
However, the out-of-process model makes cumbersome steps to use the API. You should check the existence of the process. You will have to define how to communicate with the external process such as pipe, internal networking, or any marshaling protocols. Moreover, this approach makes it hard to extend and maintain the API calls and functionalities.

I stuck at this point so a long time. But, thanks to the old project named All-In-One Code Framework developed by Microsoft, I found a beautiful solution. Yes. The out-of-process COM server model! So I adapted the sample out-of-process COM server code, and it works like a charm.

Under the hood

When the client application requests a WSL SDK service object via COM API, Windows automatically launches the executable file to obtain an appropriate object reference. Then, wrap it with a proxy interface, and the client application retrieves that reference. For better understanding, please look at the below picture.

Justly, WSL SDK executable file invokes CoInitializeAPI with a correct parameter to communicate with WSL APIs. Then start a message pump to handle external RPC requests and any Windows GUI-related requests. When an object reference is requested, its reference count will increase or decrease. Then the count reaches zero; the executable process will shut down. And again, another request arrived, the same round will occur again until unregistering the COM information from the registry.

So the WSL SDK decouples the COM security model between the application's one and WSL's requirement. And process lifecycle management handled by the operating system's infrastructure and a reference count mechanism. Every WSL SDK client does not need to care about any details. They request a WSL SDK interface as usual, and all things are going well.

Comparing direct P/Invoke to the WSL API and using WSL SDK

I will show a simple demonstration.

A PowerShell sample code looks like below. I excerpted a sample code from the GitHub issue WSL API does not work in PowerShell · Issue #4058 · microsoft/WSL (github.com).

# Excerpted from https://github.com/microsoft/WSL/issues/4058
Write-Host 'Calling WslIsDistributionRegistered directrly (Ubuntu-20.04):'
Add-Type -TypeDefinition @'
using System.Runtime.InteropServices;
public class wslutil
{
 [DllImport("wslapi.dll", CharSet = CharSet.Unicode)]
 public static extern uint WslIsDistributionRegistered([In, MarshalAs(UnmanagedType.LPWStr)] string distributionName);
public static void Main(string[] args)
 {
  System.Console.WriteLine(WslIsDistributionRegistered("Ubuntu-20.04"));
 }
}
'@
[wslutil]::Main({})
Write-Host

The PowerShell code references a C-function from the WSLAPI.dll. Seemingly, it makes sense and should work well. But the code returns zero, does not reflect the current status.

However, WSL SDK returns the correct value.

Write-Host 'Calling WSL SDK API (Ubuntu-20.04):'
$DistroName = 'Ubuntu-20.04'
$obj = New-Object -ComObject 'WslSdk.WslService'
$Result = $obj.IsDistroRegistered($DistroName)
Write-Host "$Result"
Write-Host

Essentially, both code depends on the WslIsDistributionRegistered function, but the PowerShell already calls the CoInitializeSecurity which does not meet requirements for WSL APIs. The first example will not work because of that.

A Demo: Sandboxing WSL distro

I will show another, more complex example script. The below code will automatically download the Alpine Linux root filesystem image from the official mirror. Then, add the VI improved editor to the distribution.

$ErrorActionPreference = "Stop"
$obj = New-Object -ComObject 'WslSdk.WslService'
Write-Output 'A WslSdk.WslService object is created.'
Pause

# Get installed distro list
Write-Output 'Currently installed WSL distro list: '
$list = $obj.GetDistroList()
Write-Output $list
Pause

# Generate Random Name
$RandomName = $obj.GenerateRandomName($false)
Write-Output "We will use $RandomName as a new distro"

# Download Alpine Linux RootFS Image
Write-Output 'Downloading alpine linux root file system image'
$TargetUrl = 'https://dl-cdn.alpinelinux.org/alpine/v3.14/releases/x86_64/alpine-minirootfs-3.14.0-x86_64.tar.gz'
$RootfsFilePath = "$env:TEMP\alpine.tar.gz"
$InstallPath = "C:\Distro\$RandomName"
Invoke-WebRequest -UseBasicParsing -Uri $TargetUrl -OutFile $RootfsFilePath
Pause

# Register Distro
Write-Output "Distro installation begins"
Write-Output " - Distro Name: $RandomName"
Write-Output " - Source RootFS File Path: $RootfsFilePath"
Write-Output " - Destination Install Path: $InstallPath"
$obj.RegisterDistro($RandomName, $RootfsFilePath, $InstallPath)
Pause

# Distro Register Check
$Result = $obj.IsDistroRegistered($RandomName)
Write-Output "Distro Name $RandomName Installed: $Result"
Pause

# Metadata Query
Write-Output "Querying $RandomName metadata..."
$o = $obj.QueryDistroInfo($RandomName)
Write-Output " - Distro ID: $($o.DistroId())"
Write-Output " - Distro Name: $($o.DistroName())"
Write-Output " - Environment Variabls: $($o.DefaultEnvironmentVariables())"
Write-Output " - Default Uid: $($o.DefaultUid())"
Write-Output " - Flags: $($o.DistroFlags())"
Write-Output " - Win32 Interop Enabled: $($o.EnableInterop())"
Write-Output " - Drive Mounting Enabled: $($o.EnableDriveMounting())"
Write-Output " - NT Path Append Enabled: $($o.AppendNtPath())"
Write-Output " - WSL Version: $($o.WslVersion())"
Pause

# Run WSL command
Write-Output "Installing vim..."
$res = $obj.RunWslCommand($o.DistroName(), "apk add vim")
Write-Output $res
Pause

# Revealing launcher executable
Write-Output "Revealing launcher executable file"
Start-Process -FilePath "$env:windir\explorer.exe" -ArgumentList "/select,$InstallPath\$RandomName.exe"
Pause

# Unregister Distro
Write-Output "Unregister $RandomName distro..."
$obj.UnregisterDistro($RandomName)
Pause

# Get installed distro list
Write-Output 'Currently installed WSL distro list: '
$list = $obj.GetDistroList()
Write-Output $list
Pause
$obj = $null

It is a basic root filesystem image does not design for the WSL. As you already know, WSL supports importing any Linux root filesystem image which meets exact processor architecture.

WSL SDK handles dynamic distribution registration and manipulation even in the PowerShell environment.

Just for fun, even in Microsoft Excel, you can interact with the WSL environment.

You can access a variety of sample codes of WSL SDKs here: https://github.com/wslhub/wsl-sdk-com/tree/main/sample.

Future Roadmap

I recently created a GitHub action pipeline thanks to the https://github.com/marketplace/actions/setup-wsl plugin and the GitHub team's decision to enable the WSL component in their Windows Server workload.

Because of that, I could create a continuous integration for WSL SDK, which makes more confident releases. (https://github.com/wslhub/wsl-sdk-com/actions/workflows/wsl-sdk-com-build.yml) This work is a significant achievement of the WSL SDK roadmap.

Moreover, my bucket list contains those goals.

  • Registration-free COM server (If possible)
  • ARM64 native support
  • Adopting WSL SDK to another of my previous project (WSL Manager)
  • Various language wrappers (C#, Python, Go-lang, PowerShell, or any COM supported languages)

If you are interested in the WSL SDK project, please come and contribute to the GitHub repo. (https://github.com/wslhub/wsl-sdk-com)

28