Powershell and Parsing html code in Core

When working on a PowerShell webservice, I came across an interesting problem that I think is only going to crop up more and more.  This webservice takes a json payload, performs some simple manipulation on the data, kicks off some automation, and logs some events.  Part of the payload is html code, however.  Of course the json payload doesn’t care, but when it was coming into PowerShell it was being interpreted as a string.  That meant that when you looked at the string, you literally saw html code:

<BODY><H3>OPEN Problem 256 in environment <I>EPG</I></H3>
                               <HR>
                               <B>1 impacted infrastructure component</B>
                               <HR>
                               <BR>
                               <DIV><SPAN>Process</SPAN><BR><B><SPAN style="FONT-SIZE: 120%; COLOR: #dc172a">bosh-dns-health</SPAN></B><BR>
                               <P style="MARGIN-LEFT: 1em"><B><SPAN style="FONT-SIZE: 110%">Network problem</SPAN></B><BR>Packet retransmission rate for process bosh-dns-health on host 
                               cloud_controller/<GUID> has increased to 18 %</P></DIV>
                               <HR>
                               Root cause
                               <HR>
                               
                               <DIV><SPAN>Process</SPAN><BR><B><SPAN style="FONT-SIZE: 120%; COLOR: #dc172a">bosh-dns-health</SPAN></B><BR>
                               <P style="MARGIN-LEFT: 1em"><B><SPAN style="FONT-SIZE: 110%">Network problem</SPAN></B><BR>Packet retransmission rate for process bosh-dns-health on host 
                               cloud_controller/<GUID> has increased to 18 %</P></DIV>
                               <HR>
                               
                               <P><A href="https://redacted.somewhere.com/e/<GUID>/#problems/problemdetails;pid=-<GUID>">Open in Browser</A></P></BODY> 

I wanted to put this data in the events I was logging, but I obviously didn’t want it to look like this.  There isn’t a PowerShell cmdlet for ‘ConvertFrom-HTML’, although that would be great.  I could have tried to parse the text and remove the formatting, but that would be a pain trying to escape the right characters and account for all of the html tags.  I found several ways online to handle this – namely creating a ‘HTMLFile’ object, writing the html code into it, and then extracting out the innertext property of the object.  This works fine in some environments, but throws a weird error if you don’t have Office installed.  Here is the code:

$HTML = New-Object -Com "HTMLFile"
$html.IHTMLDocument2_write($htmlcode)

And the error it throws when Office isn’t installed:

Method invocation failed because [System.__ComObject] does not contain a method named 'IHTMLDocument2_write'.

There is also a catch here – that error will also be thrown if you are trying this in PowerShell Core/Microsoft PowerShell (not Windows PowerShell – i.e. anything under 5.x)!  Fortunately, there is a quick fix:

$HTML = New-Object -Com "HTMLFile"
try {
    $html.IHTMLDocument2_write($htmlcode)
}
catch {
    $encoded = [System.Text.Encoding]::Unicode.GetBytes($htmlcode)
    $html.write($encoded)
}
$text = ($html.all | Where-Object { $_.tagname -eq 'body' } | Select-Object -Property innerText).innertext

A simple try/catch, and it works great.  I am going to create a new repo in GitHub and make this a new function.  Hopefully you haven’t wasted too much time trying to track this down – it took me a while, and it wasn’t until I stumbled upon a similar solution on StackOverflow (https://stackoverflow.com/questions/46307976/unable-to-use-ihtmldocument2) that I was able to wrap it up.  Expect the cmdlet/function soon!

Azure ARM Fragments

SCOM Trick adopted for Azure ARM

I am working on a project for a very popular conference – working on backend automation that controls everything from speaker coordination to session scheduling.  The current task at hand involves deploying the entire suite of Logic Apps, Azure Automation accounts, Azure Functions, etc…  These would all be deployed to a new Resource Group.  When deploying these resources for a new conference a few things change – changing the conference name from ‘Jazz’ to ‘Midway’ for example, or changing the backend data sources.  

Normally you would use Azure ARM Template Parameters to pass these values when you deploy the resources, and you would be absolutely right!  They are powerful assets to have in your pocket.  It does get a bit dodgy, however, when you start to deploy Logic Apps in the ARM template and those Logic Apps have parameters of their own.  

Logic Apps have parameters and variables all their own, and they are defined just like parameters and variables in ARM templates.  When you want to deploy a Logic App that has parameters you can put them in the ARM Template and reference them in the Logic App, or you can use an ARM template expression in the Logic App.  The latter is not a popular idea, and the former doesn’t evaluate the parameters until the execution of the Logic App.  That obviously makes it difficult to work on the Logic App after it’s deployed.

That’s when I got the idea of using a popular SCOM tool – SCOM Fragments by Kevin Holman – in order to get the best of both worlds.  I wanted a quick way to deploy, and a quick way to edit after deployment.  Cake and all that….

The idea is simplistic – Find/Replace what you want to change before you deploy.  Sounds simple, and it is!  That is essentially how the SCOM Fragments work, and the same idea can be utilized here.  Say, for example, you have a simple Conference Name parameter you want to change.  The first thing you would do is build your Logic App like normal, keeping in place a static conference name.  Export that template.  Now, at the top of the template, add a new parameter to the parameters section in the template, like this:

{
    "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
    "contentVersion": "1.0.0.0",
    "parameters": {
        "Conference_Name": {
            "metadata": {
                "defaultValue": "###Awesome_Conference_Name###",
                "description": "Name of the conference - i.e. Management Conference 2019"
            }
        },

Now, find your static conference name down in the code for the Logic App itself, and replace the name with ###Awesome_Conference_Name###.   That’s it!  That is all that is needed to prepare your template for a rapid deployment.  When you need to deploy this template, simply Find/Replace ###Awesome_Conference_Name### with whatever text you want – i.e. “My Super Conference 2019”.  It will update it both in the parameters section, and in the code itself.  Do we really need the Parameter at the top of ARM template, especially if we are just going to replace the text ourselves before we deploy?  That answer is no, but it does help immensely when keeping track when you have a ton of parameters:

{
    "$schema": "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
    "contentVersion": "1.0.0.0",
    "parameters": {
        "Conference_Name": {
            "metadata": {
                "defaultValue": "###Awesome_Conference_Name###",
                "description": "Name of the conference - i.e. Management Conference 2019"
            }
        },
        "Sharepoint_Base_URL": {
            "defaultValue": "https://www.sharepoint.com/sites/Communications"
        },
        "Sharepoint_Speaker_ListName": {"defaultValue": "###Sharepoint_Speaker_ListName###"},
        "Sharepoint_Session_ListName": {"defaultValue": "###Sharepoint_Session_ListName###"},
        "Sharepoint_Session_Selection_Team_ListName": {"defaultValue": "###Sharepoint_Session_Selection_Team_ListName###"},
        "Sharepoint_Approved_Sessions_ListName": {"defaultValue": "###Sharepoint_Approved_Sessions_ListName###"},
        "Speaker_Agreement_URL": {"defaultValue": "###Speaker_Agreement_URL###"},
        "Session_Submission_FormName": {"defaultValue": "###Session_Submission_FormName###"},
        "Sched_Speakers_API": {"defaultValue": "###Sched_Speakers_API###"},
        "Sched_Sessions_API": {"defaultValue": "###Sched_Sessions_API###"},
        "Sched_Base_URL": {"defaultValue": "###Sched_Base_URL###"},
        "Sched_API_Key": {"defaultValue": "###Sched_API_Key###"},
        "MediaPack_URL": {"defaultValue": "###MediaPack_URL###"},

By putting the parameters in your code, even though you aren’t going to use them in the way they are intended, you can easily see which ones you need to replace in one place.