The RegEx Headache!
Grumpy Admin Here – I am still facing an uphill battle at the moment, the urge to slap certain people or push them down a lift shaft followed by petrol and a match is getting increasingly hard to resist. Enough about my ex-wife! The same can be said about certain people and their PowerShell. No PowerShell is bad PowerShell, it might be the best, but it is still PowerShell.
I constantly am fighting in the office, to introduce more PowerShell into our working environment. The future is PowerShell. With Nano Server, and Server Core etc. PowerShell is important. People need to get with the times.
Something I saw made me thing. WHY! Why do it like that when you can use the Power of PowerShell! Take for example.
I saw someone trying to extra the FQDN in a stupid and convoluted method of trying to parse the output from ipconfig combine the hostname and primary DNS suffix strings….
After a deep breath, I took the reins and in 7 seconds gave him a string with the FQDN.
$FQDN=(Resolve-DnsName 10.0.0.4).NameHost
Simple, then I showed him you can use the pipeline to make things a bit more expandable as some hosts might have more than one dns name or IP address
(Get-NetIPAddress).IPAddress |%{(Resolve-DnsName $_ -ErrorAction SilentlyContinue).NameHost}
I put the error silently continue, to supress any IP address that don’t have reverse pointers…. This was just an example. I put it in after I ran the command for the first time. Grumpy Admin hates seeing a sea of red!
The lesson he took away was that there are better ways to doing things. Work smart not hard! Remember this point, just because you can do something, doesn’t me you should or that it worth doing… consider this when you read the below!
Which brings me to the meat of my blog post today, I am sometimes guilty of this myself, but I noticed quite a few people especially around the office don’t bother validating user input in scripts. I am going to be honest, perhaps there isn’t much point. If you enter a server name or a file path wrong… it is wrong! Bu Take for example a simple script that checks if a user has profile in the profile share on the server!
The check-profile script is simple
function check-profile
{
[CmdletBinding()]
param (
[string[]]$Username,
[string[]]$Server )
$UNC = \\$Server\profile$\$username
$UNC
if(test-path $UNC)
{write-host “Profile Exisits”}
Else
{
write-warning “Profile Not Found”
}
}
As you can see it simple and works. What if I put in an IP address instead of the server? It works. What if I put something like Server%42! Which is illegal.
It will says it can’t find the profile, which is correct, but there lays the problem. The script isn’t validating the input. It just trusting the user to have supplied the correct information. Do we trust our end users, why should script trust their end users…. IE us!
To the computer and to Powershell, it just data to you and me…. 10.0.0.1 is a IP address, we know this because the format of the IP address. We also (or we should know) that 10.0.290.2 isn’t a valid IP address because it is out of range.
If we checked the validity of the user input beforehand we could save some processing in executing our script for it to fail, or handle the exception in a more user friendly manner. No one likes seeing a sea of red do they! Not this Grumpy Admin.
If you didn’t already know PowerShell supports Regular expressions. These are powerful and some people think of them as a dark art form. They are quite simple really, but can build up to be complex and powerful!
The first thing is we have operators that are very powerful and can be used to greatly improve our scripts operators like :-
-match
-like
-notlike
-replace
-notmatch
-split
You can also use regex in the SELECT-STRING cmdlet, which again gives you great power!
If that not all, for iterations and the likes PowerShell supports named captures to make things easier than dealing with index numbers! I think that is neat!
-cmatch
-creplace
-ireplace
There are also case modifies which changes the case! These are just the operators to help evaluate the true/false or to extract a match and manipulate the string.
If there a situation where you can’t use the correct operator such as -replace. You can just drop down to the power of .Net and use the [regx] class. For example
[regex]$expression ='[a-z]’
write-host “Password is” $($expression.Replace(‘secretpassword’,’X’,11))
So let’s work on improving our check-profile script with the inclusion of some regex expression to evaluate the user input and error if it fails to meet the validity tests. Like I said above, there might not be a point and there might be a better way of doing things but, why not use this script to play with some regular expressions.
As ever, lets attack this sensible, let’s define the validity of the data that we will be testing for!
The username string. Windows usernames – we know it will most likely have elements of the alphabet A-Z , it might have numbers such as 0-9 chars as well. But it won’t have spaces or special chars in it such as ” / \ [ ] : ; | = , + * ? < >! The username cannot be just full stops either! (Periods as Americans call it or spaces).
Here is a link to a TechNet doc with accounts
We know there is a hard limit to the size of username, but I am going to be realistic, and limit the username to 20 chars in size, also while you can just have a single letter or number as a user name I am going to say a minimum of 3 letter for the username. This should capture most usage cases!
We also know we can have a full stops and underscore’s and hyphens in the username. So we will have to check for them!
Just as we are checking user profile directories. There are some exceptions where Windows will append a file extension of the domain name to the end of the profile folder if it already exists. For example, Hazzy.Hazzy. However, for this script we are going to ignore this condition, as we are just messing around really.
So lets start to build our expression, we know we are going to have a-z
[a-z]
We also know we are going to have 0-9
[a-z0-9]
Then we add in our special characters
[a-z0-9\._-]
What, hang on there, if you noticed I put a \ in there. Well that because. In regex is match any single character. Which would cause a problem, so I have to escape the . So that it knows it actually a full stop. So if we look back and remember our basic regex lessons… we know they float and need to be nailed down. We can do this by using the carrot symbol ^ and the $ symbol at the end of the expression.
^[a-z0-9\._-]$
Ok now what about the length… well we can use {min,max} so our expression will look like this
^[a-z0-9\._-]{3,20}$
Simple, except we are only dealing with only lowercase with the a-z element of the expression. So we need to adjust it by adding in A-Z as well.
^[a-zA-Z0-9\._-]{3,20}$
Now let’s add this in an if statement, we will use the -notmatch operator to generate a true or false to control flow of the script
function check-profile
{
[CmdletBinding()]
param (
[string[]]$Username,
[string[]]$Server
)
# check username
if ($username -notmatch ‘^[a-zA-Z0-9\._-]{3,20}$’)
{
write-warning “$Username isn’t a valid username”
return
}
$UNC = \\$Server\profile$\$username
$UNC
if(test-path $UNC)
{write-host “Profile Exisits”}
Else
{write-warning “Profile Not Found”}
}
Right as ever, grumpy admin likes to test these things out. So let’s do a quick logic table to test our script
Excellent the results of the test match my predictions, must be doing something right… So we know that we can now validate our username input, and provide some feedback if there an issue!
Now what about the computername/server? Again we know some things about the server, we are going to be using this inside of a UNC so we know some things about UNC format don’t we!
We know that we
Can use the NetBIOS/hostname name, which has a limit of 15 chars but does support some special chars, but as most NetBIOS/hostname names have to be tcp compliant to fit in with the relevant RFC, we will ignore the special chars that are technically allowed!
Can use the DNS name of the computer
Can use the IP address of the computer
Check these link for more information about the validity of usernames and hostnames.
So we need to check for all of these /o\, I expect I will skip some checks like the . as host name! Meh! This again is just a demo right? If you can improve any of this please please submit a correction and comment below – We are all students!
The NetBIOS/Hostname name, is easy
^[a-zA-Z0-9-]${1,15}$
The IPv4 regex is a common one and why reinvent the wheel, when you can use google and cut and paste. Read through it so you can see what is happening in there! I got my cut and paste from. After all I am a Lazy Grumpy Admin!
The best thing to do with multi selection or cases is to use a switch in PowerShell. The PowerShell switch statement has -regex parameter – so instead of Boolean logic for the switch, we can use regex
So we can do the following
switch -regex ($server)
{
“^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$”
{
write-verbose “$server MATCHED IP Address”
break
} “^[a-zA-Z0-9-]{1,15}$”
{
write-verbose “$server MATCHED NETBIOS FORMAT”
break
}
“^[a-zA-Z0-9\.-]{1,254}$”
{
write-verbose “$server MATCHED DNS NAME”
break
}
default {write-warning “$server doesn’t conform to UNC standard”;return}
}
So the basic logic is this, if it doesn’t match any of the patterns we want, it will return out of the scope of the function after writing our warning to screen.
So as with most things Grumpy Admin does let’s give this a quick test
Now this does highlight a slight issue – a non-valid IP address gets flagged up as a DNS name, but it could actually be a valid DNS name. So while that is unlikely, I am going to let them through, if you can think of a better way then let me know!
So we have improved our script with some basic validation. The scripts still not to my liking still- we still have something hard coded. The profile$ share is hard coded, and not configurable. So lets change that shall we. Of course we now need to validate this share!
^[a-zA-Z0-9\._\-\$\\]{1,254}$ should do the trick! Throw in the possible length and we should be good to go with this one.
Now, our script looks like this
function check-profile
{
[CmdletBinding()]
param (
[string[]]$Username,
[string[]]$Server,
[string[]]$share = “profile$”
)
# check username
if ($username -notmatch ‘^[a-zA-Z0-9\._-]{3,20}$’)
{
write-warning “$Username isn’t a valid username”
return
}
switch -regex ($server)
{
“^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$”
{
write-verbose “$server MATCHED IP Address”
break
}
“^[a-zA-Z0-9-]{1,15}$”
{
write-verbose “$server MATCHED NETBIOS FORMAT”
break
}
“^[a-zA-Z0-9\.-]{1,254}$”
{
write-verbose “$server MATCHED DNS NAME”
break
}
default {write-warning “$server doesn’t conform to UNC standard”;return}
}
if ($share -notmatch “^[a-zA-Z0-9\._\-\$\\]{1,254}$”)
{
write-warning “$share isn’t a valid share name”
return
}
$UNC = \\$Server\$share\$username
$UNC
if(test-path $UNC)
{write-host “Profile Exisits”}
Else
{write-warning “Profile Not Found”}
}
We already know server and username validation checks work, however, it would be prudent to do some testing with the profile share name. It is important to note we are just checking validity of the share name not if the share is present or not. That is done with the test-path element of the script. It would just report to the end user that the profile wasn’t there, if the server, share or profile elements were not found.
As you can see that works… so we are validating all 3 user inputs are actually correct(ish, there is always room for improvement) format…. What if we wanted to do some individual validation on each item, well we have the structure in place in this script to do so! But then are we writing a PowerShell script, to solve a problem or are we writing reusable code? That will be the topic of another blog post I guess!
So in summary, I have taken someone else’s code from the office and made it a bit more friendly. Not perfect and still horrible code and perhaps there better ways, but I wanted to demonstrate using regex, in both an IF statement, and the -match and -notmatch settings and inside of a switch statement.
Hopefully it of use to someone… This was just a very minor usage case, to explore these topics… I am no means a regex expert. One of the best tool I have found for testing regex expressions is the following site:-
The real question you should be asking is… why bother, the check-profile with the parameters is larger to type than typing test-path <UNC HERE>…..meh! Some time you just go to “kiss” things…
Hazzy