Section 3.3. Store Information in Variables

3.3. Store Information in Variables

So far we've been building increasingly complex command-line sequences to solve problems. Even with only the aspects of MSH that we've discussed thus far, it's clear that we've already got quite a toolkit. However, in some cases, it becomes important to maintain some state while processing is underway. Think about making modifications to a subset of thousands of files and needing to keep a record of those that have changed. What if you had a lookup mechanism that could associate a person's full name, email address, and other details with his or her username? This is a scenario in which you might want to take the output of one pipeline and run it into two or more separate ones.

Whether data is needed for decision making during execution or will be used as part of a result or audit after completion, we need somewhere to keep this data for later use. Variables are the mechanism by which MSH allows us to maintain state during processing. Variables are a place to store objects; along with the data they hold, they maintain a sense of the objects' type and structure. Variables can be used to store just about anything, from numbers to filenames to sequences, all the way up to collections of objects.

We'll begin here with a few simple scenarios to become familiar with variable usage and syntax. By the end of the chapter, we'll bring everything together and see how variables, combined with conditional tests and functions, add an extra layer of versatility for solving problems.

Variables Have Types, Too

Variables provide a place to store objects and can be used to store almost any piece of data used in the command shell. Instead of relying solely on the pipeline for moving data around, we can store objects for later use.

Although it isn't necessary to say exactly what information a variable will store beforehand, as soon as a variable is assigned a value, it takes on a type. Variables are objects, too, and they can be used in the pipeline and acted upon by cmdlets such as get-member. They also support the dot notation for accessing their properties.

3.3.1. How Do I Do That?

Variables in MSH are identified with the $ prefix. This helps separate variables from cmdlets, aliases, filenames, and other identifiers used in the shell. There is no forced naming convention for variables, although it's always good practice to use a name that actually represents the information stored within; if nothing else, it can dramatically help readability and understanding of scripts later on. Variable names are always case-insensitive, and they can contain any combination of alphanumeric characters (A-Z and 0-9), as well as the underscore (_) character.

Let's start with some simple arithmetic and a variable called number:

     MSH D:\MshScripts> $number = 1       # create a variable and give it a value     MSH D:\MshScripts> $number     1     MSH D:\MshScripts> $number = 4*6      # replace the value     MSH D:\MshScripts> $number     24     MSH D:\MshScripts> $number += 2       # update the variable by adding two     MSH D:\MshScripts> $number     26     MSH D:\MshScripts> $number++          # increment the variable     MSH D:\MshScripts> $number     27

Text strings are treated in a similar fashion. Many of the same operators (such as addition and multiplication) translate directly:

     MSH D:\MshScripts> $myString = "Hello"     MSH D:\MshScripts> $myString += ", World!"     MSH D:\MshScripts> $myString     Hello, World!     MSH D:\MshScripts> $myString = "Hello" * 6     MSH D:\MshScripts> $myString     HelloHelloHelloHelloHelloHello

With the basics in place, we can look at some of the richer data structures that variables enable. Arrays give us the power to store many pieces of information within a single variable. Simple arrays contain a sequence of data, often, but not necessarily, of the same type. We'll look at a sequence of numbers here, but it's important to remember that an array could just as easily be used to store a list of filenames for further processing or a sequence of ProcessInfo objects:

     MSH D:\MshScripts> $arr = @(1,2,3,4)     MSH D:\MshScripts> $arr            # list the contents, one element per line     1     2     3     4     MSH D:\MshScripts> $arr.Count     4     MSH D:\MshScripts> $arr[2]        # get the third element ([0] is the first)     3     MSH D:\MshScripts> $arr = @(1,2)     MSH D:\MshScripts> $arr += 3     MSH D:\MshScripts> $arr     1     2     3     MSH D:\MshScripts> $arr = @(("a", "b"), 2, 3)     MSH D:\MshScripts> $arr[0]     a     b

Another useful data structure we'll begin to rely on is the hashtable, a special type of array where one value (the key) is associated directly with another (the value). Think of a hashtable like a dictionary: the key is the word you're looking up and the value is the definition that lives beside it.

Hashtables can be very useful for storing related data because, when provided with a key, they allow very fast lookup of a piece of associated information. In the following case, we build a hashtable of the relationship between machine names and their owners. When provided with a key, the associated value is easily retrieved from a hashtable:

     MSH D:\MshScripts> $h = @{}                # create a new, empty hashtable     MSH D:\MshScripts> $h["hermes"] = "Andy"     MSH D:\MshScripts> $h["zeus"] = "Andy"     MSH D:\MshScripts> $h["hades"] = "John"     MSH D:\MshScripts> $h["poseidon"] = "Bob"     MSH D:\MshScripts> $h     Key                            Value     ---                            -----     hermes                         Andy     zeus                           Andy     hades                          John     poseidon                       Bob     MSH D:\MshScripts> $h["zeus"]     Andy

Variables can be used to store just about any data structure used in the shell, including items that are passed through the pipeline. We can fill a variable using the same assignment operator (=) or by using the set-variable cmdlet at the end of a pipeline. It's often convenient to store information in a variable and then use it as the source for building a pipeline.

     MSH D:\MshScripts> $allProcesses = get-process     MSH D:\MshScripts> $allProcesses | format-table Name     Name     ----     alg     CcmExec     csrss     explorer     Idle     lsass     msh     ...     MSH D:\MshScripts> get-process | where-object { $_.HandleCount -gt 200 } | set-variable -Name MyProcesses     MSH D:\MshScripts> $MyProcesses.Count     12     MSH D:\MshScripts> $MyProcesses[0]     Handles  NPM(K)    PM(K)      WS(K) VS(M)   CPU(s)     Id ProcessName     -------  ------    -----      ----- -----   ------     -- -----------         840      14    14572      15444    77   547.39   1656  CcmExec

One final important thing to realize is that for value types such as integers, strings, and arraysassignment is done by value. Let's look at how this workswe'll see the reasoning behind it in just a moment (as before, the 93 number returned from get-alias may be different depending on whether you have set up any additional aliases):

     MSH D:\MshScripts> $a = 10     MSH D:\MshScripts> $b = $a     MSH D:\MshScripts> $b     10     MSH D:\MshScripts> $a = 0     MSH D:\MshScripts> $b     10     MSH D:\MshScripts> $aliasesVar = get-alias# $aliasesVar is an array     MSH D:\MshScripts> $aliasesVar.Count     93     MSH D:\MshScripts> new-alias wo where-object   # now 94 registered aliases     MSH D:\MshScripts> $aliasesVar.Count     93

In contrast, when a reference type like a hashtable, Process object, or FileInfo object is assigned to a variable, assignment happens by reference. Again, let's first see how this works and understand the behavior in a moment:

     MSH D:\MshScripts> $ht=@{a=10;b=20}     MSH D:\MshScripts> $ht     Key                            Value     ---                            -----     a                              10     b                              20     MSH D:\MshScripts> $otherht = $ht     MSH D:\MshScripts> $otherht["c"] = 30     MSH D:\MshScripts> $ht     Key                            Value     ---                            -----     a                              10     b                              20     c                              30

3.3.2. What Just Happened?

There are many ways to define and update variables. Basic assignments using the assignment operator (=) simply take the value from the righthand side and associate it with the variable name on the left. There are a series of other operators known as compound assignment operators that can be used to update existing variables based on their current value. Of these, we saw +=, which takes the current value of the variable, adds the righthand side to it, and updates the variable with its new value. Likewise, *= has a similar result but performs multiplication instead of addition. Table 3-1 lists the common compound assignment operators .

Table 3-1. Compound assignment operators
Assignment operator	Equivalent to	Effect
$x += $y	$x = $x + $y	Add and assign
$x -= $y	$x = $x - $y	Subtract and assign
$x *= $y	$x = $x * $y	Multiply and assign
$x /= $y	$x = $x / $y	Divide and assign

The other common mechanisms for updating numeric variables are the increment and decrement operators ($x++ and $x, respectively). Equivalent to doing $x+=1 or $x-=1, these operators are a convenient shorthand.

Global variables, such as the ones we've seen here, are available from the moment they are defined until the shell is closed and are not persisted between different MSH sessions. There are other types of variables (such as local and script-scoped) that we'll explore at the start of Chapter 4.

3.3.2.1. Working with arrays and hashtables

An array is index-based, which means that each element is stored in sequence and can be accessed at a numeric location. The first item in the array has an index of zero, the second an index of one, and so on. An individual element is retrieved using square brackets containing the index number and appearing immediately after the array (e.g., $arr[13]).

Arrays can either be populated with values when first definedwith the @(element1, element2, element3) syntaxor updated after creation. Extending an array to contain a new element is as simple as using the += compound operator. Multiple elements may be added at once by adding two arrays together, forming a single array that comprises the elements of both. The range operator .. can also be used to fill an array with a range of values:

     $a = @(1,2,3)     $b = @(4,5,6)     $c = $a+$b         # equivalent to $c=@(1,2,3,4,5,6)     $d = @(1..6)       # equivalent to $d=@(1,2,3,4,5,6)

Hashtables have obvious similarities to arrays with one significant difference: instead of using a numeric index, they use an arbitrary key. For this reason, it's not possible to iterate through the contents of a hashtable using a numeric index. Thus, hashtables expose two collectionsKeys and Valuesthat contain the information stored within.

Like arrays, hashtables can be built up progressively as we saw previously, populated with data at creation time, or they can have multiple key-value pairs added at once through "hashtable addition":

     $ht = @{hermes="Andy"; zeus="Andy"; hades="John"; poseidon="Bob"}     $ht += @{athena="Bob"; apollo="John"}

3.3.2.2. Special values

The special value $null is used to represent a null or undefined value. The null value is very flexible. If you add a number to it, it behaves as if it were zero. If you try to append a string to it, it will behave as if it were an empty string. Add a hashtable or array to null, and it behaves as if it were an empty hashtable or array. Setting a variable to null clears any previous value it might have held.

An empty array (one that contains zero elements) is represented by the @( ) syntax. Similarly, the @{} syntax (note the curly braces) is used to create an empty hashtable.

These special values are summarized in Appendix A.

3.3.2.3. By-value assignment

In effect, when an assignment is made with a value type, the variable is given a copy of its assignment at that instant. For simple variable types, such as numbers, this behavior is generally intuitive. However, in cases where cmdlets are used to populate an array of results, it's vital to remember that the cmdlet was run once, at assignment time, and any further use of the variable content is based on that copy. If you need the absolute latest output of a cmdlet, it is best to run it again rather than rely on a potentially old and out-of-date copy stored in a variable.

3.3.2.4. By-reference assignment

In the last example, we saw that even though the third element was added to the $otherht hashtable, it looked as though it was added to both $ht and $otherht. This difference from by-value assignment is easier to understand if you realize that there is just a single hashtable and both $ht and $otherht are references (or pointers) to it. Creating a third variable ($thirdht=$otherht) simply creates another reference; now there are three variables pointed at the same object.

3.3.3. What About...

...What happens if you use the same variable name in two places? Will they interfere with each other? It depends. We'll come back to some more advanced aspects of variables in the next chapter to see how MSH controls the access and visibility of variables through a system called scoping.

...Using $_ as a valid variable name? $_ is a valid variable name, although, as we saw earlier, the $_ variable has special meaning inside certain script blocks. With some cmdlets, such as foreach-object and where-object, a script block is used to operate on or make decisions about pipeline objects. In these cases, the $_ variable is pre-populated and contains the current pipeline object when the script block is run. Although it's valid to use $_ for other purposes, it's a better idea to use a more descriptive variable name outside of these cases.

...Using a variable before it has been assigned a value? Technically, it can be done, although it's rarely a good idea. If you use a variable that hasn't yet been assigned a value, you'll get a null value back and won't see any errors. This may seem to work out fine in some cases, but always consider whether your script will behave in the same way if the variable does have a value. In practice, it's always a good idea to set any variables to a known value (such as zero or an empty string) before using them.

...Seeing which variables you've defined? Yes. In the same way that we can browse through the registry like a filesystem, MSH creates a special drive that represents the assigned variables. You can use the standard commands to list the names and current values of any defined variables:

     MSH D:\MshScripts> set-location Variable:     MSH D:\MshScripts> get-childitem

...Removing elements from a hashtable? The hashtable is a rich class that exposes several methods for manipulating the content stored within (all of which can be discovered with $ht | get-member -MemberType Method). The Remove method takes a key and removes it from the set, whereas the Clear method can be used to empty the hashtable entirely:

     MSH D:\MshScripts> $ht = @{a=10;b=20;c=30}     MSH D:\MshScripts> $ht.Remove("a")     MSH D:\MshScripts> $ht     Key                            Value     ---                            -----     b                              20     c                              30     MSH D:\MshScripts> $ht.Clear(  )

3.3.4. Where Can I Learn More?

The language reference guides built into get-help offer a full list of arithmetic and assignment operators, as well as some more details on the various data structures available:

     get-help about_Arithmetic_Operators     get-help about_Assignment_Operators     get-help about_Array