Section 1.4. Create a Pipeline to Pass Information


1.4. Create a Pipeline to Pass Information

The cmd.exe command shell has always supported the idea of redirecting the output of a process to a different location. The vertical bar, or pipe symbol (|), is used to create this pipeline. When present, it instructs the shell to redirect the output of one command to the input stream of another, effectively chaining the commands together. For example, while the command type win.ini | more is a familiar way of paging through a long text file, what's really happening is that the output of the type command, which lists the contents of a file in its entirety, is being piped to the more command, which knows how big the screen is and how to pause when it's full.

The pipeline is not a new concept and is the glue that most command shells use for passing data between different processes. However, MSH takes the concept one step further. Instead of passing simple text streams between different steps (the method used by almost all command shells today), for communication between MSH cmdlets and scripts, strongly typed objects are used to carry both the information and its structure.

Passing strongly typed data has some significant advantages. Flat text files are rarely the best way to represent structured data as there is only so much information that can be captured in a line-to-line text listing. Historically, when information is transferred between two processes in textual format, there will be some mutually agreed upon encodingmaybe the generator will always output information in some sequence that the receiver must then parse to recreate the structure for processing. Authors must either add complexity to their tool by generating both human-readable and machine-readable output, or they end up forcing the script writer to use a parsing tool such as AWK to extract meaning from the textual output. Such parsers can be difficult to write, are prone to failure with minor tool changes, and cannot always handle international characters. Clearly, this tightly binds the two tools together, requiring code in both to handle the interchange. This is often so restrictive that it limits the tools so that they cannot be used for any other purpose. With MSH passing structured data instead of text, much of this encoding and decoding effort disappears and arbitrary pipelines become a reality.

The pipeline empowers MSH through composition. Composition, in this sense, refers to the way in which we can combine small functional units together, creating something altogether more useful. As we discussed earlier, cmdlets are designed to do a simple task wellfor example, listing processes, sorting, and filtering. Let's take a look at how we can pipeline cmdlets together to list, filter, and sort the process list without the get-process cmdlet ever having to know what sorting is.

The Power of Composition

The phrase "the whole is greater than the sum of the parts" suits the idea of pipeline construction and composition very well. The pipeline makes it possible to take a number of small, effective components and combine them together to form a richer processing engine.

It's usually possible to break a task down into several distinct steps. Instead of trying to attack the whole lot at once, we can look at the first step and get it working fully. With that in place, we can focus all attention on the next step, and so on. Building up a pipeline in this fashion tends to be less fallible and, over time, becomes an efficient way to develop and reuse scripts for many different purposes.


1.4.1. How Do I Do That?

As we're already comfortable with the get-process cmdlet, we'll use that as a starting point. We'll create a pipeline with the | symbol and introduce the where-object cmdlet to apply a test to each object as it passes through the pipe. If the object satisfies the test criteria, it will continue on, in this case, to be shown in the console:

     MSH D:\MshScripts> get-process | where-object { $_.Handles -gt 200 }     Handles  NPM(K)    PM(K)      WS(K) VS(M)   CPU(s)     Id ProcessName     -------  ------    -----      ----- -----   ------     -- -----------         624      13    10548      15756    65    25.01   1656  CcmExec         407       5     1684       3420    23    22.71    464  csrss         274      11     7376      12696    55   565.91    212  explorer         404      10     4472       2376    42    16.12    544  lsass         282      12    35028      32416   176    21.93   3088  msh         260       6     1276       2864    24    14.54    532  services        1709      52    18092      24888   103    62.37    824  svchost         209       6     2080       4320    36     4.80    940  svchost         262      14     1500       3988    34    11.43    756  svchost         284       0        0        216     2    77.96      4  System         551      61     7332       4136    51    19.24    488  winlogon         225       8     6364       7888    66     3.00   1708  wuauclt

Pipelines aren't limited to two stages. Now that we've established the set of processes that have more than 200 open handles, we can pipe that set into another cmdlet that will sort the output based on handle count:

     MSH D:\MshScripts> get-process | where-object { $_.Handles -gt 200 } | sort-object Handles     Handles  NPM(K)    PM(K)      WS(K) VS(M)   CPU(s)     Id ProcessName     -------  ------    -----      ----- -----   ------     -- -----------         209       6     2080       4320    36     4.80    940  svchost         260       6     1276       2864    24    14.54    532  services         262      14     1500       3988    34    11.64    756  svchost         274      11     7376      12696    55   580.04    212  explorer         284       0        0        216     2    78.89      4  System         405      10     4472        588    42    16.28    544  lsass         407      12    34632      33052   175    23.16   3088  msh         408      12    18432      19221    99    20.01   3089  msh         414       5     1684       3420    23    24.42    464  csrss         551      61     7332       4136    51    19.34    488  winlogon         618      13    10352      15740    64    25.38   1656  CcmExec        1748      53    18312      24964   105    63.40    824  svchost

Sometimes it's convenient to group objects by some property after they've been sorted. Like sort-object, the group-object cmdlet takes a parameter to organize its output. For example, let's view the same list, but this time, group together the processes by name:

     MSH D:\MshScripts> get-process | where-object { $_.Handles -gt 200 } | group-object ProcessName     Count Name                      Group     ----- ----                      -----         3 svchost                   {svchost, svchost, svchost}         2 msh                       {msh, msh}         1 CcmExec                   {CcmExec}         1 csrss                     {csrss}         1 explorer                  {explorer}         1 lsass                     {lsass}         1 services                  {services}         1 System                    {System}         1 winlogon                  {winlogon}

1.4.2. What Just Happened?

The three new cmdlets we've covered here are all similar in their behavior. At a high level, they all examine the objects in the pipeline and put some or all of them back into the pipeline in a different order. The where-object cmdlet is used to control whether an object continues through the pipeline or is dropped. In contrast, the sort-object cmdlet will output every object it sees, but it may do so in a different order after it has had the opportunity to rearrange the objects. Meanwhile, the group-object cmdlet will allow all objects to pass through but will do so after placing the objects in a container related to the grouping property.

The $_ notation can be read as "this." When used in the script block for the where-object test, it refers to the current object in the pipeline. The dot notation, $_.Handles, is used to access the properties of the object for this test. We'll look at objects and their properties in more detail shortly.

MSH offers a set of operators for performing comparisons. Several of the common ones are listed in Table 1-1; Appendix A contains the complete list. Note that the < and > symbols are used for redirection in the shell and cannot be used to perform less-than or greater-than comparisons.

Table 1-1. Comparison operators

Operator

Description

-gt

Greater than

-lt

Less than

-eq

Equal to

-ne

Not equal to


The important point to note here is that the get-process has no notion of sorting or filtering. In addition to significantly reducing the complexity of that cmdlet, it also has the overall benefit in that sorting and filtering now use a common syntax anywhere within the shell. Whereas today you have to learn a different syntax for each tool (look at the differences in sorting between DIR and TLIST, for example), now it's just a case of using where-object and sort-object for everything.

1.4.3. What About...

...Using legacy tools in the pipeline? Is this possible? Yes! MSH allows the use of non-cmdlet applications to form part of a pipeline, even though it is not able to deduce any structure from their output. As such, output from a legacy tool will take the form of a list of strings representing each line of the output. For example, the command ping 127.0.0.1 | sort-object is valid, but it probably won't yield the results you're hoping for. MSH will simply perform an alphabetical sort of all the lines of output and kindly return that to the screen.

Given the previous output, it would be hard to claim that the grouped output is easier to read. Fortunately, as we'll discover, there are several other cmdlets that can be used to tidy this up for display on-screen, in print, or in other applications. As we're beginning to see, the key factor here is that the data is grouped as we want it and any downstream cmdlets will only have to be concerned with presentation.

...How about sorting on more than one field? Although we only used a single property in this case, sort-object will take a comma-separated list of fields for sorting. If the values of the first are the same for two objects, the values in the second will be compared instead.

1.4.4. Where Can I Learn More?

The built-in help guides for the cmdlets introduced here have more details on their syntax and usage:

     get-help about_pipeline     get-help where-object     get-help sort-object     get-help group-object

All of these scenarios are made possible through the shift toward structured objects instead of text. We'll take a look at objects and the pipeline in more detail starting in Chapter 3. In the meantime, we'll look at some of the more immediate benefits of MSH.




Monad Jumpstart
Monad Jumpstart
ISBN: N/A
EAN: N/A
Year: 2005
Pages: 117

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net