Reverse Engineering from Source Code

Chapter 5 - Reverse Engineering
byAndrew Filevet al.?
Wrox Press ©2002
Team FLY

In this section we'll look at how you can reverse engineer the source code of any .NET solution or project into a Visio UML model. Specifically, this section provides:

  • A quick-start guide to invoking the reverse engineering feature

  • A summary of the key features and limitations of reverse engineering

  • A sample reverse engineered project, as a vehicle for discussing the resulting model structure

Later you will see a technique for reverse engineering an application for which you no longer have - or never did have - access to the source code.

Reverse Engineering QuickStart

Just select a project or an entire solution in the Visual Studio .NET Solution Explorer and then choose Project|Visio|UML, Reverse Engineer menu option as shown here.

click to expand

You'll be asked to give a name for the destination Visio (.vsd) file, before seeing the progress via a series of dialogs and a sequence of messages in the Visual Studio .NET Output window like this:

 Performing pass number: 1. Reverse engineering: ParcelTracker_DataObjects Extracting information from...     ParcelTracker_DataObjects      DataManager       ParcelTracker_DataObjects        DeliveryDataSet         DeliveryRowChangeEventHandler         CustomerRowChangeEventHandler         ... Reverse engineering: ParcelTracker_BusinessObjects Extracting information from...     ParcelTracker_BusinessObjects      DeliveryManager Reverse engineering: ParcelTracker_Server Extracting information from...     ParcelTracker_Server      StartServer Reverse engineering: ParcelTracker_Windowslnterface Extracting information from...     ParcelTracker_Windowslnterface      DeliveryDataForm Performing pass number: 2. Reverse engineering: ParcelTracker_DataObjects Extracting information from...     ParcelTracker_DataObjects     ... Reverse engineering: ParcelTracker_BusinessObjects Extracting information from...     ParcelTracker_BusinessObjects     ... Reverse engineering: ParcelTracker_Server Extracting information from...     ParcelTracker_Server     ... Reverse engineering: ParcelTracker_Windowslnterface Extracting information from...     ParcelTracker_Windowslnterface     ... Number of warnings: 0. Exporting UML model to Visio... Reverse engineering succeeded. 

Visio will be launched automatically and as the information is imported, another series of dialogs will report the progress. The end result will be a new set of classes - in the Visio Model Explorer - which you can drag onto any Static Structure Diagram. For illustration, in the following figure I have dragged a DeliveryManager class from the Model Explorer onto a Static Structure Diagram.

click to expand

That was your quick-start guide, which will get you up and running in the shortest possible time. Of course there's more to it than that, and we'll dig deeper as you read on. Let's start by looking at some of the key features and limitations of this toolset, which set it apart from the other reverse engineering tools that you may have used before.

Key Features and Limitations of Reverse Engineering

The reverse engineering facility is very useful, but not perfect, so now I'll state up-front some of the key features and limitations.

Reverse Engineering Granularity

Although you initiate the reverse engineering process from within Visual Studio .NET, the help documentation for reverse engineering is accessed via the Help menu in Visio. According to that documentation you can choose the reverse engineering granularity such that:

"The selections you make in the Visual Studio Solution Explorer determine what is reverse engineered to the Visio UML."

Taking that to its logical conclusion you might expect to be able to select an individual source file for reverse engineering, or an individual class from the Class View. From Visual Studio .NET you can select a source file or a single class in the Solution Explorer, then invoke the reverse engineer, but you will still end up with the entire project (not the entire solution) in Visio.

The division into projects within a Solution will determine the granularity for reverse engineering.

Semantic Errors

If you have semantic error checking turned on in Visio you may see a series of error messages in the Visio Output Window of this kind:

  • An interface cannot be used as the type of a parameter.

or:

  • No behavioral feature of the same kind may have the same signature in a classifier.

According to the Visio UML documentation such errors occur when the resulting model breaks certain rules of the UML 1.2 specification. All you need to provoke that kind of error is a method definition like this:

      public System.Collections.IEnumerator GetEnumerator()      {          return this.Rows.GetEnumerator();      } 

It doesn't look too bad, does it? What's more it wasn't written by a human, rather it was generated automatically by Visual Studio .NET in the process of converting an XML schema into a .NET DataSet.

Unless we're reverse engineering an application with a view to re-engineering it we can't really attempt to fix such problems. If we did, our UML model would not be a representation of the original code. As the .NET Framework itself contains such method definitions those errors could appear more frequently than you think.

So if we ought not to fix these problems, and if they'll appear more frequently than we would like, what can we do? Well, we can hide the messages by un-checking the Check semantic errors on UML model element option of the UML | Options dialog.

click to expand

Of course, when designing a new application we have an opportunity to avoid such problems from the outset.

Static Structure Diagrams

When the reverse engineering has completed you will notice that the Model Explorer is populated with the reverse engineered classes, as we saw in a previous screenshot, but they do not appear automatically on any static structure diagrams. In fact, the only reason the DeliveryManager class appears on a diagram in the previous figure is because I dragged it from the Model Explorer onto that diagram.

If you're new to reverse engineering, that might be a little disappointing but it's pretty much the same story for most UML modeling tools. The fact is that static structure (namely class) diagrams represent views onto the underlying static model. You might want a view that represents the classes of a particular package, or those classes that participate in a particular package, or a super-view of every class in the model. The tool has no way of knowing what you want to appear on the diagrams, so it doesn't draw them for you.

That has an important implication - Visio EA is a little peculiar in that an association exists between two classes only if that association is drawn on at least one diagram. No diagrams means no associations in the reverse-engineered model, so when reverse engineering from code that you have previously generated you will not necessarily finish up with a replica of your original model - this is one reason why round-trip engineering is not really feasible.

You won't have lost any information as such because the associations will still be represented by member variables within the associated classes. All inheritance (generalization) relationships will have been preserved and will be drawn automatically on static structure diagrams.

Round-trip Engineering

Many tools claim to offer round-trip engineering - the ability to generate code from a UML model, modify the code in your IDE, and then reverse engineer the changes back into your original model. In a very loose sense you can round-trip engineer with this toolset. That is to say that you can generate a Visual Studio .NET solution from a Visio UML model, and then reverse engineer that solution back into a Visio UML model.

The problem is that you can reverse engineer into a new model, but crucially not back into the same model, which defeats the objective. Furthermore, if you go on to do a new round of code generation from your model you will find that you'll get new skeleton code with none of your original method implementations included.

Although I find code generation and reverse engineering both to be very useful individually, in my experience round-trip engineering is somewhat overrated by tools vendors. It's quite a difficult trick to pull off, requiring sophisticated synchronization techniques, often with the extra step of marking up of the code with special tags. Also, the process is not helped by the fact that an arbitrary number of changes may be made in the model or in the code before the user next chooses to generate code or reverse engineer.

More recently a different approach has been taken, in which the UML modeling tool and the code IDE are combined into a single environment. The UML model and the code are kept synchronized at all times, with no separate code generation and reverse engineering phases as such. This is true of Together Control Center (for Java) and Rational XDE (for .NET).

Projects that Don't Compile

Because the reverse engineering works directly from source code (see below) it's not necessary that your project fully compiles before attempting to reverse engineer. This could be very useful in those situations where you want a UML representation of a partial project or solution. Of course, there is a Garbage-In-Garbage-Out factor to consider here - there may be good reasons why your project does not compile and the resulting UML model will only be as good as the code you feed in.

Towards the end of this chapter you will see an example of what happens if your solution does not reference all of the required assemblies.

Source Code Required

Some other UML visual modeling tools allow you to reverse engineer from source code (for example Java .java files) or from compiled code (such as Java .class or .jar files). Visual Studio .NET is limited to reverse engineering from source code and there is no obvious way to reverse engineer an already-compiled .NET assembly. I do have a solution to that problem, which I'll share with you towards the end of this chapter.

One effect of that restriction is that it's not easy to get the .NET Framework classes themselves into a UML model in Visio; again this contrasts with other tools that often incorporate, for example, the Java runtime classes. This can be quite a limitation, because it means the attribute types, parameter types, and associations in a new Visio UML model are restricted to the C#, Visual Basic.NET, or C++ .NET fundamental language types - remember in Chapter 3 when we had to add the System.Data.DataSet class to our Visio diagram by hand? Once more, there's a solution to this problem later in this chapter.

I used the phrase "new Visio UML model" in the previous paragraph to mean one that you've constructed up-front without reverse engineering. That's because my claim about the lack of .NET Framework classes is not entirely true for a reverse-engineered model. Any .NET Framework classes that are referenced in your code - for example via inheritance - will be reverse engineered into the model. This is limited to those classes that are explicitly referenced in code, and the resulting UML classes will be included by name only. We'll look more at this topic later in this chapter.

Reverse Engineering Example

In the download code for this chapter there is a Visual Studio .NET solution that we'll use as a test case for reverse engineering. The solution name is ParcelTracker, and from that name you'll have guessed that the application was originally intended as a simulation of the kind of system that might be used by UPS, TNT, ParcelForce (in the UK), or any other national or international package delivery company.

For our purposes here it matters very little what this application actually does, because we won't be looking in detail at how it works or even running it. What is most important is that the application is sufficiently complex to be a credible test case for reverse engineering.

From a design point of view the application is divided into three layers, each layer corresponding with a separate project within the ParcelTracker solution. The layers are:

  • Windows Interface Layer (in project ParcelTracker_WindowsInterface)

  • Business Objects Layer (in project ParcelTracker_Businessobjects)

  • Data Objects Layer (in project ParcelTracker_DataObjects)

There is an additional project called ParcelTracker_Server that contains an executable program that would start up the server aspect of the application.

In terms of .NET coverage this application was designed to make use of Windows Forms, Remoting, and a DataSet that uses XML for persistent storage.

Reverse Engineering the ParcelTracker Application

The steps for reverse engineering the sample application are as described in the Reverse Engineering fiom Source Code section earlier in this chapter. To recap:

  • Open the solution ParcelTracker.sln in Visual Studio .NET.

  • Choose the menu option Project | Visio UML | Reverse Engineer.

As mentioned in the earlier Semantic Errors section, reverse engineering may result in the output window containing error messages of the form:

  • An interface cannot be used as the type of a parameter.

That's true for this application and it's also true when the .NET Framework class libraries themselves are reverse engineered, as described later. Since we're reverse engineering an existing application, we don't have the option of fixing the model to meet the UML 1.2 criteria as we would if we were designing from scratch; which leaves little alternative but to simply switch off the UML semantic checking.

Reverse-Engineered Model Structure

The ParcelTracker application is packaged as a solution containing multiple projects, as can be seen on the left-hand side of the following figure. The right-hand side of the figure shows the resulting UML model structure in Visio, as you would see in the Model Explorer.

click to expand

You can see that the Top Package (the default top-level package in Visio) contains four UML Subsystems (represented in the Model Explorer with pink-colored package icons). These subsystems are ParcelTracker_BusinessObjects, ParcelTracker_DataObjects, ParcelTracker_Server, and ParcelTracker_Windowslnterface corresponding with the four projects contained within the original solution.

Each project within a Visual Studio .NET solution is reverse engineered into a UML subsystem in the Visio model.

You may wonder why a package with the same name appears within each subsystem. That's because this application has a package structure that reflects the project structure. Let's take the DeliveryManager class as an example - it is defined within the ParcelTracker_Businessobjects package (thus falls within the subsystem of that name), and in addition, its definition specifies the class as being contained in the ParcelTracker_BusinessObjects namespace as you can see from looking at the code:

     namespace ParcelTracker_BusinessobjectsBusinessObjects     {        public class DeliveryManager : MarshalByRefobject        { 

Nested Classes

The ParcelTracker application contains some nested classes that are not apparent in the model structure shown in the previous figure. Consider the following code from the DeliveryDataSet class, which shows at least two classes - DeliveryDataTable and DeliveryRow - as having their definitions nested within the definition of the DeliveryDataSet.

     namespace ParcelTracker_DataObjects     {         ...         public class DeliveryDataSet : DataSet         {             ...             public class DeliveryDataTable : DataTable ...             {                ...             }             public class DeliveryRow : DataRow             {                ...             } 

To see those nested classes in the Visio Model Explorer you can simply expand the containing classes to expose their constituents, like this:

click to expand

Referenced .NET Classes

Another interesting feature of the reverse engineered model structure is that a System package has been generated, with contained packages that reflect the namespace structure of the .NET Framework itself. We'll see later that Visio for Enterprise Architects does not provide a base .NET Framework model, and nor does it allow such a model to be reverse engineered without additional work, but the presence of these packages is explained by the fact that any .NET Framework classes that you refer to in your application - for example by inheriting from them - will be included in the reverse engineered model.

However, the .NET Framework classes you refer to in your code are not populated with operations and attributes - only the class name is provided.

The sample application refers to certain classes from the System.Data namespace, and as you can see from the following figure those classes - and only those classes - have been included by name in the UML model structure.

click to expand

Team FLY


Professional UML with Visual Studio. NET. Unmasking Visio for Enterprise Architects
Professional UML with Visual Studio. NET. Unmasking Visio for Enterprise Architects
ISBN: 1440490856
EAN: N/A
Year: 2001
Pages: 85

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net