The first step is to make a comprehensive framework for the analysis. Then, using the framework and the right tools and techniques, we can map an entire Web application and gather enough data for the analysis. Our goal is to retrieve all the possible information from the Web application and organize it in some structured way. The outcome of this process will give us a Web application map organized into functional groups.
The contents of the map provide many pieces of information that will help us understand the application structure and eventually reveal areas that need to be secured. There is no "textbook" method for site linkage analysis. Our experience in the field analyzing Web applications helped us derive our own methodology, which we present here. Figure 8-1 gives a schematic representation of this methodology.
The methodology consists of four main steps: crawling the Web site, creating logical groups within the application structure, analyzing each Web resource, and inventorying Web resources. The first step involves collecting all possible information that may be useful for building the Web application map. The next step involves identifying the functionality of each crawled Web resource. This part of the process is more intuitive than technical. Many times simply looking at the names and words used in the URL string tells us the role played by that particular Web resource. The third step is crucial. It involves going through the Web resource with a fine-tooth comb and picking up every bit of information that might be of help. The techniques box in Figure 8-1 lists the steps involved in this task. The final step is to prepare an inventory of all Web resources, tabulated in a simple and reusable format.