THE PROCESS DATABASE

2.1 THE PROCESS DATABASE

The process database is a permanent repository of the process performance data from projects; it can be used for project planning, estimation, analysis of productivity and quality, and other purposes.2 The PDB consists of data from completed projects, with each project providing one data record. As you can imagine, to populate the PDB, data must be collected, analyzed, and then organized for entry. Here we focus on how the data are represented in the PDB at Infosys; Chapter 7 explains how the data are collected.

2.1.1 Contents of the PDB

To use the information in the PDB during planning, project managers often find information about similar projects particularly useful. To allow for similarity checking, you should capture in the PDB general information about the project, such as languages used, platforms, databases used, tools used, size, and effort. With this type of information, a project manager can search and find information on all projects that, for example, focused on a particular application domain, used a particular database management system (DBMS) or language, or targeted a specific platform.

To help in project planning, you should capture data about the effort, defects, schedule, risk, and so on. If the total effort spent in a project is known, along with the size and distribution of effort in different phases, this data can be used for estimating effort in a new project.

Thus, the data captured in the PDB at Infosys can be classified as follows:

         Project characteristics

         Project schedule

         Project effort

         Size

         Defects

Data on project characteristics consists of the project name, the names of the project manager and module leaders (so that they can be contacted for further information or clarification), the business unit (to permit analysis based on business unit), the process deployed (to allow separate analyses of different processes), the application domain, the hardware platform, the languages used, the DBMS used, a brief statement of the project goals, information about project risks, the duration of the project, and team size.

The schedule data is primarily the project's expected and actual start and end dates. The data on project effort includes data on the initial estimated effort and the total actual effort, and the distribution of the actual effort among various stages, such as project initiation, requirements management, design, build, unit testing, and other phases. Chapter 7 discusses how to capture the effort data.

The size of the software developed may be in terms of lines of code (LOC), the number of simple, medium, or complex programs, or a combination of these measures. Even if function points are not used for estimation, you can obtain a uniform metric for productivity by representing the final size in function points, which is usually obtained by converting the measured size of the software in LOC to function points using published conversion tables.3 The size of the final system in function points is also captured.

The data on defects includes the number of defects found in the various defect detection activities, and the number of defects injected in different stages. Hence, you record the number of defects of different origins found in requirements review, design review, code review, unit testing, and other phases. Chapter 7 explains how projects record defect data.

In addition, notes are recorded, including notes on estimation (for example, the criteria used for classifying programs as simple, medium, or complex) and notes on risk management (for example, how risk perception changed during the project).

2.1.2 A Sample Entry

Let's look at a sample PDB entry for a project, which we will refer to with the pseudonym Synergy. In the Synergy project an application was built that formed the precursor to that of the case study (the ACIC project). The case study will refer to this PDB entry during planning.

Data for the four major tables are shown (the example uses expressive names, but codes are used in the actual database for various phases and quality activities). In this example, the data are fairly complete; in other situations, however, the data may not be complete. Such data cannot always be discarded because the information may still be useful.4 Hence, such data may also be captured in the PDB.

Table 2.1 gives the general information on the project, including start and end dates (estimated and actual), estimated effort (actual effort is not put in this table because it can be computed from the effort table), peak team size, information about the risk, tools used, and other items. In addition, other information for example, about the client is stored in this table.

Table 2.1. General Data about a Project

General Characteristics

Field Name

Value for Synergy

ProcessCategory

Development

LifeCycle

Full

BusinessDomain

Brokerage/Finance

ProcessTailoringNotes

Added group review for high-impact documents.

First program of each developer was group reviewed.

PeakTeamSize

12

ToolsUsed

VSS for document CM, VAJ for source code

EstimatedStart

20 Jan 2000

EstimatedFinish

5 May 2000

EstimatedEffortHrs

3,106

EstimationNotes

Use case point approach was one method used for estimation.

ActualStart

20 Jan 2000

ActualFinish

5 May 2000

First Risk

Working through link on customer DB

Second Risk

Additional requirements

Third Risk

Attrition

RiskNotes

Worked in shifts; agreed to take enhancements after acceptance of this product; team building exercises were done.

The second table captures the information about effort. For different stages in the process, it includes data on the effort spent in the activity and the effort spent in rework after the task. Rework effort is captured because it helps in calculating and understanding the cost of quality. Table 2.2 shows the Synergy effort data in person-hours. Estimated effort for the phases is also given. (The total effort spent in life-cycle stages is 2,950 person-hours, and in review, 223 person-hours; the total estimated effort is 3,012 person-hours.)

The third table contains information about defects. It is desirable to know not only when the defect was detected but also when it was injected. Hence, you should record the number of defects found for each injection stage and detection stage combination. The detection stages consist of various reviews and testing, whereas the injection stages involve requirements, design, and coding. If you can separate the defects detected by stage according to their injection stages, then you can compute removal efficiencies of the defect detection stages. This information can be useful for identifying potential improvement areas. Table 2.3 shows the defect data for the Synergy project.

Table 2.2. Effort Data

Effort by Stage

Stage

TaskEffort

Review Effort

Estimated

Requirements analysis

0

0

0

Design

414

32

367

Coding

1147

76

1182

Independent unit testing

156

74

269

Integration testing

251

30

180

Acceptance testing and installation

183

0

175

Project management

237

8

357

Configuration management

30

3

38

Project-specific training

200

0

218

Others

332

0

226

The final table contains information about the size of the project. Different languages may be used in a project, so this table may have multiple entries. Multiple units of size may also be used, so the table captures the unit. Generally, if the size is given in LOC, size in function points can also be computed by using conversion tables as needed. This information is used to calculate productivity in terms of function points. Because size is a critical factor in determining productivity, other factors, such as the operating system and hardware used, are also captured. Table 2.4 shows the values for this table for Synergy.

Table 2.3. Defect Data for the Synergy Project

 

Requirement Review

Design Review

Code Review

Unit Testing

System Testing

Acceptance Testing

Requirements

0

0

0

1

1

0

Design

 

14

3

1

0

0

Coding

 

 

21

48

17

6

 

Table 2.4. Size Data for the Synergy Project

Size

LangCode

OSCode

DBMSCode

HWCode

MeasureCode

ActualCode Size

Java

Windows

PC

LOC

8,082

Persistent Builder

Windows NT

DB2

Client MC

LOC

12,185

 



Software Project Management in Practice
Linux Annoyances for Geeks: Getting the Most Flexible System in the World Just the Way You Want It
ISBN: 0596008015
EAN: 2147483647
Year: 2005
Pages: 83
Authors: Michael Jang

Similar book on Amazon

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net