13.4 Experimental design and simulation

< Free Open Study >

13.4.1 Hardware specifications for the systems used

To ensure that all the operating systems were tested on a level playing field, the four PCs that were used had identical hardware. The hardware for each of the machines was as shown in Table 13.1.

Table 13.1: Tested Configuration
CPU	Pentium III @ 500 MHz
Total RAM	256 MB

To ensure that all of the machines were indeed equal in the sense of performance capabilities, an independent group ran a PC performance benchmark to validate. The results are discussed in the next section.

13.4.2 PC benchmark

Passmark: the performance test

The performance test comprises 22 individual benchmark tests, which are available in six test suites. The six different test suites are as follows:

Integer and floating-point mathematical operations
Tests of standard two-dimensional graphical functions
Reading, writing, and seeking within disk files
Memory allocation and access
Tests of the MMX (multimedia extensions) within newer CPUs
A test of the DirectX 3D graphics system

The test results reported are shown as relative values. The larger the number the faster the computer. For example, a computer with a result of 40 can process roughly twice as much data as a computer with a result of 20. The Passmark rating is a weighted average of all the other test results and gives a single overall indication of the computer's performance. The bigger the number the faster the computer. The results we observed are shown in Table 13.2.

Table 13.2: Observed Test Results for the Passmark Performance Suites ^[*]
S. No.	Parameter Tested	Win NT	Win XP	LINUX 7.2	Win ME
1	Math-Addition	96.6	96.2	94.6	97.0
2	Math-Subtraction	96.4	97.1	96.2	97.6
3	Math-Multiplication	101.1	101.4	101.4	103.1
4	Math-Division	12.9	12.8	12.9	13.0
5	Math-Floating-Point Addition	87.7	87.8	87.6	88.7
6	Math-Floating-Point Subtraction	89.4	89.5	88.6	90.1
7	Math-Floating-Point Multiplication	91.7	91.7	90.9	92.3
8	Math-Floating-Point Division	14.8	14.8	14.8	14.9
9	Math-Maximum Mega FLOPS	171.2	172.2	170.7	177.6
10	Graphics 2D-Lines	17.5	17.6	17.5	17.8
11	Graphics 2D-Bitmaps	12.9	12.9	12.8	12.9
12	Graphics 2D-Shapes	4.7	4.7	4.7	4.7
13	Graphics 3D-Many Worlds	22.9	23.0	22.9	22.9
14	Memory-Allocated Small Blocks	86.6	87.6	87.0	87.6
15	Memory-Read Cached	67.9	68.4	68.0	68.5
16	Memory-Read Uncached	48.7	48.8	50	49.1
17	Memory-Write	40.8	41.1	40.9	41.4
18	Disk-Sequential Read	3.2	3.8	3.7	3.1
19	Disk-Sequential Write	2.9	3.4	3.4	2.9
20	Disk-Random Seek	1.2	2.3	3.6	2.1
21	MMX-Addition	97.7	94.5	97.8	99.4
22	MMX-Subtraction	92.3	98.2	93.3	96.0
23	MMX-Multiplication	97.8	97.5	96.9	99.1
24	Math Mark	75.6	75.8	75.2	76.8
25	2D Mark	46.7	46.9	46.7	47.1
26	Memory Mark	58.7	59.2	59.2	59.4
27	Disk Mark	19.3	25.1	28.4	21.5
28	3D Graphics Mark	15.5	15.7	15.5	15.6
29	MMX Mark	48.8	49.2	48.9	50
30	Passmark Rating	45.7	47.2	47.8	46.7
^[*]The bold text indicates the highest values for each category.

13.4.3 Burn-in test

What the burn-in test does is a thorough exercise of the hardware in a PC in a short period of time, in the same way as normal applications use a PC over a long period of time. It tests the following items:

CPU
Hard drives
CD-ROMs
Sound cards
2D graphics
3D graphics
RAM
Network connections and printers

The results we observed are shown in Table 13.3.

Table 13.3: Observed Test Results for the Burn-in Test
S. No.	System Information	Win XP	Win NT	Win ME	LINUX 7.2
2	Number of CPU	1	1	1	1
3	CPU manufacturer	Intel	Intel	Intel	Intel
4	CPU type	Celeron	Celeron	Celeron	Celeron
5	CPU features	MMX	MMX	MMX	MMX
6	CPU Serial #	N/A or disabled	N/A or disabled	N/A or disabled	N/A or disabled
7	CPU1 speed	501.3 MHz	501.3 MHz	501.3 MHz	501.2 MHz
8	CPU Level 2 Cache	128 KB	128 KB	128 KB	128 KB
9	RAM	267,821,056 Bytes (256 MB)	267,821,056 Bytes (256 MB)	267,821,056 Bytes (256 MB)	267,821,056 Bytes (256 MB)
10	Color depth	24	24	24	24

13.4.4 Experimental design

Operating system performance depends on several factors. In order to conduct a proper analysis, the effects of each factor must be isolated from those of others so that meaningful estimates of their positive or negative contributions to performance can be made. For this evaluation, we created the three workloads previously mentioned to exercise the process's management, memory management, and I/O management subsystems of the operating systems under study. All three workloads exercise these subsystems, although each workload was designed to stress particular subsystems. The first workload, which performs the matrix operations, is computationally extensive and requires large memory allocations. Therefore, this workload is primarily focusing on exercising the memory management subsystem. The second workload, which creates, writes to, and deletes an array of varying sized files, focuses on exercising the file management or I/O management subsystem. The last workload, which forks UNIX and DOS shell processes, was designed to stress the process management subsystem.

The experimental design for this study is summarized in Table 13.4.

Table 13.4: A 3⁴ Experimental Design ^[*]
Workload	Factors and Factor Values	Replications
A MATLAB program performing matrix operations on an array of varying sized matrices. This program also records response times.	Size of Matrix-Small, Medium, Large (defined in program) Number of Matrices-10, 100, 1000	3
A C program that creates, writes to, and deletes a number of varying sized files and records response times.	Size of File-Small (1 KB), Medium (10 KB), Large (100 KB) Number of Files-10, 100, 1000	3
A C program that forks several UNIX/DOS shell processes. This program also records response times.	Number of Processes-10, 100, 1000 Process Arrival Rate-Slow (100 p/s), Medium (1,000 p/s), Fast (all at once)	3
^[*][(3 Workloads) × (3 Levels for factor1) × (3 Levels for factor 2) × (3 Replications)] = 81 Experiments

We use a full factorial design to utilize every possible combination at all levels of all factors. By limiting the number of factors and their corresponding level, we can take advantage of this design to examine all possible combinations of configuration and workload. The measures chosen for comparison are response time, CPU utilization, disk utilization, memory allocation, and stability. Table 13.4 shows the factors and their levels. To increase statistical significance while keeping the number of experiments at a reasonable level, experiments were repeated three times for every combination of factors and levels. The resulting number of experiments for this design is 81, calculated as follows: (3 workloads) × (3 levels for factorl) × (3 levels for factor2) × (3 replications).

13.4.5 Simulation

A simulation involves many activities, including data collection, building models, executing the simulation, generating alternatives, analyzing outputs, and presenting results. Simulation systems provide software support for all the activities. AWESIM is a simulation system that supports model building, analysis of models using simulation, and the presentation of simulation results. The AWESIM user interface is built on a see-point-select philosophy employing multiple windows, a series of menus within each window, and a mouse to select menu items. Each team was assigned to prepare a high-level model before using the AWESIM simulation toolkit. The following section includes the high-level model for each group, the corresponding AWESIM model, and the reports.

AWESIM model for LINUX 7.2

In designing the AWESIM model, the LINUX team worked closely with the NT team. The NT team focused mainly on process scheduling, whereas the LINUX team focused mainly on memory management. There was, of course, overlap in the design of both sections, and some differences emerged on the final model for each team due to differences of the operating systems and to differences of opinion over which design was best.

Before looking at the diagram it is essential to first know what resources are being modeled and what attributes the entities have. The only resources modeled were memory and CPU. CPU has an allocation unit of one, since for any point in time only one process has the CPU; memory has as many allocation units as the system being modeled has pages. The attributes are as follows: ATRIB[1] = total CPU time needed, ATRIB[2] = timestamp (used for the LRU approximation), ATRIB[3] = remaining time needed before process terminates, LTRIB[1] = priority, LTRIB[2] = total number of pages the process will ever need, LTRIB[3] = total number of reads remaining, LTRIB[4] = total number of writes remaining, LTRIB[5] = total number of pages currently held, and LTRIB[6] indicates if the entity represents the MATLAB program or the C forking program. As a note, ATRIBs are reals whereas the LTRIBs are integers. ATRIB[2] deserves more explanation. The way the LRU was approximated was by using timestamps every time a memory reference was made.

However, instead of using a timestamp for every page held (which admittedly would have been much better), a single timestamp was used for each process. The logic is that whatever process least recently referenced memory also has the least recently used page. Once this algorithm was determined acceptable, another problem came up: swapping pages. If process A has a page miss, and process B has the LRU (least recently used page), process A can take a page from process B with a preempt node using ATRIB[2] to indicate priority. The problem occurs, however, when the process itself has the LRU, since a process can clearly not preempt itself. This was solved with the combination of the node HaveLRU and a user-defined function.

At the HaveLRU node, if process A has the LRU, it will go onto the SwapWSelf node; otherwise, the entity will be routed to the preempt node. The user-defined function merely examines the timestamp of all entities possessing at least one page of memory. It then returns the value of the LRU's timestamp. A branch can then be used to see if the entity's timestamp equals the LRU's timestamp. Whenever an entity does preempt another entity in this manner, it becomes necessary to increment LTRIB[5] (since it now holds more than one page). Once the preceding concepts are understood, the rest of the model becomes simple to understand.

Model

Processes are created at the create node, create_proc, but they are not told whether they are to simulate the MATLAB program or the C forking program until the goon-node, Initialize, randomly chooses to forward the entity to the assign-node forking or the assign-node MATLAB. Once the correct attributes have been set, the process must obtain a page of memory by entering the await node, mem_wait. After the page of memory has been obtained, the fact is recorded by incrementing the value of LTRIB[5] via the assign-node GetFrstPg. Once the page has been obtained, the process waits in the ready queue. From the ready queue the process goes to the preempt-node, cpu_wait, where it will either preempt the current process (provided ATRIB[3] gives it a higher priority than the current process) and send the preempted process back to the ready queue, or it will simply wait until the current running process releases the CPU. Once the entity has control of the CPU, the next sequence of steps is determined by which process the entity is to simulate.

Forking

The entity modeling the forking program will first go to the goon-node, fork, from where it will release the CPU at the free-node, cpu_free2, and its memory at the free-node, mem_free2. Once both resources have been freed, the entity will split, sending one copy of itself (the parent) to the terminate-node end and the other copy of itself (the child) to the assign-node child, where it will record its number of pages as being zero. The child then attempts to get its first page of memory by going to the await-node, mem_wait.

MATLAB

The entity modeling the MATLAB program will first go to the goon-node, processmatlab, from where it will branch, depending on whether or not it has any reads or writes to perform. If there are no reads or writes to perform, the entity will be routed to the assign-node Ionotreq, where ATRIB[1] will be decremented by the duration of the time slice. Next, the CPU is released through the release-node cpu_free. If there is any processing remaining to be done, the entity will be routed back to the ready queue; otherwise, it will release its memory via the free-node mem_free and terminate by going to the terminate-node end.

If, however, there are any reads or writes that must be performed, the entity will first be routed to the assign-node Ioreq, where the value for ATRIB[1] will be decremented between zero and the value of the time slice-depending on when the I/O is requested. It will then release the CPU via the free-node cpuio_free, from where it will be routed to the goon-node requestpage.

Once a page is requested, it must first be determined whether or not the process already owns that page. The model deals with this by assuming that the probability a page is already owned is equal to the current number of pages held, divided by the total number of pages needed. The value of this probability is used for the branches. If it is determined that the process does in fact have the necessary page, the entity will be routed to the goon-node GetPageFrame, which simulates finding the page frame, and is then routed to the goon-node addoffset, which indicates that the offset is being added to the page frame. After addoffset, if there are any writes to perform, the entity is routed to the goon-node write and then to the assign-node DecWrite, where the timestamp is obtained and the number of writes remaining to be performed is decremented. After the write, the entity returns to the ready queue. If no writes remain, then a read needs to be performed (since we wouldn't be here in the first place if there were neither reads nor writes). The entity will first be routed to the goon-node read, from where it will be routed to the assign-node DecRead, which is similar to the DecWrite node. After changing the attributes, the entity is routed to the ready queue.

If the process does not have the page being requested, it needs to find the LRU and exchange it with the needed page. First, it must be determined if the process itself contains the LRU, so the entity is first routed to the goon-node HaveLRU, where it uses the user-defined function to determine whether or not it has the LRU and then branches accordingly. If it does have the LRU, the entity only needs to swap out its LRU and swap in the new page; this is handled through the goon-node SwapWSelf, after which the entity can be routed to the goon-node GetPageFrame. If the entity does not have the LRU, it goes to a preempt node to try to take a page away from the process with the LRU (done by using timestamp-ATRIB[2]-as the priority) and use it for itself. Once it gets a page, the number of currently held pages is incremented with the assign node, and the entity is then routed to the goon-node GetPageFrame. Figure 13.1 illustrates the model for INUX 7.2.

click to expand
Figure 13.1: Model for LINUX 7.2.

AWESIM model for Windows XP

Modeling of the system

We have modeled CPU scheduling based on the Windows NT architecture. We have provided a high-level model and implemented the AWESIM model based on this high-level model. The following text describes how the CPU is scheduled in the operating system.

CPU scheduling

This operating system uses a preemptive multithreading system. That is, it lets several processes execute simultaneously and switches among them rapidly to create the illusion that each process is the only process running on the machine. This scenario assumes a uniprocessor environment.

The basic scheduling unit is a thread. The system assigns each thread a priority number from 1 to 31, where higher numbers signal higher priorities. It reserves priorities 16 through 31 (real-time priorities) for use by time-critical operations. A process priority class is a priority level around which the process's threads get executed. New processes inherit the priority class of their parent. Process threads start at the priority level associated with their process's priority class.

The relative priorities that can change a thread's priority from its process class priority are highest, above normal, normal, below normal, and lowest. Threads must take turns running on the CPU so that one thread doesn't prevent other threads from performing work. One of the scheduler's jobs is to assign units of CPU time (quantum) to threads. A quantum is typically very short in duration, but threads receive quantum so frequently that the system appears to run smoothly-even when many threads are performing work. The scheduler must make a CPU scheduling decision every time one of the following three situations occurs:

A thread's quantum on the CPU expires.
A thread waits for an event to occur.
A thread becomes ready to execute.

The scheduler executes the FindReadyThread algorithm to decide whether another thread needs to take over the CPU. If a higher-priority thread is ready to execute, it replaces (or preempts) the thread that was running. FindReadyThread and ReadyThread are the key algorithms the scheduler uses to determine how threads take turns on the CPU. FindReadyThread locates the highest-priority thread that is ready to execute. The scheduler keeps track of all ready-to-execute threads in the Dispatcher Ready List. The FindReadyThread algorithm scans the Dispatcher Ready List and picks the front thread in the highest-priority nonempty queue. ReadyThread is the algorithm that places threads in the Dispatcher Ready List. When ReadyThread receives a ready-to-execute thread, it checks to see whether the thread has a higher priority than the executing thread. If the new thread has a higher priority, it preempts the current thread and the current thread goes to the Dispatcher Ready List. Otherwise, ReadyThread places the ready-to-execute thread in the appropriate Dispatcher Ready List. At the front of the queue, ReadyThread places threads that the scheduler pulls off the CPU before they complete at least one quantum; all other threads (including blocked threads) go to the end of the queue.

High-level model

Figure 13.2 illustrates the high-level model of the CPU scheduler that we implemented in AWESIM.

click to expand
Figure 13.2: High-level model of CPU scheduler implemented in AWESIM.

Assumptions

We have made a number of assumptions while implementing the network model. These are:

Processes that have I/O are given a fixed time for those operations to occur.
A fixed quantum size.
Preempted processes return to the end of the queue instead of going to the head.
The model will only consider I/O operations. Interrupts and forking operations will not be considered.

AWESIM model

The AWESIM model is the implementation of the high-level model described previously. The following are the working details of the model:

Each created process has the following attributes: total process time, priority I/O, and whether or not the process will perform an I/O operation.
If a process has an I/O operation, then the time of occurrence and the total time for the I/O are allocated.
Processes with different priorities go to the different queues.
Since CPU is allocated to a process for a quantum, calculations are done to calculate the remaining time of the process after the quantum has expired.
Preemption is done based on priority using the preempt-node, and, again, the remaining time for the process to complete is calculated.
The preempted processes are sent to a queue, which goes back to the different priority queues.
All the processes go to the await-node, where they wait for the CPU. When the CPU becomes available, they use it for either the full length of the quantum or until a preemption or an I/O request occurs.
When an I/O request occurs, the process waits in the I/O await-node where it will be assigned one unit of the I/O resource. The processes that have finished I/O go back to the ready queue.
After a resource (I/O or the CPU) is used, it is freed to be allocated to the next entity (process).
After a process completes an I/O operation, its status is changed to reflect that it does not require more I/O operations.
A process is terminated upon completion of the total time allocated to it.
Collect-nodes were implemented to collect statistics such as the number of processes being preempted, the number of processes with different priorities, and the number of processes performing I/O.

Figure 13.3 illustrates the AWESIM model for CPU scheduling.

click to expand
Figure 13.3: AWESIM model for CPU scheduling.

The design description of the AWESIM model is as follows. The high-level model concentrates on CPU scheduling; therefore the two resources in the AWESIM model are CPU and I/O. The create-node creates 100 entities (processes), which are assigned with different attributes (with each attribute defining a specific function). LTRIB[0] gives the total time of execution of the process, including the time taken for executing the I/O operation. Whether a process has I/O or not is defined by LTRIB[1]. The time at which the I/O occurs within the total execution time is indicated by LTRIB[2]. Each process is assigned a priority, which is given by LTRIB[4].

In our model, a process departing the create-node will have either a priority 1, 2, or 3. Depending on the priority value, the entities are sent to their respective queues. Here the entities branch, depending on the value of LTRIB[1]. The time required by the process for execution is compared with the time slice available on the CPU and, accordingly, the resource is made available to execute the process.

If a process with a low priority is currently being serviced by the CPU and a process with high priority comes in, then the preempt-node preempts the low-priority process and sends it to queue 4 for future service. Also, any process with I/O requirements after being serviced by the CPU will be sent to the I/O await-node, and subsequently serviced by the I/O resource. Any process with leftover execution time is sent back to the initial queues to complete its execution. The two resources are freed by using a free-node after the service.

AWESIM model results

The summary of the output is given in a report, from which we are able to find the utilization rate of the different queues, queue lengths, utilization of service activities, and the parameters related to the other nodes. From this evaluation we have been able to find the percentages of CPU utilization and I/O utilization. The inclusion of the collect-nodes at every stage of the AWESIM model yields results, which give details about the different priorities that are attributed to the processes, the number of processes that are preempted, and whether a process has an I/O or not.

AWESIM model for Windows ME

High-level model

In the basic high-level model for Windows 98, every incoming process is directed to the operating system. The OS, depending on the type of service requested, directs the process either to the I/O or memory. If it is a memory request, then the operating system checks to see whether the requested data are present in the cache. If not, it checks for the data in the main memory and transfers the block of data into the cache. If it is not a memory request, then it is an I/O request. Eventually, after an I/O request or memory access, it finally goes to the CPU. Once the processing in the CPU is complete, it can get into the I/O or memory chain or terminate. Figure 13.4 illustrates the high-level network model.

click to expand
Figure 13.4: High-level network model.

The simulation model for Windows ME focuses on the basic functioning of two aspects: memory access and I/O. The model deals with them in the simplest way and with the level of detail necessary to simulate the conditions realistically. The simulation model for Windows focuses on basic functioning of the two main modes as closely as possible. There is one entity creation node. The entities emanating from this are fed into three assign-nodes, which assign a set of attributes to these entities to differentiate their behavior in the system. These are as follows:

ATRIB[1] corresponds to the type of service needed: I/O or simple memory access.
ATRIB[2] is for priority that ranges from 5 to 15.
ATRIB[3] corresponds to the number of reinstalls or the number of times the entity needs to be serviced.

These entities pass through the assign-node queue in the input queue waiting to acquire their respective resources (cache, main memory, and I/O) at the await-nodes. The resources are assigned based on a process's priority. After acquiring the resources, they all queue at the await-node for the CPU resource (CPU clock pulses). After getting serviced by the CPU, the entities move on and free their respective resources at the free-node and terminate. Entities that need multiple services go back to the input queue. This is done by checking ATRIB3 and changing ATRIB1 to create a realistic behavior. The model also tries to incorporate forking by creating some entities within the process. This is visible in the entity count on the activities.

Figure 13.5 illustrates the AWESIM model for Windows ME.

click to expand
Figure 13.5: AWESIM model for Windows ME.

AWESIM model for Windows NT

Figure 13.6 illustrates the AWESIM model for Windows NT.

click to expand
Figure 13.6: AWESIM model for Windows NT.

In developing the AWESIM model (virtual memory part), the NT team worked closely with the LINUX team. The NT team focused mainly on CPU process scheduling, I/O scheduling, and virtual memory. We could not get into details such as file management and object manager, because the internals of the operating system were not available.

Before getting into the actual model, it is essential to look at what kind of resources are being modeled and what attributes they have.

The resources are memory, CPU, and I/O manager. Out of these, the memory and CPU are modeled as a single source of resource, whereas the I/O manager (which manages various I/O devices) is modeled as a group resource, meaning it has an n number of resources instead of having only one. The CPU has an allocation unit of one, since, for any one point in time, only one process acquires the CPU; memory has as many allocation units as the system being modeled has pages.

The attributes are as follows:
ATRIB[1] = The total CPU time required by a process.
ATRIB[2] = Defines the priority of a process. This value is assigned using a probability function. If this value is 1, then it is assumed that the process belongs to a higher priority than the rest of the processes, and a zero indicates a lower priority. There was a probablity of 0.1 that a process would have a priority of one.
LTRIB[1] = The total number of pages a process needs. Again, to assign the number of pages, we used a random function, UNFRM, which generates values from 1 to 5.
LTRIB[2] = The probability that the processes might require an I/O operation such as printing, waiting on a subroutine, and so on.
LTRIB[3] = Number of pages requested from the memory, initially bearing a value of 0.
LTRIB[4] = The total number of pages currently held in memory.
LTRIB[5] = The timestamp, used to compute the LRU.

The way the LRU works is as follows. Every time a memory reference is made, the timestamp is updated. However, instead of using a timestamp for every page held, a single timestamp was used for each process. The logic is that whatever process least recently referenced memory also has the least recently used page.

Suppose X and Y are two processes: X has a page miss and Y has the LRU. Process X can take a page from process Y with a preempt-node using LTRIB[5] to indicate priority. But this logic has a problem whenever process X has the LRU, since it cannot preempt itself. This is solved by using a HaveLRU-node and a user-defined function, which simply examine the timestamp of all entries possessing at least one page of memory and returning the value of LRU's timestamp. A branch can then be used to see if the entity's timestamp equals the LRU's timestamp. At the HaveLRU-node, the processes then check for themselves whether they have the LRU. If so, they will go on to SwapSelf-node, where they simply swap themselves; otherwise, they will be routed to the preempt-node. Whenever an entity does preempt another entity in this manner, it becomes necessary to increment LTRIB[3] (since it holds more than one page).

Network model

Processes are created at the create-node, create_proc. Once the correct attributes have been set, the processes must obtain a page of memory by waiting for an await-node, mem_wait. Initially all the processes are allocated a single page of memory. After acquiring the page, this fact is recorded by incrementing the value of LTRIB[3] via the assign-node GetFrstPage. Once the page has been obtained, the processes wait in the ready queue for the CPU. Thereafter, they go to a preempt-node, cpu_wait, where they will either preempt the current process (provided the CPU encounters a higher-priority process than the current process) and send the preempted process back to the ready queue, or they will simply wait until the current running process releases the CPU. Once the entity has control of the CPU, the next sequence of steps is determined by the node io_reqmnt, which determines whether it requires I/O or not depending on the value of LTRIB[2]. Here two issues arise. What if an entity requires I/O and what if it doesn't?

What if the entities do not require I/O? They proceed from the node io_reqmnt toward node pages_mem, which assigns the value of LTRIB[3]/ LTRIB[1] to LTRIB[4]. The logic behind doing this is to check whether or not the corresponding process has the required number of pages depending on the value of LTRIB[4]. Here two issues arise. What if a page fault occurs and what if it does not?

If there isn't a page fault, then the processes get processed and go to an assign-node, where we update ATRIB[1] and the timestamp in LTRIB[5], thereafter freeing the CPU and sending the processes back to the ready queue.

If there is a page fault, then we free the CPU from that process, since it is waiting for another page. After doing this, we need to update ATRIB[1], since we need to take into account the time that the process already spent in CPU executing up to that moment-that is, before the page fault occured. After updating ATRIB[1], the process goes to node mem_full_or_not, which checks whether the memory is full or not.

If the memory is not full, then the process gets the required number of pages by updating LTRIB[3] in an assign-node, page_alloc, and thereafter it waits for memory at the await-node, mem_wait.

If the memory is full, the entity is routed through a check_lru node, which computes the LRU using the user-defined function discussed previously and branches accordingly.

If the entity does have the LRU, the entity only needs to swap out its LRU page and swap in the new page. This is handled through the go-on node SwapSelf, after which the entity can be routed through the go-on node GetPageFrame, which simulates finding the page frame and adding the offset to it. After adding offset, the required read/write operation is performed by routing the entity through the go-on node perform_rw. After doing this, the entities are routed through an assign node, upd_time, which updates the timestamp of the entities that accessed the memory. After updating the timestamp, the entities are sent back to the ready queue, since they are done with their operation.

If the entity does not have the LRU, it goes to a preempt-node, mem_preempt, to try to take a page away from the process with the LRU (done by timestamp LTRIB[3]) and use it for itself. After preempting the LRU, the timestamp, LTRIB[5], is again updated and gets the requested page by waiting for the await-node, mem_wait, for memory resource.

What if the entities do require I/O? Entities requiring I/O are freed from the CPU and routed through an assign-node, ioreq, which updates the total execution time already spent by the entity in the CPU. After this, entities wait at an await-node, iowait, on the group resource I/O block, which has various standard I/O resources, such as monitor, mouse, keyboard, and printer. After the entities are serviced by the resource, the attribute LTRIB[2] is set to zero, assuming that the entity no longer requires assitional I/O. After this the entities are routed through a free-node, free_io, where all the resources allocated to that entity are freed. Finally, the entities are routed back to the ready queue, since they are done with their execution.

Why isn't the AWESIM model validated by experimental results?

The results of AWESIM were so obviously wrong (e.g., CPU utilization equals 100 percent), that without any validation it was instantly known the model was invalid. Considering that parts of the model were actually left out, this is not a surprise. Without these parts a valid model is impossible, so rather than trying to force the data to fit into a validation scheme, we will instead explain why the results were so poor, and why that cannot be changed.

The first obviously incorrect piece of data is that CPU is 100 percent utilized. This is because we do not simulate any I/O, so the only thing to make a process give up CPU is termination or end of time slice. In either case another process instantly replaces it, yielding an unrealistically high utilization rate.

All other problems run into the same general problem-we could not obtain appropriate values for attributes. An example of this is that we would like to know the number of reads or writes the MATLAB program performs, but all we are given is particular operations. These operations could represent any number of reads and writes. This forced us to use an arbitrary number for reads and writes. Also, the total number of pages a process will ever request was unknown to the simulator, forcing an arbitrary number for that as well. Other arbitrarily chosen numbers were as follows: duration of time slice, time to perform a single read, time to perform a single write, and total length of CPU time (not counting I/O times) to complete the process.

Obviously, even if the model nodes and activities are perfect, with so many arbitrary numbers the results cannot possibly be expected to be valid. The model is useful for explaining, from a high-level point of view, how the LINUX operating system works, and was also a useful educational experience to design, but as far as determining real-world behavior goes, the model is useless.

< Free Open Study >