The overall design of the application does not change very much from the design shown in Chapter 9. But we must take into account a few considerations before implementing the optimization features. These considerations deal with the implementation of the increased user responsiveness of the GUI. Before we start making design decisions, let's look at some background information about performance optimizations.
10.3.1 Performance Optimization Options
There are several possibilities for improving the overall performance of an application. These opportunities for performance optimizations can be divided into six categories:
To optimize the photo editor application, we try to address as many of the six categories as possible. To optimize utilization, we make sure that the compiler uses optimizations that generate efficient code. To improve efficiency, we will use the data we collected using the profiling tool. We have identified a hot spot of inefficient execution resulting from the use of the Get- and SetPixel methods of GDI+.
Another aspect we keep in mind is to reduce latency, or slow GUI response, within the application. In addition, we will try to increase the concurrency of the application. Unfortunately, we cannot improve through put very much because that is usually done by using multiple processors. But we have no influence over the hardware users are running, and it would not be acceptable to restrict the application to run only on dual-processor machines. Last but not least, we will try to eliminate bottlenecks as we improve the application's performance.
Now that we know what we are intending to improve, we need to have some knowledge of how these improvements can be implemented.
10.3.2 Multithreading and Symmetric Multiprocessing
When talking about performance optimizations, we often mention multithreading and symmetric multiprocessing (SMP). It is no different in the case of the photo editor application.
Symmetric multiprocessing refers to the hardware of a system. If a system has more than one microprocessor available, then we talk about SMP. Each CPU executes a distinct set of program threads. The operating system is responsible for scheduling the tasks for each processor. SMP improves the throughput of a system.
Multithreading is a software construct that allows for parallel execution of software. Each thread has its own call stack and CPU state, but threads share memory with the process they are bound to. Multithreading programming is used to improve concurrency, reduce latency, and improve throughput.
10.3.3 Design of the Multithreaded GUI
When we talk about multithreading, a question often arises: Why not use multiple processes? The answer is that we can create a thread with very little overhead. In addition, it is convenient for the programmer that all threads created within one process share the same memory. This makes data sharing very easy (sometimes too easy, as when we must synchronize data access in order for the application to perform its task correctly).
To decide on the design of the photo editor application, let's first look at the goals of this iteration and the techniques we can use to implement them:
For the first three points on our list, multithreading seems to be a viable option for the photo editor application. Therefore, we analyze the use of multithreading in more detail in the following sections.
A common strategy to improve a GUI's user responsiveness is to implement multiple threads that can execute independently. A thread is an entity that can execute by itself within the process it is bound to. Usually an application uses one thread, which is started automatically by the system at program startup. The developer can create additional threads in a program that run in parallel.
If a computer with multiple processors is used, then two threads can run in parallel. If a single-processor machine is used, then the operating system uses a technique called preemptive multitasking, which makes multiple threads look as if they are running in parallel, even though they are sharing one processor. It is entirely up to the operating system to schedule time for each thread to execute. In addition, Intel recently released new processors that allow for hyperthreading: the ability to execute two threads in parallel on a single processor.
The action of the operating system in switching between the executions of different threads is also known as context switching. Windows uses a round-robin algorithm to schedule time for each of the created threads. The developer has no influence on when exactly the context switches occur. Therefore, it is the task of the developer to synchronize the access of data that is used by two or more concurrent threads. This process, known as thread synchronization, is done by using methods for data locking, such as mutual exclusion. In the simplest case, data is locked by the action of the thread requesting a lock to the data. Another thread requesting the same lock must wait until the first thread releases the lock. This implies that if a lock is not released by a thread, the system might hang. Therefore, it is important to be very careful when working with multiple threads and locking mechanisms.
In addition to increased user responsiveness, an implementation using multiple threads can result in improved performance and throughput. If computers with multiple processors or Intel processors with hyperthreading support are used, threads will execute in parallel, and that shortens the execution time. Multithreaded applications achieve improved throughput because if one thread is waiting for a lengthy operation (such as a file read from disk) to finish, another thread can use the processor's idle time for execution.
Multithreading should be used with caution because of potential problems such as race conditions and deadlocks. Both of these are very unpleasant and should be avoided by careful design and implementation.
Race conditions result in corrupted data and unexpected behavior. Because race conditions depend highly on the context switches of the system, they are observed in many kinds of strange behavior of the system.
Deadlocks are easier to detect because the application simply appears to hang. In many cases deadlocks are also dependent on race conditions whose behaviors are, as mentioned before, not predictable.
Let's look closely at what constitutes a race condition. Imagine that thread A is applying the contrast operation to image 1. In the meantime, a second thread, thread B, applies the color correction function to the same image, image 1. In this case the image is being read and written by two threads at the same time. It is up to the operating system to distribute time to each thread. For the developer, this means that there is no way to find out which operation is applied first to a pixel. In some cases, the order of operations does not matter, but in many cases it is essential to the output.
The problem can be solved by synchronizing access to the resource (in this case, the image) that is used to perform the requested operation. This synchronization is also called locking. Locking enables programmers to restrict access to a resource to a single thread at a time. In our example, the loaded image needs to be locked while it is altered by thread A. If thread B tries to acquire the lock, it will fail because the image is locked. When thread A finishes its calculation, it releases the lock, and thread B can acquire the lock and perform its operation on the image. Whenever two threads are trying to write to the same data or whenever one thread is writing data that another thread is reading, a lock should be used to ensure that the result of the applied operation is as expected (situations in which two threads are reading the same data are obviously OK as long as the data is not altered). If a lock for an object is acquired by a thread, a second thread that tries to acquire the same lock must wait until the first thread releases it.
This type of serialization is very important when you deal with threads, but it can cause the application to hang if a deadlock is caused by a failure to release the lock when appropriate. Usually, locks should be acquired just before the start of the operation for which the lock is needed and should be released as soon as the data is no longer being used or altered.
A deadlock is a situation in which two or more threads try to work on the same data at the same time and for that reason block each other from proceeding. Deadlocks can happen under many possible scenarios. Imagine three threads called thread A, thread B, and thread C. Thread A holds a lock on object A, thread B holds a lock on object B, and thread C holds a lock on object C. Now thread A tries to acquire a lock on object B, which is held by thread B. Thread B waits for a lock on object C, and thread C is trying to get a lock on object A. In this case, none of the threads can proceed because each thread needs an object that is locked by another thread. Figure 10.3 shows this deadlock scenario.
Figure 10.3. Deadlock Example
To recover from this situation, the threads must release the locks of all the objects they are working on and try again in different time spans.
The best way to recover from deadlocks is to avoid them. For example, the threads could acquire all the locks they need in the same order. This would mean that thread C should acquire the lock first on object A and then on object C. In this way, if object A is locked by thread A, thread C would have to wait until object A was released. At the same time, thread A would have to wait for thread B to release object B, something that eventually will happen because thread B can now acquire the lock to object C, do the necessary work, and release the locks for objects B and C, thereby enabling thread A to do its work, and so on. The described scenario can be seen in Figure 10.4.
Figure 10.4. Avoiding a Deadlock
Design of the Multithreading Feature
The strategy for the photo editor application is to provide a single mutex (mutual exclusion) object, which enables the developer to protect data against corruption by another thread during a calculation. The threads are created using the .NET Frameworkprovided thread pool capabilities. The calculation of the image-processing operations is done on a copy of the image data; after the thread finishes, the resulting image is shown on the screen.
In the design review for this chapter, we discover that the class diagram provided in the sixth iteration was not complete. Therefore, we add a defect on the documentation of the defect tracking sheet (provided in the doc directory of the sample solution of this chapter on the accompanying CD), and we update the class diagram to fix this document issue. The class diagram of the photo editor is changed slightly to put the ApplyImageProcessing method and the necessary properties into a new class called PlugInInterface. This interface does the locking of the data and invokes the plugin at run time. In addition, the ApplyImageProcessing method of the PlugInInterface class serves as the entry point of the thread created to do the image processing. The updated class diagram is shown in Figure 10.5.
Figure 10.5. Class Diagram with PlugInInterface
For the multithreaded application, it is a good practice to provide a state chart for the typical lock scenario. This will clarify the states of the multithreaded system in the critical scenario.
State Chart of the Lock Scenario
The critical scenario for the image-processing operations of the photo editor occurs if two threads are trying to use the same image data for read or write access at the same time. The state chart in Figure 10.6 shows the handling of such scenarios.
Figure 10.6. State Chart of the Critical Lock Scenario
The state chart shows that a second execution of a plugin component must wait until the first thread has finished the image processing and has released the lock. After the lock has been released, the second thread can acquire the lock and start the calculation.
After the design changes for the optimization functionality are reviewed and released, the project team is ready to move on to the implementation workflow.