19.14 Postmortem | UNIX Systems Programming: Communication, Concurrency and Threads

Team-FLY

This section describes common pitfalls and mistakes that we have observed in student implementations of the servers described in this chapter.

19.14.1 Threading and timing errors

Most timing errors for this type of program result from an incorrect understanding of TCP. Do not assume that an entire request can be read in a single read, even if you provide a large enough buffer. TCP provides an abstraction of a stream of bytes without packet or message boundaries. You have no control over how much will be delivered in a single read operation because the amount depends on how the message was encapsulated into packets and how those packets were delivered through an unreliable channel. Unfortunately, a program that makes this assumption works most of the time when tested on a fast local area network.

Whether writing a tunnel, proxy or gateway, do not assume that a client first sends its entire request and then the server responds. A program that reads from the client until it detects the end of the HTTP request does not follow the specification. Your program should simultaneously monitor the incoming file descriptors for both the client and the origin server. (See Sections 12.1 and 12.2 for approaches to do this.)

According to the specification, passmonitor should measure the time it takes to process each client request. How you approach this depends, to some extent, on your method of handling multiple file descriptors. In any case, do not measure the start time before the accept call because doing so incorporates an indefinite client "think" time. Do not measure the end time right after the fork call if you are using multiple processes, right after pthread_create if you are using multiple threads, or right after select if you are monitoring multiple descriptors in a single thread of execution. Why not?

Be sure that the time values you measure are reasonable. Most time- related library functions return seconds and milliseconds , seconds and microseconds, or seconds and nanoseconds. A common mistake is to confuse the units of the second element. Another common mistake is to subtract the start and end times without allowing for wrap-around . If you come out with a time value in days or months, you know that you made a mistake.

Do not use sleep to "cover up" incorrectly synchronized code. These programs should not need sleep to work correctly, and the presence of a sleep call in the code is a tip-off that something is seriously wrong.

Logging of headers also presents a timing problem. If you write one header line at a time to the log file, it is possible that headers for responses and requests will be interleaved. Accumulate each header in a buffer and write it by using a single write function when your program detects that the header is complete.

Do not connect to the destination web server in the tunnel programs before accepting a client connection. If you type fast enough during testing, you might not detect a problem. However, most web servers disconnect after a fairly short time when no incoming request appears.

19.14.2 Uncaught errors and bad exits

If you did not seriously or correctly address how your servers react to errors and when they should exit, your running programs may represent a system threat, particularly if they run with heightened privileges.

A server usually should run until the system reboots, so think about exit strategies. Do not exit from any functions except the main function. In general, other functions should either handle the error or return an error code to the caller. Do not exit if the proxy fails to connect to the destination web server ”the problem may be temporary or may just be for that particular server. In general, a client should not be able to cause a server to exit. The server should exit only if there is an unrecoverable error due to lack of resources (memory, descriptors, etc.) that would jeopardize future correct execution. Remember the Mars Pathfinder (see page 483)! For these programs, a server should exit only when it fails to create a socket for listening to client requests. You should think about what actions to take in other situations.

Programs in C continue to execute even when a library function returns an error, possibly causing a fatal and virtually untrackable error later in the execution. To avoid this type of problem, check the return value for every library function that can return an error.

Releasing resources is always important. In servers, it is critical. Close all appropriate file descriptors when the client communication is finished. If a function allocates buffers, be sure to free them somewhere. Check to see that resources are freed on all paths through each function, paying particular attention to what happens when an error occurs.

Decide when a function should output an error message as well as return an error code. Use conditional compilation to leave informational messages in the source without having them appear in the released application. Remember that in the real world those messages have to go somewhere ”probably to some unfortunate console log. Write messages to standard error, not to standard output. Usually, standard error is redirected to a console log ”where someone might actually read the message. Also, the system does not buffer standard error, so the message appears when the error occurs.

19.14.3 Writing style and presentation

Most significant projects have an accompanying report or auxiliary documentation. Here are some things to think about in producing such a report.

Clean up the spelling and grammar. No one is going to believe that the code is debugged if the report isn't. Using (and paying attention to) a grammar checker won't make you a great writer, but it will help you avoid truly awful writing. Be consistent in your style, typeface, numbering scheme and use of bullets. Not only does this attention to detail result in a more visually pleasing report, but it helps readers who may use style as a cue to meaning. Put some thought into the layout and organization of your report. Use section titles and subsection titles to make the organization of the report clear. Use paragraph divisions that are consistent with meaning. If your report contains single- spaced paragraphs that are a third of a page or longer, you probably need more paragraphs or more conciseness. Avoid excessive use of code in the report. Use outlines, pseudocode or block diagrams to convey implementation details. If readers want to see code, they can look at the programs.

Pay attention to the introduction. Be sure that it has enough information for readers to understand the project. However, irrelevant information is sometimes worse than no information at all.

Diagrams are useful and can greatly improve the clarity of the presentation, but a diagram that conveys the wrong idea is worse than no diagram. Ask yourself what information you are trying to convey by the diagram, and distinguish that information with carefully chosen and consistent symbols. For example, don't use the same style box to represent both a process and a port, or the same type of arrow to represent a connection request and a thread.

Use architectural diagrams to convey overall structure and differences in design. For example, if contrasting the implementations of the tunnel and the proxy, give separate architectural diagrams for each that are clearly distinct. Alternatively, you could give one diagram for both (not two copies of the same diagrams) and emphasize that the two implementations have the same communication structure but differ in other ways.

On your final pass, verify that the report agrees with the implementation. For example, you might describe a resource-naming scheme in the report and then modify it in the program during testing. It is easy to forget to change the documentation to reflect the modifications. Section 22.12 gives some additional discussion about technical reports .

19.14.4 Poor testing and presentation of results

Each of the tunnel and proxy programs should be tested in a controlled environment before being tested with browsers and web servers. Otherwise, you are contending with three linked systems, each with unknown behavior. This configuration is impossible to test in a meaningful way.

A good way to start is to test the tunnel programs with simple copying programs such as Programs 18.1 and 18.3 to be sure that tunnel correctly transfers all of the information. Be sure that ordinary and binary files are correctly transmitted for all versions. Testing that the program transmitted data is not the same as testing to see that it transmitted correctly. Use diff or other utilities to make sure that files were exactly transmitted.

Avoid random test syndrome by organizing the test cases before writing the programs. Think about what factors might affect program behavior ”different types of web pages, different types of servers, different network connections, different times of day, etc., and clearly organize the tests.

State clearly in the report what tests were performed, what the results were, and what aspect of the program these tests were designed to exercise. The typical beginner's approach to test reporting is to write a short paragraph saying the program worked and then append a large log file of test results to the report. A better approach might be to organize the test results into a table with annotations of the outcomes and a column with page numbers in the output so that the reader can actually find the tests.

Always record and state the conditions under which tests or performance experiments were run (machines, times of day, etc.). These factors may not appear to be important at the time, but you usually can't go back later and reconstruct these details accurately. Include in your report an analysis of what you expected to happen and what actually did happen.

19.14.5 Programming errors and bad style

Well-written programs are always easier to debug and modify. If you try to produce clean code from the initial design, you will usually spend less time debugging.

Avoid large or inconsistent indentation ”it generally makes complicated code difficult to follow. Also avoid big loops ”use functions to reduce complexity. For example, parsing the GET line of an HTTP request should be done in a function and tested separately.

Don't reinvent the wheel. Use libraries if available. Consolidate common code. For example, in the proxy, call the same function for each direction once the GET line is parsed. Do not assume that a header or other data will never exceed some arbitrary, predetermined size . It is best to include code to resize arrays (by realloc ) when necessary. Be careful of memory leaks. Alternatively, you could use a fixed-size buffer and report longer requests as invalid. Be sure your buffer size is large enough. In no circumstance should you write past the end of an array. However, be cognizant of when a badly behaved program (e.g., a client that tries to write an infinitely long HTTP request) might cause trouble and be prepared to take appropriate action.

Always free allocated resources such as buffers, but don't free them more than once because this can cause later allocations to fail. Good programming practice suggests setting the pointer argument of free to NULL after the call, since the free function ignores NULL pointers. Often, a function will correctly free a buffer or other resource when successful but will miss freeing it when certain error conditions occur.

Do not use numeric values for buffer sizes and other parameters within the program. Use predefined constants for default and initial values so that you know what they mean and only have to modify them in one place. Be careful about when to use a default value and when not to. Mistakes here can be difficult to detect during testing. For example, the absolute URL contains an optional port number. You should not assume port 80 if this optional number is present. Be sure that all command-line arguments meet their specifications.

Parsing the HTTP headers is quite difficult. If you implement robust parsing, you need to assume that lines can end in a carriage return followed by a line feed, by just a line feed, or by just a carriage return. The line feed is the same as the newline character. If you did this parsing inline in the main loop, you probably didn't test parsing very well ”how could you?

Headers in HTTP are in ASCII format, but resources may be in binary format. You will need to switch strategies in the middle of handling input.

Team-FLY