It would be impossible for us to cover every conceivable area of computer knowledge that could appear on a résumé or in an interview. Instead, this chapter provides a representative sample of knowledge-based questions. These questions focus on system-level issues, trade-offs between various ways of programming, and advanced features of languages. All these topic areas make sense from the interviewer’s perspective. A candidate who claims to know a lot about computers but who isn’t aware of basic system-level issues such as virtual memory and disk cache certainly doesn’t seem very knowledgeable. Furthermore, many job assignments are not of the variety “Solve this problem by implementing this algorithm in this language,” but may be more along the lines of “We have this problem that we need solved.” A candidate who understands the trade-offs between various solutions and when to use each one is always preferred to a candidate who does not understand these differences. Finally, these questions enable the interviewer to assess experience and filter out résumé padding. It’s unlikely that an experienced developer would have problems answering questions about advanced features in a language that she had used in development for some time. However, an inexperienced programmer or a resume padder might stumble. These questions can help interviewers separate the wheat from the chaff.
Offer specific and thorough responses.
One final note: The answers presented here are researched answers that result from many people thinking about a question and coming up with the best answer. As a candidate hearing a question for the first time, you are not expected to replicate such detailed solutions. Consider these answers to be the goal, and try to get as close to these solutions as possible.
What are the differences between C++ and Java?
C++ and Java are syntactically very similar. Java’s designers intended this to make it easy for C++ developers to learn Java. Apart from this area of similarity, Java and C++ differ in a variety of ways, largely because of their different design goals. Security, portability, and rapid development were of paramount importance in the design of Java, whereas C++ is more concerned with performance and backward compatibility with C. Java is compiled to virtual machine byte-code and requires a virtual machine to run; C++ is compiled to native machine code. This usually makes C++ faster, but it gives Java greater potential for portability and security.
C++ is a superset of C and maintains features such as programmer-controlled memory management, pointers, and a preprocessor for full backward compatibility with C. In contrast, Java eliminates these and other bug-prone features. Java replaces programmer memory deallocations with garbage collection. Java further dispenses with C++ features such as operator overloading and multiple inheritance. These choices are seen by some to make Java a better choice for rapid development and for projects where portability and security are more important than performance.
A limited form of multiple inheritance can be simulated in Java using interfaces.
In Java, all objects are passed by reference, whereas in C++, the default behavior is to pass objects by value. Java does not perform automatic type casting as C++ does, though newer Java features such as generics and autoboxing now handle many common cases. In Java, all methods are virtual, meaning the implementation for a method is selected according to the type of the object as opposed to the type of the reference. In C++, methods must be explicitly declared as virtual. Java has defined sizes for primitive data types, while type sizes are implementation-dependent in C++ (and C).
In situations where there is legacy C code and a great need for performance, C++ has certain benefits, especially when low-level system access is required. In situations where portability, security, and speed of development are emphasized, Java (or a similar language such as C#) may be a better choice.
Discuss friend classes in C++ and give an example of when you would use one.
The friend keyword is applied to either a function or a class. It gives the friend function or friend class access to the private members of the class in which the declaration occurs. Some programmers feel this feature violates the principles of object-oriented programming because it allows a class to operate on another class’s private members. This violation can, in turn, lead to unexpected bugs when a change in the internal implementation of a class causes problems with the friend class that accesses it.
In some cases, however, the benefits of a friend class outweigh its drawbacks. For example, suppose you implemented a sophisticated dynamic array class. Imagine that you wanted a separate class to iterate through your array. The iterator class would probably need access to the dynamic array class’s private members to function correctly. It would make sense to declare the iterator as a friend to the array class. The workings of the two classes are inextricably tied together already, so it probably doesn’t make sense to enforce a meaningless separation between the two.
Note that while Java and C# do not support the concept of friend classes, they do support nested classes, and nested classes have access to their enclosing class’ private data and methods. nested classes can therefore take the place of friend classes in many instances.
Assume you have the class hierarchy shown in Figure 14-1.
You are given a method that takes a B as an argument. Which of the classes can you pass to the method?
Clearly, you can pass B because that’s exactly what the method takes. You can’t possibly pass D because it may have totally different characteristics than B. A is the parent class of B. Consider that a child class is required to implement all of the methods of the parent, but the parent does not necessarily have all of the methods of a child. Thus, the parent class, A, cannot be passed to the method. C is the child class of B and is guaranteed to have all of the methods of B, so you can pass C to the method.
A hash table does one thing well. It can store and retrieve data quickly (in O(1) or constant time). However, its uses beyond this are limited.
Discuss what garbage collection is and explain different ways it can be implemented.
Garbage collection is the process by which a program automatically finds and reclaims memory that is no longer used or no longer accessible by an application. This reclamation occurs without programmer assistance. C#, Java, Lisp, and Perl are examples of languages with garbage-collection facilities.
Garbage collection provides several advantages over having a programmer explicitly deallocate memory. It eliminates bugs due to dangling pointers and memory leaks. It also promotes greater simplicity in program and interface design because the complicated mechanisms traditionally used to ensure that memory is properly freed (such as “smart pointers” in C++) are unnecessary. In addition, because programmers don’t have to worry about memory deallocation, program development proceeds at a more rapid pace.
Garbage collection is not without its disadvantages, however. Garbage-collected programs often run more slowly because of the overhead needed for the system to determine when to deallocate and reclaim memory no longer needed. In addition, the system will occasionally over-allocate memory and may not free memory at the best possible time.
One method of garbage collection is to use reference counting. This involves tracking how many variables reference an object. Initially, there will be one reference to a piece of memory. The reference count will increase if the variable referencing it is copied. When a variable referencing an object changes value or goes out of scope, the object’s reference count is decremented. If a reference count ever goes to 0, the memory associated with the object is freed: If no one is keeping a reference to the object, then the object (and hence its memory) is no longer needed.
Reference counting is simple and relatively fast. However, it doesn’t handle circular references. Consider what happens in the case of a circularly linked list with nothing external pointing to it. Every element in the list has a nonzero reference count, yet the memory isn’t referenced by any object outside the list itself. Thus, the memory could safely be deallocated, but reference-based garbage collection won’t free it.
A second method of garbage collection is known as mark and sweep. In the first pass, the memory manager will mark all objects that can be accessed by any thread in the program. In the second pass, all unmarked objects are deallocated, or swept away.
Mark and sweep handles circular references, but it’s also less efficient. The garbage collector runs at different points in an application’s execution and may cause the application to pause while the garbage collecting occurs.
What are the two major issues in networking performance?
Any network can be measured by two major characteristics: latency and bandwidth. Latency refers to how long it takes a given bit of information to get through the network. Bandwidth refers to the rate at which data moves through the network once communication is established. The perfect network would have infinite bandwidth and no latency.
A pipe is a good analog for a network. The time it takes for a molecule of water to go through the whole pipe is determined by the length; this is analogous to the latency. The width of the pipe determines the bandwidth: how much water can pass in a given time.
Latency and bandwidth problems are often encountered when searching the Web. If you wait a long time for a page to display and then it appears quickly, this indicates good bandwidth but high latency. On the other hand, if a page starts loading right away but takes a long time to load, that is a symptom of a low-latency, low-bandwidth connection.
Discuss the differences between symmetric key cryptography and public key cryptography. Give an example of when you would use each.
Symmetric key cryptography, also called shared key cryptography, involves two people using the same key to encrypt and decrypt information. Public key cryptography makes use of two different keys: a public key for encryption and a private key for decryption. Symmetric key cryptography has the advantage that it’s much faster than public key cryptography. It is also generally easier to implement, less likely to involve patented algorithms, and usually requires less processing power. On the downside, the two parties sending messages must agree on the same private key before securely transmitting information. This is often inconvenient or even impossible. If the two parties are geographically separated, then a secure means of communication is needed for one to tell the other what the key will be. In a pure symmetric key scenario, secure communication is generally not available. If it were, there would be little need for encryption to create another secure channel.
Public key cryptography has the advantage that the public key, used for encryption, does not need to be kept secret for encrypted messages to remain secure. This means public keys can be transmitted over insecure channels. Often, people use public key cryptography to establish a shared session key and then communicate via symmetric key cryptography using the shared session key. This solution provides the convenience of public key cryptography with the performance of shared key cryptography.
Both public key and symmetric key cryptography are used to get secure information from the Web. First your browser establishes a shared session key with the Web site using public key cryptography. Then you communicate with the Web site using symmetric key cryptography to actually obtain the private information.
If you discover a new cryptography algorithm, should you use it immediately?
This is not a trick question, but goes to the heart of modern cryptography. Basically, no algorithm stays secret for long, and almost every algorithm has at least minor bugs in it or in its implementation. It’s virtually impossible to hide an algorithm given the number of people who develop it and know about it and the advanced techniques used by today’s best crackers. If your security is based on the secrecy of your algorithm, you have what is called security by obscurity, which is effectively no security at all. It’s very likely that a determined cracker could discover your algorithm. This can render your security worse than useless because you think you have security when in fact you have none.
Thus, it’s best to make any algorithm public from the beginning and flush out the bugs, rather than keep it secret and encounter a lot of security problems when it is discovered. Only keys should be kept secret. If, after extensive public review and discussion, your algorithm is accepted as secure, then you can probably securely use the algorithm.
Compare and contrast a hash table and a binary search tree. If you were designing the address book data structure for a personal digital assistant (PDA) with limited memory, which one would you use?
A binary search tree can insert and retrieve in O(log(n)). This is fast, though not as fast as a hash table’s O(1). A binary search tree, however, also maintains its data in sorted order.
In a PDA, you want to keep as much memory as possible available for data storage. If you use an unordered data structure such as a hash table, you need additional memory to sort the values, as you undoubtedly want to display the values in alphabetical order. Therefore, if you use a hash table, you have to set aside memory for sorting that could otherwise be used as storage space.
If you use a binary search tree, you won’t have to waste memory or processing time on sorting records for display. Although binary tree operations are slower than hash table operations, a device like this is likely to have no more than about 10,000 entries, so a binary search tree’s O(log(n)) lookup will undoubtedly be fast enough. For these reasons, a binary search tree is better suited for this kind of task than a hash table.