Once you’ve answered the problem, you may be asked about the efficiency of your implementation. Often, you have to compare trade-offs between your implementation and another possible solution and identify the conditions that make each option more favorable. Common questions focus on memory or space usage, especially when recursion is involved.
A good understanding of big-O analysis is critical to making a good impression with the interviewer. Big-O analysis is a form of run-time analysis that measures the efficiency of an algorithm in terms of the time it takes for the algorithm to run as a function of the input size. It’s not a formal benchmark, just a simple way to classify algorithms by relative efficiency.
Most coding problem solutions in this book include a run-time analysis to help you solidify your understanding of the algorithms.
Let’s start with an example of big-O analysis in action. Consider a simple function that returns the maximum value stored in an array of non-negative numbers. The size of the array is n. There are at least two easy ways to implement the function. In the first alternative, you keep track of the current largest number as the function iterates through the array and return that value when you are done iterating. This implementation, called CompareToMax, looks like this:
/* Returns the largest integer in the array */ int CompareToMax(int array[], int n) { int curMax, i; /* Make sure that there is at least one element in the array. */ if (n <= 0) return -1; /* Set the largest number so far to the first array value. */ curMax = array[0]; /* Compare every number with the largest number so far. */ for (i = 1; i < n; i++) { if (array[i] > curMax) { curMax = array[i]; } } return curMax; }
The second alternative compares each value to all the other values. If all other values are less than or equal to a given value, that value must be the maximum value. This implementation, called CompareToAll, looks like this:
/* Returns the largest integer in the array */ int CompareToAll(int array[], int n) { int i, j; bool isMax; /* Make sure that there is at least one element in the array. */ if (n <= 0) return -1; for (i = n-1; i > 0; i--) { isMax = true; for (j = 0; j < n; j++) { /* See if any value is greater. */ if (array[j] > array[i]) isMax = false; /* array[i] is not the largest value. */ } /* If isMax is true, no larger value exists; array[i] is max. */ if (isMax) break; } return array[i]; }
Both of these functions correctly return the maximum value. Which one is more efficient? You could try benchmarking them, but it’s usually impractical (and inefficient) to implement and benchmark every possible alternative. You need to be able to predict an algorithm’s performance without having to implement it. Big-O analysis enables you to do exactly that: compare the predicted relative performance of different algorithms.
In big-O analysis, input size is assumed to be n. In this case, n simply represents the number of elements in an array. In other problems, n may represent the number of nodes in a linked list, or the number of bits in a data type, or the number of entries in a hash table, and so on. After figuring out what n means in terms of the input, you have to determine how many times the n input items are examined in terms of n. “Examined” is a fuzzy word because algorithms differ greatly. Commonly, an examination might be something like adding an input value to a constant, creating a new input item, or deleting an input value. In big-O analysis, these operations are all considered equivalent. In both CompareToMax and CompareToAll, “examine” means comparing an array value to another value.
In CompareToMax, each array element was compared once to a maximum value. Thus, the n input items are each examined once, resulting in n examinations. This is considered O(n), usually referred to as linear time: The time required to run the algorithm increases linearly with the number of input items.
You may notice that in addition to examining each element once, there is a check to ensure that the array is not empty and a step that initializes the curMax variable. It may seem more accurate to call this an O(n + 2) function to reflect these extra operations. Big-O analysis, however, yields the asymptotic running time, the limit of the running time as n gets very large. As n approaches infinity, the difference between n and n + 2 is insignificant, so the constant term can be ignored. Similarly, for an algorithm running in n + n^{2} time, the difference between n^{2} and n + n^{2} is negligible for very large n. Thus, in big-O analysis you eliminate all but the highest-order term, the term that is largest as n gets very large. In this case, n is the highest-order term. Therefore, the CompareToMax function is O(n).
The analysis of CompareToAll is a little more difficult. First, you have to make an assumption about where the largest number occurs in the array. For now, assume that the maximum element is at the end of the array. In this case, this function may compare each of n elements to n other elements. Thus, there are n(n examinations, and this is an O(n^{2}) algorithm.
The analysis so far has shown that CompareToMax is O(n) and CompareToAll is O(n^{2}). This means that as the array grows, the number of comparisons in CompareToAll will become much larger than in CompareToMax. Consider an array with 30,000 elements. CompareToMax compares on the order of 30,000 elements, whereas CompareToAll compares on the order of 900,000,000 elements. You would expect CompareToMax to be much faster because it examines 30,000 times fewer elements. In fact, one benchmark timed CompareToMax at less than .01 seconds, while CompareToAll took 23.99 seconds.
The fastest-possible running time for any run-time analysis is O(1), commonly referred to as constant running time. An algorithm with constant running time always takes the same amount of time to execute, regardless of the input size.
You may think this comparison was stacked against CompareToAll because the maximum value was at the end. This is true, and it raises the important issues of best case, average case, and worst case running times. The analysis of CompareToAll was a worst-case scenario: The maximum value was at the end of the array. Consider, however, the average case, where the largest value is in the middle. You end up checking only half the values n times because the maximum value is in the middle. This results in checking n (n / 2) = n^{2}/ 2 times. This would appear to be an O(n^{2}/ 2) running time. Consider, though, what the 1/2 factor means. The actual time to check each value is highly dependent on the machine instructions that the code translates to and then on the speed at which the CPU can execute the instructions. Therefore, the 1/2 doesn’t mean very much. You could even come up with an O(n^{2}) algorithm that was faster than an O(n^{2}/ 2) algorithm. In big-O analysis, you drop all constant factors, so the average case for CompareToAll is no better than the worst case. It is still O(n^{2}).
The best-case running time for CompareToAll is better than O(n^{2}). In this case, the maximum value is at the beginning of the array. The maximum value is compared to all other values only once, so the result is an O(n) running time.
Note that in CompareToMax, the best-case, average-case, and worst-case running times are identical. Regardless of the arrangement of the values in the array, the algorithm is always O(n).
Ask the interviewer which scenario they’re most interested in. Sometimes there are clues to this in the problem itself. Some sorting algorithms with terrible worst cases for unsorted data may nonetheless be well suited for a problem if the input is already sorted.
The general procedure for big-O run-time analysis is as follows:
Figure out what the input is and what n represents.
Express the number of operations the algorithm performs in terms of n.
Eliminate all but the highest-order terms.
Remove all constant factors.
For the algorithms you’ll be tested on, big-O analysis should be straightforward as long as you correctly identify the operations that are dependent on the input size.
Algorithm optimizations do not always yield the expected changes in their overall running times. Consider the following optimization to CompareToAll: Instead of comparing each number to every other number, compare each number only with the numbers that follow it in the array. In essence, every number before the current number has already been compared to the current number. Thus, the algorithm is still correct if you compare only to numbers occurring after the current number.
What’s the worst-case running time for this implementation? The first number is compared to n numbers, the second number to n – 1 numbers, the third number to n – 2, resulting in a number of comparisons equal to n + (n – 1) + (n – 2) + (n – 3) + … + 1. This is a very common result, a mathematical series whose answer is expressed as n^{2}/2 + n/2. n^{2} is the highest-order term, so this version of the algorithm still has an O(n^{2}) running time in the worst case. For large input values, the change you make to the algorithm has no real effect on its running time.