Evaluation - The Four Levels and ROI

Evaluation—The Four Levels and ROI


Without measurement there is no performance improvement.

—W. Edwards Deming

Evaluation: the most powerful and underutilized tool in the entire arsenal of instructional systems design.

—Roger Chevalier

Kirkpatrick's Four Levels

Together with needs assessments, evaluations are among the most strategic tools available to the trainer or performance improvement practitioner. The evaluation of training programs involves four levels, which were originally suggested by Donald Kirkpatrick in the 1950s. In a series of four articles in Training magazine on "Techniques for Evaluating Training Programs," Kirkpatrick described how courses could be evaluated from four different perspectives (see Figure 3). The terms he used for these four levels were reaction, learning, behavior, and results. In what follows, these are referred to as evaluation (attitude), knowledge (cognitive), application (behavioral skills), and impact (financial results):

  • Level 1: Evaluating the Course (How to Improve the Course)

  • Level 2: Knowledge of What Was Taught (e.g., Certification of the Learner)

  • Level 3: Applying It on the Job (Transfer to the Real World)

  • Level 4: Impact on the Business (ROI and Bottom-line Profit)

click to expand
Figure 3: The Four Levels of Evaluation.

Level 1: Evaluating the Course (How to Improve the Course)

He made measurements everywhere, so that not one inch would be unaccounted for.

—Sherlock Holmes, The Sign of the Four, 1890

Level 1 evaluations survey student opinions on the course, including content, tests, and delivery.

  • Evaluation Source: Survey at the end of the course

  • Benefit: Gives directions on how course and delivery can be improved in the future.

  • Other Uses: Survey can be administered, not just at the end of the class, but each month, to determine what elements of the course are most useful on the job.

  • Challenge: Devising survey questions that are realistically helpful in evaluating how the course might be improved.

Teaching by lecture makes men mere scholars, but instructing by examination makes them learned: the student has the best chance of becoming actually great.

—Oliver Goldsmith, An Enquiry into the Present State of Polite Learning in Europe, 1759

Level 2: Knowledge of What Was Taught (Certification of the Learner)

Level 2 evaluations measure what the student has learned in the course.

  • Evaluation Source: Written exam at the end of the course.

  • Benefit: Allows for certifications or verifications of what was learned in the course. Benefits both learner and organization.

  • Other Uses: Evaluates indirectly the effectiveness of the organization's curriculum, tests, and delivery.

  • Challenge: Devising test questions that link the course to the real world of the job to be performed is sometimes difficult.

Evaluating is the most valuable treasure of all that we value: only through evaluation does value exist.

—Friedrich Nietzsche, Thus Spoke Zarathustra, 1883

start sidebar
Test Scoring

Tests should not be graded "on a curve" (this is called "norm-referenced"), but rather on an objective scale, which is called "criterion-referenced." (A criterion is an objective standard.)

end sidebar

Level 3: Applying It on the Job (Transfer to the Real World)

Level 3 surveys participants one to three months after the course, to determine whether they are applying the new-found knowledge back on the job.

  • Evaluation Source: Several methods are used, including self-surveys as well as surveys of co-workers, managers, and direct reports. If all of these methods are utilized, Level 3 constitutes a 360-degree feedback survey.

  • Benefit: Level 3 feeds back valuable information on how effectively the learner is transferring what was learned to the job.

  • Other Uses: Level 3 evaluates not only the student's performance but also the effectiveness of the organization's courses, tests, and follow-on coaching and support programs.

  • Challenge: Designing the surveys so that they screen out factors other than training and measure only the impact of the course itself can be difficult.

Using Level 3 to Improve the Course

Level 3 surveys of learners, similar to Level 1 surveys, can be helpful in determining how courses, tests, and follow-on reinforcements can be improved. Simple Web surveys can be deployed, asking such direct questions as, "Of the 12 items you learned last month in class, which 3 do you find most useful on the job today?" Such surveys are not difficult to administer and can be of tremendous value to course developers. A side benefit of such surveys is that they also provide students with indirect reminders of what they were taught.

Level 4: Impact on the Business (ROI and Bottom-Line Profit)

He whipped out a tape measure and hurried about the room—measuring, comparing, and evaluating.

—Sherlock Holmes, The Sign of the Four, 1890

Level 4 evaluation measures the cost savings and/or added revenue that can be attributed to a course. This level is the most difficult to establish, although there are several workarounds (see "Tips for Level 4" below).

  • Evaluation Source: There are numerous sources for Level 4, which is part of the problem in evaluating this level. Calculations should consider the cost of the training (easiest to compute), financial reports of the organization, and measures of on-the-job performance improvement of students following the course. Because on-the-job improvement is difficult to verify, this measure is often softened from a "proof" to a "correlation"—see "Tips for Level 4" below.

  • Benefit: Measures bottom-line results (company profits or return on investment) resulting from the training. If these can't be confirmed, there are other measures possible: cost reductions, reductions in cycle time, time-to-market, or time-to-competency.

  • Other Uses: Level 4 indicates not only the training department's contribution to the company's bottom line, but also the organization's overall effectiveness—which can be motivational feedback for employers and employees alike. Results should be celebrated and rewarded.

  • Challenge: This level of evaluation is time-consuming and costly to perform. ROI due to training is difficult to substantiate for there are many factors at work. As we've stated, it may be more realistic to demonstrate plausible cases or correlations in data trends (see Tips below).

TIP: Tips for Level 4

start example
  • Cluster soft skills into a group. It is particularly difficult to evaluate the financial impact of soft skills courses. Tip: Instead of trying to measure the impact of, say, listening skills on your company's bottom line, group this class with other classes, such as selling techniques, and proposal writing. Then measure their combined impact on the bottom line by clustering them as a mini-curriculum called, for instance, "Effective Selling." The skill cluster functions more like a competency, and therefore can be more readily correlated to the bottom line.

  • Speak of retention and satisfaction levels. If hard data is unavailable for determining the financial impact of training, reach for softer data. Look instead, for instance, at "employee retention levels" as measured by turnover, or at "employee satisfaction levels." Ask questions such as "Do you feel prepared to return and apply these skills on the job?" or "Was the course worthwhile?" These factors can be viewed as "key differentiators," and therefore as a competitive advantage for the firm.

  • Use opinion surveys to project correlated approximations of financial impact. If no Level 4 financial data is available, John Noonan has proposed an inventive solution. This is not as strange as it sounds at first. He suggests projecting Level 4 financial impact from Level 1 attitude surveys containing such questions as "Rate your productivity, following the training, on the tasks that utilize what was presented in the course." Although the method is too detailed to reproduce in its entirety here, it basically approximates financial payback by extrapolating from field surveys filled out by learners. Not billed as ROI, but rather as plausible cases and ranges of magnitude, and with the appropriate disclaimers ("we're building a business case, not trying to publish in an academic journal"), Noonan's "directional indicators" can still be useful to managers in terms of deciding whether the training was effective or not. If the only alternative is to produce no Level 4 results at all, Noonan's suggestion is an interesting and creative one.

  • Tackle the easiest first. ROI is gathered from four sources; arrange these from easy to most difficult and tackle the easiest first:

    • Cost Savings: This is the simplest ROI factor to link to training because it involves fewer variables than do revenue or earnings estimates. This factor is most often cited when making the case for Web courses, which will presumably eliminate the cost of trainers, classrooms, travel, and hotels.

    • Time Savings: Reduction in time-to-competency, time-to-market, cycle time, etc.

    • Increase in Revenue (Sales): More difficult ROI factor to link to training because there are more variables.

    • Increase in Earnings (Profit): The most complex ROI factor to link to training, for it involves the most variables.

end example