Research Findings: Current Practices in Gathering Learner Feedback

,

Respondents

Over 200 learning professionals responded to Work-Learning Research’s 2017-2018 survey on current practices in gathering learner feedback, and today I will reveal the results. The survey ran from November 29th, 2017 to September 16th, 2018. The sample of respondents was drawn from Work-Learning Research’s mailing list and through extensive calls for participation in a variety of social media. Because of this sampling methodology, the survey results are likely skewed toward professionals who care and/or pay attention to research-based practice recommendations more than the workplace learning field as a whole. They are also likely more interested and experienced in learning evaluation as well.

Feel free to share this link with others.

Goal of the Research

The goal of the research was to determine what people are doing in the way of evaluating their learning interventions through the practice of asking learners for their perspectives.

Questions the Research Hoped to Answer

  1. Are smile sheets (learner-feedback questions) still the most common method of doing learning evaluation?
  2. How does their use compare with other methods? Are other methods growing in prominence/use?
  3. How satisfied are learning professionals with their organizations’ learner-feedback methods?
  4. To what extent are organizations looking for alternatives to their current learner-feedback methods?
  5. What kinds of questions are used on smile sheets? Has Thalheimer’s new approach, performance-focused questioning, gained any traction?
  6. What do learning professionals think their current smile sheets are good at measuring (Satisfaction, Reputation, Effectiveness, Nothing)?
  7. What tools are organizations using to gather learner feedback?
  8. How useful are current learner-feedback questions in helping guide improvements in learning design and delivery?
  9. How widely are the target metrics of LTEM (The Learning-Transfer Evaluation Model) currently being measured?

A summary of the findings indexed to these questions can be found at the end of this post.

Situating the Practice of Gathering Learner Feedback

When we gather feedback from learners, we are using a Tier 3 methodology on the LTEM (Learning-Transfer Evaluation Model) or Level 1 on the Kirkpatrick-Katzell Four-Level Model of Training Evaluation.

Demographic Background of Respondents

Respondents came from a wide range of organizations, including small, midsize, and large organizations.

Respondents play a wide range of roles in the learning field.

Most respondents live in the United States and Canada, but there was some significant representation from many predominantly English-speaking countries.

Learner-Feedback Findings

About 67% of respondents report that learners are asked about their perceptions on more than half of their organization’s learning programs, including elearning. Only about 22% report that they survey learners on less than half of their learning programs. This finding is consistent with past findings—surveying learners is the most common form of learning evaluation and is widely practiced.

The two most common question types in use are Likert-like questions and numeric-scale questions. I have argued against their use* and I am pleased that Performance-Focused Smile Sheet questions have been utilized by so many so quickly. Of course, this sample of respondents is comprised of folks on my mailing list so this result surely doesn’t represent current practice in the field as a whole. Not yet! LOL.

*Likert-like questions and numeric-scale questions are problematic for several reasons. First, because they offer fuzzy response choices, learners have a difficult time deciding between them and this likely makes their responses less precise. Second, such fuzziness may inflate bias as there are not concrete anchors to minimize biasing effects of the question stems. Third, Likert-like options and numeric scales likely deflate learner responding because learners are habituated to such scales and because they may be skeptical that data from such scales will actually be useful. Finally, Likert-like options and numeric scales produce indistinct results—averages all in the same range. Such results are difficult to assess, failing to support decision-making—the whole purpose for evaluation in the first place. To learn more, check out Performance-Focused Smile Sheets: A Radical Rethinking of a Dangerous Art Form (book website here).

The most common tools used to gather feedback from learners were paper surveys and SurveyMonkey. Questions delivered from within an LMS were the next highest. High-end evaluation systems like Metrics that Matter were not highly represented in our respondents.

Our respondents did not rate their learner-feedback efforts as very effective. Their learner surveys were seen as most effective in gauging learner satisfaction. Only about 33% of respondents thought their learner surveys gave them insights on the effectiveness of the learning.

Only about 15% of respondents found their data very useful in providing them feedback about how to improve their learning interventions.

Respondents report that their organizations are somewhat open to alternatives to their current learner-feedback approaches, but overall they are not actively looking for alternatives.

Most respondents report that their organizations are at least “modestly happy” with their learner-feedback assessments. Yet only 22% reported being “generally happy” with them. Combining this finding with the one above showing that lots of organizations are open to alternatives, it seems that organizational satisfaction with current learner-feedback approaches is soft.

We asked respondents about their organizations’ attempts to measure the following:

  • Learner Attendance
  • Whether Learner is Paying Attention
  • Learner Perceptions of the Learning (eg, Smile Sheets, Learner Feedback)
  • Amount or Quality of Learner Participation
  • Learner Knowledge of the Content
  • Learner Ability to Make Realistic Decisions
  • Learner Ability to Complete Realistic Tasks
  • Learner Performance on the Job (or in another future performance situation)
  • Impact of Learning on the Learner
  • Impact of Learning on the Organization
  • Impact of Learning on Coworkers, Family, Friends of the Learner
  • Impact of Learning on the Community or Society
  • Impact of Learning on the Environment

These evaluation targets are encouraged in LTEM (The Learning-Transfer Evaluation Model).

Results are difficult to show—because our question was very complicated (admittedly too complicated)—but I will summarize the findings below.

As you can see, learner attendance and learner perceptions (smile sheets) were the most commonly measured factors, with learner knowledge a distant third. The least common measures involved the impact of the learning on the environment, community/society, and the learner’s coworkers/family/friends.

The flip side—methods rarely utilized in respondents’ organizations—shows pretty much the same thing.

Note that the question above, because it was too complicated, probably produced some spurious results, even if the trends at the extremes are probably indicative of the whole range. In other words, it’s likely that attendance and smile sheets are the most utilized and measures of impact on the environment, community/society, and learners’ coworkers/family/friends are the least utilized.

Questions Answered Based on Our Sample

  1. Are smile sheets (learner-feedback questions) still the most common method of doing learning evaluation?

    Yes! Smile sheets are clearly the most popular evaluation method, along with measuring attendance (if we include that as a metric).

  2. How does their use compare with other methods? Are other methods growing in prominence/use?

    Except for Attendance, nothing else comes close. The next most common method is measuring knowledge. Remarkably, given the known importance of decision-making (Tier 5 in LTEM) and task competence (Tier 6 in LTEM), these are used in evaluation at a relatively low level. Similar low levels are found in measuring work performance (Tier 7 in LTEM) and organizational results (part of Tier 8 in LTEM). We’ve known about these relatively low levels from many previous research surveys.

    Hardly any measurement is being done on the impact of learning on learner or his/her coworkers/family/friends, the impact of the learning on the community/society/environment, or on learner participation/attention.

  3. How satisfied are learning professionals with their organizations’ learner-feedback methods?

    Learning professionals are moderately satisfied.

  4. To what extent are organizations looking for alternatives to their current learner-feedback methods?

    Organizations are open to alternatives, with some actively seeking alternatives and some not looking.

  5. What kinds of questions are used on smile sheets? Has Thalheimer’s new approach, performance-focused questioning, gained any traction?

    Likert-like options and numeric scales are the most commonly used. Thalheimer’s performance-focused smile-sheet method has gained traction in this sample of respondents—people likely more in the know about Thalheimer’s approach than the industry at large.

  6. What do learning professionals think their current smile sheets are good at measuring (Satisfaction, Reputation, Effectiveness, Nothing)?

    Learning professionals think their current smile sheets are fairly good at measuring the satisfaction of learners. A full one-third of respondents feel that their current approaches are not valid enough to provide them with meaningful insights about the learning interventions.

  7. What tools are organizations using to gather learner feedback?

    The two most common methods for collecting learner feedback are paper surveys and SurveyMonkey. Questions from LMSs are the next most widely used. Sophisticated evaluation tools are not much in use in our respondent sample.

  8. How useful are current learner-feedback questions in helping guide improvements in learning design and delivery?

    This may be the most important question we might ask, given that evaluation is supposed to aid us in maintaining our successes and improving on our deficiencies. Only 15% of respondents found learner feedback “very helpful” in helping them improve their learning. Many found the feedback “somewhat helpful” but a full one-third found the feedback “not very useful” in enabling them to improve learning.

  9. How widely are the target metrics of LTEM (The Learning-Transfer Evaluation Model) currently being measured?

    As described in Question 2 above, many of the targets of LTEM are not being adequately measured at this point in time (November 2017 to September 2018, during the time immediately before and after LTEM was introduced). This indicates that LTEM is poised to help organizations uncover evaluation targets that can be helpful in setting goals for learning improvements.

Lessons to be Drawn

The results of this survey reinforce what we’ve known for years. In the workplace learning industry, we default to learner-feedback questions (smile sheets) as our most common learning-evaluation method. This is a big freakin’ problem for two reasons. First, our learner-feedback methods are inadequate. We often use poor survey methodologies and ones particularly unsuited to learner feedback, including the use of fuzzy Likert-like options and numeric scales. Second, even if we used the most advanced learner-feedback methods, we still would not be doing enough to gain insights into the strengths and weaknesses of our learning interventions.

Evaluation is meant to provide us with data we can use to make our most critical decisions. We need to know, for example, whether our learning designs are supporting learner comprehension, learner motivation to apply what they’ve learned, learner ability to remember what they’ve learned, and the supports available to help learners transfer their learning to their work. We typically don’t know these things. As a result, we don’t make design decisions we ought to. We don’t make improvements in the learning methods we use or the way we deploy learning. The research captured here should be seen as a wake up call.

The good news from this research is that learning professionals are often aware and sensitized to the deficiencies of their learning-evaluation methods. This seems like a good omen. When improved methods are introduced, they will seek to encourage their use.

LTEM, the new learning-evaluation model (which I developed with the help of some of the smartest folks in the workplace learning field) is targeting some of the most critical learning metrics—metrics that have too often been ignored. It is too new to be certain of its impact, but it seems like a promising tool.

Why I have turned my Attention to Evaluation (and why you should too!)

For 20 years, I’ve focused on compiling scientific research on learning in the belief that research-based information—when combined with a deep knowledge of practice—can drastically improve learning results. I still believe that wholeheartedly! What I’ve also come to understand is that we as learning professionals must get valid feedback on our everyday efforts. It’s simply our responsibility to do so.

We have to create learning interventions based on the best blend of practical wisdom and research-based guidance. We have to measure key indices that tell us how our learning interventions are doing. We have to find out what their strengths are and what their weaknesses are. Then we have to analyze and assess and make decisions about what to keep and what to improve. Then we have to make improvements and again measure our results and continue the cycle—working always toward continuous improvement.

Here’s a quick-and-dirty outline of the recommended cycle for using learning to improve work performance. “Quick-and-dirty” means I might be missing something!

  1. Learn about and/or work to uncover performance-improvement needs.
  2. If you determine that learning can help, continue. Otherwise, build or suggest alternative methods to get to improved work performance.
  3. Deeply understand the work-performance context.
  4. Sketch out a very rough draft for your learning intervention.
  5. Specify your evaluation goals—the metrics you will use to measure your intervention’s strengths and weaknesses.
  6. Sketch out a rough draft for your learning intervention.
  7. Specify your learning objectives (notice that evaluation goals come first!).
  8. Review the learning research and consider your practical constraints (two separate efforts subsequently brought together).
  9. Sketch out a reasonably good draft for your learning intervention.
  10. Build your learning intervention and your learning evaluation instruments (Iteratively testing and improving).
  11. Deploy your “ready-to-go” learning intervention.
  12. Measure your results using the previously determined evaluation instruments, which were based on your previously determined evaluation objectives.
  13. Analyze your results.
  14. Determine what to keep and what to improve.
  15. Make improvements.
  16. Repeat (maybe not every step, but at least from Step 6 onward)

And here is a shorter version:

  1. Know the learning research
  2. Understand your project needs.
  3. Outline your evaluation objectives—the metrics you will use.
  4. Design your learning.
  5. Deploy your learning and your measurement.
  6. Analyze your results.
  7. Make Improvements
  8. Repeat.

More Later Maybe

The results shared here are the result from all respondents. If I get the time, I’d like to look at subsets of respondents. For example, I’d like to look at how learning executives and managers might differ from learning practitioners. Let me know how interested you would be in these results.

Also, I will be conducting other surveys on learning-evaluation practices, so stay tuned. We have been too long frustrated with our evaluation practices and more work needs to be done in understanding the forces that keep us from doing what we want to do. We could also use more and better learning-evaluation tools because the truth is that learning evaluation is still a nascent field.

Finally, because I learn a ton by working with clients who challenge themselves to do more effective interventions, please get in touch with me if you’d like a partner in thinking things through and trying new methods to build more effective evaluation practices. Also, please let me know how you’ve used LTEM (The Learning-Transfer Evaluation Model).

Some links to make this happen:

Appreciations

As always, I am grateful to all the people I learn from, including clients, researchers, thought leaders, conference attendees, and more… Thanks also to all who acknowledge and share my work! It means a lot!

Brinkerhoff Case Method — A Better Name for a Great Learning-Evaluation Innovation

, ,

Updated July 3rd, 2018—a week after the original post. See end of post for the update, featuring Rob Brinkerhoff’s response.

Rob Brinkerhoff’s “Success Case Method” needs a subtle name change. I think a more accurate name would be the “Brinkerhoff Case Method.”

I’m one of Rob’s biggest fans, having selected him in 2008 as the Neon Elephant Award Winner for his evaluation work.

Thirty five years ago, in 1983, Rob published an article where he introduced the “Success Case Method.” Here is a picture of the first page of that article:

In that article, the Success-Case Method was introduced as a way to find the value of training when it works. Rob wrote, “The success-case method does not purport to produce a balanced assessment of the total results of training. It does, however, attempt to answer the question: When training works, how well does it work?” (page 58, which is visible above).

The Success-Case Method didn’t stand still. It evolved and improved as Rob refined it based on his research and his work with clients. In his landmark book that details the methodology in 2006, Telling Training’s Story: Evaluation Made Simple, Credible, and Effective, Rob describes how to first survey learners and then sample some of them for interviews by selecting them based on their level of success in applying the training. “Once the sorting is complete, the next step is to select the interviewees from among the high and low success candidates, and perhaps from the middle categories.” (page 102).

To call this the success-case method seems more aligned with the original naming then the actual recommended practice. For that reason, I recommend that we simply call it the Brinkerhoff Case Method. This gives Rob the credit he deserves, and it more accurately reflects the rigor and balance of the method itself.

As soon as I posted the original post, I reached out to Rob Brinkerhoff to let him know. After some reflection, Rob wrote this and asked me to post it:

“Thank you for raising the issue of the currency of the name Success Case Method (SCM). It is kind of you to also think about identifying it more closely with my name. Your thoughts are not unlike others and on occasion even myself. 

It is true the SCM collects data from extreme portions of the respondent distribution including likely successes, non-successes, and ‘middling’ users of training. Digging into these different groups yields rich and useful information. 

Interestingly the original name I gave to the method some 40 years ago when I first started forging it was the “Pioneer” method since when we studied the impact of a new technology or innovation we felt we learned the most from the early adopters – those out ahead of the pack that tried out new things and blazed a trail for others to follow. I refined that name to a more familiar term but the concept and goal remained identical: accelerate the pace of change and learning by studying and documenting the work of those who are using it the most and the best. Their experience is where the gold is buried. 

Given that, I choose to stick with the “success” name. It expresses our overall intent: to nurture and learn from and drive more success. In a nutshell, this name expresses best not how we do it, but why we do it. 

Thanks again for your thoughtful reflections. We’re on the same page.“ 

Rob’s response is thoughtful, as usual. Yet my feelings on this remain steady. As I’ve written in my report on the new Learning-Transfer Evaluation Model (LTEM), our models should nudge appropriate actions. The same is true for the names we give things. Mining for success stories is good, but it has to be balanced. After all, if evaluation doesn’t look for the full truth—without putting a thumb on the scale—than we are not evaluating, we are doing something else.

I know Rob’s work. I know that he is not advocating for, nor does he engage in, unbalanced evaluations. I do fear that the name Success Case Method may give permission or unconsciously nudge lesser practitioners to find more success and less failure than is warranted by the facts.

Of course, the term “Success Case Method” has one brilliant advantage. Where people are hesitant to evaluate for fear of uncovering unpleasant results, the name “Success Case Method” may lessen the worry of moving forward and engaging in evaluation—and so it may actually enable the balanced evaluation that is necessary to uncover the truth of learning’s level of success.

Whatever we call it, the Success Case Method or the Brinkerhoff Case Method—and this is the most important point—it is one of the best learning-evaluation innovations in the past half century.

I also agree that since Rob is the creator, his voice should have the most influence in terms of what to call his invention.

I will end with one of my all-time favorite quotations from the workplace learning field, from Tim Mooney and Robert Brinkerhoff’s excellent book, Courageous Training:

“The goal of training evaluation is not to prove the value of training; the goal of evaluation is to improve the value of training.” (p. 94-95)

On this we should all agree!

The Learning-Transfer Evaluation Model (LTEM) Updated to Version 12

The Learning-Transfer Evaluation Model (LTEM) and accompanying Report were updated today with two major changes:

  • The model has been inverted to put the better evaluation methods at the top instead of at the bottom.
  • The model now uses the word “Tier” to refer to the different levels within the model—to distinguish these from the levels of the Kirkpatrick-Katzell model.

This will be the last update to LTEM for the foreseeable future.

You can find the latest version of LTEM and the accompanying report by clicking here.

The Learning-Transfer Evaluation Model (LTEM)

NOTICE OF UPDATE (17 May 2018):

The LTEM Model and accompanying Report were updated today and can be found below.

Two major changes were included:

  • The model has been inverted to put the better evaluation methods at the top instead of at the bottom.
  • The model now uses the word “Tier” to refer to the different levels within the model—to distinguish these from the levels of the Kirkpatrick-Katzell model.

This will be the last update to LTEM for the foreseeable future.

 

This blog post introduces a new learning-evaluation model, the Learning-Transfer Evaluation Model (LTEM).

 

Why We Need a New Evaluation Model

It is well past time for a new learning-evaluation model for the workplace learning field. The Kirkpatrick-Katzell Model is over 60 years old. It was born in a time before computers, before cognitive psychology revolutionized the learning field, before the training field was transformed from one that focused on the classroom learning experience to one focused on work performance.

The Kirkpatrick-Katzell model—created by Raymond Katzell and popularized by Donald Kirkpatrick—is the dominant standard in our field. It has also done a tremendous amount of harm, pushing us to rely on inadequate evaluation practices and poor learning designs.

I am not the only critic of the Kirkpatrick-Katzell model. There are legions of us. If you do a Google search starting with these letters, “Criticisms of the Ki,” Google anticipates the following: “Criticisms of the Kirkpatrick Model” as one of the most popular searches.

Here’s what a seminal research review said about the Kirkpatrick-Katzell model (before the model’s name change):

The Kirkpatrick framework has a number of theoretical and practical shortcomings. [It] is antithetical to nearly 40 years of research on human learning, leads to a checklist approach to evaluation (e.g., ‘we are measuring Levels 1 and 2, so we need to measure Level 3’), and, by ignoring the actual purpose for evaluation, risks providing no information of value to stakeholders…

The New Model

For the past year or so I’ve been working to develop a new learning-evaluation model. The current version is the eleventh iteration, improved after reflection, after asking some of the smartest people in our industry to provide feedback, after sharing earlier versions with conference attendees at the 2017 ISPI innovation and design-thinking conference and the 2018 Learning Technologies conference in London.

Special thanks to the following people who provided significant feedback that improved the model and/or the accompanying article:

Julie Dirksen, Clark Quinn, Roy Pollock, Adam Neaman, Yvon Dalat, Emma Weber, Scott Weersing, Mark Jenkins, Ingrid Guerra-Lopez, Rob Brinkerhoff, Trudy Mandeville, Mike Rustici

The model, which I’ve named the Learning-Transfer Evaluation Model (LTEM, pronounced L-tem) is a one page, eight-level model, augmented with color coding and descriptive explanations. In addition to the model itself, I’ve prepared a 34-page report to describe the need for the model, the rationale for its design, and recommendations on how to use it.

You can access the model and the report by clicking on the following links:

 

 

Release Notes

The LTEM model and report were researched, conceived, and written by Dr. Will Thalheimer of Work-Learning Research, Inc., with significant and indispensable input from others. No one sponsored or funded this work. It was a labor of love and is provided as a valentine for the workplace learning field on February 14th, 2018 (Version 11). Version 12 was released on May 17th, 2018 based on feedback from its use. The model and report are copyrighted by Will Thalheimer, but you are free to share them as is, as long as you don’t sell them.

If you would like to contact me (Will Thalheimer), you can do that at this link: https://www.worklearning.com/contact/

If you would like to sign up for my list, you can do that here: https://www.worklearning.com/sign-up/

 

 

What Do Senior Business Leaders Want in Terms of Learning Evaluation?

,

 

Let’s find out by asking them!

And, let’s ask ourselves (workplace learning professionals) what we think senior leaders will tell us.

NOTE: This may take some effort on our part. Please complete the survey yourself and ask senior leaders at your organization (if your organization is 1000 people or more) to complete the survey.

 

The Survey Below is for both Senior Organizational Leaders AND for Workplace Learning Professionals.

We will branch you to a separate set of questions!

Answer the survey questions below, or you need it, here is a link to the survey.

 



Send me an email if you want to talk more about learning evaluation...

Donald Kirkpatrick was NOT the Originator of the Four-Level Model of Learning Evaluation

, , , ,

Donald Kirkpatrick (1924-2014) was a giant in the workplace learning and development field, widely known for creating the four-level model of learning evaluation. Evidence however contradicts this creation myth and points to Raymond Katzell, a distinguished industrial-organizational psychologist, as the true originator. This, of course, does not diminish Don Kirkpatrick’s contribution to framing and popularizing the four-level framework of learning evaluation.

The Four-Levels Creation Myth

The four-level model is traditionally traced back to a series of four articles Donald Kirkpatrick wrote in 1959 and 1960, each article covering one of the four levels, Reaction, Learning, Behavior, Results. These articles were published in the magazine of ASTD (then called the American Society of Training Directors). Here’s a picture of the first page of the first article:

In June of 1977, ASTD (known by then as the American Society of Training and Development, now ATD, the Association for Talent Development) reissued Kirkpatrick’s original four articles, combining them into one article in the Training and Development Journal. The story has always been that it was those four articles that introduced the world to the four-level model of training evaluation.

In 1994, in the first edition of his book, Evaluating Training Programs: The Four Levels, Donald Kirkpatrick wrote:

“In 1959, I wrote a series of four articles called ‘Techniques for Evaluating Training Programs,’ published in Training and Development, the journal of the American Society for Training and Development (ASTD). The articles described the four levels of evaluation that I had formulated. I am not sure where I got the idea for this model, but the concept originated with work on my Ph.D. dissertation at the University of Wisconsin, Madison.” (p. xiii). [Will’s Note: Kirkpatrick was slightly inaccurate here. At the time of his four articles, the initials ASTD stood for the American Society of Training Directors and the four articles were published in the Journal of the American Society of Training Directors. This doesn’t diminish Kirkpatrick’s central point: that he was the person who formulated the four levels of learning evaluation].

In 2011, in a tribute to Dr. Kirkpatrick, he is asked about how he came up with the four levels. This is what he said in that video tribute:

“[after I finished my dissertation in 1954], between 54 and 59 I did some research on behavior and results. I went into companies. I found out are you using what you learned and if so what can you show any evidence of productivity or quality or more sales or anything from it. So I did some research and then in 1959 Bob Craig, editor of the ASTD journal, called me and said, ‘Don, I understand you’ve done some research on evaluation would you write an article?’ I said, ‘Bob, I’ll tell you what I’ll do, I’ll write four articles, one on reaction, one on learning, one on behavior, and one on results.'”

In 2014, when asked to reminisce on his legacy, Dr. Kirkpatrick said this:

“When I developed the four levels in the 1950s, I had no idea that they would turn into my legacy. I simply needed a way to determine if the programs I had developed for managers and supervisors were successful in helping them perform better on the job. No models available at that time quite fit the bill, so I created something that I thought was useful, implemented it, and wrote my dissertation about it.” (Quote from blog post published January 22, 2014).

As recently as this month (January 2018), on the Kirkpatrick Partners website, the following is written:

“Don was the creator of the Kirkpatrick Model, the most recognized and widely used training evaluation model in the world. The four levels were developed in the writing of his Ph.D. dissertation, Evaluating a Human Relations Training Program for Supervisors.

Despite these public pronouncements, Kirkpatrick’s legendary 1959-1960 articles were not the first published evidence of a four-level evaluation approach.

Raymond Katzell’s Four-Step Framework of Evaluation

In an article written by Donald Kirkpatrick in 1956, the following “steps” were laid out and were attributed to “Raymond Katzell, a well known authority in the field [of training evaluation].”

  1. To determine how the trainees feel about the program.
  2. To determine how much the trainees learn in the form of increased knowledge and understanding.
  3. To measure the changes in the on-the-job behavior of the trainees.
  4. To determine the effects of these behavioral changes on objective criteria such as production, turnover, absenteeism, and waste.

These four steps are the same as Kirkpatrick’s four levels, except there are no labels.

Raymond Katzell went on to a long and distinguished career as an industrial-organizational psychologist, even winning the Society for Industrial and Organizational Performance’s Distinguished Scientific Contributions award.

Raymond Katzell. Picture used by SIOP (Society for Industrial and Organizational Psychology) when they talk about The Raymond A. Katzell Media Award in I-O Psychology.

The first page of Kirkpatrick’s 1956 article—written three years before his famous 1959 introduction to the four levels—is pictured below:

And here is a higher-resolution view of the quote from that front page, regarding Katzell’s contribution:

So Donald Kirkpatrick mentions Katzell’s four-step model in 1956, but not in 1959 when he—Kirkpatrick—introduces the four labels in his classic set of four articles.

It Appears that Kirkpatrick Never Mentions Katzell’s Four Steps Again

As far I can tell, after searching for and examining many publications, Donald Kirkpatrick never mentioned Katzell’s four steps after his 1956 article.

Three years after the 1956 article, Kirkpatrick did not mention Katzell’s taxonomy when he wrote his four famous articles in 1959. He did mention an unrelated article where Katzell was a co-author (Merrihue & Katzell, 1955), but he did not mention Katzell’s four steps.

Neither did Kirkpatrick mention Katzell in his 1994 book, Evaluating Training Programs: The Four Levels.

Nor did Kirkpatrick mention Katzell in the third edition of the book, written with Jim Kirkpatrick, his son.

Nor was Katzell mentioned in a later version of the book written by Jim and Wendy Kirkpatrick in 2016. I spoke with Jim and Wendy recently (January 2018), and they seemed as surprised as I was about the 1956 article and about Raymond Katzell.

Nor did Donald Kirkpatrick mention Katzell in any of the interviews he did to mark the many anniversaries of his original 1959-1960 articles.

To summarize, Katzell, despite coming up with the four-step taxonomy of learning evaluation, was only given credit by Kirkpatrick once, in the 1956 article, three years prior to the articles that introduced the world to the Kirkpatrick Model’s four labels.

Kirkpatrick’s Dissertation

Kirkpatrick did not introduce the four-levels in his 1954 dissertation. There is not even a hint at a four-level framework.

In his dissertation, Kirkpatrick cited two publications by Katzell. The first, was an article from 1948, “Testing a Training Program in Human Relations.” That article studies the effect of leadership training, but makes no mention of Katzell’s four steps. It does, however, hint at the value of measuring on-the-job performance, in this case the value of leadership behaviors. Katzell writes, “Ideally, a training program of this sort [a leadership training program] should be evaluated in terms of the on-the-job behavior of those with whom the trainees come in contact.

The second Katzell article cited by Kirkpatrick in his dissertation was an article entitled, “Can We Evaluate Training?” from 1952. Unfortunately, it was a mimeographed article published by the Industrial Management Institute at the University of Wisconsin, and seems to be lost to history. Even after several weeks of effort (in late 2017), the University of Wisconsin Archives could not locate the article. Interestingly, in a 1955 publication entitled, “Monthly Checklist of State Publications” a subtitle was added to Katzell’s Can We Evaluate Training? The subtitle was:A summary of a one day Conference for Training Managers” from April 23, 1952.

To be clear, Kirkpatrick did not mention the four levels in his 1954 dissertation. The four levels notion came later.

How I Learned about Katzell’s Contribution

I’ve spent the last several years studying learning evaluation, and as part of these efforts, I decided to find Kirkpatrick’s original four articles and reread them. ATD (The Association for Talent Development) in 2017 had a wonderful archive of the articles it had published over the years. As I searched for “Kirkpatrick,” several other articles—besides the famous four—came up, including the 1956 article. I was absolutely freaking stunned when I read it. Donald Kirkpatrick had cited Katzell as the originator of the four level notion!!!

I immediately began searching for more information on the Kirkpatrick-Katzell connection and found that I wasn’t the first person to uncover the connection. I found an article by Stephen Smith who acknowledged Kazell’s contribution in 2008, also in an ASTD publication. I communicated with Smith recently (December 2017) and he had nothing but kind words to say about Donald Kirkpatrick, who he said coached him on training evaluations. Here is a graphic taken directly from Smith’s 2008 article:

Smith’s article was not focused on Katzell’s contribution to the four levels, which is probably why it wasn’t more widely cited. In 2011, Cynthia Lewis wrote a dissertation and directly compared the Katzell and Kirkpatrick formulations. She appears to have learned about Katzell’s contribution from Smith’s 2008 article. Lewis’s (2011) comparison chart is reproduced below:

In 2004, four years before Smith wrote his article with the Katzell sidebar, ASTD republished Kirkpatrick’s 1956 article—the one in which Kirkpatrick acknowledges Katzell’s four steps. Here is the front page of that article:

In 2016, an academic article appeared in a book that referred to the Katzell-Kirkpatrick connection. The book is only available in French and the article appears to have had little impact in the English-speaking learning field. Whereas neither Kirkpatrick’s 2004 reprint nor Smith’s 2008 article offered commentary about Katzell’s contribution except to acknowledge it, Bouteiller, Cossette, & Bleau (2016) were clear in stating that Katzell deserves to be known as the person who conceptualized the four levels of training evaluation, while Kirkpatrick should get credit for popularizing it. The authors also lamented that Kirkpatrick, who himself recognized Katzell as the father of the four-level model of evaluation in his 1956 article, completely ignored Katzell for the next 55 years and declared himself in all his books and on his website as the sole inventor of the model. I accessed their chapter through Google Scholar and used Google Translate to make sense of it. I also followed up with two of the authors (Bouteiller and Cossette in January 2018) to confirm I was understanding their messaging clearly.

Is There Evidence of a Transgression?

Raymond Katzell seems to be the true originator of the four-level framework of learning evaluation and yet Donald Kirkpatrick on multiple occasions claimed to be the creator of the four-level model.

Of course, we can never know the full story. Kirkpatrick and Katzell are dead. Perhaps Katzell willingly gave his work away. Perhaps Kirkpatrick asked Katzell if he could use it. Perhaps Kirkpatrick cited Katzell because he wanted to bolster the credibility of a framework he developed himself. Perhaps Kirkpatrick simply forgot Katzell’s four steps when he went on to write his now-legendary 1959-1960 articles. This last explanation may seem a bit forced given that Kirkpatrick referred to the Merrihue and Katzell work in the last of his four articles—and we might expect that the name “Katzell” would trigger memories of Katzell’s four steps, especially given that Katzell was cited by Kirkpatrick as a “well known authority.” This forgetting hypothesis also doesn’t explain why Kirkpatrick would continue to fail to acknowledge Katzell’s contribution after ASTD republished Kirkpatrick’s 1956 article in 2004 or after Steven Smith’s 2008 article showed Katzell’s four steps. Smith was well-known to Kirkpatrick and is likely to have at least mentioned his article to Kirkpatrick.

We can’t know for certain what transpired, but we can analyze the possibilities. Plagiarism means that we take another person’s work and claim it as our own. Plagiarism, then, has two essential features (see this article for details). First, an idea or creation is copied in some way. Second, no attribution is offered. That is, no credit is given to the originator. Kirkpatrick had clear contact with the essential features of Katzell’s four-level framework. He wrote about them in 1956! This doesn’t guarantee that he copied them intentionally. He could have generated the four levels subconsciously, without knowing that Katzell’s ideas were influencing his thinking. Alternatively, he could have spontaneously created them without any influence from Katzell’s ideas. People often generate similar ideas when the stimuli they encounter are similar. How many people claim that they invented the term, “email?” Plagiarism does not require intent, but intentional plagiarism is generally considered a higher-level transgression than sloppy scholarship.

A personal example of how easy it is to think you invented something: In the 1990’s or early 2000’s, I searched for just the right words to explain a concept. I wrangled on it for several weeks. Finally, I came up with the perfect wording, with just the right connotation. “Retrieval Practice.” It was better than the prevailing terminology at the time—the testing effect—because people could retrieve without being tested. Eureka I thought! Brilliant I thought! It was several years later, rereading Robert Bjork’s 1988 article, “Retrieval practice and the maintenance of knowledge,” that I realized that my label was not original to me, and that even if I did generate it without consciously thinking of Bjork’s work, that my previous contact with the term “retrieval practice” almost certainly influenced my creative construction.

The second requirement for plagiarism is that the original creator is not given credit. This is evident in the case of the four levels of learning evaluation. Donald Kirkpatrick never mentioned Katzell after 1956. He certainly never mentioned Katzell when it would have been most appropriate, for example when he first wrote about the four levels in 1959, when he first published a book on the four levels in 1994, and when he received awards for the four levels.

Finally, one comment may be telling, Kirkpatrick’s statement from his 1994 book: “I am not sure where I got the idea for this model, but the concept originated with work on my Ph.D. dissertation at the University of Wisconsin, Madison.” The statement seems to suggest that Kirkpatrick recognized that there was a source for the four-level model—a source that was not Kirkpatrick himself.

Here is the critical timeline:

  • Katzell was doing work on learning evaluation as early at 1948.
  • Kirkpatrick’s 1954 dissertation offers no trace of a four-part learning-evaluation framework.
  • In 1956, the first reference to a four-part learning evaluation framework was offered by Kirkpatrick and attributed to Raymond Katzell.
  • In 1959, the first mention of the Kirkpatrick terminology (i.e., Reaction, Learning, Behavior, Results) was published, but Katzell was not credited.
  • In 1994, Kirkpatrick published his book on the four levels, saying specifically that he formulated the four levels. He did not mention Katzell’s contribution.
  • In 2004, Kirkpatrick’s 1956 article was republished, repeating Kirkpatrick’s acknowledgement that Katzell invented the four-part framework of learning evaluation.
  • In 2008, Smith published the article where he cited Katzell’s contribution.
  • In 2014, Kirkpatrick claimed to have developed the four levels in the 1950s.
  • As far as I’ve been able to tell—corroborated by Bouteiller, Cossette, & Bleau (2016)—Donald Kirkpatrick never mentioned Katzell’s four-step formulation after 1956.

Judge Not Too Quickly

I have struggled writing this article, and have rewritten it dozens of times. I shared an earlier version with four trusted colleagues in the learning field and asked them if I was being fair. I’ve searched exhaustively for source documents. I reached out to key players to see if I was missing something.

It is not a trifle to curate evidence that impacts other people’s reputations. It is a sacred responsibility. I as the writer have the most responsibility, but you as a reader have a responsibility too to weigh the evidence and make your own judgments.

First we should not be too quick to judge. We simply don’t know why Donald Kirkpatrick never mentioned Katzell after the original 1956 article. Indeed, perhaps he did mention Katzell in his workshops and teachings. We just don’t know.

Here are some distinct possibilities:

  • Perhaps Katzell and Kirkpatrick had an agreement that Kirkpatrick could write about the four levels. Let’s remember the 1959-1960 articles were not written to boost Kirkpatrick’s business interests. He didn’t have any business interests at that time—he was an employee—and his writing seemed aimed specifically at helping others do better evaluation.
  • Perhaps Kirkpatrick, being a young man without much of résumé in 1956, had developed a four-level framework but felt he needed to cite Katzell in 1956 to add credibility to his own ideas. Perhaps later in 1959 he dropped this false attribution to give himself the credit he deserved.
  • Perhaps Kirkpatrick felt that citing Katzell once was enough. Where many academics and researchers see plagiarism as one of the deadly sins, others have not been acculturated into the strongest form of this ethos. Let’s remember that in 1959 Kirkpatrick was not intending to create a legendary meme, he was just writing some articles. Perhaps at the time it didn’t seem important to acknowledge Katzell’s contribution. I don’t mean to dismiss this lightly. All of us are raised to believe in fairness and giving credit where credit is due. Indeed, research suggests that even the youngest infants have a sense of fairness. Kirkpatrick earned his doctorate at a prestigious research university. He should have been aware of the ethic of attribution, but perhaps because the 1959-1960 articles seemed so insignificant at the time, it didn’t seem important to site Katzell.
  • Perhaps Kirkpatrick intended to cite Katzell’s contribution in his 1959-1960 articles but the journal editor talked him out of it or disallowed it.
  • Perhaps Kirkpatrick realized that Katzell’s four steps were simply not resonant enough to be important. Let’s admit that Kirkpatrick’s framing of the four levels into the four labels was a brilliant marketing masterstroke. If Kirkpatrick believed this, he might have seen Katzell’s contribution as minimal and not deserving of acknowledgement.
  • Perhaps Kirkpatrick completely forget Katzell’s four-step taxonomy. Perhaps it didn’t influence him when he created his four labels, that he didn’t think of Katzell’s contribution when he wrote about Katzell’s article with Merrihue, that for the rest of his life he never remembered Katzell’s formulation, that he never saw the 2004 reprinting of his 1956 article, that he never saw Smith’s 2008 article, and that he never talked with Smith about Katzell’s work even though Smith has claimed a working relationship. Admittedly, this last possibility seems unlikely.

Let us also not judge Jim and Wendy Kirkpatrick, proprietors of Kirkpatrick Partners, a global provider of learning-evaluation workshops and consulting. None of this is on them! They were genuinely surprised to hear the news when I told them. They seemed to have no idea about Katzell or his contribution. What is past is past, and Jim and Wendy bear no responsibility for the history recounted here. What they do henceforth is their responsibility. Already, since we spoke last week, they have updated their website to acknowledge Katzell’s contribution!

Article Update (two days after original publication of this article): Yesterday, on the 31st of January 2018, Jim and Wendy Kirkpatrick posted a blog entry (copied here for the historic record) that admitted Katzell’s contribution but ignored Donald Kirkpatrick’s failure to acknowledge Katzell’s contribution as the originator of the four-level concept.

What about our trade associations and their responsibilities? It seems that ASTD bears a responsibility for their actions over the years, not only as the American Society of Training Directors who published the 1959-1960 articles without insisting that Katzell be acknowledged even though they themselves had published the 1956 articles where Katzell’s four-step framework was included on the first page; but also as the American Society of Training and Development who republished Kirkpatrick’s 1956 article in 2004 and republished the 1959-1960 articles in 1977. Recently rebranded as ATD (Association for Talent Development), the organization should now make amends. Other trade associations should also help set the record straight by acknowledging Katzell’s contribution to the four-level model of learning evaluation.

Donald Kirkpatrick’s Enduring Contribution

Regardless of who invented the four-level model of evaluation, it was Donald Kirkpatrick who framed it to perfection with the four labels and popularized it, helping it spread worldwide throughout the workplace learning and performance field.

As I have communicated elsewhere, I think the four-level model has issues—that it sends messages about learning evaluation that are not helpful.

On the other hand, the four-level model has been instrumental in pushing the field toward a focus on performance improvement. This shift—away from training as our sole responsibility, toward a focus on how to improve on-the-job performance—is one of the most important paradigm shifts in the long history of workplace learning. Kirkpatrick’s popularization of the four levels enabled us—indeed, it pushed us—to see the importance of focusing on work outcomes. For this, we owe Donald Kirkpatrick a debt of gratitude.

And we owe Raymond Katzell our gratitude as well. Not only did he originate the four levels, but he also put forth the idea that it was valuable to measure the impact learners have on their organizations.

What Should We Do Now?

What now is our responsibility as workplace learning professionals? What is ethical? The preponderance of the evidence points to Katzell as the originator of the four levels and Donald Kirkpatrick as the creator of the four labels (Reaction, Learning, Behavior, Results) and the person responsible for the popularization of the four levels. Kirkpatrick himself in 1956 acknowledged Katzell’s contribution, so it seems appropriate that we acknowledge it too.

Should we call them Katzell’s Four Levels of Evaluation? Or, the Katzell-Kirkpatrick Four Levels? I can’t answer this question for you, but it seems that we should acknowledge that Katzell was the first to consider a four-part taxonomy for learning evaluation.

For me, for the foreseeable future, I will either call it the Kirkpatrick Model and then explain that Raymond Katzell was the originator of the four levels, or I’ll simply call it the Kirkpatrick-Katzell Model.

Indeed, I think in fairness to both men—Kirkpatrick for the powerful framing of his four labels and his exhaustive efforts to popularize the model and Katzell for the original formulation—I recommend that we call it the Kirkpatrick-Katzell Four-Level Model of Training Evaluation. Or simply, the Kirkpatrick-Katzell Model.

Research Cited

Bjork, R. A. (1988). Retrieval practice and the maintenance of knowledge. In M. M. Gruneberg, P. E. Morris, R. N. Sykes (Eds.), Practical Aspects of Memory: Current Research and Issues, Vol. 1., Memory in Everyday Life (pp. 396-401). NY: Wiley.

Bouteiller, D., Cossette, M., & Bleau, M-P. (2016). Modèle d’évaluation de la formation de Kirkpatrick: retour sur les origins et mise en perspective. Dans M. Lauzier et D. Denis (éds.), Accroître le transfert des apprentissages: Vers de nouvelles connaissances, pratiques et expériences. Presses de l’Université du Québec, Chapitre 10, 297-339. [In English: Bouteiller, D., Cossette, M., & Bleau, M-P. (2016). Kirkpatrick training evaluation model: back to the origins and put into perspective. In M. Lauzier and D. Denis (eds.), Increasing the Transfer of Learning: Towards New Knowledge, Practices and Experiences. Presses de l’Université du Québec, Chapter 10, 297-339.]

Katzell, R. A. (1948). Testing a training program in human relations. Personnel Psychology, 1, 319-329.

Katzell, R. A. (1952). Can we evaluate training? A summary of a one day conference for training managers. A publication of the Industrial Management Institute, University of Wisconsin, April, 1952.

Kirkpatrick, D. L. (1956). How to start an objective evaluation of your training program. Journal of the American Society of Training Directors, 10, 18-22.

Kirkpatrick, D. L. (1959a). Techniques for evaluating training programs. Journal of the American Society of Training Directors, 13(11), 3-9.

Kirkpatrick, D. L. (1959b). Techniques for evaluating training programs: Part 2—Learning. Journal of the American Society of Training Directors, 13(12), 21-26.

Kirkpatrick, D. L. (1960a). Techniques for evaluating training programs: Part 3—Behavior. Journal of the American Society of Training Directors, 14(1), 13-18.

Kirkpatrick, D. L. (1960b). Techniques for evaluating training programs: Part 4—Results. Journal of the American Society of Training Directors, 14(2), 28-32.

Kirkpatrick, D. L. (1956-2004). A T+D classic: How to start an objective evaluation of your training program. T+D, 58(5), 1-3.

Lewis, C. J. (2011). A study of the impact of the workplace learning function on organizational excellence by examining the workplace learning practices of six Malcolm Baldridge Quality Award recipients. San Diego: CA. Available at http://sdsu-dspace.calstate.edu/bitstream/handle/10211.10/1424/Lewis_Cynthia.pdf.

Merrihue, W. V., & Katzell, R. A. (1955). ERI: Yardstick of employee relations. Harvard Business Review, 33, 91-99.

Salas, E., Tannenbaum, S. I., Kraiger, K., & Smith-Jentsch, K. A. (2012). The science of training and development in organizations: What matters in practice. Psychological Science in the Public Interest, 13(2), 74–101.

Smith, S. (2008). Why follow levels when you can build bridges? T+D, September 2008, 58-62.

 

 

 

 

Updated Smile-Sheet Questions for 2018

Since the publication of my book, Performance-Focused Smile Sheets: A Radical Rethinking of a Dangerous Art Form, I’ve been working with clients to help them craft questions they can use to get the learner feedback they want. I’ve learned a ton through this process. The most important thing I’ve learned is:

  • The process of writing questions benefits from thoughtful iterations utilizing multiple stakeholders.
  • We, as question writers, must maintain humility and be aggressive in working to create continuous improvement.

To honor these bits of wisdom, let me share some improvements to the questions I’ve been recommending.

The questions here represent a culmination of a long line of improvements that rely on hundreds of helpful comments from learning-and-development professionals and numerous data points from real learners and real workplace learning.

Gauging Learning to Performance

Let me start with the great grand-daughter of a question I once called, “The World’s Best Smile-Sheet Question.” The original seemed great at the time, but I have learned that it was too wordy and missed some critical elements. This is a much stronger version:

HOW ABLE ARE YOU to put what you’ve learned into practice in your work? CHOOSE THE ONE OPTION that best describes your current readiness.

  • My CURRENT ROLE DOES NOT ENABLE me to use what I learned.
  • I AM STILL UNCLEAR about what to do, and/or why to do it.
  • I NEED MORE GUIDANCE before I know how to use what I learned.
  • I NEED MORE EXPERIENCE to be good at using what I learned.
  • I CAN BE SUCCESSFUL NOW in using what I learned (even without more guidance or experience).
  • I CAN PERFORM NOW AT AN EXPERT LEVEL in using what I learned.

This question gauges the learners’ perspectives on how well they will be able use what they learned in their work.

The following are the recommended standards for each answer choice above:

  • Unacceptable (Learning Not Relevant)
  • Unacceptable (Learning Did Not Work)
  • Unacceptable (Learning Still Needed)
  • Acceptable (Enabled for Action)
  • Superior (Enabled for Performance)
  • Unlikely/Overconfident (Maybe Not Attending to Question)

Standards should be negotiated with your stakeholders, so you can use the recommended standards as a starting point for discussions.

Gauging Learner Comprehension

Another key goal of training is to ensure that learner’s fully comprehend what was taught. The next question is focused on that:

Now that you’ve completed the learning experience, how well do you feel you understand the concepts taught? CHOOSE ONE.

  • I am still at least SOMEWHAT CONFUSED about the concepts.
  • I am now SOMEWHAT FAMILIAR WITH the concepts.
  • I have a SOLID UNDERSTANDING of the concepts.
  • I AM FULLY READY TO USE the concepts in my work.
  • I have an EXPERT-LEVEL ABILITY to use the concepts.

Standards recommended:

  • Unacceptable (Learning Insufficient)
  • Unacceptable (Awareness Not Enough)
  • Acceptable (Learned Sufficiently)
  • Superior (Ready to Use)
  • Unlikely/Overconfident (Maybe Not Attending to Question)

 

Gauging After-Learning Support

Another goal of training is to provide learners with after-learning support, to increase the likelihood that learning will transfer:

After the course, when you begin to apply your new knowledge at your worksite, which of the following supports are likely to be in place for you? SELECT AS MANY ITEMS as are likely to be true.

  • MY MANAGER WILL ACTIVELY SUPPORT ME with key supports like time, resources, advice, and/or encouragement.
  • I will use A COACH OR MENTOR to guide me in applying the learning to my work.
  • I will regularly receive support from A COURSE INSTRUCTOR to help me in applying the learning to my work.
  • I will use JOB AIDS like checklists, search tools, or reference materials to guide me in applying the learning to my work.
  • I will be PERIODICALLY REMINDED (for at least several weeks) of key concepts and skills that were taught.
  • I will NOT get much direct support, but will rely on my own initiative.

 

Using Open-Ended Questions

Open-ended questions are some of the most powerful questions you can ask. Here are two I recommend:

Which aspects of the learning helped you the most in learning what was taught?

What could have been done better to make this a more effective learning experience? Remember, your feedback is critical, especially in providing us with constructive ideas for improvement.

 

Final Thoughts

These are just a few of the improved questions I’m now recommending. If you want help with your smile-sheet questions, please get in touch.

If you want to read the book, go to the book’s website.

If you want to learn more about my smile-sheet workshop or rebuilds, check this out.

If you want a better question than the Net Promoter Score, use this one.

If you want to see questions I recommended six months after the book was published, look here.

One of the Biggest Lies in Learning Evaluation — Asking Learners about Level 3 and 4.

, ,

The Kirkpatrick four-level model of evaluation includes Level 1 learner reactions, Level 2 learning, Level 3 behavior, and 4 Level results. Because of the model’s ubiquity and popularity, many learning professionals and organizations are influenced or compelled by the model to measure the two higher levels—Behavior and Results—even when it doesn’t make sense to do so and even if poor methods are used to do the measurement. This pressure has led many of us astray. It has also enabled vendors to lie to us.

Let me get right to the point. When we ask learners whether a learning intervention will improve their job performance, we are getting their Level 1 reactions. We are NOT getting Level 3 data. More specifically, we are not getting information we can trust to tell us whether a person’s on-the-job behavior has improved due to the learning intervention.

Similarly, when we ask learners about the organizational results that might come from a training or elearning program, we are getting learners’ Level 1 reactions. We are NOT getting Level 4 data. More specifically, we are not getting information we can trust to tell us whether organizational results improved due to the learning intervention.

One key question is, “Are we getting information we can trust?” Another is, “Are we sure the learning intervention caused the outcome we’re targeting—or whether, at least, it was significant in helping to create the targeted outcomes?”

Whenever we gather learner answers, we have to remember that people’s subjective opinions are not always accurate. First there are general problems with human subjectivity; including people’s tendencies toward wanting to be nice, to see themselves and their organizations in a positive light, to believing they themselves are more productive, intelligent, and capable than they actually are. In addition, learners don’t always know how different learning methods affect learning outcomes, so asking them to assess learning designs has to be done with great care to avoid bias.

The Foolishness of Measuring Level 3 and 4 with Learner-Input Alone

There are also specific difficulties in having learners rate Level 3 and 4 results.

  • Having learners assess Level 3 is fraught with peril because of all the biases that are entailed. Learners may want to look good to others or to themselves. They may suffer from the Dunning-Kruger effect and rate their performance at a higher level than what is deserved.
  • Assessing Level 4 organizational results is particularly problematic. First, it is very difficult to track all the things that influence organizational performance. Asking learners for Level 4 results is a dubious enterprise because most employees cannot observe or may not fully understand the many influences that impact organizational outcomes.

Many questions we ask learners in measuring Level 3 and 4 are biased in and of themselves. These four questions are highly biasing, and yet sadly they were taken directly from two of our industry’s best-known learning-evaluation vendors:

  • “Estimate the degree to which you improved your performance related to this course?” (Rated on a scale of percentages to 100)
  • “The training has improved my job performance.” (Rated on a numeric scale)
  • “I will be able to apply on the job what I learned during this session.” (rated with a Likert-like scale)
  • “I anticipate that I will eventually see positive results as a result of my efforts.” (rated with a Likert-like scale)

At least two of our top evaluation vendors make the case explicitly that smile sheets can gather Level 3 and 4 data. This is one of the great lies in the learning industry. A smile sheet garners Level 1 results! It does not capture data at any other levels.

What about delayed smile sheets—questions delivered to learners weeks or months after a learning experience? Can these get Level 2, 3, and 4 data? No! Asking learners for their perspectives, regardless of when their answers are collected, still gives us only Level 1 outcomes! Yes, learners answers can provide hints, but the data can only be a proxy for outcomes beyond Level 1.

On top of that, the problems cited above regarding learner perspectives on their job performance and on organizational results still apply even when questions are asked well after a learning event. Remember, the key to measurement is always whether we can trust the data we are collecting! To reiterate, asking learners for their perspectives on behavior and results suffers from the following:

  • Learners’ biases skew the data
  • Learners’ blind spots make their answers suspect
  • Biased questioning spoils the data
  • The complexity in determining the network of causal influences makes assessments of learning impact difficult or impossible

In situations where learner perspectives are so in doubt, asking learners questions may generate some reasonable hypotheses, but then these hypotheses must be tested with other means.

The Ethics of the Practice

It is unfair to call Level 1 data Level 3 data or Level 4 data.

In truth, it is not only unfair, it is deceptive, disingenuous, and harmful to our learning efforts.

How Widespread is this Misconception?

If two of are top vendors are spreading this misconception, we can be pretty sure that our friend-and-neighbor foot soldiers are marching to the beat.

Last week, I posted a Twitter poll asking the following question:

If you ask your learners how the training will impact their job performance, what #Kirkpatrick level is it?

Twitter polls only allow four choices, so I gave people the choice of choosing Level 1 — Reaction, Level 2 –Learning, Level 3 — Behavior, or Level 4 — Results.

Over 250 people responded (253). Here are the results:

  • Level 1 — Reaction (garnered 31% of the votes)
  • Level 2 — Learning (garnered 15% of the votes)
  • Level 3 — Behavior (garnered 38% of the votes)
  • Level 4 — Results (garnered 16% of the votes)

Level 1 is the correct answer! Level 3 is the most common misconception!

And note, given that Twitter is built on a social-media follower-model—and many people who follow me have read my book on Performance-Focused Smile Sheets, where I specifically debunk this misconception—I’m sure this result is NOT representative of the workplace learning field in general. I’m certain that in the field, more people believe that the question represents a Level 3 measure.

Yes, it is true what they say! People like you who read my work are more informed and less subject to the vagaries of vendor promotions. Also better looking, more bold, and more likely to be humble humanitarians!

My tweet offered one randomly-chosen winner a copy of my award-winning book. And the winner is:

Sheri Kendall-DuPont, known on Twitter as:

Thanks to everyone who participated in the poll…

Replacement for the Net Promoter Score—For Learning Assessments

The Net Promoter Score is one of the most popular smile-sheet questions in use. Unfortunately, it is fatally flawed for learning. I’ve written about NPS’s problems before. Essentially, NPS was designed for marketing purposes to get people’s feelings about the products they were using. NPS was NOT designed for learning. Also, the wording and choices of the question are too fuzzy to be meaningful. Finally, and most damning, NPS follows traditional smile sheets in focusing on learner satisfaction and course reputation—even though research has shown that traditional smile sheets are uncorrelated with learning!!

Despite these problems, organizations continue their blind allegiance to NPS.

Oftentimes, we are forced into doing stupid things by our organizational stakeholders, mostly because there seems to be no alternative. Let me provide one.

Can we gauge learner satisfaction in a way that focuses the question toward learning effectiveness and less on entertainment, enjoyment, ease of attendance, etc.? Yes. We. Can!

 

Net Effectiveness Score (NES)

Here’s the question:

If someone asked you about the effectiveness of the learning experience, would you recommend the learning to them? CHOOSE ONE.

  • The learning was TOO INEFFECTIVE to recommend.
  • The learning was INEFFECTIVE ENOUGH THAT I WOULD BE HESITANT to recommend it.
  • The learning was NOT FULLY EFFECTIVE, BUT I would recommend it IF IMPROVEMENTS WERE MADE to the learning.
  • The learning was NOT FULLY EFFECTIVE, BUT I would still recommend it EVEN IF NO CHANGES WERE MADE to the learning.
  • The learning was EFFECTIVE, SO I WOULD RECOMMEND IT.
  • The learning was VERY EFFECTIVE, SO I WOULD HIGHLY RECOMMEND IT.

This question has several benefits over the NPS question.

  1. It focuses on learning.
  2. It prompts learners to think about learning effectiveness.
  3. It has concrete answer choices, not fuzzy numeric ones.
  4. It will create meaningful results.

By the way, this question should be delivered after other smile-sheet questions that nudge learners to think about learning factors that really matter.

To learn more about performance-focused learner-feedback questions, either get in touch with me or check out my book.

 

Brett Christensen Uses the Performance-Focused Smile Sheet Methodology

This week, Brett Christensen published an article on how he’s used a Performance-Focused Smile Sheet to support him in teaching one of ISPI’s flagship workshops.

What I found particularly striking is how Brett used the smile-sheet results to make sense of learning effectiveness. His goal was to help his learners be able to take what they’ve learned and use it back on the job.

One smile-sheet question he used pointed to results that suggested that learners felt they had gained awareness of concepts, but they might not be fully able to put what they learned into practice. This raised a red flag, so Brett examined results from another question on the amount of practice received in the workshop. The learners told him that practice was only a little more than 50% of the workshop, and Brett used this information to consider changes for adding more practice.

He also used a question to get a sense of whether the spacing effect was utilized to support long-term remembering–a key research-based learning approach. He got good news there–so that even in a one-day workshop–many learners felt repetitions were delivered after a delay of an hour or more. Good instructional design!

For a century or more, our learner-feedback questions have focused on satisfaction, course reputation, and other factors that are NOT directly related to learning effectiveness. Now we have a new methodology, first described in the award-winning book, Performance-Focused Smile Sheets: A Radical Rethinking of a Dangerous Art Form. We ought to use this to get feedback about what we can do better.

Brett offers a wonderful case study from his work teaching a course offered through ISPI (Developed by Dr. Roger Chevalier). We are no longer hogtied with evaluations that provide us with bogus information. We can look for ways to get better feedback, improve our learning interventions, and get better results.

To read Brett’s full article, click here…