The Backfire Effect is NOT Prevalent: Good News for Debunkers, Humans, and Learning Professionals!

, , ,

An exhaustive new research study reveals that the backfire effect is not as prevalent as previous research once suggested. This is good news for debunkers, those who attempt to correct misconceptions. This may be good news for humanity as well. If we cannot reason from truth, if we cannot reliably correct our misconceptions, we as a species will certainly be diminished—weakened by realities we have not prepared ourselves to overcome. For those of us in the learning field, the removal of the backfire effect as an unbeatable Goliath is good news too. Perhaps we can correct the misconceptions about learning that every day wreak havoc on our learning designs, hurt our learners, push ineffective practices, and cause an untold waste of time and money spent chasing mythological learning memes.



The Backfire Effect

The backfire effect is a fascinating phenomenon. It occurs when a person is confronted with information that contradicts an incorrect belief that they hold. The backfire effect results from the surprising finding that attempts at persuading others with truthful information may actually make the believer believe the untruth even more than if they hadn’t been confronted in the first place.

The term “backfire effect” was coined by Brendan Nyhan and Jason Reifler in a 2010 scientific article on political misperceptions. Their article caused an international sensation, both in the scientific community and in the popular press. At a time when dishonesty in politics seems to be at historically high levels, this is no surprise.

In their article, Nyhan and Reifler concluded:

“The experiments reported in this paper help us understand why factual misperceptions about politics are so persistent. We find that responses to corrections in mock news articles differ significantly according to subjects’ ideological views. As a result, the corrections fail to reduce misperceptions for the most committed participants. Even worse, they actually strengthen misperceptions among ideological subgroups in several cases.”

Subsequently, other researchers found similar backfire effects, and notable researchers working in the area (e.g., Lewandowsky) have expressed the rather fatalistic view that attempts at correcting misinformation were unlikely to work—that believers would not change their minds even in the face of compelling evidence.


Debunking the Myths in the Learning Field

As I have communicated many times, there are dozens of dangerously harmful myths in the learning field, including learning styles, neuroscience as fundamental to learning design, and the myth that “people remember 10% of what they read, 20% of what they hear, 30% of what they see…etc.” I even formed a group to confront these myths (The Debunker Club), although, and I must apologize, I have not had the time to devote to enabling our group to be more active.

The “backfire effect” was a direct assault on attempts to debunk myths in the learning field. Why bother if we would make no difference? If believers of untruths would continue to believe? If our actions to persuade would have a boomerang effect, causing false beliefs to be believed even more strongly? It was a leg-breaking, breath-taking finding. I wrote a set of recommendations to debunkers in the learning field on how best to be successful in debunking, but admittedly many of us, me included, were left feeling somewhat paralyzed by the backfire finding.

Ironically perhaps, I was not fully convinced. Indeed, some may think I suffered from my own backfire effect. In reviewing a scientific research review in 2017 on how to debunk, I implored that more research be done so we could learn more about how to debunk successfully, but I also argued that misinformation simply couldn’t be a permanent condition, that there was ample evidence to show that people could change their minds even on issues that they once believed strongly. Racist bigots have become voices for diversity. Homophobes have embraced the rainbow. Religious zealots have become agnostic. Lovers of technology have become anti-technology. Vegans have become paleo meat lovers. Devotees of Coke have switched to Pepsi.

The bottom line is that organizations waste millions of dollars every year when they use faulty information to guide their learning designs. As a professional in the learning field, it’s our professional responsibility to avoid the danger of misinformation! But is this even possible?


The Latest Research Findings

There is good news in the latest research! Thomas Wood and Ethan Porter just published an article (2018) that could not find any evidence for a backfire effect. They replicated the Nyhan and Reifler research, they expanded tenfold the number of misinformation instances studied, they modified the wording of their materials, they utilized over 10,000 participants in their research, and they varied their methods for obtaining those participants. They did not find any evidence for a backfire effect.

“We find that backfire is stubbornly difficult to induce, and is thus unlikely to be a characteristic of the public’s relationship to factual information. Overwhelmingly, when presented with factual information that corrects politicians—even when the politician is an ally—the average subject accedes to the correction and distances himself from the inaccurate claim.”

There is additional research to show that people can change their minds, that fact-checking can work, that feedback can correct misconceptions. Rich and Zaragoza (2016) found that misinformation can be fixed with corrections. Rich, Van Loon, Dunlosky, and  Zaragoza (2017) found that corrective feedback could work, if it was designed to be believed. More directly, Nyhan and Reifler (2016), in work cited by the American Press Institute Accountability Project, found that fact checking can work to debunk misinformation.


Some Perspective

First of all, let’s acknowledge that science sometimes works slowly. We don’t yet know all we will know about these persuasion and information-correction effects.

Also, let’s please be careful to note that backfire effects, when they are actually evoked, are typically found in situations where people are ideologically inclined to a system of beliefs for which they strongly identify. Backfire effects have been studied most of in situations where someone identifies themselves as a conservative or liberal—when this identity is singularly or strongly important to their self identity. Are folks in the learning field such strong believers in a system of beliefs and self-identity to easily suffer from the backfire effect? Maybe sometimes, but perhaps less likely than in the area of political belief which seems to consume many of us.

Here are some learning-industry beliefs that may be so deeply held that the light of truth may not penetrate easily:

  • Belief that learners know what is best for their learning.
  • Belief that learning is about conveying information.
  • Belief that we as learning professionals must kowtow to our organizational stakeholders, that we have no grounds to stand by our own principles.
  • Belief that our primary responsibility is to our organizations not our learners.
  • Belief that learner feedback is sufficient in revealing learning effectiveness.

These beliefs seem to undergird other beliefs and I’ve seen in my work where these beliefs seem to make it difficult to convey important truths. So let me clarify and first say that it is speculative on my part that these beliefs have substantial influence. This is a conjecture on my part. Note also that given that the research on the “backfire effect” has now been shown to be tenuous, I’m not claiming that fighting such foundational beliefs will cause damage. On the contrary, it seems like it might be worth doing.


Knowledge May Be Modifiable, But Attitudes and Belief Systems May Be Harder to Change

The original backfire effect showed that people believed facts more strongly when confronted with correct information, but this misses an important distinction. There are facts and there are attitudes, belief systems, and policy preferences.

A fascinating thing happened when Wood and Porter looked for—but didn’t find—the backfire effect. They talked with the original researchers, Nyhan and Reifler, and they began working together to solve the mystery. Why did the backfire effect happen sometimes but not regularly?

In a recent podcast (January 28, 2018) from the “You Are Not So Smart” podcast, Wood, Porter, and Nyhan were interviewed by David McRaney and they nicely clarified the distinction between factual backfire and attitudinal backfire.


“People often focus on changing factual beliefs with the assumption that it will have consequences for the opinions people hold, or the policy preferences that they have, but we know from lots of social science research…that people can change their factual beliefs and it may not have an effect on their opinions at all.”

“The fundamental misconception here is that people use facts to form opinions and in practice that’s not how we tend to do it as human beings. Often we are marshaling facts to defend a particular opinion that we hold and we may be willing to discard a particular factual belief without actually revising the opinion that we’re using it to justify.”


“Factual backfire if it exits would be especially worrisome, right? I don’t really believe we are going to find it anytime soon… Attitudinal backfire is less worrisome, because in some ways attitudinal backfire is just another description for failed persuasion attempts… that doesn’t mean that it’s impossible to change your attitude. That may very well just mean that what I’ve done to change your attitude has been a failure. It’s not that everyone is immune to persuasion, it’s just that persuasion is really, really hard.”

McRaney (Podcast Host):

“And so the facts suggest that the facts do work, and you absolutely should keep correcting people’s misinformation because people do update their beliefs and that’s important, but when we try to change people’s minds by only changing their [factual] beliefs, you can expect to end up, and engaging in, belief whack-a-mole, correcting bad beliefs left and right as the person on the other side generates new ones to support, justify, and protect the deeper psychological foundations of the self.”


“True backfire effects, when people are moving overwhelmingly in the opposite direction, are probably very rare, they are probably on issues where people have very strong fixed beliefs….”


Rise Up! Debunk!

Here’s the takeaway for us in the learning field who want to be helpful in moving practice to more effective approaches.

  • While there may be some underlying beliefs that influence thinking in the learning field, they are unlikely to be as strongly held as the political beliefs that researchers have studied.
  • The research seems fairly clear that factual backfire effects are extremely unlikely to occur, so we should not be afraid to debunk factual inaccuracies.
  • Persuasion is difficult but not impossible, so it is worth making attempts to debunk. Such attempts are likely to be more effective if we take a change-management approach, look to the science of persuasion, and persevere respectfully and persistently over time.

Here is the message that one of the researchers, Tom Wood, wants to convey:

“I want to affirm people. Keep going out and trying to provide facts in your daily lives and know that the facts definitely make some difference…”

Here are some methods of persuasion from a recent article by Flynn, Nyhan, and Reifler (2017) that have worked even with people’s strongly-held beliefs:

  • When the persuader is seen to be ideologically sympathetic with those who might be persuaded.
  • When the correct information is presented in a graphical form rather than a textual form.
  • When an alternative causal account of the original belief is offered.
  • When credible or professional fact-checkers are utilized.
  • When multiple “related stories” are also encountered.

The stakes are high! Bad information permeates the learning field and makes our learning interventions less effective, harming our learners and our organizations while wasting untold resources.

We owe it to our organizations, our colleagues, and our fellow citizens to debunk bad information when we encounter it!

Let’s not be assholes about it! Let’s do it with respect, with openness to being wrong, and with all our persuasive wisdom. But let’s do it. It’s really important that we do!


Research Cited

Nyhan, B., & Reifler, J. (2010). When corrections fail: The persistence of political misperceptions.
Political Behavior, 32(2), 303–330.

Nyhan, B., & Zaragoza, J. (2016). Do people actually learn from fact-checking? Evidence from a longitudinal study during the 2014 campaign. Available at:
Rich, P. R., Van Loon, M. H., Dunlosky, J., & Zaragoza, M. S. (2017). Belief in corrective feedback for common misconceptions: Implications for knowledge revision. Journal of Experimental Psychology: Learning, Memory, and Cognition, 43(3), 492-501.
Rich, P. R., & Zaragoza, M. S. (2016). The continued influence of implied and explicitly stated misinformation in news reports. Journal of Experimental Psychology: Learning, Memory, and Cognition, 42(1), 62-74.
Wood, T., & Porter, E. (2018). The elusive backfire effect: Mass attitudes’ steadfast factual adherence, Political Behavior, Advance Online Publication.


Donald Kirkpatrick was NOT the Originator of the Four-Level Model of Learning Evaluation

, , , ,

Donald Kirkpatrick (1924-2014) was a giant in the workplace learning and development field, widely known for creating the four-level model of learning evaluation. Evidence however contradicts this creation myth and points to Raymond Katzell, a distinguished industrial-organizational psychologist, as the true originator. This, of course, does not diminish Don Kirkpatrick’s contribution to framing and popularizing the four-level framework of learning evaluation.

The Four-Levels Creation Myth

The four-level model is traditionally traced back to a series of four articles Donald Kirkpatrick wrote in 1959 and 1960, each article covering one of the four levels, Reaction, Learning, Behavior, Results. These articles were published in the magazine of ASTD (then called the American Society of Training Directors). Here’s a picture of the first page of the first article:

In June of 1977, ASTD (known by then as the American Society of Training and Development, now ATD, the Association for Talent Development) reissued Kirkpatrick’s original four articles, combining them into one article in the Training and Development Journal. The story has always been that it was those four articles that introduced the world to the four-level model of training evaluation.

In 1994, in the first edition of his book, Evaluating Training Programs: The Four Levels, Donald Kirkpatrick wrote:

“In 1959, I wrote a series of four articles called ‘Techniques for Evaluating Training Programs,’ published in Training and Development, the journal of the American Society for Training and Development (ASTD). The articles described the four levels of evaluation that I had formulated. I am not sure where I got the idea for this model, but the concept originated with work on my Ph.D. dissertation at the University of Wisconsin, Madison.” (p. xiii). [Will’s Note: Kirkpatrick was slightly inaccurate here. At the time of his four articles, the initials ASTD stood for the American Society of Training Directors and the four articles were published in the Journal of the American Society of Training Directors. This doesn’t diminish Kirkpatrick’s central point: that he was the person who formulated the four levels of learning evaluation].

In 2011, in a tribute to Dr. Kirkpatrick, he is asked about how he came up with the four levels. This is what he said in that video tribute:

“[after I finished my dissertation in 1954], between 54 and 59 I did some research on behavior and results. I went into companies. I found out are you using what you learned and if so what can you show any evidence of productivity or quality or more sales or anything from it. So I did some research and then in 1959 Bob Craig, editor of the ASTD journal, called me and said, ‘Don, I understand you’ve done some research on evaluation would you write an article?’ I said, ‘Bob, I’ll tell you what I’ll do, I’ll write four articles, one on reaction, one on learning, one on behavior, and one on results.'”

In 2014, when asked to reminisce on his legacy, Dr. Kirkpatrick said this:

“When I developed the four levels in the 1950s, I had no idea that they would turn into my legacy. I simply needed a way to determine if the programs I had developed for managers and supervisors were successful in helping them perform better on the job. No models available at that time quite fit the bill, so I created something that I thought was useful, implemented it, and wrote my dissertation about it.” (Quote from blog post published January 22, 2014).

As recently as this month (January 2018), on the Kirkpatrick Partners website, the following is written:

“Don was the creator of the Kirkpatrick Model, the most recognized and widely used training evaluation model in the world. The four levels were developed in the writing of his Ph.D. dissertation, Evaluating a Human Relations Training Program for Supervisors.

Despite these public pronouncements, Kirkpatrick’s legendary 1959-1960 articles were not the first published evidence of a four-level evaluation approach.

Raymond Katzell’s Four-Step Framework of Evaluation

In an article written by Donald Kirkpatrick in 1956, the following “steps” were laid out and were attributed to “Raymond Katzell, a well known authority in the field [of training evaluation].”

  1. To determine how the trainees feel about the program.
  2. To determine how much the trainees learn in the form of increased knowledge and understanding.
  3. To measure the changes in the on-the-job behavior of the trainees.
  4. To determine the effects of these behavioral changes on objective criteria such as production, turnover, absenteeism, and waste.

These four steps are the same as Kirkpatrick’s four levels, except there are no labels.

Raymond Katzell went on to a long and distinguished career as an industrial-organizational psychologist, even winning the Society for Industrial and Organizational Performance’s Distinguished Scientific Contributions award.

Raymond Katzell. Picture used by SIOP (Society for Industrial and Organizational Psychology) when they talk about The Raymond A. Katzell Media Award in I-O Psychology.

The first page of Kirkpatrick’s 1956 article—written three years before his famous 1959 introduction to the four levels—is pictured below:

And here is a higher-resolution view of the quote from that front page, regarding Katzell’s contribution:

So Donald Kirkpatrick mentions Katzell’s four-step model in 1956, but not in 1959 when he—Kirkpatrick—introduces the four labels in his classic set of four articles.

It Appears that Kirkpatrick Never Mentions Katzell’s Four Steps Again

As far I can tell, after searching for and examining many publications, Donald Kirkpatrick never mentioned Katzell’s four steps after his 1956 article.

Three years after the 1956 article, Kirkpatrick did not mention Katzell’s taxonomy when he wrote his four famous articles in 1959. He did mention an unrelated article where Katzell was a co-author (Merrihue & Katzell, 1955), but he did not mention Katzell’s four steps.

Neither did Kirkpatrick mention Katzell in his 1994 book, Evaluating Training Programs: The Four Levels.

Nor did Kirkpatrick mention Katzell in the third edition of the book, written with Jim Kirkpatrick, his son.

Nor was Katzell mentioned in a later version of the book written by Jim and Wendy Kirkpatrick in 2016. I spoke with Jim and Wendy recently (January 2018), and they seemed as surprised as I was about the 1956 article and about Raymond Katzell.

Nor did Donald Kirkpatrick mention Katzell in any of the interviews he did to mark the many anniversaries of his original 1959-1960 articles.

To summarize, Katzell, despite coming up with the four-step taxonomy of learning evaluation, was only given credit by Kirkpatrick once, in the 1956 article, three years prior to the articles that introduced the world to the Kirkpatrick Model’s four labels.

Kirkpatrick’s Dissertation

Kirkpatrick did not introduce the four-levels in his 1954 dissertation. There is not even a hint at a four-level framework.

In his dissertation, Kirkpatrick cited two publications by Katzell. The first, was an article from 1948, “Testing a Training Program in Human Relations.” That article studies the effect of leadership training, but makes no mention of Katzell’s four steps. It does, however, hint at the value of measuring on-the-job performance, in this case the value of leadership behaviors. Katzell writes, “Ideally, a training program of this sort [a leadership training program] should be evaluated in terms of the on-the-job behavior of those with whom the trainees come in contact.

The second Katzell article cited by Kirkpatrick in his dissertation was an article entitled, “Can We Evaluate Training?” from 1952. Unfortunately, it was a mimeographed article published by the Industrial Management Institute at the University of Wisconsin, and seems to be lost to history. Even after several weeks of effort (in late 2017), the University of Wisconsin Archives could not locate the article. Interestingly, in a 1955 publication entitled, “Monthly Checklist of State Publications” a subtitle was added to Katzell’s Can We Evaluate Training? The subtitle was:A summary of a one day Conference for Training Managers” from April 23, 1952.

To be clear, Kirkpatrick did not mention the four levels in his 1954 dissertation. The four levels notion came later.

How I Learned about Katzell’s Contribution

I’ve spent the last several years studying learning evaluation, and as part of these efforts, I decided to find Kirkpatrick’s original four articles and reread them. ATD (The Association for Talent Development) in 2017 had a wonderful archive of the articles it had published over the years. As I searched for “Kirkpatrick,” several other articles—besides the famous four—came up, including the 1956 article. I was absolutely freaking stunned when I read it. Donald Kirkpatrick had cited Katzell as the originator of the four level notion!!!

I immediately began searching for more information on the Kirkpatrick-Katzell connection and found that I wasn’t the first person to uncover the connection. I found an article by Stephen Smith who acknowledged Kazell’s contribution in 2008, also in an ASTD publication. I communicated with Smith recently (December 2017) and he had nothing but kind words to say about Donald Kirkpatrick, who he said coached him on training evaluations. Here is a graphic taken directly from Smith’s 2008 article:

Smith’s article was not focused on Katzell’s contribution to the four levels, which is probably why it wasn’t more widely cited. In 2011, Cynthia Lewis wrote a dissertation and directly compared the Katzell and Kirkpatrick formulations. She appears to have learned about Katzell’s contribution from Smith’s 2008 article. Lewis’s (2011) comparison chart is reproduced below:

In 2004, four years before Smith wrote his article with the Katzell sidebar, ASTD republished Kirkpatrick’s 1956 article—the one in which Kirkpatrick acknowledges Katzell’s four steps. Here is the front page of that article:

In 2016, an academic article appeared in a book that referred to the Katzell-Kirkpatrick connection. The book is only available in French and the article appears to have had little impact in the English-speaking learning field. Whereas neither Kirkpatrick’s 2004 reprint nor Smith’s 2008 article offered commentary about Katzell’s contribution except to acknowledge it, Bouteiller, Cossette, & Bleau (2016) were clear in stating that Katzell deserves to be known as the person who conceptualized the four levels of training evaluation, while Kirkpatrick should get credit for popularizing it. The authors also lamented that Kirkpatrick, who himself recognized Katzell as the father of the four-level model of evaluation in his 1956 article, completely ignored Katzell for the next 55 years and declared himself in all his books and on his website as the sole inventor of the model. I accessed their chapter through Google Scholar and used Google Translate to make sense of it. I also followed up with two of the authors (Bouteiller and Cossette in January 2018) to confirm I was understanding their messaging clearly.

Is There Evidence of a Transgression?

Raymond Katzell seems to be the true originator of the four-level framework of learning evaluation and yet Donald Kirkpatrick on multiple occasions claimed to be the creator of the four-level model.

Of course, we can never know the full story. Kirkpatrick and Katzell are dead. Perhaps Katzell willingly gave his work away. Perhaps Kirkpatrick asked Katzell if he could use it. Perhaps Kirkpatrick cited Katzell because he wanted to bolster the credibility of a framework he developed himself. Perhaps Kirkpatrick simply forgot Katzell’s four steps when he went on to write his now-legendary 1959-1960 articles. This last explanation may seem a bit forced given that Kirkpatrick referred to the Merrihue and Katzell work in the last of his four articles—and we might expect that the name “Katzell” would trigger memories of Katzell’s four steps, especially given that Katzell was cited by Kirkpatrick as a “well known authority.” This forgetting hypothesis also doesn’t explain why Kirkpatrick would continue to fail to acknowledge Katzell’s contribution after ASTD republished Kirkpatrick’s 1956 article in 2004 or after Steven Smith’s 2008 article showed Katzell’s four steps. Smith was well-known to Kirkpatrick and is likely to have at least mentioned his article to Kirkpatrick.

We can’t know for certain what transpired, but we can analyze the possibilities. Plagiarism means that we take another person’s work and claim it as our own. Plagiarism, then, has two essential features (see this article for details). First, an idea or creation is copied in some way. Second, no attribution is offered. That is, no credit is given to the originator. Kirkpatrick had clear contact with the essential features of Katzell’s four-level framework. He wrote about them in 1956! This doesn’t guarantee that he copied them intentionally. He could have generated the four levels subconsciously, without knowing that Katzell’s ideas were influencing his thinking. Alternatively, he could have spontaneously created them without any influence from Katzell’s ideas. People often generate similar ideas when the stimuli they encounter are similar. How many people claim that they invented the term, “email?” Plagiarism does not require intent, but intentional plagiarism is generally considered a higher-level transgression than sloppy scholarship.

A personal example of how easy it is to think you invented something: In the 1990’s or early 2000’s, I searched for just the right words to explain a concept. I wrangled on it for several weeks. Finally, I came up with the perfect wording, with just the right connotation. “Retrieval Practice.” It was better than the prevailing terminology at the time—the testing effect—because people could retrieve without being tested. Eureka I thought! Brilliant I thought! It was several years later, rereading Robert Bjork’s 1988 article, “Retrieval practice and the maintenance of knowledge,” that I realized that my label was not original to me, and that even if I did generate it without consciously thinking of Bjork’s work, that my previous contact with the term “retrieval practice” almost certainly influenced my creative construction.

The second requirement for plagiarism is that the original creator is not given credit. This is evident in the case of the four levels of learning evaluation. Donald Kirkpatrick never mentioned Katzell after 1956. He certainly never mentioned Katzell when it would have been most appropriate, for example when he first wrote about the four levels in 1959, when he first published a book on the four levels in 1994, and when he received awards for the four levels.

Finally, one comment may be telling, Kirkpatrick’s statement from his 1994 book: “I am not sure where I got the idea for this model, but the concept originated with work on my Ph.D. dissertation at the University of Wisconsin, Madison.” The statement seems to suggest that Kirkpatrick recognized that there was a source for the four-level model—a source that was not Kirkpatrick himself.

Here is the critical timeline:

  • Katzell was doing work on learning evaluation as early at 1948.
  • Kirkpatrick’s 1954 dissertation offers no trace of a four-part learning-evaluation framework.
  • In 1956, the first reference to a four-part learning evaluation framework was offered by Kirkpatrick and attributed to Raymond Katzell.
  • In 1959, the first mention of the Kirkpatrick terminology (i.e., Reaction, Learning, Behavior, Results) was published, but Katzell was not credited.
  • In 1994, Kirkpatrick published his book on the four levels, saying specifically that he formulated the four levels. He did not mention Katzell’s contribution.
  • In 2004, Kirkpatrick’s 1956 article was republished, repeating Kirkpatrick’s acknowledgement that Katzell invented the four-part framework of learning evaluation.
  • In 2008, Smith published the article where he cited Katzell’s contribution.
  • In 2014, Kirkpatrick claimed to have developed the four levels in the 1950s.
  • As far as I’ve been able to tell—corroborated by Bouteiller, Cossette, & Bleau (2016)—Donald Kirkpatrick never mentioned Katzell’s four-step formulation after 1956.

Judge Not Too Quickly

I have struggled writing this article, and have rewritten it dozens of times. I shared an earlier version with four trusted colleagues in the learning field and asked them if I was being fair. I’ve searched exhaustively for source documents. I reached out to key players to see if I was missing something.

It is not a trifle to curate evidence that impacts other people’s reputations. It is a sacred responsibility. I as the writer have the most responsibility, but you as a reader have a responsibility too to weigh the evidence and make your own judgments.

First we should not be too quick to judge. We simply don’t know why Donald Kirkpatrick never mentioned Katzell after the original 1956 article. Indeed, perhaps he did mention Katzell in his workshops and teachings. We just don’t know.

Here are some distinct possibilities:

  • Perhaps Katzell and Kirkpatrick had an agreement that Kirkpatrick could write about the four levels. Let’s remember the 1959-1960 articles were not written to boost Kirkpatrick’s business interests. He didn’t have any business interests at that time—he was an employee—and his writing seemed aimed specifically at helping others do better evaluation.
  • Perhaps Kirkpatrick, being a young man without much of résumé in 1956, had developed a four-level framework but felt he needed to cite Katzell in 1956 to add credibility to his own ideas. Perhaps later in 1959 he dropped this false attribution to give himself the credit he deserved.
  • Perhaps Kirkpatrick felt that citing Katzell once was enough. Where many academics and researchers see plagiarism as one of the deadly sins, others have not been acculturated into the strongest form of this ethos. Let’s remember that in 1959 Kirkpatrick was not intending to create a legendary meme, he was just writing some articles. Perhaps at the time it didn’t seem important to acknowledge Katzell’s contribution. I don’t mean to dismiss this lightly. All of us are raised to believe in fairness and giving credit where credit is due. Indeed, research suggests that even the youngest infants have a sense of fairness. Kirkpatrick earned his doctorate at a prestigious research university. He should have been aware of the ethic of attribution, but perhaps because the 1959-1960 articles seemed so insignificant at the time, it didn’t seem important to site Katzell.
  • Perhaps Kirkpatrick intended to cite Katzell’s contribution in his 1959-1960 articles but the journal editor talked him out of it or disallowed it.
  • Perhaps Kirkpatrick realized that Katzell’s four steps were simply not resonant enough to be important. Let’s admit that Kirkpatrick’s framing of the four levels into the four labels was a brilliant marketing masterstroke. If Kirkpatrick believed this, he might have seen Katzell’s contribution as minimal and not deserving of acknowledgement.
  • Perhaps Kirkpatrick completely forget Katzell’s four-step taxonomy. Perhaps it didn’t influence him when he created his four labels, that he didn’t think of Katzell’s contribution when he wrote about Katzell’s article with Merrihue, that for the rest of his life he never remembered Katzell’s formulation, that he never saw the 2004 reprinting of his 1956 article, that he never saw Smith’s 2008 article, and that he never talked with Smith about Katzell’s work even though Smith has claimed a working relationship. Admittedly, this last possibility seems unlikely.

Let us also not judge Jim and Wendy Kirkpatrick, proprietors of Kirkpatrick Partners, a global provider of learning-evaluation workshops and consulting. None of this is on them! They were genuinely surprised to hear the news when I told them. They seemed to have no idea about Katzell or his contribution. What is past is past, and Jim and Wendy bear no responsibility for the history recounted here. What they do henceforth is their responsibility. Already, since we spoke last week, they have updated their website to acknowledge Katzell’s contribution!

Article Update (two days after original publication of this article): Yesterday, on the 31st of January 2018, Jim and Wendy Kirkpatrick posted a blog entry (copied here for the historic record) that admitted Katzell’s contribution but ignored Donald Kirkpatrick’s failure to acknowledge Katzell’s contribution as the originator of the four-level concept.

What about our trade associations and their responsibilities? It seems that ASTD bears a responsibility for their actions over the years, not only as the American Society of Training Directors who published the 1959-1960 articles without insisting that Katzell be acknowledged even though they themselves had published the 1956 articles where Katzell’s four-step framework was included on the first page; but also as the American Society of Training and Development who republished Kirkpatrick’s 1956 article in 2004 and republished the 1959-1960 articles in 1977. Recently rebranded as ATD (Association for Talent Development), the organization should now make amends. Other trade associations should also help set the record straight by acknowledging Katzell’s contribution to the four-level model of learning evaluation.

Donald Kirkpatrick’s Enduring Contribution

Regardless of who invented the four-level model of evaluation, it was Donald Kirkpatrick who framed it to perfection with the four labels and popularized it, helping it spread worldwide throughout the workplace learning and performance field.

As I have communicated elsewhere, I think the four-level model has issues—that it sends messages about learning evaluation that are not helpful.

On the other hand, the four-level model has been instrumental in pushing the field toward a focus on performance improvement. This shift—away from training as our sole responsibility, toward a focus on how to improve on-the-job performance—is one of the most important paradigm shifts in the long history of workplace learning. Kirkpatrick’s popularization of the four levels enabled us—indeed, it pushed us—to see the importance of focusing on work outcomes. For this, we owe Donald Kirkpatrick a debt of gratitude.

And we owe Raymond Katzell our gratitude as well. Not only did he originate the four levels, but he also put forth the idea that it was valuable to measure the impact learners have on their organizations.

What Should We Do Now?

What now is our responsibility as workplace learning professionals? What is ethical? The preponderance of the evidence points to Katzell as the originator of the four levels and Donald Kirkpatrick as the creator of the four labels (Reaction, Learning, Behavior, Results) and the person responsible for the popularization of the four levels. Kirkpatrick himself in 1956 acknowledged Katzell’s contribution, so it seems appropriate that we acknowledge it too.

Should we call them Katzell’s Four Levels of Evaluation? Or, the Katzell-Kirkpatrick Four Levels? I can’t answer this question for you, but it seems that we should acknowledge that Katzell was the first to consider a four-part taxonomy for learning evaluation.

For me, for the foreseeable future, I will either call it the Kirkpatrick Model and then explain that Raymond Katzell was the originator of the four levels, or I’ll simply call it the Kirkpatrick-Katzell Model.

Indeed, I think in fairness to both men—Kirkpatrick for the powerful framing of his four labels and his exhaustive efforts to popularize the model and Katzell for the original formulation—I recommend that we call it the Kirkpatrick-Katzell Four-Level Model of Training Evaluation. Or simply, the Kirkpatrick-Katzell Model.

Research Cited

Bjork, R. A. (1988). Retrieval practice and the maintenance of knowledge. In M. M. Gruneberg, P. E. Morris, R. N. Sykes (Eds.), Practical Aspects of Memory: Current Research and Issues, Vol. 1., Memory in Everyday Life (pp. 396-401). NY: Wiley.

Bouteiller, D., Cossette, M., & Bleau, M-P. (2016). Modèle d’évaluation de la formation de Kirkpatrick: retour sur les origins et mise en perspective. Dans M. Lauzier et D. Denis (éds.), Accroître le transfert des apprentissages: Vers de nouvelles connaissances, pratiques et expériences. Presses de l’Université du Québec, Chapitre 10, 297-339. [In English: Bouteiller, D., Cossette, M., & Bleau, M-P. (2016). Kirkpatrick training evaluation model: back to the origins and put into perspective. In M. Lauzier and D. Denis (eds.), Increasing the Transfer of Learning: Towards New Knowledge, Practices and Experiences. Presses de l’Université du Québec, Chapter 10, 297-339.]

Katzell, R. A. (1948). Testing a training program in human relations. Personnel Psychology, 1, 319-329.

Katzell, R. A. (1952). Can we evaluate training? A summary of a one day conference for training managers. A publication of the Industrial Management Institute, University of Wisconsin, April, 1952.

Kirkpatrick, D. L. (1956). How to start an objective evaluation of your training program. Journal of the American Society of Training Directors, 10, 18-22.

Kirkpatrick, D. L. (1959a). Techniques for evaluating training programs. Journal of the American Society of Training Directors, 13(11), 3-9.

Kirkpatrick, D. L. (1959b). Techniques for evaluating training programs: Part 2—Learning. Journal of the American Society of Training Directors, 13(12), 21-26.

Kirkpatrick, D. L. (1960a). Techniques for evaluating training programs: Part 3—Behavior. Journal of the American Society of Training Directors, 14(1), 13-18.

Kirkpatrick, D. L. (1960b). Techniques for evaluating training programs: Part 4—Results. Journal of the American Society of Training Directors, 14(2), 28-32.

Kirkpatrick, D. L. (1956-2004). A T+D classic: How to start an objective evaluation of your training program. T+D, 58(5), 1-3.

Lewis, C. J. (2011). A study of the impact of the workplace learning function on organizational excellence by examining the workplace learning practices of six Malcolm Baldridge Quality Award recipients. San Diego: CA. Available at

Merrihue, W. V., & Katzell, R. A. (1955). ERI: Yardstick of employee relations. Harvard Business Review, 33, 91-99.

Salas, E., Tannenbaum, S. I., Kraiger, K., & Smith-Jentsch, K. A. (2012). The science of training and development in organizations: What matters in practice. Psychological Science in the Public Interest, 13(2), 74–101.

Smith, S. (2008). Why follow levels when you can build bridges? T+D, September 2008, 58-62.





One of the Biggest Lies in Learning Evaluation — Asking Learners about Level 3 and 4.

, ,

The Kirkpatrick four-level model of evaluation includes Level 1 learner reactions, Level 2 learning, Level 3 behavior, and 4 Level results. Because of the model’s ubiquity and popularity, many learning professionals and organizations are influenced or compelled by the model to measure the two higher levels—Behavior and Results—even when it doesn’t make sense to do so and even if poor methods are used to do the measurement. This pressure has led many of us astray. It has also enabled vendors to lie to us.

Let me get right to the point. When we ask learners whether a learning intervention will improve their job performance, we are getting their Level 1 reactions. We are NOT getting Level 3 data. More specifically, we are not getting information we can trust to tell us whether a person’s on-the-job behavior has improved due to the learning intervention.

Similarly, when we ask learners about the organizational results that might come from a training or elearning program, we are getting learners’ Level 1 reactions. We are NOT getting Level 4 data. More specifically, we are not getting information we can trust to tell us whether organizational results improved due to the learning intervention.

One key question is, “Are we getting information we can trust?” Another is, “Are we sure the learning intervention caused the outcome we’re targeting—or whether, at least, it was significant in helping to create the targeted outcomes?”

Whenever we gather learner answers, we have to remember that people’s subjective opinions are not always accurate. First there are general problems with human subjectivity; including people’s tendencies toward wanting to be nice, to see themselves and their organizations in a positive light, to believing they themselves are more productive, intelligent, and capable than they actually are. In addition, learners don’t always know how different learning methods affect learning outcomes, so asking them to assess learning designs has to be done with great care to avoid bias.

The Foolishness of Measuring Level 3 and 4 with Learner-Input Alone

There are also specific difficulties in having learners rate Level 3 and 4 results.

  • Having learners assess Level 3 is fraught with peril because of all the biases that are entailed. Learners may want to look good to others or to themselves. They may suffer from the Dunning-Kruger effect and rate their performance at a higher level than what is deserved.
  • Assessing Level 4 organizational results is particularly problematic. First, it is very difficult to track all the things that influence organizational performance. Asking learners for Level 4 results is a dubious enterprise because most employees cannot observe or may not fully understand the many influences that impact organizational outcomes.

Many questions we ask learners in measuring Level 3 and 4 are biased in and of themselves. These four questions are highly biasing, and yet sadly they were taken directly from two of our industry’s best-known learning-evaluation vendors:

  • “Estimate the degree to which you improved your performance related to this course?” (Rated on a scale of percentages to 100)
  • “The training has improved my job performance.” (Rated on a numeric scale)
  • “I will be able to apply on the job what I learned during this session.” (rated with a Likert-like scale)
  • “I anticipate that I will eventually see positive results as a result of my efforts.” (rated with a Likert-like scale)

At least two of our top evaluation vendors make the case explicitly that smile sheets can gather Level 3 and 4 data. This is one of the great lies in the learning industry. A smile sheet garners Level 1 results! It does not capture data at any other levels.

What about delayed smile sheets—questions delivered to learners weeks or months after a learning experience? Can these get Level 2, 3, and 4 data? No! Asking learners for their perspectives, regardless of when their answers are collected, still gives us only Level 1 outcomes! Yes, learners answers can provide hints, but the data can only be a proxy for outcomes beyond Level 1.

On top of that, the problems cited above regarding learner perspectives on their job performance and on organizational results still apply even when questions are asked well after a learning event. Remember, the key to measurement is always whether we can trust the data we are collecting! To reiterate, asking learners for their perspectives on behavior and results suffers from the following:

  • Learners’ biases skew the data
  • Learners’ blind spots make their answers suspect
  • Biased questioning spoils the data
  • The complexity in determining the network of causal influences makes assessments of learning impact difficult or impossible

In situations where learner perspectives are so in doubt, asking learners questions may generate some reasonable hypotheses, but then these hypotheses must be tested with other means.

The Ethics of the Practice

It is unfair to call Level 1 data Level 3 data or Level 4 data.

In truth, it is not only unfair, it is deceptive, disingenuous, and harmful to our learning efforts.

How Widespread is this Misconception?

If two of are top vendors are spreading this misconception, we can be pretty sure that our friend-and-neighbor foot soldiers are marching to the beat.

Last week, I posted a Twitter poll asking the following question:

If you ask your learners how the training will impact their job performance, what #Kirkpatrick level is it?

Twitter polls only allow four choices, so I gave people the choice of choosing Level 1 — Reaction, Level 2 –Learning, Level 3 — Behavior, or Level 4 — Results.

Over 250 people responded (253). Here are the results:

  • Level 1 — Reaction (garnered 31% of the votes)
  • Level 2 — Learning (garnered 15% of the votes)
  • Level 3 — Behavior (garnered 38% of the votes)
  • Level 4 — Results (garnered 16% of the votes)

Level 1 is the correct answer! Level 3 is the most common misconception!

And note, given that Twitter is built on a social-media follower-model—and many people who follow me have read my book on Performance-Focused Smile Sheets, where I specifically debunk this misconception—I’m sure this result is NOT representative of the workplace learning field in general. I’m certain that in the field, more people believe that the question represents a Level 3 measure.

Yes, it is true what they say! People like you who read my work are more informed and less subject to the vagaries of vendor promotions. Also better looking, more bold, and more likely to be humble humanitarians!

My tweet offered one randomly-chosen winner a copy of my award-winning book. And the winner is:

Sheri Kendall-DuPont, known on Twitter as:

Thanks to everyone who participated in the poll…

Guest Post from Robert O. Brinkerhoff: 70-20-10: The Good, the Bad, and the Ugly

, , ,

This is a guest post by Robert O. Brinkerhoff (

Rob is a renowned expert on learning evaluation and performance improvement. His books, Telling Training’s Story and Courageous Training, are classics.


70-20-10: The Good, the Bad, and the Ugly

The 70-20-10 framework may not have much if any research basis, but it is still a good reminder to all of us in in the L&D and performance improvement professions that the work-space is a powerful teacher and poses many opportunities for practice, feedback, and improvement.

But we must also recognize that a lot of the learning that is taking place on the job may not be for the good. I have held jobs in agencies, corporations and the military where I learned many things that were counter to what the organization wanted me to learn: how to fudge records, how to take unfair advantage of reimbursement policies, how to extend coffee breaks well beyond their prescribed limits, how to stretch sick leave, and so forth.

These were relatively benign instances. Consider this: Where did VW engineers learn how to falsify engine emission results? Where did Well Fargo staff learn how to create and sell fake accounts to their unwitting customers?

Besides these egregiously ugly examples, we have to also recognize that in the case of L&D programming that is intended to support new strategic and other change initiatives, the last thing the organization needs is more people learning how to do their jobs in the old way. AT&T, for example, worked very hard to drive new beliefs and actions to enable the business to shift from landline technologies to wireless; on-the-job learning dragged them backwards, and creates problems still today. As AllState Insurance tries to shift sales focus away from casualty policies to financial planning services, the old guard teaches the opposite actions, as they continue to harvest the financial benefits of policy renewals. Any organization that has to make wholesale and fundamental shifts to execute new strategies will have to cope with the negative effects of years of on-the-job learning.

When strategy is new, there are few if any on-the-job pockets of expertise and role models. Training new employees for existing jobs is a different story. Here, obviously, the on-job space is an entirely appropriate learning resource.

In short, we have to recognize that not all on-the-job learning is learning that we want. Yet on the job learning remains an inexorable force that we in L&D must learn how to understand, leverage, guide and manage.

Learning Styles Notion Still Prevalent on Google

, , ,

Two and a half years ago, in writing a blog post on learning styles, I did a Google search using the words “learning styles.” I found that the top 17 search items were all advocating for learning styles, even though there was clear evidence that learning-styles approaches DO NOT WORK.

Today, I replicated that search and found the following in the top 17 search items:

  • 13 advocated/supported the learning-styles idea.
  • 4 debunked it.

That’s progress, but clearly Google is not up to the task of providing valid information on learning styles.

Scientific Research that clearly Debunks the Learning-Styles Notion:

  • Kirschner, P. A. (2017) Stop propagating the learning styles myth. Computers & Education, 106, 166-171.
  • Willingham, D. T., Hughes, E. M., & Dobolyi, D. G. (2015). The scientific status of learning styles theories. Teaching of Psychology, 42(3), 266-271.
  • Pashler, H., McDaniel, M., Rohrer, D., & Bjork, R. (2008). Learning styles: Concepts and evidence. Psychological Science in the Public Interest, 9(3), 105-119.
  • Rohrer, D., & Pashler, H. (2012). Learning styles: Where’s the evidence? Medical Education, 46(7), 634-635.

Follow the Money

  • Still no one has come forward to prove the benefits of learning styles, even though it’s been over 10 years since $1,000 was offered, and 3 years since $5,000 was offered.

Great 70-20-10 Debate

The Debunker Club (, one of my hobbies, is hosting a debate about the potency/viability of the 70-20-10 model. For more information, go directly to the Debunker Club’s event page.

New Meta-Analysis on Debunking — Still an Unclear Way to Potency


A new meta-analysis on debunking was released last week, and I was hoping to get clear guidelines on how to debunk misinformation. Unfortunately, the science still seems somewhat equivocal about how to debunk. Either that, or there’s just no magic bullet.

Let’s break this down. We all know misinformation exists. People lie, people get confused and share bad information, people don’t vet their sources, incorrect information is easily spread, et cetera. Debunking is the act of providing information or inducing interactions intended to correct misinformation.

Misinformation is a huge problem in the world today, especially in our political systems. Democracy is difficult if political debate and citizen conversations are infused with bad information. Misinformation is also a huge problem for citizens themselves and for organizations. People who hear false health-related information can make themselves sick. Organizations who have employees who make decisions based on bad information, can hurt the bottom line.

In the workplace learning field, there’s a ton of misinformation that has incredibly damaging effects. People believe in the witchcraft of learning styles, neuroscience snake oil, traditional smile sheets, and all kinds of bogus information.

It would be nice if misinformation could be easily thwarted, but too often it lingers. For example, the idea that people remember 10% of what they read, 20% of what they hear, 30% of what they see, etc., has been around since 1913 if not before, but it still gets passed around every year on bastardized versions of Dale’s Cone.

A meta-analysis is a scientific study that compiles many other scientific studies using advanced statistical procedures to enable overall conclusions to be drawn. The study I reviewed (the one that was made available online last week) is:

Chan, M. S., Jones, C. R., Jamieson, K. H., & Albarracin, D. (2017). Debunking: A meta-analysis of the psychological efficacy of messages countering misinformation. Psychological Science, early online publication (print page numbers not yet determined). Available here (if you have journal access:

This study compiled scientific studies that:

  1. First presented people with misinformation (except a control group that got no misinformation).
  2. Then presented them with a debunking procedure.
  3. Then looked at what effect the debunking procedure had on people’s beliefs.

There are three types of effects examined in the study:

  1. Misinformation effect = Difference between the group that just got misinformation and a control group that didn’t get misinformation. This determined how much the misinformation hurt.
  2. Debunking effect = Difference between the group that just got misinformation and a group that got misinformation and later debunking. This determined how much debunking could lesson the effects of the misinformation.
  3. Misinformation-Persistence effect = Difference between the group that got misinformation-and-debunking and the control group that didn’t get misinformation. This determined how much debunking could fully reverse the effects of the misinformation.

They looked at three sets of factors.

First, the study examined what happens when people encounter misinformation. They found that the more people thought of explanations for the false information, the more they would believe this misinformation later, even in the face of debunking. From a practical standpoint then, if people are receiving misinformation, we should hope they don’t think too deeply about it. Of course, this is largely out of our control as learning practitioners, because people come to us after they’ve gotten misinformation. On the other hand, it may provide hints for us as we use knowledge management or social media. The research findings suggest that we might need to intervene immediately when bad information is encountered to prevent people from elaborating on the misinformation.

Second, the meta-analysis examined whether debunking messages that included procedures to induce people to make counter-arguments to the misinformation would outperform debunking messages that did not include such procedures (or that included less potent counter-argument-inducing procedures). They found consistent benefits to these counter-argument inducing procedures. These procedures helped reduce misinformation. This suggests strongly that debunking should induce counter-arguments to the misinformation. And though specific mechanisms for doing this may be difficult to design, it is probably not enough to present the counter-arguments ourselves without getting our learners to fully process the counter-arguments themselves to some sufficient level of mathemagenic (learning-producing) processing.

Third, the meta-analysis looked at whether debunking messages that included explanatory information for why the misinformation was wrong would outperform debunking messages that included just contradictory claims (for example, statements to the effect that the misinformation was wrong). They found mixed results here. Providing debunking messages with explanatory information was more effective in debunking misinformation (getting people to move from being misinformed to being less misinformed), but these more explanatory messages were actually less effective in fully ridding people of the misinformation. This was a conflicting finding and so it’s not clear whether greater explanations make a difference, or how they might be designed to make a difference. One wild conjecture. Perhaps where the explanations can induce relevant counter-arguments to the misinformation, they will be effective.

Overall, I came away disappointed that we haven’t been able to learn more about how to debunk. This is NOT these researchers’ fault. The data is the data. Rather, the research community as a whole has to double down on debunking and persuasion and figure out what works.

People certainly change their minds on heartfelt issues. Just think about the acceptance of gays and lesbians over the last twenty years. Dramatic changes! Many people are much more open and embracing. Well, how the hell did this happen? Some people died out, but many other people’s minds were changed.

My point is that misinformation cannot possibly be a permanent condition and it behooves the world to focus resources on fixing this problem — because it’s freakin’ huge!


Note that a review of this research in the New York Times painted this in a more optimistic light.


Some additional thoughts (added one day after original post).

To do a thorough job of analyzing any research paradigm, we should, of course, go beyond meta-analyses to the original studies being meta-analyzed. Most of us don’t have time for that, so we often take the short-cut of just reading the meta-analysis or just reading research reviews, etc. This is generally okay, but there is a caveat that we might be missing something important.

One thing that struck me in reading the meta-analysis is that the authors commented on the typical experimental paradigm used in the research. It appeared that the actual experiment might have lasted 30 minutes or less, maybe 60 minutes at most. This includes reading (learning) the misinformation, getting a ten-minute distractor task, and answering a few questions (some treatment manipulations, that is, types of debunking methods; plus the assessment of their final state of belief through answers to questions). To ensure I wasn’t misinterpreting the authors’ message that the experiments were short, I looked at several of the studies compiled in the meta-analysis. The research I looked at used very short experimental sessions. Here is one of the treatments the experimental participants received (it includes both misinformation and a corrective, so it is one of the longer treatments):

Health Care Reform and Death Panels: Setting the Record Straight

Published: November 15, 2009

WASHINGTON, DC – With health care reform in full swing, politicians and citizen groups are taking a close look at the provisions in the Affordable Health Care for America Act (H.R. 3962) and the accompanying Medicare Physician Payment Reform Act (H.R. 3961).

Discussion has focused on whether Congress intends to establish “death panels” to determine whether or not seniors can get access to end-of-life medical care. Some have speculated that these panels will force the elderly and ailing into accepting minimal end-of-life care to reduce health care costs. Concerns have been raised that hospitals will be forced to withhold treatments simply because they are costly, even if they extend the life of the patient. Now talking heads and politicians are getting into the act.

Betsy McCaughey, the former Lieutenant Governor of New York State has warned that the bills contain provisions that would make it mandatory that “people in Medicare have a required counseling session that will tell them how to end their life sooner.”

Iowa Senator Chuck Grassley, the ranking Republican member of the Senate Finance Committee, chimed into the debate as well at a town-hall meeting, telling a questioner, “You have every right to fear…[You] should not have a government-run plan to decide when to pull the plug on Grandma.”

However, a close examination of the bill by non-partisan organizations reveals that the controversial proposals are not death panels at all. They are nothing more than a provision that allows Medicare to pay for voluntary counseling.

The American Medical Association and the National Hospice and Palliative Care Organization support the provision. For years, federal laws and policies have encouraged Americans to think ahead about end-of-life decisions.

The bills allow Medicare to pay doctors to provide information about living wills, pain medication, and hospice care. John Rother, executive vice president of AARP, the seniors’ lobby, repeatedly has declared the “death panel” rumors false.

The new provision is similar to a proposal in the last Congress to cover an end-of-life planning consultation. That bill was co-sponsored by three Republicans, including John Isakson, a Republican Senator from Georgia.

Speaking about the end of life provisions, Senator Isakson has said, “It’s voluntary. Every state in America has an end of life directive or durable power of attorney provision… someone said Sarah Palin’s web site had talked about the House bill having death panels on it where people would be euthanized. How someone could take an end of life directive or a living will as that is nuts.”

That’s it. That’s the experimental treatment.

Are we truly to believe that such short exposures are representative of real-world debunking? Surely not! In the real world, people who get misinformation often hold that misinformation over months or years while occasionally thinking about the misinformation again or encountering additional supportive misinformation or non-supportive information that may modify their initial beliefs in the misinformation. This all happens and then we try our debunking treatments.

Finally, it should be emphasized that the meta-analysis also only compiled eight research articles, many using the same (or similar) experimental paradigm. This is further inducement to skepticism. We should be very skeptical of these findings and my plea above for more study of debunking — especially in more ecologically-valid situations — is reinforced!

Research Reflections — Take a Selfie Here; The Examined Life is Worth Living!


As professionals in the learning field, memory is central to our work. If we don’t help our learners preserve their memories (of what they learned), we have not really done our job. I’m oversimplifying here — sometimes we want to guide our learners toward external memory aids instead of memory. But mostly, we aim to support learning and memory.

Glacier View

You might have learned that people who take photographs will remember less than those who did not take photographs. Several research studies showed this (see for example, Henkel, 2014).

The internet buzzed with this information a few years ago:

  • The Telegraph —
  • NPR —
  • Slate —
  • CNN —
  • Fox News —

Well, that was then. This is now.

Research Wisdom

There are CRITICAL LESSONS to be learned here — about using science intelligently… with wisdom.

Science is a self-correcting system that, with the arc of time, bends toward the truth. So, at any point in time, when we ask science for its conclusions, it tells us what it knows, while it apologizes for not knowing everything. Scientists can be wrong. Science can take wrong turns on the long road toward better understanding.

Does this mean we should reject scientific conclusions because they can’t guarantee omniscience; they can’t guarantee truth? I’ve written about this in more depth elsewhere, but I’ll say it here briefly — recommendations from science are better than our own intuitions; especially in regards to learning, given all the ways we humans are blind to how learning works.

Memory With Photography

Earlier studies showed that people who photographed images were less able to remember them than people who simply examined the images. Researchers surmised that people who off-loaded their memories to an external memory aid — to the photographs — freed up memory for other things.

We can look back at this now and see that this was a time of innocence; that science had kept some confidences hidden. New research by Barasch, Diehl, Silverman, and Zauberman (2017), found that people “who could freely take photographs during an experience recognized more of what they saw” and that those “with a camera had better recognition of aspects of the scene that they photographed than of aspects they did not photograph.

Of course, this is just one set of studies… we must be patient with science. More research will be done, and you and will benefit in knowing more than we know now and with more confidence… but this will take time.

What is the difference between the earlier studies and this latest set of studies? As argued by Barasch, Diehl, Silverman, and Zauberman (2017), the older studies did not give people the choice of which objects to photograph. In the words of the researchers, people did not have volitional control of their photographing experience. They didn’t go through the normal process we might go through in our real-world situations, where we must decide what to photograph and determine how to photograph the objects we target (i.e., the angles, borders, focus, etc.).

In a series of four experiments, the new research showed that attention was at the center of the memory effect. Indeed, people taking photographs “recognized more of what they saw and less of what they heard, compared with those who could not take any photographs (I added the bold underlines).

Interestingly, some of the same researchers, just the year before had found that taking photographs actually improved people’s enjoyment of their experiences (Diehl, Zauberman, & Barasch, 2016).

Practical Considerations for Learning Professionals

You might be asking yourself, “How should I handle the research-based recommendations I encounter?” Here is my advice:

  1. Be skeptical, but not too skeptical.
  2. Determine whether the research comes from a trusted source. Best sources are top-tier refereed scientific journals — especially where many studies find the same results. Worst sources are survey-based compilations of opinions. Beware of recommendations based on one scientific article. Beware of vendor-sponsored research. Beware of research that is not refereed; that is, not vetted by other researchers.
  3. Find yourself a trusted research translator. These people — and I count myself among them — have spent enough substantial time exploring the practical aspects of the research that they are liable to have wisdom about what the research means — and what its boundary conditions might be.
  4. Pay your research translators — so they can continue doing their work.
  5. Be good and prosper. Use the research in your learning programs and test it. Do good evaluation so you can get valid feedback to make your learning initiatives maximally effective.

Inscribed in My High School Yearbook in 1976

Time it was, and what a time it was, it was
A time of innocence, A time of confidences
Long ago, it must be, I have a photograph
Preserve your memories; They’re all that’s left you

Written by Paul Simon

The Photograph Above

Taken in Glacier National Park, Montana, USA; July 1, 2017
And incidentally, the glaciers are shrinking permanently.

Research Cited

Barasch, A., Diehl, K., Silverman, J., & Zauberman, G. (2017). Photographic Memory: The Effects of Volitional Photo Taking on Memory for Visual and Auditory Aspects of an Experience. Psychological Science, early online publication.

Diehl, K., Zauberman, G., & Barasch, A. (2016). How taking photos increases enjoyment of experiences. Journal of Personality and Social Psychology, 111, 119–140.

Henkel, L. A. (2014). Point-and-shoot memories: The influence of taking photos on memory for a museum tour. Psychological Science, 25, 396–402.

What’s Wrong With This Picture?

I must be in a bad mood — or maybe I’ve been unlucky in clinking on links — but this graphic is horrifying. Indeed, it’s so obviously flawed that I’m not even going to point out it’s most glaring problem. You decide!

One more editorial comment before the big reveal:  Why, why, why is the gloriously noble and important field of learning besieged by such crap!!!!













Why is the goal of a learning-focused game, “fun?”

The Last Two Decades of Neuroscience Research (via fMRI) Called Into Question!


Updated July 11, 2016. An earlier version was more apocalyptic.


THIS IS HUGE. A large number of studies from the last 15 years of neuroscience research (via fMRI) could be INVALID!

A recent study in the journal PNAS looked at the three most commonly used software packages used with fMRI machines. Where they expected to find a normal familywise error rate of 5%, they found error rates up to 70%.

Here’s what the authors’ wrote:

“Using mass empirical analyses with task-free fMRI data, we have found that the parametric statistical methods used for group fMRI analysis with the packages SPM, FSL, and AFNI can produce FWE-corrected cluster P values that are erroneous, being spuriously low and inflating statistical significance. This calls into question the validity of countless published fMRI studies based on parametric clusterwise inference. It is important to stress that we have focused on inferences corrected for multiple comparisons in each group analysis, yet some 40% of a sample of 241 recent fMRI papers did not report correcting for multiple comparisons (26), meaning that many group results in the fMRI literature suffer even worse false-positive rates than found here (37).”

In a follow-up blog post, the authors estimated that up to 3,500 scientific studies may be affected, which is down from their initial published estimate of 40,000. The discrepancy results because only studies at the edge of statistical reliability are likely to have results that might be affected. For an easy-to-read review of their walk-back, Wired has a nice piece.

The authors also point out that there is more to worry about than those 3,500 studies. An additional 13,000 studies don’t use any statistical correction at all (so they’re not affected by the software glitch reported in the scientific paper). However, these 13,000 studies use an approach that “has familywise error rates well in excess of 50%.” (cited from the blog post)

Here’s what the authors say in their walk-back:

“So, are we saying 3,500 papers are “wrong”? It depends. Our results suggest CDT P=0.01 results have inflated P-values, but each study must be examined… if the effects are really strong, it likely doesn’t matter if the P-values are biased, and the scientific inference will remain unchanged. But if the effects are really weak, then the results might indeed be consistent with noise. And, what about those 13,000 papers with no correction, especially common in the earlier literature? No, they shouldn’t be discarded out of hand either, but a particularly jaded eye is needed for those works, especially when comparing them to new references with improved methodological standards.”


Some Perspective

Let’s take a deep breadth here. Science works slowly and we need to see what other experts have to say in the coming months.

The authors reported that there were about 40,000 published studies in the last 15 years that might be affected. Of this amount, only some of 3,500 + 13,000 = 16,500 are affected. That’s 41% of published articles with a potential to have invalid results.

But, of course, in the learning field, we don’t care about all these studies as most of them have very little to do with learning or memory. Indeed, a search of the whole history of PsycINFO (a social-science database) finds a total of 22,347 articles mentioning fMRI at all. Searching for articles that have a learning or memory aspect culls this number down to 7,056. This is a very rough accounting, but it does put the overall findings in some perspective.

As the authors warn, it’s not appropriate to dismiss the validity of all the research articles, even if they’re in one of the suspect groups of studies. Instead, when looking at the potentially-invalidate articles, each one has to be examined individually to know whether it has problems.

Despite these comforting caveats, the findings by the scientists have implications for many neuroscience research studies over the past 15 years (when the bulk of neuroscience research has been done).

On the other hand, there truly haven’t been many neuroscience findings that have much practical relevance to the learning field as of yet. See my review for a critique of overblown claims about neuroscience and learning. Indeed, as I’ve argued elsewhere, neuroscience’s potential to aid learning professionals probably rests in the future. So, being optimistic, maybe these statistical glitches will end up being a good thing. First, perhaps they’ll propel greater scrutiny to research methodologies, improving future neuroscience research. Second, perhaps they’ll put the brakes on the myth-creating industrial complex around neuroscience until we have better data to report and utilize.

Still, a dark cloud of low credibility may settle over the whole neuroscience field itself, hampering researchers from getting funding, and making future research results difficult for practitioners to embrace. Time will tell.


Popular Press Articles Citing the Original Article (Published Before the Walk-Backs).

Here are some articles from the scientific press pointing out the potential danger:




From Wikipedia (July 11, 2016): “In statistics, family-wise error rate (FWER) is the probability of making one or more false discoveries, or type I errors, among all the hypotheses when performing multiple hypotheses tests.”