Tag Archive for: meta-analyses

People keep asking me for references to the claim that learner surveys are not correlated—or are virtually uncorrelated—with learning results. In this post, I include them, with commentary.



Major Meta-Analyses

Here are the major meta-analyses (studies that compile the results of many other scientific studies using statistical means to ensure fair and valid comparisons):

For Workplace Training

Alliger, Tannenbaum, Bennett, Traver, & Shotland (1997). A meta-analysis of the relations among training criteria. Personnel Psychology, 50, 341-357.

Hughes, A. M., Gregory, M. E., Joseph, D. L., Sonesh, S. C., Marlow, S. L., Lacerenza, C. N., Benishek, L. E., King, H. B., Salas, E. (2016). Saving lives: A meta-analysis of team training in healthcare. Journal of Applied Psychology, 101(9), 1266-1304.

Sitzmann, T., Brown, K. G., Casper, W. J., Ely, K., & Zimmerman, R. D. (2008). A review and meta-analysis of the nomological network of trainee reactions. Journal of Applied Psychology, 93, 280-295.

For University Teaching

Uttl, B., White, C. A., Gonzalez (2017). Meta-analysis of faculty’s teaching effectiveness: Student evaluation of teaching ratings and student learning are not related. Studies in Educational Evaluation, 54, 22-42.

What these Results Say

These four meta-analyses, covering over 200 scientific studies, find correlations between smile-sheet ratings and learning to average about 10%, which is virtually no correlation at all. Statisticians consider correlations below 30% to be weak correlations, and 10% then is very weak.

What these Results Mean

These results suggest that typical learner surveys are not correlated with learning results.

From a practical standpoint:


If you get HIGH MARKS on your smile sheets:

You are almost equally likely to have

(1) An Effective Course

(2) An Ineffective Course


If you get LOW MARKS on your smile sheets:

You are almost equally likely to have

(1) A Poorly-Designed Course

(2) A Well-Designed Course



It is very likely that the traditional smile sheets that have been used in these scientific studies, while capturing data on learner satisfaction, have been inadequately designed to capture data on learning effectiveness.

I have developed a new approach to learner surveys to capture data on learning effectiveness. This approach is the Performance-Focused Smile Sheet approach as originally conveyed in my 2016 award-winning book. As of yet, no scientific studies have been conducted to correlate the new smile sheets with measures of learning. However, many many organizations are reporting substantial benefits. Researchers or learning professionals who want my updated list of recommended questions can access them here.


  1. Although I have written a book on learner surveys, in the new learning evaluation model, LTEM (Learning-Transfer Evaluation Model), I place these smile sheets at Tier 3, out of eight tiers, less valuable than measures of knowledge, decision-making, task performance, transfer, and transfer effects. Yes, learner surveys are worth doing, if done right, but they should not be the only tool we use when we evaluate learning.
  2. The earlier belief—and one notably advocated by Donald, Jim, and Wendy Kirkpatrick—that there was a causal chain from learner reactions to learning, behavior, and results has been shown to be false.
  3. There are three types of questions we can utilize on our smile sheets: (1) Questions that focus on learner satisfaction and the reputation of the learning, (2) Questions that support learning, and (3) Questions that capture information about learning effectiveness.
  4. It is my belief that we focus too much on learner satisfaction, which has been shown to be uncorrelated with learning results—and we also focus too little on questions that gauge learning effectiveness (the main impetus for the creation of Performance-Focused Smile Sheets).
  5. I do believe that learner satisfaction is important, but it is not most important.

Learning Opportunities regarding Learner Surveys

This post is for research geeks, and it's really just an introduction — maybe a gentle warning — as I don't have time or the statistical expertise to explore this deeply.


The Basics

When scientific experiments get done, researchers typically compare one experimental treatment to second one (or to no-treatment at all). So for example, we might compare two versions of the same elearning program, one that utilizes spaced repetitions and a second that uses unspaced repetitions. When we do such comparisons we need to know two things before we can draw conclusions:

  1. Statistical Significance:
    How likely is it that the experimental results might be caused by random chance. Social scientists aim for results that are more than 95% likely to result from the experimental factors being studied. In other words, if we did the same experiment 100 times, we should expect the same outcome at least 95% of the time.
  2. Effect Size:
    How different are the actual results. Are they sufficiently large to be meaningful?

If we don't take effect sizes into account, we can have an experiment that is statistically significant but not practically significant. That is, we can have statistical significance, but not effect-size significance. Without looking at effect-size calculations, we can be fooled into thinking that an experimental result is meaningful when it actually shows no substantial advantage for one learning method compared with another.

So for example, suppose that a new mobile-learning app improves learning by less than one-half of one percent, but cost $10,000 per learner…

Meta-analyses are statistical studies that compile many scientific studies, looking at the whole of the results. Meta-analyses have been a potent source of wisdom because they take complicated and complex results over a range of studies and combine them in a way that helps us make sense of the overall trends. Meta-analyses rely on effect sizes to calculate the overall importance of the factors being studied.


Some Subtleties

As with all things in science, over time scientists make improvement and refinements in their work. Effect sizes are no different. Recently, researchers have found that meta-analyses have to be interpreted with wisdom, otherwise the results may not be what they seem. Of specific concern is the finding that published studies tend to report higher effect sizes than unpublished studies. Quasi-experimental designs reported higher effect sizes than randomized control studies. Et cetera…

Here are some recommendations for researchers from Cheung and Slavin (2016), who are focused on educational research, but whose recommendations are widely applicable:

  • In doing a meta-analysis, don't just look at published studies. Moreover, work diligently to gather all studies that have been done.
  • Researchers, in general, should utilize randomized trials whenever possible. Those doing meta-analyses should look at these separately because they are likely to have the least-biased data.
  • Policy makers and educators (and I, Will Thalheimer, would add all workplace learning professionals) should "insist on large, randomized evaluations to validate promising programs."


Some Research Articles of Relevance

Cheung, A. C. K. & Slavin, R. E. (2016). How methodological features affect effect sizes in education. Educational Researcher, 45(5), 283-292.
Ueno, T., Fastrich, G. M., & Murayama, K. (2016). Meta-analysis to integrate effect sizes within an article: Possible misuse and Type I error inflation. Journal of Experimental Psychology: General, 145(5), 643-654. http://dx.doi.org/10.1037/xge0000159
van Assen, M. A. L. M., van Aert, R. C. M., & Wicherts, J. M. (2015). Meta-analysis using effect size distributions of only statistically significant studies. Psychological Methods, 20(3), 293-309. http://dx.doi.org/10.1037/met0000025