Posts

I read a brilliantly clear article today by Karen Hao from the MIT Technology Review. It explains what machine learning is and provides a very clear diagram, which I really like.

Now, I am not a machine learning expert, but I have a hypothesis that has a ton of face validity when I look in the mirror. My hypothesis is this:

Machine learning will return meaningful results to the extent that the data it uses is representative of the domain of interest.

A simple thought experiment will demonstrate my point. If a learning machine is given data about professional baseball in the United States from 1890 to 2000, it would learn all kinds of things, including the benefits of pulling the ball as a batter. Pulling the ball occurs when a right-handed batter hits the ball to left field or a left-handed batter hits the ball to right field. In the long history of baseball, many hitters benefited by trying to pull the ball because it produces a more natural swing and one that generates more power. Starting in the 2000s, with the advent of advanced analytics that show where each player is likely to hit the ball, a maneuver called “the shift” has been used more and more, and pulling the ball consistently has become a disadvantage. In the shift, players in the field migrate to positions where the batter is most likely to hit the ball, thus negating the power benefits of pulling the ball. Our learning machine would not know about the decreased benefits of pulling the ball because it would never have seen that data (the data from 2000 to now).

Machine Learning about Learning

I raise this point because of the creeping danger in the world of learning and education. My concern is relevant to all domains where it is difficult to collect data on the most meaningful factors and outcomes, but where it is easy to collect data on less meaningful factors and outcomes. In such cases, our learning machines will only have access to the data that is easy to collect and will not have access to the data that is difficult or impossible to collect. People using machine learning on inadequate data sets will certainly find some interesting relationships in the data, but they will have no way of knowing what they’re missing. The worst part is that they’ll report out some fanciful finding, we’ll all jump up and down in excitement and then make bad decisions based on the bad learning caused by the incomplete data.

In the learning field—where trainers, instructional designers, elearning developers, and teachers reside—we have learned a great deal about research-based methods of improving learning results, but we don’t know everything. And, many of the factors which we know work are not tracked in most big data sets. Do we track the spacing effect, the number of concepts repeated with attention-grabbing variation, the alignment between context cues present in learning materials compared with the cues that will be present in our learners’ future performance situations? Ha! Our large data sets certainly miss many of these causal factors.

Our large data sets also fail to capture the most important outcomes metrics. Indeed, as I have been regularly recounting for years now, typical learning measurements are often biased by measuring immediately at the end of learning (before memories fade), by measuring in the learning context (where contextual cues offer inauthentic hints or subconscious triggering of recall targets), and by measuring with tests of low-level knowledge (compared to more relevant skill-focused decision-making or task performances). We also overwhelmingly rely on learner feedback surveys, both in workplace learning and in higher education. Learner surveys—at least traditional ones—have been found virtually uncorrelated with learning results. To use these meaningless metrics as a primary dependent variable (or just a variable) in a machine-learning data set is complete malpractice.

So if our machine learning data sets have a poor handle on both the inputs and outputs to learning, how can we see machine learning interpretations of learning data as anything but a shiny new alchemy?

 

Measurement Illuminates Some Things But Leaves Others Hidden

In my learning-evaluation workshops, I often show this image.

The theme expressed in the picture is relevant to all types of evaluation, but it is especially relevant for machine learning.

When we review our smile-sheet data, we should not fool ourselves into thinking that we have learned the truth about the success of our learning. When we see a beautiful data-visualized dashboard, we should not deceive ourselves and our organizations that what we see is all there is to see.

So it is with machine learning, especially in domains where the data is not all the data, where the data flawed, and where the boundaries on the full population of domain data are not known.

 

With Apologies to Karen Hao

I don’t know Karen, but I do love her diagram. It’s clear and makes some very cogent points—as does her accompanying article.

Here is her diagram, which you can see in the original at this URL.

Like measurement itself, I think the diagram illuminates some aspects of machine learning but fails to illuminate the danger of incomplete or unrepresentative data sets. So, I made a modification in the flow chart.

And yes, that seven-letter provocation is a new machine-learning term that arises from the data as I see it.

Corrective Feedback Welcome

As I said to start this invective, my hypothesis about machine learning and data is just that—a semi-educated hypothesis that deserves a review from people more knowledgeable than me about machine learning. So, what do you think machine learning gurus?

 

Karen Hao Responds

I’m so delighted! One day after I posted this, Karen Hao responded:

 

 

 

Dear Readers,

Many of you are now following me and my social-media presence because you’re interested in LEARNING MEASUREMENT. Probably because of my recent book on Performance-Focused Smile Sheets (which you can learn about at the book’s website, SmileSheets.com).

More and more, I’m meeting people who have jobs that focus on learning measurement. For some, that’s their primary focus. For most, it’s just a part of their job.

Today, I got an email from a guy looking for a job in learning measurement and analytics. He’s a good guy, smart and passionate, and so he ought to be able to find a good job where he can really help. So here’s what I’m thinking. You, my readers are some of the best and brightest in the industry — you care about our work and you look to the scientific research as a source of guidance. You are also, many of you, enlightened employers, looking to recruit and hire the best and brightest. So it seems obvious that I should try to connect you…

So here’s what we’ll try. If you’ve got a job in learning measurement, let me know about it. I’ll post it here on my blog. This will be an experiment to see what happens. Maybe nothing… but it’s worth a try.

Now, I know many of you are also loyal readers because of things BESIDES learning measurement, for example, learning research briefs, research-based insights, elearning, subscription learning, learning audits, and great jokes… but let’s keep this experiment to LEARNING MEASUREMENT JOBS at first.

BOTTOM LINE: If you know of a learning-measurement job, let me know. Email me here…

In a recent research article, Tobias Wolbring and Patrick Riordan report the results of a study looking into the effects of instructor “beauty” on college course evaluations. What they found might surprise you — or worry you — depending on your views on vagaries of fairness in life.

Before I reveal the results, let me say that this is one study (two experiments), and that the findings were very weak in the sense that the effects were small.

Their first study used a large data set involving university students. Given that the data was previously collected through routine evaluation procedures, the researchers could not be sure of the quality of the actual teaching, nor the true “beauty” of the instructors (they had to rely on online images).

The second study was a laboratory study where they could precisely vary the level of beauty of the instructor and their gender, while keeping the actual instructional materials consistent. Unfortunately, “the instruction” consisted of an 11-minute audio lecture taught by relatively young instructors (young adults), so it’s not clear whether their results would generalize to more realistic instructional situations.

In both studies they relied on beauty as represented by facial beauty. While previous research shows that facial beauty is the primary way we rate each other on attractiveness, body beauty has also been found to have effects.

Their most compelling results:

1.

They found that ratings of attractiveness are very consistent across raters. People seem to know who is attractive and who is not. This confirms findings of many studies.

2.

Instructors who are more attractive, get better smile sheet ratings. Note that the effect was very small in both experiments. They confirmed what many other research studies have found, although their results were generally weaker than previous studies — probably due to the better controls utilized.

3.

They found that instructors who are better looking engender less absenteeism. That is, students were more likely to show up for class when their instructor was attractive.

4.

They found that it did not make a difference on the genders of the raters or instructors. It was hypothesized that female raters might respond differently to male and female instructors, and males would do the same. But this was not found. In previous studies there have been mixed results.

5.

In the second experiment, where they actually gave learners a test of what they’d learned, attractive instructors engendered higher scores on a difficult test, but not an easy test. The researchers hypothesize that learners engage more fully when their instructors are attractive.

6.

In the second experiment, they asked learners to either: (a) take a test first and then evaluate the course, or (b) do the evaluation first and then take the test. Did it matter? Yes! The researchers hypothesized that highly-attractive instructors would be penalized for giving a hard test more than their unattractive colleagues. This prediction was confirmed. When the difficult test came before the evaluation, better looking instructors were rated more poorly than less attractive instructors. Not much difference was found for the easy test.

Ramifications for Learning Professionals

First, let me caveat these thoughts with the reminder that this is just one study! Second, the study’s effects were relatively weak. Third, their results — even if valid — might not be relevant to your learners, your instructors, your organization, your situation, et cetera!

  1. If you’re a trainer, instructor, teacher, professor — get beautiful! Obviously, you can’t change your bone structure or symmetry, but you can do some things to make yourself more attractive. I drink raw spinach smoothies and climb telephone poles with my bare hands to strengthen my shoulders and give me that upside-down triangle attractiveness, while wearing the most expensive suits I can afford — $199 at Men’s Warehouse; all with the purpose of pushing myself above the threshold of … I can’t even say the word. You’ll have to find what works for you.
  2. If you refuse to sell your soul or put in time at the gym, you can always become a behind-the-scenes instructional designer or a research translator. As Clint said, “A man’s got to know his limitations.”
  3. Okay, I’ll be serious. We shouldn’t discount attractiveness entirely. It may make a small difference. On the other hand, we have more important, more leverageable actions we can take. I like the research-based findings that we all get judged primarily on two dimensions warmth/trust and competence. Be personable, authentically trustworthy, and work hard to do good work.
  4. The finding from the second experiment that better looking instructors might prompt more engagement and more learning — that I find intriguing. It may suggest, more generally, that the likability/attractiveness of our instructors or elearning narrators may be important in keeping our learners engaged. The research isn’t a slam dunk, but it may be suggestive.
  5. In terms of learning measurement, the results may suggest that evaluations come before difficult performance tests. I don’t know though how this relates to adults in workplace learning. They might be more thankful for instructional rigor if it helps them perform better in their jobs.
  6. More research is needed!

Research Reviewed

Wolbring, T., & Riordan, P. (2016). How beauty works. Theoretical mechanisms and two
empirical applications on students’ evaluation of teaching. Social Science Research, 57, 253-272.

Social Media is hot, but it is not clear how well we are measuring social media.

A couple of years ago I wrote an article for the eLearning Guild about measuring social media. But it's not clear that we've got this nailed yet.

With this worry in mind, I've created a research survey to begin a process to see how best social-media (of the kind we might use to bolster workplace learning-and-performance) can be measured.

Here's the survey link. Please take the survey yourself. You don't have to be an expert to take it.

Here's my thinking so far on this. Please send wisdom if I've missed something.

  1. We can think about measuring social media the same way we measure any learning intervention.
  2. We can also create a list of all the proposed benefits for social media, and the proposed costs, and all the proposed harms, and we can see how people are measuring these now. The survey will help us with this second approach.

Note: Survey results will be made available for free. If you take the survey, you'll get early releases of the survey results and recommendations.

Also, this is not the kind of survey that needs good representative sampling, so feel free to share this far and wide.

Here is the direct link to the survey:   http://tinyurl.com/4tlslol

Here is the direct link to this blog post:   http://tinyurl.com/465ekpa

This Friday February 6th, Dr. Roy Pollock will join me for a Brown Bag Learning Webinosh (short webinar) to talk about How to Build Measurement into Our Training-Development Processes.

We'll talk about this by reviewing our newly released job aid. Click the link below to get the job aid:

Building Measurement Into Your Training-Development Plan

Roy is co-author of the groundbreaking book, Six Disciplines of Breakthrough Learning, and the just-released book, "Getting your Money's Worth from Training-and-Development," which is fantastic by the way (see my blog post on this tomorrow).

Click to learn more about the webinar (or sign up now)…

Today, Roy Pollock (CLO of the Fort Hill Company) and I release our job aid, "Building Measurement Into Your Training-Development Plan."

It's not rocket science, but it is our attempt to provide some guidance for how you might better utilize learning measurement.

Good learning measurement enables us to:

  1. Boost Learning Results
  2. Improve Our Learning Designs
  3. Prove Learning's Benefits

Unfortunately, in general we aren't very good at measuring learning. This is not only an embarrassment, but a big missed opportunity to improve our practices and our profession–and to grab a competitive advantage for our organizations.

Roy and I wanted to develop a job aid that would help (1) remind us to plan for measurement, (2) see where and how measurement should be integrated into our training-development plans, and (3) provide the reasoning behind the key steps.

There are two ways to use the job aid. You can use it "as is" to guide your training development. Or, you can utilize the wisdom from the job aid and add the key measurement steps to your own training-development process.

Roy and I will be teaching our learning-measurement workshop at the upcoming eLearning Guild conference in March. We'd be delighted if you would join us. Click to learn more…