I read a brilliantly clear article today by Karen Hao from the MIT Technology Review. It explains what machine learning is and provides a very clear diagram, which I really like.

Now, I am not a machine learning expert, but I have a hypothesis that has a ton of face validity when I look in the mirror. My hypothesis is this:

Machine learning will return meaningful results to the extent that the data it uses is representative of the domain of interest.

A simple thought experiment will demonstrate my point. If a learning machine is given data about professional baseball in the United States from 1890 to 2000, it would learn all kinds of things, including the benefits of pulling the ball as a batter. Pulling the ball occurs when a right-handed batter hits the ball to left field or a left-handed batter hits the ball to right field. In the long history of baseball, many hitters benefited by trying to pull the ball because it produces a more natural swing and one that generates more power. Starting in the 2000s, with the advent of advanced analytics that show where each player is likely to hit the ball, a maneuver called “the shift” has been used more and more, and pulling the ball consistently has become a disadvantage. In the shift, players in the field migrate to positions where the batter is most likely to hit the ball, thus negating the power benefits of pulling the ball. Our learning machine would not know about the decreased benefits of pulling the ball because it would never have seen that data (the data from 2000 to now).

Machine Learning about Learning

I raise this point because of the creeping danger in the world of learning and education. My concern is relevant to all domains where it is difficult to collect data on the most meaningful factors and outcomes, but where it is easy to collect data on less meaningful factors and outcomes. In such cases, our learning machines will only have access to the data that is easy to collect and will not have access to the data that is difficult or impossible to collect. People using machine learning on inadequate data sets will certainly find some interesting relationships in the data, but they will have no way of knowing what they’re missing. The worst part is that they’ll report out some fanciful finding, we’ll all jump up and down in excitement and then make bad decisions based on the bad learning caused by the incomplete data.

In the learning field—where trainers, instructional designers, elearning developers, and teachers reside—we have learned a great deal about research-based methods of improving learning results, but we don’t know everything. And, many of the factors which we know work are not tracked in most big data sets. Do we track the spacing effect, the number of concepts repeated with attention-grabbing variation, the alignment between context cues present in learning materials compared with the cues that will be present in our learners’ future performance situations? Ha! Our large data sets certainly miss many of these causal factors.

Our large data sets also fail to capture the most important outcomes metrics. Indeed, as I have been regularly recounting for years now, typical learning measurements are often biased by measuring immediately at the end of learning (before memories fade), by measuring in the learning context (where contextual cues offer inauthentic hints or subconscious triggering of recall targets), and by measuring with tests of low-level knowledge (compared to more relevant skill-focused decision-making or task performances). We also overwhelmingly rely on learner feedback surveys, both in workplace learning and in higher education. Learner surveys—at least traditional ones—have been found virtually uncorrelated with learning results. To use these meaningless metrics as a primary dependent variable (or just a variable) in a machine-learning data set is complete malpractice.

So if our machine learning data sets have a poor handle on both the inputs and outputs to learning, how can we see machine learning interpretations of learning data as anything but a shiny new alchemy?

 

Measurement Illuminates Some Things But Leaves Others Hidden

In my learning-evaluation workshops, I often show this image.

The theme expressed in the picture is relevant to all types of evaluation, but it is especially relevant for machine learning.

When we review our smile-sheet data, we should not fool ourselves into thinking that we have learned the truth about the success of our learning. When we see a beautiful data-visualized dashboard, we should not deceive ourselves and our organizations that what we see is all there is to see.

So it is with machine learning, especially in domains where the data is not all the data, where the data flawed, and where the boundaries on the full population of domain data are not known.

 

With Apologies to Karen Hao

I don’t know Karen, but I do love her diagram. It’s clear and makes some very cogent points—as does her accompanying article.

Here is her diagram, which you can see in the original at this URL.

Like measurement itself, I think the diagram illuminates some aspects of machine learning but fails to illuminate the danger of incomplete or unrepresentative data sets. So, I made a modification in the flow chart.

And yes, that seven-letter provocation is a new machine-learning term that arises from the data as I see it.

Corrective Feedback Welcome

As I said to start this invective, my hypothesis about machine learning and data is just that—a semi-educated hypothesis that deserves a review from people more knowledgeable than me about machine learning. So, what do you think machine learning gurus?

 

Karen Hao Responds

I’m so delighted! One day after I posted this, Karen Hao responded:

 

 

 

Geese are everywhere these days, crapping all over everything. Where we might have nourishment, we get poop on our shoes.

Big data is everywhere these days…

Even flocking into the learning field.

For big-data practitioners to NOT crap up the learning field, they’ll need to find good sources of data (good luck with that!), use intelligence about learning to know what it means (blind arrogance will prevent this, at least at first), and then find where the data is actually useful in practice (will there be patience and practice or just shiny new objects for sale?).

Beware of the wild goose chase! It’s already here.

Is anyone else getting completely annoyed watching someone’s hand draw and write on videos and elearning?

OMG! It’s beginning to drive me nuts! What the hell is wrong with us?

Here’s the thing. When this was new, it was engaging. Now it’s cliche! Now most people are habituated to it. What we’re doing now is taking one of our tools and completely overusing it.

Let’s be smarter.

Somebody sent me a link to a YouTube video today — a video created to explain to laypeople what instructional design is. Most of it was reasonable, until it gave the following example, narrated as follows:

“… and testing is created to clear up confusion and make sure learners got it right.”


Something is obviously wrong here — something an instructional designer ought to know. What is it?

Scroll down for the answer…

Before you scroll down, come up with your own answer…

.

.

.

.

.

.

.

.

.

.

.

.

Answer: 

The test question is devoid of real-world context. Instead of asking a text-based question, we could provide an image and ask them to point to the access panel.

Better yet, we could have them work on a simulated real-world task and follow steps that would enable them to complete the simulated task only if they used the access panel as part of their task completion.

Better yet, we could have them work on an actual real-world task… et cetera…

Better yet, we might first ask ourselves whether anybody really needs to “LEARN” where the access panel is — or would they just find it on their own without being trained or tested on it?

Better yet, we might first ask ourselves whether we really need a course in the first place. Maybe we’d be better off to create a performance-support tool that would take them through troubleshooting steps — with zero or very little training required.

Better yet, we might first ask ourselves whether we could design our equipment so that technicians don’t need training or performance support.

.

.

.

Or we could ask ourselves existential questions about the meaning and potency of instructional design, about whether a career devoted to helping people learn work skills is worthy to be our life’s work…

Or we could just get back to work and crank out that test…

SMILE…

 

 

I'm a bad blogger. I don't analyze my site traffic. I don't drive my readers down a purchase funnel. I don't sell advertising on my blog site. I'm a bad, bad blogger.

Two months ago, I set up Google Analytics to capture my blog's traffic. Holy heck batman! I found out something amazing, and I'm not really sure how to think about it.

Over April and May 2014, my most popular blog post–that is, the one most visited–was a blog post I published in 2006. How popular was this 2006 blog post? It accounted for 50% of all my blog traffic! Fifty freakin' percent! And April and May have been relatively busy blog posting months for me, so it wasn't like I wasn't creating new traffic.

What blog post was the envy of all the others?

It was this one, on one of the biggest myths in the learning field.

I guess this makes sense, (1) it's important, (2) the myth keeps resurfacing, and (3) by now the link has been posted in hundreds of places.

If I die today, at least I have made a bit of a difference, just in this blog post.

I'm a bad, bad blogger.  <<<<<WINK>>>>>

My friend, Jonathan Kaye, elearning-guy-extraordinaire, has posted the first parody of the Serious eLearning Manifesto–and I am proud to share it with you. Read his dLearning Manifesto principles (in this blog post) to get a LOL experience.

DLearning-graphic-1024x400

 

The Work-Learning Research website is ranked as follows:

  • #4 on Google
  • #4 on Bing
  • #7 on Yahoo

When searching for "learning research."

Interestingly, we hardly ever get paid to do research. Mostly we get paid to use research wisdom to make practical recommendations, for example in the following areas:

  1. Learning Design
  2. E-Learning
  3. Training
  4. Onboarding
  5. Safety
  6. Learning Evaluation
  7. Organizational Change
  8. Leadership Development
  9. Improving the Learning Department's Results
  10. Making Fundamental Change in Your Organization's Learning Practices

Research for me is a labor of love, and also, a way to help clients cut through opinions and get to practical truths that will actually make a difference.

But still, we are happy that the world-according-to-search-engines (WOTSE) values the research perspective we offer.

And here's a secret. We don't pay any search-optimizer companies, nor do we do anything intentional to raise our search profile (who has time or money for that?). Must be our dance moves or something…

Dear President Obama,

You're a technophile I have heard. So, I have an improvement to suggest for the FDA, particularly how it deals with food-safety issues.

Here's what the FDA does now.

In the age of web technology, the FDA's methodology is just plain laughable.

I propose a webpage with a database that would enable citizens to submit food-safety alerts.

This should be damn simple. The post office has a list of all addresses in the country. Why can't the FDA create a list of all foods sold in the U.S. plus a list of all food sellers (grocery stores, restaurants, etc.).

Consumers who suspect they have some bad food could go online and within a few clicks select their product and where they bought it from. They could describe the issue, etc.

In the background, the system would monitor products for unusual activities (larger than normal number of alerts) and create an alerting response when something looks wrong.

If the FDA doesn't have the wherewithal to design and create such a system. I would be glad to take this on with my strategic partner Centrax Corporation (they build high-premium e-learning and web programs and could whip this up no problem).

Seriously, the FDA could save lives very simply and at a relatively low cost. Let's just do it.

Thank you Mr. President for considering this.

Please let me know what I'm supposed to do with the yogurt in my refrigerator that tastes bad. If you think I'm going to call one of those numbers, you just don't get it.

–A worried citizen/consumer

Update Thursday April 16th

Yesterday I decided I should make those calls. I called the yogurt manufacturer and went to their website and I called my regional FDA hotline person (who called me back today, a day later). Stoneyfield Farm has posted the following recall information (their phone complaint line was horribly implemented with long wait times and no one has gotten back to me from their online complaint system):

Londonderry, NH – April 3, 2009 Stonyfield Farm is
conducting a voluntary recall of Fat Free Plain Quarts in Stonyfield
Farm branded containers limited to specific dates. The products are
being recalled because they may contain a presence of food grade
sanitizer.

Affected products are limited to Stonyfield Farm 32 ounce Fat Free
Plain yogurt UPC # 52159 00006 carrying one of the following product
codes printed along the cup bottom that start with the following date
codes:
· May 06 09 Time stamped 22:17 thru 23:59 (limited to these specific time stamps only)
· May 07 09 All time stamps

Approximately 44,000 quarts were distributed to retail accounts nationally.

We have received several reports of people noticing an off-taste
when eating the product. We have received no reports of illness of any
kind after consuming the product.

The issue was a result of human error in not following our Company's
standard operating procedures. Stonyfield has taken all the necessary
corrective action to prevent this from occurring again.

Consumers are advised not to consume the product and to return
opened and unopened containers to the store where it was purchased.
Anyone returning these products will be reimbursed for the full value
of their purchase.

Customers with questions should contact Stonyfield Farm Consumer Relations at 1-800-Pro-Cows (776-2697) or visit our website at www.stonyfield.com.

This is listed on their website when I checked today. I didn't notice it yesterday (they have a very busy home page), but it probably was there.

Note to Stonyfield Farm: 

I am not satisfied with your announcement stating, "We have received several reports of people noticing an off-taste
when eating the product. We have received no reports of illness of any
kind after consuming the product."

THAT IS NOT GOOD ENOUGH!! You should (1) tell us what we ingested, (2) get health experts to provide us with some expert guidance on what symptoms or dangers we might be subject to.

More:

I just called Stonyfield Farm Consumer Hotline again (and actually got through to them today) and the guy said it was a Food-Grade Sanitizer, FDA approved, organic, etc. He told me ingesting it wouldn't hurt me, but I'm not convinced. I told him I wanted to know what it was I ingested. He wouldn't or couldn't tell me. I asked him if I ate a whole container whether it would hurt me…He said no.

Hey Stonyfield. You can do better…

Video Overview:

The following video provides an entertaining and, I hope, enlightening look at the humble job aid.

Featuring:

  • This is only the second video that I shot and edited. See how I did.
      
  • Allison Rossett, co-author of the book, Job Aids and Performance Support (with Lisa Schafer) is interviewed.
      
  • Worldwide public introduction to incredible new talent, the incomparable Alena.
      
  • Brewer the dog has cameo role.
      

Video Notes:

Because of YouTube size restrictions, it is divided into 2 parts.

Enjoy in HD (if your computer can handle it) by:

  1. Starting the Video
  2. Clicking on HD at Lower Right, AND
  3. Clicking on the full-screen display (the box in a box) at Lower Right
  4. IF the audio doesn't track, your computer can't handle HD.



Part 1



Part 2

Purchasing (or learning more about) Allison's Book: