Which Algorithm is Better? SuperMemo's or Anki's?

2018-10-30 2020-11-17 1142 words 6 minutes

/2018-10-30-is-sm-17-in-supermemo-better-than-sm-2-in-anki/640px-Tower_of_Hanoi.jpeg

Contents

This question frequently comes up in r/Anki and help.supermemo.org. This is almost as frequent as “Is SuperMemo better than Anki?”

If you’re Pro-Anki, you’d probably link this page

If you’re Pro-SuperMemo, you’d link this or this

Note

Quick Terminology Recap

SRS = Spaced Repetition Software

SM = SuperMemo from super-memo.com; NOT the courses, mobile app, application from supermemo.com. They are different.

Items in SuperMemo = Cards in Anki; IR = Incremental Reading

If you’re Pro-Anki, you’d believe SM-2 is as good as SM-17,

“These SM-17 data are all experimental and in theory, Anki has a better UI and addons and you can’t beat that.”

If you’re Pro-SuperMemo, you would not trust Anki manual’s comment on the algorithm because:

“Information about flaws in earlier algorithms is baseless. All documented flaws have been described at supermemo.com. SuperMemo official sites are more authoritative source of information than Wikipedia (or in this case Anki site).”

Is SM-17 better than SM-2?

I asked myself this question a lot when I was considered switching.

There isn’t any experiment done on these algorithms (except simulation from SM). There is no controlled-randomized experiment where you split two groups of people, have them memorized the same list of word with different algorithms and 10 years later, test their recall.

Ultimately, there are too many variables. Woz would of course, say newer is better.

…… and I simply trust him. I think SM-17 is superior than any other previous algorithms (thus, including SM-2). My arguments of SM-17 is better than SM-2 are largely subjective.

First, he is THE MAN about spaced repetition algorithms.

He came up with the algorithms. No one in the world knows about Spaced Repetition algorithms better than him. My faith and trust in him simply are the results of his extensive knowledge about human memory and his algorithms. He even had considered the impact of circadian rhythm on memory and thus, included SleepChart into SM. He’s THE ONE who has been tweaking the spaced repetition algorithms for more than 30 years. When he says SM17 is better, I say SM17 is better.

There are pages written from Woz himself describing some parameters and behaviors of the newer algorithms. I didn’t try to understand it in-depth; I could certainly try, but if I keep digging, it will ultimately lead to his three-component model of long-term memory.

“X increases as Y decreases because of Z. Why? Because of the variable B in the three-component model of long-term memory.”

If you believe in the validity of that model and theories, you’d be inclined to believe that SM-17 is better than SM-2, as this is my case.

Second, throughout the years, I’ve read textbooks and research papers on memory and got to know different theories about memory, for example, the new theory of disuse and the generative theory of learning.

You can’t accurately judge whether a guy can speak fluent Japanese unless you’re a native or have reached native fluency. By the same token, you can’t judge how much Woz knows about human memory unless you know (at least little bit) about human memory.

One peculiar characteristics of memory I learned is desirable difficulty, that the more difficult the retrieval, the better the subsequent increase in retrieval strength and storage strength. It turned out Woz addressed this “desirable difficulty” concept in his Wiki entry Pleasure of Learning

Visit his site and you’ll see he has written extensive articles on how sleep, nutrition, exercise, among others, affect memory. I am more or less,

“Wow, this guy certainly knows a lot about memory. See, he’s even considered this and that in his algorithms!”

You can regard this as blind faith, but I don’t see anyone who is more authoritative and knowledgeable about this subject than him. Maybe Woz is wrong; maybe the new theory of disuse and the generative theory of learning are also wrong, I don’t know. One thing I know for certain is that they are more trustworthy than my wild guesses of how memory works.

Comparing the Algorithms

There will never be any definitive answers on whether SM17 is better than SM2, because ultimately it relies on you trusting the memory models and algorithms and the validity of the metrics. For me, I believe in his theories and models, so I believe the result of simulation that SM-17 is better than SM-2. Still, for a more objective argument, we can look at the simulation comparing SM-2 and SM-17:

Quote

“Feb 22, 2018: We have finally added a simulation of Algorithm SM-2 to SuperMemo for Windows and have come up with the average least squares metric of 53.5685% (for Algorithm SM-2).

For comparison, Algorithm SM-17 results in 37.1202% (a million repetitions dataset). This may not sound impressive, however, for shorter intervals, the load of repetitions might easily be 2-10x greater assuming no delays (i.e. executing repetitions as prescribed). Back in 1989, we could see that even Algorithm SM-5 would reduce repetition loads twice as fast as SM-2.

The least squares metric for Alg SM-2 equals ~54% as compared to ~37% for Algorithm SM-17. This does not sound like a lot, but it may easily double or triple the review workload (esp. for shorter intervals).”

If it means 31.4% more efficient, then THAT’S A LOT over a lifetime. The more items you have, the more time the newer algorithm will save you. Snowball effect.

The way I understand it, SM-17 blows away SM-2 but the difference between SM-17 and SM-15 is not that large.

PS: In case anyone is wondering about the algorithm discrepancy, I use SM-17 whenever SM asks me about it.

From Intricacies of Spaced Retrieval: a Resolution:

Quote

“Remembering is greatly aided if the first presentation is forgotten to some extent before the repetition occurs.”

“There are many instances where the rate and level of initial learning is very good relative to some other condition, yet these seemingly beneficial conditions ultimately produce poor long-term retention as assessed on delayed tests.”

“Stated another way, conditions that make initial learning slower and more difficult might produce worse initial learning performance but lead to gains in long-term retention. Some difficulty that makes initial learning slower and more effortful can make long-term retention better.”

The difficulty of the first retrieval in the typical expanding scheme is critical to later performance. - Making Things Hard on Yourself, But in a Good Way: Creating Desirable Difficulties to Enhance Learning

Performance ≠ learning

Like the graph I showed above, making the initial retrieval easier can lead to worse long-term retention. Changing the initial step from 6-day interval to 1-day will produce better initial recall (performance) during learning, but worse long-term retention.

After a night of cramming, you’ll perform well enough to pass the exam, but it doesn’t lead to “learning” because you’ll forget the materials as soon as you’re done with the exam.