Anki for mental performance measurement

Nick on your presentation you said that you looked into using Anki to quantify mental power.

Did you write code and calculated some measurement that didn’t work out on did you simply saw that it’s a lot of work to get a decent measurement?

I actually wrote some code and did some machine learning to try to measure my SRS studies with Skritter, which is the Chinese character SRS that I built. This should have been a much better measure because everything was the same type of item, so reviews never varied much in total day-to-day difficulty. I also had access to more raw review information.

I did spend several days trying to make it work out, but it just showed no correlation with my self-reported energy levels around the time of when I was studying. So I can’t say whether it’s a bad measure or whether my own estimation of my alertness is a bad measure.

With Anki, it would be hard to use it as a regular measure of mental performance unless you were going to learn a lot of items of the same difficulty over a long period of time. However, if you wanted to use it to evaluate different learning strategies, it would be really easy. Just take some material, divide it evenly into two decks, and study one parallel to the other using a different strategy, and see how long it takes to learn it using Anki’s stats.

[quote]I actually wrote some code and did some machine learning to try to measure my SRS studies with Skritter, which is the Chinese character SRS that I built.[/quote]What data did you use to train the machine learning algorithm?

[quote]This should have been a much better measure because everything was the same type of item, so reviews never varied much in total day-to-day difficulty. [/quote]So you didn’t make an attempt at separating difficult Chinese characters from hard ones?

[quote]I also had access to more raw review information.[/quote]Anki also provides a bunch of raw information.
cardId/Time/lastInterval/nextInterval/ease/delay/lastFactor/nextFactor/reps/thinkingTime/yesCount/noCount seems pretty comprehensive.

The only advantage I see with Skritter is that you have memory data for many people.

[quote]I did spend several days trying to make it work out, but it just showed no correlation with my self-reported energy levels around the time of when I was studying. So I can’t say whether it’s a bad measure or whether my own estimation of my alertness is a bad measure.[/quote]There are two separate questions:

  1. Do you actually measure something or are the measurement noise?
  2. If you do measure something, does it have something to do with alertness?

As Skritter seems to give you a lot of data from different people I would start by focus on the first question.
Separate every session into two halves.

If you have a measure that correlates decently than it measures something.

Finding out what something is, is the next adventure :wink:

I used Weka’s 10-fold cross-validation, so all the data was used both as training and test data. This is the data I had for a year on my daily energy levels, exercise, and sleep were against my thinking times, reaction times, and retention rates.

I did attempt to separate difficult Chinese characters from easy ones, but to my surprise, it didn’t make any difference. The scheduling had already normalized them out. I was using my metric of standard usage frequency weighted toward learner-relevant sources, but it had no correlation with difficulty. Now if I had generated the difficulty index for each character based not on usage frequency but average Skritterer difficulty with that character, I probably would have gotten something.

Ah, I didn’t know that about Anki’s raw review data. If you had a lot of cards for Anki that were all the same type over a long time period, then you could definitely do the same experiment.

I guess you’re right about determining measurement noise by comparing to other Skritter users, but it does take a long time to organize and to run these jobs even with just my data from last year. I think I will put off the analysis of that question until later. If someone else wants to do it, I can provide support.