Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Ease/Difficulty Hell in fsrs4anki #189

Closed
Leocadion opened this issue Mar 21, 2023 · 17 comments · Fixed by #283
Closed

[Question] Ease/Difficulty Hell in fsrs4anki #189

Leocadion opened this issue Mar 21, 2023 · 17 comments · Fixed by #283
Labels
question Further information is requested

Comments

@Leocadion
Copy link

  • [x ] I have checked the FAQ and could not find an answer to my question
  • [x ] I have read the wiki and still felt confused
  • [x ] I have searched for similar existing questions here

Question
Hello! One of my decks seems to have stuck in ease/difficulty hell when using fsrs4anki.

When training the parameters, this is the distribution of difficulty that I'm getting.

prediction.tsv saved.
difficulty
4 0.000188
5 0.094309
6 0.000658
7 0.385508
8 0.000846
9 0.239227
10 0.279263
Name: count, dtype: float64

As you can see, more than 90% of cards in this deck are evaluated to be at difficulty 7 or higher.

I suspect this is due to the parameter w[5], which is normally used to avoid ease hell. The default value for w[5] is 0.2. but for me after training it is 0.0012 (166x smaller). Here are my parameters:

var w = [0.8945, 0.9921, 5.3858, -0.9833, -0.9239, 0.0012, 1.0714, -0.1468, 0.4514, 2.0159, -0.1871, 0.6128, 1.1116];

This means that after getting a card wrong a few times, it goes to max difficulty, and it takes a lot of times getting it right for it to drop even by a value of 0.1. This also leads to my intervals increasing extremely slowly. Here's an example:

rating history: 1,3,1,3,1,3,3,3,3,3,3,3,3,3,3,3,3,3
interval history: 0,2,3,4,6,6,7,9,11,13,15,18,21,25,30,35,41,48,56
difficulty history: 0,7.4,7.4,9.2,9.2,10.0,10.0,10.0,10.0,10.0,10.0,10.0,10.0,10.0,10.0,9.9,9.9,9.9,9.9

So yeah, what do you think the issue might be? Is it that my deck is just extremely difficult, or is there something else at play here? Either way, I would have thought that that after training the difficulties for cards within a deck should be more evenly distributed.

Let me know your thoughts.

Also attaching the deck if anyone would like to recreate the issue (all cards used, GMT timezone).

test.zip

@Leocadion Leocadion added the question Further information is requested label Mar 21, 2023
@kuroahna
Copy link

I've also got similar results as you and opened a similar thread. I do feel like it should be evenly distributed

My parameters

[1.2958, 0.9669, 5.428, -1.248, -1.0036, 0.0257, 1.2816, -0.0125, 0.7222, 1.7748, -0.4153, 0.6615, 0.7742]

and distribution

1     0.000659
2     0.004772
3     0.004594
4     0.006419
5     0.115627
6     0.017076
7     0.030618
8     0.227834
9     0.111830
10    0.480571

48% are apparently stuck at a difficulty of 10.

And looking at my ease graph in anki, most of my cards have 250% ease (sort of the equivalent to difficulty in FSRS) with an average of 246%

image

I still haven't switched to FSRS and am still using Anki SM-2 because I feel like the distribution should be more even/uniform after training

@Leocadion
Copy link
Author

Hi @kuroahna. Glad to see I'm not the only one with this issue.

I had a look at the thread that you linked, and yes, technically there is mean reversion built in this algorithm, but if it is so so weak, then it's the same as being non-existent.

For instance, in the example that I linked above, I would have to get the card right 10 times in a row (spanning 155 days) to bring the difficulty down from 10 to 9.9. Taking this into account, how long would it actually take for the card's difficulty to return to the mean?

Currently the situation is that I'm hitting 90% retention daily, but the number of daily reviews is almost not going down at all, meaning I am unable to add any new cards without increasing my daily workload.

P.S. I had a look at some of the previous parameters produced by training, and last month my w[5] value was 0.00, which basically means there was no mean reversion of difficulty. Why does it mean when training produces such a value?

@spartandrew18
Copy link

I personally feel like my reviews have been piling up and not going down as well.

@user1823
Copy link
Collaborator

I also seem to be affected by this issue.

  • Currently, my w[5] value is 0.0008, which is also much smaller than the default value.
  • Till last week, my w[5] value was 0.0, which means that there was no mean reversion of difficulty. I re-iterate one of the questions made above: What does it mean when training produces such a value?

Fortunately, my difficulty distribution is not as skewed as others here.

prediction.tsv saved.
difficulty
4     0.011870
5     0.391594
6     0.006120
8     0.319956
9     0.009600
10    0.260861
Name: count, dtype: float64

@L-M-Sherlock
Copy link
Member

You can manually increase w[5] if you think it is too small.

@Leocadion
Copy link
Author

You can manually increase w[5] if you think it is too small.

Yes, I've considered it, but, as you said in another thread, you should just trust the algorithm. I do kinda trust it, so before I change any of the parameters myself I would like to understand them a bit better. Do you have any ideas or scenarios where it would make sense for w[5] value to be so low to begin with?

@user1823
Copy link
Collaborator

user1823 commented Mar 23, 2023

You can manually increase w[5] if you think it is too small.

This is not a solution actually. How can we know how much to increase the w[5] value?

Also, the main question remains unanswered: Why does the training produces such values of w[5]? Is it actually better to use that value of w[5]? Or is the w[5] value produced by the optimizer erroneous?

Edit: @Leocadion beat me to the punch. Our comments convey the same thing, just in different words.

@L-M-Sherlock
Copy link
Member

If the low w[5] induces a lower loss than the high w[5], the low w[5] is better. You can compare them in the optimizer:

image

@kieranlblack
Copy link
Contributor

I have been experiencing a similar phenomenon and have noticed a discrepancy between the future due distribution generated when the helper schedules cards and that of when cards get naturally scheduled using the custom scheduler code. When I don't use the helper, things tend to start piling up creating a moving ridge, but the helper (load balance is not enabled) always smooths things out.

With 4 or so days of the custom code scheduler the distribution tends to look like this:
image

However when rescheduling with the helper the distribution snaps to something more like this:
image

These are pretty drastically different distributions and note how many more cards are scheduled for review from the custom scheduler code one.

@L-M-Sherlock
Copy link
Member

To be honest, the original version of FSRS doesn't have the mean reversion of difficulty. It doesn't exist in my paper, too.

If we really need it in the memory model, we probably need more evidence to support it.

By the way, maybe I need to remove the upper limit of difficulty, then the w[5] perhaps will increase and the loss could decrease.

@user1823
Copy link
Collaborator

I am not completely sure about how the FSRS algorithm calculates the difficulty. But, I noted that the way SuperMemo calculates the difficulty was changed with SM-18. (https://supermemo.guru/wiki/Item_difficulty_in_Algorithm_SM-18)

Is the difficulty in FSRS calculated like SM-17 or like SM-18?

If FSRS calculates the difficulty like SM-17, can this issue be solved by calculating the difficulty like SM-18?

@L-M-Sherlock
Copy link
Member

Is the difficulty in FSRS calculated like SM-17 or like SM-18?

The difficulty in FSRS is calculated like SM-18, because the scheduler code only could know the last states, and it doesn't know the entire review history.

@user1823
Copy link
Collaborator

What if the card is rescheduled using the helper?

@L-M-Sherlock
Copy link
Member

What if the card is rescheduled using the helper?

It's the same.

@Leocadion
Copy link
Author

Hello, just a small update, which might be slightly interesting/relevant.

Since I made the original post 2 months ago, I've been continuing to stick to the parameters given by the algorithm. The progress in bringing down my daily review counts (despite me not adding any new cards) was negligible - the daily review count went down from approx. 160 to 150 (and all that with more than 60 days of reviews, maintaining roughly an 87% success rate.

I usually adjust my parameters and retrain the algorithm at the end of the month, so yesterday I wanted to do that, but first I decided to have a look at some of the scheduling history for indvidual cards. I found that sometimes, even if I answered a card correctly. the new interval would stay the same, or even get reduced (!). I think I had previously noticed this behaviour from time to time, but it usually coincided with me changing my algorithm parameters, so I didn't think much of it. This time though, after a more thorough inspection I could really see that this was happening quite often, and that it was not related to parameter changes.

I thought about what might be causing it, and I realised that it is probably the implementation of fuzz. Since the difficulty of most of my cards is extremely high, each time I answer a card correctly its interval would only increase by 1 or 2 days. But, since fuzz adds or subtracts a couple of days to each new interval, this would effectively mean that there would be a reasonably high chance that the new intervals would actually be lower/the same than the previous ones.

I set fuzz to 'false' in my deck settings, rescheduled all my cards, and it seems like this has alleviated the problem quite a lot, there being less reviews in the upcoming days, and it seems like the amount of reviews I'll have to do over the next few days will actually start going down (slowly, but still). It's too early to assess how big of a difference removing fuzz will bring in the long run, but at least now I have hope that I might make my way out of this ease/difficulty/interval (?) hell.

@L-M-Sherlock
Copy link
Member

I thought about what might be causing it, and I realised that it is probably the implementation of fuzz. Since the difficulty of most of my cards is extremely high, each time I answer a card correctly its interval would only increase by 1 or 2 days. But, since fuzz adds or subtracts a couple of days to each new interval, this would effectively mean that there would be a reasonably high chance that the new intervals would actually be lower/the same than the previous ones.

It has been solved by #249.

For the Ease/Difficulty Hell, I think we can clamp the w[5] to ensure it is larger than 0.1 or other value.

@L-M-Sherlock L-M-Sherlock linked a pull request Jun 6, 2023 that will close this issue
@L-M-Sherlock
Copy link
Member

After #283, the difficulty will be distributed more uniformly.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants