Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Produce workload graph for an individual #596

Closed
Xemorr opened this issue Jan 23, 2024 · 31 comments · Fixed by #598
Closed

[Feature Request] Produce workload graph for an individual #596

Xemorr opened this issue Jan 23, 2024 · 31 comments · Fixed by #598
Labels
enhancement New feature or request

Comments

@Xemorr
Copy link

Xemorr commented Jan 23, 2024

Which module is related to your feature request?
Scheduler, Optimizer, or Simulator?
Simulator I guess?

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
The optimal retention calculator spits out a single number, mine being 0.75/0.76 depending on how parameters are varied. I'm preparing for exams so can't afford a retention that low but would still be interested in a graph of how my workload goes up as desired retention is varied.

Describe the solution you'd like
A clear and concise description of what you want to happen.
The ideal solution would be a button under compute optimal retention that pops up with a graph.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Maybe instead of purely minimising workload, (ie finding the minimum point on that graph), it takes into account retention. For example, if 90% retention is only 5% more work than 75% retention, but you get questions wrong 2/5ths as often, then that is much better. I think it's possible to write an equation that also optimises while considering that.

Additional context
Add any other context or screenshots about the feature request here.
281365800-e2b95037-593a-4633-8774-dd16cba5f48e
The graph in the tutorial I'm referencing

@Xemorr Xemorr added the enhancement New feature or request label Jan 23, 2024
@L-M-Sherlock
Copy link
Member

@Expertium, do you have the code to draw the workload graph?

@Expertium
Copy link
Collaborator

Expertium commented Jan 24, 2024

I do, but beware that for large decks and large values of "Days to simulate" it will take a lot of time to generate the graph, even if we limit the range to 0.7-0.99 instead of 0.5-0.99.
Plotting a graph from Sherlock's data.zip

EDIT: the upper limit for the y axis was hard-coded, I made it adaptable. I also changed the lower limit of the x axis from 0.5 to 0.7 and added text to make it more clear what the horizontal dashed lines mean.
EDIT 2: I want to make the colors adaptable too, for example, the green area should cover everything from min. workload to 2x min. workload. But that's going to be a bit complicated, give me some time.

@Expertium
Copy link
Collaborator

Expertium commented Jan 24, 2024

Plotting a graph from Sherlock's data.zip

Here's the code. However, with the way I defined the green area, it may be possible for the output of "Compute optimal retention" to fall within the yellow area.

Figure_1

EDIT: there are still some edge cases. Give me some more time.
EDIT 2: ok, I covered all edge cases I could think of, though the code is a bit complicated as a result. I've also deifned the green area differently to decrease the chances that the optimal value will fall into the yellow area.
Figure_1

@Expertium
Copy link
Collaborator

For example, if 90% retention is only 5% more work than 75% retention, but you get questions wrong 2/5ths as often, then that is much better.

By the way, "Compute optimal retention" doesn't minimize the workload, rather, it maximizes the sum of all probabilities of recall of all cards. In simple terms, it maximizes knowledge acquisition within given time constraints.

@Xemorr
Copy link
Author

Xemorr commented Jan 24, 2024

For example, if 90% retention is only 5% more work than 75% retention, but you get questions wrong 2/5ths as often, then that is much better.

By the way, "Compute optimal retention" doesn't minimize the workload, rather, it maximizes the sum of all probabilities of recall of all cards. In simple terms, it maximizes knowledge acquisition within given time constraints.

ah thank you, I took that understanding from what someone said on reddit. That sounds more sensible. I've realised the values I'm getting from compute optimal retention were because I had a large deck size (explaining the difference between what I'm managing to do, and what the algorithm spit out). I was entering a deck size figure by what I'd expect my deck size to be by the time of exams, as it can't account for a varying deck size. Maybe I'm more interested in having the compute optimal retention feature gain like a linear interpolation feature where there is the option to linearly interpolate between current deck size, end deck size using the number of days of study.

My retention for an hours worth of Anki per day in 180 days rapidly declines between 1000 deck size and 2000 deck size from around 90% retention to more the higher 70s. (but I think that might be due to in part not accounting for the cards that have already matured etc)

@L-M-Sherlock
Copy link
Member

Added in the optimizer:

image

@aleksejrs
Copy link

aleksejrs commented Jan 26, 2024

Thanks!

My two arithmetic decks (where I fail a card if I didn't answer it quick enough) got graphs where 98-99% is best, so the whole graph is red.

Compared to the values stored in Anki, Anki's Evaluate shows that optimizer's RMSE is usually a little higher, and Log Loss a little lower. (corrected: usually, not always)

@Expertium
Copy link
Collaborator

Can you show that graph here?

@aleksejrs
Copy link

imp 0 veryimp mathbas 12%.apkg
[3.1008, 3.1008, 8.3177, 25.1118, 5.3022, 1.7371, 1.1355, 0.0697, 1.6848, 0.1068, 1.1423, 1.5057, 0.1995, 0.4903, 0.5435, 0.0127, 3.9198]
Loss before training: 0.2054
Loss after training: 0.2014
imp 0 veryimp mathbas 12% apkg

imp post math-bas 14%.apkg
[1.1157, 2.5695, 8.9803, 20.0419, 4.9463, 0.7911, 0.7905, 0.0424, 1.6001, 0.1579, 1.0157, 2.0435, 0.1945, 0.3917, 1.3053, 0.1832, 3.161]

imp post math-bas 14% apkg

@Expertium
Copy link
Collaborator

Expertium commented Jan 26, 2024

Huh. I guess the default values of "Days to simulate" and other stuff just aren't very good for you.
@L-M-Sherlock what are the defaults? 365 days, 10 000 cards, 30 minutes of study per day? Maybe you should add an option to configure all of that, just like in "Compute optimal retention" in Anki itself

@aleksejrs
Copy link

aleksejrs commented Jan 26, 2024

@L-M-Sherlock what are the defaults? 365 days, 10 000 cards, 30 minutes of study per day?

So it still depends on the choice of time of study?

Past year:

imp 0 veryimp mathbas 12%.apkg:
Days studied: 77% (281 of 364)
Total: 2508 reviews
Average for days studied: 8.9 reviews/day
If you studied every day: 6.9 reviews/day
Total: 2 hours
Average for days studied: 0.5 minutes/day
If you studied every day: 0.4 minutes/day
Average answer time: ⁨3.2⁩s (⁨18.74⁩ cards/minute)

imp post math-bas 14%.apkg:
Days studied: 65% (237 of 364)
Total: 1099 reviews
Average for days studied: 4.6 reviews/day
If you studied every day: 3.0 reviews/day
Total: 1 hours
Average for days studied: 0.3 minutes/day
If you studied every day: 0.2 minutes/day
Average answer time: ⁨4.03⁩s (⁨14.88⁩ cards/minute)

@Xemorr
Copy link
Author

Xemorr commented Jan 26, 2024

Huh. I guess the default values of "Days to simulate" and other stuff just aren't very good for you. @L-M-Sherlock what are the defaults? 365 days, 10 000 cards, 30 minutes of study per day? Maybe you should add an option to configure all of that, just like in "Compute optimal retention" in Anki itself

Would it be possible to set the default number of cards, to the number of cards the user currently has in their deck? I imagine that might be a more sensible default than 10k.

@L-M-Sherlock
Copy link
Member

Maybe you should add an option to configure all of that, just like in "Compute optimal retention" in Anki itself

Actually, the current implementation is configurable: https://github.com/open-spaced-repetition/fsrs-optimizer/blob/bddc4fe67184fdcfaf79284b0e5430a11fcb3c31/src/fsrs_optimizer/fsrs_optimizer.py#L1220-L1227

@Expertium
Copy link
Collaborator

But I don't see those options in fsrs4anki_optimizer.ipynb.

@L-M-Sherlock
Copy link
Member

You can pass the arguments into the function in fsrs4anki_optimizer.ipynb.

@Expertium
Copy link
Collaborator

That's not obvious. Maybe you should add text explaining that, and an example? You can't expect most people to know that.

@L-M-Sherlock
Copy link
Member

I will do it tomorrow.

@aleksejrs
Copy link

aleksejrs commented Jan 26, 2024

So this graph is much better than trying different values to get individual numbers and then thinking about them, but it suffers from all the other problems of Evaluate.

I opened the stats for "imp 0 veryimp mathbas 12%.apkg"', "deck:current introduced:100", and it shows that there have been 3 reviews per day (2 with -rated:1), which is about 8.7 seconds.
There are 569 new cards.

figs = optimizer.find_optimal_retention(deck_size=569, learn_span=365, max_cost_perday=9)
Untitled

figs = optimizer.find_optimal_retention(deck_size=569, learn_span=365, max_cost_perday=6)
Just over 140 cards memorized in the year.
Untitled

I introduce only 1 card per day, so it should be 569 days. And I will probably have backlogs again, so I try changing 6s to 5s.
figs = optimizer.find_optimal_retention(deck_size=569, learn_span=569, max_cost_perday=5)

But apparently 6s was already not enough, so I can learn a new card less often than every other day if I am to fit in 5s?
Untitled
Untitled
Untitled
Untitled
Untitled

It seems that the Average answer time is 3.2-3.5s, not 2.92 that I used for some reason.

@Expertium
Copy link
Collaborator

And I will probably have backlogs again, so I try changing 6s to 5s.

max_cost_perday is in minutes, and it's not per card, it's per all cards of that day.

@aleksejrs
Copy link

1800 seconds = 30 minutes, 1800 minutes won't fit in a day.

@Expertium
Copy link
Collaborator

Ok, my bad. I looked at the code again, and it appears to be in seconds. But I'm still pretty sure that it's per all cards, not just one card.
This would've been much easier if only it was properly documented in the .ipynb itself.

@aleksejrs
Copy link

I was talking about the time per review that I multiplied by the number of reviews (since Anki rounds per-day stats down to 0).

@Expertium
Copy link
Collaborator

Ah, so you're plotting a graph for a "What if I only learn 1 card per day?" scenario? I don't think this feature is intended to be used that way, but ok.

@aleksejrs
Copy link

aleksejrs commented Jan 26, 2024

Ah, so you're plotting a graph for a "What if I only learn 1 card per day?" scenario? I don't think this feature is intended to be used that way, but ok.

Yes, I introduce 1 new card per day in that deck. It's not 1 review per day.

As I understand it,

  • Optimizer limits overall repetitions. It introduces as many new cards as it can fit in max_cost_perday. It doesn't support new_cards_limits, but it can't review overtime.
  • Simulator limits overall repetitions + obeys new_cards_limits. It can't review overtime.
  • Anki can limit overall repetitions and obey new_cards_limits. You can review overtime. But the only built-in way to get a dynamic new card limit like with Optimizer or Cardistry is to limit reviews and review overtime by choice.

@Expertium
Copy link
Collaborator

  • Optimizer limits overall repetitions. It introduces as many new cards as it can fit in max_cost_perday. It doesn't support new_cards_limits, but it can't review overtime.
  • Simulator limits overall repetitions + obeys new_cards_limits. It can't review overtime.

I don't think simulator obeys new card's limits, actually. But I'm not sure.

@Expertium
Copy link
Collaborator

I'm looking at the simulator too
image

@Expertium
Copy link
Collaborator

Ah, that one. Yeah, I was looking at the one that is used for calculating optimal retention

@aleksejrs
Copy link

aleksejrs commented Jan 26, 2024

Simulator needs a switch to disable Anki simulation.
Optimizer needs to support a new card limit.

Still not sure what to do about the existing review cards, since they are causing the most workload in the decks where I am not learning new cards.

For the above deck of 569 cards, with 1 new card/day in 600 days and pretty much no review limits (100 reps, 60 seconds), I got
DR, time, remembered cards, time per remembered card (rem. = remembered cards÷569):
remembered cards is int(card["retrievability"].sum()), and then the rounded value is used to calculate time per remembered card.

0.01  46.5 201 0.23 (anki: 569 180.7 560 0.32)
0.02  46.5 202 0.23 (rem. 0.355)
0.05  47.2 203 0.23 (rem. 0.357)
0.08  48.2 206 0.23 (rem. 0.362)
0.09  51.1 211 0.24 (rem. 0.371)
0.10  53.0 203 0.26 (rem. 0.357, anki: 569 181.4 559 0.32)
0.20  90.6 269 0.34 (rem. 0.473)
0.30 129.4 334 0.39 (rem. 0.587)
0.40 140.6 411 0.34 (anki: 569 180.3 559 0.32)
0.50 134.7 456 0.30 (anki: 569 178.8 560 0.32)
0.60 122.5 481 0.25 (anki: 569 176.6 560 0.32)
0.70 119.9 511 0.23, 2 leeches
0.75 113.6 525 0.216 (rem. 0.923)
0.83 115.2 535 0.225

0.84 112.5 536 0.201
0.84 113.9 535 0.213
0.84 115.6 537 0.215

0.85 114.0 539 0.212 (rem. 0.947)
0.85 114.6 539 0.2127 (no internal rounding)
0.85 114.7 539 0.213
0.85 115.7 540 0.214 (no internal rounding)
0.85 115.2 539 0.214

0.85 116.7 536 0.218 — with the FSRS values I had before today

0.86 115.5 539 0.214
0.86 115.6 539 0.214
0.86 117.4 540 0.216, 1 leech

0.87 116.9 541 0.216
0.87 118.2 540 0.219
0.87 119.2 540 0.221, 1 leech

0.88 119.2 542 0.22
0.89 118.1 543 0.218
0.92 127.6 552 0.231 (rem. 0.970)
0.99 306.1 566 0.54 (rem. 0.995)
1.00 604.4  24 25.18 (unlike the other lines, only 25 cards were "learned" at all)

114÷600=
0.19 min/day avg
11.4 seconds
11.4÷4=2.85
11.4÷3=3.8

@L-M-Sherlock
Copy link
Member

Simulator needs a switch to disable Anki simulation. Optimizer needs to support a new card limit.

Please open a new issue to submit feature request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants