You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Redis outages in the last months have been the main cause of serlo.org not working, so we need to improve this.
Reasons why Redis was not working were:
not enough storage
the version of Redis we use doesn't remove automatically the stale keys
in order to upgrade the Redis image we would have to refactor module (it would need to upgrade the helm chart)
not enough memory -> could maybe be solved with auto-scaling
sometimes Redis restarts itself and we don't know yet the real reason (not a big problem though, since it is just for some minutes, but it is annoying and hard to debug)
We could fix all these things in the cluster but it sounds easier to outsource it and not have to manage the Redis availability.
Note that our API gateway is dependent on the Redis cache, so that it is down if the connection to Redis fails.
RICE Score
Reach and impact: 8 -- Will not directly make work on features easier but saves developer time that currently is needed when Redis does not work.
Confidence: 7 -- Should be possible but needs a small investigation about options and concrete implementation.
Effort: 6 -- Sounds not to complicated at first but probably there are lots of details to be considered to get it right.
The text was updated successfully, but these errors were encountered:
Redis outages in the last months have been the main cause of serlo.org not working, so we need to improve this.
Reasons why Redis was not working were:
We could fix all these things in the cluster but it sounds easier to outsource it and not have to manage the Redis availability.
Note that our API gateway is dependent on the Redis cache, so that it is down if the connection to Redis fails.
RICE Score
The text was updated successfully, but these errors were encountered: