You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Our state machine may be overly complex, and user may need to monitor a bunch of different individual states to derive if container is healthy. Now, couple of times we have run into situation where FF was able to create/attach new doc, get ID back, but unable to interact with op stream due to op sequencing service being down. In terms of user experience, it was not obvious that error was present as doc creation implied services were up and running.
Can we create more holistic "healthy system" indicators/API?
Do we have needed services monitoring capabilities in place?
Should we allow operations on historian/storage service if alfred is (for example) down?
The text was updated successfully, but these errors were encountered:
I think the key here is minimizing complexity and the necessity for plumbing by the client. In this particular case the client attached a new container, and then say it was stuck in the dirty state. It turned out the container had never connected, so all ops after attach we waiting to be sent.
When i think about this in a scenario focused way i can see two different but related scenarios
Attaching a new container
Tracking saved/dirty state
For the first, should a container be considered to be successfully attached if it can't send ops? There are performance reason not to wait, but ideally the defaults make it easy, and we have ways to get performance with more work.
For the second, a container that never connects will never move to saved. potentially we need a better model around save/dirty so we express that saving is blocked or having trouble.
Note about current state of things - "connected" event fires once the container is connected to the delta stream and "caught up" (well, pending #9377). SO maybe we want to key other things off this as well.
This PR has been automatically marked as stale because it has had no activity for 60 days. It will be closed if no further activity occurs within 8 days of this comment. Thank you for your contributions to Fluid Framework!
Our state machine may be overly complex, and user may need to monitor a bunch of different individual states to derive if container is healthy. Now, couple of times we have run into situation where FF was able to create/attach new doc, get ID back, but unable to interact with op stream due to op sequencing service being down. In terms of user experience, it was not obvious that error was present as doc creation implied services were up and running.
Can we create more holistic "healthy system" indicators/API?
Do we have needed services monitoring capabilities in place?
Should we allow operations on historian/storage service if alfred is (for example) down?
The text was updated successfully, but these errors were encountered: