Raft snapshots disappear during machine crashes #3362

preetapan · 2017-08-04T15:41:17Z

preetapan · 2017-08-04T15:41:42Z

Fixed with 454b3a2

romfreiman · 2017-08-05T19:37:25Z

@preetapan thanks for fast fix.
Out of interest - how it was tested? Do you need my help to try and test it in our environment?

slackpad · 2017-08-05T20:14:26Z

Hi @rom-stratoscale we've done some local testing but it would be great if you can give this a spin in your local environment and see if it looks fixed there as well. Are you ok to make a build locally off master?

romfreiman · 2017-08-05T21:02:49Z

@slackpad we're using 0.8.4 (just upgraded). I can try and cherry-pick (btw, aren't u planning to to back port the fix to 0.8.x?)?

BTW, do u have a containerized build environment for consul, or just to build it on my laptop?
If you're interested, you can look at skipper (developed by my team @ Stratoscale), which allows you easily prepare containerized build environment and also to build a deploy container.
https://github.com/Stratoscale/skipper

romfreiman · 2017-08-06T13:46:57Z

@slackpad @preetapan strange. I cherry picked your commit above 0.8.4 and run the crash tests (ipmi off to a server, centos 7.1, kernel 3.10.).
from consul logs:
2017/08/06 13:02:31 [INFO] consul.fsm: snapshot created in 16.894Âµs
2017/08/06 13:02:31 [INFO] raft: Starting snapshot up to 8284
2017/08/06 13:02:31 [INFO] snapshot: Creating new snapshot at /mnt/data/consul/raft/snapshots/2-8284-1502024551949.tmp
2017/08/06 13:02:31 [INFO] raft: Snapshot to 8284 complete

approx 40 sec afterwards the test crashed the node.

Then the node starts:
2017/08/06 13:11:47 [INFO] consul_wrapper: Starting new HTTP connection (1): 127.0.0.1 (/usr/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py:203)
==> Starting Consul agent...
==> Consul agent running!
Version: 'v0.8.4'
Node ID: '00051dd4-0422-4b6c-0278-a7275b88336f'
Node name: 'stratonode2'
Datacenter: 'stratocluster-11281299619890165129__3_13_0_ee76fd4a'
Server: true (bootstrap: false)
Client Addr: 0.0.0.0 (HTTP: 8500, HTTPS: -1, DNS: 53)
Cluster Addr: 1.71.130.86 (LAN: 8301, WAN: 8302)
Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false

==> Log data will now stream in as it occurs:

2017/08/06 13:11:47 [INFO] raft: Restored from snapshot 2-8284-1502024551949
2017/08/06 13:11:47 [INFO] raft: Initial configuration (index=1808): [{Suffrage:Voter ID:1.71.128.5:8300 Address:1.71.128.5:8300} {Suffrage:Voter ID:1.71.130.86:8300 Address:1.71.130.86:8300} {Suffrage:Voter ID:1.71.130.18:8300 Address:1.71.130.18:8300}]
2017/08/06 13:11:47 [INFO] raft: Node at 1.71.130.86:8300 [Follower] entering Follower state (Leader: "")
2017/08/06 13:11:47 [WARN] memberlist: Binding to public address without encryption!
2017/08/06 13:11:47 [INFO] serf: EventMemberJoin: stratonode2 1.71.130.86
2017/08/06 13:11:47 [INFO] serf: Attempting re-join to previously known node: stratonode1: 1.71.130.18:8301
2017/08/06 13:11:47 [INFO] consul: Adding LAN server stratonode2 (Addr: tcp/1.71.130.86:8300) (DC: stratocluster-11281299619890165129__3_13_0_ee76fd4a)
2017/08/06 13:11:47 [WARN] memberlist: Binding to public address without encryption!
2017/08/06 13:11:47 [INFO] serf: EventMemberJoin: stratonode2.stratocluster-11281299619890165129__3_13_0_ee76fd4a 1.71.130.86
2017/08/06 13:11:47 [INFO] agent: Started DNS server 0.0.0.0:53 (udp)
2017/08/06 13:11:47 [INFO] agent: Started DNS server 0.0.0.0:53 (tcp)
2017/08/06 13:11:47 [INFO] agent: Started HTTP server on [::]:8500
2017/08/06 13:11:47 [INFO] agent: Joining cluster...
2017/08/06 13:11:47 [INFO] agent: (LAN) joining: [1.71.130.18 1.71.128.5]
2017/08/06 13:11:47 [INFO] serf: Attempting re-join to previously known node: stratonode0.stratocluster-11281299619890165129__3_13_0_ee76fd4a: 1.71.128.5:8302
2017/08/06 13:11:47 [INFO] consul: Handled member-join event for server "stratonode2.stratocluster-11281299619890165129__3_13_0_ee76fd4a" in area "wan"
2017/08/06 13:11:47 [INFO] serf: EventMemberJoin: stratonode0 1.71.128.5
2017/08/06 13:11:47 [INFO] serf: EventMemberJoin: stratonode1 1.71.130.18
2017/08/06 13:11:47 [INFO] serf: EventMemberJoin: stratonode3 1.71.131.246
2017/08/06 13:11:47 [INFO] serf: Re-joined to previously known node: stratonode1: 1.71.130.18:8301
2017/08/06 13:11:47 [INFO] consul: Adding LAN server stratonode0 (Addr: tcp/1.71.128.5:8300) (DC: stratocluster-11281299619890165129__3_13_0_ee76fd4a)
2017/08/06 13:11:47 [INFO] consul: Adding LAN server stratonode1 (Addr: tcp/1.71.130.18:8300) (DC: stratocluster-11281299619890165129__3_13_0_ee76fd4a)
2017/08/06 13:11:47 [INFO] serf: EventMemberJoin: stratonode0.stratocluster-11281299619890165129__3_13_0_ee76fd4a 1.71.128.5
2017/08/06 13:11:47 [INFO] serf: EventMemberJoin: stratonode1.stratocluster-11281299619890165129__3_13_0_ee76fd4a 1.71.130.18
2017/08/06 13:11:47 [INFO] consul: Handled member-join event for server "stratonode0.stratocluster-11281299619890165129__3_13_0_ee76fd4a" in area "wan"
2017/08/06 13:11:47 [INFO] consul: Handled member-join event for server "stratonode1.stratocluster-11281299619890165129__3_13_0_ee76fd4a" in area "wan"
2017/08/06 13:11:47 [INFO] agent: (LAN) joined: 2 Err: <nil>
2017/08/06 13:11:47 [INFO] agent: Join completed. Synced with 2 initial agents
2017/08/06 13:11:47 [INFO] serf: Re-joined to previously known node: stratonode0.stratocluster-11281299619890165129__3_13_0_ee76fd4a: 1.71.128.5:8302
2017/08/06 13:11:47 [WARN] dns: Query results too stale, re-requesting
2017/08/06 13:11:47 [WARN] dns: Query results too stale, re-requesting

consul did not manage to connect (test network issues probably) , after a while we restart it, and then got following errors:
2017/08/06 13:19:02 [INFO] consul_wrapper: Starting new HTTP connection (1): 127.0.0.1 (/usr/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py:203)
==> Starting Consul agent...
==> Consul agent running!
Version: 'v0.8.4'
Node ID: '00051dd4-0422-4b6c-0278-a7275b88336f'
Node name: 'stratonode2'
Datacenter: 'stratocluster-11281299619890165129__3_13_0_ee76fd4a'
Server: true (bootstrap: false)
Client Addr: 0.0.0.0 (HTTP: 8500, HTTPS: -1, DNS: 53)
Cluster Addr: 1.71.130.86 (LAN: 8301, WAN: 8302)
Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false

==> Log data will now stream in as it occurs:

2017/08/06 13:19:02 [INFO] raft: Restored from snapshot 2-8284-1502024551949
2017/08/06 13:19:02 [INFO] raft: Initial configuration (index=1808): [{Suffrage:Voter ID:1.71.128.5:8300 Address:1.71.128.5:8300} {Suffrage:Voter ID:1.71.130.86:8300 Address:1.71.130.86:8300} {Suffrage:Voter ID:1.71.130.18:8300 Address:1.71.130.18:8300}]
2017/08/06 13:19:02 [INFO] raft: Node at 1.71.130.86:8300 [Follower] entering Follower state (Leader: "")
2017/08/06 13:19:02 [WARN] serf: Unrecognized snapshot line: ��������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������alive: stratonode2 1.71.130.86:8301
2017/08/06 13:19:02 [WARN] memberlist: Binding to public address without encryption!
2017/08/06 13:19:02 [INFO] serf: EventMemberJoin: stratonode2 1.71.130.86
2017/08/06 13:19:02 [INFO] consul: Adding LAN server stratonode2 (Addr: tcp/1.71.130.86:8300) (DC: stratocluster-11281299619890165129__3_13_0_ee76fd4a)
2017/08/06 13:19:02 [WARN] serf: Unrecognized snapshot line: ���������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������alive: stratonode2.stratocluster-11281299619890165129__3_13_0_ee76fd4a 1.71.130.86:8302
2017/08/06 13:19:02 [WARN] memberlist: Binding to public address without encryption!
2017/08/06 13:19:02 [INFO] serf: EventMemberJoin: stratonode2.stratocluster-11281299619890165129__3_13_0_ee76fd4a 1.71.130.86
2017/08/06 13:19:02 [INFO] consul: Handled member-join event for server "stratonode2.stratocluster-11281299619890165129__3_13_0_ee76fd4a" in area "wan"
2017/08/06 13:19:02 [INFO] agent: Started DNS server 0.0.0.0:53 (udp)
2017/08/06 13:19:02 [INFO] agent: Started DNS server 0.0.0.0:53 (tcp)
2017/08/06 13:19:02 [INFO] agent: Started HTTP server on [::]:8500
2017/08/06 13:19:02 [INFO] agent: Joining cluster...
2017/08/06 13:19:02 [INFO] agent: (LAN) joining: [1.71.130.18 1.71.128.5]
2017/08/06 13:19:02 [INFO] serf: EventMemberJoin: stratonode0 1.71.128.5
2017/08/06 13:19:02 [INFO] serf: EventMemberJoin: stratonode3 1.71.131.246
2017/08/06 13:19:02 [WARN] memberlist: Refuting a suspect message (from: stratonode2)
2017/08/06 13:19:02 [INFO] consul: Adding LAN server stratonode0 (Addr: tcp/1.71.128.5:8300) (DC: stratocluster-11281299619890165129__3_13_0_ee76fd4a)
2017/08/06 13:19:02 [INFO] serf: EventMemberJoin: stratonode1 1.71.130.18
2017/08/06 13:19:02 [INFO] consul: Adding LAN server stratonode1 (Addr: tcp/1.71.130.18:8300) (DC: stratocluster-11281299619890165129__3_13_0_ee76fd4a)
2017/08/06 13:19:02 [INFO] serf: EventMemberJoin: stratonode0.stratocluster-11281299619890165129__3_13_0_ee76fd4a 1.71.128.5
2017/08/06 13:19:02 [WARN] memberlist: Refuting a suspect message (from: stratonode2.stratocluster-11281299619890165129__3_13_0_ee76fd4a)
2017/08/06 13:19:02 [INFO] serf: EventMemberJoin: stratonode1.stratocluster-11281299619890165129__3_13_0_ee76fd4a 1.71.130.18
2017/08/06 13:19:02 [INFO] consul: Handled member-join event for server "stratonode0.stratocluster-11281299619890165129__3_13_0_ee76fd4a" in area "wan"
2017/08/06 13:19:02 [INFO] consul: Handled member-join event for server "stratonode1.stratocluster-11281299619890165129__3_13_0_ee76fd4a" in area "wan"
2017/08/06 13:19:02 [INFO] agent: (LAN) joined: 2 Err: <nil>
2017/08/06 13:19:02 [INFO] agent: Join completed. Synced with 2 initial agents
2017/08/06 13:19:02 [INFO] consul_wrapper: health-checking Consul, waiting for it to be available (/usr/share/consul/strato/consul/consulservice.py:148)
2017/08/06 13:19:02 [INFO] consul_wrapper: Starting new HTTP connection (1): 127.0.0.1 (/usr/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py:203)

which appears to me as a snapshot corruption. Not sure why though, since on the first iteration it was loaded succesfully.

state.pdf
(just rename the suffix to bin :) )

Thanks,
Rom

slackpad · 2017-08-06T16:36:30Z

Hi @rom-stratoscale thanks for the test feedback. From the output it looks like the Raft changes worked ok, but you got a few corrupted lines in the Serf snapshots, which are a totally different thing. Those lines are skipped on startup, so the agent can still startup but may have lost some information about the rest of the cluster, though that should heal itself automatically in most situations. We've done some work in improving this (6d172b7#diff-07aceaceda81f1c08ffb4c11f488ba45) but there's probably room for a similar change to what was done with Raft to better avoid corruption there in the first place.

preetapan · 2017-08-06T18:18:33Z

@rom-stratoscale @slackpad I tested the most recent raft changes by having a background task unmounting a file system partition that the consul data directory was on, rather than crashing the whole machine. Thought that was functionally equivalent (by simulating the file system disappearing before it could fsync). Appreciate the test in done above in a real environment to confirm though.

As for the Unrecognized snapshot line: warning - that is by design based on my understanding of how that works right now. Serf events are buffered up and are only flushed to disk periodically. If the machine dies in the middle of a flush, the file will contain some unrecognized lines. Like @slackpad said above the agent can still startup and recover automatically. I would not want to force an fsync for serf snapshots on every line write because that impacts performance. We can make sure that compactions (which also happens periodically) are written out atomically, and fsynced immediately before returning. That narrows down the time window in which you can see a corrupt line in the file quite a bit.

romfreiman · 2017-08-06T18:29:44Z

@preetapan I think I'm missing something - isn't all snapshoting goes into .tmp directory and only then renamed - meaning the data is synced to disk on the source (.tmp) and no other process touches it? So why would it be partial lines?

preetapan · 2017-08-06T18:41:13Z

serf snapshots don't work that way right now - they do a periodic atomic rename using a tmp file (

consul/vendor/github.com/hashicorp/serf/serf/snapshot.go

Line 348 in 6d172b7

// Check if a compaction is necessary

). However, that compaction only happens every once in a while, during other times every serf event is appended

consul/vendor/github.com/hashicorp/serf/serf/snapshot.go

Line 20 in 6d172b7

Serf supports using a "snapshot" file that contains various

to an open file on disk, which is less crash safe than the atomic write.
I can see why it was designed this way because in a large cluster there's a lot of gossip events and having each one of those cause that fsync/write to a tmp file for each event will hurt performance. This does predate me, so I could be missing reasoning on why it works this way.

We could consider not appending those intermediate lines at all, and only doing atomic compacted writes to a snapshot file after we've received enough events, but there are other edge cases to consider there.

romfreiman · 2017-08-06T19:07:03Z

@preetapan thanks for the detailed answer. My bad, I thought it was about the same snapshot. Missed the fact that there is another serf snapshot (and not only raft).

btw, seems that I just managed to reproduce the original issue (raft). Unfortunately used the wrong test branch and did get the snapshots backup. Rerunning. Will update later / tomorrow.

romfreiman · 2017-08-07T07:30:22Z

mmm ok. So I have a reproduction :(
ipmi off to a node, somehow leads to empty state.bin and valid meta.json

{"Version":1,"ID":"2-8269-1502052768345","Index":8269,"Term":2,"Peers":"k68xLjcxLjE5Mi41OjgzMDCxMS43MS4xOTIuMTE0OjgzMDCxMS43MS4yMTQuMTE4OjgzMDA=","Configuration":{"Servers":[{"Suffrage":0,"ID":"1.71.192.5:8300","Address":"1.71.192.5:8300"},{"Suffrage":0,"ID":"1.71.192.114:8300","Address":"1.71.192.114:8300"},{"Suffrage":0,"ID":"1.71.214.118:8300","Address":"1.71.214.118:8300"}]},"ConfigurationIndex":1787,"Size":619147,"CRC":"goN31aG/9aU="}

how do we want to proceed?

preetapan · 2017-08-07T13:40:32Z

@rom-stratoscale I'll have to try to reproduce on my end - can't easily tell how this can still be happening given the latest changes. If you can attach logs from the container that you crashed, that would help. Was the above meta.json file left in a directory that did not end with .tmp?

preetapan · 2017-08-07T20:20:21Z

I have tried to reproduce this in the following way.

I have the latest version of consul with my changes running in a docker container. A script runs docker stop <container-id>and docker start <container_id> continuously, giving a 60 second sleep in-between starting and stopping. Another script inserts keys in a loop to consul so that its forced to exercise the raft snapshot code every 8192 entries. snapshots take in the order of 50 to 100 microseconds to finish, and another loop lists the contents of the snapshot directory inside the container. I have not seen in crash in the way described above with a missing state.bin thus far, I stopped this destructive test after letting it run for about 15 minutes.

This might be a very rare edge case. The container going away and coming back is sort of like restarting the machine, but its not exactly the same thing.

preetapan · 2017-08-07T20:29:07Z

Also learned about implementation differences between ext3/ext4 http://blog.httrack.com/blog/2013/11/15/everything-you-always-wanted-to-know-about-fsync/ when it comes to fsync. What kernel version and file system type are you using?

romfreiman · 2017-08-07T21:04:23Z

@preetapan

I'm using the binary directly - without container
consul logs are partial - they didnt sync approx 1 min before the crash :)
both meta.json and snapshot (empty( are in the directory without .tmp. .tmp does not exist at all.
I use kernel 3.10.0-327.10.1.el7
I guess that docker stop sync the filesystem before exiting - since it needs to create anoter layer of the container itself - so you world be able to restart it (crash consistency I guess). You can see here more interesting info about it: Direct LVM consumes 3-4 times more space than overlay? moby/moby#32141 (go over my comments there ;) ) - so I would go for kill -9 maybe?

preetapan · 2017-08-07T21:47:15Z

I don't see this happening with docker kill either (it sends the same sigterm signal that kill -9 does). Given that the snapshots are so fast, we're talking about a 50 microsecond window in which things could potentially be left in a bad state like you describe above. So far, I haven't been able to replicate.

Please attach whatever logs you do have even if they are impartial, in case there's a hint in there somewhere.

It is expected that .tmp will not exist - we either delete it if there is any write error, or do a rename after the fsync of state.bin and metadata.json.

Also, you mention your kernel version, but not filesystem type. Can you tell us what the output of df -T is on the consul data dir?

romfreiman · 2017-08-08T11:12:27Z

@preetapan conul logs are just missing:
2017/08/06 20:51:27 [INFO] agent: Synced check 'service:rabbitmq-wrapper'
2017/08/06 20:51:27 [INFO] agent: Synced check 'service:strato-aws-auth'
2017/08/06 20:51:27 [INFO] agent: Synced check 'service:mancala-objectstore'
2017/08/06 20:51:27 [INFO] agent: Synced check 'service:fluentd'
2017/08/06 20:51:27 [INFO] agent: Synced check 'service:registry-syncer'
2017/08/06 20:51:27 [INFO] agent: Synced check 'service:servicesgw'
2017/08/06 20:51:27 [INFO] agent: Synced check 'service:influxdb'
2017/08/06 20:51:27 [INFO] agent: Synced check 'service:redis'
2017/08/06 20:51:27 [INFO] agent: Synced check 'service:mancala-garbagecollector-umounter'
2017/08/06 20:51:27 [INFO] agent: Synced check 'service:mancala-management'
�� 2017/08/06 21:00:27 [INFO] consul_wrapper: Logging started (/usr/share/consul/strato/consul/consulhelpers.py:36)
2017/08/06 21:00:27 [INFO] consul_wrapper: checking if consul has run previously (/usr/share/consul/strato/consul/consulservice.py:48)
2017/08/06 21:00:27 [INFO] consul_wrapper: localManagementIp=1.71.192.114 leaderManagementIp=1.71.192.5 (/usr/share/consul/strato/consul/consulservice.py:52)
2017/08/06 21:00:27 [INFO] consul_wrapper: checking if /mnt/data/consul exists (/usr/share/consul/strato/consul/consulhelpers.py:90)
2017/08/06 21:00:27 [INFO] consul_wrapper: it exists - cleaning the checks and services directories (/usr/share/consul/strato/consul/consulhelpers.py:93)
2017/08/06 21:00:27 [INFO] consul_wrapper: started the process args: ['consul', 'agent', '-bind=1.71.192.114', '-data-dir=/mnt/data/consul', '-client=0.0.0.0', '-config-file=/etc/consul_config.json', '-config-dir=/etc/consul.d', '-server', u'-retry-join=1.71.192.5', u'-retry-join=1.71.214.118', '-retry-interval=5s'] (/usr/share/consul/strato/consul/consulservice.py:144)
2017/08/06 21:00:27 [INFO] consul_wrapper: health-checking Consul, waiting for it to be available (/usr/share/consul/strato/consul/consulservice.py:148)
2017/08/06 21:00:27 [INFO] consul_wrapper: Starting new HTTP connection (1): 127.0.0.1 (/usr/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py:203)
==> Starting Consul agent...

also, I do see fsck in the journal log after the node reboots:

Aug 07 00:00:07 localhost systemd-fsck[888]: ROOT: Clearing orphaned inode 1198285 (uid=162, gid=162, mode=040755, size=4096)
Aug 07 00:00:07 localhost systemd-fsck[888]: ROOT: Clearing orphaned inode 1198286 (uid=0, gid=0, mode=0100644, size=0)
Aug 07 00:00:07 localhost systemd-fsck[888]: ROOT: Clearing orphaned inode 619880 (uid=0, gid=0, mode=0140666, size=0)
Aug 07 00:00:07 localhost systemd-fsck[888]: ROOT: Clearing orphaned inode 1198080 (uid=0, gid=0, mode=0100600, size=141)
Aug 07 00:00:07 localhost systemd-fsck[888]: ROOT: clean, 280014/1310720 files, 2632604/5242880 blocks
Aug 07 00:00:07 localhost systemd[1]: Started File System Check on /dev/mapper/inaugurator-root.

but cannot correlate that those are the relevant inodes.

Can you try not using docker filesystem but mount the raft directory to host and the do some killing?

preetapan · 2017-08-17T19:52:52Z

@rom-stratoscale had a chance to revisit this today - I've tested this mounting the raft directory to host as a volume, that's the default way consul works in Docker anyway. No luck reproducing.

Looks like /mnt/data/consul is your consul data directory. What does df -T /mnt/data/consul return?

romfreiman · 2017-11-07T11:44:40Z

[root@stratonode1 ~]# df -T /mnt/data/consul
Filesystem Type 1K-blocks Used Available Use% Mounted on
/dev/mapper/inaugurator-data ext4 10190100 595620 9053808 7% /mnt/data

It's an lvm with ext4 filesystem

pearkes · 2018-07-24T21:41:28Z

Given this issue has been quiet and we haven't been able to reproduce it as of the last try, I'm going to close this but do feel free to comment if you continue to see this behavior.

tsorya · 2018-08-19T14:23:54Z

@pearkes it happened one more time, what data u need to debug those issues?

pearkes closed this as completed Jul 24, 2018

pearkes added type/bug Feature does not function as expected type/question Not an "enhancement" or "bug". Please post on discuss.hashicorp labels Jul 24, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Raft snapshots disappear during machine crashes #3362

Raft snapshots disappear during machine crashes #3362

preetapan commented Aug 4, 2017

preetapan commented Aug 4, 2017

romfreiman commented Aug 5, 2017

slackpad commented Aug 5, 2017

romfreiman commented Aug 5, 2017 •

edited

Loading

romfreiman commented Aug 6, 2017

slackpad commented Aug 6, 2017

preetapan commented Aug 6, 2017 •

edited

Loading

romfreiman commented Aug 6, 2017

preetapan commented Aug 6, 2017

romfreiman commented Aug 6, 2017 •

edited

Loading

romfreiman commented Aug 7, 2017

preetapan commented Aug 7, 2017

preetapan commented Aug 7, 2017

preetapan commented Aug 7, 2017

romfreiman commented Aug 7, 2017

preetapan commented Aug 7, 2017 •

edited

Loading

romfreiman commented Aug 8, 2017

preetapan commented Aug 17, 2017

romfreiman commented Nov 7, 2017 •

edited

Loading

pearkes commented Jul 24, 2018

tsorya commented Aug 19, 2018

Raft snapshots disappear during machine crashes #3362

Raft snapshots disappear during machine crashes #3362

Comments

preetapan commented Aug 4, 2017

preetapan commented Aug 4, 2017

romfreiman commented Aug 5, 2017

slackpad commented Aug 5, 2017

romfreiman commented Aug 5, 2017 • edited Loading

romfreiman commented Aug 6, 2017

slackpad commented Aug 6, 2017

preetapan commented Aug 6, 2017 • edited Loading

romfreiman commented Aug 6, 2017

preetapan commented Aug 6, 2017

romfreiman commented Aug 6, 2017 • edited Loading

romfreiman commented Aug 7, 2017

preetapan commented Aug 7, 2017

preetapan commented Aug 7, 2017

preetapan commented Aug 7, 2017

romfreiman commented Aug 7, 2017

preetapan commented Aug 7, 2017 • edited Loading

romfreiman commented Aug 8, 2017

preetapan commented Aug 17, 2017

romfreiman commented Nov 7, 2017 • edited Loading

pearkes commented Jul 24, 2018

tsorya commented Aug 19, 2018

romfreiman commented Aug 5, 2017 •

edited

Loading

preetapan commented Aug 6, 2017 •

edited

Loading

romfreiman commented Aug 6, 2017 •

edited

Loading

preetapan commented Aug 7, 2017 •

edited

Loading

romfreiman commented Nov 7, 2017 •

edited

Loading