Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WHYPFS gateway cluster + Distributed Filesystem #14

Open
snissn opened this issue Dec 19, 2022 · 1 comment
Open

WHYPFS gateway cluster + Distributed Filesystem #14

snissn opened this issue Dec 19, 2022 · 1 comment

Comments

@snissn
Copy link

snissn commented Dec 19, 2022

Idea/Proposal: Estuary Gateway cluster based on WHYPFS + Distributed filesystem

Contributors @snissn  
Status Draft
Revision  

Proposal

NOTE: This is a draft and is not finalize yet. We'll have to polish it until we all agreed on the approach.

NOTE: This is draft draft is based on the proposal by @alvin-reyes here and is built off its formatting!

WHYPFS with Fuse+SeaweedFS to expose a distributed filesystem will allow for a very safe and scalable architecture for file storage, file ipfs pinning and file serving via http.

  • We want to have users data be highly available and resilient against hardware failures or even geographic or data center failures.

  • Moreover, it is a disadvantage and confusing UX for a user to have to declare and dedicate themselves to a particular gateway.

  • whypfs + Distributed File System will allow us to create a distributed pinning system with resilience from data loss due to individual drive failures, individual server failures and also entire datacenter outages.

  • example: We have two data centers + 5 nodes per data center. Files uploaded to seaweedfs can be replicated 2 times per data center. So we can have 10 servers across 2 data centers with 4 copies of each piece of data to guard against hardware failure. We will be able to add nodes to scale and have each node serve as a pinning gateway.

  • SeaweedFS can be set up as a mount point on each node. This node would have whypfs-gateway installed on it and use flatfs with the distributed filesystem mount point as its flat filesystem data store.

  • the put / upload and delete APIs would be protected via a secure password that only the api node would have. but get and gw api endpoints would be fully public.

  • the api node can be moderately changed to rely on a highly available and very fast whypfs+seaweedfs cluster without the end user knowing anything about provisioning a gateway!

  • over time data can be deleted from the whypfs cluster with filecoin as a long term storage backup!

Detailed plan:

SeaWhypfs

Step 0 edit this master plan document

Step 1
Investigate how much disk is used for the ipfs pin cluster usage in production.

Step 2. Spec out how many servers we would want to have in production given the amount of disk we need to store, having room to grow before needing to add more nodes, and redundancy we want in our data set. Identify what we will need ie we will need 2x data centers and 5x servers for 10 servers total with each server having xyz terabytes of disk with raidx redundancy

Step 3 Code deploy scripts
Code up ansible etc required software for deploying and managing a seaweedfs + whypfs cluster.

Step 4. Set up test cluster
Using ansible make a three node cluster with 2x disk replication in seaweedfs and put whypfs gateway on each of the seaweedfs nodes and load the nodes with at least 1tb using whypfs API endpoints and verify whypfs put and get works

Step 5. Set up full scale cluster
Deploy large scale cluster for production

Step 6. Back fill.
Clone estuary’s pin data into seawhypfs

Step 7.
Change estuary code base to take advantage of data lake. Change add pin in estuary to push to the data lake and use the gw dns for the cluster for reads. Make sure deal making API uses the new url also if it needs it.

@Zorlin
Copy link

Zorlin commented Dec 20, 2022

Very keen to prototype this with Mike. Looks like a solid plan.

Some additional thoughts

  • We should rely on Tailscale, Wireguard or Netbird to create a converged network across all of our nodes with which to send traffic. It will give us near wire-speed connectivity between nodes and regions.
  • ZFS is a good idea for bit-rot protection
  • If we trust SeaweedFS to store our data, we probably trust it to handle replication (or erasure coding) for us

After a lot of discussion with @snissn, here is my suggestion for a revised set of steps:

  1. Create a toy/proof of concept cluster (either using Ansible or by hand), spread across at least 3 geographical regions (ideally with 2 datacenters each) and with the explicit goal of testing expanding by adding servers once "full".
  2. Verify WhyPFS functionality on the proof of concept cluster.
  3. Investigate how much disk space is used by the shuttles/IPFS infrastructure
  4. Spec out how many regions, servers and total data capacity we want to have. Run with the assumption that we'll be using standard 2x replication, sealing to 10x4 erasure coding (the standard for SeaweedFS), and using ZFS single-disk volumes for bitrot protection (using SeaweedFS replication instead of RAID, in other words).
  5. Build an Ansible playbook for deploying SeaweedFS + WhyPFS
  6. Deploy a new test cluster, intended for testing (and which will later become a staging cluster for this setup). Should consist of at least 3 regions, at least 3 servers per region, and at least 1TB of storage per server.
  7. Verify WhyPFS functionality on the test cluster.
  8. Deploy a production cluster, per earlier requirements.
  9. Verify the production cluster as fully as possible (WhyPFS functionality, DR including complete site failure testing, pull disks, etc)
  10. Backfill the new cluster from the existing infrastructure
  11. Change estuary code base to take advantage of data lake. Change add pin in estuary to push to the data lake and use the gw dns for the cluster for reads. Make sure deal making API uses the new url also if it needs it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants