Skip to content
This repository has been archived by the owner on Jul 18, 2024. It is now read-only.

[DataCap Application] Guazi Dynamic - Open Sustainability Data (3/3) #1337

Closed
Fatman13 opened this issue Nov 29, 2022 · 40 comments
Closed

[DataCap Application] Guazi Dynamic - Open Sustainability Data (3/3) #1337

Fatman13 opened this issue Nov 29, 2022 · 40 comments

Comments

@Fatman13
Copy link
Contributor

Large Dataset Notary Application

To apply for DataCap to onboard your dataset to Filecoin, please fill out the following.

Core Information

  • Organization Name: Guazi Dynamic
  • Website / Social Media: guazi.io / Slack: @Fatman13
  • Total amount of DataCap being requested (between 500 TiB and 5 PiB): 5PiB
  • Weekly allocation of DataCap requested (usually between 1-100TiB): 500TiB
  • On-chain address for first allocation: f3qaipvxrz2gxexc7mcvjsxifmscfiw7c7zfhrmq76j5ee4hcbvg3gbtpea7wgz72kkjcjmwzhm5uo2onxyocq

Please respond to the questions below by replacing the text saying "Please answer here". Include as much detail as you can in your answer.

Based on the success of our previous [project](https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/84) where we distributed close to 3Pib of datacap deals to a pool of 33 storage providers (nodes). We would like to continue to work with the amazing SPs across Asia and bring open datasets on critical sustainability research to the Filecoin network.

Project details

Share a brief history of your project and organization.

Guazi has been with filecoin since Space Race. We continue to bridging the gap between storage client and filecoin.
We have developed system ready to onboard large amount of data onto filecoin network.

What is the primary source of funding for this project?

The company will cover the cost for proper equipment, bandwidth and communication cost with other SPs. 

What other projects/ecosystem stakeholders is this project associated with?

CN SPWG, incubation program, SPs we have closely worked with in the past.

Use-case details

Describe the data being stored onto Filecoin

The datasets we will be on-boarding onto Filecoin network are critical open source research datasets on sustainability, which may include but not not limited to the following...

[Prediction Of Worldwide Energy Resources (POWER)](https://power.larc.nasa.gov/) Project
GOES-18 Data
Next Generation Weather Radar (NEXRAD)
Global Mangrove Watch (GMW) dataset
Digital Earth Africa (DE Africa)
Water Observations from Space (WOfS)
...

More information can be found at https://registry.opendata.aws/tag/sustainability/

Where was the data in this dataset sourced from?

https://registry.opendata.aws/tag/sustainability/

Can you share a sample of the data? A link to a file, an image, a table, etc., are good ways to do this.

https://www.globalmangrovewatch.org/

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

Licensed under https://creativecommons.org/licenses/by/4.0/

What is the expected retrieval frequency for this data?

A couple of times a year.

For how long do you plan to keep this dataset stored on Filecoin?

A year and a half.

DataCap allocation plan

In which geographies (countries, regions) do you plan on making storage deals?

All regions but mainly Asia as Guazi has the most network bandwidth in Asia.

How will you be distributing your data to storage providers? Is there an offline data transfer process?

Online. Or offline if a SP is interested in such.

How do you plan on choosing the storage providers with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

We will contact SPWG and Incubation program admin as well as any other SPs interested in accepting the deals.

How will you be distributing deals across storage providers?

In our previous [project](https://github.com/filecoin-project/filecoin-plus-large-datasets/issues/84#issuecomment-1112128366), we have successfully distributed 3PiB worth of datacap deals to a pool of 33 different storage provider nodes. 

Evenly across all SPs as long as they can handle. If a SP is a notary itself, this notary will receive no more than 10% of the total granted datacap.

Do you have the resources/funding to start making deals as soon as you receive DataCap? What support from the community would help you onboard onto Filecoin?

We have enough funding and recourses to carry out the project. 
@large-datacap-requests
Copy link

Thanks for your request!

Heads up, you’re requesting more than the typical weekly onboarding rate of DataCap!

@large-datacap-requests
Copy link

Thanks for your request!
Everything looks good. 👌

A Governance Team member will review the information provided and contact you back pretty soon.

@simonkim0515
Copy link
Collaborator

Datacap Request Trigger

Total DataCap requested

5PiB

Expected weekly DataCap usage rate

500TiB

Client address

f3qaipvxrz2gxexc7mcvjsxifmscfiw7c7zfhrmq76j5ee4hcbvg3gbtpea7wgz72kkjcjmwzhm5uo2onxyocq

@large-datacap-requests
Copy link

large-datacap-requests bot commented Dec 5, 2022

DataCap Allocation requested

Multisig Notary address

f02049625

Client address

f3qaipvxrz2gxexc7mcvjsxifmscfiw7c7zfhrmq76j5ee4hcbvg3gbtpea7wgz72kkjcjmwzhm5uo2onxyocq

DataCap allocation requested

250TiB

Id

f9d46ada-0afc-461b-ae68-873792ae4d1d

@filplus-checker
Copy link

DataCap and CID Checker Report1

  • Organization: Guazi Dynamic
  • Client: f3qaipvxrz2gxexc7mcvjsxifmscfiw7c7zfhrmq76j5ee4hcbvg3gbtpea7wgz72kkjcjmwzhm5uo2onxyocq

Storage Provider Distribution

The below table shows the distribution of storage providers that have stored data for this client.

If this is the first time a provider takes verified deal, it will be marked as new.

For most of the datacap application, below restrictions should apply.

  • Storage provider should not exceed 25% of total datacap.
  • Storage provider should not be storing duplicate data for more than 20%.
  • Storage provider should have published its public IP address.
  • All storage providers should be located in different regions.

⚠️ 43.76% of total deal sealed by f0522948 are duplicate data.

⚠️ 43.27% of total deal sealed by f0867300 are duplicate data.

⚠️ 40.39% of total deal sealed by f01228008 are duplicate data.

⚠️ 39.84% of total deal sealed by f01228000 are duplicate data.

⚠️ 41.33% of total deal sealed by f01228087 are duplicate data.

⚠️ 41.52% of total deal sealed by f01228105 are duplicate data.

⚠️ 41.43% of total deal sealed by f01228100 are duplicate data.

⚠️ 41.88% of total deal sealed by f01228089 are duplicate data.

⚠️ 70.92% of total deal sealed by f01114587 are duplicate data.

⚠️ 44.97% of total deal sealed by f01228065 are duplicate data.

⚠️ 44.56% of total deal sealed by f01228009 are duplicate data.

⚠️ 40.07% of total deal sealed by f023651 are duplicate data.

⚠️ f023651 has unknown IP location.

⚠️ 38.33% of total deal sealed by f01114589 are duplicate data.

⚠️ f01114589 has unknown IP location.

⚠️ 43.84% of total deal sealed by f01114827 are duplicate data.

⚠️ f020378 has unknown IP location.

Provider Location Total Deals Sealed Percentage Unique Data Duplicate Deals
f0522948 Singapore, Singapore, SG 511.91 TiB 8.27% 287.91 TiB 43.76%
f0867300 Tokyo, Tokyo, JP 507.54 TiB 8.20% 287.91 TiB 43.27%
f01228008 Sydney, New South Wales, AU 487.20 TiB 7.87% 290.41 TiB 40.39%
f01228000 Seoul, Seoul, KR 477.52 TiB 7.71% 287.28 TiB 39.84%
f01228087 London, England, GB 447.49 TiB 7.23% 262.52 TiB 41.33%
f01228105 Hong Kong, Central and Western, HK 441.52 TiB 7.13% 258.21 TiB 41.52%
f01228100 San Jose, California, US 440.79 TiB 7.12% 258.18 TiB 41.43%
f01228089 Frankfurt am Main, Hesse, DE 438.91 TiB 7.09% 255.08 TiB 41.88%
f01114587 Tokyo, Tokyo, JP 266.09 TiB 4.30% 77.39 TiB 70.92%
f01228065 Singapore, Singapore, SG 261.23 TiB 4.22% 143.77 TiB 44.97%
f01228009 Hong Kong, Central and Western, HK 259.33 TiB 4.19% 143.77 TiB 44.56%
f023651 Unknown 251.03 TiB 4.05% 150.45 TiB 40.07%
f01114589 Unknown 233.12 TiB 3.76% 143.77 TiB 38.33%
f0134516 Hong Kong, Central and Western, HK 224.05 TiB 3.62% 216.13 TiB 3.54%
f0118330 Hong Kong, Central and Western, HK 218.26 TiB 3.52% 215.76 TiB 1.15%
f01353593new Hong Kong, Central and Western, HK 130.05 TiB 2.10% 106.98 TiB 17.74%
f0522949 Thessaloníki, Central Macedonia, GR 105.69 TiB 1.71% 96.17 TiB 9.00%
f0118317 Frankfurt am Main, Hesse, DE 105.58 TiB 1.70% 96.11 TiB 8.97%
f0401135 Helsinki, Uusimaa, FI 103.38 TiB 1.67% 96.05 TiB 7.09%
f0522364 Al Ain City, Abu Dhabi, AE 102.72 TiB 1.66% 95.23 TiB 7.29%
f01114827 Tokyo, Tokyo, JP 100.13 TiB 1.62% 56.23 TiB 43.84%
f033463new Hong Kong, Central and Western, HK 41.82 TiB 0.68% 41.82 TiB 0.00%
f01509930 Guangzhou, Guangdong, CN 13.32 TiB 0.22% 10.82 TiB 18.76%
f023467 Oslo, Oslo, NO 3.69 TiB 0.06% 3.65 TiB 1.06%
f01199442 Heerhugowaard, North Holland, NL 2.49 TiB 0.04% 2.45 TiB 1.57%
f0187709 Moscow, Moscow, RU 2.46 TiB 0.04% 2.46 TiB 0.00%
f01402814 Singapore, Singapore, SG 2.40 TiB 0.04% 2.37 TiB 1.30%
f01208862 Heerhugowaard, North Holland, NL 2.07 TiB 0.03% 2.04 TiB 1.32%
f01201327 Heerhugowaard, North Holland, NL 1.82 TiB 0.03% 1.79 TiB 1.72%
f01207045 Heerhugowaard, North Holland, NL 1.77 TiB 0.03% 1.76 TiB 0.88%
f01199430 Heerhugowaard, North Holland, NL 1.67 TiB 0.03% 1.66 TiB 0.70%
f03488 Seoul, Seoul, KR 1.23 TiB 0.02% 1.23 TiB 0.00%
f020378 Unknown 1.11 TiB 0.02% 1.10 TiB 0.35%
f033356 Seoul, Seoul, KR 1.06 TiB 0.02% 1.06 TiB 0.00%
f097777 Kivertsi, Volyn, UA 764.00 GiB 0.01% 764.00 GiB 0.00%
f024184 Seoul, Seoul, KR 584.00 GiB 0.01% 584.00 GiB 0.00%
f0440429 Seoul, Seoul, KR 442.00 GiB 0.01% 442.00 GiB 0.00%
f030379 Seoul, Seoul, KR 380.00 GiB 0.01% 380.00 GiB 0.00%
f010617 Surrey, British Columbia, CA 152.00 GiB 0.00% 152.00 GiB 0.00%
f01157288 Sydney, New South Wales, AU 24.00 GiB 0.00% 24.00 GiB 0.00%
f010088 Everett, Washington, US 8.00 GiB 0.00% 8.00 GiB 0.00%
f01163272 Perm, Perm Krai, RU 8.00 GiB 0.00% 8.00 GiB 0.00%

Provider Distribution

Deal Data Replication

The below table shows how each many unique data are replicated across storage providers.

  • No more than 25% of unique data are stored with less than 4 providers.

✔️ Data replication looks healthy.

Unique Data Size Total Deals Made Number of Providers Deal Percentage
155.98 TiB 179.05 TiB 1 2.89%
1.10 TiB 2.23 TiB 2 0.04%
157.34 TiB 796.75 TiB 3 12.87%
273.11 TiB 1.84 PiB 4 30.50%
146.21 TiB 1.14 PiB 5 18.79%
132.01 TiB 997.17 TiB 6 16.10%
22.53 TiB 262.81 TiB 7 4.24%
55.55 TiB 828.88 TiB 8 13.38%
640.00 GiB 8.13 TiB 9 0.13%
128.00 GiB 1.72 TiB 10 0.03%
4.28 TiB 63.59 TiB 11 1.03%

Replication Distribution

Deal Data Shared with other Clients

The below table shows how many unique data are shared with other clients.
Usually different applications owns different data and should not resolve to the same CID.

⚠️ CID sharing has been observed.

Other Client Application Total Deals Affected Unique CIDs Verifier
f1ymfz2mqdrkrdpjmrwh4qaqtuknfpsq3lp3r3auq Venus team 99.95 TiB 1,429 LDN v3 multisig
f1qlw5qik62kvrzvpa7bsst65uobtt3jmkrh3ajsq アローズコーポレーション(Arrows Corporation) 50.00 TiB 218 LDN v3 multisig
f3wgfwtrs5p6jrkwfl2mksqa2ivgbgdjjrhjbefy3
n7qzvotc3y6sazmp5gfyj7um6jlgdvlbiepzawnc6
wxtq
FileDrive Labs 30.50 TiB 368 LDN v3 multisig
f1pkrmygbvweykpjcut36lf7ewgqdfhjklbhvepda Protocol Labs ( project: Slingshot Evergreen ) 7.54 TiB 262 LDN # 293

Footnotes

  1. To manually trigger this report, add a comment with text checker:manualTrigger

@newwebgroup
Copy link

Fatman13 is an active developer in the community. The CID Checker results look good and is willing to support it

Copy link

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzaceb6tn7swvasajcoad67mhp4qg7fhrumhymvw5kgyd4yjov2ewbc6m

Address

f3qaipvxrz2gxexc7mcvjsxifmscfiw7c7zfhrmq76j5ee4hcbvg3gbtpea7wgz72kkjcjmwzhm5uo2onxyocq

Datacap Allocated

250.00TiB

Signer Address

f1e77zuityhvvw6u2t6tb5qlnsegy2s67qs4lbbbq

Id

f9d46ada-0afc-461b-ae68-873792ae4d1d

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceb6tn7swvasajcoad67mhp4qg7fhrumhymvw5kgyd4yjov2ewbc6m

@herrehesse
Copy link

It seems (given my crowded inbox) that in the past 15 minutes @newwebgroup has approved 10+ datacap requests.

I wonder if this blind approval is allowed?

Would like to see per project what due diligence you did to explain the approval @newwebgroup.

Copy link

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacedelgc7kk7j2rifmqgw27ikag7vssjj4h5yiwzjscyf6lfiuewcqk

Address

f3qaipvxrz2gxexc7mcvjsxifmscfiw7c7zfhrmq76j5ee4hcbvg3gbtpea7wgz72kkjcjmwzhm5uo2onxyocq

Datacap Allocated

250.00TiB

Signer Address

f1yjhnsoga2ccnepb7t3p3ov5fzom3syhsuinxexa

Id

f9d46ada-0afc-461b-ae68-873792ae4d1d

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedelgc7kk7j2rifmqgw27ikag7vssjj4h5yiwzjscyf6lfiuewcqk

@large-datacap-requests
Copy link

large-datacap-requests bot commented Jan 11, 2023

DataCap Allocation requested

Request number 2

Multisig Notary address

f02049625

Client address

f3qaipvxrz2gxexc7mcvjsxifmscfiw7c7zfhrmq76j5ee4hcbvg3gbtpea7wgz72kkjcjmwzhm5uo2onxyocq

DataCap allocation requested

500TiB

Id

3fe6adf8-dcda-45b0-b498-649ea3e9b410

@filplus-checker-app
Copy link

DataCap and CID Checker Report Summary1

Storage Provider Distribution

⚠️ 14 storage providers sealed too much duplicate data - f01228087: 46.89%, f01228105: 49.12%, f01228100: 49.10%, f01228089: 49.63%, f01228008: 40.39%, f01228000: 39.84%, f0522948: 42.77%, f0867300: 42.20%, f01114587: 70.92%, f01228065: 44.97%, f01228009: 44.56%, f023651: 36.23%, f01114589: 30.99%, f01114827: 43.84%

⚠️ 3 storage providers have unknown IP location - f023651, f01114589, f020378

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients2

⚠️ CID sharing has been observed. (Top 3)

Full report

Click here to view the full report.

Footnotes

  1. To manually trigger this report, add a comment with text checker:manualTrigger

  2. To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

@Fatman13
Copy link
Contributor Author

checker:manualTrigger

@filplus-checker-app
Copy link

DataCap and CID Checker Report Summary1

Storage Provider Distribution

⚠️ 14 storage providers sealed too much duplicate data - f01228087: 46.89%, f01228105: 49.12%, f01228100: 49.10%, f01228089: 49.63%, f01228008: 40.39%, f01228000: 39.84%, f0522948: 42.77%, f0867300: 42.20%, f01114587: 70.92%, f01228065: 44.97%, f01228009: 44.56%, f023651: 36.23%, f01114589: 30.99%, f01114827: 43.84%

⚠️ 3 storage providers have unknown IP location - f023651, f01114589, f020378

Deal Data Replication

✔️ Data replication looks healthy.

Deal Data Shared with other Clients2

⚠️ CID sharing has been observed. (Top 3)

Full report

Click here to view the full report.

Footnotes

  1. To manually trigger this report, add a comment with text checker:manualTrigger

  2. To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

Copy link

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacecsfvzjzte2z36gy4tff5pr6ylexvcs7mhyz4lbe5jlskoc5jbkpk

Address

f3qaipvxrz2gxexc7mcvjsxifmscfiw7c7zfhrmq76j5ee4hcbvg3gbtpea7wgz72kkjcjmwzhm5uo2onxyocq

Datacap Allocated

1.95PiB

Signer Address

f1bp3tzp536edm7dodldceekzbsx7zcy7hdfg6uzq

Id

15444189-7201-4801-9184-285dca5b0318

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecsfvzjzte2z36gy4tff5pr6ylexvcs7mhyz4lbe5jlskoc5jbkpk

@laurarenpanda
Copy link

Discussed with @Fatman13 this program and his future distribution plan.
Willing to support this round and hope to see much improvement in the Checker report.

@mikezli
Copy link

mikezli commented Mar 27, 2023

3ea47c1804310cc7d2c171c21bf18ab

Copy link

mikezli commented Mar 27, 2023

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzaced2lzytsfzmcixdzxenpb3vlutjqmmxic663kh7dk4t44i3265vsk

Address

f3qaipvxrz2gxexc7mcvjsxifmscfiw7c7zfhrmq76j5ee4hcbvg3gbtpea7wgz72kkjcjmwzhm5uo2onxyocq

Datacap Allocated

1.95PiB

Signer Address

f1dnb3uz7sylxk6emti3ififcvu3nlufnnsjui6ea

Id

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaced2lzytsfzmcixdzxenpb3vlutjqmmxic663kh7dk4t44i3265vsk

@large-datacap-requests
Copy link

The issue reached the total datacap requested. This should be closed

@large-datacap-requests
Copy link

Stats & Info for DataCap Allocation

Multisig Notary address

f02049625

Client address

f3qaipvxrz2gxexc7mcvjsxifmscfiw7c7zfhrmq76j5ee4hcbvg3gbtpea7wgz72kkjcjmwzhm5uo2onxyocq

Rule to calculate the allocation request amount

total dc reached

DataCap allocation requested

0

Total DataCap granted for client so far

1.8160790205001834e+267YiB

Datacap to be granted to reach the total amount requested by the client (5PiB)

-2.19B

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
95388 20 1.95PiB 7.03 459.90TiB

Client f0215074 does not follow the datacap usage rules. More info here.
This application has been failing the requirements for 7 days.
Please take appropiate action to fix the following DataCap usage problems.

Criteria Treshold Reason
Percent of used DataCap stored with top provider < 75 The percent of Data from the client that is stored with their top provider is 100%. This should be less than 75%
Shared data percent < 20% 20.04% of the clients data is shared with other clients. This should be less than 20%

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.