Skip to content
This repository has been archived by the owner on Jul 18, 2024. It is now read-only.

[DataCap Application] USC Shoah Foundation #53

Closed
galen-mcandrew opened this issue Sep 30, 2021 · 86 comments
Closed

[DataCap Application] USC Shoah Foundation #53

galen-mcandrew opened this issue Sep 30, 2021 · 86 comments

Comments

@galen-mcandrew
Copy link
Collaborator

galen-mcandrew commented Sep 30, 2021

Large Dataset Notary Application

To apply for a DataCap allocation for your dataset, please fill out the following information.

Core Information

  • Organization Name: USC Shoah Foundation
  • Website / Social Media: http://sfi.usc.edu/ & https://www.youtube.com/user/USCShoahFoundation
  • Total amount of DataCap being requested (between 500 TiB and 5 PiB): 5 PiB
  • Weekly allocation of DataCap requested (usually between 1-100TiB): 100TiB
  • On-chain address for first allocation: f17g7h52bsi53rb263xwne573dusskit4mieqkgry

Please respond to the questions below in pargraph form, replacing the text saying "Please answer here". Include as much detail as you can in your answer!

Project details

Share a brief history of your project and organization.

USC Shoah Foundation – The Institute for Visual History and Education develops empathy, understanding and respect through testimony, using its Visual History Archive of more than 55,000 video testimonies, award-winning IWitness education program, and the Center for Advanced Genocide Research. USC Shoah Foundation's interactive programming, research and materials are accessed in museums and universities, cited by government leaders and NGOs, and taught in classrooms around the world. Now in its third decade, USC Shoah Foundation reaches millions of people on six continents from its home at the Dornsife College of Letters, Arts and Sciences at the University of Southern California.

What is the primary source of funding for this project?

Filecoin Foundation for the Decentralized Web (FFDW)

What other projects/ecosystem stakeholders is this project associated with?

Starling Labs

Use-case details

Describe the data being stored onto Filecoin

Digital Library of Survivor Testimonies - compilation of audiovisual content from holocaust and genocide survivors. Majority of the data is lossless copies of collected data, but some of the dataset is lower quality replicas of the content.

Where was the data in this dataset sourced from?

Audiovisual content was recorded by volunteers and is stored on tape drives at the University of Southern California. It consists of live interviews with survivors of holocaust and genocide. Maintaining the integrity of the original content is extremely important, and USC constantly runs fixity checks on the content. 

Can you share a sample of what is in the dataset? A link to a file, an image, a table, etc., are good examples of this.

A set of testimonies is available on YouTube and viewable by anyone: https://www.youtube.com/playlist?list=PLWIFgIFN2QqiDdkA-MXpsvZOSvTYkEGsL. 

Confirm that this is a public dataset that can be retrieved by anyone on the Network (i.e., no specific permissions or access rights are required to view the data).

A lot of the data is already publicly available for view via the Visual History Archive (https://vhaonline.usc.edu/). Some of the data has been requested to be private for a period of time based on requests from the interviewees of the content. 

What is the expected retrieval frequency for this data?

This is primarily for long-term archiving purposes only. Viewing / retrieving of this content should primarily be happening through the Visual History Archive (https://vhaonline.usc.edu/).

For how long do you plan to keep this dataset stored on Filecoin? Will this be a permanent archival or a one-time storage deal?

Yes, permanent archival. 

DataCap allocation plan

In which geographies do you plan on making storage deals?

We will be prioritizing making deals globally in any geography where our content can be legally stored by storage providers. 

What is your expected data onboarding rate? How many deals can you make in a day, in a week? How much DataCap do you plan on using per day, per week?

The current plan is to use offline data transfer mechanisms that will enable 100s of terabytes of content to be stored on a weekly basis. We hope to have access to at least 100TiB of DataCap per week.

How will you be distributing your data to miners? Is there an offline data transfer process?

Yes, there is going to be an offline data transfer process either through hosting files online where storage providers can download them or (where logistically feasible) through the distribution of content on physical drives. 

How do you plan on choosing the miners with whom you will be making deals? This should include a plan to ensure the data is retrievable in the future both by you and others.

We would like to work with several reputable large-scale storage provider operations to ensure geo-distribution and reliability of storage.  

How will you be distributing data and DataCap across miners storing data?

This project aims to onboard 4+PiB of original data, for which we’d like to store multiple replicas (2-5) with separate storage providers for each replica. Deals will be structured to be as close to sector size as possible for a storage provider. 
@large-datacap-requests
Copy link

Thanks for your request!
Everything looks good. 👌

A Governance Team member will review the information provided and contact you back pretty soon.

@galen-mcandrew
Copy link
Collaborator Author

@starling-admin Here is the new large dataset application issue, per the new LDN process.

@galen-mcandrew
Copy link
Collaborator Author

Multisig Notary requested

Total DataCap requested

5PiB

Expected weekly DataCap usage rate

100TiB

@large-datacap-requests
Copy link

**Multisig created and sent to RKH f01322626

@large-datacap-requests
Copy link

DataCap Allocation requested

Multisig Notary address

f01322626

Client address

f3w5fx6wta4ewl2iyf7xcogmzffz2fmrngpzdpduj3xmk3dwjxc6dyq36gdf3rflkkrblh5nci5xymc5hal3qq

DataCap allocation requested

50TiB

Copy link

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacebmb5eshdszn6mike6iohpbstp7yin4evmiaszriuu2ccxlpimjpo

Address

f3w5fx6wta4ewl2iyf7xcogmzffz2fmrngpzdpduj3xmk3dwjxc6dyq36gdf3rflkkrblh5nci5xymc5hal3qq

Datacap Allocated

50TiB

Signer Address

f1krmypm4uoxxf3g7okrwtrahlmpcph3y7rbqqgfa

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacebmb5eshdszn6mike6iohpbstp7yin4evmiaszriuu2ccxlpimjpo

Copy link

dannyob commented Oct 28, 2021

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzaceav3tvsfrq57kzh4ldst2uynaf6ukbtpljyvcavsw7dmqtvkcr4yy

Address

f3w5fx6wta4ewl2iyf7xcogmzffz2fmrngpzdpduj3xmk3dwjxc6dyq36gdf3rflkkrblh5nci5xymc5hal3qq

Datacap Allocated

50TiB

Signer Address

f1k6wwevxvp466ybil7y2scqlhtnrz5atjkkyvm4a

You can check the status of the message here: https://filfox.info/en/message/bafy2bzaceav3tvsfrq57kzh4ldst2uynaf6ukbtpljyvcavsw7dmqtvkcr4yy

Copy link

dannyob commented Oct 28, 2021

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacecqgfrguyxw33sg5saijaj2jtbkcwzxfcmyhslznvzesvx7rtamzc

Address

f3w5fx6wta4ewl2iyf7xcogmzffz2fmrngpzdpduj3xmk3dwjxc6dyq36gdf3rflkkrblh5nci5xymc5hal3qq

Datacap Allocated

50TiB

Signer Address

f1k6wwevxvp466ybil7y2scqlhtnrz5atjkkyvm4a

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecqgfrguyxw33sg5saijaj2jtbkcwzxfcmyhslznvzesvx7rtamzc

@dkkapur
Copy link
Collaborator

dkkapur commented Oct 29, 2021

^ this went as a proposal, looks like a dup. I'm going to remove the ready to sign label here since the allocation was already made.

@starling-admin
Copy link

@dkkapur + the Datacap notaries,

We'd like to move our next tranche of our approved data cap allocation to this wallet: f17g7h52bsi53rb263xwne573dusskit4mieqkgry

We are actively sealing deals and ready to scale.

Thanks!

@galen-mcandrew
Copy link
Collaborator Author

According to glif, address f3w5fx6wta4ewl2iyf7xcogmzffz2fmrngpzdpduj3xmk3dwjxc6dyq36gdf3rflkkrblh5nci5xymc5hal3qq has 14 TiB remaining. With the initial allocation of 50TiB, that means the next allocation should kick off at ~12 TiB.

Checking lotus, seeing 2 pending transactions with only 1 approval, so I want to check in with the team before I make any changes to the client address.

@fabriziogianni7 @ialberquilla

@galen-mcandrew
Copy link
Collaborator Author

Additionally flagging for notaries:
@starling-admin according to a quick audit (https://filplus.d.interplanetary.one/clients/f0700600/breakdown), while you have worked with 20 storage providers, an extreme majority of your deal-making has been with a single storage provider (84%).

Per your application and the large dataset process, can you provide some more details about your deal distribution plan?

@starling-admin
Copy link

starling-admin commented Dec 17, 2021 via email

@large-datacap-requests
Copy link

large-datacap-requests bot commented Dec 17, 2021

DataCap Allocation requested

Multisig Notary address

f01322626

Client address

f17g7h52bsi53rb263xwne573dusskit4mieqkgry

DataCap allocation requested

100TiB

@large-datacap-requests
Copy link

Stats for DataCap Allocation

Multisig Notary address

f01322626

Client address

f3w5fx6wta4ewl2iyf7xcogmzffz2fmrngpzdpduj3xmk3dwjxc6dyq36gdf3rflkkrblh5nci5xymc5hal3qq

Last two approvers

dannyob & dannyob

DataCap allocation requested

100TiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
2762 20 50TiB 84.06 11.9TiB

@large-datacap-requests
Copy link

DataCap Allocation requested

Request number 8

Multisig Notary address

f02049625

Client address

f17g7h52bsi53rb263xwne573dusskit4mieqkgry

DataCap allocation requested

400TiB

Id

6fad9f7c-2f64-43fb-826c-c8515a728dac

@large-datacap-requests
Copy link

Stats & Info for DataCap Allocation

Multisig Notary address

f01858410

Client address

f17g7h52bsi53rb263xwne573dusskit4mieqkgry

Rule to calculate the allocation request amount

400% of weekly dc amount requested

DataCap allocation requested

400TiB

Total DataCap granted for client so far

7.275957614183434e+80YiB

Datacap to be granted to reach the total amount requested by the client (5 PiB)

7.275957614183434e+80YiB

Stats

Number of deals Number of storage providers Previous DC Allocated Top provider Remaining DC
129574 13 800TiB 13.85 201.00TiB

@cryptowhizzard
Copy link

checker:manualTrigger

@filplus-checker-app
Copy link

DataCap and CID Checker Report Summary1

Retrieval Statistics

  • Overall Graphsync retrieval success rate: 13.69%
  • Overall HTTP retrieval success rate: 0.00%
  • Overall Bitswap retrieval success rate: 0.00%

Storage Provider Distribution

✔️ Storage provider distribution looks healthy.

Deal Data Replication

⚠️ 88.51% of deals are for data replicated across less than 4 storage providers.

Deal Data Shared with other Clients2

⚠️ CID sharing has been observed. (Top 3)

Full report

Click here to view the CID Checker report.
Click here to view the Retrieval report.

Footnotes

  1. To manually trigger this report, add a comment with text checker:manualTrigger

  2. To manually trigger this report with deals from other related addresses, add a comment with text checker:manualTrigger <other_address_1> <other_address_2> ...

@cryptowhizzard
Copy link

PiKNiK, in collaboration with USC Shoah, kindly requested our attention to review and sign the LDN. Their data exhibits exceptional value and holds significance for the ecosystem. To ensure wider acceptance of the CID report, it is recommended to distribute it to one additional global service provider.

Copy link

Request Proposed

Your Datacap Allocation Request has been proposed by the Notary

Message sent to Filecoin Network

bafy2bzacedwvpuzrvf4eunpl47cwccck3hd5ovouxxylxjgwtv4o45nwejyg4

Address

f17g7h52bsi53rb263xwne573dusskit4mieqkgry

Datacap Allocated

400.00TiB

Signer Address

f1krmypm4uoxxf3g7okrwtrahlmpcph3y7rbqqgfa

Id

6fad9f7c-2f64-43fb-826c-c8515a728dac

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacedwvpuzrvf4eunpl47cwccck3hd5ovouxxylxjgwtv4o45nwejyg4

Copy link
Contributor

xinaxu commented Jul 6, 2023

Request Approved

Your Datacap Allocation Request has been approved by the Notary

Message sent to Filecoin Network

bafy2bzacecaamvtmuetlnrxxond3mmdhjvn4ivi4rbcjd42wprkq6pt6cc5xc

Address

f17g7h52bsi53rb263xwne573dusskit4mieqkgry

Datacap Allocated

400.00TiB

Signer Address

f1k3ysofkrrmqcot6fkx4wnezpczlltpirmrpsgui

Id

6fad9f7c-2f64-43fb-826c-c8515a728dac

You can check the status of the message here: https://filfox.info/en/message/bafy2bzacecaamvtmuetlnrxxond3mmdhjvn4ivi4rbcjd42wprkq6pt6cc5xc

@github-actions
Copy link

This application has not seen any responses in the last 10 days. This issue will be marked with Stale label and will be closed in 4 days. Comment if you want to keep this application open.

@github-actions github-actions bot added the Stale label Jul 21, 2023
@github-actions
Copy link

This application has not seen any responses in the last 14 days, so for now it is being closed. Please feel free to contact the Fil+ Gov team to re-open the application if it is still being processed. Thank you!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

16 participants