Limit how much data we retrieve for a given CID #16

bajtos · 2023-09-05T16:11:45Z

At the moment, the SPARK node tries to retrieve all content of CID, regardless of the size. Some CIDs represent GBs of data.

IMO, this is a problem - we don't want Stations to use so much bandwidth.

It also creates a problem in spark-api, where we currently represent byte_length as a 32bit signed integer, which overflows at 2GB.

2023-09-05T15:54:19Z app[17814d5b527638] cdg [info]error: value "2753993443" is out of range for type integer
2023-09-05T15:54:19Z app[17814d5b527638] cdg [info]    at /app/node_modules/pg-pool/index.js:45:11
2023-09-05T15:54:19Z app[17814d5b527638] cdg [info]    at runMicrotasks (<anonymous>)
2023-09-05T15:54:19Z app[17814d5b527638] cdg [info]    at processTicksAndRejections (node:internal/process/task_queues:96:5)
2023-09-05T15:54:19Z app[17814d5b527638] cdg [info]    at async setRetrievalResult (file:///app/index.js:74:5)
2023-09-05T15:54:19Z app[17814d5b527638] cdg [info]    at async handler (file:///app/index.js:12:5)

I am proposing to introduce a new retrieval error status - content too large.

Tasks

Give feedback

Implement the limit in SPARK module
Release a new version of SPARK & Station Core & Station Desktop
Update SPARK API to record the new measurement field flagging CARs too large
Update Grafana charts to handle this new flag
Options

The text was updated successfully, but these errors were encountered:

juliangruber · 2023-09-07T11:43:33Z

Spark clients should be allowed to abort retrieval if it is too large, without getting penalized. Then ideally they won't even report the result. However, Spark shouldn't have a problem with retrieval testing for large CIDs, a result is a result and is useful.

I think therefore we want the solution to be on the Station module side - it should abort the request - and for the Station module not to be penalized for not reporting in a large retrieval.

bajtos added this to Space Meridian Sep 5, 2023

bajtos moved this to 📥 todo in Space Meridian Sep 5, 2023

bajtos mentioned this issue Oct 3, 2023

feat: limit max download size + calculate SHA-256 checksum #28

Merged

bajtos moved this from 📥 todo to 🏗 in progress in Space Meridian Oct 3, 2023

bajtos mentioned this issue Oct 3, 2023

SPARK Public Launch at LabWeek23 space-meridian/roadmap#46

Closed

bajtos self-assigned this Oct 12, 2023

bajtos mentioned this issue Oct 16, 2023

SPARK Roadmap space-meridian/roadmap#47

Open

bajtos moved this from 🏗 in progress to 🧊 icebox in Space Meridian Oct 23, 2023

bajtos mentioned this issue Nov 6, 2023

feat: new measurement field car_too_large filecoin-station/spark-api#128

Merged

bajtos moved this from 🧊 icebox to 📥 todo in Space Meridian Nov 6, 2023

bajtos closed this as completed Nov 29, 2023

github-project-automation bot moved this from 📥 todo to ✅ done in Space Meridian Nov 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Limit how much data we retrieve for a given CID #16

Limit how much data we retrieve for a given CID #16

bajtos commented Sep 5, 2023 •

edited

Loading

Tasks

juliangruber commented Sep 7, 2023

Limit how much data we retrieve for a given CID #16

Limit how much data we retrieve for a given CID #16

Comments

bajtos commented Sep 5, 2023 • edited Loading

Tasks

juliangruber commented Sep 7, 2023

bajtos commented Sep 5, 2023 •

edited

Loading