Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] Add instance-stats-randomize config option #3718

Merged
merged 2 commits into from
Jan 31, 2025

Conversation

tsmethurst
Copy link
Contributor

Description

If this is a code change, please include a summary of what you've coded, and link to the issue(s) it closes/implements.

If this is a documentation change, please briefly describe what you've changed and why.

Adds a config option instance-stats-randomize to randomize user + post stats at /api/v1|v2/instance and /nodeinfo/2.0 endpoints. Does not affect web views. Stats are re-randomized every hour thanks to the ttlcache.

Checklist

Please put an x inside each checkbox to indicate that you've read and followed it: [ ] -> [x]

If this is a documentation change, only the first checkbox must be filled (you can delete the others if you want).

  • I/we have read the GoToSocial contribution guidelines.
  • I/we have discussed the proposed changes already, either in an issue on the repository, or in the Matrix chat.
  • I/we have not leveraged AI to create the proposed changes.
  • I/we have performed a self-review of added code.
  • I/we have written code that is legible and maintainable by others.
  • I/we have commented the added code, particularly in hard-to-understand areas.
  • I/we have made any necessary changes to documentation.
  • I/we have added tests that cover new code.
  • I/we have run tests and they pass locally with the changes.
  • I/we have run go fmt ./... and golangci-lint run.

@tsmethurst
Copy link
Contributor Author

Hurray!

@tsmethurst tsmethurst merged commit a55bd6d into main Jan 31, 2025
4 checks passed
@tsmethurst tsmethurst deleted the instance_stats_randomize branch January 31, 2025 18:27
@rimu
Copy link

rimu commented Feb 1, 2025

Tools like FediDB rely on these statistics to make pretty graphs. If you want to hide the number of users, consider setting it to zero instead?

@martijndeb
Copy link
Contributor

Tools like FediDB rely on these statistics to make pretty graphs. If you want to hide the number of users, consider setting it to zero instead?

I support this, providing faked statistics creates noise in the counting. Some promotion of the fediverse relies on those numbers. One can differ on opinion if that's a good thing, but intentionally corrupting data is never a great idea.

Whilst I do support the idea, I think that providing 0 instead of random data is the better option, as it basically creates an opt-out situation

@shleeable
Copy link

shleeable commented Feb 2, 2025

Shouldn't the randomise number be between 0 and the actual number? for this to be acting as a bad actor?

Between 0 and 9223372036854775807 is a bit of a stretch?

@tsmethurst
Copy link
Contributor Author

Between 0 and 9223372036854775807 is a bit of a stretch?

The number is limited between 0 and 1,000,000 (for users), and 0 and 10,000,000 (for statuses):

https://github.com/superseriousbusiness/gotosocial/pull/3718/files#diff-7b929d08989a3f96f39b47a817eff54d4f0a0d3e1df866144bd20f652d73b2b7R73-R76

Still ridiculous, but not enough to break json parsing etc.

Re: other concerns about graphs etc: GtS has set a restrictive robots.txt file disallowing crawling of /.well-known and /api endpoints for years now. The fediverse.observer dev agreed three years ago that this should be taken account of. Crawlers can easily avoid putting wonky data in their data sets by respecting robots.txt. If they want accurate user + status counts, they can parse the html of / if they want, once we add a way for instance admins to mark the whole instance as discoverable + thereby remove the robots noindex meta tag from the homepage (see #776 (comment)). Crawlers can also just avoid crawling GoToSocial instances entirely, which is now what fedidb and fediverse.observer both do.

Will consider changing the setting so that admins can choose between serving actual numbers, serving 0, and serving random numbers though, that's a nice idea :)

@tsmethurst
Copy link
Contributor Author

tsmethurst commented Feb 2, 2025

Anyway I'm going to lock this discussion since a now-closed PR is not really the place for it, but I'll open a new issue for the more granular setting :) EDIT: Here's the issue: #3723

@superseriousbusiness superseriousbusiness locked and limited conversation to collaborators Feb 2, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants