-
-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove files from MongoDB after the associated document is deleted #6780
Comments
I think your suggested feature would be a useful addition to Parse Server. There have been many questions posted about the fact that files are not deleted from the file storage when the referencing Parse Object is deleted. I have come across similar solutions as you suggest in your implementation that works as expected. Adding this as a feature would certainly be welcomed by the community. I think the 3 most commonly expected behaviors would be:
Actually c) covers b) but is more difficult to implement. If you want to to start with b), it would make sense to explicitly define the classes (ideally in regex) in the Parse Server configuration, so that projects that use a If you want to shoot straight for c) you would most likely implement a file reference counter so that Parse Server knows when a file is not referenced anymore and deletes it. Whichever solution you go for, we'd be happy to review your PR. |
Thanks for the input, I appreciate it. Indeed option c) seems to be best one to implement, but let me dive into the code first and see what I find there. I will probably start working on this next week. |
Great, let me know whenever you need any guidance. |
Really interesting addition to clean up GridFS or some linked object storages ! |
A "clean-up", as I understand it, would also delete unreferenced files retroactively from the file storage. That means scanning through the files in the storage and delete the unreferenced ones. That's a (one time) script with its own complexities therefore I am not sure this should be mixed into this PR, unless someone comes up with an ingenious concept. Such a script existed for Parse.com, I wonder if it's still around somewhere or anyone brought it over after the shutdown. |
@vegidio Are you still working on this? If you need any guidance, please feel free to reach out. |
I am happy to work on this if you're not @vegidio, as I've worked on similar features. How do we see the best approach to tackling this? Create a new class _FileObject and track references? |
I am going to work on a PR that: -Tracking of file references for cleanup Although most of these features cover different issues, most of them depend on a _File class. We could easily add an option to delete the file on the destroy hook of a Parse Object, if the file references are 0. However, this provides some challenges for existing deployments as the _File object is created on create of a file, and the reference count would change on the create / update event of a @Moumouls would you or @mtrezza have any suggestions around how to build the _File class for existing deployments? The only thing I can think of is creating an example script for users to run on update to the newest version that links _File objects with their respective parent objects. |
Good question @dblythy , here 2 options is see:
For continuous deployment and avoid a huge breaking change, we should may be first try to support both systems (easy to detect via Parse.File type or a Parse.File pointer on a field). I'm in favor in a non breaking change strategy to let developers migrate at their own speed 😃 |
Here @dblythy my point of view about the next Parse.File system:
Note here i have an idea on how support both systems ate the same time easily without breaking all things. |
Could we also depreciate the file triggers in place of It wouldn't be that it would be a breaking change for existing deployments, it's just that the "references" to files wouldn't be accurate on existing deployments, as the reference count would be stored on creation / update of a It would mean that without the correct "building" of a _File collection, some of the new features wouldn't work properly. For example: Remove files from MongoDB after file is deleted would require:
When a
The problem is that if the _File class is created on create of a Parse.File, how to create the _File class for existing Parse Servers, when the files are kept in a number of different locations, and references could be complicated (e.g they could have Files set as |
So you'd be able to access I think for simplicity the _File class should be a special internal class that only has objectId, ACL, createdAt, updatedAt, createdBy, and maybe fileSize/ fileType (isn't url is dependant on publicServerURL?), and the only change to the SDKs should be the ability to set an ACL on a Parse.File. I think this way we can get these features built internally without causing any breaking changes or major updates. That's just my thoughts though. |
Disclaimer: this is a feature request, but I'm willing to contribute by coding it myself since I already coded a similar solution in many of my projects that already use the Parse Server.
Is your feature request related to a problem? Please describe.
I've been using Parse for quite some time in different projects, but when I create a document with
File
fields, this becomes a problem when the document or when the field itself is removed/replaced using one of the Parse SDKs (JS, Android, etc) because the reference to the file is removed, but the actual file remains there. After some time the database will be full of orphan files that only occupy space.Describe the solution you'd like
To solve this problem I created a couple of triggers in my Parse projects (
Parse.Cloud.afterSave
,Parse.Cloud.afterDelete
, etc) to properly remove the files whenever the document that they are associated with is deleted or when the field is remove/updated.The way I implemented this was by creating an environment variable called
DELETE_ORPHANS_FROM
where I put the list of classes that should have the files removed when a document is deleted/updated. Like this:DELETE_ORPHANS_FROM=Post,Comment
. So in my cloud code I read the content of this environment variable and send aParse.Cloud.httpRequest
to delete any orphan file in the classes that I listed there.The orphan files in the classes that are not listed in this variable are of course untouched. This gives flexibility to the users if for some reason they want to keep the orphan files in certain classes.
Additional context
Before I start working on this I would like to hear from the maintainers and other members of the community what they think of this new feature. Maybe there is a reason why it was not implemented yet. Or maybe the solution that I'm proposing is not optimal, despite the fact that it works in my projects
Please let me know your thoughts and depending on the feedback I will start working on this.
The text was updated successfully, but these errors were encountered: