Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Access to the original file in events #21

Closed
mitom opened this issue Jun 21, 2013 · 22 comments
Closed

Access to the original file in events #21

mitom opened this issue Jun 21, 2013 · 22 comments

Comments

@mitom
Copy link
Contributor

mitom commented Jun 21, 2013

Hey,

Could you also pass the original file to the events?

The reason is that the namer gets called before the events, so what we get is already a renamed file, so we don't know the original name. For me it would be important to know the exact filename (case-sensitive) and store it in the db with other properties, but rename the files to a reasonable one.

If the file is not chunked and there is only 1 in the request, this can be done by accessing it directly in the request, however it won't work for chunked files, or if there are more files/request.

It would be possible to manage the database thing in the namer too, but I'd need to fetch the entity again in postPersist to generate a proper response.

Let me know your opinion, maybe there is a better way. Thanks.

@sheeep
Copy link
Contributor

sheeep commented Jun 21, 2013

Could you also pass the original file to the events?

Hmm. Does not sound like a good idea to me. The time the POST_UPLOAD / POST_PERSIST event is fired, the original file does not exist anymore, as it was already moved. Even though we could keep the original object, you would get a Stat failed error when calling getSize and most likely run in other problems.

cf.: https://gist.github.com/sheeep/5833126

But I've seen now that the original object will be kept in the request object, so this works as expected:

var_dump($request->files->get('meep')->getClientOriginalName());

If the file is not chunked and there is only 1 in the request, this can be done by accessing it directly in the request, however it won't work for chunked files, or if there are more files/request.

If you are using the chunked upload controller, a Chunk will be treated like a single file, but given to the ChunkManager instead of directly calling the handleUpload method. And given the fact that every chunk is handled in a separate request you should theoretically be able to get the information from the request object. Moreover, every frontend that is implemented in this bundle is just sending one single file per request, and I think this is quite common in multiple file uploaders :)

Nonetheless this is a problem. A possible solution would be to introduce a PRE_UPLOAD event where the UploadedFile instance is passed and which will be fired before the namer is started. You could then extract the original name of the uploaded file and write it to the request object.

May I ask: What uploader frontend are you using? Some of them pass the filename and other properties in the request. And if you're using chunked uploads: Is the last chunk accessible?

@mitom
Copy link
Contributor Author

mitom commented Jun 21, 2013

Thank you for your answer!

I am using blueimp as frontend, and you were right, the last chunk as object is accessible, so the filename I can tell. However telling other information, such as mime-type depends on what storage are you using. If it is Gaufrette and an adapter that does not support metadata, you have no chance to tell the mime-type of the file then, except if you upload it, and stream it back immediately.

However, if I do

$file = $request->files->all()["files"][0];
var_dump($file->getSize());

on a not-chunked file in postupload/postpersist I can access it. The reason is, I am using Gaufrette, so the file is not moved, as it would be in your example, rather copied over.

If it is chunked, then the file coming from the request is the last chunk, which does not exist by then so it throws a FileNotFoundException, as it should.

Also blueimp offers you the option to upload multiple files in one request as documented here:
https://github.com/blueimp/jQuery-File-Upload/wiki/Options#singlefileuploads
I don't intend to use it, just thought it would be good for you to know.

At least I like the idea of a preUpload, it would definitely add more flexibility than just passing another thing to the current events.

@sheeep
Copy link
Contributor

sheeep commented Jun 21, 2013

If it is Gaufrette and an adapter that does not support metadata, you have no chance to tell the mime-type of the file then, except if you upload it, and stream it back immediately.

Is this the mime type sent by the browser? If so, I'd rather not rely on information sent from a client. Take a look at the AllowedMimeTypeValidationListener. You can determine the mime type by using the built in FileInfo extension. (edit: Given that the temporary file still exists)

The reason is, I am using Gaufrette, so the file is not moved, as it would be in your example, rather copied over.

This makes perfectly sense. But I think an implementation should behave the same (as far as it is possible) if you're using Gaufrette or the native filesystem storage.

Also blueimp offers you the option to upload multiple files in one request [...]

You're absolutely right! It seems I've implemented this unintentionally. :)

At least I like the idea of a preUpload [...]

Good to know. I'll implement it as soon as possible. Thanks for your detailed report!

@mitom
Copy link
Contributor Author

mitom commented Jun 21, 2013

Just to clarify, the mime type is not for validation, it is to tell if I should create a thumbnail or not, and to later on properly set the content type when the file is accessed.

Thanks for the quick reply.

sheeep added a commit that referenced this issue Jun 21, 2013
This event will be given an instance of UploadedFile instead of a moved file.
Addresses #21.
@sheeep
Copy link
Contributor

sheeep commented Jun 21, 2013

Let me know if this works for you. :) I also added a test to assure that it is possible to pass variables through the request object. (If you want to know how..)

@mitom
Copy link
Contributor Author

mitom commented Jun 22, 2013

Yes it does, also solves my problem. Thank you!

@sheeep sheeep closed this as completed Jun 22, 2013
@mitom
Copy link
Contributor Author

mitom commented Jun 24, 2013

Could you trim "0_" from the beginning of the reassembled file (it would be fine only in the object too)? I can access the correct name from the request, but it would be cleaner that way.

@sheeep
Copy link
Contributor

sheeep commented Jun 24, 2013

I don't think this is something we should have added to this bundle because this is part of your personal business-logic. Is the 0_ prefix something the blueimp-frontend is adding to the file name?

@mitom
Copy link
Contributor Author

mitom commented Jun 24, 2013

It is actually you.

The first chunk is called 0_filename and the rest are assembled to that, so the final file will be the same name.

I noticed your comment in the above function, try:

$chunks->getIterator()->getInnerIterator()->current();

Accessing the filename in the request gives the original, so blueimp does not append/prepend anything to it.

@sheeep
Copy link
Contributor

sheeep commented Jun 24, 2013

Ah I see. My bad! :) I'll have a look as soon as possible.

@sheeep sheeep reopened this Jun 24, 2013
@mitom
Copy link
Contributor Author

mitom commented Jun 24, 2013

Happens to the best of us ;) Thank you, yet again.

sheeep added a commit that referenced this issue Jun 24, 2013
Removed a rather hacky way of determine which of the chunks to use as a base, and replaced it with an Iterator as mentioned in #21.
sheeep added a commit that referenced this issue Jun 24, 2013
Until now, the reassembled file shared the name with the very first chunk uploaded.
It had the form {prefix}_{filename}. This commit fixes this issue and addresses #21 with it.
@sheeep
Copy link
Contributor

sheeep commented Jun 24, 2013

ChunkManager is now using an Iterator the way you mentioned and is also removing the prefix after assembling. Could you drop me a short note if this worked for you? :)

@mitom
Copy link
Contributor Author

mitom commented Jun 24, 2013

Yes, it gives back the correct name. Also looks better ;)

@mitom mitom closed this as completed Jun 24, 2013
@mitom
Copy link
Contributor Author

mitom commented Jul 25, 2013

The renaming broke pretty critically after 9dbd905.

In the following for better understanding:
base -> the part to which the chunk is be appended
chunk -> the last uploaded part

The base gets renamed from 0_filename to filename and in the next request it will be renamed from filename it still matches the regexp.

Here comes the real problem:
from foo_100.ext it will be 100.ext which will mess up the order of the chunks in the next request, so base and chunk will switch places and then the renaming will happen again to foo_100.ext - since the chunk's name was correctly generated. This means the 3rd request the order of base and chunk will be correct again and we will get a 100.ext file. So actually depending on the file-size (number of chunks) we can get out
foo_100.ext (odd amount of chunks) and 100.ext (even amount of chunks).

The last request should (I am assuming here) always be the last, as it is named by PHP_MAX_INT.

But the critical part is, from a file where we assume the order of chunks is 1-2-3-4, we end up getting a file where the order is 3-1-2-4. Md5sum definitely differs.

A solution could be to only call the rename with the last chunk - in the mean time users should be warned that the files uploaded in the meantime are probably broken.

@mitom mitom reopened this Jul 25, 2013
@sheeep
Copy link
Contributor

sheeep commented Jul 25, 2013

This is pretty critical. Let's see if I can give it a shot.

The base gets renamed from 0_filename to filename and in the next request it will be renamed from filename it still matches the regexp.

I guess the line which causes all these troubles is the following

$assembled = $assembled->move($base->getPath(), preg_replace('/^(.+?)_/', '', $base->getBasename()));

If I get you right, the file should not be renamed until it is the last chunk to assemble. Is this correct? If so, the best solution would be to add a parameter rename to the method ChunkManager::assembleChunks IMO.

@mitom
Copy link
Contributor Author

mitom commented Jul 25, 2013

I agree, that was my conclusion too. Happy to see your quick reaction. Just to be sure you could also switch the regexp to match only numbers, if something breaks again that will lower the "casualties".

sheeep pushed a commit that referenced this issue Jul 25, 2013
sheeep added a commit that referenced this issue Jul 25, 2013
Otherwise you will end up having a completly broken order of reassembling.
Addresses: #21
@sheeep
Copy link
Contributor

sheeep commented Jul 25, 2013

Renaming of chunks will now only happen if it is the last chunk. @mitom Can you verify this?

Just to be sure you could also switch the regexp to match only numbers [...]

Changed. Good point.

@mitom
Copy link
Contributor Author

mitom commented Jul 25, 2013

The md5sum of the original file and the uploaded file match now.

But I don't get this what if the load-distribution is off and the files need to be reassembled in the end?

@sheeep
Copy link
Contributor

sheeep commented Jul 25, 2013

I was a bit jazzed ;) Readded shortly after the push in 52865bd.

@mitom
Copy link
Contributor Author

mitom commented Jul 25, 2013

Then I believe this is once again fixed.

For the sakes of those who have un-replaceable files corrupted, it would be worth a try to reorder them, however it is really dependant on the original filename and size.

@mitom mitom closed this as completed Jul 25, 2013
@sheeep
Copy link
Contributor

sheeep commented Jul 25, 2013

Hmm. I can't think of any solution for reordering that would work in any case.

sheeep pushed a commit that referenced this issue Jul 25, 2013
@p365labs
Copy link

Hi, I'm testing the bundle with Symfony 2.1.6 and the last bundle tag version 0.9.7 with bluimp as frontend, but I have I problem with chunks upload.
The problem is that the bundle don't re-assemble the file leaving the chunks not assembled. I've tried to debug a bit and I've noticed that the problem is $iterator = $chunks->getIterator()->getInnerIterator(); in ChunkManager.php.
If I remove the getInnerIterator() leaving only getIterator() all works fine : the file is correctly assembled and the chunks removed. Am I doing something wrong ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants