-
Notifications
You must be signed in to change notification settings - Fork 425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TEZ-4295: Could not decompress data. Buffer length is too small. #130
Conversation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
🎊 +1 overall
This message was automatically generated. |
Hi, Regards |
Thanks @tprelle for letting us know! It's very strange, I haven't been able to reproduce by unit test, that's why went only for fixing CodecUtils.getDecompressedInputStreamWithBufferSize. Let me think about this again before committing the fix. Could you share some details about your workload: what exception trace did you get (the same as reported in jira)? What's your upstream application, hive, or something else? I'm 100% about including the fix for every compressor/decompressor instantiation (so including changes in IFile you mentioned), I just want to cover it by a unit test in order to prevent regressions in the future. UPDATE: I've just reproduced from hive, I'll update the PR once I reproduced this problem from a UT
|
@tprelle : could you please try the latest commit? 7ffc783#diff-7ae0f0c86feac929c1490f9938cad5c25d46435d6f80de2455e64f11a49da1c8R44 very interestingly, I reproduced the problem and I found that synchronization didn't solve the issue for me, and then I found that the problem is the default buffer size I choose regardless of codec type (128K), which led to errors in case of snappy with default 256K, everything worked fine could you please try this patch on your cluster to confirm if we're hitting the very same issue? |
This comment has been minimized.
This comment has been minimized.
Hi @abstractdog thanks to look into it
I was not able to reproduce it on unit test but i run this type of query on a large dataset.
|
I just see the previous message i was searching the stackstrace. I will test the lastest patch. |
@tprelle: kindly reminder, if you have the chance to try out the latest patch on your cluster |
Hi @abstractdog sorry i was OOO yesterday. |
thanks for the feedback again @tprelle! |
Yes which log do you want me to add ? |
I'm about to add a few lines to my patch and share with you later |
Hi @abstractdog again,
|
@abstractdog FYI if i combined your last commit and IFIle patch it's seem working for my two issue. |
@tprelle: what's very strange is those seem to be the same issue that disappeared in my environment after applying only my patch...but before guessing I've uploaded a new patch with some additional INFO messages (0d04a2f), could you please try this one and upload app logs somewhere in case of a failure? UPDATE: nevermind, finally I reproduced it from unit test, creating the final fix |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Nice thanks :) |
@tprelle: could you please do final testing with the last patch? |
🎊 +1 overall
This message was automatically generated. |
hi @abstractdog it seems working with the last patch |
discussed offline, Ashutosh's +1 still holds after the latest fixes, I'm merging this to master |
… (Laszlo Bodor reviewed by Ashutosh Chauhan)
No description provided.