Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not able to upload a compressed S3 file using close()/without context manager #837

Closed
jbarragan-bridge opened this issue Sep 24, 2024 · 4 comments

Comments

@jbarragan-bridge
Copy link
Contributor

Problem description

  • What are you trying to achieve?

We are using smart_open to compress and upload files to s3 without a context manager:
e.g.

hook = S3Hook()
s3_params = {
    'client': hook.get_session().client('s3'),
}
f = smart_open.open('s3://bucket/file.gz', 'w', transport_params=s3_params)
f.write('Some content')
f.close()
  • What is the expected result?
    The expected result (and the result previous version 7.0.0) is to have the MultipartWriter.close() been called to perform a S3 complete multipart upload and see the file uploaded into s3

  • What are you seeing instead?
    The file is not uploaded and MultipartWriter.close() is not called.

Steps/code to reproduce the problem

Adding the following tests to test_s3.py (class MultipartWriterTest) shows our problem:

    def test_write_gz_using_context_manager(self):
        """Does s3 multipart upload create a compressed file using context manager?"""
        contents = b'get ready for a surprise'
        with smart_open.open(
                f's3://{BUCKET_NAME}/{WRITE_KEY_NAME}.gz',
                mode="wb",
                transport_params={
                    "multipart_upload": True,
                    "min_part_size": 10,
                }
        ) as fout:
            fout.write(contents)

        with smart_open.open(f's3://{BUCKET_NAME}/{WRITE_KEY_NAME}.gz', 'rb') as fin:
            actual = fin.read()

        assert actual == contents

    def test_write_gz_not_using_context_manager(self):
        """Does s3 multipart upload create a compressed file not using context manager but close()?"""
        contents = b'get ready for a surprise'
        fout = smart_open.open(
            f's3://{BUCKET_NAME}/{WRITE_KEY_NAME}.gz',
            mode="wb",
            transport_params={
                "multipart_upload": True,
                "min_part_size": 10,
            }
        )
        fout.write(contents)
        fout.close()

        with smart_open.open(f's3://{BUCKET_NAME}/{WRITE_KEY_NAME}.gz', 'rb') as fin:
            actual = fin.read()

        assert actual == contents

Versions

This looks to be affecting from v7.0.0 to current version

Please provide the output of:

Linux-6.5.7-arch1-1-x86_64-with-glibc2.38
Python 3.11.5 (main, Sep  2 2023, 14:16:33) [GCC 13.2.1 20230801]
smart_open 7.0.4

Probably related to this PR #786 here:

https://github.com/piskvorky/smart_open/pull/786/files#diff-ba9d790d5607c344a97e86f15b3706586deb24d52d52cdf70b9776e3f3412b69L103

Recommendation

Probably closing both, outer and inner in FileLikeProxy will help:

class FileLikeProxy(wrapt.ObjectProxy):
    __inner = ...  # initialized before wrapt disallows __setattr__ on certain objects
...
    def __next__(self):
        return self.__wrapped__.__next__()

    def close(self):
        try:
            self.__wrapped__.close()
        finally:
            self.__inner.close()

I can open a PR if that helps.

Thanks for reading!!

Checklist

Before you create the issue, please make sure you have:

  • [ x ] Described the problem clearly
  • [ x ] Provided a minimal reproducible example, including any required data
  • [ x ] Provided the version numbers of the relevant software
@ddelange
Copy link
Contributor

Sounds like indeed this case was not covered. Can you open a PR?

@jbarragan-bridge
Copy link
Contributor Author

Thanks @ddelange , I created PR #838 to address the issue

I am NOT sure if that is a good solution; please let me know if I should consider another approach. Thanks for all the help!!

@ddelange
Copy link
Contributor

@jbarragan-bridge @mpenkov this issue can be closed with 7.0.5 released

@jbarragan-bridge
Copy link
Contributor Author

Yes, thank you!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants