Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3OpenRead doesn't work with TextIOWrapper #64

Closed
mjwillson opened this issue Mar 7, 2016 · 7 comments
Closed

S3OpenRead doesn't work with TextIOWrapper #64

mjwillson opened this issue Mar 7, 2016 · 7 comments

Comments

@mjwillson
Copy link

Currently only binary-mode read is supported, which would be OK if the stream you returned worked with TextIOWrapper -- but it doesn't, at least not S3OpenRead. TextIOWrapper expects readable, writable, seekable, closed etc methods. If you make S3OpenRead inherit (or quack like) IOBase that should fix it.

@piskvorky
Copy link
Owner

I'm not familiar with hacking these classes. If you send a PR, we can add that support.

Is it possible to add it globally to all readers? (not just S3OpenRead)

@tmylk tmylk added the wishlist label Sep 18, 2016
@tmylk
Copy link
Contributor

tmylk commented Oct 5, 2016

Ping @mjwillson

@mjwillson
Copy link
Author

mjwillson commented Oct 5, 2016

Hey, sorry I'm unlikely to get around to this as it stands -- we've worked around it like so, although this won't work for everyone:

            # smart_open's S3OpenRead doesn't support TextIOWrapper
            # and smart_open doesn't support text-mode open. So
            # we return a poor-man's line-by-line TextIOWrapper:
            return (str(line, "utf-8") for line in text_stream)

To be honest smart_open feels like quite a leaky abstraction due to patchy / partial support for various features, meaning you need extra special-cased dispatch and wrapping code after parsing the URL yourself. E.g. gzip only working for local files, some other common compression formats not supported, text mode not working for S3, https issues. Maybe some of those are fixed now, but if I was to revisit I'd be tempted to just manually wrap the streams from the underlying libraries myself.

If you wanted to address that in a deeper way, I'd suggest making sure any stream implementations here respect the stream interfaces from the standard library ( https://docs.python.org/3/library/io.html ), which would make it easier to write general and composable code that builds on them. Perhaps some of that would be easier if you give up on Python 2. Then try and implement features like compression and text mode support in a general way where possible via wrapping streams, and make the dispatch mechanisms (for protocols, compression formats etc) extensible.

@tmylk
Copy link
Contributor

tmylk commented Oct 5, 2016

Thanks for the idea of stream interfaces. Leaving open as it is an interesting direction to explore.

@mpenkov
Copy link
Collaborator

mpenkov commented Aug 10, 2017

This is related to #91. If we make our S3 objects behave like the base classes from the io package, this issue will be resolved.

@mpenkov
Copy link
Collaborator

mpenkov commented Dec 10, 2017

@menshikh-iv This is also done (PR #131 fixed it).

@menshikh-iv
Copy link
Contributor

great @mpenkov, fixed by #131

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants