Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PumpReader.createInputStream(...) returns EOF for supplementary code points #658

Closed
Marcono1234 opened this issue Feb 23, 2021 · 0 comments

Comments

@Marcono1234
Copy link
Contributor

The InputStream created using org.jline.utils.PumpReader.createInputStream(Charset) returns EOF (-1) when encountering a supplementary code point (i.e. > U+FFFF) in the input, e.g.:

PumpReader reader = new PumpReader();
reader.getWriter().append("\uD83D\uDE0Atest");
InputStream in = reader.createInputStream(StandardCharsets.UTF_8);
in.read();

The reason for this is that the buffer is sized incorrectly here:

this.buffer = ByteBuffer.allocate((int) Math.ceil(encoder.maxBytesPerChar()));

It appears often encoders try to encode supplementary code points (represented by a surrogate pair consisting of two char) either completely or not at all, so the buffer with size encoder.maxBytesPerChar() is too small, using 2 * encoder.maxBytesPerChar() should solve this issue.

However, even when this is fixed, it would be good to improve the encoding logic by checking for OVERFLOW (instead of silently ignoring it) and throwing an AssertionError here:

CoderResult result = encoder.encode(readBuffer, output, false);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant